310 117 2MB
English Pages 268 [269] Year 2014
Igor O. Cherednikov, Tom Mertens, Frederik F. Van der Veken Wilson Lines in Quantum Field Theory
De Gruyter Studies in Mathematical Physics
| Editors Michael Efroimsky, Bethesda, Maryland, USA Leonard Gamberg, Reading, Pennsylvania, USA Dmitry Gitman, S˜ao Paulo, Brazil Alexander Lazarian, Madison, Wisconsin, USA Boris Smirnov, Moscow, Russia
Volume 24
Igor O. Cherednikov, Tom Mertens, Frederik F. Van der Veken
Wilson Lines in Quantum Field Theory |
Physics and Astronomy Classification Scheme 2010 11.15.-q, 11.15.Tk, 12.38.Aw, 02.10.Hh, 02.20.Qs, 02.40.Hw, 03.65.Vf Authors Igor Olegovich Cherednikov Universiteit Antwerpen Departement Fysica Groenenborgerlaan 171 2020 Antwerp Belgium and Bogoliubov Laboratory of Theoretical Physics Joint Institute for Nuclear Research 141980 Dubna Russia [email protected] Tom Mertens Universiteit Antwerpen Departement Fysica Groenenborgerlaan 171 2020 Antwerp Belgium [email protected] Frederik F. Van der Veken Universiteit Antwerpen Departement Fysica Groenenborgerlaan 171 2020 Antwerp Belgium [email protected] ISBN 978-3-11-030910-2 e-ISBN (PDF) 978-3-11-030921-8 e-ISBN (EPUB) 978-3-11-038293-8 Set-ISBN 978-3-11-030922-5 ISSN 2194-3532 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2014 Walter de Gruyter GmbH, Berlin/Boston Typesetting: PTP-Berlin, Protago TEX-Produktion GmbH, www.ptp-berlin.de Printing and binding: CPI books GmbH, Leck ♾Printed on acid-free paper Printed in Germany www.degruyter.com
Preface Aristotle held that human intellectual activity or philosophical (in a broad sense) knowledge can be seen as a threefold research program. This program contains metaphysics, the most fundamental branch, which tries to find the right way to deal with Being as such; mathematics, an exact science studying calculable – at least, in principle – abstract objects and formal relations between them; and, finally, physics, the science working with changeable things and the causes of the changes. Therefore, physics is the science of evolution – in the first place the evolution in time. Put in more ‘contemporary’ terms, at any energy scale there are things which a physicist has to accept as being ‘given from above’ and then try to formulate a theory of how do these things, whatever they are, change. Of course, by increasing the energy and, therefore, by improving the resolution of experimental facility, one discovers that those things emerge, in fact, as a result of evolution of other things, which should now be considered as ‘given from above’.¹ The very possibility that the evolution of material things, whatever they are, can be studied quantitatively is highly non-trivial. First of all, to introduce changes of something, one has to secure the existence of something that does not change. Indeed, changes can be observed only with respect to something permanent. Kant proposed that what is permanent in all changes of phenomena is substance. Although phenomena occur in time and time is the substratum, wherein co-existence or succession of phenomena can take place, time as such cannot be perceived. Relations of time are only possible on the background of the permanent. Given that changes ‘really’ take place, one derives the necessity of the existence of a representation of time as the substratum and defines it as substance. Substance is, therefore, the permanent thing only with respect to which all time relations of phenomena can be identified. Kant gave then a proof that all changes occur according to the law of the connection between cause and effect, that is, the law of causality. Given that the requirement of causality is fulfilled, at least locally, we are able to use the language of differential equations to describe quantitatively the physical evolution of things. There is, however, a hierarchy of levels of causality. For example, Newton’s theory of gravitation is causal only if we do not ask how the gravitational force gets transported from one massive body to another. The concept of a field as an omnipresent mediator of all interactions allows us to step up to a higher level of causality. The field approach to the description of the natural forces culminated in the creation in the 20th century of
1 It is worth noticing that this scheme is one of the most consistent ways to introduce the concept of the renormalization group, which is crucial in a quantum field theoretical approach to describe the three fundamental interactions.
vi | Preface the quantum field theoretic approach as an (almost) universal framework to study the physical phenomena at the level of the most elementary constituents of the matter. To be more precise, the quantitative picture of the three fundamental interactions is provided by the Standard Model, the quantum field theory of the strong, weak and electromagnetic forces. The aesthetic attractivity and unprecedented predictive power of this theory is due to the most successful and nowadays commonly accepted way to introduce the interactions by adopting the principle of local (gauge) symmetry. This principle allows us to make use of the local field functions, which depend on the choice of the specific gauge and, as such, do not represent any observables, to construct a mathematically consistent and phenomenologically useful theory. In any gauge field theory we need, therefore, gauge-invariant objects, which are supposed to be the fundamental ingredients of the Lagrangian of the theory, and which can be consistently related, at least, in principle, to physical observables. The most straightforward implementation of the idea of a scalar gauge invariant object is provided by the traced product of field strength tensors Tr [Fμ 𝜈 (x)F μ 𝜈 (x)] ,
(1)
Fμ 𝜈 (x) = 𝜕μ A𝜈 (x) − 𝜕𝜈 Aμ (x) ± ig[Aμ (x), A𝜈 (x)],
(2)
where Aμ (x) being the local gauge potentials belonging to the adjoint representation of the N-parametric group of local transformations U(x), and g a coupling constant. Field strength tensors are also local quantities, which change covariantly under the gauge transformations: Fμ 𝜈 (x) → U(x)Fμ 𝜈 (x)U † (x). (3) Interesting non-local realizations of gauge-invariant objects emerge from Wilson lines defined as path-ordered (P) exponentials² of contour (path, loop, line) integrals of the local gauge fields Aμ (z) y
U𝛾 [y, x] = Pexp [±ig ∫dzμ Aμ (z) ] . x [ ]𝛾
(4)
The integration goes along an arbitrary path 𝛾: z∈𝛾 from the initial point x to the end point y. The notion of a path will be one of the crucial issues throughout the book.
2 The terminology and the choice of the signature ± will be explained below.
Preface
| vii
The Wilson line (4) is gauge covariant, but, in contrast to the field strength, the transformation law reads U𝛾 [y, x] → U(y)U𝛾 [y, x]U † (x),
(5)
so that the transformation operators U, U † are defined in different space-time points. For closed paths x = y, so that we speak about the Wilson loop: U𝛾 ≡ U𝛾 [x, x] = Pexp [±ig ∮dzμ Aμ (z) ] , 𝛾
(6)
which transforms similarly to the field strength U𝛾 = U𝛾 [x, x] → U(x)U𝛾 U † (x).
(7)
The simplest scalar gauge invariant objects made from the Wilson loops are, therefore, the traced Wilson loops W𝛾 = Tr U𝛾 . From the mathematical point of view, one can construct a loop space whose elements are the Wilson loops defined on an infinite set of the contours. The recast of a quantum gauge field theory in the loop space is supposed to enable one to utilize the scalar gauge-invariant field functionals as the fundamental degrees of freedom instead of the traditional gauge-dependent boson and fermion fields. Physical observables are supposed then to be expressed in terms of the vacuum averages of the products of the Wilson loops (n)
W{𝛾} = ⟨0| Tr U𝛾1 Tr U𝛾2 ⋅ ⋅ ⋅ Tr U𝛾n |0⟩.
(8)
The concept of Wilson lines finds an enormously wide range of applications in a variety of branches of modern quantum field theory, from condensed matter and lattice simulations to quantum chromodynamics, high-energy effective theories and gravity. However, there exist surprisingly few reviews or textbooks which contain a more or less comprehensive pedagogical introduction into the subject. Even the basics of the Wilson lines theory may put students and non-experts in significant trouble. In contrast to the generic quantum field theory, which can be taught with the help of plenty of excellent textbooks and lecture courses, the theory of the Wilson lines and loops still lacks such a support. The objective of the present book is, therefore, to collect, overview and present in the appropriate form the most important results available in the literature with the aim to familiarize the reader with the theoretical and mathematical foundations of the concept of Wilson lines and loops. We intend also to give an introductory idea of how to implement elementary calculations utilizing the Wilson lines within the context of modern quantum field theory, in particular, in Quantum Chromodynamics. The target audience of our book consists of graduate and postgraduate students working in various areas of quantum field theory, as well as curious researchers from
viii | Preface other fields. Our lettore modello is assumed to have already followed standard university courses in advanced quantum mechanics, theoretical mechanics, classical fields and the basics of quantum field theory, elements of differential geometry, etc. However, we give all necessary information about those subjects to keep with the logical structure of the exposition. Chapters 2, 3, and 4 were written by T. Mertens, Chapter 5 by F. F. Van der Veken. Preface, Introduction and general editing are due to I. O. Cherednikov. In our exposition we used extensively the results, theorems, proofs and definitions given in many excellent books and original research papers. For the sake of uniformity, we usually refrain from citing the original works in the main text. We hope that the dedicated literature guide in Appendix D will do this job better. Besides this, we have benefited from presentations made by our colleagues at conferences and workshops and informal discussions with a number of experts. Unfortunately, it is not possible to mention everybody without the risk of missing many others who deserve mentioning as well. However, we are happy to thank our current and former collaborators, from whom we have learned a lot: I. V. Anikin, E. N. Antonov, U. D’Alesio, A. E. Dorokhov, E. Iancu, A. I. Karanikas, N. I. Kochelev, E. A. Kuraev, J. Lauwers, L. N. Lipatov, O. V. Teryaev, F. Murgia, N. G. Stefanis, and P. Taels. Our special thanks go to I. V. Anikin, M. Khalo, and P. Taels for reading parts of the manuscript and making valuable critical remarks on its content. We greatly appreciate the inspiring atmosphere created by our colleagues from the Elementary Particle Physics group in University of Antwerp, where this book was written. We are grateful to M. Efroimsky and L. Gamberg for their invitation to write this book, and to the staff of De Gruyter for their professional assistance in the course of the preparation of the manuscript.
Antwerp, May 2014
I. O. Cherednikov T. Mertens F. F. Van der Veken
Contents Preface | v 1
Introduction: What are Wilson lines? | 1
2 2.1 2.1.1 2.1.2 2.1.3 2.2 2.2.1 2.2.2 2.2.3 2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.4 2.4.1 2.4.2 2.5 2.5.1 2.5.2
Prolegomena to the mathematical theory of Wilson lines | 6 Shuffle algebra and the idea of algebraic paths | 7 Shuffle algebra: Definition and properties | 7 Chen’s algebraic paths | 21 Chen iterated integrals | 39 Gauge fields as connections on a principal bundle | 44 Principal fiber bundle, sections and associated vector bundle | 45 Gauge field as a connection | 50 Horizontal lift and parallel transport | 55 Solving matrix differential equations: Chen iterated integrals | 56 Derivatives of a matrix function | 57 Product integral of a matrix function | 59 Continuity of matrix functions | 61 Iterated integrals and path ordering | 63 Wilson lines, parallel transport and covariant derivative | 65 Parallel transport and Wilson lines | 65 Holonomy, curvature and the Ambrose–Singer theorem | 66 Generalization of manifolds and derivatives | 71 Manifold: Fr´echet derivative and Banach manifold | 71 Fr´echet manifold | 76
3 3.1 3.2 3.3 3.4 3.5 3.6
The group of generalized loops and its Lie algebra | 80 Introduction | 80 The shuffle algebra over Ω = ⋀ M as a Hopf algebra | 80 The group of loops | 87 The group of generalized loops | 87 Generalized loops and the Ambrose–Singer theorem | 92 The Lie algebra of the group of the generalized loops | 94
4 4.1 4.2 4.3 4.4
Shape variations in the loop space | 100 Path derivatives | 100 Area derivative | 107 Variational calculus | 117 Fr´echet derivative in a generalized loop space | 120
x | Contents 5 5.1 5.1.1 5.1.2 5.2 5.2.1 5.2.2 5.2.3 5.2.4 5.2.5 5.2.6 5.2.7 5.3 5.3.1 5.3.2 5.3.3 5.3.4
Wilson lines in high-energy QCD | 127 Eikonal approximation | 127 Wilson line on a linear path | 127 Wilson line as an eikonal line | 136 Deep inelastic scattering | 139 Kinematics | 139 Invitation: the free parton model | 141 A more formal approach | 143 Parton distribution functions | 150 Operator definition for PDFs | 152 Gauge invariant operator definition | 155 Collinear factorization and evolution of PDFs | 159 Semi-inclusive deep inelastic scattering | 165 Conventions and kinematics | 166 Structure functions | 167 Transverse momentum dependent PDFs | 170 Gauge-invariant definition for TMDs | 172
A A.1 A.2 A.3 A.4 A.5 A.6 A.7 A.8 A.9 A.10 A.11 A.12 A.13 A.14 A.15 A.16 A.17 A.18 A.19 A.20 A.21
Mathematical vocabulary | 176 General topology | 176 Topology and basis | 177 Continuity | 181 Connectedness | 183 Local connectedness and local path-connectedness | 186 Compactness | 186 Countability axioms and Baire theorem | 190 Convergence | 192 Separation properties | 194 Local compactness and compactification | 195 Quotient topology | 196 Fundamental group | 199 Manifolds | 202 Differential calculus | 205 Stokes’ theorem | 210 Algebra: Rings and modules | 211 Algebra: Ideals | 213 Algebras | 214 Hopf algebra | 217 Topological, C ∗ -, and Banach algebras | 224 Nuclear multiplicative convex Hausdorff algebras and the Gel’fand spectrum | 225
Contents
B B.1 B.2 B.3 B.4 B.5
Notations and conventions in quantum field theory | 232 Vectors and tensors | 232 Spinors and gamma matrices | 233 Light-cone coordinates | 235 Fourier transforms and distributions | 237 Feynman rules for QCD | 238
C C.1 C.1.1 C.1.2 C.2 C.2.1 C.2.2
Color algebra | 240 Basics | 240 Representations | 240 Properties | 240 Advanced topics | 242 Calculating products of fundamental generators | 242 Calculating traces in the adjoint representation | 245
D
Brief literature guide | 248
Bibliography | 249 Index | 252
| xi
1 Introduction: What are Wilson lines? The idea of gauge symmetry suggests that any field theory must be invariant under the local (i.e., depending on space-time points) transformations of field functions ψ (x) → U(x)ψ (x), ψ (x) → ψ (x)U † (x),
(1.1)
where the matrices U(x) belong to the fundamental representation of a given Lie group. In other words, the Lagrangian has to exhibit local symmetry. We shall mostly deal with special unitary groups, SU(Nc ), which are used in the Yang–Mills theories. Although a number of important results can be obtained by using only the general form of the gauge transformation, it will be sometimes helpful to use the parameterization¹ U(x) = e±igα (x) , (1.2) where α (x) = ta α a (x), ta =
λa , 2
and λ a are the generators of the Lie algebra of the group U. Consider for simplicity the free Lagrangian for a single massless fermion field ψ (x) Lfermion = ψ (x) i𝜕/ ψ (x), 𝜕/ = 𝛾μ ⋅
𝜕 . 𝜕xμ
(1.3)
We easily observe that the local transformations (1.1) do not leave this Lagrangian intact. The reason is that the derivative of the field transforms as 𝜕μ ψ (x) → U(x)[𝜕μ ψ (x)] + [𝜕μ U(x)]ψ (x).
(1.4)
The minimal extension of the Lagrangian (1.3), which exhibits the property of gauge invariance, consists in the introduction of the set of gauge fields Aμ (x) = ta Aaμ (x)
(1.5)
belonging to the adjoint representation of the same gauge group, which is required to transform as i Aμ (x) → U(x)Aμ U † (x) ± [𝜕μ U(x)]U † (x). (1.6) g Thus, the usual derivative has to be replaced by the covariant derivative: Dμ = 𝜕μ ∓ igAμ ,
(1.7)
1 The coupling constant g can be chosen to have a positive or a negative sign. As this is merely a matter of convention, we leave the choice open and will write ±g.
2 | 1 Introduction: What are Wilson lines? which transforms as Dμ → U(x)Dμ U † (x).
(1.8)
Dμ ψ (x) → U(x)[Dμ ψ (x)].
(1.9)
Then This procedure obviously makes the minimally extended Lagrangian gauge invariant / μ ψ (x). Lgauge inv. = ψ (x) iD
(1.10)
Consider now a little bit more complicated object, namely, the bi-local product of two matter fields Δ(y, x) = ψ (y)ψ (x). (1.11) Such products arise in various correlation functions in quantum field theory,² in particular, they determine the most fundamental quantities, Green’s functions, via G(y, x) = ⟨0|T ψ (y)ψ (x)|0⟩,
(1.12)
where the symbol T stands for the time-ordering operation. It is evident that in such a naive form the bi-local field products and Green’s functions are not gauge invariant: Δ(y, x) → ψ (y)U † (y)U(x)ψ (x).
(1.13)
Therefore, the problem arises of how to find an operator T[y,x] , which transports the field ψ (x) to the point y, so that T[y,x] ψ (x) → U(y)[T[y,x] ψ (x)].
(1.14)
ψ (y)T[y,x] ψ (x) → ψ (y)U † (y)U(y)[T[y,x] ψ (x)] = ψ (y)T[y,x] ψ (x),
(1.15)
In this case, we have
so that the product (1.13) becomes gauge invariant. Consider first the Abelian gauge group U(1). In this case U(x) = e±igα (x) ,
(1.16)
where α (x) is a scalar function. Then³ Aμ (x) → Aμ (x) + 𝜕μ α (x),
(1.17)
so it is straightforward to see that the ‘transporter’ T[y,x] is given by⁴ y
T[y,x] = exp [sign ig ∫ dzμ Aμ (z)] . x [ ]
(1.18)
2 See, in particular, references in the section ‘Gauge invariance in particle physics’, Appendix D. 3 Note that the sign in front of 𝜕μ α (x) is independent on the sign choice of g. 4 Any ordering of the field functions is not needed in the classical Abelian case.
1 Introduction: What are Wilson lines?
| 3
Indeed, the product (1.15) transforms as ψ (y)U𝛾 [y, x]ψ (x) → y
ψ (y)e∓igα (y) exp [sign ig ∫ dzμ [Aμ (z) + 𝜕μ α (z)]] e±igα (x) ψ (x). x [ ]
(1.19)
It is instructive to see that the choice of the sign in equation (4) depends on the parameterization of the symmetry transformation U(x). Taking the line integral for the integrand 𝜕μ α explicitly, one concludes that in order to save the gauge invariance, the sign should be chosen as sign = ±. In what follows we shall always specify the signature conventions we adopt. In the non-Abelian case the situation is more involved. The fields at different space-time points Aμ (z) and Aμ (z ), equation (1.5), do not commute, so the exponential of non-commuting functions is ill-defined. An infinitesimal version of equation (1.15) suggests the following equation to the transporter T[y,x] : d = gA𝛾 (t)T[y,x] , (1.20) T dt [y,x] where we introduce an arbitrary path 𝛾 along which T[y,x] ‘transports’ a field ψ (x) from the point x to the point y. The need for this path stems from the fact that we do not know (unless, for some reason, stated otherwise) along which trajectory we have to transfer the fields from one point to another. The requirement of the gauge invariance is not affected by the choice of path, but, as we will see, the transporter becomes a functional of the path. The path 𝛾 is assumed to be parameterized by the coordinate z∈𝛾 depending on the parameter t in such a way that dzμ = 𝛾μ̇ (t)dt, z(0) = x, z(t) = y. The operator A𝛾 (t) in the r.h.s. of equation (1.20) is given by A𝛾 (t) = Aμ [z(t)]𝛾μ̇ (t).
(1.21)
It is easy to see that (1.18) solves equation (1.20) in the classical Abelian case. Integrating equation (1.20) from 0 to t yields an integral equation instead of a differential one: t
T[y,x] − T[x,x] = T(t) − T(0) = ∫A𝛾 (t1 )T(t1 )dt1 .
(1.22)
0
Imagine that the coupling constant g can be considered as small and let us solve this equation perturbatively. Namely, we assume that a solution can be presented as an infinite series T[y,x] (t) = T (0) + gT (1) + g2 T (2) + ... + gn T (n) + ... (1.23)
4 | 1 Introduction: What are Wilson lines? Suppose we have an initial condition T(0) = T[x,x] = T (0) .
(1.24)
Then, for the first non-trivial term in the expansion (1.23) we obtain t
T
(1)
(t) = [∫ A𝛾 (t1 )dt1 ] T(0). [0
(1.25)
]
T(0) is t1 -independent by construction and thus can be separated out from the integration. The next order gives t
T
(2)
(t) = ∫ A𝛾 (t1 )T(t1 )dt1 0 t1
t
= [∫ A𝛾 (t1 ) ∫ A𝛾 (t2 )dt1 dt2 ] T(0). 0 [0 ]
(1.26)
We can rewrite equation (1.26) as t t
T (2) (t) =
1[ P ∫ ∫ A𝛾 (t1 )A𝛾 (t2 )dt1 dt2 ] T(0), 2 ] [ 0 0
(1.27)
where the path-ordering operator reads PA𝛾 (t1 )A𝛾 (t2 ) = θ (t1 − t2 )A𝛾 (t1 )A𝛾 (t2 ) + θ (t2 − t1 )A𝛾 (t2 )A𝛾 (t1 ).
(1.28)
It is straightforward to see that generic n-order term is given by T (n) (t) =
t
t
0
0
1 P ∫ ⋅ ⋅ ⋅ ∫[A𝛾 (t1 )...A𝛾 (tn )dt1 dt2 ⋅ ⋅ ⋅ dtn ] T(0), n!
(1.29)
with obvious generalization of the path-ordering to n functions A. Therefore, the final solution can be presented in the form T(t) = ∑ gn n=0
t
t
0
0
1 P ∫ ⋅ ⋅ ⋅ ∫ [A𝛾 (t1 ) ⋅ ⋅ ⋅ A𝛾 (tn )dt1 dt2 dtn ] T(0) n! t
≡ Pexp [g ∫ A𝛾 (t )dt ] T(0). [ 0 ]
(1.30)
Remembering the definition of A, equation (1.21) and taking the natural initial condition T0 = T[x,x] = 1,
1 Introduction: What are Wilson lines?
| 5
we have finally y
T[y,x] = Pexp [±g ∫ Aμ [z]dzμ ] , [ x ]𝛾
(1.31)
T[y,x] = U𝛾 [y, x],
(1.32)
that is the Wilson line (4): with arbitrary path 𝛾. In other words, the function (1.15) is gauge invariant, but path dependent. The rest of the book will be devoted to mathematical motivation of the above manipulations and to some applications of the Wilson line approach in Quantum Field Theory.
2 Prolegomena to the mathematical theory of Wilson lines In this part of the book we give the necessary conceptual thesaurus and overview the main steps towards the construction of the mathematical theory of Wilson lines and loops. To be more precise, a goal of this exposition is to demonstrate that gauge theories can be consistently formulated in the principal fiber bundle setting, where the gauge fields (or potentials) are identified with pullbacks of sections of a connection one-form in the gauge bundle. The gauge potentials give rise to a parallel transport equation in the gauge bundle that can be solved by using product integrals. As a result, we shall show that the solution of the parallel transport equation can be presented as a Wilson line. We shall also discuss its relation to the standard covariant derivative in gauge theories. Then, an alternative way to construct a gauge theory will be discussed, which is based on the use of the holonomies in the gauge bundle instead of the gauge potentials. This possibility is based on the Ambrose–Singer theorem, which claims that the entire gauge invariant content of a gauge theory is included in the holonomies. However, the issues of overcompleteness, reparameterization invariance, and additional algebraic constraints, coming from the matrix representation of the Lie algebra associated with the gauge group, impede the straightforward application of the standard loop space approach to gauge field theories. An interesting solution to these problems arises if one extends this setting to the so-called generalized loops, first proposed by Chen and further studied by Tavares (for references, see section ‘Algebraic paths’ in Literature Guide D) within the framework of the generalized loop space (GLS) approach. Our exposition is based mostly on the original works by these authors. Aiming towards the appropriate formulation of the generalized loop space framework and having in mind the demonstration of its relation to Wilson lines and loops, we start with an introductory discussion of the most relevant algebraic concepts. Then we make use of these concepts to construct Chen’s algebraic d-paths, and, consequently, the generalized loop space. We end the chapter with a discussion on the differential operators which can be defined in generalized loop space. Assuming that gauge field theories can be recast within the GLS framework, and given that, to this end, a suitable action should be found, one can use relevant differential operators to generate the variations of the generalized degrees of freedom in the GLS, and hence, to construct the appropriate equations of motion in the GLS. Let us mention that the ambitious program of reformulation of gauge theories in the GLS setting has never been fully accomplished and thus remains a challenge. Note that we give complete definitions of the notions, formulations of theorems and their proofs only when we find it necessary for the consistency of the exposition. For an extended list of definitions and some helpful theorems and statements we refer to Appendix A.
2.1 Shuffle algebra and the idea of algebraic paths
| 7
2.1 Shuffle algebra and the idea of algebraic paths 2.1.1 Shuffle algebra: Definition and properties 2.1.1.1 Algebraic preliminaries For our purposes it is sufficient to describe an n-dimensional manifold as a topological space, wherein a neighborhood to each point is equivalent (strictly speaking, homeomorphic) to the n-dimensional Euclidean space. The fundamental geometrical object in a manifold we will be concerned about is a path. One has a natural intuitive idea of what a path or a loop in a manifold is. Mathematically one usually defines a path 𝛾 in a manifold M as the map 𝛾 : [0, 1] → M, t → 𝛾(t). For closed paths, which are called loops, one just adds the extra condition that the initial and final points of the path coincide 𝛾(0) = 𝛾(1) ∈ M. The straightforward idea of paths and loops can be generalized to the so-called algebraic d-paths, where the d-paths are constructed as algebraic objects possessing certain (desirable) properties. The resulting algebraic structure can then be supplied with a topology, turning it into a topological algebra. The topology is used to complete the algebraic properties with analytic ones, allowing one to introduce the necessary differential operators.¹ Several algebraic structures must be introduced before we begin the main discussion. Without going too deep into details, we define a ring as a set wherein two binary operations of multiplication and addition are defined. Putting it another way, a ring is an Abelian group (with addition being the group operation) supplied with an extra operation (multiplication). If the second operation is commutative, the ring is also called commutative. The set of integers provides one of the simplest examples of a commutative ring. Otherwise we speak of noncommutative rings. The set of square matrices is an example of a noncommutative ring. Having introduced the notion of a ring, we are able to introduce another algebraic structure, namely a field, which is defined as a commutative ring where division by a nonzero element is allowed. It is evident that nonzero elements of a field make up an Abelian group under multiplication. For example, the set of real numbers forms a field. One can then construct a vector space over a field. In this case, the elements of the vector space are called vectors, while the elements of the field are scalars, and two
1 Most of the material in this section is based on the original works by Chen (see Literature Guide), where the proofs to a number of the stated theorems can also be found. For the sake of brevity we skip those proofs which do not bring more insight than needed into the subject of the book.
8 | 2 Prolegomena to the mathematical theory of Wilson lines binary operations (addition of two vectors and multiplication of a vector by a scalar) acting within the vector space should be defined. One easily captures the idea of a vector space by thinking of the usual Euclidean vectors of velocities or forces. The notion of a module over a ring generalizes the concept of a vector space over a field: now scalars only have to form a ring, not necessary a field. For example, any Abelian group is a module over the ring of integers. In what follows ‘K-module’ stands for a module over a ring K.
2.1.1.2 Shuffle algebra The generalization of the concept of paths in a manifold calls for the introduction of the notion of a shuffle algebra. The shuffle algebra is an algebra based on shuffle product, which in its turn is defined via (k, l)-shuffles. Let us start with the definitions of the shuffles. Definition 2.1 ((k, l)-shuffle). A (k, l)-shuffle is a permutation P of the k + l letters, such that P(1) < ⋅ ⋅ ⋅ < P(k) and P(k + 1) < ⋅ ⋅ ⋅ < P(k + l). Exercise 2.2. How can one explain a (k, l)-shuffle using a deck of cards? Using these (k, l)-shuffles we can introduce the shuffle multiplication, symbolically represented by the symbol ∙. Let us consider a set of arbitrary objects Zi . Definition 2.3 (Shuffle multiplication). Using the notations k
Z1 ⋅ ⋅ ⋅ Zk = Z1 ⊗ ⋅ ⋅ ⋅ ⊗ Zk ∈ ⨂ ⋀ M , k ≥ 1 and, by convention, Z1 ⋅ ⋅ ⋅ Zk = 0 for k = 0. we write the shuffle multiplication as Z1 ⋅ ⋅ ⋅ Zk ∙ Zk+1 ⋅ ⋅ ⋅ Zk+l = ∑ ZP(1) ⋅ ⋅ ⋅ ZP(k+l)
(2.1)
Pk,l
where ∑ denotes the sum over all (k, l)-shuffles and ⋀n M stands for the n-th exterior σk,l
power of the exterior algebra ⋀ M over the manifold M. Several examples will be instructive to make the shuffle multiplication clear. Example 2.4 (Shuffle multiplication). –
The situation with two objects is evident: Z1 ∙ Z2 = Z1 Z2 + Z2 Z1
2.1 Shuffle algebra and the idea of algebraic paths
–
| 9
Shuffle multiplication of three objects reads: Z1 ∙ Z2 Z3 = Z1 Z2 Z3 + Z2 Z1 Z3 + Z2 Z3 Z1
–
Four objects multiply as: Z1 Z2 ∙ Z3 Z4 = Z1 Z2 Z3 Z4 + Z1 Z3 Z2 Z4 + Z1 Z3 Z4 Z2 + Z3 Z1 Z2 Z4 + Z3 Z1 Z4 Z2 + Z3 Z4 Z1 Z2 .
(2.2)
If we consider the objects Z to be one-forms ω (or linear functionals) denied on some manifold and compare their shuffle products with the usual antisymmetric wedge products, then shuffle product can be treated as a symmetric counterpart to the wedge product. Let now M be a manifold and 1
Ω = ⋀M = ⋀M be the set of one-forms on M. We interpret Ω as a K-module, where for the moment we assume that K is a general ring of scalars with a multiplicative unity. Introducing the shuffle product on a K-module Ω defines the shuffle K-algebra. Definition 2.5 (Shuffle K-algebra). Consider a K-module Ω and the regular tensor algebra over K based on Ω, denoted by T(Ω). Then T r (Ω) represents the degree r components of the algebra. It is easy to see that T 0 (Ω) = K. Replacing the tensor product by the shuffle multiplication generates a new algebra called the shuffle K-algebra Sh(Ω) based on the K-module Ω. In this algebra the shuffle product plays a role of the algebra multiplication m, so that one can write m ≡ ∙ : Sh → Sh, and for the algebra unit map one has u : K → Sh, 1K → 1Sh . It is now possible to extend the algebraic structure based on the shuffle product by introducing the K-linear maps 𝜖, Δ. Definition 2.6 (Co-unit and co-multiplication). –
Co-unit 𝜖 ∈ Alg(Sh(Ω), K) is defined by {
𝜖(1Sh ) = 1K 𝜖(ω1 ⋅ ⋅ ⋅ ωn ) = 0
for for
n=0 n > 0.
(2.3)
10 | 2 Prolegomena to the mathematical theory of Wilson lines –
Co-multiplication Δ : Sh(Ω) → Sh(Ω) ⊗ Sh(Ω) acts as Δ(1) = 1 for n = 0 { { { { Δ(ω ⋅ ⋅ ⋅ ω ) = 1 n { n { { { ∑ (ω1 ⋅ ⋅ ⋅ ωi ) ⊗ (ωi+1 ⋅ ⋅ ⋅ ωn ) for n > 0. { i=0
(2.4)
The map Δ can be considered as a K-module morphism and is also an associative comultiplication since (1 ⊗ Δ)Δ = (Δ ⊗ 1)Δ. Exercise 2.7. Prove the above statement. The co-multiplication Δ and co-unit 𝜖 introduces a co-algebra structure on the shuffle algebra, so that it becomes a bi-algebra. We can go a step further and show that the shuffle algebra is also a Hopf algebra.² For that reason we introduce the notion of an antipode. Definition 2.8 (Antipode). A K-linear map J J : Sh → Sh, is called the shuffle algebra antipode provided that J(ω1 ⋅ ⋅ ⋅ ωn ) = (−1)n ωn ⋅ ⋅ ⋅ ω1 .
(2.5)
It is evident that J(1) = 1, J 2 = 1. Consider now the shuffle multiplication ms : Sh ⊗ Sh → Sh and the unit map η : K → Sh. Let the transposition map or flipping operation T : Sh ⊗ Sh → Sh ⊗ Sh be defined as T(u1 ⊗ u2 ) = u2 ⊗ u1 .
2 A Hopf algebra is at the same time an algebra and a co-algebra, see Appendix A.
(2.6)
2.1 Shuffle algebra and the idea of algebraic paths
| 11
Then, for all u1 , u2 ∈ Sh, the antipode can be shown to possess the following properties: J(u1 ∙ u2 ) = J(u2 ) ∙ J(u1 ) ms ∘ (J ⊗ 1) ∘ Δ = ms ∘ (1 ⊗ J) ∘ Δ = η ∘ 𝜖 T ∘ (J ⊗ J) ∘ Δ = Δ ∘ J, 𝜖 ∘ J = 𝜖.
(2.7)
Therefore, the following theorem holds: Theorem 2.9 (Sh(Ω) is a Hopf algebra). The shuffle algebra Sh(Ω) is a Hopf K-algebra provided that its co-multiplication Δ, co-unit 𝜖 and antipode J are defined as in equations (2.3), (2.4), and (2.5). Keeping in mind the algebraic structure of the shuffle algebra discussed above, we can go on with the study of the algebra homomorphisms³ Alg(Sh(Ω), K). Definition 2.10 (Group multiplication on Alg(Sh(Ω), K)). Consider the algebra homomorphisms αi ∈ Alg(Sh(Ω), K). Define the multiplication α1 α2 ∈ Alg(Sh(Ω), K) as α1 α2 = (α1 ⊗ α2 )Δ. For this multiplication we have 𝜖α1 = α1 𝜖 = α1 and α1 (α2 α3 ) = (α1 ⊗ α2 ⊗ α3 )(1 ⊗ Δ)Δ = (α1 ⊗ α2 ⊗ α3 )(Δ ⊗ 1)Δ = (α1 α2 )α3 . The multiplication of algebra morphisms is depicted in Figure 2.1. It is easy to observe that: Proposition 2.11. The multiplication in Definition 2.10, defined on the algebra morphisms Alg(Sh(Ω), K), turns it into a group.
3 It suffices here to describe a homomorphism as a map between two sets which preserves their algebraic structures.
12 | 2 Prolegomena to the mathematical theory of Wilson lines
Δ
Sh(Ω)
Sh(Ω) ⊗ Sh(Ω)
g Al Ω h( (S
α1 ⊗ α2
k ), )
k⊗k ∼ =k
Fig. 2.1: Multiplication of algebra morphisms.
We can now study the properties of the algebra homomorphisms Alg(Sh(Ω), Sh(Ω)). To this end, let us define an algebra morphism which might look a bit strange at the moment, but will turn out to be valuable when considering the group structure of algebraic paths and loops. Definition 2.12 (L-operator). For α ∈ Alg(Sh(Ω), K) define and
̃ = (α ⊗ 1)Δ ∈ Alg(Sh(Ω), Sh(Ω)) L α
(2.8)
̃ ⊗ 1 ∈ Hom(Sh(Ω) ⊗ Ω, Sh(Ω) ⊗ Ω). L̂ α = L α
(2.9)
This operator has the following interesting property with respect to the products of elements of Alg(Sh(Ω), K): Property 2.13. If α2 ∈ Alg(Sh(Ω), K), then, by Definition 2.10, and
̃ =α α α2 L α1 1 2 ̃ ̂ ̃ ̃ ̂ ̂ (L α1 α2 , Lα1 α2 ) = (Lα2 Lα1 , Lα2 Lα1 ).
(2.10)
The proof of the above statement is straightforward: Proof. ̃ L α1 α2 = (α1 α2 ⊗ 1)Δ=[(α1 ⊗ α2 )Δ ⊗ 1]Δ ̃ )Δ = (α1 ⊗ α2 ⊗ 1)(1 ⊗ Δ)Δ = (α1 ⊗ L α2 ̃ (α ⊗ 1)Δ = L ̃ . ̃ L =L α2
1
α2 α1
(2.11)
2.1 Shuffle algebra and the idea of algebraic paths
One also obtains
̃ ̂ ̂ L̂ α1 α2 = L α1 α2 ⊗ 1 = Lα2 Lα1 .
| 13
(2.12)
̃ and L̂ are identity morphisms on Sh(Ω) and Sh(Ω) ⊗ Ω. Notice that L 𝜖 𝜖
2.1.1.3 Shuffle differentiations We wish to discuss generalized or algebraic paths and loops which are based on shuffle algebra morphisms. Because we are ultimately interested in the mathematically consistent formalism for variations of these paths and loops, we need the operations of differentiation to be well-defined. The appropriate introduction of these differentiations requires some basic knowledge of category theory. We give now a brief introduction to category theory, restricting ourselves only to those concepts which will be explicitly used in our discussion. Define first the concept of a category. Definition 2.14 (Category). Category C includes: 1. a class ob(C) of objects ai ; 2. a class Hom(C) of morphisms Fl which can be interpreted as maps between the objects. A morphism has a unique source object ai and target object aj : Fl : ai → aj ;
3.
a binary operation called the composition of morphisms, such that Hom(a1 , a2 ) × Hom(a2 , a3 ) → Hom(a1 , a3 ),
which exhibit the properties of 1. identity: there exists a unity object 1 ∈ ob(C), such that 1ai = ai 1 = ai ; 2.
associativity: if Fi : ai → ai+1 , then F3 ⋅ (F2 ⋅ F1 ) = (F3 ⋅ F2 ) ⋅ F1 .
It is easy to show that there exists a unique identity map. We also need maps between categories which are captured by the notion of a functor.
14 | 2 Prolegomena to the mathematical theory of Wilson lines Definition 2.15 (Functor). Let C1 and C2 be two categories. A functor F from C1 to C2 is a map with the following properties: 1. The mapping rule: for each X1 ∈ C1 , there exists X2 ∈ C2 , such that X1 → X2 = F(X1 ). 2.
For a covariant functor: for each f : X1 → X2 ∈ C1 there exists F(f ), such that F(X1 ) → F(X2 ) ∈ C2 .
3.
For a contravariant functor, correspondingly: F(X2 ) → F(X1 ) ∈ C2 ,
such that the identity and composition of morphisms are preserved. Namely, we have 1. for the unity 1C1 in C1 1C1 → F(1C1 ) = 1C2 ∈ C2 . 2.
for a covariant functor F(f1 ∘ f2 ) = F(f1 ) ∘ F(f2 );
3.
for a contravariant functor F(f1 ∘ f2 ) = F(f2 ) ∘ F(f1 ).
One calls a functor F : C 1 → C2 full (faithful, fully faithful) if, for all objects a1 and a2 of C1 , the map Hom(a1 , a2 ) → Hom(F(a1 ), F(a2 ))
(2.13)
is surjective (injective, bijective). There exist some special types of functors, of which we only mention the forgetful functor, since we shall deal with it in further discussions. Definition 2.16 (Forgetful functor). Suppose two categories C1 and C2 are given, and the object X ∈ C1 can be regarded as an object of C2 by ignoring certain mathematical structures of X. Then a functor U : C 1 → C2 which ‘forgets’ about any mathematical structure is called a forgetful functor.
2.1 Shuffle algebra and the idea of algebraic paths
| 15
Now we are ready to introduce differentiations. Let us begin with the notion of a Kmodule differentiation. Definition 2.17 (K-module differentiation). Consider K-modules U and a U-module Ω. Let F1 , F2 ∈ U. A differentiation d is a morphism d: U→Ω which obeys the rule d(F1 F2 ) = F2 (dF1 ) + F1 (dF2 ).
(2.14)
Extending to K-algebras, K-module differentiations form a category: Definition 2.18 (Category of K-algebra differentiations). Consider K-algebras U and U and the differentiations d:U→Ω and d : U → Ω Denote by Diff(D, D ) the set of all pairs ̃ , φ̂ ), (φ where ̃ ∈ Alg(U, U ) φ and φ̂ ∈ Homk (Ω, Ω ), such that ̃ = φ̂ D d φ and if F ∈ U, w ∈ Ω then ̃ F)φ̂ w. φ̂ (Fw) = (φ In what follows, D stands for the category of differentiations of unitary commutative K-algebras with the category morphisms defined above. The next category of differentiations we shall introduce is the category of pointed differentiations. Definition 2.19 (Category of pointed differentiations). Consider a pair (d, p) with the operation d:U→Ω being a differentiation, and p ∈ Alg(U, K).
16 | 2 Prolegomena to the mathematical theory of Wilson lines Such a pair is said to be a pointed differentiation. The corresponding category PD can be introduced, such that the morphisms Diff(D, p : D p ) in PD (d, p) → (d , p ) are given by pairs ̃ , φ̂ ) ∈ Diff(d, d ), (φ such that ̃ p = p φ The category morphisms then define equivalences of differentiations. Anticipating a forthcoming discussion, we notice that this is the above-mentioned property of morphisms ̃ p = p φ which guarantees the uniqueness of the initial point of a generalized path. A subcategory of this pointed differentiation is generated if one imposes the constraint of surjectivity on the K-module differentiation. Definition 2.20 (Surjective pointed differentiation). We call a pointed differentiation (d, p) surjective if d is surjective. The last subcategory of pointed differentiations we need to define is the category of splitting pointed differentiations. To define these differentiations we also need to introduce the notion of a short exact sequence. In general, ker F stands for a kernel of a map F : A1 → A2 , that is, a subset of A1 which maps under F into the zero of A2 . In other words, the image of the kernel F is the zero in A2 . Then we define: Definition 2.21 (Exact sequence). Consider a sequence of homomorphisms of Kmodules F0 F1 U1 → U2 → U3 . It is said to be exact at U2 if Im F0 = ker F1 . If each term except the first and the last one in a sequence F0
F1
F
Fn−1
U0 → U1 → U2 → ⋅ ⋅ ⋅ → Un is exact, then we speak about an exact sequence. A five-term exact sequence 0 → U1 → U2 → U3 → 0 is short exact.
2.1 Shuffle algebra and the idea of algebraic paths
| 17
Using short exact sequences we can finally define the splitting pointed differentiations. Definition 2.22 (Splitting pointed differentiation). A pointed differentiation (d, p) is called splitting if for a K-module U U = ker d ⊕ ker p.
(2.15)
That is, (d, p) is splitting if and only if ker d ∩ ker p = 0. (d, p) is splitting and surjective if and only if u
d
0→K→U→Ω→0
(2.16)
is a short exact sequence, where u: K→U is the unit map in the algebra U. In what follows, SPD stands for the subcategory of splitting surjective differentiations of the category PD. Consider an application of the above differentiations to the specific case of shuffle algebras. Applying the K-module differentiation d from Definition 2.17 with U = Sh(Ω) and K-module Ω yields d
Sh(Ω) → Sh(Ω) ⊗ Ω, where we treat Ω as an Sh(Ω)-module with the properties for ωi ∈ Sh(Ω) and wi ∈ Ω: 1. 1Sh(Ω) w = w; 2. ω (w1 +Ω w2 ) = ω w1 +Ω ω w2 ; 3. (ω1 +Sh(Ω) ω2 )w = ω1 w +Ω ω2 w; 4. (ω1 ∙ ω2 )w = ω1 w ⋅Ω ω2 w . Similarly we can consider the surjective differentiations: Definition 2.23 (Surjective shuffle module differentiation). Suppose we have the Kmodule Sh(Ω) ⊗ Ω. Consider it as a Sh(Ω)-module, so that for u1 , u2 ∈ Sh(Ω), u3 ∈ Ω we have u1 ∙ (u2 ⊗ u3 ) = (u1 ∙ u2 ) ⊗ u3 .
18 | 2 Prolegomena to the mathematical theory of Wilson lines The surjective differentiation δ δ ∈ Hom(Sh(Ω), Sh(Ω) ⊗ Ω) is defined by δ (u1 u3 ) = u1 ⊗ u3 , δ (1) = 0. To see that δ is a differentiation, it is instructive to first consider an example of a shuffle product of the tensor products. Example 2.24 (Shuffle product of tensor products). Consider u1 , u2 ∈ T 1 (Ω) and w1 , w2 ∈ T 1 (Ω). We have (u1 w1 ) ∙ (u2 w2 ) = (u1 ⊗ w1 ) ∙ (u2 ⊗ w2 ) = u1 w1 u2 w2 + u1 u2 w1 w2 + u1 u2 w2 w1 + u2 u1 w2 w1 + u2 w2 u1 w1 + u2 u1 w1 w2 = (u1 w1 ∙ u2 )w2 + (u2 w2 ∙ u1 )w1 .
(2.17)
Therefore, we obtain: Theorem 2.25. δ is a differentiation. Proof. We have (u1 w1 ) ∙ (u2 w2 ) = (u1 w1 ∙ u2 )w2 + (u2 w2 ∙ u1 )w1
(2.18)
by the properties of the shuffle multiplication as discussed above. Applying δ one gets δ [(u1 w1 ) ∙ (u2 w2 )] = (u1 w1 ∙ u2 ) ⊗ w2 + (u2 w2 ∙ u1 ) ⊗ w1 = (u1 w1 ) ∙ (u2 ⊗ w2 ) + (u2 w2 ) ∙ (u1 ⊗ w1 ) = (u1 w1 ) ∙ δ (u2 w2 ) + (u2 w2 ) ∙ δ (u1 w1 ).
(2.19)
Thus δ obeys the Leibniz rule showing that δ is indeed a differentiation, according to (2.17). For splitting pointed differentiations we have the following lemma: Lemma 2.26 (Splitting Pointed Differentiation Homomorphisms). Suppose we have a splitting pointed differentiation (d, p), (Definition 2.22). A commutative diagram of Kmodule morphisms is shown in Figure 2.2 (the double line between the K’s indicates that their values are equal). Assuming that θ̃ (1) = 1
2.1 Shuffle algebra and the idea of algebraic paths
k
Sh(Ω)
δ
Sh(Ω) ⊗ Ω
ˆ θ
˜ θ
k
p
| 19
U
d
Ω
Fig. 2.2: Splitting pointed differential.
and, for all u1 ∈ Sh(Ω), u2 ∈ Ω, θ ̂ (u1 ⊗ u2 ) = (θ̃ u1 )θ ̂ (1 ⊗ u2 ), we obtain
(θ̃ , θ ̂ ) ∈ Diff(δ , 𝜖; d, p).
Suppose that θ ∈ Hom(Ω, Ω ), generates a homomorphism between the tensor algebras T(Ω) and T(Ω ). Denote this homomorphism by T(θ ). This algebra morphism is shuffle product preserving, such that we can write it as Sh(θ ). The following special functor can be defined: Definition 2.27 (Covariant functor to SPD). Let ΔF denote the covariant functor (Definition 2.15) to the category of splitting pointed differentiations (Definition 2.22) on the category of K-modules, which exhibits the properties ΔF (Ω) = (δ , 𝜖) = (δ (Ω), 𝜖(Ω)) and for θ ∈ Hom(Ω, Ω ) ΔF (θ ) = (Sh(θ ), Sh(θ ) ⊗ θ ). A diagrammatic representation of this definition is given in Figure 2.3. A theorem holds which states that the morphism (θ̃ , θ ̂ ) is a unique homomorphism in the category of splitting pointed differentiations: Theorem 2.28 (Uniqueness). Suppose we have K-module Ω and a splitting surjective pointed differentiation (d , p ), where d : U → Ω . Then, for θ ∈ Hom(Ω, Ω ),
20 | 2 Prolegomena to the mathematical theory of Wilson lines
Fig. 2.3: Covariant functor to the category of SPD.
there exists a unique pair such that for all w ∈ Ω
(θ̃ , θ ̂ ) ∈ Diff(δ , 𝜖; d , p ) θ ̂ (1 ⊗ w) = θ w.
This theorem demonstrates that ΔF is adjoint to the forgetful functor (Definition 2.16) to the category of K-modules on the category of splitting pointed differentiations, which assigns to each (d, p) the K-module Ω and to each ̃ , φ̂ ) ∈ Diff(d, p; d , p ) (φ the morphism φ of K-modules. Continuing along the same line and taking into account that Sh(Ω) ⊗ Sh(Ω) ⊗ Ω is a Sh(Ω) ⊗ Sh(Ω)-module, and that 𝜖 ⊗ 𝜖 ∈ Alg(Sh(Ω)) ⊗ Sh(Ω), K), we come to the following statements: 1. The morphism of K-modules 1 ⊗ δ : Sh(Ω) ⊗ Sh(Ω) → Sh(Ω) ⊗ Sh(Ω) ⊗ Ω
2.
is a differentiation. The pair (1 ⊗ δ , 𝜖 ⊗ 𝜖) is a splitting surjective pointed differentiation.
2.1 Shuffle algebra and the idea of algebraic paths
| 21
We end the discussion on shuffle algebras and their differentiations by stating the following property of the L-operator, defined in equation (2.12), with respect to the category of differentiations defined on the shuffle algebras: ̃ , L̂ ) is an equivalence in the category of differentiations D. That Proposition 2.29. (L α α is, ̃ . L̂ α δ = δ L α
2.1.2 Chen’s algebraic paths In this part we introduce the d-paths and d-loops, generalizing the intuitive notion of paths and loops, as it was originally proposed by Chen.
2.1.2.1 Algebraic paths The whole concept of d-paths is schematically visualized in Figure 2.4. This diagram merges the properties of the shuffle product and algebra on the K-module of one-forms on a manifold M into a unified structure, allowing the construction of algebraic or generalized paths on M. Now we shall discuss the properties of the maps shown in Figure 2.4 and the way they generate the d-paths. p
d
U
k
Ω
˜χ0
ˆχ0
˜χ
k
ˆχ
Sh(Ω)
δ
Sh(Ω) ⊗ Ω
˜ ρ
k
1
Fig. 2.4: Path diagram.
Sh(Ω)/I
δ1
ˆ ρ
Sh(Ω) ⊗ Ω/δI
22 | 2 Prolegomena to the mathematical theory of Wilson lines The entire structure is built starting from a given pointed differentiation (d, p), which is mapped to the pointed differentiation (δ , 𝜖) by the equivalence of differentiations we introduced in Definition 2.19. The δ in the figure is the differentiation introduced in Definition 2.23, and 𝜖 is the co-unit from the co-algebra structure on Sh(Ω). Anticipating that the notion of an ideal will play an important role, let us give the definition of an ideal and review some of its properties related to kernels of maps. Definition 2.30 (Ring ideal). Consider a ring K. Its subset A⊂K is called an ideal if 1. A is a subgroup of K under the addition; 2. for a ∈ A and k ∈ K one has ka ∈ A. We shall denote an ideal of Sh(Ω) by I = I(d, p), and (δ1 , 𝜖1 ) stands for the pointed differentiation induced by (δ , 𝜖) after dividing out this ideal. χ̃0 and χ0̂ are the K-module morphisms as defined in Lemma 2.26, such that χ̃0 F = pF + dF and χ0̂ w = 1 ⊗ w. Given the maps in the above diagram, a d-path from p can be defined as a Kalgebra morphism Sh(Ω) → K, factorizable through Sh(Ω)/I. We can also consider sums and products of ideals. Definition 2.31. Consider two ideals in K, A1 and A2 . Then, for a1 ∈ a1 , a2 ∈ a2 , the set {a1 + a2 } is an ideal written as A1 + A2 . Similarly, the set {a1 a2 } is an ideal A1 A2 .
2.1 Shuffle algebra and the idea of algebraic paths
| 23
Note that A1 A2 ⊂ A1 ∩ A2 . The following property of ideals is important for our purposes. Proposition 2.32. The kernel of a homomorphism F : A1 → A 2 is an ideal in A1 . To prove this powerful statement we first need the following theorem: Theorem 2.33 (Kernel is a subring). Consider two rings K1 and K2 with the binary operations {+1,2 , ∘1,2 }. Suppose that we have a ring homomorphism Φ : K 1 → K2 . Then the kernel of Φ is a subring in K1 . Proof. A ring homomorphism of addition is a group homomorphism and the kernel is a subgroup ker Φ ≤ K1 , where ≤ denotes subgroup. Let now x1 , x2 ∈ ker Φ, then Φ (x1 ∘1 x2 ) = Φ (x1 ) ∘2 Φ (x2 ) = 0K2 ∘2 0K2 = 0K2 . Therefore, x1 ∘1 x2 ∈ ker Φ so that the conditions for a subring are fulfilled. Now we are in a position to show that the kernel of a homomorphism is indeed an ideal. Theorem 2.34 (Kernel is an ideal). Let K1,2 be again rings with the corresponding binary operations. Consider a ring homomorphism Φ : K1 → K 2 . Then the kernel of Φ is an ideal in K1 . Proof. By Theorem 2.33, ker Φ is a subring of K1 . Consider x1 ∈ ker Φ, such that Φ (x1 ) = 0K2 .
24 | 2 Prolegomena to the mathematical theory of Wilson lines Suppose now that x2 ∈ K1 . Then, given that x1 ∈ ker Φ, Φ (x2 ∘1 x1 ) = Φ (x2 ) ∘2 Φ (x1 ) = Φ (x2 ) ∘2 0K2 = 0K2 . Taking into account that (now obviously) Φ (x1 ∘1 x2 ) , the theorem is proven. With the aid of ideals we can introduce d-closed differentiations. Definition 2.35 (d-closed differentiation). Consider a differentiation d : U → Ω. An ideal J of U is called d-closed if dJ is an U-submodule of Ω and if JΩ ⊂ dJ. If J is a d-closed ideal for U, then d induces a differentiation dJ which maps U/J → Ω/dJ .
(2.20)
This definition calls for more detailed explanation. We take U to be a K-module, so that U supplied with addition (U, +) is an Abelian group, and we can use elements of K as scalars in the multiplication with elements of U. Expressing this multiplication as a map, we can write K × U → U. We take similarly (Ω, +) to be an Abelian group as a U-module, such that the elements of U now act as scalars. The differentiation d then makes the ideal J a subset of Ω, but also turns it into a U-(sub)module U × dJ → Ω. The term ‘closed’ refers then to the fact that JΩ ⊂ dJ in Ω, where the elements of J are multiplied by the elements of Ω. As it was discussed before, kernels of homomorphisms generate ideals. The proposition below explains how a d-closed ideal is related to the kernel of a homomorphism between pointed differentiations.
2.1 Shuffle algebra and the idea of algebraic paths
| 25
Proposition 2.36. Consider the pointed differentiations (d, p) and (d , p ), of which (d, p) is surjective and (d , p ) splitting. Hence, if ̃ Φ̂ ∈ Diff(d, p; d , p ), Φ, then
̃ ker Φ
is a d-closed ideal of U. Therefore, we see that J from Definition 2.35 becomes ̃ J → ker Φ.
2.1.2.2 Chen iterated integrals as extension of line integrals Given that the concept of ideals is introduced and their relation to kernels of homomorphisms is clarified, we are in a position to study an ideal in the shuffle algebra. To this end, we define an extension of the notion of line integrals, so-called Chen iterated integrals. Definition 2.37 (Chen iterated integrals). Consider a line integral along the path 𝛾(t) from the point p ∈ 𝛾 to q ∈ 𝛾: q
Ii [𝛾] = ∫dxi (t) = xi (q) − xi (p).
(2.21)
p
Being generalized by recursion for n ≥ 2, it gives q
Ii1 ⋅⋅⋅in [𝛾] = ∫dxin (t) Ii1 ⋅⋅⋅in−1 (𝛾t ),
(2.22)
p
where 𝛾t represents the part of the path 𝛾, for which the path parameter runs from 0 to t (or, equivalently, the coordinates along the path vary from the point a to the point 𝛾(t)). The above definition depends, however, on the coordinates, which is not desirable in a manifold setting with coordinate independent equations. One can give an alternative definition that is free of explicit coordinate dependence. Definition 2.38 (Chen iterated integrals without coordinates). Consider a smooth ndimensional manifold M, the set of piecewise-smooth paths PM 𝛾:I→M where I = [0, 1],
26 | 2 Prolegomena to the mathematical theory of Wilson lines and real one-forms ω1 , ω2 ⋅ ⋅ ⋅ , ωn ∈ ⋀ M. Using the notation ω1 ⊗ ω2 ⊗ ⋅ ⋅ ⋅ ⊗ ωn = ω1 ⋅ ⋅ ⋅ ωn , ̇ ωk (t) ≡ ωk [𝛾(t)] ⋅ 𝛾(t), and 𝛾t : I → M, 𝛾t (t ) ≡ 𝛾(tt ), we can define the iterated line integrals by induction: 1
∫ ω1 = ∫ ω1 (t)dt 𝛾
0 1
t
∫ ω1 ω2 = ∫ [ ∫ ω1 (t1 )dt1 ]ω2 (t)dt 𝛾
0
0
and, generically, 1
∫ ω1 ω2 ⋅ ⋅ ⋅ ωn = ∫ [ ∫ ω1 ⋅ ⋅ ⋅ ωn−1 ]ωn (t)dt. 𝛾
0
(2.23)
𝛾t
The following property of the Chen iterated integrals helps us to derive the form of the elements of an ideal of the shuffle algebra. Proposition 2.39 (Chen iterated integrals preserve multiplication). Consider again a piecewise linear path 𝛾 in the manifold M 𝛾 and be the set of one-forms Ω on M. If we interpret 𝛾 as the map 𝛾 : T(Ω) → ℝ, ω1 ω2 ⋅ ⋅ ⋅ ωn → ∫ω1 ⋅ ⋅ ⋅ ωn , 𝛾
then this map preserves multiplication: ∫ ω1 ⋅ ⋅ ⋅ ωk ∫ ωk+1 ⋅ ⋅ ⋅ ωk+l = ∫ ω1 ⋅ ⋅ ⋅ ωk ∙ ωk+1 ⋅ ⋅ ⋅ ωk+l . 𝛾
𝛾
(2.24)
𝛾
This means that the map 𝛾 is a homomorphism. Exercise 2.40. Prove Proposition 2.39. Hint: use induction. This proposition can be straightforwardly extended to one-forms by taking values in ℂ or in GL(n, ℂ). Then, the shuffle multiplication is replaced by the matrix multiplication, where the matrix entries get multiplied by means of the shuffle multiplication.
2.1 Shuffle algebra and the idea of algebraic paths
| 27
It is worth noting that the extension of Proposition 2.39 to Lie algebra valued oneforms allows one to use Chen iterated integrals in the principal fiber bundle setting when solving the parallel transport equation in gauge theory. Let us give some simple examples of how the above proposition actually works. Example 2.41. ∫ ω 1 ∫ ω 2 = ∫ ω 1 ∙ ω2 = ∫ ω 1 ω 2 + ω 2 ω 1 𝛾
𝛾
𝛾
(2.25)
𝛾
∫ ω 1 ∫ ω 2 ω 3 = ∫ ω 1 ∙ ω2 ω 3 = ∫ ω 1 ω 2 ω 3 + ω 2 ω 1 ω 3 + ω 2 ω 3 ω 2 . 𝛾
𝛾
𝛾
(2.26)
𝛾
Suppose now that 𝛾 is a map as defined in Proposition 2.39 and F ∈ U. We obtain t
t
F [𝛾(t)] = F[𝛾(0)] + ∫ dF = pF + ∫ dF, 0
(2.27)
0
t
∫ Fω1 = ∫ (pF + ∫ dF)ω1 = ∫ dF ω1 + pF ∫ ω1 , 𝛾
𝛾
1
t
𝛾
0
t
1
t
∫ ω1 (Fω2 ) = ∫ (∫ ω1 ∫ dF)ω2 [𝛾(t)] dt + pF ∫ (∫ ω1 )ω2 [𝛾(t)] dt 𝛾
0
0
0
0
(2.29)
0
= ∫ (ω1 ∙ dF)w2 + pF ∫ ω1 ω2 , 𝛾
(2.28)
𝛾
(2.30)
𝛾
where we defined pF ≡ F [𝛾(0)] . Therefore, the general expression reads ∫ ω1 ⋅ ⋅ ⋅ ωi−1 (Fωi )ωi+1 ⋅ ⋅ ⋅ ωn = ∫ [(ω1 ⋅ ⋅ ⋅ ωi−1 ) ∙ dF] ωi ⋅ ⋅ ⋅ ωn + pF ∫ ω1 ⋅ ⋅ ⋅ ωn , (2.31) 𝛾
𝛾
𝛾
where the integrals are Chen iterated integrals as defined in Definition 2.38. This expression can again be extended to other one-forms. The following proposition holds: Proposition 2.42. For all F ∈ C∞ M
28 | 2 Prolegomena to the mathematical theory of Wilson lines and ω1 , ⋅ ⋅ ⋅ , ωn ∈ ⋀ M, one has ∫ dF ⋅ ω1 ⋅ ⋅ ⋅ ωn = ∫(F ⋅ ω1 )ω2 ⋅ ⋅ ⋅ ωn − F [𝛾(0)] ⋅ ∫ ω1 ⋅ ⋅ ⋅ ωn 𝛾
𝛾
𝛾
∫ ω1 ⋅ ⋅ ⋅ ωn ⋅ dF = (∫ ω1 ⋅ ⋅ ⋅ ωn ) ⋅ F [𝛾(1)] − ∫ ω1 ⋅ ⋅ ⋅ ωn−1 ⋅ (ωn ⋅ F) 𝛾
𝛾
𝛾
∫ ω1 ⋅ ⋅ ⋅ ωi−1 ⋅ (dF) ⋅ ωi+1 ⋅ ⋅ ⋅ ωn = ∫ ω1 ⋅ ⋅ ⋅ ωi−1 ⋅ (F ⋅ ωi+1 ) ⋅ ωi+2 ⋅ ⋅ ⋅ ωn 𝛾
𝛾
− ∫ ω1 ⋅ ⋅ ⋅ (ωi−1 ⋅ F) ⋅ ωi ⋅ ⋅ ⋅ ωn 𝛾
∫ ω1 ⋅ ⋅ ⋅ ωi−1 ⋅ (F ⋅ ωi ) ⋅ ωi+1 ⋅ ⋅ ⋅ ωn = 𝛾
F [𝛾(0)] ⋅ ∫ ω1 ⋅ ⋅ ⋅ ωn + ∫ [(ω1 ⋅ ⋅ ⋅ ωi−1 ) ∙ dF] ⋅ ωi ⋅ ⋅ ⋅ ωn . 𝛾
𝛾
In Section 2.1.3 we shall discuss these integrals and their further properties in detail, but for the moment the above is sufficient to construct an ideal in the shuffle algebra. We learned from Proposition 2.39 that the map 𝛾 : T(Ω) → ℝ is a homomorphism. If one brings all the terms in equation (2.31) to the l.h.s., then the r.h.s. becomes zero. In this way the new l.h.s. becomes an element of the kernel of the homomorphism 𝛾. From Theorem 2.34 we know that this operation generates an ideal on T(Ω), and hence also on the shuffle algebra Sh(Ω), according to Proposition 2.39. This suggests that I(d, p) ≡ u1 (Fw)u2 − (u1 ∙ dF)wu2 − (pF)u1 wu2 is an ideal in Sh(Ω) with p ∈ Alg(U, K) and u1 , u2 ∈ T(Ω), w ∈ T 1 (Ω), F ∈ U. We still need to prove that this is actually an ideal of the K-algebra Sh(Ω). Lemma 2.43. The K-submodule I(d, p) is an ideal of the K-algebra Sh(Ω). Proof. Given u1 , u2 ∈ T(Ω), w ∈ T 1 (Ω), F ∈ U, w1 , ⋅ ⋅ ⋅ wn ∈ T 1 (Ω),
(2.32)
2.1 Shuffle algebra and the idea of algebraic paths
| 29
we set Wi = w1 ⋅ ⋅ ⋅ wn−i , W i = wn−i+1 ⋅ ⋅ ⋅ wn , and Wn = W 0 = 1. Then: (w1 ⋅ ⋅ ⋅ wn ) ∙ (u1 (Fw)u2 ) = ∑(Wi ∙ u1 )(Fw)(W i ∙ u2 ), i
(w1 ⋅ ⋅ ⋅ wn ) ∙ [(u1 ∙ dF)wu2 ] = ∑(Wi ∙ u1 ∙ dF)w(W i ∙ u2 ), i
(w1 ⋅ ⋅ ⋅ wn ) ∙ (u1 wu2 ) = ∑(Wi ∙ u1 )w(W i ∙ u2 ), i
as is clear from the definition of the shuffle product. This allows us to write Fp (F, w, u1 , u2 ) = u1 (Fw)u2 − (u1 ∙ dF)wu2 − (pF)u1 wu2 , and (w1 ⋅ ⋅ ⋅ wr ) ∙ Fp (F, w, u1 , u2 ) = ∑ Fp (F, w, Wi ∙ u1 , W i ∙ u2 ) ∈ I,
(2.33)
i
which by Definition 2.30 turns it into an ideal. Note that 1 ∉ I such that the factor algebra Sh(Ω)/I is again a commutative unitary K-algebra. Having the above ideal of Sh(Ω), we are now ready to introduce Chen’s d-paths. Definition 2.44 (d-path). A d-path 𝛾 from p is an element of Alg(Sh(Ω), K), such that 𝛾I = {0}. From the discussion on Chen integrals and their relation to the shuffle algebra’s ideal, it is clear that if one takes such an integral over an element of the ideal I(d, p), this will return zero, in addition making the link with the ideal being the kernel of the map ∫𝛾 . This is consistent with the definition of a d-path 𝛾 where one needs to have 𝛾 [I(d, p)] = ∫ I(d, p) = 0. 𝛾
In other words, Chen integrals can be considered as d-paths. We shall come back to this point in Section 2.1.3. Notice that the homomorphisms induced by the Chen iterated integrals also preserve the algebraic (shuffle) structure. Hence, they can also be considered as algebra morphisms, where we denote the resulting algebra by Ap . This leads us to the following remark, which will become relevant when introducing generalized loop space.
30 | 2 Prolegomena to the mathematical theory of Wilson lines Remark 2.45. The kernel of the algebra map Sh(Ω) → Ap , when considering closed d-paths (i.e., loops), not only contains the ideal of the shuffle algebra, but also dC∞ (M) that we denote by ⟨dC⟩. This generates a new ideal in Sh(Ω) when considered on the space of closed paths at p: Jp = Ip + ⟨dC⟩, such that for d-loops we have the isomorphism Sh(Ω)/Jp ≅ Ap . The following proposition relates the d-closed property to the ideal. Proposition 2.46 (Least δ -closed ideal). I is the least δ -closed ideal of Sh(Ω) which is contained in ker 𝜖 and for F ∈ U, w ∈ Ω contains all Fw − dFw − (pF)w. Proof. We start by determining that the intersection of two δ -closed ideals is not necessarily δ -closed. It is easy to see that 𝜖I = 0 and that I ∙ (Sh(Ω) ⊗ Ω) ⊂ (I ∙ Sh(Ω)) ⊗ Ω ⊂ I ⊗ Ω ⊂ δ I,
(2.34)
with δ as defined in Definition 2.23, next to Sh(Ω) ∙ δ I ⊂ δ (Sh(Ω) ∙ I) + I ∙ δ Sh(Ω) ⊂ δ I.
(2.35)
We conclude that I is a δ -closed ideal of Sh(Ω). Consider now another δ -closed ideal of Sh(Ω), I , which is contained in ker 𝜖 and itself contains all elements of the form Fp (F, w, 1, 1) = Fw − dFw − (pF)w. Then δ I includes all u ∙ δ Fp (F, w, 1, 1) = δ Fp (F, w, u, 1), such that Fp (F, w, u, 1) ∈ I , by virtue that I ⊂ ker 𝜖.
(2.36)
2.1 Shuffle algebra and the idea of algebraic paths
| 31
Taking into account that for n ≥ 1 δ Fp (F, w, u, w1 ⋅ ⋅ ⋅ wn ) = Fp (F, w, u, w1 ⋅ ⋅ ⋅ wn−1 ) ⊗ wn ∈ I ⊗ wn ⊂ δ I , we obtain by induction Fp (F, w, u, w1 ⋅ ⋅ ⋅ wn ) ∈ I .
(2.37)
We now introduce the following notation for the canonical morphisms: δI = δ1 = δ1 (d, p) : Sh(Ω)/I → (Sh(Ω) ⊗ Ω)/δ I ρ̃ : Sh(Ω) → Sh(Ω)/I ρ̂ : Sh(Ω) ⊗ Ω → (Sh(Ω) ⊗ Ω)/δ I.
(2.38)
By virtue that 𝜖I = 0, we have that 𝜖 has a unique factorization through the ideal: 𝜖 = 𝜖1 ρ̃ , 𝜖1 ∈ Alg(Sh(Ω)/I, k). Obviously, ker δ1 ∩ ker 𝜖1 = 0, so that a pair (δ1 , 𝜖1 ) = (δ1 (d, p), 𝜖1 (d, p)) is a splitting surjective pointed differentiation, and (ρ̃ , ρ̂ ) ∈ Diff(δ , 𝜖; δ1 , 𝜖1 ). Using the definition displayed in Figure 2.4 we find δ χ̃0 = χ0̂ d. Note that in general this is not the case: (χ̃0 , χ0̂ ) ∉ Diff(δ , 𝜖; δ1 , 𝜖1 ). This indicates that we need an extra condition, provided by the following theorem. Theorem 2.47. Consider a splitting pointed differentiation. Let (d , p ) and (θ̃ , θ ̂ ) ∈ Diff(δ , 𝜖; δ1 , 𝜖1 ). We find that if and only if
(θ̃ χ̃0 , θ ̂ χ0̂ ) ∈ Diff(d, p; d , p ) I ⊂ ker θ̃ .
32 | 2 Prolegomena to the mathematical theory of Wilson lines From this theorem one can derive two corollaries that make the diagram in Figure 2.4, defining the mathematical construct for d-paths, consistent. Corollary 2.48. From χ̃ = ρ̃ χ̃0 and χ ̂ = ρ̂ χ0̂ it follows that (χ̃ , χ ̂ ) ∈ Diff(d, p; δ1 , 𝜖1 ). Corollary 2.49. Given a splitting pointed differentiation (d , p ) and (θ̃ , θ ̂ ) ∈ Diff(δ , 𝜖; d , p ), such that
(θ̃ χ̃0 , θ ̂ χ0̂ ) ∈ Diff(d, p; d , p ),
we get that there is a unique ̃ Θ)̂ ∈ Diff(δ , 𝜖 ; d , p ) (Θ, 1 1 with
̃χ̃ , Θ̂ χ ̂ ) = (θ̃ χ̃ , θ ̂ χ ̂ ). (Θ 0 0
Using the ideal I(d, p) of the shuffle algebra, any d-path 𝛾 which begins at the point p can be factorized through 𝛾 ∈ Alg(Sh(Ω)/I, K) as 𝛾 = 𝛾 ρ̃ . With the aid of this factorization we obtain q = 𝛾χ̃0 = 𝛾 χ̃ ∈ Alg(U, K). We call p and q the initial and end (terminal) points of 𝛾 with 𝛾 being the d-path from p to q. If 𝛾 is such a d-path from p to q, then 𝛾(dF) = 𝛾(χ̃0 F − pF) = qF − pF which follows from the factorization through the ideal I(d, p). The following proposition states that, under certain assumptions about the scalars in K, the initial point of the d-path is unique. Proposition 2.50 (Unique initial point). Consider an integral domain K (that is, a commutative ring wherein the product of any nonzero elements is nonzero). The initial point of a d-path, provided that 𝛾 ≠ 𝜖, is unique.
2.1 Shuffle algebra and the idea of algebraic paths |
33
Proof. Consider 𝛾 to be a d-path from p as well as from p . By virtue that 𝛾 ≠ 𝜖, we find that there exist w ∈ T 1 (Ω), w ∈ T(Ω) for which 𝛾(w w) ≠ 0. If now F is an element of U, we get 𝛾 [(Fw)w ] = 𝛾(dFww ) + (pF)𝛾(ww ) = 𝛾(dFww ) + (p F)𝛾(ww ). From this it follows that pF = p F. We indeed have a unique initial point for the d-path 𝛾. It might seem that the algebraic structures introduced above depend on the choice of the initial point of the d-path. The following lemma shows that this is not the case, i.e., the algebraic structure is preserved under translation of the path to another initial point. Lemma 2.51. Consider a d-path 𝛾 from p to q. Then L-operator acts as ̃ I(d, p) = I(d, q) L 𝛾
(2.39)
̃ , L̂ ) is an equivalence in the Proof. We have learned from Proposition 2.29 that (L 𝛾 𝛾 ̃ category of differentiations D, such that L𝛾 I(d, p) is indeed a δ -closed ideal of Sh(Ω). Hence, ̃ (Fw − dF w − (pF)w) = (𝛾 ⊗ 1)Δ(Fw − dF w − (pF)w). L 𝛾 Then one gets Δ(Fw) = Fw ⊗ 1 + 1 ⊗ Fw (𝛾 ⊗ 1)Δ(Fw) = 𝛾(Fw) + Fw Δ(dF w) = 1 ⊗ dF w + dF w ⊗ 1 + dF ⊗ w (𝛾 ⊗ 1)Δ(dF w) = dF w + 𝛾(dF w) + 𝛾(dF)w Δ(pF w) = pF w ⊗ 1 + 1 ⊗ pF w (𝛾 ⊗ 1)Δ(p Fw) = (pF)𝛾(w) + pF w.
(2.40)
Summing all the above and taking into account that 𝛾 is a d-path, we get ̃ (Fw − dFw − (pF)w) L 𝛾 = 𝛾(Fw) + Fw − dFw − 𝛾(dFw) − 𝛾(dF)w − (pF)𝛾(w) − pFw = Fw − 𝛾(Fw − dFw − pFw) − qFw − dFw + pFw − pFw = Fw − dFw − (qF)w,
(2.41)
34 | 2 Prolegomena to the mathematical theory of Wilson lines where we used 𝛾(I) = 0, so that
̃ I(d, p). I(d, q) ⊂ L 𝛾
̃ −1 Similarly, we obtain for L 𝛾 so that
̃ −1 I(d, q) I(d, p) ⊂ L 𝛾 ̃ I(d, p) ⊂ I(d, q). L 𝛾
This shows that I(d, p) ≡ I(d, q).
The meaning of the L-operator is now clear: it is the operator associated with a path 𝛾 from p to q that translates the algebra ideal I(d, p) at p to the algebra ideal I(d, q) at q, which is the endpoint of the d-path 𝛾. With Definition 2.10 for the product of two algebra homomorphisms 𝛾1 , 𝛾2 ∈ Alg(Sh(Ω), K) we can introduce products of d-paths and inverses d-paths. As this multiplication turned the algebra homomorphisms into a group, the same will also be true for dpaths. Theorem 2.52. Suppose we have 𝛾1 to be a d-path from p to q and 𝛾2 to be a d-path from q to q . In this case, 𝛾12 = 𝛾1 𝛾2 is a d-path from p to q , and 𝛾1−1 is a d-path from q to p. Proof. Given that ̃ I(d, p) = 𝛾 I(d, q) = 0, 𝛾12 I(d, p) = 𝛾1 𝛾2 I(d, p) = 𝛾2 L 𝛾1 2
(2.42)
we see that 𝛾12 = 𝛾1 𝛾2 is a d-path from p. For F ∈ U, 𝛾12 (dF) = (𝛾1 𝛾2 )(dF) = 𝛾1 (dF) + 𝛾2 (dF) = q F − pF.
(2.43)
That is, q is the endpoint of 𝛾12 . Next to this we also have ̃ I(d, p) = (𝛾 𝛾−1 )I(d, p) = (𝜖)I(d, p) = 0 𝛾1−1 I(d, q) = 𝛾1−1 L 𝛾1 1 1
(2.44)
2.1 Shuffle algebra and the idea of algebraic paths |
35
and 𝛾1−1 (dF) = −𝛾1 (dF) = pF − qF.
(2.45)
Hence, 𝛾1−1 is a d-path from q to p. Given that the d-paths form a group under the above multiplication, one is able to construct the group of generalized loops.
2.1.2.3 Connectedness In the previous sections we have formally introduced Chen’s generalization of the intuitive idea of paths in a given space. Similarly to the case of the intuitive paths, we can now ask the question if some space is connected with respect to these generalized paths. If this turns out to be true, we shall say these spaces are d-connected as compared to the path connected ones. Definition 2.53 (d-connectedness). U is called d-connected if for arbitrary p and q, such that p, q ∈ Alg(U, K), there always is a d-path from p to q. From topology we know that continuous maps transform path connected spaces to path connected spaces. The d-path counterpart of this statement is given by the following proposition: Proposition 2.54 (Maps between d-connected spaces). Consider ̃ , φ̂ ) ∈ Diff(d, d ). (φ ̃ generates a surjective map If U is d -connected and if φ Alg(U , K) → Alg(U, K), then U is d-connected. Recall that the d-paths are defined by means of a differentiation (d, p) that returns the ideal I(d, p) on which the d-path vanishes. Not surprisingly we see that if two points are points in a d-connected space, i.e., they can be connected by a d-path, their differentiations are equivalent. Proposition 2.55 (Equivalence of differentiations). Suppose that U is d-connected. For all p, q ∈ Alg(U, K) the differentiation operations δ1 (d, p) and δ1 (d, q) are equivalent.
36 | 2 Prolegomena to the mathematical theory of Wilson lines Exercise 2.56. Although this statement seems trivial at first, why is it not so trivial? How can one interpret this equivalence in the principal fiber bundle context? Similarly to usual topological spaces, we can define discrete points with respect to dpaths. Definition 2.57 (d-discrete point). A point p ∈ Alg(U, K) is a d-discrete point if there is no d-path starting at it, such that 𝛾 ≠ 𝜖. Using these discrete points we can determine when a w∈Ω may be called trivial. Definition 2.58 (P-triviality). Suppose that P ⊂ Alg(U, K) contains at least one d-nondiscrete point. Then w∈Ω is called P-trivial if, for all d-paths 𝛾 from a point p∈P and for all u1 , u2 ∈ T(Ω), F ∈ U, we have 𝛾(u1 (Fw)u2 ) = 0. Then F∈U is called P-trivial if dF is P-trivial and if Fw is P-trivial for all w ∈ Ω. Exercise 2.59. Show that not every point of Ω is P-trivial and that 1 ∈ U is not P-trivial. Let UP stand for the quotient K-algebra of U over the ideal of the P-trivial elements of U, and ΩP stand for the quotient of the U-module of Ω over the U-submodule of the Ptrivial elements of Ω. Then ΩP is an UP -module. The differentiation d maps the ideal of the P-trivial elements of U into the submodule of the P-trivial elements of Ω and thus generates the following differentiation:
2.1 Shuffle algebra and the idea of algebraic paths
| 37
Definition 2.60. dP : UP → ΩP . Consider canonical homomorphisms π̃P ∈ Alg(U, UP ) and π̂P ∈ Hom(Ω, ΩP ). Hence, we get πP = (π̃P , π̂P ) ∈ Diff(d, dP ). UP , ΩP and πP depend only on the d-nondiscrete points of P. An injective map Alg(UP , K) → Alg(U, K) is generated by the projection π̃P . Proposition 2.61. Consider an integral domain K. The set of all the d-nondiscrete points of P is contained in the image of the injective map Alg(UP , K) → Alg(U, K). Since we shall only be interested in the nontrivial elements, we can introduce reduced spaces that only contain such nontrivial elements. The purpose of this reduction will become clear when we return to the properties of Chen iterated integrals and their relation to d-paths in Section 2.1.3. Definition 2.62 (d-reduced). U is called d-reduced if in Alg(U, K) there exists at least one d-nondiscrete point and the only Alg(U, K)-trivial element of U is zero.
2.1.2.4 d-Loops Before returning to Chen iterated integrals we comment a bit more on d-loops. Generalized loops or d-loops can be naturally defined as d-paths where the initial and endpoints coincide, but where one needs to complete the ideal with the set {dU}. This becomes clear when one considers Chen integrals as d-paths since they return zero over this set, such that the set {dU} indeed can be added to the algebra ideal I(d, p).⁴ Definition 2.63 (d-loop). A d-loop from p is obviously defined as a d-path which begins and ends at the same point p. Then {dU} stands for the ideal of Sh(Ω) generated by dU ⊂ T 1 (Ω).
4 See also Remark 2.45.
38 | 2 Prolegomena to the mathematical theory of Wilson lines Therefore 𝛾 ∈ Alg(Sh(Ω), K) is a d-loop from p provided that (and only if) 𝛾 annuls the ideal I(d, p) + {dU} of Sh(Ω). In what follows we shall use the notation Shc(d, p) for the quotient K-algebra Sh(Ω)/(I = I(d, p) + {dU}). Notice that Shc(d, p) is commutative and unitary. There exists a canonical bijective map from the set Alg(Shc(d, p), K) to the set of d-loops. Using again the multiplication from Definition 2.10 and Theorem 2.52, it is easy to see that the d-loops also form a group. In Section 2.1.1.2 we have shown that Sh(Ω) is a Hopf algebra. The same is true for Shc: Theorem 2.64. Shc(d, p) is a Hopf K-algebra with a co-multiplication Δc , a co-unit 𝜖c and antipode Jc , generated by Δ, 𝜖 and J. Considering loops in topological spaces one usually discusses the fundamental group. One of the nice properties of the fundamental group is that it is independent of the base point of the loops. In the case of d-loops we have similar properties, namely that the Hopf-algebra structure and the group structure of Shc are independent of the base point of the loops. The following proposition holds: Proposition 2.65. Suppose we have a d-path from p to q. Then the Hopf K-algebras Shc(d, p) and Shc(d, q) are isomorphic. It follows directly from this proposition that, for the same path, the group of d-loops from p is isomorphic with the group of d-loops from q. We have already introduced Chen’s d-paths and d-loops as algebra morphisms. We also discussed some of their properties, emphasizing ideals of algebra morphisms. The shuffle algebra ideal was constructed by using Chen’s generalization of line integrals. We have presented some of the properties of Chen iterated integrals that will be used for introducing the group of generalized loops.
2.1 Shuffle algebra and the idea of algebraic paths |
39
2.1.3 Chen iterated integrals 2.1.3.1 d-loops and Chen iterated integrals Let us discuss the relationship between Chen’s integrals and the d-loops. From Remark 2.45 we learn that the integral algebra Ap is isomorphic to the algebra Shc(d, p). A dloop 𝛾 is then considered as an algebra morphism Alg(Sh(Ω), K) that vanishes on the ideal I(d, p) + {dU}. On the other hand, this ideal is also an ideal in the algebra of Chen iterated loop integrals Ap , by definition. The isomorphism of both algebras then enable one to identify a d-loop with an element of A∗p , the dual space of Ap formed by the real (complex, GL(n, ℂ)) valued linear functionals on Ap , inducing the identification: Alg(Shc(d, p), k) ∋ 𝛾 → ∮ ∈ A∗p .
(2.46)
𝛾
This property, in combination with its relevance to the solution of the parallel transport equation in a principal fiber bundle (see Section 2.4), is the reason that we are interested in the Chen integrals. In the principal fiber bundle setting we shall assume the one-forms, used in the functionals X ω1 ⋅⋅⋅ωn (see equation (2.52)), to be Lie algebravalued. In other words, we shall assume ωi ∈ ⋀ M ⊗ gl(g), where gl is a matrix representation (i.e., an element of GL(n, ℂ)) of the Lie algebra g, which explains the presence of ωi ∈ ⋀ M ⊗ GL(n, ℂ) in many of the previous and following definitions and properties of Chen iterated integrals.
2.1.3.2 Chen iterated integrals: properties In Section 2.1.2 we introduced Chen iterated integrals (Definitions 2.37 and 2.38) and discussed some of their properties. Here we extend the list of properties of these integrals, which will be relevant for the construction of generalized loop space. Let us begin with answering several elementary questions concerning the behavior of the Chen integrals. Exercise 2.66. What is the behavior of the Chen integrals if we take into account intermediate points along the path?
40 | 2 Prolegomena to the mathematical theory of Wilson lines This question is answered by the following lemma: Lemma 2.67 (Intermediate points). Suppose we have the three points p≤c≤q and the line integrations along the paths 𝛾c and 𝛾c defined as c
q
c
𝛾 ∼ ∫, and 𝛾c ∼ ∫ . p
c
Then we find that Ii1 ⋅⋅⋅in [𝛾] = Ii1 ⋅⋅⋅in [𝛾c ] + Ii1 ⋅⋅⋅in−1 [𝛾c ]Iin [𝛾c ] + Ii1 ⋅⋅⋅ik [𝛾c ]Iik ⋅⋅⋅in [𝛾c ] + ⋅ ⋅ ⋅ + Ii1 ⋅⋅⋅in [𝛾c ].
(2.47)
Notice that these integrals, as extensions of line integrals, are reparameterization invariant if the reparameterization preserves the orientation. Proposition 2.68 (Reparameterization). ∫ ω1 ⋅ ⋅ ⋅ ωn 𝛾
is invariant under orientation-preserving reparameterizations. Having this property in mind, we shall investigate how the integrals behave when we combine two paths 𝛾1 and 𝛾2 , with the endpoint of 𝛾1 being the starting point of 𝛾2 , which we denote by c. Combining the paths we create the path 𝛾12 = 𝛾1 𝛾2 , where 𝛾2 goes after 𝛾1 . Applying Lemma 2.67 to the new path 𝛾12 with an intermediate point c, we immediately find out how to deal with combined paths: Lemma 2.69 (Combining paths). Ii1 ⋅⋅⋅in [𝛾12 ] = Ii1 ⋅⋅⋅in [𝛾1 ] + Ii1 ⋅⋅⋅in−1 [𝛾1 ]Iin [𝛾2 ] + Ii1 ⋅⋅⋅ik [𝛾1 ]Iik ⋅⋅⋅in [𝛾2 ] + ⋅ ⋅ ⋅ + Ii1 ⋅⋅⋅in [𝛾2 ] or, in the notations of Definition 2.38, Proposition 2.70 (Composition of paths). Given that 𝛾1 , 𝛾2 ∈ PM, the space of paths in the real smooth manifold M, i.e., 𝛾1 , 𝛾2 : [0, 1] → M
(2.48)
2.1 Shuffle algebra and the idea of algebraic paths
| 41
with 𝛾1 (1) = 𝛾2 (0), we can compose the paths using equation (2.47). Let ω1 , ⋅ ⋅ ⋅ , ω n ∈ ⋀ M and for n = 0 it is assumed that ∫ ω1 ⋅ ⋅ ⋅ ωn = 1. 𝛾
Then, under composition of the paths, the Chen integrals change in the following way (where we introduced the notion of an inverse path): n
∫ ω1 ⋅ ⋅ ⋅ ωn = ∑ ∫ ω1 ⋅ ⋅ ⋅ ωi ⋅ ∫ ωi+1 ⋅ ⋅ ⋅ ωn i=0 𝛾1
𝛾1 ⋅𝛾2
(2.49)
𝛾2
∫ ω1 ⋅ ⋅ ⋅ ωn = (−1)n ∫ ωn ⋅ ⋅ ⋅ ω1 .
(2.50)
𝛾1
𝛾1−1
When applied to ω1 , ⋅ ⋅ ⋅ , ωn ∈ ⋀ M ⊗ GL(n, ℂ) (i.e., general linear group complex matrix-valued one-forms), equation (2.50) is replaced by ∫ ω1 ⋅ ⋅ ⋅ ωn = (−1)n ∫[ωnT ⋅ ⋅ ⋅ ω1T ]T
(2.51)
𝛾1
𝛾1−1
with ω T the transpose of the matrix ω . The matrix valued one-forms can be considered as matrix functions, which will be discussed in more detail in Section 2.3.1. Within the principal fiber bundle approach to the formulation of gauge theories (Section 2.2) we shall identify the gauge potentials Aμ with such one-forms, where the matrices form a representation of the Lie algebra. The Chen integrals will then be applied to solving the parallel transport equation. In what follows we adopt the notation X ω1 ⋅⋅⋅ωn X ω1 ⋅⋅⋅ωn [𝛾] = ∫ ω1 ⋅ ⋅ ⋅ ωn = 𝛾[ω1 ⋅ ⋅ ⋅ ωn ],
(2.52)
𝛾
where ∫𝛾 is interpreted as a d-path and where we considered the one-forms ωi to be complex-valued, ωi ∈ ⋀ M ⊗ ℂ. This notation can be straightforwardly extended to complex matrix valued one-forms, and thus to Lie algebra valued one-forms as well. Let us give a simple example:
42 | 2 Prolegomena to the mathematical theory of Wilson lines Example 2.71. X ω1 ω2 [𝛾] = ∫ ω1 ω2 , 𝛾
is a matrix in GL(n, ℂ) with the elements given by: i
(∫ ω1 ω2 ) = ∫(ω1 )ik ⊗ (ω2 )kj 𝛾
j
(2.53)
𝛾
with ω1 , ω2 ∈ ⋀ M ⊗ GL(n, ℂ) are matrices of one-forms on M. Recall now that Chen’s integrals can be considered as d-paths/loops. The above properties of these integrals allow us to give some extra notions related to d-paths. Definition 2.72 (Elementary equivalent paths). Two paths are called elementary equivalent if 𝛾1 𝛾2 𝛾2−1 𝛾3 = 𝛾1 𝛾3 . This equivalence induces an equivalence relation on the d-paths, and thus also induces equivalence classes of paths [𝛾1 𝛾3 ]. Definition 2.73 (Piecewise regular paths). A piecewise regular path is a path in PM with nonvanishing tangent vectors. Definition 2.74 (Reduced paths). A path is called a reduced path if it is piecewise regular and if it does not belong to the type 𝛾1 𝛾2 𝛾2−1 𝛾3 . for arbitrary 𝛾2 . From these definitions, together with (2.49) and (2.50), it is clear that the functionals X defined in (2.52) depend only on the equivalence class and not on the specific path representing the class. As a specific example we see that 𝛾 and 𝛾𝛾 𝛾−1 are representatives of the same class, and from the composition of paths property of the Chen integrals we get X ω1 ⋅⋅⋅ωn [𝛾] = X ω1 ⋅⋅⋅ωn [𝛾𝛾 𝛾−1 ]. In the literature on loop space, this property is sometimes graphically represented as in Figure 2.5, where the functions X exhibit the property X ω1 ⋅⋅⋅ωn [𝛾] = X ω1 ⋅⋅⋅ωn [𝛾𝛾 𝛾−1 ] sometimes referred to as Stokes’ functionals.
2.1 Shuffle algebra and the idea of algebraic paths |
β
γ
43
β−
γ
̶̶ ̶
Fig. 2.5: The property of path reduction.
The properties of Chen integrals can be used to prove the following lemma: Lemma 2.75 (Nonvanishing Chen integrals). Consider a reduced piecewise regular path in PM 𝛾 ≠ 𝜖. For n ≥ 1, there exist one-forms ω1 , ω2 , . . . , ωn ∈ ⋀ M, such that X ω1 ⋅⋅⋅ωn [𝛾] ≠ 0. From the above lemma we can derive an important separation property of the functionals X ω1 ⋅⋅⋅ωn : Theorem 2.76 (Separation Property Theorem). Two piecewise regular paths 𝛾, 𝛾 are called equivalent if and only if for all sets of one-forms (n ≥ 1) ω 1 , ω 2 ⋅ ⋅ ⋅ , ωn ∈ ⋀ M corresponding Chen integrals equal each other X ω1 ⋅⋅⋅ωn [𝛾] = X ω1 ⋅⋅⋅ωn [𝛾].
(2.54)
44 | 2 Prolegomena to the mathematical theory of Wilson lines Exercise 2.77. Prove Theorem 2.76 by using the definitions and properties of Chen integrals and reduced paths. These lemma and theorem state that each d-path, defined by Chen integrals, is equivalent to exactly one reduced path. The theorem also says that the functionals X can be used to separate d-paths. Remark 2.78. With the principal fiber bundle approach in mind, we can go a step further. The following theorem states that if the d-paths of α1 and α2 return the same value for the exponential homomorphism Θ, then 𝛾1 and 𝛾2 only differ by parameterization and left translation provided they are reduced (see Definition 2.74): Theorem 2.79. Introduce the exponential homomorphism Θ: ∞
Θ[𝛾1 ] = 1 + ∑ ∑ ∫ ωi1 ⋅ ⋅ ⋅ ωin Xi1 ⋅ ⋅ ⋅ Xin , n=1
(2.55)
α
where the Xj are noncommutative indeterminates with respect to a base ω1 , ⋅ ⋅ ⋅ , ωn of the Maurer–Cartan forms (g−1 dg) of a real Lie group G, and Θ(𝛾1 ) is in an element of the G. Then one of two irreducible piecewise regular continuous paths 𝛾1 and 𝛾2 can be obtained from the other by left translation and change of parameter if and only if Θ[𝛾1 ] = Θ[𝛾2 ]. Identifying the exponential homomorphism with the parallel transporter or Wilson line, the above theorem strengthens the equivalence relation on d-paths and d-loops induced by path reduction from Definition 2.74. It permits as well a stronger separation of paths compared to the functionals X ω1 ⋅⋅⋅ωn . In other words, the parallel transporter can be used to distinguish or separate d-paths and d-loops, a fact that will be quite helpful when introducing a topology on the algebra Ap .
2.2 Gauge fields as connections on a principal bundle A mathematical point of view on Quantum Field Theory suggests that the fundamental interactions between matter fields can be conveniently expressed in a geometrical setting using principal fiber bundles.⁵ In this section we present some of the basic concepts of the fiber bundle theory, which will be used to derive the parallel transport
5 Notice that principal fiber bundles are not the only method to provide a geometrical description of physical interactions. Other approaches, which in some sense are closer to the ideas of Quantum Mechanics, are given by (Lie) algebroids and noncommutative geometry.
2.2 Gauge fields as connections on a principal bundle
|
45
equation. Then we shall link the solution of this equation to the concept of Wilson lines.
2.2.1 Principal fiber bundle, sections and associated vector bundle The simplest structure we wish to define is a fiber bundle.⁶ A fiber bundle P(Y, π ) consists of the base space Y, another set P, and the projection π : P → Y, which maps the fiber bundle P to the base space Y. Therefore, for each element y ∈ Y there exists a number of elements (fiber) Py = πy−1 ∈ P. So far the sets P and Y are arbitrary. More useful structures arise if we let them be differentiable manifolds and, correspondingly, π be a differentiable projection. Adding also a group G (which we shall assume to be a Lie group of Yang–Mills theory) enable us to define a principal (Yang–Mills) fiber bundle. Definition 2.80 (Yang–Mills principal fiber bundle). A principal fiber bundle P(Y, G, π ) is a set of the following ingredients: 1. a base space Y, which is assumed in what follows to be a four-dimensional Minkowskian manifold M 4 ; 2. a differentiable manifold P; 3. a surjective projection π : P → Y; 4. a structure group G (in the Yang–Mills case it is a gauge (Lie) group), which is equivalent to a fiber Py , so that an inverse image yields the fiber at y πy−1 ≡ Gy ≅ G. Notice the following properties: 1. A Lie group G acts on the fibers from the left. 2. There exists an open cover {Ui } of Y together with the diffeomorphisms Φi : Ui × G → π −1 (Ui ),
6 We base our exposition mostly on the works given in the section ‘Gauge theory and the principal fiber bundle approach’ in the Literature Guide.
46 | 2 Prolegomena to the mathematical theory of Wilson lines so that (π ∘ Φi )(y, g) = y, where g is an element of G. The Φi are referred to as the local gauge or local trivialization since Φi−1 acts as π −1 (Ui ) → Ui × G. 3.
The mapping Φi (y) : G → Gy is a diffeomorphism. On Ui ∩ Uj ≠ 0, it is required that Sij (y) ≡ Φi−1 (y)Φj (y) which maps G→G is an element of the structure group G. Both maps Φi and Φj are related by a smooth map Sij : Ui ∩ Uj → G, so that Φj (y, g) = Φi (y, Sij (y)g). We refer to the Sij as the transition functions or passive gauge transformations.
We also have a right action of G (see Figure 2.6) on the fiber which does not depend on the local gauges. Given that the structure group is equivalent to fiber, the right action of G on π −1 (Ui ) reads Φi−1 (π −1 (y)g) = (y, gi g) or π −1 (y)g = Φi (y, gi g). To see that this is independent of local gauges, let us consider a y ∈ Ui ∩ Uj , for which π −1 (y)g = Φj (y, gj g) = Φj (y, Sji gi g) = Φi (y, gi g).
(2.56)
2.2 Gauge fields as connections on a principal bundle
|
47
G P
ϕi
gi g
pg gi p
π
M
Fig. 2.6: Right action of G on a fiber.
Ui
Next to the principal fiber bundles we shall also need the concept of a section: Definition 2.81 (Section). A smooth map S:Y→P is called a section when π ∘ S = 1Y . Suppose we have a section Si (y) over Ui . Then we can construct a corresponding local gauge Φi . To this end, let us consider for y ∈ U p ∈ π −1 (y), for which there exists a unique element gp ∈ G, such that p = Si (y)gp . Now we define Φi through its inverse Φi−1 (p) = (y, gp ).
48 | 2 Prolegomena to the mathematical theory of Wilson lines Notice that in this specific gauge (often referred to as the canonical local trivialization) we get Si (y) = Φ(y, 𝜖), where 𝜖 is the identity element in the structure group G. The gauge potentials can be defined in these principal fiber bundles that form an appropriate geometrical space. Exercise 2.82. How to include a matter field ψ (x), or put differently, how does one geometrize ψ (x)? The geometrization is performed with the aid of another structure called associated vector bundle E(Y, G, V, P, πE ). The vector bundle E is constructed using an n-dimensional vector space V, ((p, v) ∈ P × V), on which the gauge group G acts: (p, v) → (pg, ρ −1 (g)v)
(2.57)
with ρ the n-dimensional unitary representation of G. Therefore, the vector bundle E(Y, G, V, P, πE ) forms an equivalence class P × V/G, such that (p, v) ≡ (pg, ρ −1 (g)v). The bundle E now also has a fiber bundle structure E = P ×ρ V, where πE : E → Y, πE (p, v) = π (p) with local trivialization Ψi : Ui × V → πE−1 (Ui ). Again we have transition functions, which are now ρ (Sij (y)) with the Sij transition functions on P. A local section Si on P can then not only be used to determine a local gauge on P, but also on E: Φi−1 (y) ∘ Si (x) = 1G
(2.58)
Ψi−1 (y)
(2.59)
∘ Si (x) = 1V ,
| 49
2.2 Gauge fields as connections on a principal bundle
with Ψi−1 (y) : πE−1 (y) → V. Now the associated vector bundle E(Y, G, V, P, πE ) allows us to geometrize matter fields. Definition 2.83 (Matter field). A matter field of type (ρ , V) is defined as a section ψ (y) : Y → E. Being expressed in a gauge independent way, it yields: Definition 2.84 (Gauge independent definition matter field). A matter field of type (ρ , V) (see Figure 2.7) is defined as a map ̃ : P → V. ψ This map is equivariant under the structure group G for each p ∈ P ̃ (pg) = ρ (g−1 )ψ ̃ (p). ψ
˜ ψ ϕi
P
si
π
M4 Fig. 2.7: Definition of matter field.
G
V
Φi
E
πE
M4
ψ
50 | 2 Prolegomena to the mathematical theory of Wilson lines 2.2.2 Gauge field as a connection In a gauge theory the fields (or potentials) can be introduced as Lie algebra-valued one-forms on the principal fiber bundle associated to the gauge theory.⁷ Now we shall motivate and discuss this identification.⁸ We shall give two equivalent definitions of connection, the first of which is more used by mathematicians, while the second one is more favored by physicists. Definition 2.85 (Connection (math)). Consider a principal fiber bundle P(Y, G, π ). Then a connection on P is defined as a unique splitting of the tangent space Tp P into the vertical subspace Vp P and the horizontal subspace Hp P, such that 1. Tp P = Hp P ⊕ Vp P. 2. A smooth vector field X on P can be split into horizontal and vertical fields X = XH + XV , where X H ∈ Hp P and X V ∈ Vp P. 3.
For p ∈ P and g ∈ G one has Hpg P = Rg∗ Hp P.
The vertical space is considered to be tangent to Gx at p, which we shall discuss in more detail below. The last statement in the definition says that the horizontal spaces Hpg P and Hp P on the same fiber are related by a linear transformation generated by the right action of the structure group. Most physicists, however, would prefer the definition of a connection one-form introduced by Ehresmann. Before we give that definition, let us first overview some facts about Lie groups and Lie algebras.
7 Let us emphasize that the identification of fields, as defined in quantum field theory in physics, with sections of the principal (gauge) fiber bundle is only valid in the perturbative sector. In the nonperturbative regime the situation becomes much more involved. Sometimes one runs into problems of uniqueness, even in the perturbative sector. An example of this is, for instance, the U(1)-bundle over the sphere S2 . 8 A gauge field Aμ (also referred to as gauge potential ) transforms under a gauge transformation U(x) as was discussed in the Introduction: Aμ → U(x)Aμ (x)U † (x) ∓ ei 𝜕μ U(x)U † (x), with e0 the coupling 0 constant. This obviously differs from the transformation law for vectors and looks more like the trans−1 −1 formation of a connection ω → g ω g + g 𝜕μ g.
2.2 Gauge fields as connections on a principal bundle
|
51
2.2.2.1 Lie groups and Lie algebras Consider a Lie group G to which we can associate a left (Lg ) and a right action (Rg ) defined respectively as Lg h = gh and Rg h = hg for g, h ∈ G. The left action Lg generates the map (push-forward)⁹ Lg∗ : Th (G) → Tgh (G) between tangent spaces at different points in the Lie group G. This allows us to define a left-invariant vector field X by demanding that Lg∗ X|h = X|gh . These left-invariant vector fields generate a Lie algebra of G, which we write as g. Now X∈g is specified by its value at the Lie group’s unit element e, and vice versa. This means there exists a vector space isomorphism between the Lie algebra g and the tangent space of G at the unit element, i.e., g ≅ Te G. From Lie theory we learn that the Lie algebra g has a set of generators {Tα } that also define the structure constants 𝛾
𝛾
fαβ : [Tα , Tβ ] = fαβ T𝛾 . Besides the left and right action, Lie groups also allow for an adjoint action ad : G → G, h → adg h ≡ ghg−1 , which in its turn generates the adjoint map Adg : Th (G) → Tghg−1 (G) between tangent spaces. By choosing h ∈ G in the adjoint map to be the unit element e, we immediately see that Adg maps Te (G) ≅ g onto itself.
9 Notice that this map is well-defined due to the fact that this action is an automorphism of G.
52 | 2 Prolegomena to the mathematical theory of Wilson lines With the aid of this information on Lie groups and algebras, we now understand how to construct the vertical subspace Vp P, defined in Definition 2.85, of the tangent space of the principal fiber bundle Tp P. Suppose we have A∈g and p ∈ P. Then the right action: Re[tA] p = pe[tA] ,
(2.60)
defines a curve through p parameterized by t. Noticing that π (p) = π [pe[tA] ] = y implies the curve lies in Gy , the fiber above y ∈ Y. Using an arbitrary smooth function F:P→ℝ we define the vector A♯ ∈ Tp P as
d (2.61) F(pe[tA] )|t=0 . dt This vector is tangent to P at p and by definition tangent to G such that we have A♯ F(p) =
A♯ ∈ Vp P. Constructing such an A♯ at each point of P builds a vector field also noted as A♯ and referred to as the fundamental vector field generated by A. We obtain, therefore, the isomorphism ♯ : g → Vp P : A → A♯ . We identify the complement of Vp P with Hp P from Definition 2.85. We are now in a position to define the Ehresmann connection one-form: Definition 2.86 (Ehresmann connection one-form). A connection one-form ω ∈ T∗P ⊗ g is a projection of Tp P onto the vertical component Vp P ≅ g, the Lie algebra of G. This one-form possesses the following properties:
2.2 Gauge fields as connections on a principal bundle
|
53
1. ω (A♯ ) = A with A∈g 2. R∗g ω = Adg−1 ω or for X ∈ Tp P
R∗g ωpg (X) = ωpg(Rg∗ X) = g−1 ωp (X)g.
Using this definition, the horizontal subspace Hp P can also be identified with the kernel of ω . Recall now that we wish to relate Aμ to a Lie algebra-valued one-form. We have constructed a one-form, so the following question naturally arises: Exercise 2.87. How to find the relation between the gauge fields Aμ and the Ehresmann connection one-form? Take an open covering {Ui } of Y and let Si be a local section on each Ui . Using the Ehresmann connection ω define the Lie algebra-valued one-form Ai on Ui by:¹⁰ Ai ≡ Si∗ ω ∈ ⋀(Ui ) ⊗ g.
(2.62)
Now it is also possible, given a gauge field and a section Si : Ui → π −1 (Ui ), to reconstruct a connection one-form ω . The following theorem helps us to proceed: Theorem 2.88. Given a g-valued one-form Ai on Ui and a local section Si : Ui → π −1 (Ui ) , there exists a connection one-form ω whose pullback by Si∗ iS Ai . It is worth mentioning, however, that the connection one-form ω can be defined globally, while the Lie algebra-valued one-form Ai cannot because of the need for the local sections Si . Theorem 2.88 states that given a gauge potential Ai in Ui there exists a connection one-form ω , but it does not say whether it is unique. If we wish this connection oneform to be unique, it needs to satisfy an extra condition that is called the compatibility condition.
10 The indices i refer to the covering and not to the space-time indices μ that accompany each Ai in Ui for a specific i.
54 | 2 Prolegomena to the mathematical theory of Wilson lines This condition follows from the fact that if ω needs to be unique, one needs that ωi = ωj , on Ui ∩ Uj with ωi = ω |Ui . From manifold theory it is clear that this restriction has something to do with the transition function associated to transformation from Ui to Uj , so we can expect a statement that restricts the transition functions. The explicit form of this condition can derived by applying the connection oneform ω to (2.63) in the following lemma: Lemma 2.89. Consider a principal fiber bundle P(Y, G, π ) and local sections Si , Sj over Ui and Uj , such that Ui ∩ Uj ≠ 0. For X ∈ Tp M with p ∈ Ui ∩ Uj , Si∗ X, Sj∗ X satisfy
♯
Sj∗ X = Rtij ∗ (Si∗ X) + (tij−1 dtij (X)) ,
(2.63)
where tij : Ui ∩ Uj → G is the transition function. After application of ω to (2.63), using ω (Sj∗ ) = Sj∗ ω together with the second property of Definition 2.86, we obtain Aj = tij−1 Ai tij + tij−1 dtij .
(2.64)
Identifying the Aj with gauge potentials, we obtain for the components A2μ = g−1 (p)A1μ (p)g(p) + g−1 (p)𝜕μ g(p),
(2.65)
which is identical to a gauge transformation in gauge theory. In local coordinates it reads Ai = (−igAaμ ta dxμ )i , (2.66) where g is now the coupling constant and ta are the Lie algebra generators.
2.2 Gauge fields as connections on a principal bundle
|
55
2.2.3 Horizontal lift and parallel transport Now that we have defined the splitting of the tangent space TP of the principal fiber bundle P(Y, G, π ) we can define the horizontal lift of a curve in the base manifold Y = M4. Definition 2.90 (Horizontal lift). Consider a principal fiber bundle P(Y, G, π ) and a curve in Y 𝛾 : [0, 1] → Y. Then a curve 𝛾̃ : [0, 1] → P is called a horizontal lift of 𝛾 if the tangent vector to 𝛾̃(t) is contained in H𝛾̃(t) P. With this definition we have the following theorem: Theorem 2.91. Consider again a curve in Y 𝛾 : [0, 1] → Y and p ∈ π −1 [𝛾(0)]. One can show that there is a unique horizontal lift 𝛾̃(t) in P (see Figure 2.7), such that 𝛾̃(0) = p.
It follows from this statement that if 𝛾̃ is another horizontal lift of 𝛾, such that 𝛾̃ (0) = 𝛾̃(0)g, then for all t ∈ [0, 1] one gets
𝛾̃ (t) = 𝛾̃(t)g.
The last statement demonstrates the global gauge symmetry, a global right action does not change the connection on the principal fiber. Consider now X to be the tangent vector of 𝛾(t) at 𝛾(0), using the horizontal lift we have that ̃ = 𝛾̃ X X ∗ is tangent to 𝛾̃ at p = 𝛾̃(0). Given that this lifted tangent vector is horizontal by definition, we get ̃ = 0. ω (X) Rewriting equation (2.63) using the fact that the transition functions are elements of G returns: ̃ = g−1 (t)S Xg (t) + (g−1 (t)dg (X))♯ . X (2.67) i∗ i i i i Applying ω to this result we have ̃ = g−1 (t)ω (S X)g (t) + g−1 (t) dgi (t) . 0 = ω (X) i∗ i i i dt
(2.68)
56 | 2 Prolegomena to the mathematical theory of Wilson lines p1 g
˜γ (t)g
p1
p0 g ˜γ (t) p0
γ(t)
Fig. 2.8: Horizontal lifts of a curve.
Exercise 2.92. Derive equation (2.68) by applying ω to equation (2.67). This result can now be used to answer the question: Exercise 2.93. What is the parallel transport equation in gauge theory? From the expression for the gauge potentials ω (Si∗ X) = Si∗ ω (X) = Ai (X) in equation (2.68) it follows that the parallel transport equation in the local form reads dgi (t) (2.69) = −Ai (X)gi (t). dt Thus we have discussed the relation between gauge potentials and connection one-forms on principal fiber bundles. This eventually allowed us to derive the parallel transport equation in gauge theory. In the next section we shall introduce the mathematical tools to solve this type of equations.
2.3 Solving matrix differential equations: Chen iterated integrals The main goal of this section is to clarify the relationship between iterated integrals, solutions of the parallel transporter equation in the perturbative sector of a gauge the-
2.3 Solving matrix differential equations: Chen iterated integrals
|
57
ory, and Wilson lines. We shall see that this parallel transporter equation is a linear differential matrix equation (LDME) given that we work with a matrix representation of the gauge group generators. Since we will be interested in the solutions of this transporter equation, it is instructive to discuss in detail the procedure of their construction. Finally, the solutions will be expressed in terms of product integrals and Chen iterated integrals. We restrict ourselves to the definitions and properties that are necessary in our exposition, with the aim to make it clear how Chen iterated integrals emerge in the solution of the parallel transporter equation.
2.3.1 Derivatives of a matrix function We assume that the reader is familiar with the basics of matrix theory, so that we only define the derivative and product integral of a matrix function A : [a, b] → ℝn×n , a matrix-valued function. For the moment we restrict ourselves to real-valued matrices, but most definitions and properties can be straightforwardly extended to complex matrices. The first concept we need is the one of differentiability of a matrix function. Definition 2.94 (Differentiability of a matrix function). A matrix function A : [a, b] → ℝn×n is called differentiable at a point x ∈ (a, b) if all its entries aij , i, j ∈ 1, ..., n are differentiable at x ∈ [a, b], where the entries are considered to be real-valued functions aij : [a, b] → ℝ. If the matrix function A is differentiable we use the notation: n
A (x) = {aij }i,j=1 .
(2.70)
Building on the differentiability of the matrix function A we can now define not one, but two derivatives. Definition 2.95 (Left and right derivative of a matrix runction). Let A : [a, b] → ℝn×n be a differentiable and regular (single-valued and analytic) matrix function at x ∈ (a, b), then we define the left derivative of A at x as d A(x + Δx)A−1 (x) − I A(x) = A (x)A−1 (x) = lim , Δx→0 dx Δx
(2.71)
58 | 2 Prolegomena to the mathematical theory of Wilson lines and similarly the right derivative as: A(x)
A−1 (x)A(x + Δx) − I d = A−1 (x)A (x) = lim . Δx→0 dx Δx
(2.72)
Derivatives at the endpoints of the interval [a, b] are defined in the same way left and right¹¹ derivatives are defined for scalar functions, again by using the matrix entries aij . Both the left and right derivatives of a matrix function share many properties with the common derivatives of functions, but still in some cases we have to be careful. To demonstrate this we just mention the application to a product. Theorem 2.96. Let A1 , A2 : [a1 , a2 ] → ℝn×n be differentiable and regular matrix functions at x ∈ (a1 , a2 ). Hence, one gets d d d d d (A A ) = A + A1 ( A2 )A−1 + A )A−1 1 = A1 (A dx 1 2 dx 1 dx dx dx 2 1 d d d d d (A1 A2 ) = A2 + A−1 ) = A−1 + A )A . 2 (A1 2 (A1 dx dx dx dx dx 2 2
(2.73) (2.74)
Exercise 2.97. Prove Theorem 2.96.
Exercise 2.98. Show that
d d A (CA) = dx dx
where C is a constant matrix. Exercise 2.99. Demonstrate that d d −1 d d (A ) = −A , (A−1 ) = − A. dx dx dx dx Theorem 2.100. Suppose that A1 , A2 : [a, b] → ℝn×n are differentiable and regular matrix functions at x ∈ (a, b),
11 Here the left and right refer to approaching the endpoints of the interval from the left or the right and not to the derivatives of the matrix function.
2.3 Solving matrix differential equations: Chen iterated integrals
such that
d d A = A . dx 1 dx 2
|
59
(2.75)
Then there exists a constant matrix A3 ∈ ℝn×n such that for all x ∈ {a, b} A2 (x) = A1 (x)A3 . Exercise 2.101. Prove Theorem 2.100. (Hint: use A3 = A−1 1 A2 .)
2.3.2 Product integral of a matrix function Having introduced the derivatives of a matrix function, we now turn to integrals of matrix functions. Consider a matrix function A : [a, b] → ℝn×n and a partition D of the interval [a, b] defined as a = t0 ≤ ξ1 ≤ t1 ≤ ξ2 ≤ ⋅ ⋅ ⋅ ≤ tm−1 ≤ ξm ≤ tm = b.
(2.76)
Next we introduce the notation Δti = ti − ti−1 , i = 1, ..., m 𝜈(D) = max Δti ,
(2.77) (2.78)
1≤i≤m
and 1
P(A, D) = ∏ (I + A(ξi )Δti ) = (I + A(ξm )Δtm ) ⋅ ⋅ ⋅ (I + A(ξ1 )Δt1 )
(2.79)
i=m m
P∗ (A, D) = ∏ (I + A(ξi )Δti ) = (I + A(ξ1 )Δt1 ) ⋅ ⋅ ⋅ (I + A(ξm )Δtm ).
(2.80)
i=1
Volterra then defined the left and right integral of the matrix function A as: b
∫ {aij } = lim P(A, D), 𝜈(D)→0
a
Left integral,
(2.81)
Right integral,
(2.82)
b
{aij } ∫ = lim P∗ (A, D), a
𝜈(D)→0
where lim M(D) = M,
𝜈(D)→0
is defined as ∀ 𝜖 > 0, ∃δ > 0,
(2.83)
60 | 2 Prolegomena to the mathematical theory of Wilson lines such that
M(D)ij − Mij < 𝜖 for every partition D of [a, b] as defined in equation (2.76). This allows us to define the left and right product integrals:
Definition 2.102 (Left and right product integrals). Consider a matrix function A, B : [a, b] → ℝn×n . If the limits b
lim P(A, D) = ∏ (I + A(t)dt),
𝜈(D)→0
(2.84)
a b
lim P∗ (A, D) = (I + A(t)dt) ∏
𝜈(D)→0
(2.85)
a
exist then they are called, correspondingly, the left and right product integral of A over [a, b]. In order to link this operation with the usual Riemann integrals, we observe that a matrix function A is Riemann integrable if its matrix entries aij are Riemann integrable functions on [a, b]. In this case one has b
n
b
{ } ∫ A(t)dt = {∫ aij (t) dt} . a {a }i,j=1
(2.86)
Riemann integrability allows us to expand the integrals of a matrix function in order to relate them to the Chen iterated integrals. This expansion is captured by the following theorem: Theorem 2.103. Introduce a Riemann integrable matrix function A : [a, b] → ℝn×n . Then the left and right product integrals exist and are given by¹² ∞
x
x tk
t2
∏ (I + A(t)dt) = I + ∑ ∫ ∫ ⋅ ⋅ ⋅ ∫ A(tk ) ⋅ ⋅ ⋅ A(t1 )dt1 ⋅ ⋅ ⋅ dtk , a
k=1 a a x
∞
x tk
t2
(I + A(t)dt) ∏ = I + ∑ ∫ ∫ ⋅ ⋅ ⋅ ∫ A(t1 ) ⋅ ⋅ ⋅ A(tk )dt1 ⋅ ⋅ ⋅ dtk , a
k=1 a a
(2.87)
a
a
where the series converge absolutely and uniformly for x ∈ [a, b].
12 Notice the ordering of the matrix functions under the integral signs.
(2.88)
2.3 Solving matrix differential equations: Chen iterated integrals
|
61
The following Theorem 2.103 takes place: Theorem 2.104. Consider a Riemann integrable matrix function A : [a1 , b] → ℝn×n and x
Y1 (x) = ∏ (I + A(t)dt),
(2.89)
a x
Y2 (x) = (I + A(t)dt) ∏ .
(2.90)
a
Then for all x ∈ [a1 , b] the integral equations are satisfied x
Y1 (x) = I + ∫ A(t)Y1 (t) dt
(2.91)
a x
Y2 (x) = I + ∫ Y2 (t)A(t) dt.
(2.92)
a
2.3.3 Continuity of matrix functions In order to continue toward our goal of finding solutions to the type of differential matrix equation that emerged in the parallel transport equation in gauge theory, we need to consider the continuity of matrix functions. Just as differentiability of the matrix function was defined using the differentiability of its matrix entries aij , we do the same for continuity. Definition 2.105. Consider a matrix function A : [a, b] → ℝn×n . Then A is called continuous if the entries aij of A are continuous functions on [a, b]. With this definition we can write down the types of differential equations we require, which are obtained by differentiating the integral equations of Theorem 2.104. Theorem 2.106. Consider a continuous matrix function A : [a, b] → ℝn×n .
62 | 2 Prolegomena to the mathematical theory of Wilson lines Then x ∈ [a, b] the product integrals x
Y1 (x) = ∏ (I + A(t)dt),
(2.93)
a x
Y2 (x) = (I + A(t)dt) ∏
(2.94)
a
satisfy the conditions Y1 (x) = A(x)Y1 (x),
(2.95)
Y2 (x)
(2.96)
= Y2 (x)A(x).
Written in a notation using the left and right derivatives defined in Section 2.3.1, the equations (2.95) and (2.96) can be rewritten as: d x ∏ (I + A(t)dt) = A(x), dx a x
(I + A(t)dt) ∏ a
d = A(x). dx
(2.97)
Moreover, we have Corollary 2.107. Consider a function Y : [a, b] → ℝn×n . It delivers a solution to the equation for x ∈ [a, b] Y (x) = A(x)Y(x).
(2.98)
Moreover, it satisfies Y(a) = I if and only if Y solves the integral equation x
Y(x) = I + ∫ A(t)Y(t)dt.
(2.99)
a
From the above it is now evident that solutions of equations (2.95) and (2.96) can be presented as ∞
b xk
x2
Y1 (x) = I + ∑ ∫ ∫ ⋅ ⋅ ⋅ ∫ A(xk ) ⋅ ⋅ ⋅ A(x1 )dx1 ⋅ ⋅ ⋅ dxk , k=1 a a ∞
b x1 k
x2
Y2 (x) = I + ∑ ∫ ∫ ⋅ ⋅ ⋅ ∫ A(x1 ) ⋅ ⋅ ⋅ A(xk )dx1 ⋅ ⋅ ⋅ dxk , k=1 a a
(2.100)
a
a
to be compared to the expressions given in Example 2.108.
(2.101)
2.3 Solving matrix differential equations: Chen iterated integrals
|
63
All the above properties and theorems can by readily extended to matrix functions A : [a, b] → ℂn×n , such that this is not an obstacle when considering matrix representations of gauge groups such as, for example, SU(N).
2.3.4 Iterated integrals and path ordering In this section we shall rewrite the product integrals presented above in the iterated integrals form (Theorem 2.103) in a more familiar notation in the context of Wilson lines. To this end we start with a well-known example: Example 2.108. Consider the Schr¨ odinger equation for a quantum evolution operator in the interaction representation: i𝜕t U(t) = H(t)U(t), U(0) = 1
(2.102)
where H(t) is the interaction Hamiltonian – an operator function acting in the Hilbert space. This unitary operator can also be treated as a complex-valued scalar matrix function U(t) : [0, t] → ℂ. The iterated integrals which contribute to the solution of equation (2.102) can be rewritten as t t1
tl−1
t
1 ∫ dt1 ⋅ ⋅ ⋅ dtl T{H(t1 ) ⋅ ⋅ ⋅ H(tl )}, l!
∫ ∫ ⋅ ⋅ ⋅ ∫ H(t1 ) ⋅ ⋅ ⋅ H(tl )dt1 ⋅ ⋅ ⋅ dtl = 0 0
0
(2.103)
0
where T indicates the time-ordering operation for the Hamilton operator H(t). That is, this operator orders the H(t)...H(t ) in time. The previous expression then allows for the formal notation for the unitary operator U(t) t
[−i ∫0 dt H(t )]
Uτ (t) ≡ Pe
,
(2.104)
which could be interpreted as a parallel propagator along a path through the time axis τ = [0, t]. We now wish to do the same thing, but replace the time integration variable t with the variable that parameterizes a curve (path) in a smooth real manifold M. More specifically, we are considering the matrix function A : [0, 1] → ℂn×n ,
64 | 2 Prolegomena to the mathematical theory of Wilson lines so that A can be written as A=S∘φ where φ : [0, 1] → M t → φ (t) = xμ (t) and S : M → ℂn×n xμ → S(xμ ) = A(x(t)). Applying the same reasoning as in Example 2.108, we see that the equation Y (t) = A(t)Y(t),
(2.105)
has a unique solution t
[∫0 dt A(t )]
Y(t) = Te
y
= Pe[∫0 dx S(x)]
(2.106)
given the initial condition Y(0) = 1 and the time-ordering is replaced with the path-ordering, which orders the operators S(x) along the path in the manifold M. We shall return to this type of equations in what follows, after a brief discussion on the relation between product integrals and the Chen integrals from Section 2.1.3. Investigating equation (2.23) more closely it is easy to see that the operators ωi are ordered under the integral sign. Hence, we can rewrite it as 1
∫ ( ∫ ω1 ⋅ ⋅ ⋅ ωr−1 )ωn (t)dt = P{ ∫ ⋅ ⋅ ⋅ ∫ ω1 ⋅ ⋅ ⋅ ωn }, 0
𝛾
𝛾t
(2.107)
𝛾
where we considered the integrals between the braces as ordinary integrals and not as a Chen iterated integrals. Using this result we can rewrite the function Y(t) from equation (2.106) with Chen iterated integrals: y
Y(t) = Pe[∫0 dx S(x)] = e
[∫𝛾 S]
,
(2.108)
if one identifies the operator S(x) dx (interpreted as a form) with the forms ω = ω1 = ⋅ ⋅ ⋅ = ωn from (2.23). Exercise 2.109. One needs to be careful with this last statement about the ωi . We can indeed identify them all with ω , which will still depend the coordinates xμ after having chosen a coordinate chart. Consider the simple example ω1 ω2 → ω (x1 )ω (x2 ) to clarify this statement.
2.4 Wilson lines, parallel transport and covariant derivative | 65
Now that the relation between product integrals, Chen integrals and path ordering has been explained we are ready to investigate the parallel transport equation in gauge theory and its connection with Wilson lines.
2.4 Wilson lines, parallel transport and covariant derivative 2.4.1 Parallel transport and Wilson lines We return now to the parallel transport equation in gauge theory, equation (2.69) dgi (t) = −Ai (X)gi (t), dt
(2.109)
where Ai is a Lie algebra-valued (i.e., complex matrix when considering matrix representations for the Lie algebra) one-form. Given the initial condition gi (0) = e, a solution can be expressed using product integrals or Chen integrals, yielding (locally) the formal solution in the form of a functional of an arbitrary path 𝛾(t) t
μ
[− ∫ Ai μ (x(t)) dxdt dt]
gi [𝛾(t)] = Pe𝛾
0
(2.110)
𝛾(t)
[− ∫ Ai (x)dxμ ] μ 𝛾(0) ]
= Pe[
[− ∫ Ai ]
= Pe
𝛾
(2.111)
where Aiμ = igAai μ ta with horizontal lift 𝛾̃(t) = si [𝛾(t)]gi [𝛾(t)].
(2.112)
Note that the integrals in equation (2.111) are interpreted as Chen iterated integrals. More specifically, we find that if u0 ∈ π −1 [𝛾(0)], then u1 ∈ π −1 [𝛾(1)] is the parallel transport of u0 along the curve 𝛾 Γ(̃ 𝛾) : π −1 [𝛾(0)] → π −1 [𝛾(1)], u0 → u1 .
66 | 2 Prolegomena to the mathematical theory of Wilson lines Introducing a coordinate chart we can thus write locally: 1
[− ∫ Ai μ
u1 = si (1) Pe
0
dxμ dt
dt]
.
(2.113)
Exercise 2.110. Why is the formal solution (2.110) only valid locally? The relation with Wilson lines is now straightforward when considering equation (2.110). In other words, Wilson lines along a path 𝛾 are the parallel transporter along this path. Because of this relationship a Wilson line is sometimes also referred to as a gauge link. Using the properties of the principal fiber bundle we obtain Rg Γ(̃ 𝛾)(u0 ) = u1 g
(2.114)
Γ(̃ 𝛾)Rg (u0 ) = Γ(̃ 𝛾)(u0 g),
(2.115)
and which together with the fact that 𝛾̃(t)g is the horizontal lift through u0 g and u1 g returns that Γ(̃ 𝛾) commutes with the right action. Exercise 2.111. Using the properties of Chen integrals prove that −1
Γ(̃ 𝛾−1 ) = (Γ(̃ 𝛾)) . Exercise 2.112. Again using the properties of Chen integrals prove that if we have two curves α1,2 : [0, 1] → M, such that α1 (1) = α2 (0), then ̃2 ) ∘ Γ(α̃1 ). Γ(α̃ 1 α2 ) = Γ(α
2.4.2 Holonomy, curvature and the Ambrose–Singer theorem 2.4.2.1 Holonomy In the previous section we have clarified the relation between Wilson lines and the parallel transport equation. Now we wish to discuss the relation between Wilson loops and holonomies. Consider a fiber bundle P(Y, G, π ) and the two curves in Y 𝛾1 and 𝛾2 , such that 𝛾1 (0) = 𝛾2 (0) = p0 and 𝛾1 (1) = 𝛾2 (1) = p1 .
2.4 Wilson lines, parallel transport and covariant derivative | 67
If we consider the horizontal lifts of these curves for which 𝛾̃1 (0) = 𝛾̃2 (0) = u0 , then we do not necessarily get 𝛾̃1 (1) = 𝛾̃2 (1). This means that if we consider a loop 𝛾 in Y, i.e., 𝛾(0) = 𝛾(1), then, in general, the horizontal lift does not yield unavoidably 𝛾̃(0) ≠ 𝛾̃(1). In other words, a loop 𝛾 induces a transformation τ𝛾 : π −1 (p) → π −1 (p) on the fiber at p. Because the horizontal lift Γ(̃ 𝛾) commutes with the right action we obtain τ𝛾 (ug) = τ𝛾 (u)g. (2.116) Fixing a point in the manifold Y and considering all loops for which this point is the base point, written as Cp (Y), τ𝛾 can only reach certain elements of G. The set of elements that can be reached form a subgroup of the structure group G and generate the holonomy group at u, where π (u) = p Φu = {g ∈ G|τ𝛾 (u) = ug, 𝛾 ∈ Cp M}.
(2.117)
Exercise 2.113. Show that the elements of Φu form a group. An interesting fact is that τ𝛾−1 = τ𝛾−1 inducing g𝛾−1 = g𝛾−1 . From the discussion on parallel transport, we find that the elements of the holonomy group can be treated as Wilson loops g𝛾 = Pexp [− ∮Aiμ (x)dxμ ] . 𝛾
(2.118)
2.4.2.2 Curvature Before we continue the discussion of holonomies, we need to introduce the curvature two-form in gauge theory.
68 | 2 Prolegomena to the mathematical theory of Wilson lines Definition 2.114 (Covariant derivative). Suppose we have a vector space V of dimension k, and basis in V denoted by {eα }. Let φ : TP ∧ ⋅ ⋅ ⋅ ∧ TP → V and X1 , . . . , Xn+1 ∈ Tu P. The covariant derivative acting on k
φ = ∑ φ α ⊗ eα α =1
is then defined as: H ), Dφ (X1 , . . . , Xn+1 ) ≡ dP φ (X1H , . . . , Xn+1
(2.119)
with dP φ ≡ dP φ α ⊗ eα , where dP is the exterior differential for the fiber bundle P. The curvature can then be introduced using this definition of the covariant derivative: Definition 2.115 (Curvature two-form). The curvature two-form Ω is the covariant derivative of the Ehresmann connection one-form ω 2
Ω ≡ Dω ∈ ⋀ P ⊗ g.
(2.120)
The right action on the curvature is expressed by the proposition: Proposition 2.116. The curvature transforms under the right action of an element g ∈ G as R∗g Ω = g−1 Ωg. (2.121) Exercise 2.117. Prove this proposition starting from the observation that Rg∗ preserves horizontal subspaces, and that dP R∗g = R∗g dP . In gauge theory notation this can be rewritten as R∗g Fμ 𝜈 = g−1 Fμ 𝜈 g, where Fμ 𝜈 is the gauge-covariant field strength. The above notation allows us to introduce Cartan’s structure equation which will also be familiar when written with field strength tensors. Theorem 2.118 (Cartan’s structure equation). Consider X1 , X2 ∈ Tu P.
2.4 Wilson lines, parallel transport and covariant derivative
| 69
The curvature Ω and the Ehresmann connection ω satisfy the Cartan structure equation Ω(X1 , X2 ) = dP ω (X1 , X2 ) + [ω (X1 ), ω (X2 )].
(2.122)
It can also be written in the form Ω = dP ω + ω ∧ ω .
(2.123)
Now the field strength tensor (also called the gauge curvature) Fμ 𝜈 can be written as Fμ 𝜈 = dP Aμ 𝜈 + Aμ ∧ A𝜈 = 𝜕μ A𝜈 − 𝜕𝜈 Aμ + [Aμ , A𝜈 ]
(2.124)
which should look more familiar for physicists.
2.4.2.3 The Ambrose–Singer theorem The connection of Wilson loops with holonomies is supposed to allow one, in principle, to recast gauge theory in the space of generalized loops. The Ambrose–Singer theorem is the cornerstone of this program. Theorem 2.119 (Ambrose–Singer). Consider a principal fiber bundle P(Y, G, π ) with connection ω , and curvature form Ω. Let Φ(u) be the holonomy group with reference point u ∈ P(Y, G, π ) and P(u) the holonomy bundle of ω through u. Then the Lie algebra of Φ(u) is equal to the Lie sub-algebra of g, generated by all elements of the form Ωp (v1 , v2 ) for p ∈ P(u) and v1 , v2 horizontal vectors at p, where g is the Lie algebra of G. Expressed in words, this theorem says that the physical content of the principal fiber bundle P theory with connection ω can also be found in the holonomy group Φ(u). In other words, there exists an equivalent loop space representation of a gauge theory. A downside of this approach is that if one considers the holonomy group which is infinite dimensional, we have abundant information or, said differently, the free loop space is overcomplete. Furthermore, the holonomy group is gauge dependent, such that if we want to express physical observables as functions of the holonomies, these functions will need to be gauge invariant. Fortunately, we shall see that considering generalized loops in the sense of Chen integrals as d-loops enables us to deal with this issues.
2.4.2.4 Wilson loop functional Let us summarize and recapitulate some of the properties of Wilson lines and loops from a gauge theory point of view and introduce the gauge invariant Wilson loop functionals, which in the next sections will be used to introduce and study generalized loop space.
70 | 2 Prolegomena to the mathematical theory of Wilson lines Remember that a Wilson line [∫𝛾 Aμ ]
U𝛾 = Pe
,
(2.125)
is a solution of the parallel transport equation. When 𝛾 is a closed path (a loop) this becomes [∮ A ] U𝛾 = Pe 𝛾 μ . (2.126) Notice that this infinite series, when one expands the exponential, converges to an element g ∈ G. As we have seen before, the gauge link is not gauge invariant, but transform as U𝛾g = gy−1 U𝛾 gx , (2.127) for a path 𝛾 from x to y or as:
U𝛾g = gx−1 U𝛾 gx ,
(2.128)
when 𝛾 is a loop with base point x = 𝛾(0). Since observables are by definition gauge invariant, and as we will see, the advantage of using generalized loop space is its gauge invariance, we define the gauge invariant Wilson path/loop functional W : LM → ℂ by 1 (2.129) Tr U𝛾 , N where LM represents the space of all loops in M. By continuity of the trace and the expansion of the exponential in Chen integrals we get 1 ∑ Tr ∫ ωω W(𝛾) = (2.130) ⏟ ⏞⏞⋅ ⋅ ⋅ ω⏟ N n≥0 W(𝛾) =
𝛾
n
with as before the convention that ∫ ωω ⏟ ⏞⏞⋅ ⋅ ⋅ ω⏟ = Id, 𝛾
n
if n = 0. Expressed with the gauge potentials Aμ this Wilson loop can be written as [∫𝛾 Aμ ]
,
(2.131)
[∮𝛾 Aμ ]
,
(2.132)
W𝛾 = Tr Pe
for open paths and W𝛾 = Tr Pe
for loops. Both expressions are now gauge invariant, due to the traces, such that Wilson loop functionals are indeed gauge invariant functions of the holonomies. In terms of d-paths these Wilson loop functionals are complex-valued d-paths W𝛾 ∈ Alg(Sh(Ω), ℂ),
i.e., they vanish on the ideal I(d, p) defined in Section 2.1.2.
2.5 Generalization of manifolds and derivatives
|
71
2.5 Generalization of manifolds and derivatives We wish to show that the generalized loop space exhibits a manifold structure, which is not, however, usual. Namely, this space is not locally homeomorphic to the Euclidean space ℝn , as it is required for manifolds. To describe the manifold-like structure we need to generalize the manifold concept to allow for spaces that are modeled on, for instance, Banach spaces. This generalization allows us to extend the manifold concept to infinite dimensional spaces. With the aid of the generalized manifolds one can generalize derivatives. The most important generalization for our purposes is the Fr´echet derivative. In the last section of the present chapter we shall discuss this derivative and some of its nice properties in more detail, here we only present the necessary mathematical preliminaries.
2.5.1 Manifold: Fr´echet derivative and Banach manifold A real smooth manifold is a topological space that is locally homeomorphic to ℝn . This manifold concept can be extended to a larger class where now the manifold is no longer modeled on a Euclidean but on a Banach space.¹³ Put differently, the underlying topological space is locally homeomorphic to an open set in a Banach space, allowing to extend the manifold concept to infinite dimensions. A more formal definition will be given below, but we first need to generalize the derivative concept to the so called Fr´echet derivative. This derivative is defined on Banach spaces and can be interpreted as a generalization of the derivative of a one parameter real-valued function to the case of a vector-valued function depending on multiple real values, which is what we will need to define derivatives on the generalized loop space and is actually necessary to define the functional derivative in this space as we will see. To give the definition of the Fr´echet derivative we need the concept of a bounded linear operator. Definition 2.120 (Bounded linear operator). A bounded linear operator is a linear transformation L between normed vector spaces X and Y for which the ratio of the norm of L(v) to that of v is bounded by the same number, over all nonzero vectors v ∈ X. Therefore, there exists M > 0, such that for all v ∈ X ‖L(v)‖Y ≤ M‖v‖X .
13 Complete vector spaces with norm.
72 | 2 Prolegomena to the mathematical theory of Wilson lines The smallest M is called the operator norm ‖L‖op of L. A bounded linear operator is generally not a bounded function, which would require that the norm of L(v) be bounded for all v, which is not possible unless Y is the zero vector space. Put more correctly, a bounded linear operator is a locally bounded function. Let us recall that a linear operator on a metrizable vector space is bounded if and only if it is continuous. With the above we are now ready to define the Fr´ echet derivative. Definition 2.121 (Fr´echet derivative). Consider Banach spaces X1 , X2 , and let U ⊂ X1 be an open subset. A function F : U → X2 is called Fr´ echet differentiable at x∈U if there exists a bounded linear operator A x : X1 → X 2 such that
F(x + Δ) − F(x) − Ax (Δ)X2 lim = 0, Δ→0 ‖Δ‖X1
(2.133)
where the limit is defined as in the usual sense. If this limit exists, then DF(x) = Ax stands for the Fr´echet derivative. We call the function F C1 if DF : U → B(X1 , X2 ) ; x → DF(x) = Ax ,
(2.134)
is continuous, B here highlights the fact that this is the space of bounded linear operators. Note the difference with the continuity of DF(x) in the previous definition. The usual derivative of a real function can be easily restored from this definition. To this end, let us take F : ℝ → ℝ, such that DF(x) is the function t → t F (x).
2.5 Generalization of manifolds and derivatives |
73
The Fr´echet derivative can be extended to arbitrary topological vector spaces (TVCs). The latter are defined as vector spaces with a topology that makes the addition and scalar multiplication operations continuous, i.e., the topology is consistent with the linear structure of the vector space. Definition 2.122 (Fr´echet derivative for topological vector spaces). Let now X1 , X2 be topological vector spaces with U ∈ X1 an open subset that contains the origin and given a function F : U → X2 preserving the origin F(0) = 0. To continue it is necessary to explain what it means for this function to have 0 as its derivative. We call the function F tangent to 0 if for every open neighborhood V2 ⊂ X2 , of 0X2 , there is an open neighborhood V1 ⊂ X1 , of 0X1 , together with a function H : ℝ → ℝ, such that lim
Δ→0
H(Δ) =0 Δ
and for all Δ F(ΔV1 ) ⊂ H(Δ)V2 . This somewhat strange constraint can be removed by defining F to be Fr´echet differentiable at a point x0 ∈ U given that there exists a continuous linear operator λ : X 1 → X2 , such that F(x0 + Δ) − F(x0 ) − λ Δ, considered as a function of Δ, is tangent to 0. It can further be demonstrated that if the Fr´echet derivative exists, then it is unique. Similarly to the usual properties of differentiable functions we find that
74 | 2 Prolegomena to the mathematical theory of Wilson lines – –
if a function is Fr´echet differentiable at a point it is necessarily continuous at this point; sums and scalar multiples of Fr´echet differentiable functions are differentiable.
Hence we conclude that the space of Fr´echet differentiable functions at some point x forms a subspace of the functions that are continuous at that point x. Moreover, the chain rule also holds as does the Leibniz rule whenever Y is an algebra and a topological vector space in which multiplication is continuous. This will turn out to be exactly the case for the space of generalized loops, where the algebra multiplication is the shuffle product. Using the above generalization of derivative we can extend the manifold concept to that of a Banach manifold: Definition 2.123 (Banach manifold). Take a set X. An atlas of class Cn , n ≥ 0, on X is defined as a collection of pairs (charts) (Ui , φi ), i ∈ I, such that 1. for each i ∈ I, Ui ⊂ X, ⋃ Ui = X; i
2.
3.
for each i ∈ I, φi is a bijection from Ui onto an open subset φi (Ui ) of some Banach space Ei and φi (Ui ∩ Uj ) is open in Ei ; the crossover map φj ∘ φi−1 : φi (Ui ∩ Uj ) → φj (Ui ∩ Uj ) is a smooth function r-times continuously differentiable function for all i, j ∈ I meaning that the n-th Fr´echet derivative Dn (φj ∘ φi−1 ) : φi (Ui ∩ Uj ) → Lin (Ein ; Ej ) exists and is a continuous function with respect to the Ei -norm topology on subsets of Ei and the operator norm topology (i.e., the topology induced by a norm on the space of bounded linear operators, Definition (2.120)) on the space of linear operators Lin (Ein ; Ej ) ,
2.5 Generalization of manifolds and derivatives |
75
where Ein takes into account that the n-times iterated application of the linear operator defines the n-th Fr´echet derivative. It can be shown there is a unique topology on X such that for all i ∈ I, Ui is open and φi is a homeomorphism. This topological space is assumed to be a Hausdorff space in most cases, but this is not necessary from the point of view of the formal definition. In the cases where all the Ei are equal to the same space E, the atlas is called an E-atlas. However, it is not necessary that the Banach spaces Ei be the same space, or even isomorphic as topological vector spaces. But, if two charts (Ui , φi ) and (Uj , φj ) are such that Ui ∩ Uj ≠ 0 , it clearly follows from the derivative of the crossover map φj ∘ φi−1 : φi (Ui ∩ Uj ) → φj (Ui ∩ Uj ) that Ei ≅ Ej that is they are isomorphic as topological vector spaces. It is important to realize that the set of points x ∈ X for which there is a chart (Ui , φi ) : x ∈ Ui and Ei isomorphic to a given Banach space E is both open and closed. Hence, one can assume that, on each connected component of X, the atlas is an E-atlas for some fixed E. Similarly to the common differentiable manifolds, a new chart (U, φ ) is called compatible with a given atlas {(Ui , φi |i ∈ I} if the crossover map φi ∘ φ −1 : φ (U ∩ Ui ) → φi (U ∩ Ui ) is an r-times continuously differentiable function for all i ∈ I. Two atlases are compatible when each chart in one atlas is compatible with the other atlas. Compatibility of atlases defines an equivalence relation on the class of all possible atlases on X. Just like in the situation with real smooth manifolds, a Cr -manifold structure on X is defined as a choice of an equivalence class of atlases on X of class Cr . If all the Banach spaces Ei are isomorphic as topological vector spaces (as is guaranteed to be the case if X is connected), then an equivalent atlas can be found for which they are all equal to some Banach space E. X is then called an E-manifold, or one says that X is modeled on E. We end this discussion by making the remark that a Hilbert manifold is a special case of a Banach manifold in which the manifold is locally modeled on Hilbert spaces.
76 | 2 Prolegomena to the mathematical theory of Wilson lines 2.5.2 Fr´echet manifold The concept of the Banach manifolds can be further generalized by making use of Fr´ echet spaces, which are a special kind of topological vector spaces. Fr´echet spaces are locally convex spaces which are complete with respect to a translation invariant metric and their metric does not need to be generated by a norm. Notice that this means that not every Fr´echet space is a Banach space, which requires a norm. Typical examples are spaces of infinitely differentiable functions. We give below two equivalent definitions of a Fr´echet space, one using translation invariant metrics and one using a family of semi-norms. Definition 2.124 (Fr´echet spaces via translation invariant metrics). A topological vector space X is a Fr´echet space if and only if it satisfies the following three properties: 1. There is a local basis for its topology at every point, i.e., it is locally convex. 2. Its topology can be induced by a translation invariant metric, meaning that a subset U⊂X is open if and only if for all u1 ∈ U there exists 𝜖 > 0, such that {u2 : d(u2 , u1 ) < 𝜖} ⊂ U. 3.
It is a complete metric space.
Note that there is no natural notion of distance between two points of a Fr´echet space: many different translation invariant metrics may induce the same topology. The second definition is built on a family of semi-norms. Definition 2.125 (Fr´echet spaces via family of semi-norms). A topological vector space X is a Fr´echet space if and only if it satisfies the following three properties. 1. It is a Hausdorff space. 2. Its topology may be induced by a countable family of semi-norms ‖ ⋅ ‖l , l = 0, 1, 2, . . . . This means that a subset U⊂X is open if and only if for all u1 ∈ U there exists K ≥ 0, 𝜖 > 0 |{u2 : ‖u2 − u1 ‖l < 𝜖, ∀l ≤ K} ⊂ U. 3.
It is complete with respect to the family of semi-norms.
A sequence (xn ) ∈ X
2.5 Generalization of manifolds and derivatives |
77
converges to x in the Fr´echet space defined by a family of semi-norms if and only if it converges to x with respect to each of the given semi-norms. Note that every Banach space is a Fr´echet space, as the norm induces a translation invariant metric and the space is complete with respect to this metric. The following examples show how the shuffle algebra can be made topological by semi-norms, turning it into a Fr´echet space. Example 2.126. The vector space of infinitely differentiable functions C∞ ([0, 1]) F : [0, 1] → ℝ becomes a Fr´echet space with the semi-norms l d |F|(l) = sup { l F(x) : x ∈ [0, 1]}, dx
(2.135)
∀ ℕ ∋ l ≥ 0. A sequence (Fn ) of functions converges to F ∈ C∞ ([0, 1]) if and only if for all ℕ ∋ l ≥ 0, the sequence (Fn(l) ) converges uniformly to F (l) , where F (l) =
dl F(x). dxl
Considering differentiation of maps between Fr´echet spaces one has to be careful. Take Fr´echet spaces X1 , X2 . The set of all continuous linear maps L(X1 , X2 ) X1 → X2 is not a Fr´echet space in any natural manner. This is where the theory of Banach spaces and that of Fr´echet spaces strongly deviate and we need a different definition for continuous differentiability of functions defined on Fr´echet spaces, the Gˆ ateaux derivative: Definition 2.127 (Gˆateaux derivative). Suppose X1 , X2 are Fr´echet spaces, U ⊂ X1 open, and P : U → X2 a function, x ∈ U, and V ∈ X1 .
78 | 2 Prolegomena to the mathematical theory of Wilson lines Then P is called differentiable at x in the direction V if the following limit exists DV [P(x)] = lim
Δ→0
P(x + VΔ) − P(x) . Δ
(2.136)
Then P is called continuously differentiable in U if D[P] : U × X1 → X2 ,
(2.137)
is continuous. Since the product of Fr´echet spaces is again a Fr´echet space, we can then differentiate D[P] and define the higher derivatives of P in this fashion. The derivative operator P : C∞ ([0, 1]) → C∞ ([0, 1]) defined by P(x) = x is itself infinitely differentiable. The first derivative reads DV [P(x)] = V
(2.138)
for any two elements x, V ∈ C∞ ([0, 1]). This is an important advantage of the Fr´echet space C∞ ([0, 1]) as compared to the Banach space Ck ([0, 1]) for finite k. If P : U → X2 is a continuously differentiable function, then the differential equation x (t) = P(x(t)), x(0) = x0 ∈ U,
(2.139)
need not have any solutions, and even if it does, the solutions need not be unique, in strong contrast to the situation in Banach spaces.¹⁴ One can now define Fr´echet manifolds as spaces that locally look like Fr´echet spaces, and one can then extend the concept of Lie groups to these manifolds, leading
14 We emphasize that the inverse function theorem does not hold in Fr´echet spaces. A partial substitute to it is the Nash–Moser theorem, which extends the notion of an inverse function from Banach spaces to a class of Fr´echet spaces. In contrast to the Banach space case, in which the invertibility of the derivative (where the derivative is interpreted as a linear operator) at a point is sufficient for a map to be locally invertible, the Nash–Moser theorem requires the derivative to be invertible in a vicinity of a point. The theorem is widely used to prove local uniqueness for nonlinear partial differential equations in spaces of smooth functions.
2.5 Generalization of manifolds and derivatives |
79
to a Fr´echet Lie group . Such a Lie group is a group G which is also a manifold, but now a Fr´echet manifold (infinite dimensional) such that the map: G × G → G, (g, h) → gh−1
(2.140)
is continuous. This is useful because for a given (ordinary) compact C∞ -manifold M, the set of all C∞ diffeomorphisms F:M→M forms a generalized Lie group in this sense, and this Lie group captures the symmetries of M. Some of the relations between Lie algebras and Lie groups remain valid in this setting, which will be used when studying the group structure and Lie algebra structure of generalized loop space.
3 The group of generalized loops and its Lie algebra 3.1 Introduction In the previous chapter we introduced d-paths and d-loops as algebra morphisms. We have already demonstrated that Shc(d, p) forms a group with respect to the multiplication introduced in Definition 2.10 and that this algebra is isomorphic to the (Chen) integral algebra Ap generated by all functionals X ω1 ⋅⋅⋅wn , such that d-loops can be identified with elements of the algebra morphisms Alg(Ap , K). From now on we set K ≡ ℂ. The algebra Shc(d, p) can be supplied with a topology turning it into a topological algebra, more specifically into a locally multiplicative convex (LMC) algebra. This topology is built from semi-norms, a construction that is due to Tavares. We shall explicitly discuss the construction of this topology, next to a diagrammatic overview of the different steps in this topologization process. Equipped with such a topology Shc(d, p) turns into a Fr´echet space and combined with the fact that the generalized loops form a group this will also return a Fr´echet Lie group and algebra. The algebraic properties combined with the differential operations from Section 2.1.1 and the fact that limits are well-defined in this new space allows to extend differential calculus on manifolds to the generalized manifolds discussed before. Several differential operators will be introduced in Section 2.5, which generate variations of the loops. The exposition in this chapter is based mostly on the works by Tavares (see References).
3.2 The shuffle algebra over Ω = ⋀ M as a Hopf algebra The main advantage of d-paths is that they can be considered as algebraic paths, in the sense that they have a rich algebraic structure that can be used to derive many interesting properties. In this section we investigate this issues in more detail. We start by restating the co-multiplication and co-unit of the shuffle algebra and their properties: n
Δ(ω ⋅ ⋅ ⋅ ωn ) = ∑ ω1 ⋅ ⋅ ⋅ ωi ⊗ ωi+1 ⋅ ⋅ ⋅ ωn i=0
𝜖(ω1 ...ωn ) = 0, if n ≥ 1 = 1, if n = 0.
(3.1)
3.2 The shuffle algebra over Ω = ⋀ M as a Hopf algebra |
81
Properties of the co-multiplication and co-unit are the following: (Δ ⊗ 1) ∘ Δ = (1 ⊗ Δ) ∘ Δ
(co-associative law)
(1 ⊗ 𝜖) ∘ Δ = (𝜖 ⊗ 1) ∘ Δ = 1
(co-unitary property)
Δ(u ∙ v) = Δ(u) ∙ Δ(v)
(Δ is an algebra morphism)
𝜖(u ∙ v) = 𝜖(u) ∙ 𝜖(v)
(𝜖 is an algebra morphism)
(3.2)
∀u, v ∈ Sh. A complete Hopf algebra structure is given by a multiplication (the shuffle product), a unit, a co-multiplication, a co-unit and an antipode. The antipode was defined in Definition 2.8 as a K-linear map: J : Sh → Sh and is restated here for convenience: J(ω1 ⋅ ⋅ ⋅ ωn ) = (−1)n ωn ⋅ ⋅ ⋅ ω1 ,
(3.3)
with the properties given before, see equation (2.7). Exercise 3.1. Using the above definitions and properties prove that n
n
∑(−1)i ωi ⋅ ⋅ ⋅ ω1 ∙ ωi+1 ⋅ ⋅ ⋅ ωn = ∑(−1)n−i ω1 ⋅ ⋅ ⋅ ωi ∙ ωn ⋅ ⋅ ⋅ ωi+1 i=0
i=0
= 𝜖(ω1 ⋅ ⋅ ⋅ ωn ).
(3.4)
The above part describes the Hopf algebra structure of Sh(Ω), but we wish to extend this structure to the algebra Ap generated by the functionals X ω1 ⋅⋅⋅ωn , equation (2.52). This extension follows from Proposition 2.39 that turns the surjective map Sh(Ω) → Ap , defined by 1 → 1, and ω1 ⋅ ⋅ ⋅ ωn → X ω1 ⋅⋅⋅ωn into a homomorphism of algebras. Since this map is now an algebra morphism, the algebraic structure of Sh(Ω) is preserved under this map. Proposition 2.42 and Theorem A.114 imply that the kernel of this morphism contains the ideal I(d, p). This ideal reads ω1 ⋅ ⋅ ⋅ ωi−1 (Fωi )ωi+1 ⋅ ⋅ ⋅ ωn − F(p)ω1 ⋅ ⋅ ⋅ ωn − ((ω1 ⋅ ⋅ ⋅ ωi−1 ) ∙ dF)ωi ⋅ ⋅ ⋅ ωn ,
(3.5)
or in reduced notation u1 (Fω )u2 − (u1 ∙ dF)ω u2 − F(p)u1 ω u2
(3.6)
82 | 3 The group of generalized loops and its Lie algebra for u1 , u2 ∈ Sh, ω ∈ ⋀M, F ∈ C∞ M. With this algebra morphism we obtain that d-paths can be seen as elements of the set of algebra morphisms Alg(Ap , ℂ) that is, a d-path is an algebra morphism 𝛾 ∈ Alg(Ap , ℂ) where 𝛾 : Ap → ℂ vanishes on the ideal I(d, p) by definition. In the case of d-loops, however, we need to extend the ideal to include dC∞ (M). However, in the integral algebra this is included by definition since ∫ dF = 0 𝛾
for 𝛾 ∈ LM and thus dC∞ (M) ∈ ker (Sh(Ω) → Ap ). As before we denote this ideal by Jp Jp = I(d, p) + ⟨dC⟩,
(3.7)
where I(d, p) is the shuffle algebra ideal associated to the pointed differentiation (d, p). We have already seen that this new ideal induces the algebra isomorphism Sh(Ω)/Jp ≃ Ap .
(3.8)
The algebra Ap has an induced Hopf algebra structure, where the unit and multiplication follow from Proposition 2.39 and the co-multiplication, co-unit and antipode follow from these operations on Sh(Ω) as n
Δ(X ω1 ⋅⋅⋅ωn ) = ∑ X ω1 ⋅⋅⋅ωi ⊗ X ωi+1 ⋅⋅⋅ωn i=0
𝜖(X ω1 ⋅⋅⋅ωn ) = 0
if n ≥ 1
=1
if n = 0
J(X
ω1 ⋅⋅⋅ωn
) = (−1)n X ωn ⋅⋅⋅ω1 .
(3.9)
3.2 The shuffle algebra over Ω = ⋀ M as a Hopf algebra
| 83
Exercise 3.2. How can one understand the co-multiplication Δ : Ap ⊗ Ap → Ap in equation (3.9) taking into account Proposition 2.39? This explains the Hopf algebra structure, but the integral algebra can be equipped with a much richer structure, namely that of a nuclear locally multiplicative-convex (NLMC) algebra. This structure is generated by topology, giving it the structure of a Fr´echet space. Figure 3.1 gives a diagrammatic overview of how different topologies are constructed on the involved algebras. Let us derive a topology on n
⨂ ⋀ M, which can then be used to obtain a topology on Sh(Ω) consistent with its linear structure. We write Ω = ⋀M as before. The construction of the topology will give us more than just a topology, it will enrich Sh(Ω) with the structure of a nuclear locally multiplicative-convex topological vector space (TVS), or Fr´echet space, that is also Hausdorff, Banach and Hopf. The construction of the topology starts from the Riemannian metric and connection on M. The connection allows us to define a covariant derivative D and the metric induces a norm |⋅|. On the other hand, we know M as a manifold has a topology induced from its Riemannian metric. Combining this with the atlas of M we get local basis for this topology (Uk )k∈ℕ . Using this local basis it is possible to construct a sequence of nested compacts U {Km }m≥1
in a local coordinate chart (U, x), such that U ⋃ Km = U. m≥1
We can then define a first family of semi-norms on (U, x) by using the Riemannian metric induced norm and covariant derivative p ωi m,p = sup (D ωi (x)), U x∈Km
where ωi ∈ C ∞ U defined by the vectors n
ω = ∑ ωi dxi ∈ ⋀ M. i=1
(3.10)
84 | 3 The group of generalized loops and its Lie algebra
N este d c o m p acts
r T op ology on M B asis f o
A tlas
(Uk )k∈N
(U, x)
M
U Km
m≥1
S e m i- N o r m
O ne for m s
R ie m a n n i a n m etric
One forms
p-t h c o v a r d er |·|
ω∈
Dp
1
Uk , ω =
n i=1
ωi m,p = sup (|Dp ωi (x)|)
ωi dx i
U x∈Km
S e m i - N o r m F a m il y
1
Initial Topology
M
In cl
P u ll- b a c k ik∗ :
1
M→
1
u sio n
m ap U (ω) = max ρ U (ω ), ω ∈ Nm i m
Local Charts
d es c
ri bed
by S em
o p o l o g y B a sis Local T
L o c a l B a sis
r
Okj
Ukj
i -n o
j=1
rms
Se m i-
N o r m F a m il y Uk
pk,m,l (ω) = max1≤j≤l Nm j iU∗ ω
(ik∗ )−1 (Okj ) j
kj
y log ly po mi To Fa uct m d or Pro i-N sor n
T w o ten so r e x a m ple r 1
(2)
Nk,m,l (u) = inf
M
n
pk,m,l (ωi ) · pk,m,l (ηi )
i=1
with inf over u =
n i=1
ωi ⊗ ηi
Direct Sum
Se
P
sup |Dp ωi (x)|
U x∈Km
Pr oje ct S ive em Te
T en sor pr o d u cts
|p|≤m
logy
U
ρUm (ωi ) = sup
T op o
1
1
1≤i≤n
ik : U k → M
Uk
r 1
r≥0 (
mi
-N
m
Fa
m
ily
c t T o p o l o g y + Co m p l et io n T e ns or P r o d u ( Ban a ch ) r oj e ct i v e
Nk,m,l (u) =
M)
Ba
or
u=
na
ch LM ,H C au alge sd bra o rf f, N
u cl
Com m u
ear
1
T(
Fig. 3.1: Topology on Sh(Ω).
M)
t a tiv e N L M C H Algebr a Hopf Sh(Ω)
r
(r)
r
Nk,m,l (ur )
ur , with ur ∈
r 1
M
3.2 The shuffle algebra over Ω = ⋀ M as a Hopf algebra | 85
Dp denotes the p-th covariant derivative with respect to the connection. A second family of semi-norms is now constructed from the first family of semi-norms ‖ω ‖m,p by NmU (ω ) = max ρmU (ωi ), ω ∈ ⋀U,
(3.11)
ρmU (ωi ) = sup (sup Dp ωi (x)).
(3.12)
1≤i≤n
where |p|≤m
U x∈Km
As a result we obtain a family of semi-norms on the local coordinate chart (U, x). The next step is to extend this to the entire manifold M. This will be implemented by means of the inclusion map on the local basis for the (Riemannian) topology on M. Consider again the local basis {Uk }k∈ℕ which can now also be interpreted as local charts. Define the map ik : Uk → M as the inclusion map which embeds the local basis into M. The linear pullback maps i∗k : ⋀M → ⋀Uk now define a map between the one-forms on this local basis and the same one-forms but now considered on M. Endowing ⋀M with the initial topology defined by these maps, successfully equips it with a topology induced by semi-norms. Notice that by definition this topology is the weakest topology for which all the maps i∗k are continuous, and a local topology basis consists of sets of the form r
⋂(i∗kj )−1 (Okj ), j=1
where the sets Okj run over a local basis of ⋀Ukj . Therefore, ⋀M becomes a nuclear locally-convex topological vector space (Fr´echet space), with the topology that is given by the family of semi-norms Uk
pk,m,l (ω ) = max Nm j (i∗Uk ω ). 1≤j≤l
j
(3.13)
86 | 3 The group of generalized loops and its Lie algebra From elementary calculus one learns that the definition of a differentiation depends on taking limits, which in its turn is defined by the convergence of a sequence concept. Given that we eventually will be interested in well-defined derivatives let us briefly consider convergence with the above family semi-norms. With such a family of semi-norms a sequence only converges if it converges for all semi-norms in the family. In other words, a sequence of one-forms (ωn )n≥1 , in ⋀M converges to zero if and only if, in a vicinity of every point of M, each derivative of each coefficient of ωk converges uniformly to zero. The tensor powers n
⨂ ⋀M now get a topology by the projective tensor product topology and becomes a Banach space when we also complete this space with respect to the semi-norms that describe (r) this tensor topology. In other words, this topology is described by the semi-norms Nk,m,l which are the tensor product of the above ones. To make this explicit, consider the example where r = 2. Example 3.3. Let u ∈ ⋀M ⊗ ⋀M for which we have
n
(2) Nk,m,l (u) = inf ∑ pk,m,l (ω i ) ⋅ pk,m,l (η i ),
(3.14)
i=1
where inf is taken over all expressions of the element u in the form n
u = ∑ ω i ⊗ η i. i=1
Extending now to elements in n
⨁ (⨂ ⋀M), n≥0
which are finite sums
n
u = ∑ un , with un ∈ ⨂ ⋀M, n
we get the semi-norms (n) Nk,m,l (u) = ∑ Nk,m,l (un ) n
inducing a nuclear locally-convex topology on T(⋀M).
(3.15)
3.3 The group of loops
| 87
Due to the fact that all above topologies are consistent with the linear structures of the algebras, the shuffle product is a continuous map in this last topology. Moreover, the shuffle product is commutative so that Sh(Ω) inherits the structure of a commutative LMC algebra from T(⋀M) that also is Hopf, Banach and Hausdorff. We continue to write Sh(Ω) for this algebra. We end this section with the remark that the integral algebra Ap inherits the same structure through the isomorphism equation (3.8).
3.3 The group of loops The (naive) piecewise smooth loops based at p form a loop space LMp , which is a semi-group with respect to the product 𝛾1 ⋅ 𝛾2 , for 𝛾1 , 𝛾2 ∈ LMp . Looking again at the equivalence relation introduced in equation (2.54), we can introduce a multiplication [𝛾1 ] ∗ [𝛾2 ] = [𝛾1 ⋅ 𝛾2 ], (3.16) on the set LMp /∼ of equivalence classes turning this set into a group and where [𝛾1 ] is the equivalence class of 𝛾1 . The inverse of an element [𝛾] ∈ LMp /∼ is clearly given by [𝛾]−1 = [𝛾−1 ] and the unit element reads 𝜖 = [p], the class of the constant loops equal to the point p. Therefore, we have described the group LMp /∼, ∗ referred to as the group of loops on the manifold M based at p. In what follows, we symbolically represent this group by LMp .
3.4 The group of generalized loops In order to be able to introduce the space of generalized loops, or equivalently the space of d-loops, as the algebra morphisms from Sh(Ω) to ℂ that vanish on the ideal
88 | 3 The group of generalized loops and its Lie algebra Jp , we need to extend the consideration of the algebra Ap . The main concept that we need in this extension is that of a spectrum on a commutative Banach algebra.¹ Definition 3.4 (Gel’fand space or spectrum). Consider a commutative Banach algebra A. Let (A) (or ) stand for the collection of nonzero complex homomorphisms H : A → ℂ. Elements of the Gel’fand space are called characters. Applying this definition to the algebra² Ap and writing p for the spectrum, we find that φ ∈ p is also an element of the dual space A∗p of Ap . We can now also consider the dual space A∗∗ p of the dual space in which we can embed the original space Ap by the map x → Φx : Φx (φ ) = φ (x). With the maps Φx we can define a coarsest topology on A∗p , such that all the Φx are continuous maps ∗ Ap → ℂ. This topology is referred to as the weak-∗ topology, in which the characters are now continuous by definition. From Section 3.2 we know that Ap inherits a semi-norm structure from Sh(Ω), such that by the Banach–Alaoglu theorem Ap is reflexive ∗∗
Ap ≡ Ap .
From this it follow that every bounded sequence has a weakly converging subsequence, similar to the case in regular calculus. Under the weak-∗ convergence we have that a sequence φn ∈ A∗p converges if and only if φn (x) → φ (x), ∀x ∈ Ap . The Hausdorff property of Ap can be understood from the separation property (2.76) of the functionals X ω1 ⋅⋅⋅ωn . As a consequence the d-loops 𝛾̃ : Sh(Ω) → ℂ can be identified with elements of A∗p .
1 Where the commutative refers to the shuffle product which is commutative. 2 For the moment considering the one-form to be complex valued.
3.4 The group of generalized loops
|
89
Notice that up until now we have only considered complex valued one-forms, but in a gauge theory setting and using the principal fiber bundle formalism we need to deal with Lie algebra valued one-forms. Choosing to represent the Lie algebra by matrices, the algebra elements form a sub-algebra of GL(n, ℂ). So let us consider the case of GL(n, ℂ) valued one-forms. In this case the nuclear property of Sh(Ω) comes to the rescue. The fact that this algebra is of the nuclear or trace class, the trace of the matrices does not spoil the algebraic or topological structures such that convergence is still well-defined. Thus by adding the trace operator to the integrals in the functionals of Ap in the case of matrix valued one-forms we get again a set of continuous characters (complex valued!). The nuclear property also assures that their exists a well-defined trace operator on the linear bounded operators used to define the Fr´echet derivative in (2.121). Moreover, it also assures that this trace is finite. We thus find that d-loops are identified in this way with the spectrum of Ap with the remark that if the ω ∈ ⋀1 M are GL(n, ℂ)-valued, we need to take the trace to reduce the GL(n, ℂ)-valued matrix to an element of ℂ. Let us now extend the previously introduced equivalence relation (2.76) on d-loops to: W𝛾1 = Tr U𝛾1 = Tr U𝛾2 = W𝛾2 ,
(3.17)
for two d-loops 𝛾1 , 𝛾2 ∈ M with U and W defined in equations (2.125) and (2.129). These form a subset of the d-loops, and also of the generalized loops, that are still separable by Theorem 2.79. Weak-∗ convergence is also still applicable due to the fact that convergence requires convergence for all elements in Ap . The continuity of the trace now allows to define generalized loops. Definition 3.5 (Generalized loop). A generalized loop based at p∈M is a character of the algebra Ap or, equivalently, a continuous complex algebra homomorphism 𝛾̃ : Sh(Ω) → ℂ that vanishes on the ideal Jp . By making use of the weak-∗ topology we can define convergence on the space of generalized loops as above. With this new space we can now ask the question: Exercise 3.6. How are the naive loops from the previous section embedded in the space of generalized loops? This embedding is realized by the Dirac map: δ : LMp → p , [𝛾] → δ[𝛾]
(3.18)
90 | 3 The group of generalized loops and its Lie algebra defined by: δ[𝛾] (X ω1 ⋅⋅⋅ωn ) = X ω1 ⋅⋅⋅ωn ([𝛾]),
(3.19)
[𝛾] ∈ LMp . We now have an injective embedding due to Theorem 2.76. Identifying LMp with its image, under δ , in p , it also inherits an induced topology. Another question one can pose at this point is ‘How to compose loops?’. Exercise 3.7. Find out what is the definition of the composition, or multiplication, operation in the space of generalized loops? The multiplication for the generalized loops is introduced as a convolution multiplication 𝛾̃1 ⋆ 𝛾̃2 of the two elements 𝛾̃1 , 𝛾̃2 ∈ p defined by 𝛾1 ⊗ 𝛾̃2 ) ∘ Δ, 𝛾̃1 ⋆ 𝛾̃2 ≡ (̃
(3.20)
which gives p a group structure and where we used K ⊗ K ≃ K, ℂ ⊗ ℂ ≃ ℂ. Hence, we have obtained the group of generalized loops. Writing this out explicitly with the definition of Δ from equation (3.9) we have r
𝛾̃1 ⋆ 𝛾̃2 (X ω1 ⋅⋅⋅ωn ) = ∑ 𝛾̃1 (X ω1 ⋅⋅⋅ωi ) ⋅ 𝛾̃2 (X ωi+1 ⋅⋅⋅ωn ).
(3.21)
i=0
With the aid of the Dirac map we can rewrite equation (3.21) on LMp as n
n
i=0
i=0 n
∑ 𝛾̃1 (X ω1 ⋅⋅⋅ωi ) ⋅ 𝛾̃2 (X ωi+1 ⋅⋅⋅ωn ) = ∑ 𝛾1 (X ω1 ⋅⋅⋅ωi ) ⋅ 𝛾2 (X ωi+1 ⋅⋅⋅ωn ) = ∑(X ω1 ⋅⋅⋅ωi )(𝛾1 ) ⋅ (X ωi+1 ⋅⋅⋅ωn )(𝛾2 ) i=0 n
= ∑ ∫ ω1 ⋅ ⋅ ⋅ ωi ⋅ ∫ ωi+1 ⋅ ⋅ ⋅ ωn i=0 𝛾
1
= ∫ ω1 ⋅ ⋅ ⋅ ωn ,
𝛾2
(3.22)
𝛾1 ⋅𝛾2
which also shows that the convolution product defined on the generalized loop space makes sense as a composition of d-loops. The inverse in the group of 𝛾̃ ∈ p ,
3.4 The group of generalized loops
| 91
reads 𝛾̃ ∘ J, so that 𝛾̃−1 (ω1 ⋅ ⋅ ⋅ ωn ) = (−1)n 𝛾̃(ωn ⋅ ⋅ ⋅ ω1 ),
(3.23)
with 𝜖 to be the unit element. Considering the topologization of the previous sections, the group of generalized loops can also be considered as a topological group. Definition 3.8 (Generalized loop space as topological group). The paths 𝛾̃1 ⋆ 𝛾̃2 , 𝛾̃1 −1 and 𝜖 belong to the set of generalized loops based at p. In other words, they are continuous characters on the algebra Ap . In addition, (p , ⋆) has the properties of a topological group. Definition 3.9 (Group of generalized loops). This topological group (Δp , ⋆) is then called the group of generalized loops of M at p ∈ M. It will be denoted by Mp . L̃
The Dirac map preserves group operations, so that LMp is a topological subgroup of ̃ . The above discussion clearly shows that the naive loops form a subset of generLM p alized loops. Let us clarify the distinction with an example. Example 3.10. Consider a manifold M = S1 . Then one has LSp1 = ℤ and
̃ 1 = ℝ. LS p
Given that H 1 (S1 , ℝ) = ℝ, each one-form ω ∈ S1
92 | 3 The group of generalized loops and its Lie algebra equals to a constant multiple of ω0 ≡ dθ , i.e., the volume form in S1 , modulo an exact form ω = cω0 + dF, c ∈ ℝ. Therefore, we obtain ⋀S1 = ℝω0 ⊕ dC∞ (S1 ).
(3.24)
Now we are in a position to prove that Ap , being a Hopf algebra, is isomorphic to the polynomial ring in one variable ℝ[t] t ↔ X ω0 . The Hopf operations on ℝ[t] read Δ(t) = 1 ⊗ t + t ⊗ 1, J(t) = −t, 𝜖(t) = 0. Hence, we find
̃ 1 = ℝ. LS p
Remark 3.11 (Generalized paths). Consider the path space PMp of paths based at p ∈ M, and the algebra Bp generated by all the functions X ω1 ⋅⋅⋅ωn , considered now as functions on PMp . Similarly to the previous case, there exists an algebra isomorphism Sh(Ω)/Ip ≃ Bp , which allows to consider Bp as an LMC algebra and define generalized paths, based at p, as continuous characters on Sh(Ω) that vanish on Ip . These generalized paths, however, do not form a group but only a semi-group.
3.5 Generalized loops and the Ambrose–Singer theorem The equivalence relation, equation(3.17), has its origin in the Ambrose–Singer theorem (2.119). In our discussion of this theorem we argued that a naive loop space is overcomplete and we would solve this by introducing an equivalence relation, which is exactly realized by the definition above. We also mentioned algebraic constraints and nonlinear constraints coming from the fact that it has to be possible to write the complex value of the Wilson loop functional as a trace of an N × N SU(N) matrix, both
3.5 Generalized loops and the Ambrose–Singer theorem
|
93
constraints are combined in the Mandelstam constraints.³ Below we discuss how this equivalence takes care of these constraints. Recall that due to the translation invariance of d-paths and path reduction, the algebraic structure is independent from the chosen base point for the d-loops just like the fundamental group was base point independent. Choosing now a fixed base point for the loops we look for the equivalent expression in Wilson loop variables of the following property of the holonomy U𝛾 [[x2 , x1 ] ∘ [x3 , x2 ]] = U𝛾 [x2 , x1 ]U𝛾 [x3 , x2 ],
(3.25)
where 𝛾 : [x1 , x2 ] represent that path between the points x1 and x2 in the base manifold. This eventually gives rise to the so-called Mandelstam constraints. Writing down the equivalence relationship, introduced by the Wilson loop functionals for n loops⁴ reproduces the Mandelstam constraints and returns the Wilson loop variant of equation (3.25). These constraints now also allow the reconstruction of the N ×N matrices and thus the gauge fields Aμ up to a gauge transformation starting from characters of the spectrum. In other words, adopting this equivalence is equivalent to taking into account the Mandelstam constraints. Note that this equivalence reduces the infinite dimensional group algebra of the holonomy to the finite dimensional matrix representation of the holonomy group. This means that many elements of the holonomy group algebra are represented by the same matrix, thus taking care of the overcompleteness which actually has its origin in this infinite dimensionality of the holonomy group algebra. Hence, we have found an alternative representation of gauge theory that does not make use of gauge potentials, where the fundamental degrees of freedom are gauge invariant due to the trace operation. Although this is a very nice property, one has to keep in mind that we have paid some price for this: instead of the gauge dependence we are charged with the path dependence, which is also sometimes hard to deal with.⁵ However, if we are able to keep the path dependence under control, this does not produce any serious problems. A good thing now is that we can construct the relations which are by definition gauge invariant.
3 Giles [54] demonstrated this for the classical case, in the quantum field case there is no strict proof that this is really the case. 4 More explicitly one considers a loop formed by n loops, where the equivalence relation then states that they form the same loop if their Wilson loop functionals have the same complex value. 5 In many cases one considers paths and loops on the light cone, where there is only one possible path between two sequential points on the same light cone if one assumes that the complete path needs to stay on the light cone.
94 | 3 The group of generalized loops and its Lie algebra
3.6 The Lie algebra of the group of the generalized loops In the previous section we have found that the generalized loops form a topological group, namely a Fr´echet Lie group. Now we investigate if we can also construct the associated Lie algebra. As we know from Section 2.2.1, Lie algebras have a close connection with right/left invariant vector fields. With this in mind we repeat here the definition of a left invariant derivation (respectively, right invariant derivation) on Ap . Definition 3.12 (Left invariant derivation). A K-linear map d : Ap → Ap is called a left invariant derivation (respectively, right invariant derivation) on Ap if D satisfies the following two conditions d(X u1 X u2 ) = X u1 d(X u2 ) + d(X u1 )X u2 Δ ∘ d = (1 ⊗ d) ∘ Δ
(3.26) (3.27)
(respectively, Δ ∘ D = (D ⊗ 1) ∘ Δ), for all u1 , u2 ∈ Sh. Mp and the above invariant derivations we Using the topological group property of L̃ have the following definition from the general theory of affine K-groups:
Definition 3.13 (Lie algebra of L̃ Mp ). The Lie algebra of the group L̃ Mp is defined as Mp of all continuous left invariant derivations on Ap . With the Lie the K-linear space l̃ bracket defined as [d1 , d2 ] = d1 d2 − d2 d1 . (3.28) Mp . Just as in the case of prinNote that these fields are left invariant vector fields on L̃ cipal fiber bundles one can show that this Lie algebra is isomorphic to the tangent space at the unit element L̃ Mp , i.e., with Mp at 𝜖. T𝜖 L̃
To justify this relation and make the derivations more explicit, let us consider the convolution product F1 ⋆ F2 of two elements F1 , F2 ∈ A∗p , the topological (weak) dual of Ap F1 ⋆ F2 (X ω1 ⋅⋅⋅ωn ) = (F1 ⊗ F2 ) ∘ Δ(X ω1 ⋅⋅⋅ωn ) n
= ∑ F1 (X ω1 ⋅⋅⋅ωi ) ⋅ F2 (X ωi+1 ⋅⋅⋅ωn ). i=0
(3.29)
3.6 The Lie algebra of the group of the generalized loops
|
95
With this convolution product we can define left- and right-invariant endomorphisms on A∗p : Lemma 3.14. (A∗p , ⋆) is a topological K-algebra, isomorphic (antiisomorphic) to the topological algebra EndLL (Ap ) (EndRL (Ap )) of all left (or right) invariant K-linear endomorphisms of Ap (i.e., K-linear morphisms σ : Ap → Ap that satisfy the left (right) invariance condition) Δ ∘ σ = (1 ⊗ σ ) ∘ Δ
(3.30)
and, respectively, Δ ∘ σ = (σ ⊗ 1) ∘ Δ, and endowed with the topology of pointwise convergence. The elements of EndLL (Ap ) commute with the elements of EndRL (Ap ). To understand these definitions and properties better, it is instructive to start with some examples. Example 3.15. Let 𝛾 ∈ LMp and δ𝛾 ∈ A∗p be the Dirac map as defined before. Then Ψδ𝛾 is the automorphism Xu → 𝛾 ⋅ Xu corresponding to the action of 𝛾, on LMp from the right.⁶ In fact, the right action of LMp on itself, through right translations r𝛾1 : 𝛾2 → 𝛾2 ⋅ 𝛾1 , induces a left action of LMp on Ap by (𝛾1 ⋅ X u )(𝛾2 ) ≡ X u (𝛾2 ⋅ 𝛾1 ).
(3.31)
6 X u (β ) → 𝛾 ⋅ X u (β ) = ∫β 𝛾 X u note the order change of the paths which makes it into a right action on LMp although it is written as a product from the left.
96 | 3 The group of generalized loops and its Lie algebra By the identification 𝛾1 → δ𝛾1 , we can write the right-hand side of the above equation in the form X u (𝛾2 ⋅ 𝛾1 ) = δ𝛾2 ⋅𝛾1 (X u ) = δ𝛾2 ⋆ δ𝛾1 (X u ) = δ𝛾2 ((1 ⊗ δ𝛾1 )ΔX u ) = δ𝛾2 (Ψδ𝛾 (X u )),
(3.32)
1
while the left-hand side is simply δ𝛾2 (𝛾1 ⋅ X u ), which allows the above mentioned identification Ψδ𝛾 ≃ 𝛾1 ⋅ X u . 1
In the same way we can prove that Λδ𝛾 is the automorphism 1
u
X → X u ⋅ 𝛾1 corresponding to the action of 𝛾1 , on LMp from the left. Taking now the σ , defined in Lemma 3.14, to be a left invariant derivation σ = d, we can write Φ(d) = Fd = 𝜖 ∘ d ∈ A∗p with Fd (X u1 X u2 ) = 𝜖d(X u1 X u2 ) = 𝜖(X u1 dX u2 + dX u1 X u2 ) = 𝜖(X u1 )Fd (X u2 ) + Fd (X u1 )𝜖(X u2 ),
(3.33)
Mp is isomorphic, as K-linear space, to the subdemonstrating that the Lie algebra l̃ ∗ space of Ap consisting of the pointed derivations (2.19) at 𝜖: ∗ u u u u u u l̃ Mp ≅ {δ ∈ Ap : δ (X 1 X 2 ) = 𝜖(X 1 )δ (X 2 ) + δ (X 1 )𝜖(X 2 )}.
(3.34)
It is this K-linear space of pointed derivations at 𝜖 that is considered to be the tangent space T𝜖 L̃ Mp , just as we wanted. To motivate why we call this space the tangent space, consider 𝛾̃t , a curve of generalized loops such that 𝛾̃0 = 𝜖 lim 𝛾̃Δ = 𝜖
Δ→0
lim
Δ→0
1 𝛾̃ − 𝜖 = δ ∈ A∗p Δ Δ
(3.35)
3.6 The Lie algebra of the group of the generalized loops
|
97
where δ ∈ A∗p and the limits are defined in the weak (topology) sense, for all u ∈ Sh(Ω) lim 𝛾̃Δ (X u ) = 𝜖(X u ).
Δ→0
Applying δ to X u X v returns: 1 𝛾̃ (X u1 X u2 ) − 𝜖(X u1 X u2 )Δ Δ Δ 1 1 = lim (̃ 𝛾Δ (X u1 ) 𝛾̃Δ (X u2 ) − 𝜖(X u2 ) + 𝛾̃Δ (X u1 ) − 𝜖(X u )𝜖(X v )) Δ→0 Δ Δ = 𝜖(X u1 )δ (X u2 ) + δ (X u1 )𝜖(X u2 ),
δ (X u1 X u2 ) = lim
Δ→0
(3.36)
where we used the property that the Chen integrals preserve multiplicity (2.39): 𝛾Δ ) = X u (̃ 𝛾Δ )X v (̃ 𝛾Δ ) = (̃ 𝛾Δ )(X u )(̃ 𝛾Δ )(X v ). 𝛾̃Δ (X u X v ) = (X u X v )(̃
(3.37)
Therefore, we obtain a K-linear isomorphism: Mp ≅ l̃ Mp , T𝜖 L̃
(3.38)
given by δ → dδ = (1 ⊗ δ ) ∘ Δ. The Lie bracket of T𝜖 L̃ Mp for the pointed differentiations is defined by [δ , η ] ≡ 𝜖 ∘ [dδ , dη ] = δ ⋆ η − η ⋆ δ .
(3.39)
Notice that for any pointed derivation δ , at 𝜖, for all n, n ≥ 1: δ (X ω1 ⋅⋅⋅ωn X ωn+1 ⋅⋅⋅ωn+n ) = 0,
(3.40)
which stems from the product properties of the X u and the definition of 𝜖(X u ) that show up when taking the δ of a product (see Proposition 2.39, equations (3.9) and (3.39)). From the above result we also conclude that for all m > n ≥ 0 δ m (X ω1 ⋅⋅⋅ωn ) = 0,
(3.41)
where for m ≥ 1 δ m ≡ δ m−1 ⋆ δ . The exponential map eδ can now be defined for each n
δ δ Mp ≅ l̃ Mp : e ≡ 𝜖 + ∑n≥1 δ ∈ T𝜖 L̃ n! where for each X ω1 ⋅⋅⋅ωn ,
eδ (X ω1 ⋅⋅⋅ωn )
(3.42)
98 | 3 The group of generalized loops and its Lie algebra is given by (𝜖 + ∑m≥1
δm )(X ω1 ⋅⋅⋅ωm ), m!
(3.43)
which is only valid under the assumption that the series converges. By equation (3.41) this series is finite, so that eδ is well-defined. Interestingly we can show that eδ is a generalized loop, again similar to the situation with the usual Lie groups and algebras. Considering the inverse case, given Mp , 𝛾̃ ∈ L̃
we can define log 𝛾̃ ≡ ∑n≥1
(−1)n−1 (̃ 𝛾 − 𝜖)n , n
with (̃ 𝛾 − 𝜖)n ≡ (̃ 𝛾 − 𝜖)n−1 ⋆ (̃ 𝛾 − 𝜖), by virtue that for m > n ≥ 0, (̃ 𝛾 − 𝜖)m (X ω1 ⋅⋅⋅ωn ) = 0, log 𝛾̃ is well-defined and is an element of Mp ≅ l̃ Mp . T𝜖 L̃
The formal power series (for k ∈ Z) allows for exp(k log 𝛾̃) = 𝛾̃k , log(expδ ) = δ , which can be extended to define for each Δ∈K 𝛾̃Δ ≡ exp(Δ log 𝛾̃). It is now not so hard to show that Δ → 𝛾̃Δ is a one-parameter subgroup of L̃ Mp , generated by log 𝛾̃, i.e., 𝛾̃0 = 𝜖
𝛾̃Δ ⋆ 𝛾̃Δ = 𝛾̃Δ+Δ 1 Δ lim [̃ 𝛾 − 𝜖] = log 𝛾̃ = δ , Δ→0 Δ such that 𝛾̃Δ = eδ ,
(3.44)
3.6 The Lie algebra of the group of the generalized loops
|
99
a generalized loop and where in the last line the limit is taken in the weak (topology) sense. Now that we have introduced the left- and right-invariant derivations, discussed their relation with the derivations defined on the shuffle algebra and have defined a Lie algebra, we can move on in the next section to differential calculus in the loop space.
4 Shape variations in the loop space In this chaper we introduce the differential operators which enable the formulation of the equations of motion in the generalized path and loop space, with the final goal being to define the so-called Fr´ echet derivative.
4.1 Path derivatives The first class of differential operators we wish to introduce acts on both generalized paths and loops, they are referred to as the initial and endpoint derivatives. This class of derivatives depends on a vector field, which for the terminal endpoint derivative is assumed to exist in a vicinity U of the endpoint q = 𝛾(1) of the generalized path 𝛾. Writing V(𝛾(1)) = v ∈ T𝛾(1) M for the vector field at q this local vector field v generates a local integral curve, starting at q = 𝛾(1), s = 0 which we symbolically write as ηsV = ΦV (s)(q). We will write 𝛾s = 𝛾 ⋅ ηsV for the new path composed of the combination of the original path 𝛾 followed by the local integral curve induced by v and qs = 𝛾s (1) for the varying endpoint of the combined path, this is graphically represented in the left panel of Figure 4.1. The right panel shows that extending the original path 𝛾 → 𝛾s returns a different path, i.e., a different point in the path space PM, from which it is clear that the endpoint derivatives are actually directional derivatives in PM. The direction of this directional derivative is determined by the local vector V. We implicitly assumed a reparameterization such that the parameter that describes the curve is in the interval [0, 1]. Here we identified the generalized paths and loops with Chen
4.1 Path derivatives | 101
γ(1) = q • v ηVs γs •
γ • s=0
γ(0)
PM Fig. 4.1: 𝛾s = 𝛾 ⋅ ηsV and qs = 𝛾s (1).
integrals, where reparameterization invariance is naturally included. One could also introduce the invariance explicitly by dividing out the equivalence relation for paths that only differ by reparameterization. In a quantum setting, using the path-integral formalism, this results in integrating over all reparameterizations which gives rise to a constant factor. This factor then divides out if one divides by the vacuum diagrams in the calculation of an expectation value in quantum field theory.¹ With these notations and parameterizations we can now give the definition of the terminal covariant endpoint derivative.² Definition 4.1 (Terminal covariant endpoint derivative). Consider a functional U𝛾 defined on a path in PM, which returns the elements of ℝ. The terminal covariant endpoint derivative ∇VT (qs )U𝛾 of U𝛾 , at 𝛾, in the direction of V, is defined by ∇VT (qs )U𝛾 = lim
U𝛾s+Δ − U𝛾s
Δ→0
Replacing 𝛾s with
Δ
.
(4.1)
𝛾s = (ηsV )−1 ⋅ 𝛾,
yields the initial covariant endpoint derivative ∇VI (qs )U𝛾 . Clearly this is only well-defined in a vicinity of the endpoint qs = 𝛾s (1),
1 Notice that this reparameterization invariance is not assumed in all path or loop spaces described in the literature. 2 Instead of ℝ one can consider other sets ℂ; GL(n, ℂ), the definitions do not change.
102 | 4 Shape variations in the loop space and moreover depends on the vector field V, which gives it the directional derivative like behavior. In the special case that s = 0, we can define the terminal endpoint derivative. Definition 4.2 (Terminal endpoint derivative). Consider a functional U𝛾 defined on a path in PM, which returns the elements of ℝ. The terminal endpoint derivative 𝜕vT U𝛾 of U𝛾 , in the direction of
v ∈ T𝛾(1) M,
is defined by 𝜕vT U𝛾 = lim
U𝛾Δ − U𝛾
Δ→0
Δ
(4.2)
given that this limit exists independently of the choice of the vector field V ∈ XM, with V(𝛾(1)) = v. To demonstrate how this class of derivatives work in practice, consider the following example. Example 4.3. Suppose we have a smooth function F ∈ C∞ M and a path functional UF which reads UF [𝛾] = F[𝛾(1)]. Applying the terminal endpoint derivative yields ∇VT (qs )UF [𝛾] = V ⋅ F[qs ] = dF[Vqs ],
(4.3)
𝜕vT UF [𝛾] = V ⋅ F[𝛾(1)] = dF[v],
(4.4)
and depending only on the vector v, and not on the specific extension V. The above example is not only useful to demonstrate the operation of the endpoint derivatives, but also to introduce the concept of a marked path functional, where the marked refers to the fact that it is determined by the evaluation of some function at a certain point along the path. This is where it might make a difference if one assumes reparameterization invariance or not.
4.1 Path derivatives | 103
Definition 4.4 (Marked path functional). Consider a path-dependent functional U𝛾 and F ∈ C∞ M. We define the marked path functional F ⊙ U𝛾 , by (F ⊙ U𝛾 )[𝛾] = F[𝛾(1)]U𝛾 .
(4.5)
Similar to the regular derivatives the endpoint derivatives obey the Leibniz rule: Lemma 4.5 (Leibniz Rule). Suppose that the limit in equation (4.1) exists for a path functional U𝛾 , which has the continuity condition for s ≥ 0 lim U𝛾s+Δ = U𝛾s .
Δ→0
The covariant endpoint derivative then obeys the Leibniz rule: ∇VT (qs )(F ⊙ U𝛾 )(𝛾] = V ⋅ F(qs )U𝛾s + F[qs ]∇VT (qs )U𝛾 = ∇VT (qs )F[𝛾]U𝛾s + F[𝛾s ]∇VT (qs )U𝛾
(4.6)
with qs = 𝛾s (1). In particular, if 𝜕vT U𝛾 exists in the sense of Definition 4.2, then we have at the endpoint q = 𝛾(1) that 𝜕vT (F ⊙ U𝛾 )[𝛾] = 𝜕vT F[𝛾]U𝛾 + F[𝛾]𝜕vT U𝛾
(4.7)
which depends only on the vector v, and not on the particular extension V. We can thus safely state that the object defined in Definition 4.1 is really a derivative. Exercise 4.6. What does the endpoint derivative of the path functionals X ω1 ⋅⋅⋅ωn look like? To be able to answer this question we need the following lemma: Lemma 4.7. Suppose we have ηx = ηxV . Then the following limits occur for n ≥ 2 lim
Δ→0
lim
Δ→0
1 ∫ ω = ω (v) Δ
(4.8)
1 ∫ ω1 ⋅ ⋅ ⋅ ωn = 0. Δ
(4.9)
ηΔ
ηΔ
104 | 4 Shape variations in the loop space The proof of this lemma is straightforward after introduction of local coordinates, for which we can write ω = Aμ (x) dxμ . Proof. lim
Δ→0
1 1 ∫ ω = lim ∫ Aμ (x)dxμ Δ→0 Δ Δ ηΔ
ηs
Δ
= lim
Δ→0
1 dxμ ∫Aμ [x(t)] dt Δ dt 0
Δ
= lim
Δ→0
1 ∫A [x(t)]vμ (t) dt Δ μ 0
= Aμ [x(0)]vμ (0) = ω (v),
(4.10)
which is valid under the assumption that there are no divergences in the kernel of the integral. Thus, Lemma 4.7 and equation (2.49) with 𝛾1 ⋅ 𝛾2 = 𝛾 ⋅ ηs , where 𝛾 is a path from 𝛾(0) to 𝛾(s)) for n ≥ 1, allow for the calculation of the endpoint path derivative ∇VT (qs )X ω1 ⋅⋅⋅ωn (𝛾) = X ω1 ⋅⋅⋅ωn−1 (𝛾s ) ⋅ ωn (Vqs )
(4.11)
and for 𝜕vT this reduces to 𝜕vT X ω1 ⋅⋅⋅ωn (𝛾) = X ω1 ⋅⋅⋅ωn−1 (𝛾) ⋅ ωn (v).
(4.12)
The dependence on the vector v is worth noticing here. Exercise 4.8. Derive equation (4.11). Equivalent expressions for the initial terminal point derivative can be derived for n ≥ 1 ∇VI (qs )X ω1 ⋅⋅⋅ωn (𝛾) = −ω1 (Vqs ) ⋅ X ω2 ⋅⋅⋅ωn (𝛾s ),
(4.13)
𝜕vI X ω1 ⋅⋅⋅ωn (𝛾) = −ω1 (v) ⋅ X ω2 ⋅⋅⋅ωn ,
(4.14)
and where ω1 , ⋅ ⋅ ⋅ , ωn ∈ ⋀M
4.1 Path derivatives | 105
and ω1 , ⋅ ⋅ ⋅ , ωn ∈ ⋀M ⊗ GL(n, ℂ). Keeping quantum field theory in mind we can consider the commutator of two endpoint derivatives. Applying the commutator to X ω1 ⋅⋅⋅ωn returns the following result [𝜕uT1 , 𝜕uT2 ]X ω1 ⋅⋅⋅ωn (𝛾) = X ω1 ⋅⋅⋅ωn−2 (𝛾) ⋅ (ωn−1 ∧ ωn )(u1 ∧ u2 ).
(4.15)
The above results clearly show that ∇VT (qs )X ω1 ⋅⋅⋅ωn (𝛾) given by equation (4.11), is a marked path functional, as defined in 4.4 with F = ωn (V). It then follows that when we consider two vector fields U1 , U2 , locally defined around q = 𝛾(1), we can apply the Leibniz rule (4.5) and dω (U1 , U2 ) = U1 ⋅ ω (U2 ) − U2 ⋅ ω (U1 ) − ω ([U1 , U2 ]), to derive that at q we get [∇UT1 (q), ∇UT2 (q)]X ω1 ⋅⋅⋅ωn (𝛾) = X ω1 ⋅⋅⋅ωn−1 (𝛾) ⋅ dωn (u1 ∧ u2 ) + X ω1 ⋅⋅⋅ωn−2 (𝛾).(ωn−1 ∧ ωn )(u1 ∧ u2 ).
(4.16)
This only depends on the local vectors u, v, and not on the particular extensions U1 , U2 allowing for the notation [∇uT1 , ∇uT2 ]X ω1 ⋅⋅⋅ωn (𝛾). Here we are specifically interested in the result of the application of the endpoint derivatives on the parallel transporter. Consider the parallel transport path functional U𝛾 : PM → GL(p) which was introduced in equation (2.125). Applying the terminal endpoint derivative results in 𝜕uT2 U𝛾 = U𝛾 ⋅ ω (u2 ), (4.17) and for the initial endpoint derivative it returns: 𝜕uI 2 U𝛾 = −ω (u2 ) ⋅ U𝛾 .
(4.18)
106 | 4 Shape variations in the loop space The relevance of the application of the commutator to the gauge link will become clear in the next section when we demonstrate its relation to the area derivative. For the moment let us just state the result. [∇uT1 , ∇uT2 ]U𝛾 = U𝛾 ⋅ (dω + ω ∧ ω )(u1 ∧ u2 ) = U𝛾 ⋅ Ω(u1 ∧ u2 ),
(4.19)
where Ω is the curvature of the connection one-form ω . In a more familiar quantum field notation this becomes: [∇uT1 , ∇uT2 ]U𝛾 = U𝛾 ⋅ Fμ 𝜈 (u1 ∧ u𝜈2 ), μ
(4.20)
where now Fμ 𝜈 is the usual field strength tensor.³ In the above exposition we sometimes referred to the endpoint derivatives as being covariant, without explicitly explaining why. The following example demonstrates where the name covariant endpoint derivative has its origin. Example 4.9. Consider a path λ ∈ PMp and a function F ∈ C∞ M. A marked path functional reads ω ⋅⋅⋅ωn
Z(i)1
(λ ; F) ≡ X ω1 ⋅⋅⋅ωi (λ )F[λ (1)]X ωi+1 ⋅⋅⋅ωn (λ −1 )
(4.21)
where ω1 ⋅ ⋅ ⋅ ωn ∈ ⋀M. Applying the Leibniz rule returns ω ⋅⋅⋅ωn
∇vT Z(i)1
(λ ; F) = X ω1 ⋅⋅⋅ωi (λ ) ⋅ dFq (v)X ωi+1 ⋅⋅⋅ωn (λ −1 ) + X ω1 ⋅⋅⋅ωi−1 (λ ) ⋅ ωi (v) ⋅ F(q) ⋅ X ωi+1 ⋅⋅⋅ωn (λ −1 ) − X ω1 ⋅⋅⋅ωi (λ ) ⋅ F(q) ⋅ ωi+1 (v) ⋅ X ωi+2 ⋅⋅⋅ωn (λ −1 ), (4.22)
where q = λ (1) Consider now a connection one-form ω . A marked path functional Ψ can be defined as Ψ(λ ; F) ≡ Uλ ⋅ F(q) ⋅ Uλ −1 (4.23)
3 Note the type of the different tensors: F is a dual covariant tensor and the vectors u1 , u2 are indeed two contra-variant tensors. This makes the contractions well-defined.
4.2 Area derivative | 107
where q = λ (1) U is the parallel transport operator of the connection ω and F ∈ C∞ M ⊗ GL(n, ℂ). By means of equation (4.22), one finds ∇vT Ψ(λ ; F) = Uλ ⋅ (dFq (v) + [ω , F](v)) ⋅ Uλ −1 = Uλ ⋅ Dωq F(v) ⋅ Uλ −1 ,
(4.24)
where Dωq F(v) ≡ dFq (v) + [ω , F(v)] stands for the usual covariant derivative of F. This explains the name of the operator ∇vT , as terminal endpoint covariant derivative.
4.2 Area derivative In the previous section we introduced the path endpoint derivatives. Now we focus on the area derivative, Figure 4.2. We start with the definition of an area extension. u (U,V)
λ t = λ · t
Φut • λ
q •
•
• (U,V) t v
λ−1 •
Φvt
λt
γ •
γ
LM
Fig. 4.2: Δλ ;u1 ∧u2 (q)X ω1 ⋅⋅⋅wr (𝛾).
Definition 4.10 (Area extension). Consider a loop 𝛾 ∈ LMp , a point q ∈ M,
· λ−1
108 | 4 Shape variations in the loop space and a path λ ∈ PMp , going from p to q = λ (1). Given an ordered pair (u1 , u2 ) of tangent vectors u1 , u2 ∈ Tq M, we extend them by two commuting vector fields U1 , U2 ∈ XU, defined in a vicinity U of q = λ (1). We introduce the infinitesimal loop (U1 ,U2 )
◻Δ
based at q, which is defined by the local flows Φ: (U1 ,U2 )
◻Δ
= ΦU2 (−Δ)ΦU1 (−Δ)ΦU2 (Δ)ΦU1 (Δ)(q)
(4.25)
where ΦU1 ,U2 is the local flow of U1 , U2 . Use the notation λΔ for the (Δ-dependent) loop, see Figure 4.2, where in the right panel we now have a curve of loops in LM, (U1 ,U2 )
λ ⋅ ◻Δ
⋅ λ −1 ,
for which, due to the path reduction property (2.74), we obtain lim λΔ = 𝜖,
Δ→0
where 𝜖 is the unity in the group LMp (of equivalence classes) of loops at p ∈ M. In the classical case one can write lim λΔ (X u ) = lim X u (λΔ ) = 𝜖(X u ).
Δ→0
Δ→0
Given that λΔ ⋅ 𝛾
(4.26)
4.2 Area derivative | 109
represents an infinitesimal deformation of the loop 𝛾, in the topology of LMp , the area derivative can be define as follows: Definition 4.11 (Area derivative). Given a loop functional U𝛾 on LMp , with values in ℝ, its area derivative Δλ ;(u1 ,u2 ) (q) ⋅ U𝛾 is defined by the limit (if it exists independently of the choice of the vector fields U1 , U2 ∈ XU ) 1 (4.27) Δλ ;(u1 ,u2 ) (q)U𝛾 = lim 2 [UλΔ ⋅𝛾 − U𝛾 ]. Δ→0 Δ With the goal of applying the area derivative to Wilson loop variables (for SU(N) gauge theory) 1 [ig ∮ A (x)dxμ ] 0⟩ , W𝛾 = ⟨0 Tr Pe 𝛾 μ (4.28) N we investigate the application of this derivative to the Chen iterated integrals X ω1 ⋅⋅⋅ωn : LMp → ℝ. The fact that the area derivative of the functionals X ω1 ⋅⋅⋅ωn is well-defined stems from the following lemma. Lemma 4.12. Write ◻Δ for
(U1 ,U2 )
◻Δ as before. Then lim
Δ→0
lim
Δ→0
lim
Δ→0
1 ∫ ω = ∫ dω = dω (u1 ∧ u2 ) Δ2 ◻Δ
(4.29)
V
1 ∫ ω1 ω2 = (ω1 ∧ ω2 )(u1 ∧ u2 ) Δ2
(4.30)
1 ∫ ω1 ⋅ ωn = 0, Δ2
(4.31)
◻Δ
◻Δ
where ω , ω1 , ⋅ ⋅ ⋅ , ωn ∈ ⋀M. Just like in the case of the endpoint derivatives this lemma can be proved by introducing local coordinates. Again these integrals are well-defined by the Stokes theorem, but one does need to make a remark with respect to the goal of applying the area
110 | 4 Shape variations in the loop space derivative to the Wilson loop variables, equation (4.28). To show the intricacies rewrite the integral in (4.30) in a more familiar gauge theory notation: ∫ ω1 ω2 = ∫ Aμ A𝜈 , ◻t
(4.32)
◻t
where Aμ , A𝜈 are the gauge connection one-forms. This integral is well-defined in the classical case, but will become problematic in a field theory setting when taking vacuum expectation values, even when considering both one-forms locally constant the vacuum expectation value will give rise to a tadpole, which can be taken care of by a convenient regularization scheme. But in the general case, more specifically in the case of Wilson loops laying entirely on the light cone, one will not be able to resolve this issue. The question is then if one can interchange the integrals and the vacuum expectation values. For the remainder of this section we will assume that the integrals in Lemma 4.12 are well-defined and use the values shown there. Notice that the above defined area derivative introduces extra cusps along the contour, which is the main cause of divergencies of the integrals in Lemma 4.12 in a quantum field theory setting. To investigate the properties of the area derivative applied to the functionals X ω1 ⋅⋅⋅ωn further, we define the following derivation Definition 4.13. For
2
u1 ∧ u2 ∈ ⋀ Tq M define a derivation Du1 ∧u2 (q) in the algebra of iterated integrals, by Du1 ∧u2 (q)X ω1 ⋅⋅⋅ωn = X ω1 ⋅⋅⋅ωn−1 ⋅ dωn (u1 ∧ u2 ).
(4.33)
From the (algebraic) commutator [𝜕uT1 , 𝜕uT2 ] of two terminal endpoint derivatives at q, we also define the derivation Du1 ∧u2 (q)
according to: T
T
Du1 ∧v2 (q) = Du1 ∧u2 (q) + [𝜕u1 , 𝜕u2 ],
(4.34)
for which we formulate the lemma below, establishing its relationship to the area derivative Lemma 4.14. Let Δλ ;(u1 ,u2 ) (q) be as introduced in Definition 4.11. Then n
Δλ ;(u1 ,u2 ) (q)X ω1 ⋅⋅⋅ωn (𝜖) = ∑ (Du1 ∧u2 (q)X ω1 ⋅⋅⋅ωi (λ ))(X ωi+1 ⋅⋅⋅ωn (λ −1 )), i=1
(4.35)
4.2 Area derivative | 111
where ω1 ⋅ ⋅ ⋅ ωn ∈ ⋀M. Since the derivative depends only on the wedge product u1 ∧ u2 of the local vectors, we introduce the notation Δ(λ ;u1 ∧u2 ) (q)X ω1 ⋅⋅⋅ωn (𝜖). The lemma can be proved using the properties of the Chen iterated integrals for products of loops in Definition 4.11 and comparing the resulting expression with the results of applying the derivative, equation (4.13), to the functionals X ω1 ⋅⋅⋅ωn . Example 4.15. Δ(λ ;u1 ∧u2 ) (q)X ω (𝜖) = dω (u1 ∧ u2 ) Δ(λ ;u1 ∧u2 ) (q)X ω1 ω2 (𝜖) = dω1 (u1 ∧ u2 )X ω2 (λ −1 ) + X ω1 (λ ) ⋅ dω2 (u1 ∧ u2 ) + (ω1 ∧ ω2 )(u1 ∧ u2 ) (4.36) and, more generally n
Δ(λ ;u1 ∧u2 ) (q)X ω1 ⋅⋅⋅ωn (𝜖) = ∑ X ω1 ⋅⋅⋅ωi−1 (λ ) ⋅ dωi (u1 ∧ u2 )X ωi+1 ⋅⋅⋅ωn (λ −1 ) i=1 n
+ ∑ X ω1 ⋅⋅⋅ωi−2 (λ ) ⋅ (ωi−1 ∧ ωi )(u1 ∧ u2 ) ⋅ X ωi+1 ⋅⋅⋅ωn (λ −1 ). (4.37) i=2
Keeping in mind that lim λΔ = 𝜖,
Δ→0
still assuming that the integrals in Lemma 4.12 are well-defined, one can demonstrate that λ −𝜖 lim Δ =0 Δ→0 Δ by introducing local coordinates. At the same time we find that lim
Δ→0
λΔ − 𝜖 Δ2
exists and is actually the area derivative. We write now δ(λ ;u∧v) for the operator in the algebra of iterated integrals Ap , defined through the derivations from equations (4.33) and (4.34): δ(λ ;u1 ∧u2 ) X u ≡ Δ(λ ;u1 ∧u2 ) (q)X u (𝜖) = (λ ⊗ λ )((Du1 ∧u2 (q) ⊗ J) ∘ Δ)X u ,
(4.38)
112 | 4 Shape variations in the loop space for u ∈ Sh, where in the last line J is the antipode and Δ co-multiplication of the Hopf algebra structure on Ap . The last equality can be understood from combining the definitions of the operators written in the last line of equation (4.38) with the left action of a loop on the space of generalized loops. Since this is a topological group as we have seen in the previous section, this action is well-defined. Exercise 4.16. Demonstrate that the last line in equation (4.38) is indeed equivalent to equation (4.37). In a similar way it is possible to show that δ(λ ;u1 ∧u2 ) is a pointed derivation at 𝜖. δ(λ ;u1 ∧u2 ) (X u1 X u2 ) = δ(λ ;u1 ∧u2 ) (X u1 )𝜖(X u2 ) + 𝜖(X u1 )δ(λ ;u1 ∧u2 ) (X v ),
(4.39)
∀u1 , u2 ∈ Sh(Ω). Note that equation (4.38) indicates that δ(λ ;u1 ∧u2 ) : Ap → k is a linear map. In the discussion on the Lie algebra on the generalized loop space we derived that the tangent space T𝜖 LMp , to the group LMp , at 𝜖, is the K-linear subspace of A∗p . We will now demonstrate that this space is generated by all the δ(λ ;u1 ∧u2 ) . Suppose we have a loop 𝛾 ∈ LMp , a path λ ∈ PMp and evaluate the area derivative, Figure 4.3, (λ ;u1 ∧u2 ) (q)X ω1 ⋅⋅⋅ωn [𝛾]. We have the following lemma: Lemma 4.17. For
2
u1 ∧ u2 ∈ ⋀ Tλ (1) M one has r
(λ ;u1 ∧u2 ) (q)X ω1 ⋅⋅⋅ωn (𝛾) = ∑ (λ ;u1 ∧u2 ) (q)X ω1 ⋅⋅⋅ωi (𝜖)X ωi+1 ⋅⋅⋅ωn (𝛾) i=1
= 𝛾 ∘ (δ(λ ;u1 ∧u2 ) ⊗ 1) ∘ Δ(X ω1 ⋅⋅⋅ωn )
(4.40)
4.2 Area derivative | 113
u
•
Φut
• (U,V) t
q • •
λ
Φvt
v
λ−1
γ • p = γ(0)
Fig. 4.3: λ ;u1 ∧u2 (q)X ω1 ⋅⋅⋅wr (𝛾).
with (δ(λ ;u1 ∧u2 ) ⊗ 1) ∘ Δ the right invariant derivation on the algebra Ap , which is associated to the tangent vector δ(λ ;u1 ∧u2 ) . This lemma motivates the notation R(λ ;u1 ∧u2 ) : LMp → A∗p , given by R(λ ;u1 ∧u2 ) (𝛾) ≡ 𝛾 ∘ (δ(λ ;u1 ∧u2 ) ⊗ 1) ∘ Δ as its designation as the right invariant ‘vector field’ on LMp , determined by δ(λ ;u1 ∧u2 ) . If λ = 𝜖, then (𝜖;u1 ∧u2 ) (p)
(4.41)
114 | 4 Shape variations in the loop space is said to be the initial endpoint area derivative, denoted by I(𝜖;u1 ∧u2 ) (p), as visualized in Figure 4.4.
Φut
γ • p = γ(0) Φvt
Fig. 4.4: I(𝜖;u1 ∧u2 ) (p).
In this case the area derivative reduces to: I(𝜖;u1 ∧u2 ) (p)X ω1 ⋅⋅⋅ωn (𝛾) = dω1 (u1 ∧ u2 ) ⋅ X ω2 ⋅⋅⋅ωn (𝛾) + (ω1 ∧ ω2 )(u1 ∧ u2 ) ⋅ X ω3 ⋅⋅⋅ωn (𝛾).
(4.42)
Another possibility is to consider λ = 𝛾 ⋅ η, 𝛾 ∈ LMp , η ∈ PMp , and
2
u1 ∧ u2 ∈ ⋀ Tη (1) M. In the latter case, Figure 4.5, (U1 ,U2 )
λΔ ⋅ 𝛾 ≡ (λ ⋅ ◻Δ
(U1 ,U2 )
⋅ λ −1 ) ⋅ 𝛾 = 𝛾 ⋅ η ⋅ ◻Δ
⋅ 𝛾 ⋅ η −1 ⋅ 𝛾
(U1 ,U2 )
= 𝛾 ⋅ (η ⋅ ◻Δ
⋅ η −1 ) ≡ 𝛾 ⋅ ηΔ .
(4.43)
The area derivative in this situation is referred to as the terminal endpoint area derivative and is denoted by E(η ;u1 ∧u2 ) (q). Similarly to the case of the initial endpoint area derivative one derives n
E(η ;u1 ∧u2 ) (q)X ω1 ⋅⋅⋅ωn (𝛾) = ∑ X ω1 ⋅⋅⋅ωi (𝛾)(η ;u1 ∧u2 ) (q)X ωi+1 ⋅⋅⋅ωn (𝜖) i=1
= 𝛾 ∘ (1 ⊗ δ(η ;u1 ∧u2 ) ) ∘ Δ(X ω1 ⋅⋅⋅ωn ).
(4.44)
4.2 Area derivative | 115
u
•
Φut
• (U,V) t
q • η
• Φvt
v
η−1
γ • p = γ(0)
Fig. 4.5: E(η ;u1 ∧u2 ) (q).
Analogous to the right invariant derivations, we can define the left invariant derivation (1 ⊗ δ(η ;u1 ∧u2 ) ) ∘ Δ associated to δ(η ;u1 ∧u2 ) . Naturally L(η ;u1 ∧u2 ) : LMp → A∗p , given by L(η ;u1 ∧u2 ) (𝛾) ≡ 𝛾 ∘ (1 ⊗ δ(η ;u1 ∧u2 ) ) ∘ Δ, now referred to as the left invariant “vector field”, on LMp , determined by δ(η ;u1 ∧u2 ) .
(4.45)
116 | 4 Shape variations in the loop space
Φut η−1 η γ
η(0) • • p = γ(0))
Φvt
Fig. 4.6: I(𝜖;u1 ∧u2 ) (p).
If η =𝜖 (see Fig. 4.6), the above formula reduces to E(𝜖;u1 ∧u2 ) (p)X ω1 ⋅ωn (𝛾) = Du1 ∧u2 (p)X ω1 ⋅⋅⋅ωn =X
ω1 ⋅⋅⋅ωn−1
(𝛾) ⋅ dωn (u ∧ v) + X
(4.46) ω1 ⋅⋅⋅ωn−2
(𝛾) ⋅ (ωn−1 ∧ ωn )(u1 ∧ u2 ).
Particularly interesting in this last case is that we can relate the area derivative to the Lie bracket of terminal endpoint path derivations, equation (4.16): (𝜖;u1 ∧u2 ) (q)X ω1 ⋅⋅⋅ωn (𝛾) = [∇uT , ∇vT ]X ω1 ⋅⋅⋅ωn (𝛾).
(4.47)
If, in this specific case, we consider not the functionals X ω1 ⋅ωn but the holonomy U𝛾 instead, we obtain E(𝜖;u1 ∧u2 ) (p)U𝛾 = U𝛾 ⋅ (dω + ω ∧ ω )(u1 ∧ u2 ) = U𝛾 ⋅ Ω(u1 ∧ u2 )
(4.48)
where again Ω is the curvature of the connection ω . The fact that Ap has a nuclear algebra means that we can apply this derivation to the Wilson loop W: E(𝜖;u1 ∧u2 ) (p)W(𝛾) = Tr ((dξ + ω ∧ ω )(u1 ∧ u2 ) ⋅ U𝛾 ) = Tr (Ω(u1 ∧ u2 ) ⋅ U𝛾 ),
(4.49)
which are also referred to as the Mandelstam formulas. Since we are dealing with Lie algebras, it is not a surprise that we also have a Bianchi identity. Theorem 4.18 (Bianchi identity). ∑ cycl{u1 ,u2 ,u3 }
∇uT1 (λ (1))δ(λ ;u2 ∧u3 ) = 0,
(4.50)
4.3 Variational calculus
|
117
where ∑ cycl{u1 ,u2 ,u3 }
stands for the sum over the cyclic permutations of the vectors u1 , u2 , u3 . Just like with the endpoint derivatives, we can consider the commutator of two area derivatives which, as elements of the Lie algebra, will allow the determination of the structure constants [δ(λ ;a1 ∧a2 ) , δ(η ;u1 ∧u2 ) ] = δ(λ ;a1 ∧a2 ) ⋆ δ(η ;u1 ∧u2 ) − δ(η ;u1 ∧u2 ) ⋆ δ(λ ;a1 ∧a2 ) .
(4.51)
Using the definition of the area derivative this can be written as: [δ(λ ;a1 ∧a2 ) , δ(η ;u1 ∧u2 ) ]X ω1 ⋅⋅⋅ωn = n
i
∑ ∑ (Da1 ∧a2 (λ (1))X ω1 ⋅⋅⋅ωk (λ ))(X ωk+1 ⋅⋅⋅ωi (λ −1 ))δ(η ;u1 ∧u2 ) (X ωi+1 ⋅⋅⋅ωn ) i=0 k=0 n
i
− ∑ ∑ (Du1 ∧u2 (η (1))X ω1 ⋅⋅⋅ωk (η ))(X ωk+1 ⋅⋅⋅ωi (η −1 ))δ(λ ;a∧b) (X ωi+1 ⋅⋅⋅ωn ),
(4.52)
i=0 k=0
from which one is formally able to extract the structure constant. With this we end our introduction of the area derivative and move on to the variational derivative.
4.3 Variational calculus In the previous section we introduced the area derivative, which depends on the local flows of two independent local vector fields. A crucial problem with this derivative in the context of quantum field theory, while calculating perturbative matrix elements, vacuum expectation values etc., is that it may introduce extra cusps (anglelike obstructions) in the loops and, consequently, may generate extra singularities in the perturbative expansion. To deal with this problem we shall define a different area derivative that, under certain assumptions, will not introduce extra singularities. Within this scheme the variations of the shape of a contour are generated by diffeomorphisms, which can be related to the Fr´echet derivative that in some aspects can be considered as a differential operators situated, in a sense, between the path- and an area-derivative. To introduce this derivative consider Diff(M), the diffeomorphism group of M. Let now ϕ ∈ Diff(M) be a diffeomorphism of M and 𝛾 ∈ PM
118 | 4 Shape variations in the loop space be a path in M. Then we have ϕ ⋅𝛾 for the image of the path 𝛾 under the diffeomorphism ϕ . From elementary manifold theory we conclude that the action of the diffeomorphism ϕ on the functionals X ω1 ⋅⋅⋅ωn is given by X ω1 ⋅⋅⋅ωn [ϕ ⋅ 𝛾] = X ϕ
∗
ω1 ⋅⋅⋅ϕ ∗ ωn
[𝛾]
(4.53)
where, as before, ω1 , ⋅ ⋅ ⋅ , ωn ∈ ⋀M, ∗
and ϕ ωi are the pullbacks of ωi under the map ϕ . We focus our attention to the diffeomorphisms that form a one parameter group, infinitesimally generated by Y ∈ XM, a vector field on M. The vector field Y then generates a one parameter group of active diffeomorphisms ψ (t) by the identification ψtY (p) := cYp (t),
(4.54)
where t → cYp (t) is the maximal integral curve in M starting at p ∈ M with tangential vector field Y at each point of the curve. For a composition of diffeomorphisms we have Y ψtY (p) ∘ ψsY (p) = ψs+t (p),
so that Diff(M) turns into a group. The local flow of a vector field allows us to define a Lie derivative of any tensor field by the identification (LY (t))(p) := (
d Y ∗ (ψ ) (t))(p), ds s=0 s
(4.55)
which being applied to equation (4.53) results in DV X ω1 ⋅⋅⋅ωn (𝛾) ≡
d ω ⋅⋅⋅ω X 1 n (ϕs ⋅ 𝛾) ds s=0 n
= ∑ X ω1 ⋅⋅⋅ωi−1 (LY ωi )ωi+1 ⋅⋅⋅ωn (𝛾),
(4.56)
i=1
where LY ω refers to the Lie derivative of the one-form ω , in the direction of Y. Making use of Cartan’s formula: LY = ιY d + dιY ,
where ιY ω n (v2 , ⋅ ⋅ ⋅ , vn ) = ω n (Y, v2 , ⋅ ⋅ ⋅ , vn )
(4.57)
4.3 Variational calculus
|
119
is the interior product, equation (4.56) reduces to⁴ n
DV X ω1 ⋅⋅⋅ωn (𝛾) = ∑ X ω1 ⋅⋅⋅ωi−1 ⋅(ιY dωi )⋅ωi+1 ⋅⋅⋅ωn [𝛾] i=1 n
+ ∑ X ω1 ⋅⋅⋅ωi−2 ⋅ιY (ωi−1 ∧ωi )⋅⋅⋅ωi+1 ⋅⋅⋅ωn [𝛾] i=2
+ ωn [V(1)]X ω1 ⋅⋅⋅ωn−1 (𝛾) − ω1 (V(0))X ω2 ⋅⋅⋅ωn [𝛾].
(4.58)
From Section 3.2 we know that Ap is a Fr´echet space, such that the Lie derivative DV X ω1 ⋅⋅⋅ωn [𝛾] associated with V now is a Fr´echet derivative of X ω1 ⋅⋅⋅ωn at 𝛾, in the direction of the tangent vector V = Y ∘ 𝛾 ∈ 𝛾∗ TM, as introduced in Definition 2.121. Restricting ourselves to the pointed diffeomorphism group Diff p (M), the diffeomorphisms ϕ that fix the point p, and keeping in mind that this is also a topological group, we can consider its ‘Lie algebra’ Xp (M). This algebra consists of the vector fields Y that vanish on p. The action of the elements of this algebra on the algebra Ap can be naturally defined by making use of the pullbacks of the one-forms ωi : (ϕ , X ω1 ⋅⋅⋅ωn ) → ϕ ⋅ X ω1 ⋅⋅⋅ωn ≡ Xϕ
∗
ω1 ⋅⋅⋅ϕ ∗ ωn
.
(4.59)
Since diffeomorphisms do not change the algebraic structure of the one-forms, it also preserves its Hopf algebra structure and as such ϕ is a Hopf algebra automorphism, written explicitly ϕ ⋅ (X u X v ) = (ϕ ⋅ X u ) ⋅ (ϕ ⋅ X v ) Δ ∘ ϕ = (ϕ ⊗ ϕ ) ∘ Δ.
(4.60)
Mp , through the As a direct consequence of this fact ϕ induces an automorphism of L̃ identification:
ϕ ⋅ α̃ (X ω1 ⋅⋅⋅ωn ) ≡ α̃ (ϕ ⋅ X ω1 ⋅⋅⋅ωn ) = α̃ (X ϕ
4 Note the different limits of the summations.
∗
ω1 ⋅⋅⋅ϕ ∗ ωn
),
(4.61)
120 | 4 Shape variations in the loop space ϕ , as an element of
Aut(L̃ Mp ),
has a differential
dϕ : l̃ Mp → l̃ Mp ,
defined as in standard differential geometry by dϕ (δ )(X ω1 ⋅⋅⋅ωn ) ≡ δ (X ϕ
∗
ω1 ⋅⋅⋅ϕ ∗ ωn
),
(4.62)
̃ Mp . with δ a tangent vector (or derivation) of L This differential allows ϕ → dϕ to produce a linear representation of the pointed diffeomorphism group Diff p (M) on l̃ Mp . The infinitesimal action of Y ∈ Xp (M) on δ , written as Y ⋅ δ , is represented by: n
(Y ⋅ δ )(X ω1 ⋅⋅⋅ωn ) = ∑ δ (X ω1 ⋅⋅⋅ωi−1 ⋅(LY ωi )⋅ωi+1 ⋅⋅⋅ωn ),
(4.63)
i=1
where we used Y(0) = 0 = Y(1). Using Cartan’s expression (4.57) for the Lie derivative and the expressions that defined the ideal Jp of Ap , the above result can be reduced to n
(Y ⋅ δ )(X ω1 ⋅⋅⋅ωn ) = ∑ δ (X ω1 ⋅⋅⋅ωi−1 ⋅(ιY dωi )⋅ωi+1 ⋅⋅⋅ωn ) i=1 n
+ ∑ δ (X ω1 ⋅⋅⋅ωi−2 ⋅ιY (ωi−1 ∧ωi )⋅ωi+1 ⋅⋅⋅ωn ),
(4.64)
i=2
where ιY stands for the interior product.
4.4 Fr´echet derivative in a generalized loop space We end the mathematical introduction to the theory of the Wilson lines with the discussion of the connection between the Fr´echet derivative and diffeomorphisms. Namely, we shall show how the diffeomorphism generating vector field V from the previous section becomes a variational vector field. Suppose we have a path 𝛾 ∈ PMp based at p, with T𝛾 PMp the tangent space of PMp at 𝛾 as visualized in Figure 4.7.
4.4 Fr´echet derivative in a generalized loop space
| 121
T PM
TM
•V
γ V
γs •
PM
M
Fig. 4.7: Diffeomorphism of a path.
The vector fields along 𝛾 are defined throughout the pullback bundle 𝛾∗ T M. Notice that these vanish at p since this point needs to stay fixed. Let us now choose such a vector V ∈ T𝛾 PMp . Defining s → 𝛾s as a curve of paths in PMp , starting at 𝛾, in s = 0, with velocity V, we can write: 𝛾0 = 𝛾 𝜕 V(t) = 𝛾s (t) 𝜕s s=0 V(0) = 0.
(4.65) (4.66) (4.67)
The map s → 𝛾s is the variation of 𝛾 = 𝛾0 , with associated variational vector field V. In the special case that the variation 𝛾s is induced by a diffeomorphism, like in the previous section 𝛾s = ϕs ∘ 𝛾, and the vector field is the diffeomorphism generator V = Y ∘ 𝛾, we can determine the Fr´echet derivative of the path functionals X ω1 ⋅⋅⋅ωn , at 𝛾 ∈ PMp .
122 | 4 Shape variations in the loop space This derivative was defined in Definition 2.121 as the linear map D⋅ X ω1 ⋅⋅⋅ωn [𝛾] : T𝛾 LM → ℝ. In the previous section we concluded that in the case we are considering here it can be written as: d ω ⋅⋅⋅ω DV X ω1 ⋅⋅⋅ωn [𝛾] ≡ (4.68) X 1 n [𝛾s ]. ds s=0 The derivation of this result requires one more lemma: Lemma 4.19. Consider a manifold N (I or S1 ), an immersion 𝛾 : N → M, and a differential form ω in M. Suppose that Γ : N × [0, 𝜖] → M is a smooth variation of 𝛾, with variational vector field V. It means that given 𝛾s (t) = Γ(t, s), for (t, s) ∈ N × [0, 𝜖], one finds 𝛾0 = 𝛾 and V(t) =
𝜕 𝜕 Γ(t, s) = Γ∗(t,0) ( ), 𝜕s s=0 𝜕s (t,0)
for t ∈ N. Therefore, we have d ∗ ∗ 𝛾 ω = 𝛾 (ιV dω + d(ιV ω )) ds s=0 s = 𝛾∗ (ιV dω ) + d(𝛾∗ (ιV ω )) as differential forms on N = N × {0}. In this lemma ιV(t) ω is the interior product of the form ω (𝛾(t)) with V(t) ∈ T𝛾(t) M and d is the usual differential operator.
(4.69)
4.4 Fr´echet derivative in a generalized loop space
| 123
Consider the case when 𝛾:I→M is an immersed path, based at p, and 𝛾s a variation of this path generated by the variational vector field V. The Fr´echet derivative then becomes: d d ∗ ∗ (∫ ω ) = (∫𝛾s ω ) = ∫ 𝛾 (ιV dω + d(ιV ω )) ds s=0 ds s=0 𝛾s
I
I
= ∫ 𝛾∗ (ιV dω ) + ∫ 𝛾∗ d(ιV ω ) 𝜕I
I
= ∫ 𝛾∗ (ιV dω ) + ω [V(1)] − ω [V(0)] I
= ∫ ιV dω + ω [V(1)],
(4.70)
𝛾
where we used the above lemma and where in the last equality we have used the identity ∫ ιV dω = ∫ 𝛾∗ (ιV ω ). 𝛾
I
Applying this to a loop 𝛾 ∈ LMp and taking into account that V(0) = 0 = V(1) results in DV X ω [𝛾] = X ιV dω [𝛾] = ∫ ιV dω
(4.71)
𝛾
for the functional X ω . Lemma 4.19 combined with an induction procedure results in the following expression for a general functional X ω1 ⋅⋅⋅ωn n
DV X ω1 ⋅⋅⋅ωn [𝛾] = ∑ ∫ ω1 ⋅ ⋅ ⋅ ωi−1 ⋅ ιV (dωi ) ⋅ ωi+1 ⋅ ⋅ ⋅ ωn i=1 𝛾 n
+ ∑ ∫ ω1 ⋅ ⋅ ⋅ ωi−2 ⋅ ιV (ωi−1 ∧ ωi ) ⋅ ωi+1 ⋅ ⋅ ⋅ ωn i=2 𝛾
+ ( ∫ ω1 ⋅ ⋅ ⋅ ωn−1 ) ⋅ ωn [V(1)], 𝛾
which is the same as equation (4.58). For an immersed loop 𝛾 ∈ LMp ,
(4.72)
124 | 4 Shape variations in the loop space we have only to consider variations V, that keep the base point p fixed: ∗
Vp ≡ {V ∈ 𝛾 TM : V(0) = 0 = V(1)}.
(4.73)
Let us mention that any solution Ψ of the equation for all V ∈ Vp D𝛾 Ψ(V) = 0, is called a relative homotopy invariant of the loop 𝛾, which has its own interesting properties, for instance in Chern–Simons theories or in String Theory. Returning to our motivation for introducing the Fr´echet derivative, we now see that if we consider smooth diffeomorphisms the number of cusps is preserved and we still have an area variation. Of special interest for us is the subgroup of diffeomorphisms that also preserve angles, i.e., the locally conformal diffeomorphisms. Despite this striking difference between the area derivative and the Fr´echet derivative they are still very well related to each other. To make this relation apparent we define an element of l̃ Mp by 1
Θ(𝛾; V) ≡ ∫ δ(𝛾ot ;V(t)∧𝛾(t)) ̇ (𝛾(t))dt
(4.74)
0
where V ∈ Vp and
𝛾0t
stand for the part of 𝛾, from 𝛾(0) to 𝛾(t). Applying (4.74) to the functionals X u , u ∈ Sh(Ω)
results in
1
u Θ(𝛾; V)(X u ) ≡ ∫ δ(𝛾ot ;V(t)∧𝛾(t)) ̇ (𝛾(t))(X )dt
(4.75)
0
if, of course, this is well-defined. Tavares showed that 1 u
DV X (𝛾) = ∫ (𝛾ot ;V∧𝛾)̇ (𝛾(t))X u dt
(4.76)
0
= (𝛾 ∘ (Θ(𝛾; V) ⊗ 1) ∘ Δ) X u .
(4.77)
From this we conclude that the Fr´echet derivative associated with the variational vector field V can be considered as an integral along the path of area derivatives. If one considers area variations induced by the area derivatives as little squares along the path, and integrate over them we get a smooth area variation. The fact that this is possible is due to the fact that the overlapping sides of the little squares are traversed in opposite direction such that they disappear due to the path reduction property and inverses. This cancelling effectively eliminates the cusps introduced by every square
4.4 Fr´echet derivative in a generalized loop space
| 125
V γ˙ γ
δ( λ; γ˙ ^V)
Pa th reduction
Fig. 4.8: Variation induced by Fr´echet derivative.
such that in the end we have not introduced any new cusp and the result is a smoothly varied contour. Figure 4.8 represents this idea graphically. Naturally we are interested in the application of this result not only to the functionals X u , u ∈ Sh(Ω), but also to the holonomy or Wilson loop U : LMp → GL(n, ℂ) of a connection ω . The result for the holonomy can be written as: 1
̇ ⋅ U(𝛾0t )−1 ), DV U𝛾 = U𝛾 ⋅ (∫ U𝛾0t Ω𝛾(t) (V(t) ∧ 𝛾(t))
(4.78)
0
also referred to as the ‘non-Abelian Stokes’ theorem’, which for Wilson loop variables becomes (making use of the nuclear or trace class property): δ(λ ;u∧v) ⟨0 Tr Uλ 0⟩ = ⟨0 Tr {Uλ Fμ 𝜈 (λ (1))(uμ ∧ v𝜈 ) ⋅ Uλ −1 } 0⟩ .
(4.79)
Similar computations yield δ(λ ;u1 ∧u2 ) Uλ = Uλ ⋅ Fλ (1) (u1 ∧ u2 ) ⋅ Uλ −1 .
(4.80)
126 | 4 Shape variations in the loop space Now that we have concluded that the Fr´echet derivative induces a smooth variation of the (Wilson) loops it can be used to derive equations of motion in generalized loop space (Figure 4.9 shows such a variation, the effect on the holonomy and on the spectrum). In the applications of this derivative in the previous chapters of this book we have only considered variations that are angle-preserving. Certainly, we could also consider smooth diffeomorphisms which still preserve the number of angles, but do not preserve the angle sizes. Investigations of such variations have not been done yet as far as we know, opening the door to extend contour variations to a bigger class of transformations. ˜γ (t)
C V Tr(g) 1 = Tr(e)
g
=
U(˜γ )
g = U(˜γ )
Tr(g ) Tr
e ˜γ (t)
γ (t)
γ(t)
Fig. 4.9: Loop variation and its effects on the holonomy and on the spectrum.
5 Wilson lines in high-energy QCD In this chapter we focus on the major use of Wilson lines in field theories, more specifically in high-energy QCD.¹ As we shall show in the first section, a Wilson line on a linear path resums soft and collinear gluon radiation to all orders. This property can then be used to construct parton density functions, both collinear and k⊥ -dependent, essentially making them gauge-invariant.
5.1 Eikonal approximation A highly energetic fermion does not deviate much from its path when radiating a soft gluon, that is, we can limit the effect of radiating soft gluons to a phase factor. This is called the eikonal approximation.
5.1.1 Wilson line on a linear path Before we investigate the eikonal approximation, let us first derive Feynman rules for Wilson lines on a linear path. We start with a path going from −∞ to a point bμ along a direction nμ . In what follows we use the notation U[y ; x] = U𝛾 [y, x]
(5.1)
for the Wilson lines evaluated along straight-linear paths 𝛾, which are determined by vectors nμ . In this case, the path-dependence becomes trivial and reduces to the dependence on these vectors. Such a path can be parameterized as λ = −∞ . . . 0.
zμ = bμ + nμ λ
(5.2)
In the coordinate representation the Wilson line evaluated along this path reads 0 μ U = Pexp [−ignμ ∫ dλ A (nλ + bμ )] . −∞ [ ]
(5.3)
Its perturbative expansion is given by 0 μ
U = 1 − ignμ ∫ dλ A (nλ + bμ ) + ... −∞
1 The QCD Lagrangian, Feynman rules etc. are given in Appendix B.
(5.4)
128 | 5 Wilson lines in high-energy QCD Fourier transform of the gauge fields allows us to rewrite this series in the momentum representation. Then we can expand the Wilson line as:² ∞
U= ∑( n=0
n −i g n 4 4 −i b⋅∑ kj ) ∫d k ⋅ ⋅ ⋅ d k n⋅A(k ) ⋅ ⋅ ⋅ n⋅A(k ) e In , 1 n n 1 16π 4
λn
0
λ2
(5.5)
n
In (k1 , . . . , kn ) = ∫ ∫ ⋅ ⋅ ⋅ ∫dλn ⋅ ⋅ ⋅ dλ1 e−i n⋅∑ kj λj . −∞ −∞
(5.6)
−∞
Remember that the path-ordering is defined such that the field with the highest value for λ is written leftmost. This is the field which will be drawn at the rightmost side of the diagram (assuming the path to be drawn from left to right), implying that we read Wilson lines in a Feynman diagram as we do with Dirac lines: from right to left when writing the corresponding formula. The path-ordering is manifest in the integral borders of In , as they make sure that λ1 ≤ λ2 ≤ . . . ≤ λn . Solving this integral is straightforward. First we calculate the innermost integral: λ2
∫ dλ1 e−i n⋅k1 λ1 = −∞
i n⋅k1 + i 𝜖
The effect of the innermost integral is a factor Then the next integral will give a factor get:
1 n⋅k1
1 n⋅k1 +n⋅k2
j=1
(5.7)
and an extra term n⋅k1 in front of λ2 .
and so on. In other words, we simply
n
In (k1 , . . . , kn ) = ∏
e−i n⋅k1 λ2 .
i j
,
(5.8)
n⋅ ∑ kl + i 𝜖 l=1
giving ∞
n
n n d4 ki −i b⋅∑ kj ∏ n⋅A(k ) ⋅ ⋅ ⋅ n⋅A(k ) e n 1 16π 4 j=1
U[b ; −∞] = ∑ (−i g) ∫ n=0
i j
.
n⋅ ∑ kl + i 𝜖 l=1
This now gives rise to the following Feynman rules:
(1) Wilson line propagator: (2) External point:
k k
bμ
=
i n⋅k + i 𝜖
=
e−i b⋅k
2 The symbol n is used both as an index (in the n-th order expansion) and as a directional vector. The difference should be clear from context.
5.1 Eikonal approximation
−∞
(3) Line from infinity:
j
(4) Wilson vertex:
i
k μ, a
=
1
=
−i g nμ (ta )ij
129
|
(no momentum flow)
We need to realize that when drawing Wilson lines with these rules, the momenta of the gluons should always be pointing inwards (towards the Wilson line) and be collected at the external point, to ensure the correct momentum summations in the Wilson line propagators. For each momentum ki that happens to point outwards, make the substitution ki → −ki in the Feynman rules. The resulting n-th order diagram is drawn in Figure 5.1. n
n−1
−∞
⋅⋅⋅
k1 + k 2
k1 k1
⋅⋅⋅
k3
k2
∑ kj
∑ kj
j=1
j=1
kn−1
bμ
kn
Fig. 5.1: n-gluon radiation for a Wilson line going from −∞ to bμ .
The logical next step is to investigate a path that starts at a point aμ and now goes up to + ∞, which we parameterize as zμ = aμ + nμ λ
λ = 0 ⋅ ⋅ ⋅ + ∞.
(5.9)
In this case, it is easier to reverse the integration variables and borders as follows: +∞ +∞
+∞
n
In = ∫ ∫ ⋅ ⋅ ⋅∫ dλ1 ⋅ ⋅ ⋅ dλn e−i n⋅∑ kj λj . 0
λ1
(5.10)
λn−1
This keeps the same path-ordering (in other words λ1 ≤ . . . ≤ λn remains valid). The calculation goes as before, giving: n
−i
In (k1 , . . . , kn ) = ∏ j=1 ∞
j
l=1 4
n
n n d ki n⋅A(kn ) ⋅ ⋅ ⋅ n⋅A(k1 ) e−i a⋅∑ kj ∏ 4 16π j=1
U[+∞ ; a] = ∑ (−i g) ∫ n=0
,
n⋅ ∑ kn−l+1 − i 𝜖 −i j
.
n⋅ ∑ kn−l+1 − i 𝜖 l=1
The Feynman rules derived before remain valid if we make the substitution k → −k in the Wilson line propagators (but not in the external point). Then we can draw the
130 | 5 Wilson lines in high-energy QCD n
aμ
n
∑ kj
∑ kj
j=1
j=2
k1
k2
kn
kn−1 + kn
⋅⋅⋅
+∞ ⋅⋅⋅
kn−2
kn
kn−1
Fig. 5.2: n-gluon radiation for a Wilson line going from aμ to + ∞.
n-th order diagram for a Wilson line going from −∞ to bμ , as demonstrated in Figure 5.2. The path still flows from left to right, but now the momentum is opposite to the path flow. Let us now investigate what changes when we reverse the path of a Wilson line from aμ to bμ . First, the integration borders are of course interchanged, because the path flows from bμ to aμ . This is the same as keeping the integration borders as they are, and flipping the sign in the exponent. But the most important thing is that the order of the fields is reversed, because the field first on the path will be encountered last when following the reversed path flow. This is the idea of antipath-ordering P, defined such that the field with the highest value for λ is written rightmost. The reversed Wilson line is thus given by U[a ; b] = Pe
b
ig ∫a dz⋅A
.
(5.11)
But this is exactly the same as the Hermitian conjugate. The latter also reverses the † order of the fields, as (An ⋅ ⋅ ⋅ A1 ) = A†1 ⋅ ⋅ ⋅ A†n . By using the fact that A(k)† = A(−k) is a Hermitian function³, and making the substitution k → −k, the relation to the reversed path becomes apparent. We thus have (see e.g. [9]) †
U[a ; b] = U[b ; a] .
(5.12)
But of course it would be desirable to express the Hermitian conjugate line in function of normal path-ordered fields, such that we can use the same Feynman rules as before. Let us see for instance how a Wilson line from −∞ to bμ behaves when Hermitian conjugated: † †
U[b ; −∞]
n n [ ∞ −i g n [ 4 −i b⋅∑ kj ∏ = [∑ ( ) ∫d k n⋅A(k ) ⋅ ⋅ ⋅ n⋅A(k ) e i n 1 [n=0 16π 4 j=1
[ ∞
n
n
j
] ] ] ]
l=1
]
i n⋅ ∑ kl + i 𝜖
n
ig = ∑( ) ∫d4 ki n⋅A(−k1 ) ⋅ ⋅ ⋅ n⋅A(−kn ) ei b⋅∑ kj ∏ 4 16π n=0 j=1
−i j
n⋅ ∑ kl − i 𝜖 l=1
3 Because A(x) is real.
5.1 Eikonal approximation ∞ n n ig n 4 −i b⋅∑ kj ) ∫d ∏ k n⋅A(k ) ⋅ ⋅ ⋅ n⋅A(k ) e = ∑( i 1 n 4 n=0 16π j=1
i
131
|
,
j
n⋅ ∑ kl + i 𝜖 l=1
†
where we used the fact that A(k) = A(−k). We can relabel the fields by doing 1 → n, 2 → n − 1, . . . , n → 1, which gives ∞
†
n n ig n 4 −i b⋅∑ kj ∏ ) ∫d k n ⋅ A(k ) ⋅ ⋅ ⋅ n ⋅ A(k ) e i n 1 16π 4 j=1
i
U[b ; −∞] = ∑ ( n=0
j
.
n⋅ ∑ kn−l+1 + i 𝜖 l=1
This is the expansion of a Wilson line from bμ to + ∞, but with reversed path direction, i.e., † U[b ; −∞] = U[+∞ ; b] . (5.13) n̂ → −n̂ Watch the change in the sign of ∞. This relation is illustrated schematically in Figure 5.3. †
(
)
(
)
=
†
=
Fig. 5.3: Taking the Hermitian conjugate of a Wilson line literally mirrors it: the sign of ∞ is flipped and the path direction reversed.
First, for the Wilson line propagator we see that it gets complex conjugated when the momentum flow is opposed to the path direction: k = k
k
i , n⋅k + i 𝜖
=
−i , n⋅k − i 𝜖
(5.14a)
=
i . n⋅k + i 𝜖
(5.14b)
k
−i = , n⋅k − i 𝜖
Note that nμ is always defined in the positive direction. The vertex coefficient changes as well: j k μ, a
i = −i g nμ (ta ) , ij
j k μ, a
i = i g nμ (ta ) . ij
(5.15)
132 | 5 Wilson lines in high-energy QCD On the other hand, the sign in the exponent for an external point does not depend on the direction of the path, but only on the direction of the momentum flow: k
k μ
= r
rμ
= rμ
r
k
μ
k = r
rμ
= rμ
r
=
k
μ
k
= e−i r⋅k , (5.16a)
μ
k
k
=
= ei r⋅k . (5.16b)
Most of the time, we will drop the arrow indicating the path direction on the Wilson line, as it obscures readability. We will assume the path flows from left to right, unless specified otherwise. Another possible configuration is an infinite Wilson line, going from −∞ to + ∞ along a direction nμ , while passing through a point rμ . This we parameterize as z μ = rμ + n μ λ
λ = −∞ . . . + ∞.
(5.17)
In this case we can calculate the n − 1 innermost integrals as before, while the outermost integral gives a δ -function: λn
+∞
In = ∫ dλn e
−i n⋅kn λn
−∞
= (∏ j=1
n−1
∫ ⋅ ⋅ ⋅ ∫ dλn−1 ⋅ ⋅ ⋅ dλ1 e −∞
n−1
λ2
i j
n⋅ ∑ kl + i 𝜖
−i n⋅ ∑ kj λj 1
−∞ +∞
) ∫ dλn e
n
−i (n⋅∑ kj )λn
−∞
l=1 n−1
= (∏ j=1
i j
n
) 2π δ (n⋅ ∑ kj + i 𝜖) .
(5.18)
n⋅ ∑ kl + i 𝜖 l=1
There are some technical difficulties with the validity of the integral representation for the δ -function (as written here it is divergent because of the convergence terms i 𝜖). But after a suitable regularization of the path, it can be shown that a δ -function with a complex argument is well-defined if used with the sifting property: ∫dk δ (k ± i 𝜖) f (k ± i 𝜖) = f (0), but the integral representation remains divergent. This implies that writing δ (k ± i 𝜖) = ∫dx ei x (k±i 𝜖) should be avoided.
E
5.1 Eikonal approximation
|
133
Returning to the infinite Wilson line, we can reverse the integration variables and borders, as we did before in equation (5.10), to get an equivalent definition: +∞
+∞ +∞
In = ∫ dλ1 e
−i n⋅k1 λ1
n
∫ ⋅ ⋅ ⋅∫ dλ2 ⋅ ⋅ ⋅ dλn e
−∞
−i n⋅∑ kj λj 2
λ1 λn−1 n−1 j=1
n
−i
= (∏
) 2π δ (n⋅ ∑ kj − i 𝜖) .
j
(5.19)
n⋅ ∑ kn−l+1 − i 𝜖 l=1
We add the following Feynman rules: (5) An infinite Wilson line is parameterized as passing through a point rμ . This point is connected to ±∞ on one side, and to a cut line on the other side. All gluons are radiated from the part between this cut line and ∓∞. k (6) Wilson cut propagator: = 2π δ (k − i 𝜖).
For the Wilson cut propagator we have the same rule as with the normal Wilson line propagator, viz. that k → −k when the momentum direction is opposite to the path direction. We thus have two options for drawing the n-th order diagram of an infinite Wilson line, as shown in Figure 5.4. n
−∞
n
∑ kj
∑ kj
j=1
j=2
rμ
⋅⋅⋅
k1
k2
−∞
k1
⋅⋅⋅
k1 + k 2 k2
k3
⋅⋅⋅
kn
kn−2
kn−1
+∞ ⋅⋅⋅ n−1
k1
kn−1 + kn
∑ kj
j=1
kn−1
kn
n
∑ kj j=1
kn
+∞ rμ
Fig. 5.4: Two possible diagrams for n-gluon radiation from a Wilson line going from −∞ to + ∞. The upper diagram corresponds to equation (5.19) and the lower one to (5.18).
The last possible configuration is a finite Wilson line, going from a point aμ to a point μ −aμ bμ (where now the direction is defined by nμ = b‖b−a‖ ). We parameterize this as: zμ = aμ + nμ λ
λ = 0 . . . ‖b − a‖ ,
(5.20)
134 | 5 Wilson lines in high-energy QCD and expand the Wilson line as ∞
U= ∑( n=0
n −i g n 4 ) ∫d k1 ⋅ ⋅ ⋅ d4 kn n⋅A(kn ) ⋅ ⋅ ⋅ n⋅A(k1 ) e−i a⋅∑ kj In 4 16π
‖b−a‖ λn
λ2
n
In (k1 , . . . , kn ) = ∫ ∫⋅ ⋅ ⋅∫dλn ⋅ ⋅ ⋅ dλ1 e−i n⋅∑ kj λj . 0
0
(5.21)
(5.22)
0
The innermost integral can be easily calculated: λ2
i (e−i n⋅k1 λ2 − 1) . n⋅k1
∫dλ1 e−i n⋅k1 λ1 = 0
(5.23)
For higher orders we find a recursion relation: In (k1 , . . . , kn ) =
i (I (k + k2 , . . . , kn ) − In−1 (k2 , . . . , kn )) , n⋅k1 n−1 1
(5.24)
which we can solve exactly by careful inspection:⁴ n
n−1
−i (b−a)⋅∑ kj
In = ∑ (e
m+1
m
m=0
j=1
n
−i
− 1) ∏
i
∏
j
n⋅ ∑ km−l+1
j
j=m+1
.
(5.25)
n⋅∑ kl
l=1
l=m+1
We will rewrite this result in a more symmetrical way. First note that n
m
∑
∏
m=0
j=1
n
−i j
i
∏
n⋅ ∑ km−l+1
= 0,
j
j=m+1
n⋅∑ kl
l=1
l=m+1
meaning we can rewrite the integral in the following way: n−1
In = ∑ e
n
−i (b−a)⋅∑ kj m+1
m=0
m
∏ j=1
−i j
n⋅ ∑ km−l+1
n
∏ j=m+1
l=1 n−1
= ∑e
n
−i (b−a)⋅∑ kj m+1
m=0
m
∏ j=1
−i j
n⋅ ∑ km−l+1
= ∑e m=0
n
−i (b−a)⋅∑ kj m+1
m
∏ j=1
−i j
n⋅ ∑ km−l+1 l=1
−i
− ∑∏
j
m=0 j=1
n⋅∑ kl
j
n⋅ ∑ km−l+1
l=m+1 n
∏ j=m+1
l=1 n
n−1 m
i
l=1 n
i
+∏
j
j=1
n⋅∑ kl l=m+1
n
∏ j=m+1
i j
n
∏ j=m+1
i j
n⋅∑ kl l=m+1
−i j
n⋅ ∑ kn−l+1 l=1
.
(5.26)
n⋅∑ kl l=m+1
4 Remember that by definition ∑bj= a f (j) = 0 if a > b, this is an ‘empty sum’. The same is true for multiplication: ∏bj= a f (j) = 1 if a > b.
5.1 Eikonal approximation
|
135
Combining this with the exponent in equation (5.21) we finally get n
m
n
n
e−i a⋅∑ kj In = ∑ e
m
−i a⋅∑ kj −i b⋅∑ kj 1
e
−i
∏
m+1
m=0
j
j=1
n⋅ ∑ km−l+1
n
∏ j=m+1
l=1
i j
.
(5.27)
n⋅∑ kl l=m+1
Using the fact that the product of two infinite sums can in general be written as a chained sum: ∞
∞
i=0
j=0
∞
n
(∑ Ai ) (∑ Bj ) = ∑ ∑ Am Bn−m , n=0 m=0
we can transform equation (5.27) into a product of two Wilson lines: †
U[b ; a] = U[+∞ ; b] U[+∞ ; a] .
(5.28)
There is thus only one Feynman rule to add: (7) A finite Wilson line from aμ to bμ can be calculated by cutting the path in two at + ∞ or −∞, where the second part is a Hermitian conjugate line. This is illustrated schematically in the diagram in Figure 5.5. ⋅⋅⋅ ⋅⋅⋅
= +
⋅⋅⋅
+ ⋅⋅⋅ +
⋅⋅⋅
+
⋅⋅⋅
Fig. 5.5: The diagram for n-gluon radiation from a finite Wilson line can be understood as a sum of products of two half-infinite Wilson lines.
Let us recapitulate all the Feynman rules we have derived for a general linear Wilson line: Feynman rules for linear Wilson lines (1) Wilson line propagator: i k = n⋅k + i 𝜖 (momentum in the direction of the path). (2) External point: k −i r⋅k rμ = e (momentum flowing towards the external point). (3) Infinite point: +∞ =1
(k = 0).
136 | 5 Wilson lines in high-energy QCD (4) Wilson vertex: j i = −i g nμ (ta ) . ij k μ, a (5) An infinite Wilson line is parameterized as passing through a point rμ . This point is connected to ±∞ on one side, and to a cut line on the other side. All gluons are radiated from the part between this cut line and ∓∞. (6) Wilson cut propagator: k = 2π δ (k + i 𝜖) . (7) A finite Wilson line from aμ to bμ can be calculated by cutting the path in two at + ∞ or −∞, where the second part is a Hermitian conjugate line. It is important to realize that different conventions exist in the literature. We follow the same convention as Peskin and Schroeder [7], where the inverse Fourier transform has a minus sign in the exponent. If one uses the convention with a plus sign in the exponent, as in Collins’s book [9], one has to draw a Wilson line diagram with gluon momenta pointing outwards instead of inwards (essentially making the flip k → −k). Also, the sign in the Wilson line exponential is defined by the sign choice of the couping constant g. As is clear from equation (5.5), we use a negative sign. If one were to use a positive sign, one would get the complex conjugate of rule (4). Of course, the use of the right convention only matters for intermediate calculations; the final result concerning observables is invariant.
5.1.2 Wilson line as an eikonal line Now we return to the original goal of this section: the investigation of the eikonal approximation. In the eikonal approximation we assume a quark with momentum large enough to neglect the change in momentum due to the emission or absorption of a soft gluon. We take an incoming quark with momentum p that absorbs a soft gluon with momentum q. This is illustrated in Figure 5.6 (where the blob represents all possible p+q
p
F q μ, a
Fig. 5.6: A quark radiating a soft gluon.
5.1 Eikonal approximation
|
137
diagrams connected to the quark propagator). This diagram is equal to F
i (p/ + q/ ) (−i g ta 𝛾μ ) u(p). (p + q)2 + i 𝜖
(5.29)
Making the soft approximation is the same as neglecting q/ with respect to p/ , and q2 with respect to p⋅q, giving i p𝜈 𝛾𝜈 𝛾μ (−i g ta ) u(p). 2 p⋅q + i 𝜖
F
Because of the Dirac equation (B.16) p/ u(p) = 0, we can add a term i p𝜈 𝛾μ 𝛾𝜈 to the numerator of the fraction: F
i p𝜈 {𝛾𝜈 , 𝛾μ } (−i g ta ) u(p). 2 p⋅q + i 𝜖
(5.30)
Last we use the anticommutation rule (B.14) and write the momentum as pμ = |p| nμ , with nμ a normalized directional vector, in order to get F
i nμ n⋅q + i 𝜖
(−i g ta ) u(p).
(5.31)
What we see is that the Dirac propagator has been replaced by a Wilson line propagator, and the Dirac-gluon coupling by a Wilson-gluon coupling. By using the eikonal approximation, we literally factorized out the gluon contribution from the Dirac part. Of course this remains valid when radiating more than one gluon. In the latter case, the resulting formula is straightforward: (−i g)n F tan ⋅ ⋅ ⋅ ta1 u(p)
i nμ n
⋅⋅⋅
n
n⋅ ∑ qi + i 𝜖
nμ 1
n μ2
n⋅(q1 + q2 ) + i 𝜖 n⋅q1 + i 𝜖
.
This is exactly the result for an incoming bare quark connected to the blob, multiplied with a Wilson line going from −∞ to 0: F U(0 ; −∞) u(p).
(5.32)
̃ does not denote This is illustrated in the diagram in Figure 5.7. Note that the symbol ⊗ a convolution, but is used to remind ourselves that the relation does not give a bare multiplication either, because the tai are placed between the u(p) of the external quark and the blob. Writing out the Dirac and Lie indices makes this clear: j
i
(F)β δ βα (tan ⋅ ⋅ ⋅ ta1 )ji (u(p))α
i g nμn n
n⋅ ∑ qi + i 𝜖
⋅⋅⋅
i g nμ1 n⋅q1 + i 𝜖
.
138 | 5 Wilson lines in high-energy QCD ≈
...
̃ ⊗
...
Fig. 5.7: A quark radiating n soft gluons can be represented as a bare quark multiplied with a Wilson line going from −∞ to 0.
From this result, we introduce the concept of an eikonal quark. This is a quark that is only interacting softly with the gauge field, and thus does not deviate from its straight path: An eikonal quark can be understood as a bare quark multiplied with a Wilson line to all orders: i ψeik. ⟩ = Uij(0 ; −∞) ψ j ⟩ . (5.33) In other words, the net effect of multiple soft gluon interactions on an eikonal quark is just a color rotation (nothing but a phase). It is common to denote an eikonal quark with a double line, but this gives rise to ambiguities: the double line was already used to denote a Wilson line propagator. These are, although related, not the same. The eikonal line represents a quark (carrying spinor indices) resummed with soft gluon radiation to all orders, while the Wilson line propagator represents gluon radiation at a specified order (not necessarily soft), still to be multiplied with the quark (carrying no spinor indices itself). In short, Wilson line propagators are used in the calculation of an eikonal line. To appreciate the difference, have a look at equation (5.32): the eikonal quark is the combination U(0 ; −∞) u(p), while the Wilson line propagators are components of U(0 ; −∞) . To avoid confusion, only on an eikonal line we will draw an arrowhead (representing the quark’s momentum flow): p q
i An eikonal line, i.e., ψeik. (p)⟩α A Wilson line propagator, i.e.,
(5.34a) i . n⋅q + i 𝜖
(5.34b)
But keep in mind that these are commonly interchanged in literature. Using our notation for the eikonal line, we can write down the eikonal approximation diagrammatically as in Figure 5.8. A final remark: in the derivation of the eikonal approximation, more specifically equation (5.30), we used the fact that the quark in question is external, by adding a term 𝛾μ p/ u(p) = 0. This is a crucial step, without which we would not have been able to resum all gluons into a Wilson line, i.e., Wilson lines as a resummation of gluon radiation can only appear next to quarks that are on-shell.
5.2 Deep inelastic scattering
|
139
soft limit ... Fig. 5.8: In the soft limit, a bare quark can be represented as an eikonal quark.
It is possible to resum gluon radiation into a Wilson line even if it is not soft. For example, in the collinear approximation we allow for large radiated momenta q which are collinear to p, i.e., if pμ = |p| nμ then qμ = |q| nμ in the same direction. The Dirac equation tells us that p/ u(p) = 0 and thus n/ u(p) = 0, which implies we can add a term 𝛾μ q/ u(p) to equation (5.29). If we keep the quasi on-shell constraint, q2 ≈ 0 as compared to p ⋅ q, this again leads to a Wilson line, but this time with possibly big q momentum components (as long as they are collinear to p).
5.2 Deep inelastic scattering Now that we have identified the physical interpretation of a Wilson line, being a resummation of soft gluons, we continue by exploring how this translates into real-world examples. By far the most used application of Wilson lines is inside the definition of a Parton Density Function, or PDF for short, which is a function containing all information on the proton content. We start with the easiest setup which is Deep Inelastic Scattering, or DIS for short. Here an electron is collided with a proton, but in the final state only the electron is measured while all other final states are integrated out. This means we only get to use one PDF and we can integrate out the transversal component of the momentum of the struck quark, leaving only longitudinal dependence in the PDF. In Section 5.3 we go one step further by identifying a final state hadron, implying the need of two PDFs concurrently, and the preservation of transversal momentum dependence.
5.2.1 Kinematics Deep inelastic scattering is the most straightforward process to probe the insides of a hadron. An electron is collided head-on with a proton (or whatever hadron), destroying it. The kinematic diagram is shown in Figure 5.9. We will always neglect electron masses. The center-of-mass energy squared s is then given by s = (P + l)2 = m2p + 2P ⋅ l,
(5.35)
and q is the momentum transferred by the photon:
qμ = lμ − l μ . N
(5.36)
140 | 5 Wilson lines in high-energy QCD l electron
l q k
proton
X
P
Fig. 5.9: Kinematics of deep inelastic electron-proton scattering.
Because q2 = 2Ee Ee (cos θee − 1) ≤ 0, we define Q2 = −q2 ≥ 0. The invariant mass of the final state X is then given by N
m2X = (P + q)2 = m2p + 2P ⋅ q − Q2 .
(5.37)
In order for the photon to probe the contents of the proton, it should have a wavelength λ > m2p ) and inelastic (m2X >> m2p ) scattering. The two Lorentz invariants of interest in the process are Q2 and P ⋅ q, but it is convenient to use the variables Q and xB instead, where Q2 N xB = (5.38) 2P⋅q is called the Bjorken-x. Unless necessary to avoid confusion, we will always drop the index ‘B’, just remember that x always denotes the Bjorken-x (and thus not a general Q2 fraction, see further). Its kinematics restrain x to lie between s+Q 2 (neglecting terms of m
(O( Qp )) and 1 (the elastic limit). Another useful variable is N
y= =
P⋅q P⋅l
(5.39a) Q2
x (s − m2p )
.
(5.39b)
In the rest frame this equals y = E−E , the fractional energy loss of the lepton. It is not E an independent variable because Q2 = x y(s − m2p ).
(5.40)
Let us finish this subsection on kinematics with two trivial relations: 2x P⋅l =
Q2 y
l⋅q = −l ⋅q = −
(5.41a) Q2 . 2
(5.41b)
The latter can be demonstrated by calculating (l − q)2 = l 2 = 0.
5.2 Deep inelastic scattering
| 141
5.2.2 Invitation: the free parton model A ‘parton’ is a term used to denote any pointlike constituent of the proton, being quarks, antiquarks or gluons. The Parton model (or PM for short) describes the proton as a box containing an undetermined amount of such partons. The mutual interactions of these partons have large timescales compared to the interaction with the photon, allowing us to separate the latter from the former. For instance, inside the proton a gluon could fluctuate into a quark-antiquark pair. The photon would enter the proton and kick out one of the quarks, much faster than the pair can recombine. The pair looks ‘frozen’ to the photon: because of the much larger timescale of the parton interactions, all dynamics are hidden for the photon. The PM thus describes DIS without the strong interaction participating, i.e., we set gs = 0, because all strong interactions are hidden in the proton. It is convenient to call the short-distance process, the interaction with the photon, the hard part, which we will often denote with a hat, e.g., ŝ is the hard c.o.m. energy squared. In contrast to this stands the soft part, which, as we will see in the later subsections, contains all interactions at large distances. For now, we can make an intuitive distinction: everything inside the proton is soft, everything outside the proton (the photon and the struck quark) is hard. Later on we will give a more rigorous formulation for this distinction. Before we really delve into the PM, we try to get a general idea by investigating an extreme case: the Free Parton model (FPM). In this toy model the proton has no dynamic structure, but merely consists of exactly three quarks, totally unaware of each other’s existence. From the point of view of the photon it does not matter how the proton structure looks, be it in the FPM or the standard PM, it just hits a parton like it would hit any electromagnetically charged particle, ignoring all other particles in the proton. The hard part of the PM is therefore genuine electron-quark scattering, which we can describe similarly to electron-muon scattering.⁵ This is illustrated schematically in Figure 5.10. The differential cross section for (unpolarized) e− μ + scattering can be calculated by basic QED techniques and equals dσ − + y2 4πα 2 s (1 − y + ) , (e μ → e− μ + ) = 4 dy 2 Q
(5.42)
1 is the electromagnetic fine-structure constant. The only difference bewhere α ≈ 137 tween the cross section for e− μ + scattering and that for e− q± scattering is the charge of the quark: y2 d σ̂ − ± 4πα 2 ŝ (1 ), − y + (e q → e− q± ) = e2q dy 2 Q4
5 Note that we deliberately chose e− μ + scattering over e− e+ scattering, because the latter also contains a diagram where the two electrons annihilate into a virtual photon, which has no correspondence with e− q scattering.
142 | 5 Wilson lines in high-energy QCD
parton model
Fig. 5.10: Deep inelastic scattering in the parton model. The virtual photon strikes one of the quarks, while the other two quarks are left unharmed and don’t influence the process anyhow.
but now ŝ = (l + k)2 , the centre-of-mass energy squared of the electron and the quark. In order to relate the hard cross section to the full cross section, we define the quark momentum as a fraction of the proton momentum: k = ξP
0 < ξ < 1,
such that ŝ = ξ s
ŷ = y.
For the outgoing quark to be on-shell, we have the requirement (k + q)2 ≈ 2ξ P ⋅ q − Q2 ≡ 0 ⇒ ξ ≡ x. In this case, the momentum fraction equals the Bjorken variable, but this is certainly not a general result. The electron-quark cross section is then given by d3 σ̂q dx dy dξ
=
y2 4πα 2 s (1 − y + ) e2q ξ δ (x − ξ ) . 4 2 Q
(5.43)
We have made a distinction between x and ξ from a physical point of view. The Bjorken fraction x is related to and kinematically constrained by the type of experiment (in the case of DIS it is given by (5.38)), while ξ is related to the proton only, by representing the momentum fraction the quark carries in a specific event. Going to the electronproton cross section is obvious in the FPM. We simply integrate over the quark fraction ξ and make a weighted sum over the three quarks: d3 σ̂q d2 σ FPM 1 = ∑ ∫dξ dx dy 3 q dx dy dξ =
4πα 2 s y2 1 (1 − y + ) x ∑ e2q . 4 2 3 q Q
(5.44)
5.2 Deep inelastic scattering
|
143
5.2.3 A more formal approach Let us redo our intuitive derivation from the previous section in a more formal way. We will treat the proton as a ‘black box’ (contrary to the FPM representation where it is a packet of three partons), which we deeply probe with a highly virtual photon. This is depicted in Figure 5.11. What we keep from the parton model is the assumption that the photon interacts with one constituent of the proton only (a quark, an antiquark, or at higher orders possibly a gluon), on a timescale sufficiently small to allow the struck parton to be considered temporarily ‘free’. To motivate this quantitatively, we write the components of the proton momentum P and the parton momentum k in light-cone coordinates (see Appendix B.3): Pμ = (P+ ,
X
m2p 2P+
kμ = (k+ , k− , p⊥ ).
, 0⊥ )
Fig. 5.11: Deep inelastic scattering to all orders: a photon hitting a proton and breaking it.
In the remaining frame of the proton, the distribution of its constituents is isotropic, i.e., all components of pμ are of the order ≲ mp . In the limit P+ → ∞, the so-called infinite-momentum frame, the only remaining component of the proton momentum is its plus-component. The parton naturally follows the proton in the boost. Then the 4-momenta become: PIMF = (P+ , 0− , 0⊥ ) μ
kIMF ≈ (k+ , 0− , 0⊥ ). μ
The parton’s transverse component p⊥ ∼ mp can be trivially neglected when compared to p+ → ∞. The ratio of the plus momenta is boost invariant, so that we can write: ξ =
k+ P+
⇒
μ
μ
kIMF = x PIMF .
As long as we can boost to a frame where P+ is the only remaining large component of the proton momentum, the parton is fully collinear to the parent proton and can thus be considered to be ‘free’. From now on we will always parameterize the proton momentum and the struck quark momentum based on the dominantly large P+ : Pμ = (P+ ,
m2p 2P+
, 0⊥ )
kμ = (ξ P+ ,
k2 + k2⊥ , k⊥ ) , 2ξ P+
(5.45)
144 | 5 Wilson lines in high-energy QCD where we can safely assume k2 , k2⊥ m2p . Also note that P⋅q̂ μ q̂ q⋅̂ q̂
= −P⋅q̂ q̂ μ . The next basis vector is then constructed by subtracting from it its projection on q̂ μ and tμ̂ . This is the same as contracting it with g⊥μ 𝜈 = gμ 𝜈 + q̂ μ q̂ 𝜈 − tμ̂ t𝜈̂ , (5.51)
the projection on q̂ μ equals
with the following useful properties: q̂ μ g⊥μ 𝜈 = g⊥μ 𝜈 q̂ 𝜈 = 0 tμ̂ g⊥μ 𝜈 = g⊥μ 𝜈 t𝜈̂ = 0
(5.52a) (5.52b)
5.2 Deep inelastic scattering
|
145
g⊥μ 𝜈 g⊥𝜈ρ = δρμ + q̂ μ q̂ ρ − tμ̂ tρ̂
(5.52c)
g⊥μ 𝜈 g⊥μ 𝜈
(5.52d)
= 2.
μ𝜈
Note that this definition of g⊥ is compatible with the definition in (B.40). We can construct a third orthonormal (spacelike) vector from, say, lμ : lμ̂ =
1
N
=
√−lμ g⊥μ 𝜈 g⊥𝜈ρ lρ
g⊥μ 𝜈 l𝜈
1 √1 − y −
κ 2 −1 4
(κ y2
y μ y 2−y ̂ l − κ q̂ − t) , Q 2 2
where we used the relations in (5.41). It is again a spacelike orthonormal vector: lμ⊥̂ lμ̂ = lμ⊥̂ gμ 𝜈 l𝜈⊥̂ = lμ⊥̂ g⊥ μ 𝜈 l𝜈⊥̂ = −1.
(5.53)
Now normally we would proceed with the construction of the last orthonormal basis vector, but we do not have any independent vectors left in our process. But we still can define an antisymmetric projection tensor as follows: ε⊥μ 𝜈 = ε μ 𝜈ρσ tρ̂ q̂ σ .
(5.54)
μ𝜈
As before, this definition of ε⊥ is compatible with the definition in (B.42). It is easy to show that ε⊥μ 𝜈 t𝜈̂ = 0,
(5.55a)
= 0,
(5.55b)
= ε⊥μ ρ ,
(5.55c)
ε⊥μ 𝜈 q̂ 𝜈 ε⊥μ 𝜈 g⊥ 𝜈ρ
ε⊥μ 𝜈 g⊥ μ 𝜈 = ε⊥μ μ = 0,
(5.55d)
by use of the antisymmetry of ε μ 𝜈ρσ . Note that ε⊥ 𝜈 has the same components as ε⊥ but with opposite signs. Furthermore, because in general μ
μρ
ε μ 𝜈ρσ εμ 𝜈τυ = −2 (δ ρτ δ συ − δ ρυ δ στ ) ,
(5.56)
ε⊥μ 𝜈 ε⊥ μ 𝜈 = 2.
(5.57)
we have Let us summarize our new basis: Orthonormal basis vectors: q̂ μ =
qμ Q
(5.58a)
tμ̂ =
11 (2xPμ + qμ ) κQ
(5.58b)
146 | 5 Wilson lines in high-energy QCD lμ̂ =
1 √1 − y −
κ 2 −1 2 y 4
(κ
y μ y 2−y ̂ l − κ q̂ − t) . Q 2 2
(5.58c)
Transversal tensors: g⊥μ 𝜈 = gμ 𝜈 + q̂ μ q̂ 𝜈 − tμ̂ t𝜈̂ ε⊥μ 𝜈 = ε μ 𝜈ρσ tρ̂ q̂ σ .
(5.59a) (5.59b)
Now we can express all these momenta in our new basis using (5.45) (remember that the projections on q̂ μ and lμ̂ give an extra minus sign, because q̂ 2 = l2̂ = −1): qμ = Q q̂ μ
(5.60a)
Q μ̂ Q μ t − q̂ 2x 2x ξ Q μ̂ ξ Q μ kμ ≈ t − q̂ x2 x2
Pμ = κ
lμ =
lμ =
(5.60b) (5.60c)
1 2 − y μ̂ Q 1 Q√ κ 2 − 1 2 μ̂ 1−y− Q t + q̂ μ + y l κ 2y 2 κ y 4
(5.60d)
1 2 − y μ̂ Q 1 Q√ κ 2 − 1 2 μ̂ 1−y− Q t − q̂ μ + y l . κ 2y 2 κ y 4
(5.60e)
It is easy to verify that these formulae indeed reproduce the correct definitions; for instance one can quickly check the on-shell conditions q2 = −Q2 , k2 = ξ 2 m2p , l2 =
l 2 = 0. Let us return to equation (5.48), and specify the lepton and hadron tensor in our new basis. We consider the electron beam to be polarized, say longitudinally, but we do not measure the polarization of the outgoing electron, implying we have to sum over outgoing polarization states using (B.22). Then the lepton tensor Lμ 𝜈 is given by
λ
Lμ 𝜈 = ∑ (u (l)𝛾μ uλ (l )) (u (l )𝛾𝜈 uλ (l)) λ
N
λ
= −Q2 gμ 𝜈 + 4l(μ l 𝜈) + 2i λε μ 𝜈ρσ lρ lσ .
(5.61)
Writing it in our new basis gives Lμ 𝜈 =
Q2 ̂ ̂ l𝜈) [−y2 g⊥μ 𝜈 + 4 (1 − y) (tμ̂ t𝜈̂ + lμ̂ l𝜈̂ ) + 4√1 − y (2 − y) t(μ y2 − i λ y (2 − y) ε μ 𝜈 + i 2λ y√1 − y ε μ 𝜈ρσ q̂ l ̂ ] . ⊥
ρ σ
(5.62)
5.2 Deep inelastic scattering
|
147
On the other hand, from equation (5.48) we see that the hadronic tensor is defined as d3 pX N W μ 𝜈 = 4π 3 ∑ ∫ δ (4) (P + q − pX ) ⟨P| J †μ (0) |X⟩ ⟨X| J 𝜈 (0) |P⟩ 3 2E (2π ) X X =
1 ∫d4 z ei q⋅z ⟨P| J †μ (z)J 𝜈 (0) |P⟩ , 4π
(5.63)
where we used the translation operator ⟨P| J †μ (0) |X⟩ ei (P−pX )⋅z = ⟨P| J †μ (z) |X⟩ ,
(5.64)
and integrated out a complete set of states by use of the completeness relation: d3 pX ∑∫ |X⟩ ⟨X| = 1. (2π )3 2EX X
(5.65)
Figure 5.12 shows the common convention to draw the hadronic tensor. It is a squared amplitude for a proton absorbing a photon going to any final state X, while summing over all possible final states. The straight line, a so-called ‘final-state cut’, acts both as a separator (everything to the left is the amplitude M, everything to the right is the complex conjugate M∗ ) and as a symbol representing the completeness relation (reminding us that we have to sum over all final states and integrate out their momenta). It is straightforward to use the final-state cut in perturbative calculations: every particle crossing it is a real particle and thus has to be on-shell. This can be incorporated by adding a δ (p2 − m2 ), matching the particle’s momentum squared to its mass squared.
Fig. 5.12: The hadronic tensor is a squared amplitude defined with a sum over all possible external states. This sum, and the separation between the amplitude and its conjugate, is represented by the vertical final-state cut line.
We have no information about the contents of the hadronic tensor, as it sits in the highly nonperturbative region of QCD; the proton constituents are strongly confined. But we can parameterize the hadronic tensor based on its mathematical structure. We will in this book only work with unpolarized hadron tensors, as polarization brings some technicalities with it which would distract us too much from our main topic of interest. The main course of calculations remains however the same for polarized hadrons.
148 | 5 Wilson lines in high-energy QCD For an unpolarized proton, W μ 𝜈 will only exist in the vector space spanned by the orthonormal vectors we derived before. But as the electron momentum lμ does not have any physical significance inside the hadron tensor, we will use q̂ μ , tμ̂ , and their μ𝜈 μ𝜈 crossings (remember that the transversal plane can be described by g⊥ and ε⊥ , being combinations of q̂ μ and t𝜈̂ ). Thus we can expand it as: W μ 𝜈 = A gμ 𝜈 + B q̂ μ q̂ 𝜈 + C q̂ μ t𝜈̂ + D tμ̂ q̂ 𝜈 + E tμ̂ t𝜈̂ + i Fε μ 𝜈ρσ tρ̂ q̂ σ , where the scalar functions A, . . . , F only depend on m2p , Q2 and x (because there are no other invariants in the proton system). In the case of polarized hadrons, the spin vector Sμ and its combinations should be added to the basis. Next we impose current conservation, which requires 𝜕μ J μ = 0. Applying this to equation (5.63) we find q̂ μ W μ 𝜈 = W μ 𝜈 q̂ 𝜈 = 0. This condition gives: A≡B
C ≡ D ≡ 0.
W μ 𝜈 should also be Hermitian and time-reversal invariant, and for the electromagnetic and the strong force it should be parity invariant as well. By using the transformation matrix 1 0 0 0 0 −1 0 0 𝜈 ), Λμ = ( 0 0 −1 0 0 0 0 −1 we can write out these conditions (adding spin-dependence for future reference): Wμ∗𝜈 (q, P, S)
Hermiticity:
ρ
≡ W𝜈μ (q, P, S)
(5.66a)
̃ −S) ̃ ≡ Wμ 𝜈 (q̃, P,
(5.66b)
parity-reversal:
Λμ Λ𝜈σ Wμ 𝜈 (q, P, S)
time-reversal:
̃ S) ̃ Λμ ρ Λ𝜈σ Wμ∗𝜈 (q, P, S) ≡ Wμ 𝜈 (q̃, P,
(5.66c)
̃μ = δ μ 0 q0 − δ μ i qi . The effect of these conditions is that A, . . . , F should be real where q functions, and the parity-reversal requirement sets F = 0. But parity is not conserved in weak interactions; in that case F is allowed. We can rewrite W μ 𝜈 as
Wμ𝜈 = −
1 μ𝜈 [g F (x, Q2 ) − tμ̂ t𝜈̂ FL (x, Q2 ) − i ε⊥μ 𝜈 FA (x, Q2 )] 2x ⊥ T
(5.67)
where FT = −2x A,
FL = 2x (A + E),
FA = 2x F.
These are called the transversal resp. longitudinal resp. axial structure functions of the proton. They are nonperturbative (and thus noncalculable) objects, which have to be extracted from the experiment. In parallel to these, a different notation is also used in
5.2 Deep inelastic scattering
| 149
literature: 1 F , 2x T 1 F2 = 2 (FL + FT , ) κ 1 F , F3 = x κ2 A
F1 =
FT = 2x F1 ,
(5.68a)
FL = κ 2 F2 − 2x F1 ,
(5.68b)
F A = x κ 2 F3 .
(5.68c)
We can express the hadron tensor as well in a function of the last three structure functions: κ2 κ2 W μ 𝜈 = −g⊥μ 𝜈 F1 + tμ̂ t𝜈̂ ( F2 − F1 ) + i ε⊥μ 𝜈 F3 . (5.69) 2x 2 The difference between FT , FL , FA and F1 , F2 , F3 is just a matter of historic convention. However, there exist different conventions for the normalization of the structure functions, if so often differing by a factor of 2 or 2x. We follow the same convention as, e.g., in [63], as we believe it to be the most commonly accepted one. The structure functions can be extracted from the hadronic tensor by projecting with appropriate tensors: 1 F1 = − g⊥μ 𝜈 Wμ 𝜈 , 2 x F2 = 2 (2tμ̂ t𝜈̂ − g⊥μ 𝜈 ) Wμ 𝜈 , κ 2i F3 = − 2 ε⊥μ 𝜈 Wμ 𝜈 , κ
FT = −x g⊥μ 𝜈 Wμ 𝜈 ,
(5.70a)
FL = 2x tμ̂ t𝜈̂ Wμ 𝜈 ,
(5.70b)
FA = −2x i ε⊥μ 𝜈 Wμ 𝜈 .
(5.70c)
For the rest of the book we will ignore weak interactions, dropping FA from the hadronic tensor. Combining the result from the leptonic and the hadronic tensor, we get Lμ 𝜈 W μ 𝜈 =
2Q2 y2 [(1 − y + ) FT (x, Q2 ) + (1 − y) FL (x, Q2 )] . 2 x y2
Plugging this result in equation (5.48) gives us the final expression for the unpolarized cross section for electron-proton deep inelastic scattering (neglecting terms of order
m2p ): Q2
y2 4πα 2 s d2 σ [(1 ) FT (x, Q2 ) + (1 − y) FL (x, Q2 )] . − y + = dx dy 2 Q4
(5.71)
If we compare this with the result in equation (5.44), we find the following structure functions for the free parton model: FTFPM (x, Q2 ) =
1 x ∑ e2 3 q q
FLFPM (x, Q2 ) = 0.
(5.72a) (5.72b)
150 | 5 Wilson lines in high-energy QCD 5.2.4 Parton distribution functions In Section 5.2.2 we succeeded in deriving a lowest order result for the cross section, starting from a static proton. On the other hand, in Section 5.2.3 we followed a more formal approach, without any assumptions about the proton structure but one: that we can separate the hard interaction from the proton contents. This is the concept of factorization: in any process containing hadrons we try to separate the perturbative hard part (the scattering Feynman diagram) from the nonperturbative part (the hadron contents). The latter is not-calculable, and consequently it has to be described by a probability density function (or parton distribution function, PDF for short) that gives the probability to find a parton with momentum fraction x in the parent hadron. However, one has to proceed with caution because factorization has not been proven except for a small number of processes, including e+ e− -annihilation, DIS, SIDIS and Drell–Yan. The PDF is literally the object that describes the proton as a black box. You give it a fraction x and it returns the probability to hit a parton carrying this momentum fraction when you bombard the proton with a photon. It is commonly written as fq (ξ ), where q is the type of parton for which the PDF is defined. There are thus 7 PDFs, one for each quark and antiquark, and one for the gluon. A parton distribution function is not calculable; they have to be extracted by experiment. However, as we will see in Section 5.2.7, we can calculate its evolution equations, such that we can evolve an extracted PDF from a given kinematic region to a new kinematic region. It is a probability density, but it is also a distribution in momentum space; by plotting the PDF in function of x one gets a clear view of the distribution of the partons in the proton. Furthermore we assume that the PDF only depends on x, and not, e.g., on the parton’s transverse momentum. This does not mean that we automatically neglect the struck parton’s transverse momentum component! But because we do not identify any hadron in the final state, and because we have to sum over all final states and integrate out their momenta (the final-state cut), any transverse momentum dependence in the PDF or the hard part is integrated out. Factorization in DIS, also called collinear factorization because of the collinearity of the quark to the proton, is a factorization over x (and an energy scale). We can write this formally as dσ ̂ μ 2 ), ∼ fq (x, μF2 ) ⊗ H(x, F dx which is just a schematic. We will treat the technical details soon, in Section 5.2.7. Whenever information on the transverse momentum is needed, e.g., when identifying a final hadron as in semi-inclusive DIS, collinear factorization will not do, and k⊥ -factorization is needed instead, where a transverse momentum dependent PDF, or
5.2 Deep inelastic scattering
|
151
TMD for short, is convoluted with the hard part: dσ ̂ , k , μ 2 ). ∼ fq (ξ , k⊥ , μF2 ) ⊗ H(ξ ⊥ F dξ Formally, a PDF and a TMD should be related by integrating out the transverse momentum dependence: fq (ξ ) = ∫d2 k⊥ fq (ξ , k⊥ ), however, after QCD corrections this equality is no longer valid. In the parton model, the concept of (collinear) factorization can be painlessly implemented: PM x dσ ≡ ∑ ∫dξ fq (ξ ) d σ̂q ( ) , ξ q
= fq ⊗ d σ̂q . N
(5.73a) (5.73b)
Note that this is not a standard convolution the way you might know it, like ∫dτ f (τ )g(t − τ ). This is because the latter is a convolution as defined in Fourier space. In QCD, a lot of theoretical progress has been made by the use of Mellin moments. These form an advanced mathematical tool, which would take use too long to delve into. Just know that the type of convolution as in (5.73) is a convolution in Mellin space. If we now plug equation (5.71) and (5.43) in (5.73), we get 1
FTPM
x (x, Q ) = ∑ ∫dξ fq (ξ ) F̂ Tq ( ) , ξ q 2
(5.74a)
x
= ∑ e2q xfq (x),
(5.74b)
q
FLPM (x, Q2 ) = 0, where
F̂ Tq (x) = x e2q δ (1 − x)
(5.74c)
(5.75)
is the structure function of the quark. Note that FTPM does not depend on Q2 ! This is called the “Bjorken scaling” prediction: the structure functions scale with x, independently of Q2. Because this prediction is a direct result from the parton model, it should be clearly visible in leading order (up to first-order QCD corrections, where the Bjorken scaling is broken). This is indeed confirmed by experiment. Also note that by comparing (5.74) to (5.72), we can easily find the quark PDFs in the free parton model: 1 fqFPM (x) = , (5.76) 3
152 | 5 Wilson lines in high-energy QCD which is exactly what the initial assumption for the FPM is: the proton equals exactly three quarks, thus the probability of finding a quark is always one third, regardless the value of x. A note on the difference between structure functions and PDFs. A structure function emerges in the parameterization of the hadronic tensor, the latter being process dependent. If we have a look at its definition for DIS in equation (5.63), we see that the hadronic tensor contains information both on the proton content and the photon hitting it. This is illustrated in Figure 5.12, where the blob represents the hadronic tensor, describing the process of a photon hitting a (black box) proton. As a structure function is just a parameterization of the hadronic tensor, the same applies to it. If we change the process to, say, deep inelastic neutrino scattering, our structure functions change as well, because now they describe the process of a W ± or Z 0 boson hitting a proton. But the main idea behind factorization is that, inside the structure functions, we can somehow factorize out the proton content (which is process independent) from the process dependent part. This is shown in Figure 5.13, where the smaller blob now represents a quark PDF. The factorization of structure functions in the parton model is demonstrated in equations (5.74). The initial factorization ansatz, equation (5.73), is required to be valid for any cross section, given a unique set of PDFs, i.e., the PDFs are universal. We can extract these PDFs in one type of experiment, like electron DIS, and reuse them in another experiment like neutrino DIS. In contrast with the structure functions, PDFs emerge in the parameterization of the quark correlator, as we will see in the next subsection, which is universal by definition.
Wμ𝜈 fq
Fig. 5.13: Difference between structure functions and PDFs.
5.2.5 Operator definition for PDFs As we have shown before, we can assume that the photon scatters off a quark with mass m inside the proton, if Q2 is sufficiently large. The final state can therefore be split in a quark with momentum p and the full remaining state with momentum pX . Constructing the (unpolarized) hadronic tensor for this setup is straightforward. First we remark that pulling a quark out of the proton at a space-time point (0+ , 0− , 0⊥ ) is simply ψα (0) |P⟩. Then we construct the diagram for the hadronic tensor, the so-called
5.2 Deep inelastic scattering
|
153
‘handbag diagram’, step-by-step: k X
=
⟨X| ψα (0) |P⟩
X
=
uβ (p) (𝛾𝜈 )
∼
[𝛾μ (p/ + m) 𝛾𝜈 ]
p
𝜈 k
𝜈
p
βα
λ
⟨X| ψα (0) |P⟩
μ
k
k βα
⟨P| ψ β (0) |X⟩ ⟨X| ψα (0) |P⟩ ,
where we omitted the prefactor, sums and integrations over X and p and the δ function. Then the full hadronic tensor is given by Wμ𝜈 =
3
1 ̃∫ d p ∫d4 z ei (P+q−pX −p)⋅z ∑ e2 ∑ 4π q q X (2π )3 2p0 × [𝛾μ (p/ + m) 𝛾𝜈 ]
βα
⟨P| ψ β (0) |X⟩ ⟨X| ψα (0) |P⟩ ,
(5.77)
where we used the shorthand notation 3 ̃ =N ∑ ∫ d pX ∑ . (2π )3 2EX X X
(5.78)
Next we replace the integral over p with an on-shell condition d3 p ∫ (2π )3 2p0
d4 p ∫ 2π δ + (p2 − m2 ) , (2π )4
→
where δ + is defined in (B.47). We introduce the momentum k = p − q, giving Wμ𝜈 =
4 1 ̃ ∫ d k δ + ((k + q)2 − m2 ) ∫d4 z ei (P−k−pX )⋅z ∑ e2q ∑ 4π q (2π )3 X
× [𝛾μ (/k + q/ + m) 𝛾𝜈 ]
βα
⟨P| ψ β (0) |X⟩ ⟨X| ψα (0) |P⟩ .
Now the next steps are the same as in equation (5.63), using the translation operator and the completeness relation:
154 | 5 Wilson lines in high-energy QCD
Wμ𝜈 =
1 ∑ e2 ∫d4 k δ + ((k + q)2 ) Tr(Φq 𝛾μ (/k + q/ ) 𝛾𝜈 ) 2 q q
d4 z q N =∫ e−i k⋅z ⟨P| ψ β (z)ψα (0) |P⟩ . Φαβ (2π )4
(5.79a)
(5.79b)
Φ is the quark correlator, which will be used as a basic building brick to construct PDFs. Note that its Dirac indices are defined in a reversed way, this is deliberate to set the trace right. This result is quite a general result, valid for a range of processes. Using equation (5.45) and neglecting terms of O( Q1 ), we can approximate the δ function in (5.79a) as δ ((k + q)2 ) ≈ P+ δ (ξ − x) , which again sets ξ ≡ x as in the free parton model. This then gives Wμ𝜈 ≈
P+ 1 ∑ e2q Tr(Φq (x) 𝛾μ (/k + q/ ) 𝛾𝜈 ) , 4 q P⋅q
(5.80)
where the integrated quark correlator is defined as Φ(x) = ∫dk − d2 k⊥ Φ(x, k− , k⊥ ) =
+ − 1 ∫dz − e−i xP z ⟨P| ψ β (0+ , z− , 0⊥ )ψα (0) |P⟩ . 2π
(5.81)
A last simplification that we can make is to assume that the outgoing quark is moving largely in the minus direction; kμ + qμ ≈ k− + q− . This is easily understood in the infinite momentum frame, where the quark ricochets back after being struck head-on by the photon. However, it is a valid simplification in any frame, which can be shown by making a Q1 expansion of W μ 𝜈 . With this assumption we get 2 2 P+ P+ k + k ⊥ + q− ) (/k + q/ ) ≈ 𝛾+ ( P⋅q P⋅q 2ξ P+
≈ 1, giving the final result for the unpolarized hadron tensor in DIS at leading twist (this means up to O( Q1 )): 1 (5.82) W μ 𝜈 ≈ ∑ e2q Tr(Φq (x) 𝛾μ 𝛾+ 𝛾𝜈 ) . 4 q Now let us investigate the unintegrated quark correlator (5.79b) a bit deeper. Since it is a Dirac matrix, we can expand it in function of Lorentz vectors, pseudovectors and Dirac matrices. The variables on which it depends are pμ , Pμ and Sμ (the latter is a pseudovector in the case of fermionic hadrons). Our basis is then (see (B.24)) spanned
5.2 Deep inelastic scattering
|
155
by 1, 𝛾5 , 𝛾μ , 𝛾μ 𝛾5 , 𝛾μ 𝜈 ,
pμ , Pμ , Sμ
where 𝛾μ 𝜈 = 𝛾[μ 𝛾𝜈] . The next steps go completely analogously to our derivation of the structure functions from the hadron tensor in Section 5.2.3. The conditions to satisfy are Φ(p, P, S) ≡ 𝛾0 Φ† (p, P, S)𝛾0 ̃ −S)𝛾 ̃ 0. ̃ , P, Φ(p, P, S) ≡ 𝛾0 Φ(p
Hermiticity: Parity:
(5.83a) (5.83b)
For instance the integrated quark correlator can be expanded up to leading twist as Φ(x, P, S) =
1 1 (f (x)𝛾− + g1L (x)SL 𝛾5 𝛾− + h1 (x) [S/ T , 𝛾− ] 𝛾5 ) 2 1 2
(5.84)
where the three integrated PDFs f1 , g1L and h1 are the unpolarized resp. helicity resp. transversity distributions. They can be recovered from the quark correlator by projecting on the correct gamma matrix: 1 Tr(Φ 𝛾+ ) 2 1 g1L = Tr(Φ 𝛾+ 𝛾5 ) 2 1 h1 = Tr(Φ 𝛾+i 𝛾5 ) . 2 f1 =
(5.85a) (5.85b) (5.85c)
5.2.6 Gauge invariant operator definition A general Dirac field transforms under a non-Abelian gauge transformation as ψ (x) → ei α
a
(x)ta
ψ (x)
−i α a (x)ta
ψ (x) → ψ (x) e
(5.86a) .
(5.86b)
As a result, the quark correlator is not gauge-invariant: a a a a d4 z Φ→∫ e−i k⋅z ⟨P| ψ β (z) e−i α (z)t ei α (0)t ψα (0) |P⟩ . (2π )4
But as we saw in the Introduction, a Wilson line U[x ; y] from y to x transforms as U[x ; y] → e
i α a (x)ta
U[x ; y] e
−i α a (y)ta
,
then the following definition for the quark correlator is gauge-invariant: d4 z N Φ=∫ e−i k⋅z ⟨P| ψ β (z) U[z ; 0] ψα (0) |P⟩ . (2π )4
(5.87)
Note that the gauge transformation of U only depends on its endpoints. Although the latter are fully fixed by the quark correlator, there is still the freedom of the choice of
156 | 5 Wilson lines in high-energy QCD the path, influencing the result. The gauge-invariant correlator is thus path dependent! This will play a big role when working with the k⊥ -dependent correlator, which we will investigate further in Section 5.3. The integrated quark correlator on the other hand, see equation (5.81), has its quark fields only separated along the z− direction. This simplifies the Wilson line⁶ considerably: Φ(x) =
+ − 1 ∫dz− e−i xP z ⟨P| ψ β (0+ , z− , 0⊥ ) U−[z ; 0] ψα (0) |P⟩ 2π z−
− U[z ; 0]
−i g ∫dλ n−⋅A(0+ , λ , 0⊥ )
= Pe
0
.
In the light-cone gauge, we have A+ = 0 and thus U− = 1, reducing the quark correlator to the definition in (5.81). As long as one stays in the A+ = 0 gauge, it is valid to neglect the Wilson line inside the PDFs. Using the trick in equation (5.28), we can write the Wilson line as † − − − U[z ; 0] = [U[+∞ ; z] ] U[+∞ ; 0] . (5.88) We can associate the part of the Wilson line going to + ∞ with an on-shell line, because we can throw a complete set of final states between the two Wilson lines. We thus extend the final-state cut through the Wilson line as well. This is illustrated in Figure 5.14. Remember from equation (5.33) that a quark dressed with a Wilson line can be considered an eikonal quark, essentially being a quark with soft and collinear gluon resummation. The physical interpretation for the quark correlator is not different: it represents all soft and collinear interactions between the struck quark and the proton.
(a)
(b)
Fig. 5.14: (a) The gauge-invariant quark correlator function, with a cut Wilson line. (b) The Wilson lines inside the definition of the correlator account for the resummation of soft gluons.
We inserted the Wilson line somewhat ad-hoc: we were looking for an object having the correct transformation properties to make the quark correlator gauge invariant,
6 In the context of PDFs, Wilson lines are commonly called gauge links. We do not use this terminology in our book.
5.2 Deep inelastic scattering
p−l
|
157
p l
k−l
Fig. 5.15: A first order correction to the PDF.
and the Wilson line happens to be such an object. It is not so difficult to prove this in a more formal way, using the the eikonal approximation. Consider the diagram in Figure 5.15, where one soft gluon before the cut connects the struck quark with the blob. The hadronic tensor is then (see also (5.82)): W μ 𝜈 ∼ ∑ e2q q
p/ − /l + m 1 Tr(ΦρA (k, k − l) 𝛾μ 𝛾+ 𝛾ρ 𝛾𝜈 ) , 2 (p − l)2 − m2 + i 𝜖
where the quark-quark-gluon correlator is given by ΦA (k, k − l) =
1 d4 z d4 u −i k⋅z −i l⋅(u−z) ∫ e e ⟨P| ψ β (z) gAρ (u)ψα (0) |P⟩ . 2 (2π )4 (2π )4
Remember that we have an on-shell quark so that we can use the eikonal approximation. The 𝛾+ is what is left of the real quark, after making the sum over polarization states: s ∑ us (p)u (p) = p/ + m → p− 𝛾+ , (5.89) so we can use 𝛾+ as though it were an u(p) on which to perform the eikonal approximation (as in equation (5.30)). Then we can make the approximation 𝛾+ 𝛾ρ
p/ − /l + m −nρ . ≈ 𝛾+ 2 n⋅l − i 𝜖 (p − l) − m2 + i 𝜖
(5.90)
This is indeed a Wilson line propagator for a line from z to ∞. An important remark: the definition of U†[+∞ ; z] also incorporates an exponential coming from the Feynman rule + −
for the external point. This exponential has been extracted from U (it is e−i xP z ), but this remains valid by momentum conservation. The choice to extract the exponential from the Wilson line is by historic convention. It is straightforward to generalize this to any number of gluons, where gluons on the left of the cut will be associated with a line from z to ∞, and gluons on the right of the cut with a Hermitian conjugate line. In other words: W μ 𝜈 ∼ ∑ e2q q
1 Tr(Φρ (x) 𝛾μ 𝛾+ 𝛾𝜈 ) , 2
158 | 5 Wilson lines in high-energy QCD where now the quark-quark-gluon correlator is resummed to all orders: Φ=
+ − 1 − ∫dz− e−i xP z ⟨P| ψ β (0+ , z− , 0⊥ ) U−† [+∞ ; z] U[+∞ ; 0] ψα (0) |P⟩ . 2π
(5.91)
This is indeed the anticipated result. Using equation (5.85a), we can give a gaugeinvariant formulation of the unpolarized integrated quark parton density function: fq/p (x) =
+ − 1 + − ∫dz− e−i xP z ⟨P| ψ (z− ) U−† [+∞ ; z] 𝛾 U[+∞ ; 0] ψ (0) |P⟩ , 4π
(5.92)
where the subscript in fq/p is a common convention to denote “the integrated quark PDF for a quark with flavor q inside a proton”. But what about the gluon PDF? Until now we totally ignored the possibility of the photon hitting a gluon inside the proton, because it is a higher order interaction. But while we are moving towards a more realistic approach of QCD, we cannot ignore gluon densities any further. A photon can hit a gluon by interchanging a quark. This process is called “boson-gluon fusion” and is illustrated in Figure 5.16. To construct the integrated gluon PDF, we start in the light-cone gauge A+ = 0 such that we can ignore Wilson lines for now. There is a constraint equation on A− relating it to the transverse gauge field, implying that the latter are the only independent fields. Following the same derivation as in Section 5.2.5, we find fg/p (ξ ) =
+ − 1 ∫dz− ξ P+ e−i ξ P z ⟨P| Aia (z− )Aia (0) |P⟩ . 2π
X Fig. 5.16: Boson-gluon fusion in DIS.
The factor ξ P+ is typical for fields with even-valued spin. To make this gauge-invariant, we cannot simply insert a Wilson line as before, because the gauge fields transform with an extra derivative term. However, the gauge field density F μ 𝜈 transforms without such a derivative. We can easily relate the two: Fμa 𝜈 = 𝜕μ Aa𝜈 − 𝜕𝜈 Aaμ + gf abc Abμ Ac𝜈 a F+i = 𝜕+ Aai
⇒ Aai =
(A+ = 0) 1 F , 𝜕+ +i
(5.93)
5.2 Deep inelastic scattering
|
159
which we can use to redefine the gluon PDF. Inserting a Wilson line (in the adjoint representation, as it has to couple to gluons) then gives our final result for the integrated gluon PDF: fg/p (ξ ) =
1 dz− −i ξ P+ z− ∫ e ⟨P| F +i b (z− ) UA[z−ba; 0] F +i a (0) |P⟩ . 2π ξ P+
(5.94)
5.2.7 Collinear factorization and evolution of PDFs We started the idea of factorization in the parton model (see equation (5.73)). The integrated PDF fq (x) can be defined operator-wise by constructing it from the integrated quark correlator (see equation (5.85a)). By the demand of gauge-invariance, we modified the quark-quark correlator by injecting Wilson lines, leading to a resummation of soft gluons inside the PDF (see equation (5.92)). Figure 5.17 shows the factorization in DIS as we have seen so far.
⊗
Fig. 5.17: Factorization in DIS.
To get a better understanding of the PDFs and factorization in general, we investigate the process at first order αs , and see how that changes our factorization rules: FT (x, Q2 ) = ∑ e2q xfq (x) + O(αs ) q
2
FL (x, Q ) = 0 + O(αs ) . In what follows we will continue by using F2 = FT + FL , to be in accordance with common literature. The correct approach to continue is as follows: we calculate Ŵ μ 𝜈 for a single quark up to first order in αs , then we extract F̂ 2 for a single quark using (5.70a). We compare the result with (5.75), plug it in (5.74a) and see how it changes the PDF. There are 3 types of real gluon exchanges at first order, where the exchanged gluon is on-shell, and 3 types of virtual gluon exchanges, shown in Figure 5.18. We will calculate the real contributions, and label the momenta as shown in Figure 5.19. The corresponding amplitude for the initial state gluon radiation (Figure 5.19a) is (see also
160 | 5 Wilson lines in high-energy QCD
Fig. 5.18: All types of first order corrections to the hard part. Real corrections are on the upper line; virtual on the lower line.
q
q q+k
p
p
l≈z k k
k−l
z(q + k)
k (a)
(b)
Fig. 5.19: (a) Initial state gluon radiation. (b) Final state gluon radiation.
(B.52) for the QCD Feynman rules): a,λ ,λ
Mi
λ
= u (p) (i eq 𝛾μ )
l2
i /l (−i g ε/ ta ) uλ (k). + i𝜖
We average over color and incoming spin states, and sum over final spin (see (B.22) ) and gluon polarization states, 1 2 N 1 a,λ ,λ ∗ b,λ ,λ Mi , Mi = ∑ ∑ ∑ ∑ Mi N a,b 2 λ λ pol
(5.95)
we get for the complex squared amplitude |M|2 =
1 1 C e2 g2 ∑ tr(εε/ k/ ε/ /l 𝛾μ p/ 𝛾𝜈 /l ) , 2 F q l4 pol
(5.96)
where we used equation (C.4) to simplify the color generators. We can sum over the
5.2 Deep inelastic scattering
|
161
gluon polarization states by using (B.53b), this simplifies the trace into tr(. . .) = − tr(𝛾ρ k/ 𝛾ρ /l 𝛾μ p/ 𝛾𝜈 /l ) 1 tr(𝛾+ k/ (/k − /l ) /l 𝛾μ p/ 𝛾𝜈 /l ) k+ − l+ 1 + + + tr((/k − /l ) k/ 𝛾+ /l 𝛾μ p/ 𝛾𝜈 /l ) . k −l
+
The first term can be simplified using (B.26b), while the other two can be simplified by using (B.39): (/k − /l ) k/ 𝛾+ = 2k+ (/k − /l ) − (/k − /l ) 𝛾+ k/ = 2k+ (/k − /l ) − 2 (k+ − l+ ) k/ + 𝛾+ (/k − /l ) k/ = 2l+ k/ − 2k+ /l − 2l⋅k 𝛾+ − 𝛾+ k/ (/k − /l ) . Next we move to a frame where the quark lies dominantly in the plus direction while having some transversal momentum k⊥ , and where l carries a fraction z of the plusmomentum of the quark, while its transversal momentum l⊥ is zero (all transversal momentum is carried away by the radiated gluon). k = (k+ ,
k2⊥ ,k ), 2k+ ⊥
l = (zk+ ,
l2 ,0 ). 2zk+ ⊥
(5.97)
Combining these gives us tr(. . .) =
2 2 (l + z2 k2⊥ ) tr(/l 𝛾μ p/ 𝛾𝜈 ) . z
The next steps are straightforward but tedious; we will just give the result. After integrating over kμ and projecting out F̂ 2 using (5.70a), we find the divergent correction terms: α Q2 F̂ 2div = e2q s x Pqq (x) ln 2 . (5.98) 2π μ0 We did not list the finite terms, as they are easily calculable. The integration over k⊥ led to an infrared divergence (the integral becomes infinite for k⊥ → 0), which we regulated with a lower cut-off μ02 . Pqq (x) is the so-called splitting function: Pqq (z) = CF
1 + z2 . 1−z
(5.99)
This function is specific for the diagram in Figure 5.19(a). We use the notation Pij (z) to denote “the probability to get a parton of type i with a momentum fraction z from a parent parton of type j”. In this case, Pqq (z) represents the probability for a quark to split into a quark carrying a fraction z of its momentum and a gluon carrying a fraction 1 − z of its momentum. The other real diagrams do not add any divergences, only finite, calculable parts. So do the virtual diagrams, which can be easily calculated
162 | 5 Wilson lines in high-energy QCD using standard loop-integral methods, as all ultraviolet divergences which appear in individual loop diagrams cancel out. So we can write the full result for F̂ 2 at leading order in αs : α Q2 F̂ 2 = e2q x [ δ (1 − x) + s (Pqq (x) ln 2 + C(x))] , (5.100) 2π μ0 where C(x) contains all finite parts. Bjorken scaling is, as expected, violated; F̂ 2 now depends on Q2 . The singularity which is regulated by μ02 appears when the gluon is emitted collinear to the quark (k⊥ = 0), hence it is called a collinear divergence. Physically the limit k⊥ corresponds to a long-range (soft) interaction, where QCD can no longer be calculated in a perturbative way. To extend our result to the proton structure, we convolute F̂ 2 with a PDF, as in equation (5.74a): 1
F2 =
∑ e2q q
α dξ q x Q2 x x x∫ f (ξ ) [ δ (1 − ) + s (Pqq ( ) ln 2 + C ( )))] . ξ ξ 2π ξ ξ μ0 x
However, care has to be taken as f q is the bare, unrenormalized PDF, exactly the same situation as for the renormalization of the coupling constant. From now on we will write it as f0q (ξ ) to make the distinction clear. We want to absorb the collinear divergence into the PDF and renormalize it up to an arbitrary scale. We choose such a scale μF , with μ02 < μF2 < Q2 , and we use it to split the logarithm: ln
μF2 Q2 Q2 = ln + ln , μ02 μF2 μ02
(5.101)
and define a renormalized PDF as: 1
μ2 α dξ q x x x f q (x, μF2 ) = ∫ f0 (ξ )[δ (1− ) + s (P( ) ln F2 + C ( ))] . ξ ξ 2π ξ ξ μ0
(5.102)
x
Then we can rewrite the factorization formula in terms of the renormalized PDF and the factorization scale: 1
F2 =
∑ e2q q
dξ q x x∫ f (ξ , μF2 ) Ĥ ( , Q2 , μF2 ) , ξ ξ
(5.103a)
x
α x x x Q2 ̃ x Ĥ ( ) = [ δ (1 − ) + s (Pqq ( ) ln 2 + C( ))] . ξ ξ 2π ξ ξ μF
(5.103b)
In other words, we can retrieve the structure by convoluting the PDF f q with the partonic hard part H.̂ Note that we have divided the finite part into two parts:
̃ C(x) = C(x) + C (x).
(5.104)
5.2 Deep inelastic scattering
|
163
̃ is what reC is subtracted from the hard part and gets absorbed by the PDF, while C mains in the factorization formula. The exact choice of how to do this is up to convention, and is called a factorization scheme. Two common schemes are the DIS scheme, ̃ = 0, i.e., everything is subtracted into the PDF, and the more common MS where C scheme, where C = ln 4π − 𝛾E only. It is very important to have a clear understanding of what is happening here. In the calculation of the correction to the hard part, we integrated out all k⊥2 -dependence between μ02 and Q2 . The kinematics of the system make sure that k⊥ ≤ Q always, i.e., the upper border of the integration is justified. In the infrared region however, there is no such kinematic restriction. By cutting the lower border of the integration at μ02 we discarded gluon radiation with k⊥ < μ0 from the hard part. In order to avoid dropping these gluons entirely, we have to absorb them in the PDF, which we subsequently renormalize up to an arbitrary scale μF . By doing this, we hide the divergence from the process, inside an object that was not perturbative to begin with. The physical interpretation goes as follows: we choose an arbitrary energy scale μF that separates the process in two parts, namely a hard part with k⊥ larger than this scale, and a nonperturbative part (the PDF) with k⊥ smaller than this scale. This interpretation is illustrated in Figure 5.20, and is literally factorization as we have seen it before, but now emerging in a natural way. For this reason we will call μF the factorization scale.
k⊥ μF
μF k⊥
(a) k⊥ < μF
(b) k⊥ > μF
Fig. 5.20: (a) The transverse momentum of the gluon is smaller than the factorization scale, so we absorb it in the PDF. (b) The transverse momentum of the gluon is larger than the factorization scale, so we add it to the hard part.
Since F2 is a physical observable, it cannot depend on the factorization scale (which is merely an unphysical leftover of a mathematical tool). This implies 𝜕F2 ≡ 0, 𝜕ln μF2
(5.105)
164 | 5 Wilson lines in high-energy QCD from which we can derive an evolution equation for f , the so-called DGLAP evolution equation: 1
αs (μF2 ) dξ 𝜕 x q 2 f (x, μ ) = ∫ Pqq ( , αs (μf2 )) f q (ξ , μF2 ) , F 2 2π ξ ξ 𝜕ln μF
(5.106)
x
where we already incorporated the effect of the running coupling αs (μF2 ). Note that Pqq depends on the coupling because this is an all-order equation; corrections from higher order calculations will manifest themselves inside the splitting function. Everything we have derived so far was for quarks only. Adding gluons, we can now calculate the leading-order contribution (in αs ) to F̂ 2 from the boson-gluon fusion diagram in Figure 5.16, and convolute this with the gluon PDF. We find for the partonic structure function: α Q2 F̂ 2g = ∑ e2q x s (Pqg (x) ln 2 + Cq (x)) . 2π μ0 q
(5.107)
This is quite similar to equation (5.100), especially, there is again a singularity from the integration over k⊥2 . As we already knew, there is no gluon contribution to F̂ 2 when αs = 0. The splitting function is given by Pqg (z) =
1 2 (z + (1 − z)2 ) , 2
(5.108)
where Pqg is the probability to find a quark in a gluon. Note that in F̂ g we sum over quark flavor. We have to renormalize the gluon PDF as we did with the quark PDF, but we absorb the singularities in the quark PDF: 1
μ2 αs dξ q x x ∫ f0 (ξ )(Pqq ( ) ln F2 + C q ( )) 2π ξ ξ ξ μ0
f q (x, μF2 ) = f0q (x) +
x
1
+
μ2 αs dξ g x x ∫ f0 (ξ )(Pqg ( ) ln F2 + C g ( )) . 2π ξ ξ ξ μ0
x
On the other hand, higher-order calculations show that the renormalization of the gluon PDF is given by: 1
μ2 αs dξ g x x ∫ f0 (ξ )(Pgg ( ) ln F2 + C q ( )) 2π ξ ξ ξ μ0
f g (x, μF2 ) = f0g (x) +
x
1
+
μ2 αs dξ q x x ∫ f0 (ξ )(Pgq ( ) ln F2 + C g ( )) . 2π ξ ξ ξ μ0
x
5.3 Semi-inclusive deep inelastic scattering
|
165
With these renormalization definitions, we can write the factorization formulae as: F2 = ∑ e2q x (f q ⊗ Ĥ q + f g ⊗ Ĥ g ) ,
(5.109a)
q
α Q2 ̃ q Ĥ q (z) = δ (1 − z) + s (Pqq (z) ln 2 + C (z)) , 2π μF α x Q2 ̃ g Ĥ g ( ) = s (Pqg (z) ln 2 + C (z)) , ξ 2π μF
(5.109b) (5.109c)
1
dξ x f (ξ ) H ( ) . (f ⊗ H)(x) = ∫ ξ ξ N
(5.109d)
x
Of course, in order to fully validate collinear factorization, one needs to derive factorization formulae for F1 as well and verify if they agree with those for F2 . Of course, this has been done quite thoroughly, such that we can accept collinear factorization as a valid framework. Then finally, the full DGLAP evolution equations can be expressed in a matrix equation: 1
αs dξ Pqi qj qi (x, μ 2 ) 𝜕 ( ∫ ( 2 ) = 2 g(x, μ ) Pgqj 2π ξ 𝜕ln μ x
qj ( x , μ 2 ) Pqi g ) ⋅ ( xξ 2 ) . Pgg x g( ξ , μ )
(5.110)
ξ
For the sake of completeness, we list all splitting functions at leading order: z
Pqq (z) = CF
1−z z 1−z z 1−z z 1−z
Pqg (z) =
1 + z2 , 1−z
(5.111a)
1 2 (z + (1 − z)2 ) , 2
(5.111b)
1 + (1 − z)2 , z
(5.111c)
Pgq (z) = CF
Pgg (z) = 2 CA (
1−z z + + z(1 − z)) . 1−z z
(5.111d)
This leads us to the end of this section on deep inelastic scattering. In the next section, we will investigate what changes when we can no longer integrate over transverse momentum.
5.3 Semi-inclusive deep inelastic scattering Collinear factorization is a well-explored and experimentally verified framework, but it only works when integrating out all final states. Keeping these final states, i.e., fully
166 | 5 Wilson lines in high-energy QCD exclusive DIS, would maximally break factorization. In this section we investigate an intermediate solution, where we identify exactly one hadron in the final state, and integrate out all other states. This is called Semi-Inclusive DIS (or SIDIS for short). Because there is no restriction on the momentum of the final hadron, it can acquire a transversal part. To put it more formally: in DIS we were able to describe our process on a plane, because it only has two independent directions, viz. the direction of the incoming proton (which is parallel to the incoming electron) and the direction of the outgoing electron. We have chosen a frame where the plus and minus components of the momenta span this plane, such that the transversal components are zero. In SIDIS a third direction emerges from the momentum of the identified hadron, which does not necessarily lie in the plane spanned by the incoming and outgoing electron. In this frame, the final hadron will have a nonzero transverse momentum component. As we will discover in this section, the breaking of collinear factorization is not an insurmountable task to overcome; we can adapt our factorization framework to allow for k⊥ -dependence, such that the convolution between the hard part and the PDF (now also dependent on k⊥ , and thus from now on called a Transverse Momentum Dependent PDF or TMD for short) is a convolution over k⊥ . In this book we will not delve into the technicalities for k⊥ -factorization, as they are quite intricate and would lead us to far.
5.3.1 Conventions and kinematics Different conventions exist in the literature concerning the naming of the different TMDs and azimuthal angles. We will follow the “Trento conventions”, as defined in [65]. Furthermore, concerning the labeling of momenta, we will follow the same convention as used in [64]. In an SIDIS process, we have an electron with momentum l that collides with a proton with momentum P. The mediated photon has momentum q, and hits a parton with momentum k, that has a momentum p after scattering (i.e., p = k + q). The struck parton then fragments into a hadron with momentum Ph . This is shown in Figure 5.21. Note that we now have two density functions; one that represents the probability to find a parton in the proton (the TMD), and one that represents l electron
l q
Ph p
k proton
P
X
Fig. 5.21: Kinematics of semi-inclusive deep inelastic electron-proton scattering.
5.3 Semi-inclusive deep inelastic scattering
|
167
the probability for a parton to fragment in a specific hadron (the fragmentation function or FF). We will assume the final hadron to be a spin 0 hadron, like a pion. We will use x and y as defined in Section 5.2.1, and we will define a new Lorentz invariant z: z=
P⋅Ph . P⋅q
(5.112)
The value for z can be measured in experiment; it will approximate the fractional momentum of the detected hadron relative to its parent parton, in the same way x approximates the fractional momentum of the struck quark relative to the parent proton. Intuitively, we can add a fragmentation function dq (z) to (5.74), giving a PM collinear estimate for F2 in SIDIS: F2PM = ∑ e2q x f q (x) Dq (z), (5.113) q
which gives us, using (5.71), a first estimate for the SIDIS cross section: d3 σ y2 4πα 2 s (1 − y + ) ∑ e2q x f q (x) Dq (z). ≈ 4 dx dy dz 2 q Q
(5.114)
Another important variable is the azimuthal angle φh , which is defined as ̂ l⋅P cos φh = − h , Ph⊥ where Ph⊥ is the length of the transversal component of the momentum of the outgoing hadron: μ √ Ph⊥ = −g⊥ μ 𝜈 Ph Ph𝜈 . The geometrical construction of the azimuthal angle is shown in Figure 5.22. It is straightforward to show that the cross section is given by d6 σ α2 = L W μ 𝜈, 2 2 μ𝜈 2z x s Q dx dy dz dφh dPh⊥
(5.115)
where we approximated d3 Ph ≈ dz d2 Ph⊥
Eh . z
5.3.2 Structure functions The hadronic tensor is defined as (compare it to (5.63)): N ̃ δ (4) (P + q − p − P ) ⟨P J †μ (0) X, P ⟩ ⟨X, P J 𝜈 (0) P⟩ W μ 𝜈 = 4π 3 ∑ X h h h
X
1 ∫d4 r ei q⋅r ⟨P J †μ (r) Ph ⟩ ⟨Ph J 𝜈 (0) P⟩ . = 4π
(5.116)
168 | 5 Wilson lines in high-energy QCD
Ph⊥
l
Ph
l
lepton
plane
tran sve rsal plan e
φh
lepton
plane
ne n pla hadro
Fig. 5.22: In the rest frame of the proton, Ph⊥ is the projection of Ph onto the plane perpendicular to the photon momentum. The azimuthal angle φh is the angle between Ph⊥ and the lepton plane.
As we will see in Section 5.3.3, this is a bit simplistic as we cannot integrate out the X states without affecting Ph , but the general idea is correct. Note that because we do not integrate over Ph (we measure it in the final state), we cannot drop the state Ph ⟩ ⟨Ph . This leads to an important difference as compared to the hadronic tensor in DIS, viz. that we cannot naively impose the same constraints as in (5.66a), because time-reversal invariance is not automatically satisfied. We can restore this invariance by changing it slightly, namely we require invariance under the simultaneous reversal of time and of initial and final states. For the parameterization of the hadronic tensor, we use the same orthonormal basis as before, viz. equations (5.58), but now we have an additional physical vector at our disposal, which we can use to construct the fourth basis vector: μ𝜈
ĥ μ is a spacelike unit vector:
P N g ĥ μ = ⊥ h 𝜈 . Ph⊥
(5.117)
ĥ μ ĥ μ = −1.
(5.118)
Watch out, as although we normalized this vector, it is not fully orthogonal! We have, as expected, h⋅̂ t ̂ = 0,
h⋅̂ q̂ = 0,
5.3 Semi-inclusive deep inelastic scattering
but it is not orthogonal to lμ̂ :
| 169
h⋅̂ l ̂ = cos φh .
(5.119)
This is a deliberate choice, because now we have the azimuthal dependence hardcoded inside our new basis. Note that −lμ⊥̂ ε⊥ μ 𝜈 ĥ 𝜈 = sin φh ,
(5.120)
which implies that φh is fully defined in the region 0 . . . 2π . We can parameterize W μ 𝜈 in the same way as we did in (5.67), now with ĥ added. This gives (for the unpolarized case):⁷ Wμ𝜈 =
z ̂ ĥ 𝜈) F cos φh [−g⊥μ 𝜈 FUU,T + tμ̂ t𝜈̂ FUU,L + 2t(μ UU x cos 2φh sin φ μ ̂𝜈 μ𝜈 [μ 𝜈] ̂ ̂ + (2h h + g⊥ ) FUU − 2i t ̂ h FLU h ] .
(5.121)
The subscript UU denotes a structure function for an unpolarized beam on an unpolarized target, while the labeling in function of φh will be motivated by contracting with the lepton tensor (5.62): Lμ 𝜈 W μ 𝜈 =
4zs y2 cos φ [(1 − y + ) FUU,T + √1 − y (2 − y) cos φh FUU h y 2 cos 2φh
+ (1 − y)FUU,L + (1 − y) cos 2φh FUU
sin φ + λ y√1 − y sin φh FLU h ] .
cos φ
sin φ
As anticipated, FUU h has a factor cos φh in front, and so on. Note that FLU h is the structure function for a longitudinally polarized lepton beam (on an unpolarized proton target), which is confirmed by the factor λ in front (originating from the last term in the lepton tensor (5.62)). The cross section is then given by (5.115): d6 σ y2 2α 2 [(1 )FUU,T + (1 − y)FUU,L − y + = 2 x y Q2 dx dy dz dφh dP2h⊥ sin φ cos 2φ + λ y√1 − y sin φh FLU h + (1 − y) cos 2φh FUU h cos φ + √1 − y (2 − y) cos φh FUU h ] ,
d3 σ y2 ̃ 4πα 2 [(1−y )FUU,T + (1−y)F̃UU,L] , + = dx dy dz 2 x y Q2
(5.122a) (5.122b)
where we integrated over Ph⊥ in the last step, which got rid of the φh -dependence. The tilde structure functions are the integrated versions: F̃UU,T (x, z, Q2 ) = ∫d2 Ph⊥ FUU,T (x, z, Q2 , Ph⊥ ),
(5.123)
7 The lepton sector did not change when going from DIS to SIDIS, implying we can use the same one again.
170 | 5 Wilson lines in high-energy QCD and similarly for F̃UU,L . From the logical demand ∑ ∫dz z h
d3 σSIDIS d2 σDIS ≡ dx dy dz dx dy
we can relate the SIDIS structure functions to the DIS structure functions: ∑ ∫dz z F̃UU,T (x, z, Q2 ) ≡ FT (x, Q2 )
(5.124a)
h
∑ ∫dz z F̃UU,L (x, z, Q2 ) ≡ FL (x, Q2 ).
(5.124b)
h
5.3.3 Transverse momentum dependent PDFs We can construct the diagram for the hadronic tensor following the same step-by-step procedure we used in DIS (Section 5.2.5), this time adding a fragmentation function, as is illustrated in Figure 5.23. Remember that the amplitude for extracting a quark from a proton with momentum P is ψα (0) |P⟩. Then the amplitude for a quark fragmenting in a hadron with momentum Ph is of course ⟨Ph ψ α (0). So we simply have 𝜈 k X
=
(𝛾𝜈 )
=
βα ⟨Y, Ph ψ β (0) 0⟩ (𝛾𝜈 ) ⟨X| ψα (0) |P⟩ .
βα
⟨X| ψα (0) |P⟩
Ph 𝜈
p Y
k X
The QED-vertex adds a δ -function, and making the final-state cut adds two final-state sums (using the notation defined in equation (5.78)) and two δ -functions: Wμ𝜈 =
1 ̃∑ ̃ δ (4) (P − k − p ) δ (4) (P + p − p) δ (4) + q − p) ∑ e2 ∫d4 k d4 p ∑ (k X h Y 2 q q X Y × ⟨P| ψ |X⟩ 𝛾μ ⟨0 ψ Y, Ph ⟩ ⟨Y, Ph ψ 0⟩ 𝛾𝜈 ⟨X| ψ |P⟩ .
Next we will separate the proton content from the fragmenting hadron content, applying on each the same steps as before (expressing the δ -function as an exponential,
5.3 Semi-inclusive deep inelastic scattering
Ph 𝜈
| 171
Ph
p
p
k
μ k Fig. 5.23: Leading order diagram for the hadronic tensor in SIDIS.
using the translation operator and the completeness relation). Then we get the general leading order result: 1 W μ 𝜈 = ∑ e2q ∫d4 k d4 p δ (4) (k + q−p) tr(Φ(k, P)𝛾μ Δ(p, Ph )𝛾𝜈 ) , 2 q
(5.125a)
d4 r e−i k⋅r ⟨P| ψ β (r)ψα (0) |P⟩ , Φαβ (k, P) = ∫ 16π 4
(5.125b)
d4 r e−i p⋅r ⟨0 ψα (0) Ph ⟩ ⟨Ph ψ β (r) 0⟩ . Δαβ (p, Ph ) = ∫ 16π 4
(5.125c)
Next we choose a frame where the parton in the TMD carries a fraction ξ of the proton’s plus momentum, and where the final hadron carries a fraction ζ of the fragmenting parton’s minus momentum, i.e., kμ = (ξ P+ ,
k2 + k2⊥ , k⊥ ) , 2 ξ P+
pμ = (z
such that we can write (neglecting terms that are
1 Q
p2 + p2⊥ Ph− , , p⊥ ) , 2Ph− ζ
(5.126)
suppressed):
δ (4) (k + q − p) ≈ δ (k+ + q+ ) δ (q− − p− ) δ (2) (k⊥ + q⊥ − p⊥ ) ≈
1 1 1 δ (ξ − x) δ ( − ) δ (2) (k⊥ + q⊥ − p⊥ ) , P+ Ph− ζ z
and we transform the integral measures as d4 k = P+ dξ dk− d2 k⊥ ,
d4 p = dp+ dζ
Ph− 2 d p⊥ . ζ2
Then we can rewrite the hadronic tensor as: W μ 𝜈 = ∑ e2q ∫d2 k⊥ z tr(Φ (x, k⊥ , P) 𝛾μ Δ (z, k⊥+ q⊥ , Ph ) 𝛾𝜈 ) ,
(5.127)
q
where we defined the k⊥ -dependent correlators as: + − d3 r N Φ(ξ , k⊥ , P) = ∫ 3 e−i xP r +i k⊥ ⋅r⊥ ⟨P| ψ (0+ , r− , r ⊥ )ψ (0) |P⟩ , 8π
N
Δ(z, p⊥ , Ph ) =
− 3 1 d r −i Pzh r+ +i k⊥ ⋅r⊥ ⟨0 ψ (0) Ph ⟩ ⟨Ph ψ (r+ , 0− , r ⊥ ) 0⟩ . ∫ 3 e 2 z 8π
(5.128a) (5.128b)
172 | 5 Wilson lines in high-energy QCD We can parameterize the quark correlator and fragmentator functions in terms of TMDs and FFs, precisely as we did with the quark correlator in the case of DIS. Keeping only the contributions at leading-twist, we obtain the following unpolarized TMDs and FFs: Φ(ξ , k⊥ ) =
k/ 1 i f (ξ , k⊥ ) 𝛾− + h⊥1 (ξ , k⊥ ) ⊥ 𝛾− , 2 1 2 mp
(5.129a)
Δ(ζ , k⊥ ) =
k/ 1 i D (ζ , k⊥ ) 𝛾+ + H1⊥ (ζ , k⊥ ) ⊥ 𝛾+ . 2 1 2 mh
(5.129b)
If we plug this result in (5.115) and (5.122b), and use the approximation q⊥ ≈ −
Ph⊥ , z
(5.130)
we get the factorization formula for the unpolarized transversal structure function in SIDIS: FUU,T = ∑ e2q x f1q ⊗ Dq1 , (5.131) q
where we defined the convolution over transverse momentum as 1 f1q ⊗ Dq1 = ∫d2 k⊥ d2 p⊥ δ (2) (k⊥ −p⊥ − Ph⊥)f1q (x, k⊥ ) Dq1 (z, p⊥ ). z
(5.132)
5.3.4 Gauge-invariant definition for TMDs Just as was the case in the previous section for DIS, our TMDs and FFs defined so far (equations (5.128)) are not gauge-invariant, and are only valid in the light-cone gauge A+ = 0. Gauge invariance can be restored by inserting a Wilson line: + − d3 r Φ(ξ , k⊥ , P) = ∫ 3 e−i ξ P r +i k⊥ ⋅r⊥ ⟨P| ψ (r) U[r ; 0] ψ (0) |P⟩ , 8π
(5.133)
where now the space-time point separation no longer lies on the light-cone, i.e., the Wilson line has to connect the point (0+ , 0− , 0⊥ ) with the point (0− , r+ , r ⊥ ). But as we have seen in the Introduction, the Wilson line is path dependent, meaning different choices for the Wilson path give different results. How do we choose a path, or at least motivate our choice? In the collinear case we could interpret the Wilson line as a color rotation on the quark, making it an eikonal quark. We have split the Wilson line into two parts at infinity using equation (5.28). This splitting had two advantages, viz. we could associate a line with the quarks on each side of the cut diagram separately, and secondly that we could use easy Feynman rules (all Feynman rules we derived in Section 5.1.1 are in the function of Wilson lines from a point to ±∞).
5.3 Semi-inclusive deep inelastic scattering
| 173
In the TMD definition, we would like to do something analogous. We add a lightlike line to each quark: −
U[+∞− , 0 +
ψ (0 , r
⊥
−
; 0− , 0⊥ ]
ψ (0+ , 0− , 0⊥ ),
, r ⊥ ) U−† [+∞− , r ⊥ ; r− , r⊥ ] .
(5.134a) (5.134b)
But because of the transverse separation we now have −†
U[r ; 0] ≠ U[+∞− , r
⊥
−
U[+∞− , 0
; r− , r ⊥ ]
⊥
; 0 − , 0⊥ ] .
So we need a Wilson line to connect the transverse ‘gap’, i.e., −†
⊥
U[r ; 0] = U[+∞− ; r− ] U[r
⊥
; 0⊥ ]
−
U[+∞− ; 0− ] .
We will split this line at + ∞⊥ for the same reasons as before. Adding this to equations (5.134) gives ⊥
U[+∞− , +∞
⊥
; +∞− , 0⊥ ]
−
U[+∞− , 0
ψ (0+ , r− , r ⊥ ) U−† [+∞− , r
⊥
; r− , r ⊥ ]
⊥
; 0− , 0⊥ ]
ψ (0+ , 0− , 0⊥ ),
⊥†
U[+∞− , +∞
⊥
; +∞− , r ⊥ ] ,
(5.135a) (5.135b)
leading to the final definition for the gauge-invariant TMD correlator (see Figure 5.24): + − d3 r Φ = ∫ 3 e−i xP r +i k⊥ ⋅r⊥ ⟨P| ψ (r) Ũ†[+∞ ; r] Ũ[+∞ ; 0] ψ (0) |P⟩ , 8π
̃
⊥
U[+∞ ; 0] = U[+∞− , +∞ ̃† U[+∞ ; r]
=
⊥
; +∞− , 0⊥ ]
−† U[+∞− , r ; r− , r ] ⊥ ⊥
−
U[+∞− , 0
⊥
; 0− , 0⊥ ] ,
⊥† U[+∞− , +∞ ; +∞− , r ] . ⊥ ⊥
(5.136a) (5.136b) (5.136c)
(0+ , + ∞− , + ∞⊥ )
(0+ , r − , r ⊥ )
n⊥ n−
(0+ , 0− , 0⊥ )
(0+ , r − , r ⊥ )
(0+ , 0− , 0⊥ )
(0+ , + ∞− , 0⊥ )
Fig. 5.24: Structure of the Wilson lines in the TMD definition.
What about the physical interpretation? Consider again the one-gluon exchange as depicted in Figure 5.15. We saw in equation (5.90) that the net contribution for a soft or collinear gluon is a factor p/ − /l d4 l n⋅A d4 l g∫ A/ ≈ −g ∫ 4 2 16π (p − l) + i ε 16π 4 n⋅l − i ε
(5.137)
174 | 5 Wilson lines in high-energy QCD μ
where lμ is the momentum of the exchanged photon and nμ = p|p| the direction of the outgoing quark. We were able to make this simplification because in the correlator this correction stands to the right of a factor u(p), such that we can make use of the fact u(p)p/ = 0: / = 2u(p)p⋅A. u(p)A/ p/ = u(p) (A/ p/ + p/ A) As we saw before, this contribution calculated to all orders leads to the light-like Wilson line. Now in the collinear case this was the end of the story. But now that we are in the TMD case, we cannot simply take the exchanged gluon to be collinear, instead, we need to add a term to equation (5.137): d2 l⊥ 𝛾μ /l ⊥ μ + − A/ /l d4 l ∫ g∫ ≈ g A (0 , ∞ , l⊥ ). 16π 4 2p⋅l + l2⊥ − i ε 4π 2 l2⊥ − i ε ⊥ It is not so straightforward to prove (see, e.g., [68]), but these parts will sum up to a transversal Wilson line. So in the end, inside the TMD we have both a resummation of (soft) collinear gluons, coming from the line parts U− and U−† , and a resummation of soft transversal gluons, coming from the Ũ and Ũ† parts. Note however that by choosing an appropriate gauge, it is possible to cancel the contribution of one type of these lines, e.g. in the LC gauge only the transversal parts remain. Of course, the same reasoning can be repeated for the fragmentation function, but then the light-like Wilson lines will lie in the plus direction. This is illustrated in Figure 5.25a. To end this chapter, we give an example for the use of Wilson lines in the Drell–Yan process. In this setup, two protons (or a proton and an antiproton) are collided and create a photon or weak boson by quark-antiquark annihilation. We thus need two TMDs, which are both in the initial state. The longitudinal part of the Wilson line used to make the TMD gauge-invariant represents a resummation of gluons connected to the parton struck from the other TMD. This is illustrated in Figure 5.25b. Because of the fact that the Wilson line now represents initial state radiation, the line structure will be
(a)
(b)
Fig. 5.25: (a) In SIDIS, the longitudinal Wilson line inside the fragmentation function represents a resummation of soft and collinear gluons connected to the incoming quark. (b) In Drell–Yan, the longitudinal Wilson line inside one TMD represents a resummation of soft and collinear gluons connected to the parton extracted from the other TMD.
5.3 Semi-inclusive deep inelastic scattering
|
175
(0+ , −∞− , + ∞⊥ )
(0+ , r − , r ⊥ )
(0+ , −∞− , 0⊥ )
(0+ , 0− , 0⊥ )
Fig. 5.26: Structure of the Wilson lines in the Drell–Yan TMD definition.
different. More specifically, the path will flow towards −∞ before returning, as shown in Figure 5.26. This has an important consequence: two out of the eight (unpolarized and polarized) TMDs are T-odd and will have a sign change with this line structure as compared to SIDIS. This would imply that TMDs are process-dependent, and not universal as they ought to be. However, so far it has not been experimentally verified whether these PDFs have a nonzero value. These days, a lot of focus is aimed at finding or excluding them, for the sake of TMD universality.
A Mathematical vocabulary A.1 General topology Definition A.1 (Topological space). Let X be a set and U a collection of subsets of X. Then X is called a topological space if 1. empty set and X itself belong to U; 2. U is closed with respect to finite intersections: N
U1 , ⋅ ⋅ ⋅ , UN ∈ U, N ∈ ℕ → ⋂ Uk ∈ U; k=1
3.
U is closed with respect to arbitrary (uncountably infinite as well) unions
Uα ∈ U, α ∈ A → ⋃ Uα ∈ U. α ∈A
The sets U∈U are open, their complements X − U are closed in X. To define a topology means to say which subsets of X are called open. From a topology we derive the concept of a neighborhood of a point x ∈ X.
Definition A.2 (Neighborhood). Let x∈X be a point and U be an open set which contains it. Then U is called a neighborhood of x in X. If different topologies are defined there on X, they can be compared with each other. A topology U1 is called stronger (finer) than a topology U2 , which then is weaker (coarser), if U2 ⊂ U1 as collections of subsets. Considering the subsets of a space X possessing a topology naturally induces a topology on its subsets, referred to as the induced topology. Definition A.3 (Induced topology). Let (X1 , U1 ), (X2 , U2 )
A.2 Topology and basis | 177
be topological spaces, such that X 2 ⊂ X1 . The relative or subspace topology U1X2 induced on X2 is given if the sets U1 ∩ X2 ; U1 ∈ U1 are open. Then a topological inclusion reads X2 → X1 , given that the intrinsic topology U2 is stronger than the relative one (U1X2 ⊂ U2 ). One of the properties of topologies most relevant for our purposes is the so-called Hausdorff property: Definition A.4 (Hausdorff). A topological space X is said to be Hausdorff if and only if for any two nonequivalent points x1 ≠ x2 there exist disjoint neighborhoods U1 , U2 of x1 , x2 respectively. This property allows one to separate points in a given topological space. It becomes highly relevant when considering limits.
A.2 Topology and basis Given we have a set X and a collection of its subsets U, we wish to be able to define some operations like, e.g., differentiation. To this end, we have to introduce a number of properties, which are generated by choosing a topology on X. We start with the following lemma: Lemma A.5. Let X be a set and (Uβ )β ∈B be a collection of topologies on X. Then T := ∩β ∈B Uβ
is again a topology on X. This topology can now be optimized by making use of the following proposition: Proposition A.6. Let X be a set and D ⊆ P(X)
178 | A Mathematical vocabulary be a collection of subsets of X. Then there exists a weakest topology on X, such that all subsets U∈D are open. That is to say, there exists a topology T such that: 1. every U∈D 2.
is open in T ; if U is a topology on X, such that each U∈D is open in U, then U is finer then T .
Here P(X) represents the power set of X, i.e., the set of all subsets of X. However, Proposition A.6 does not provide us with an explicit method to determine the topology T . Let us first consider a simpler case where the collection of sets D have an extra property. Definition A.7 (Topology basis). Let X be a set. A basis for a topology on X is a collection B of subsets of X with the following properties: 1. For every x∈X there exists a B ∈ B, such that x ∈ B. 2.
If B1 , B 2 ∈ B , then there exists a B3 ∈ B with x ∈ B3 and B3 ⊆ (B1 ∩ B2 ).
If T is the weakest topology on X, such that all B∈B are open in T , then we call B a basis for T or we call T the by B generated topology.
A.2 Topology and basis |
179
Proposition A.8. Let B be a basis for a topology on a set X and let T be the topology generated by this basis. If U⊆X then the following properties are equivalent: 1. U is open in T ; 2. for each x∈U there exists a B ∈ B, such that x∈B and B ⊆ U; 3.
U can be represented as a union of sets Bα from the collection B.
It might happen that one needs to consider spaces which are equipped with a metric. Metric can be used to construct a topology to which the open balls form a basis. Let us first give the definition of a metric set. Definition A.9 (Metric on a set). Let X be a set. Metric on X is a function d : X × X → ℝ≥0 , with the following properties: 1. d(x1 , x2 ) = 0 if and only if x1 = x2 ; 2. symmetry: d(x1 , x2 ) = d(x2 , x1 ), ∀x1 , x2 ∈ X; 3.
triangle inequality: d(x1 , x3 ) ≤ d(x1 , x2 ) + d(x2 , x3 ), ∀x1 , x2 , x3 ∈ X.
A set with a metric is called a metric space. A metric is called an ultra-metric if it satisfies the stronger version of the triangle inequality where points can never fall between other points: ∀x1 , x2 , x3 ∈ X, d(x1 , x3 ) ≤ max (d(x1 , x2 ), d(x2 , x3 )). A metric d on X is called intrinsic if any two points x 1 , x2 ∈ X can be joined by a curve with length arbitrarily close to d(x1 , x2 ).
180 | A Mathematical vocabulary For the sets on which the operation of addition ‘+’ is defined. d is called a translation invariant metric if ∀x1 , x2 , a ∈ X d(x1 , x2 ) = d(x1 + a, x2 + a). Let us now explicitly construct the topology induced by metric. Define an open ball B for x1 ∈ X and a real number R ≥ 0, B(x1 , R) := {x2 ∈ X | d(x1 , x2 ) < R},
(A.1)
B := {B(x1 , R) | x1 ∈ X}.
(A.2)
and a collection It is easy to see that the balls B obey the conditions of Definition (A.7) and thus form the basis of a topology. A topological space (X, T ) is called metrizable if there exists a metric on X. Such a space is Hausdorff. In order to be able to construct a topology starting from a given collection of subsets, which do not necessarily obey the properties of a basis, we need to introduce the concept of a sub-basis Definition A.10 (Sub-basis of a topology). Let X be a set. Then sub-basis of a topology on X is a collection S of subsets of X, such that ⋃ = X. S∈S
Sub-bases can be used to construct a basis: Proposition A.11. Let S be a sub-basis for a topology on X. Then define the collection B of subsets B⊆X that can be presented as the intersection of a finite number of sets in the collection S. That is to say, B∈B if and only if there exists S1 , S2 , ⋅ ⋅ ⋅ , Sn ∈ S such that B = S1 ∩ S2 ∩ ⋅ ⋅ ⋅ ∩ Sn . Then B is a basis for a topology on X and the topology generated by B is the weakest topology on X, and each S∈S is open in this topology.
A.3 Continuity
|
181
From Proposition A.11 it is now easy to construct a topology from a given collection of subsets. One just adds the set X to this given collection, so that this new collection becomes a sub-basis for a topology on X. Proposition A.11 then shows how to construct a basis and the weakest topology for which the original collection of sets is open.
A.3 Continuity Definition A.12 (Continuity). A function F : X1 → X2 for the topological spaces X1 , X2 is continuous if the pre-image F −1 (V) of any set V ⊂ X2 which is open in X2 is open also in X1 . The pre-image is defined by F −1 (V) = {x ∈ X1 ; F(x) ∈ V}
(A.3)
and does not require F to be either an injection or a surjection. Definition A.13 (Homeomorphism). If F is a continuous bijection and also F −1 is continuous, then F is a homeomorphism or a topological isomorphism. One can think of a homeomorphism as an isomorphism between topological spaces. Note that Definition A.12 for continuity is consistent with the usual definition of continuity in real calculus in the following way: Corollary A.14. Let (X1 , dX1 ) and (X2 , dX2 ) be metric spaces. Let F : X1 → X2 be a map. Then F is continuous with respect to the metric topologies on X1 and X2 if and only if ∀x ∈ X1 , ∀𝜖 > 0, ∃δ > 0 : F(B(x, 𝜖)) ⊆ B(F(x), δ ). In other words, F is continuous if and only if ∀x ∈ X1 , ∀𝜖 > 0, ∀ξ ∈ X1 ∃δ > 0 | dX1 (x, ξ ) < δ ⇒ dX2 (F(x), F(ξ )) < 𝜖. The property of continuity can be used to define a topology on products of sets. Definition A.15 (Product topology). Let {Xα }α ∈A
182 | A Mathematical vocabulary be a collection of topological spaces. Consider the product set Y := ∏ Xα . α ∈A
The projection on the factor Xα reads prα : Y → Xα . The product topology on Y is then the weakest topology on Y wherein each of the projections prα is continuous. Imposing extra conditions on a map allows us to strengthen continuity to homeomorphism. Putting even more restrictions one can extend it further to define open and closed maps. Definition A.16 (Open and closed map). Let F : X1 → X2 be a map between two topological spaces. F is an open map if ∀U ⊆ X1 its image F(U) is open in X2 . On the other hand, F is a closed map if ∀U ⊆ X1 its image F(U) is closed in X2 . When discussing manifolds, one needs to define a specific map referred to as an embedding. Definition A.17 (Embedding). A continuous map G : X1 → X2 is an embedding if G is injective and is a homeomorphism from X1 to its image G(X1 ) ⊂ X2 , where G(X1 ) is supplied with the subspace (induced) topology. An embedding possesses the following three properties: 1. G is continuous; 2. G is injective; 3. ∀U ⊆ X1 , ∃V ⊆ X2 with U = G−1 (V).
A.4 Connectedness
| 183
A.4 Connectedness Definition A.18 (Connected). A topological space X is named disconnected if there exist nonempty open subsets U1 , U2 ⊂ X, such that U1 ∩ U2 = 0 and U1 ∪ U2 = X. A topological space is connected if it is not disconnected. Notice that according to this definition the empty set is connected. From considering maps between topological spaces and restricting to continuous maps we get a generalization of the mean value theorem. Proposition A.19 (Continuous image of a connected space is connected). Let F : X1 → X2 be a continuous map. If X1 is connected, then F(X1 ) ⊆ X2 is also connected. The link with the usual mean value theorem can be made visible by considering the map F : X → ℝ. Suppose that this map is continuous. From Proposition A.19 we obtain that F(X) ⊆ ℝ. In other words, we find that if x1 , x2 ∈ X, such that F(x1 ) < F(x2 ), and c is a real number, such that F(x1 ) < c < F(x2 ), then there exists a x3 ∈ X, such that F(x3 ) = c. Sometimes one needs to parameterize a space by its connected components, which can be considered as the classes of equivalence induced by connectedness.
184 | A Mathematical vocabulary Definition A.20 (Connected components). Let X be a topological space. The equivalence classes for the equivalence relation introduced by connectedness are called connected components of X. One can see now that X is the disjunct union of its connected components. To introduce path-connectedness we need to first define a path in a topological space. Definition A.21 (Path and loop in a topological space). Let X be a topological space. A path in X is a continuous map 𝛾 : [0, 1] → X, with 𝛾(0) being the initial point and 𝛾(1) being the terminal or endpoint of the path. If 𝛾(0) = 𝛾(1), then 𝛾 is said to be a loop. This definition can be used to define a path-connected topological space and a new equivalence relation introducing path-connected components. This resembles the naive idea that every two points in a path-connected component can be connected by a path which completely belongs to this component. Definition A.22 (Path-connected components). Consider a topological space. The equivalence classes introduced by the above-mentioned equivalence relation are called its path-connected components. Definition A.23 (Path-connected). A topological space is path-connected if every two points of X are equivalent with respect to this equivalence relation. Naturally we have the following relation between path-connectedness and connectedness: Corollary A.24. A path-connected space is connected. The concept of connectedness can be easily extended to more complicated spaces by means of the product topology (Definition A.15). Definition A.25 (Connected products). Consider the topological spaces X1 and X2 . Let X1 × X2 possess the product topology. Then: 1. if X1 and X2 are connected, then X1 × X2 is also connected; 2. if X1 and X2 are path-connected, then X1 × X2 is also path-connected. Corollary A.26. Consider the topological spaces X1 , ⋅ ⋅ ⋅ , Xn . Therefore,
A.4 Connectedness
1.
| 185
if each Xi is connected, then X1 × ⋅ ⋅ ⋅ × Xn
2.
is also connected; if each Xi is path-connected, then X1 × ⋅ ⋅ ⋅ × Xn is also path-connected.
We need as well the concept of a topological group. Definition A.27 (Topological group). A topological group is a group G with a topology, such that the maps G × G → G, (g1 , g2 ) → g1 g2 : multiplication G → G, g
→ g
−1
: inverse
(A.4) (A.5)
are continuous. The group elements are interpreted as points of the topological space. It is now easy to prove the following statements: 1. Left translation on G defined by ta : G → G : ta (g) = ag, 2.
∀a ∈ G, is a homeomorphism on G. A topological group G is Hausdorff if and only if the unit element e∈G
3.
is a closed point. Let G0 ⊂ G
be the connected component of G containing the unit element e. Then G0 is a subgroup of G. 4. If G ⊂ G is a subgroup that is open, then G is also closed. Example A.28. Define the group GL2 (ℝ) := { (
a1 a3
a2 ) a1 , a2 , a3 , a4 ∈ ℝ, a1 a4 − a2 a3 ≠ 0} . a4
(A.6)
We can consider GL2 (ℝ) as an open subset of ℝ4 with the Euclidian topology, which then induces a topology on GL2 (ℝ). This is indeed a topological group. It possesses, moreover, the structure of a C∞ -variety, making it an example of a Lie group.
186 | A Mathematical vocabulary This group cannot be connected. To see this, let us consider the determinant map GL2 (ℝ) → ℝ∗ , which is continuous and surjective, while ℝ∗ = ℝ \ {0} is not connected. By contraposition we find that the group cannot be connected.
A.5 Local connectedness and local path-connectedness Definition A.29 (Locally connectedness). A topological space X is locally connected if for all x ∈ X and for each open neighborhood U1 of x there exists a connected open neighborhood U2 of x, such that U2 ⊆ U1 . Definition A.30 (Local path-connectedness). A topological space X is locality pathconnected if for all x ∈ X and each open neighborhood U1 of x there exists a pathconnected open neighborhood U2 of x, such that U2 ⊆ U1 . The link between connectedness and path-connectedness gets stronger in the local versions, which is stated in the following Proposition: Proposition A.31. If X is locally path-connected, then X is also locally connected. It is worth noticing that if a space is locally (path-)connected, it is not necessarily (path-)connected. The inverse statement is also not always true.
A.6 Compactness Consider a topological space X and define an open cover of X as a collection U = {Uα }α ∈A
of open subsets of X, such that X = ⋃ Uα . α ∈A
Given the cover U and a set
A ⊂ A,
one defines an open subcover U , which itself is an open cover of X. Such covers allow us to determine whether a topological space is compact or not.
A.6 Compactness
| 187
Definition A.32 (Compactness). A topological space X is called compact if each open cover U of X (a collection of open sets of X whose union is all of X) has a finite subcover. Example A.33. A closed interval [a, b] ⊂ ℝ is compact in the Euclidean topology. An alternative definition of compactness, continuity and closed can be given by using nets. Nets will helps us to introduce the Tychonoff topology and Tychonoff’s theorem. Definition A.34 (Partially ordered set). A (nonstrict) partial order is a binary relation ≤ over a set P which is reflexive, antisymmetric, and transitive, i.e., which, for all a1 , a2 , a3 ∈ P, possesses the properties 1. reflexivity: a1 ≤ a1 ; 2. antisymmetry: if a1 ≤ a2 and a2 ≤ a1 , then a1 = a2 ; 3. transitivity: if a1 ≤ a2 and a2 ≤ a3 , then a1 ≤ a3 . Definition A.35 (Directed set). A directed set (also a directed preorder or a filtered set) is a nonempty set A together with a reflexive and transitive binary relation ≤ having also the property that every pair of elements has an upper bound ∀a1 , a2 ∈ A, ∃ a3 ∈ A : a1 ≤ a3 , a2 ≤ a3 . Definition A.36 (Net). 1. A net (xα ) in a topological space X is a map α → xα
2.
from a partially ordered and directed index set A (with relation ≥) to X. A net (xα ) converges to x, denoted by lim xα = x, α
if for every open neighborhood of x U⊂X there exists α (U) ∈ A such that for each α ≥ α (U) xα ∈ U. 3.
It is said then that (xα ) is eventually in U. A subnet (xα1 (α2 ) ) of a net (x1α ) is defined by means of a map A2 → A1 , α2 → α1 (α2 )
(A.7)
188 | A Mathematical vocabulary between partially ordered and directed index sets, such that for each α0 ∈ A there exists α2 (α0 ) ∈ A2 with α1 (α2 ) ≥ α0 for any α2 ≥ α2 (α0 ) (one says that A2 is co-final for A1 ). 4. A net (xα ) in a topological space X is called universal if for any subset X2 ∈ X1 the net (xα ) is eventually either only in X2 or only in X1 − X2 . The notions of closedness, continuity and compactness can be reformulated in terms of nets. The fact that one uses nets instead of sequences is because Lemma A.37 does not hold when A=ℕ unless we are dealing with metric spaces. Lemma A.37 (Closedness, continuity and compactness using nets). 1. A subset X2 of a topological space X1 is closed if for all convergent nets (xα ) in X1 with x α ∈ X2 2.
for all α the limit belongs to X2 . A function F : X1 → X 2
3.
between topological spaces is continuous if for all convergent nets (xα ) in X1 , the net (F(xα )) is convergent in X2 . A topological space X is compact if for all nets there is a convergent subnet. The limit point of the convergent subnet is called then a cluster (accumulation) point of the original net.
From the above it is seen that if a net converges in some topology, then it also converges in any weaker topology. Before continuing with the Tychonoff topology, we shall first make some useful statements. Proposition A.38. If F : X1 → X 2 is a continuous map and X1 is compact, then the image F(X1 ) ⊆ X2 is also compact.
A.6 Compactness
|
189
Proposition A.39. If X1 is compact and X 3 ⊆ X1 is closed in X1 , then X3 is compact. Proposition A.40. If X1 is Hausdorff and X 3 ⊆ X1 is compact, then X3 is closed in X1 . Definition A.41 (Tychonoff topology). Let Xl be topological spaces and L be an index set. The Tychonoff topology on the direct product X∞ = ∏ X l l∈L
is the weakest topology, such that all the projections pl : X∞ → Xl , (xl )l ∈L → xl
(A.8)
are continuous. In other words, a net xα = (xlα )l∈L converges to x = (xl )l∈L if and only if xlα → xl for all l ∈ L which are point-wise (not necessarily uniformly) in L. Equivalently, the sets p−1 l (Ul ) = [∏ Xl ] × Ul , l =l̸
are open and form a basis to the topology of X∞ . That is any open set can be obtained from these sets by arbitrary unions and finite intersections. The definition of this topology is motivated by the following theorem. Theorem A.1 (Tychonoff). Let L be an index set of arbitrary cardinality and suppose that for all l∈L a compact topological space Xl is defined. Then the direct product space X∞ = ∏ X l l∈L
is a compact topological space in the Tychonoff topology.
190 | A Mathematical vocabulary As a consequence we observe that Corollary A.42. A subset X ⊆ ℝn is compact if and only if X is closed and bounded.
A.7 Countability axioms and Baire theorem In order to study the separation properties of topological spaces, we need to define the notion of countability. Definition A.43 (Neighborhood basis). Let X be a topological space and x ∈ X. Let U = {Uα }α ∈A
a collection of open neighborhoods of x. Then U is a neighborhood basis of x if for each open neighborhood Vof x there exists an α , such that Uα ⊆ V. The first countability axiom defining A1-spaces then reads: Definition A.44 (A1). A topological space X obeys the first countability axiom if all x∈X have a countable neighborhood basis. This topological space is called A1. Note that every metric space is A1. This can be seen by considering open balls of radius 1 , N ∈ ℕ. N A stronger version of the above axiom is called the second countability axiom. Definition A.45 (A2). A topological space X obeys the second countability axiom if there exists a countable basis for the topology on X. Then X is said to be A2. Proposition A.46. Let X be an A2 topological space. 1. Each open cover of X have a countable sub-cover. A space with such a property is called a Lindel¨of space. 2. There exists a countable subset of X which is dense (see Definition A.47 below) in X. Notice that A2 ⇒ A1.
(A.9)
A.7 Countability axioms and Baire theorem
|
191
Definition A.47 (Denseness). Let A be a subset of a topological space X. It is said to be dense in X if all points x∈X either belong to A or appear to be limit points of A (see Definition A.51). Definition A.48 (Meagre subset, first Baire category). Let X be a topological space. We say that a subset U⊆X is nowhere dense if the interior of the closure of U is empty. We call U meagre if U is a countable union of nowhere dense subsets. Meagre subsets have several important properties. Proposition A.49. Let X be a topological space. 1. A subset U⊆X is nowhere dense if and only if the interior of the complement X\U
2. 3.
is dense in X. A finite union of nowhere dense subsets is again nowhere dense. A countable union of meagre subsets is again meagre.
Lemma A.50. The following properties of a topological space X are equivalent: 1. Every countable intersection of dense open sets is again dense in X. 2. If C1 , C2 , ⋅ ⋅ ⋅ are closed subsets of X with empty interiors, then their union ∞
⋃ Cl l=1
also has an empty interior. 3. If U1 ⊆ X is a nonempty open subset, then U1 is not meagre. 4. If U2 ⊆ X is a meagre subset, then the complement X \ U2 is dense in X. With interior of a set C we mean the points x ∈ C that have an open neighborhood x ∈ U such that U ⊂ C. Spaces that have the above properties is called Baire spaces, and with it comes a theorem: Theorem A.2 (Baire category theorem). Each compact Hausdorff space is a Baire space.
192 | A Mathematical vocabulary ‘Baire’ referred to a meagre subset as a subset of the first Baire category and to a nonmeagre subset as being of the second Baire category. Then a Baire space is a space where all nonempty sets belong to the second category.¹
A.8 Convergence We shall now define the property of convergence of sequences in a topological space. This allows us to give meaning to limits of sequences. Definition A.51 (Convergence and accumulation point). Let (xn )n∈ℕ be a sequence of elements in a topological space X and let ξ belong to X. 1. A sequence (xn ) converges to ξ , or ξ is the limit of the sequence (xn ), if for each open neighborhood U of ξ there exists an index N(U), such that for all n ≥ N(U) xn ∈ U. 2.
In other words, a sequence is called convergent if it has a limit. We call ξ an accumulation point of the sequence (xn ) if for each open neighborhood U of ξ there exists an infinite number of indices n, such that xn ∈ U.
One might expect that for a sequence to have a unique limit it suffices to prove that it converges, but in the general case this is not true. A space should also to be Hausdorff for a sequence to have at most one limit. Definition A.52 (Countable compactness). A topological space X is countable compact if each countable open cover X = ⋃ Uα α ∈A
has a finite sub-cover. Notice the difference with straight compactness, where each cover needs a finite subcover and not only the countable ones. Moreover, we see that if X is compact, then it is also countable compact, but the inverse is not always true. Proposition A.53. A topological space X is countable compact if and only if each series (xn ), n ∈ ℕ possesses an accumulation point.
1 Note that here the notion of category has nothing to do with category theory, we mention this old terminology since it still occurs in the literature.
A.8 Convergence
|
193
Definition A.54 (Sequential compactness). A topological space X is sequentially compact if each sequence in X includes a converging subsequence. Lemma A.55. Let X be a topological space which is A1, ξ ∈ X, and (xn ), n ∈ ℕ be a series in X. Let . The following two statements are equivalent: 1. the series (xn ) has a subseries that converges to ξ ; 2. ξ is an accumulation point of the series (xn ). Theorem A.3. Let X be a topological space. 1. If X is sequentially compact, then X is countable compact. 2. If X is countable compact and A1, then X is also sequentially compact. 3. If X is countable compact and A2, then X is also compact. Graphically this can be represented as follows: +A2
⇐
+A1
⇒
compact ⇒countable compact ⇐ sequentially compact Definition A.56 (Cauchy sequence and complete space). Let (X, d) be a metric space. 1. A sequence (xn ), n ∈ ℕ in X is called Cauchy if for all 𝜖 > 0 there exists an index N, such that for all m, n ≥ N d(Xm , xn ) < 𝜖. 2.
The metric space is called complete if each Cauchy sequence in X converges.
It follows from the above that if a Cauchy sequence has a convergent subsequence, then the former sequence is convergent. We also see that every metric space can be completed. Definition A.57 (Totally bounded). A metric space (X, d) is called totally bounded if for all 𝜖 > 0 there exists a finite cover of X with open balls of radius 𝜖. Lemma A.58. A totally bounded metric space (X, d) is A2. Theorem A.4. If X is a metric space, then X is compact for the metric induced topology if and only if X is complete and totally bounded. Definition A.59 (Uniform continuity). Let (X1 , dX1 ) and (X2 , dX2 ) be metric spaces. A map F : X 1 → X2 is uniformly continuous if for all 𝜖 > 0, there exists δ > 0, such that for all x, x ∈ X1 with dX1 (x, x ) < δ
194 | A Mathematical vocabulary it holds that dX2 (F(x), F(x )) < 𝜖. Theorem A.5. Let (X1 , dX1 ) and (X2 , dX2 ) be metric spaces. If X1 is compact, then every continuous map F : X1 → X2 is uniformly continuous.
A.9 Separation properties We begin with the separation axioms. Roughly speaking, they are supposed to determine which basic objects can be separated² in a topological space. Definition A.60 (Separation axioms). Let X be a topological space. It is said that X is: 1. T1 : if all one-point sets {x} are closed in X. 2. T2 : if X is Hausdorff 3. T3 : if for all points x ∈ X and for each closed subset X ⊂ X with x ∉ X , there exist open neighborhoods U of x and U of X , such that U ∩ U = 0.
4. T4 : if for each pair of closed sets Y1 , Y 2 ⊂ X where Y1 ∩ Y2 = 0, there exist open neighborhoods U1 of Y1 and U2 of Y2 , such that U1 ∩ U2 = 0. Definition A.61 (Regular and normal). A topological space is called regular if it is T1 and T3 . It is called normal if it is T2 and T4 . It is clear that some of the axioms induce others. Lemma A.62. Normal implies regular, regular implies Hausdorff, Hausdorff implies T1 : (T4 + T1 ) ⇒ (T3 + T1 ) ⇒ T2 ⇒ T1 .
2 ‘T’ below refers to the German Trennung, i.e., separation.
A.10 Local compactness and compactification
Proposition A.63. 1.
195
Each metric space is normal: T1 + T3 + A2 ⇒ T1 + T4 .
2.
|
(A.10)
If a metric space is compact and Hausdorff, then it is normal.
Lemma A.64 (Urysohn). Let X be normal. If A1 and A2 are disjunct subsets of X, then there exists a continuous map F : X → ℝ, such that for all a1 ∈ A1 F(a) = 0, and for all a2 ∈ A2 F(a2 ) = 1. Theorem A.6 (Tietze). Let X be a normal space and X a closed subset in X. Suppose that there exists FX : X → ℝ. Then there exists a continuous function F : X → ℝ, such that F | X = FX . Theorem A.7 (Urysohn’s metrizability theorem). If a regular space is A2, then it is metrizable. In other words, one can define a metric on X, such that it induces a topology on X.
A.10 Local compactness and compactification The following statements will be important when considering Wilson lines which are allowed to go to infinity in the space-time manifold. Because infinity, strictly speaking, is not a part of the space-time manifold, one needs to introduce its compactification. Definition A.65 (Neighborhood of sets). Let A1 be a subset of a topological space X. Then a subset A2 ⊆ X is a neighborhood of A1 if A1 is contained in the interior of A2 . Definition A.66. A topological space X is called locally compact if for all x ∈ X there is a compact neighborhood.
196 | A Mathematical vocabulary For a locally compact Hausdorff space the following compactification is most convenient: Theorem A.8 (Alexandroff compactification). Let X be a locally compact Hausdorff space. Then there exists a compact Hausdorff space X ∗ and a point x ∈ X ∗ , such that X is homeomorphic with X ∗ \ p. Moreover, the pair (X ∗ , p) is unique up to homeomorphism in the following sense: suppose that compact Hausdorff spaces X1∗ and X2∗ are given, together with points xi ∈ Xi∗ , and homeomorphisms F1 : X ∗ → X1∗ \ {x1 }, and F2 : X ∗ → X2∗ \ {x2 }. Then there is a unique homeomorphism F3 : X1∗ → X2∗ , such that F3 (x1 ) = x2 , and F3 ∘ F1 = F2 . This compactification method is sometimes called the one-point compactification since one adds one point at infinity.³ An alternative compactification approach is given by the Stone–Cech compactification, which we do not discuss here.
A.11 Quotient topology When we introduce an equivalence relationship on a topological space, the question arises if the set of the equivalence classes has a topological structure. Now we shall see that this set indeed has such a structure, the quotient topology. We start by investigating the connection between the following concepts: 1. equivalence relations on a set X; 2. partitions of a set X; 3. surjective maps X1 X2 .
3 Note that this adds also an extra symmetry to the space. This is best seen in two dimensions. Adding one point at infinity turns the ‘plane’ into a Riemann sphere, a projective space that has conformal symmetry. This simple example demonstrates clearly that one has to be very careful when applying compactifications, otherwise an extra structure can be introduced, which is not necessarily wanted.
A.11 Quotient topology
|
197
If there is an equivalence relation “∼” on X, then the equivalence classes form a partition of X. On the other hand, if there is a partition of X, then an equivalence relation is introduced by stating that two elements x1 , x2 ∈ X are equivalent if they belong to the same subset of the partition. We conclude that a bijection is given: equivalence relations on X ∼ partitions of X. In order to see the relationship with surjective maps, suppose that there is an equivalence relation on X and denote the set of equivalence classes by X/ ∼. Then we get a map Q : X X/ ∼, which picks up an element x ∈ X to its equivalence class. We also refer to this map Q as dividing out the equivalence relation. Inversely, we get an equivalence relation on X from a surjective map F : X1 X2 by calling two elements x1 , x2 ∈ X1 equivalent if F(x1 ) = F(x2 ). Definition A.67 (Quotient topology). Let X be a topological space and let ∼ be an equivalence relation on X. Then the quotient topology on X/ ∼ is defined as the finest topology for which the map Q : X → X/ ∼ is continuous. Definition A.68 (Quotient map). 1.
Let X1 , X2 be topological spaces and P : X1 → X2
a surjection. Given that V ⊂ X2 2.
is open in X2 , the map P is a quotient map if and only if P−1 (V) is open in X1 . If X1 is a topological space, X2 a set and P : X 1 → X2
3.
a surjection, then there exists a unique topology on X2 with respect to which P is a quotient map. Let X be a topological space and let [X] be a partition of X. Let [x], x ∈ X
198 | A Mathematical vocabulary be the subset of X in the partition of X which contains x. Let us supply [X] with the quotient topology induced by the map [ ] : X → [X], x → [x]. Then [X] is called the quotient space of X. It should be noticed that the requirement for P to be a quotient map is stronger than just being continuous. The latter would only require that P−1 (V) is open in X1 whenever V is open in X2 (but not the other way round). Quotient spaces naturally arise if a group action is given λ : G × X → X, (g, x) → λg (x) := λ (g, x) on a topological space X and define [x] := {λg (x), g ∈ G} to be the orbit of x. The orbits clearly define a partition of X. Lemma A.69. Let X1 be a compact topological space, X2 a set and P : X1 → X 2 a surjection. Then X2 is compact in the quotient topology. Lemma A.70 (Hausdorff in quotient topology). Let X be a Hausdorff space and λ : G×X →X a continuous group action on X. Then the quotient space X/G := {[x], x ∈ X} defined by the orbits [x] = {λg (x), g ∈ G} is Hausdorff in the quotient topology. Theorem A.9 (Equivariance). Let X1 , X2 be topological spaces and let G be a group acting (not necessarily continuously) on them as λ , λ respectively. If F : X1 → X 2 is a homeomorphism, so that the actions λ , λ are equivariant (that is, commuting with the group action), then F extends as a homeomorphism to the quotient spaces X1 /G, X2 /G in their quotient topologies.
A.12 Fundamental group
| 199
A.12 Fundamental group Now we shall introduce the notion of a fundamental group, which is very relevant in the discussion of loops in a manifold. In what follows we define I as I = [0, 1] unless stated otherwise. Definition A.71 (Homotopy). Let F0 , F 1 : X 1 → X 2 be two continuous maps between topological spaces. A homotopy from F0 to F1 is a continuous map F : X1 × I → X2 , such that F(x, 0) = F0 (x) and F(x, 1) = F1 (x). If such a map exists, then F0 and F1 are said to be homotopy equivalent F 0 ≃ F1 . Hence, there is a continuous deformation between the two maps.⁴ Lemma A.72. Homotopy is an equivalence relation on the set C(X1 , X2 ) of continuous maps X1 → X2 . We can extend the restrictions on homotopies by introducing extra conditions. Definition A.73 (Relative homotopy). Let X1 , X2 be topological spaces and A ⊆ X1 . Consider the two continuous maps F0 , F1 : X1 → X2 , such that (F0 )|A = (F1 )|A . Then a homotopy from F0 to F1 relative to A is a continuous map F : X1 × I → X 2 , such that F(x, 0) = F0 (x) and F(x, 1) = F1 (x)
4 Note that homotopy also provides a set of intermediate functions, which occur in the study of renormalization-group flows and in relating vacuum expectation values or vacua in different frames or with different Hamiltonians.
200 | A Mathematical vocabulary for all x ∈ X1 , provided that for all a ∈ A, t ∈ I F(a, t) = F0 (a). Applying the definition of relative homotopy to loops with a fixed base point (the set A is then a single point) in a manifold, we find that two loops 𝛾0 , 𝛾1 : I → X are homotopy equivalent if there exists a homotopy F :I×I →X relative {0, 1}. This leads to the definition of a fundamental group. Definition A.74 (Fundamental group). Let X be a topological space and x0 ∈ X a base point. Then we define π1 (X, x0 ) as homotopy classes of loops in X with base point x0 . We call it the fundamental group of X with base point x0 . The group operation is defined as the composition of loops and can be proven to form a group structure on the homotopy equivalence classes. Moreover, it can be shown that Proposition A.75 (The fundamental group is independent of the base point). Let X be a path-connected space and let x0 , x0 ∈ X. Then the groups π1 (X, x0 ) and π1 (X, x0 ) are isomorphic. Let us now investigate what happens to the fundamental groups if we consider maps between topological spaces. Suppose we have two topological space X, Y with base points x0 , y0 and a continuous map F : X1 → X 2 , such that y0 = F(x0 ). If now 𝛾 is a loop in X at x0 , then F∘𝛾: I →Y is a loop in Y at y0 . Assuming that 𝛾 homotopy is equivalent to 𝛾 , we find that F∘𝛾 is homotopy equivalent to F ∘ 𝛾 .
A.12 Fundamental group |
201
Thus we have a map F∗ between fundamental groups F∗ : π1 (X, x0 ) → π1 (Y, y0 ),
(A.11)
[𝛾] → [F ∘ 𝛾]. Lemma A.76. Let X be a topological space with x0 ∈ X as a base point. Consider the continuous maps F1 : X → Y and F2 : Y → Z, such that y0 := F1 (x0 ) and z0 := F2 (y0 ). Then (F2 ∘ F1 )∗ = F2∗ ∘ F1∗ : π1 (X, x0 ) → π1 (Z, z0 ). Corollary A.77. Let X and Y be homeomorphic path-connected spaces. Then for each choice of base points x0 and y0 π1 (X, x0 ) ≅ π1 (Y, y0 ). The following theorem supports the Corollary A.77: Theorem A.10. Let X be a topological space having x0 ∈ X as a base point. Let F1 , F 2 : X → Y be homotopic continuous maps and y0 := F1 (x0 ), y1 := F2 (x0 ). Take a homotopy F : X×I →Y from F1 to F2 and consider the path α in Y from y0 to y1 which reads α (t) := F(x0 , t). Then if
≅
a : π1 (Y, y1 ) → π1 (Y, y0 ), is the isomorphism [𝛾] → [α −1 𝛾α ], then the homomorphisms a ∘ g∗ : π1 (X, x0 ) → π1 (Y, y0 ) and F∗ : π1 (X, x0 ) → π1 (Y, y1 ) are equal.
202 | A Mathematical vocabulary Definition A.78 (Homotopy equivalence of spaces). A continuous map F : X1 → X 2 is a homotopy equivalence if there exists a continuous map F : X2 → X1 , such that F ∘ F is homotopic to idX1 and F ∘ F with idX2 . It should be mentioned that homotopy equivalence is stronger than the homeomorphic one, and is closer to the notion of topological spaces as rubber objects. Corollary A.79. Let X, Y be two path-connected spaces. If they are also homotopy equivalent, then the fundamental groups are isomorphic for every choice of the base points x0 and y0 : π1 (X, x0 ) ≅ π1 (Y, y0 ). Definition A.80 (Simple connectedness). A path-connected space X is called simply connected if π1 (X, x0 ) = 0. Definition A.81 (Contractibility). A topological space X is called contractible if it is homotopy equivalent to a point.
A.13 Manifolds Definition A.82 (Manifolds). 1. A topological space M is an m-dimensional Ck manifold if there is a family of pairs (UI , xI ), I ∈ I consisting of an open cover of M and homeomorphisms xI : UI → xI (UI ) ⊂ ℝm , such that for all I, J ∈ I with UI ∩ UJ ≠ 0 the map φIJ := xJ ∪ xI−1 : xI (UI ∩ UJ ) → xJ (UI ∩ UJ ), is a Ck map between open subsets I of ℝm .
A.13 Manifolds
2.
|
203
The sets UI are called charts, the functions xI coordinates, the family of charts and coordinates form an atlas. Two atlases (UI , xI ) and (VI , xJ )
3.
for a topological space M are compatible if their union is again an atlas. Compatibility yields an equivalence relation on atlases with an equivalence class being a differentiable Ck structure. A topological space M is called a manifold with a boundary 𝜕M if each of the UI is homeomorphic to an open subset of the negative half-space H− = {x ∈ ℝm ; x1 ≤ 0}. The smoothness condition now demand that the φIJ are Ck on open subsets of ℝm including xI (UI ∩ UJ ). The boundary points belong to 𝜕H− = {x ∈ ℝm ; x1 = 0}.
Definition A.83 (Diffeomorphism). A map between two Ck -manifolds ψ : M1 → M2 is called Ck if for all pairs of charts UI , VJ of atlases for M1 , M2 , for which ψ (UI ) ∩ VJ ≠ 0, the maps ψIJ := xJ ∘ ψ ∘ xI−1 : xI (UI ) → xJ (VJ ) are Ck maps between open subsets of ℝm , ℝn respectively. If all the ψIJ are invertible and the inverses are Ck , then ψ is a Ck -diffeomorphism. The diffeomorphisms of a manifold form a group, denoted by Diff(M). Definition A.84 (Paracompactness). One calls an atlas (UI , xI ) locally finite if all x∈M have open neighborhoods intersecting only a finite number of the charts. A manifold M is said to be paracompact if each atlas (UI , xI ) allows a locally finite refinement (VJ , yJ ) where every VJ is included in some UI . Definition A.85 (Submanifold). Let N be a subset of an m-dimensional manifold M. Let us equip N with a manifold structure by making use of the induced topology and an induced (subspace) differentiable structure, given an atlas (UI , xI ) for M, by the
204 | A Mathematical vocabulary atlas (VI = N ∩ UI , yI = (xI )|VI ) for N. We obtain thus a differentiable structure given that the maps φIJ = yJ ∘ yI−1 for VI ∩ VJ ≠ 0 have constant rank n. Definition A.86 (Immersion and embedding). Let M be an n-dimensional manifold and ψ : M → M is Ck . Then ψ is called a local immersion if all x ∈ M possess open neighborhoods V, such that V → ψ (V) is an injection. If ψ is a global immersion, i.e., M → ψ (M ) is an injection, then ψ is called an embedding. If now for every V open in M the set ψ (V) is open in the subset topology induced from M, then ψ is said to be a regular embedding. In the latter case one says that M is an embedded submanifold of M. An embedded submanifold of dimension n = m − 1 is a hypersurface. Definition A.87 (Orientability). A manifold M is orientable if there exists an atlas such that for all y ∈ [UI ∩ UJ ] 𝜕xJ (y) det [ ] > 0. (A.12) 𝜕xI (y) If M has a boundary, then M generates an orientation on 𝜕M. Definition A.88 (Smoothness and real analyticity). – A manifold is smooth if it is C∞ . – A manifold is real analytic or Cω if the maps φIJ are real analytic. – A manifold of real dimension 2m is complex analytic or a holomorphic manifold of complex dimension m if the maps φIJ = zJ ∘ zI−1 : ℂm → ℂm satisfy the Cauchy–Riemann equations and (xI , yI ) → zI = xI + iyI is the standard isomorphism between ℝ2m and ℂm .
A.14 Differential calculus
| 205
A.14 Differential calculus Several differential objects can be defined on a manifold. Definition A.89 (Smooth function). A smooth function on a manifold M is a map F : M → ℂ, such that F ∘ xI−1 is smooth on xI (UI ) ⊂ ℝm . Definition A.90 (Vector field). A smooth vector field on M is said to be a derivation on C∞ (M). It corresponds to a linear map v : C∞ (M) → C∞ (M), F → v(F) which obeys the Leibniz rule v(F1 F2 ) = v(F1 )F2 + F1 v(F2 ), and annuls constant numbers. Given an atlas (UI , xI ), we can define special vector fields 𝜕μI on UI as obeying the condition (𝜕μI (xI𝜈 )) = δμ𝜈 for p ∈ UI , where
x(p) = (x1 (p), ⋅ ⋅ ⋅ , xm (p)) ∈ ℝm .
This allows us to represent a vector field v in the form μ
v(p) = vI [xI (p)]𝜕μI (p), where the summation over the repeated indices μ is assumed. The Leibniz rule induces the chain rule, so that μ μ vI [xI (p)]𝜕μI (p) = vJ [xJ (p)]𝜕μJ (p), (A.13) if p ∈ [UI ∩ UJ ], xJ (p) = φIJ (xI (p)). With the aid of this definition we can investigate the action of a vector field on a smooth function: μ v(F) = vI 𝜕μ [FI (x)], x = xI (p), (A.14) where FI = F ∘ xI−1 . In what follows, the space of smooth vector fields on M is denoted by T 1 (M).
206 | A Mathematical vocabulary Definition A.91 (Contravariant vector). A tangent or contravariant vector X allocates to each coordinate patch p ∈ (U, x) an n-tuple of real numbers (XUi ) = (XU1 , . . . , XUn ),
(A.15)
such that if p ∈ [U1 ∩ U2 ], then the coefficients of the contravariant vector transform as 𝜕xUi 2
XUi 2 = ∑
𝜕xUj
j
(p)XUj
1
(A.16)
1
XU2 = CU2 U1 XU1 ,
(A.17)
where CU2 U1 is called the transition function. In a local coordinate system tangent vectors can be defined as first-order differential operators Xp = ∑ X j j
𝜕 . 𝜕xj p
(A.18)
Definition A.92 (Linear functional covector). A real linear functional α on a vector space E is a real valued map α : E→ℝ from E to the one-dimensional vector space ℝ. The condition of linearity holds for the real numbers a1 , a2 and vectors v1 , v2 : α (a1 v1 + a2 v2 ) = a1 α (v1 ) + a2 α (v2 ).
(A.19)
These linear functionals are called also covector, covariant vector, or one-form. With the aid of local coordinates and a basis one gets α = ∑ aj (x) dxj ,
(A.20)
j
transforming as dxUi 2 = ∑ j
𝜕xUi 2 𝜕xUj
dxUj , 1
(A.21)
1
so that the coefficients transform as U
U
ai 2 = ∑ aj 1 j
𝜕xUj
1
𝜕xUi
2
.
(A.22)
Notice the difference in this transformation rule with the transformation rule for contravariant vectors. This difference extends also to tensors, so that the transformation rule can be used to determine its contravariant and covariant rank.
A.14 Differential calculus
| 207
Definition A.93 (Dual space). The collection of all linear functionals α on a vector space E forms another vector space E∗ , which is the dual space to E: (α1 + α2 )(v1 ) := α1 (v1 ) + α2 (v2 ), (cα1 )(v1 ) := cα1 (v1 ),
(A.23) (A.24)
where α1,2 ∈ E∗ , v1 ∈ E, c ∈ ℝ. Definition A.94 (Tangent bundle). The tangent bundle TM to a differentiable manifold M is defined as the collection of all tangent vectors at all points of M. Now we are in a position to introduce the important concepts of interior and exterior products. Definition A.95 (Interior product). Let v be a vector and α be a p-form. Their interior product (p − 1)-form iv α is defined as iv α 0 = 0
if α is a 0-form,
1
iv α = α (v) p
if α is a 1-form, p
iv1 α (v2 , ⋅ ⋅ ⋅ , vp ) = α (v1 , v2 , ⋅ ⋅ ⋅ , vp )
if α is a p-form.
(A.25)
It is evident that iv1 +v2 = iv1 + iv2 and iav = aiv . Definition A.96 (Exterior or wedge product and exterior algebra). The exterior algebra ⋀(V) over a vector space V over a field K is defined as the quotient algebra of the tensor algebra T(V) by the two-sided ideal I generated by all elements of the form x ⊗ x, such that x ∈ V: Λ(V) := T(V)/I. The exterior product ∧ of two elements of ⋀(V) is defined by: α1 ∧ α2 = α1 ⊗ α2 /I, or a1 ∧ α2 = α1 ⊗ α2 − α2 ⊗ α1 . The algebra associated with this product is the exterior algebra on M n
⋀ M. It is constructed from the vector space of one-forms on M. THM : INTERIOR PRODUCT IS AN ANTI-DERIVATION
208 | A Mathematical vocabulary Theorem A.97 (Interior product is an antiderivation). We say that p
p−1
iv : ⋀ → ⋀ is an antiderivation. That is to say p
p
p
p
p
p
iv (α1 1 ∧ α2 2 ) = [iv α1 1 ] ∧ α1 2 + (−1)p α1 1 ∧ [iv α1 2 ].
(A.26)
Definition A.98 (Differential of a map). Let φ : M → M be a smooth map and φ (x) = x . The differential φ∗ is defined as the map between the tangent spaces φ∗ : Tx M → Tx M such that φ∗ (vx ) = vx , where vx ∈ Tx M and vx ∈ Tx M are elements of the tangent spaces at x and x . Definition A.99 (Pullback). Let φ : M → M be a smooth map and φ (x) = x . Let φ∗ : Tx M → Tx M be the differential of φ . The pullback φ ∗ is the linear transformation turning covectors at x into covectors at x: φ ∗ : M ∗ (x ) → M ∗ (x), so that φ ∗ (β )(v) := β (φ∗ (v)), for all covectors β at x and vectors v at x. Definition A.100 (Push-forward). Let φ be a smooth map φ : M → M and a let X be a vector field on M. A section of φ ∗ TN over M is called a vector field along φ , i.e., a section of TM. Then, applying the differential (A.98) point-wise to X yields the push-forward φ∗ X, which is a vector field along φ , i.e., a section of φ ∗ TN over M.
A.14 Differential calculus
| 209
Any vector field X on M defines a pullback section of φ ∗ TM with (φ ∗ X )x = Xφ x . A vector field X on M and a vector field X on M are φ -related if φ∗ X = φ ∗ X as vector fields along φ . In other words, for all x ∈ M, dφx (X) = Xφ (x) . Using this language we can give an alternate definition for an immersion. Definition A.101 (Immersion). A smooth map of manifolds φ : M → M is an immersion and φ (M) is an immersed submanifold if φ∗ : Tx M → Tφ (x) M , is one-to-one, that is, for all x ∈ M ker φ∗ = 0. Definition A.102 (Support of a function)). Suppose that F: M→ℝ is a real-valued continuous function. From the definition of continuity it follows that the inverse image of each open set of ℝ is open in M. The set of nonzero real numbers form an open subset of ℝ, such that the subset of M where F ≠ 0 is an open subset in M: F −1 (ℝ − 0). The closure of this set is called the support of F. Definition A.103 (Bump function). Let p be a point p in n-dimensional manifold M. One can construct an n-form with the support contained in an open 𝜖-ball around p: ω n := f (‖x‖) dx1 ∧ ⋅ ⋅ ⋅ ∧ dxn , n
ω := 0,
inside the ball,
outside of the ball.
If n = 0, this n-form is called a bump function. Definition A.104 (Partition of unity). Let M be an n-dimensional manifold that can be covered by a finite number of coordinate patches {Uα }. Then a partition of unity subordinate to this covering yields n real-valued differentiable functions Fα : M → ℝ
210 | A Mathematical vocabulary such that for all x, α 1. Fα ≥ 0; 2. ∑α Fα (x) = 1; 3. the support of Fα is a closed subset of the patch Uα . One can see that such a partition always exists. It should be noticed that if a manifold is compact, then for each cover of this manifold there exists a finite sub-cover permitting a partition of unity.
A.15 Stokes’ theorem A lot of derivations in loop space heavily depend on the use of Stokes’ theorem. Theorem A.105 (Stokes’ theorem). Let X be an oriented n-dimensional C2 manifold. Let ω be an C1 (n − 1)-form on X. Suppose that ω possesses compact support. Then ∫ dω = ∫ ω . 𝜕X
X
Note that this theorem can be generalized to ω which have almost compact support. More relevant for our purpose is Stokes’ theorem for loops with derivative discontinuities (such as angles, intersections, etc.). Under certain circumstances Stokes’ theorem is still valid. In order to proceed, we need the concept of negligible subsets. Definition A.106 (Negligible subset). Let S be a closed subset of ℝn . We call S negligible for X if there exists an open neighborhood U of S in ℝn , a fundamental (Cauchy) sequence of open neighborhoods {Uk } of S in U, with the closure U k ⊂ U, and a sequence of C1 functions {gk }, such that 1. 0 ≤ gk ≤ 1 and gk = 0 for x in some open neighborhood of S, and gk = 1 for x ∉ Uk ; 2. if ω is an (n − 1)-form of class C1 on U, and μk is the measure associated with dgk ∧ ω on X ∩ U, then μk is finite for large k, and lim μk (U ∩ X) = 0.
k→∞
We can now formulate Stokes’ theorem with singularities. Theorem A.107 (A version of Stokes’ theorem). Let X be an oriented n-dimensional C3 submanifold without boundary in ℝn . Let ω be a C1 (n − 1)-form on X on an open neighborhood of X in ℝn , and with compact support. Suppose that:
A.16 Algebra: Rings and modules
1.
| 211
if S is the set of singular points in the boundary X − X, then S ∩ supp ω
2.
is negligible for X; the measures associated with |dω | on X, and |ω | on 𝜕ω are finite.
Therefore, ∫ dω = ∫ ω . 𝜕X
X
A.16 Algebra: Rings and modules Definition A.108 (Monoid). A monoid (semi-group with unit) is a pair containing a set M and a binary operation ‘⋅’ which satisfies the axioms for all a1 , a2 , a3 ∈ M: 1. closure: a1 ⋅ a2 ∈ M; 2.
associativity: (a1 ⋅ a2 ) ⋅ a3 = a1 ⋅ (a2 ⋅ a3 );
3.
identity element: there exists an element e ∈ M, such that (a1 ⋅ e) = (e ⋅ a1 ) = a1 .
Definition A.109 (Ring). A ring is defined as a set R with two binary operations ‘+’ and ‘⋅’, which are called addition and multiplication, which map every pair of elements of R to a unique element of R. These operations satisfy the following properties for all a1 , a2 , a3 ∈ R: – Sddition is Abelian: 1. associativity: (a1 + a2 ) + a3 = a1 + (a2 + a3 ); 2.
existence of zero 0 ∈ R: 0 + a1 = a1 ;
3.
commutativity: a1 + a2 = a2 + a1 ;
4. existence of inverse element: ∃ − a1 ∈ R | a1 + (−a1 ) = (−a1 ) + a1 = 0. –
Multiplication is associative: 1. (a1 ⋅ a2 ) ⋅ a3 = a1 ⋅ (a2 ⋅ a3 ).
212 | A Mathematical vocabulary –
Multiplication distributes over addition: 1. a1 ⋅ (a2 + a3 ) = (a1 ⋅ a2 ) + (a1 ⋅ a3 ), 2. (a1 + a2 ) ⋅ a3 = (a1 ⋅ a3 ) + (a2 ⋅ a3 ).
Definition A.110 (Field). A field F consists of a set and two composition operations: +
F×F →F ×
F×F →F
a1 , a2 → a1 + a2 ,
(A.27)
a1 , a2 → a1 a2 ,
(A.28)
called addition and multiplication. They have the following properties: 1. (F, +) is an Abelian group; 2. (F, ×) is associative and commutative, turning F \ {0}
3.
into a group. The identity element is denoted by 1; distributivity: (a1 + a2 )a3 = a1 a3 + a2 a3 .
Definition A.111 (Vector space). A vector space V over a field F consists of a set and two composition operations: +
V × V → V v1 , v2 → v1 + v2 ×
F × V → V c, v1 → cv1
(A.29) (A.30)
called addition and scalar multiplication. They have the following properties for all a1 , a2 ∈ F, v, v1 , v2 ∈ V: 1. (V, +) is an Abelian group; 2. scalar multiplication is associative with multiplication in F: (a1 a2 )v = a1 (a2 v); 3.
(A.31)
the element 1 is an identity 1v = v; .
4. double distributivity: (a1 + a2 )v1 = a1 v1 + a2 v1 and a1 (v1 + v2 ) = a1 v1 + a1 v2 . Sometimes the notion of a vector space does not suffice. A useful extension is provided by introducing the concept of modules. A module over a ring generalizes the notion of vector space over a field, with the scalars being now the elements of an arbitrary ring instead of a field. Modules generalize as well the concept of Abelian groups, which are modules over the ring of integers.
A.17 Algebra: Ideals
| 213
Definition A.112 (Module). A left K-module M over a ring K is defined to contain an Abelian group (M, +) and an operation K × M → M, such that for all k, k ∈ K and x, x ∈ M 1. k(x + x ) = kx + kx ; 2. (k + k )x = kx + k x; 3. (kk )x = k(k x); 4. 1K x = x. A right module can be defined analogously.
A.17 Algebra: Ideals Now that we have introduced the concept of a ring, we can define the ideal of a ring, which becomes relevant when considering homomorphisms and algebra morphisms. The definition of an ideal is given in the main text of the book. The following statements take place: Theorem A.113 (Kernel is a subring). Suppose that we have a ring homomorphism φ : K1 → K2 . Then the kernel of φ is a subring of K1 . Theorem A.114 (Kernel is an ideal). Suppose that we have a ring homomorphism φ : K1 → K2 . Then the kernel of φ is an ideal of K1 . Definition A.115 (Co-kernel). Suppose that we have a K-module homomorphism F : A1 → A2 . The co-kernel is then defined as the quotient group A2 /Im(F). Hence, F is injective if and only if its kernel is 0, and surjective if and only if its cokernel is 0. Definition A.116 (Prime ideal [29]). Let K be a ring. An ideal p of K is called prime if p ≠ K and a1 a2 ∈ p ⇒ a1 ∈ p or a2 ∈ p.
214 | A Mathematical vocabulary Definition A.117 (Maximal ideal). An ideal pm in K is called maximal if it is maximal among the ideals which are strictly smaller than the ring itself (proper ideals). Hence, pm is maximal if and only if A/pm is nonzero and has no proper nonzero ideals. pm is a field. Definition A.118 (Zero divisor). A zero divisor a1 ∈ K is an element of the ring which is left and right zero divisor: ∃ a2 ≠ 0 ∈ K : a1 ⋅ a2 = 0 Left zero divisor, ∃ a3 ≠ 0 ∈ K : a3 ⋅ a1 = 0 Right zero divisor. Definition A.119 (Nonzero divisor). An element a1 ∈ K is a nonzero divisor if for all a2 = ̸ 0 a1 ⋅ a2 ≠ 0. a1 is a unit if there exists another element a2 , such that a1 ⋅ a2 = 1. Definition A.120 (Domain). A nonzero ring is called a domain if each nonzero element is a nonzero divisor. It is a field if every nonzero element is a unit. Obviously, a field is a domain. Definition A.121 (Integral domain). An integral domain is a commutative ring with no zero divisors. Definition A.122 (Local ring). A ring K is local if it contains exactly one maximal ideal pm .
A.18 Algebras Definition A.123 (Ring algebra). An algebra over a commutative ring is a extension of an algebra over a field, such that the base field is replaced by a commutative ring K. A K-algebra is a pair consisting of a K-module M and a binary operation called the M-multiplication [⋅, ⋅]: [⋅, ⋅] : M × M → M such that ∀a1 , a2 ∈ K, ∀x1 , x2 , x3 ∈ M [a1 x1 + a2 x2 , x3 ] = a1 [x1 , x3 ] + a2 [x2 , x3 ] and [x3 , a1 x1 + a2 x2 ] = a1 [x3 , x1 ] + a2 [x3 , x2 ].
The following definition is useful:
A.18 Algebras
| 215
Definition A.124 (Unital algebra). Let (AR , m) be an algebra over the ring R. Then (AR , m) is a unitary (unital) algebra if it contains an identity element 1A called a unit of algebra for m, such that for all a ∈ AR : m(a, 1A ) = m(1A , a) = a. Usually 1 stands for the unity. Then an alternative definition of a ring algebra is possible that reads: Definition A.125 (K-algebra). A K-algebra is a compound of a K-vector space A and two linear maps m : A ⊗K A → A and u : K → A, such that the maps are unital (Definition A.124) and associative, with the extra conditions that both diagrams in Figure A.1 are commutative (s stands for the scalar multiplication) and the unit element in A is given by 1A = u(1K ).
A⊗A⊗A
m ⊗ 1A
A⊗A
A⊗A u ⊗ 1A
1A ⊗ m
m
1A ⊗ u m
k⊗A s
A⊗A
m
A⊗k s
A
A
Fig. A.1: K-algebra commutative diagrams.
Definition A.126 (Graded ring). A graded ring K is a ring that has a decomposition into (Abelian) additive groups: K = ⨁ Kn = K 0 ⊕ K 1 ⊕ K 2 ⊕ ⋅ ⋅ ⋅ n
such that the ring multiplication satisfies 1. x1 ∈ Ks1 , x2 ∈ Ks2 ⇒ x1 x2 ∈ Ks1 +s2 ; 2. Ks1 Ks2 ⊆ Ks1 +s2 . Elements of any term Kn of the decomposition are called homogeneous elements of degree n. A subset k is said to be homogeneous if each element k∈k
216 | A Mathematical vocabulary is the sum of homogeneous elements that belong to k. For a given k the homogeneous elements are uniquely defined. If I is a homogeneous ideal in K, then K/I is also a graded ring, allowing decomposition K/I = ⨁(Ki + I)/I. i
Each nongraded ring K can be made graded by setting for positive i K0 = K, and Ki = 0.
Definition A.127 (Graded module). A graded module is a left module M over a graded ring K, such that 1. M = ⨁i Mi ; 2. Ki Mj ⊆ Mi+j . Definition A.128 (Graded algebra). An algebra A over a ring K is said to be a graded algebra if it is graded as a ring. Definition A.129 (K-algebra homomorphism). Let A1 , A2 be some K-algebras. A Kalgebra homomorphism is defined as a K-linear map F : A 1 → A2 , such that for all x1 , x2 ∈ A1 F(x1 x2 ) = F(x1 )F(x2 ). The space formed by all K-algebra homomorphisms is denoted by HomK (A1 , A2 ). Let U1 and U2 be two commutative unitary K-algebras. We denote by Alg(U1 , U2 ) = AlgK (U1 , U2 ) the totality of K-algebra morphisms from U1 to U2 which map the unit element of one algebra into the unit element of the other one. Definition A.130 (Associative co-algebra). A K-co-algebra is a compound of a K-vector space V and two linear maps Δ:V →V ⊗V and 𝜖 : V → K, that is the co-multiplication and co-unit, respectively. The following axioms stating coassociativity and co-unit yield the commutative diagrams visualized in Figure A.2: 1. (1 ⊗ Δ) ∘ Δ = (Δ ⊗ 1) ∘ Δ; 2. (1 ⊗ 𝜖) ∘ Δ = (𝜖 ⊗ 1) ∘ Δ.
A.19 Hopf algebra
C⊗C⊗C
Δ ⊗ 1C
C⊗C
C⊗C ⊗ 1C
1C ⊗ Δ
Δ
1C ⊗ Δ
k⊗C
C⊗k
1k ⊗ − C⊗C
Δ
| 217
− ⊗ 1k C
C
Fig. A.2: K-co-algebra commutative diagrams.
A.19 Hopf algebra In order to introduce the notion of a Hopf algebra, we first merge an algebra and coalgebra into a bi-algebra. Definition A.131 (Bi-algebra). A bi-algebra A is a K-vector space A = (A, m, u, Δ, 𝜖), where (A, m, u) is an algebra and (A, Δ, 𝜖) is a co-algebra, such that 1. m and u are co-algebra homomorphisms; 2. Δ and 𝜖 are algebra homomorphisms. Let A1 , A2 be two K-algebras. Then A1 ⊗ A2 is also a K-algebra with the composition ⊗ (a1 ⊗ a2 )(a3 ⊗ a4 ) = a1 a3 ⊗ a2 a4 , such that A1 ⊗ A2 ⊗ A1 ⊗ A2
1A1 ⊗τ ⊗1A2
→
A1 ⊗ A1 ⊗ A2 ⊗ A2
mA1 ⊗mA2
→
A1 ⊗ A 2
where τ : A 2 ⊗ A1 → A1 ⊗ A2 is called a flipping operation. The unit of A1 ⊗ A2 is to be obtained by the rules uA1 ⊗uA2
uA1 ⊗A2 : K ≅ K ⊗ K → A1 ⊗ A2 uA1 ⊗A2 (1K ) = uA1 ⊗A2 (1K ⊗ 1K ) = 1A1 ⊗ 1A2 = 1A1 ⊗A2 . Similarly, one gets the co-algebra ΔB1 ⊗ΔB2
B1 ⊗ B2 → B1 ⊗ B1 ⊗ B2 ⊗ B2
1B1 ⊗τ ⊗1B2
→
B1 ⊗ B 2 ⊗ B 1 ⊗ B 2 .
218 | A Mathematical vocabulary The co-unit is given by 𝜖B1 ⊗𝜖B2
B1 ⊗ B2 → K ⊗ K ≅ K, 𝜖B1 ⊗ 𝜖B2 (1B1 ⊗ 1B2 ) = 𝜖(1B1 ) ⊗ 𝜖(1B2 ) = 1K ⊗ 1K = 1K . A bi-algebra morphism is simultaneously an algebra and co-algebra homomorphism. Definition A.132 (Bi-ideal). Let F : A1 → A 2 be a bi-algebra homomorphism. Then ker F is called a bi-ideal, meaning that ker F is both an ideal and a co-ideal. I ⊂ A1 is a co-ideal if 𝜖(I) = 0 and Δ(I) ⊆ C ⊗ I + I ⊗ C. Definition A.133 (Hopf algebra). Let K be a commutative ring. A K-algebra H is said to be a Hopf algebra if it possesses extra structure provided by K-algebra homomorphisms: 1. co-multiplication: Δ : H → H ⊗K H; 2.
co-unit: 𝜖 : H → K;
3.
antipode K-module homomorphism: λ : H → H,
which obey the conditions of 1. co-associativity: (I ⊗ Δ)Δ = (Δ ⊗ I)Δ : H → H ⊗ H ⊗ H; 2.
co-unitarity: m(I ⊗ 𝜖)Δ = I = m(𝜖 ⊗ I)Δ;
3.
antipode: m(I ⊗ λ )Δ = 𝚤𝜖 = m(λ ⊗ I)Δ,
where I is the identity map on H, m: H⊗H →H is the multiplication in H, and 𝚤:K→H is the K-algebra structure map for H, called the unit map. Here ⊗K shows that the product is K-equivariant: for k ∈ K, h1 , h2 ∈ H Δ(kh1 , h2 ) = kΔ(h1 , h2 ).
A.19 Hopf algebra
| 219
Lemma A.134. Let I be a bi-ideal of a bi-algebra A. The operations on A induce the structure of a bi-algebra on A/I, such that the bi-algebraic structure does not change under projection of A on A/I. As we have seen in the construction of generalized loop space, a specific ideal plays a prominent role there. One of the reasons for why this ideal turns out to be so important can be clarified by some observation on universal enveloping algebras of Lie algebras and their representations. Recall that a representation R assigns to all elements xi of a Lie algebra a linear operator R(xi ). Because of this linearity, the operators not only form a Lie algebra, but also an associative algebra that allows one to define the products R(x1 )R(x2 ). The result of this product depends, strictly speaking, on the chosen representation. Some of its properties, however, can be shown to hold for all representations. Given that the universal enveloping algebra is defined, one becomes able to find out these universal properties. It should now be clear that U(g) is a bi-algebra for all Lie algebras g with Δ(x) = 1 ⊗ x + x ⊗ 1 and for all g ∈ g 𝜖(x) = 0, U(g) = T(g)/I(g) and T(g) is a bi-algebra. Definition A.135 (Opposite algebra and co-algebra). The opposite algebra Aop to a Kalgebra A is defined to be the same vector space as A but now with multiplication operation m defined for all elements in A as m (a1 , a2 ) := m(a2 , a1 ). Similarly, for the co-algebra B one defines the opposite Bop , such that ΔBop := τ ∘ ΔB where τ is again the flipping operation. Definition A.136 (Co-commutativity). A co- or bi-algebra is co-commutative if its opposite is equal to itself. For the sake of simplicity of notation, we introduce the so-called Sweedler notation. Let B be a co-algebra. For its elements b we have Δ(b) = ∑ b1 ⊗ b2 .
220 | A Mathematical vocabulary Using associativity one obtains (1 ⊗ Δ) ∘ Δ(b) = (1 ⊗ Δ)(∑ b1 ⊗ b2 ) = ∑ b1 ⊗ b21 ⊗ b22 = ∑ b11 ⊗ b12 ⊗ b2 , which will be denoted as ∑ b1 ⊗ b2 ⊗ b3 . In general, Δn−1 : B → B⊗n . Then, with the aid of the right diagram in Figure A.2, one concludes that b = ∑ 𝜖(b1 )b2 = ∑ b1 𝜖(b2 ) and that B is co-commutative if and only if for all b ∈ B Δ(b) = ∑ b1 ⊗ b2 . Definition A.137 (Antipode). Suppose we have the bi-algebra A = (A, m, u, Δ, 𝜖). A linear endomorphism S:A→A is called an antipode if the diagram in Figure A.3 commutes. In terms of the Sweedler notation 𝜖(a) = ∑ a1 S(a2 ) = ∑ S(a1 )a2 . (A.32) A Hopf algebra is, therefore, a bi-algebra with an antipode and Hopf algebra morphisms are antipode preserving bi-algebra morphisms.
A⊗A
m
m
u◦
1A ⊗ S
A⊗A
A
Δ
Fig. A.3: Hopf commutative diagram.
A⊗A
S ⊗ 1A
A
Δ
A⊗A
A.19 Hopf algebra |
221
Definition A.138 (Convolution product). Let A = (A, m, u) be an algebra and C = (C, Δ, 𝜖) a co-algebra. We define the convolution product ∗ over HomK (C, A) for elements f1 , f2 ∈ HomK (C, A) and c1 , c2 ∈ C as (f1 ∗ f2 )(c) = ∑ f1 (c1 )f2 (c2 ). Proposition A.139. (HomK (C, A), ∗, u ∘ 𝜖) is an algebra. The easiest way to see this is by noticing that m ≡ ∗ : HomK (C, A) → HomK (C, A), and u ∘ 𝜖 : HomK (C, A) → HomK (C, A), an identity map. Setting C = A, this becomes a bi-algebra with EndK (A), ∗, u ∘ 𝜖. We derive, therefore, that the antipode S for A is an inverse of 1A in Endk (A), ∗, u ∘ 𝜖 which is uniquely determined due to the uniqueness of inverses. Corollary A.140. Let C = (C, Δ, 𝜖) be any K-co-algebra. C∗ = HomK (C, K) is an algebra with (f1 ∗ f2 )(c) = ∑ f1 (c1 )f2 (c2 ), ∗
so that C is commutative if and only if C is co-commutative (A.136). Theorem A.141. Let H = (H, m, Δ, u, 𝜖, S) be a Hopf algebra, so that S is a bi-algebra homomorphism H → H opcop , for all x1 , x2 ∈ H: 1. S(m(x1 , x2 )) = m[S(x2 ), S(x1 )], S(1) = 1 2. (S ⊗ S) ∘ Δ = S, 𝜖 ∘ S = 𝜖 ←→ S(x2 ) ⊗ S(x1 ) = ∑ (Sx1 )1 (Sx2 )2 . Definition A.142 (Anti-homomorphism). An antihomomorphism of rings is a map θ : K1 → K 2 where K1 , K2 are rings, such that θ (k1 k2 ) = θ (k2 )θ (k1 ).
222 | A Mathematical vocabulary Example A.143 (Hopf algebra). Let g be a Lie Algebra. Given that U(g) := T(g)/I(g), we find for all x1 , x2 ∈ g S([x1 , x2 ] − x1 x2 + x2 x1 ) = −[x1 , x2 ] − (−x2 )(−x1 ) + (−x1 )(−x2 ) = −([x1 , x2 ] − x1 x2 + x2 x1 ) ∈ I(g).
(A.33)
S is an antiautomorphism of U(g) that makes it a Hopf algebra. Definition A.144 (Restricted dual of a K-algebra). For a K-algebra A the restricted dual is given by the set A∘ = {F ∈ A∗ : F(I) = 0 for some I ⊲ A, dimK (A/I) < ∞},
(A.34)
where I ⊲ A indicates that I is an ideal of A. We can extend the concept of a module over a ring to a module over a K-algebra and a K-co-algebra generating also the concept of a co-module. Definition A.145 (Module over a K-algebra). The left module M over a K-algebra A includes a K-vector space M with a K-linear map λ : A ⊗ M → M, such that the diagrams shown in Figure A.4 commute.
A⊗A⊗M
1A ⊗ λ
A⊗M
m ⊗ 1M
A⊗M
λ
λ
M
k⊗M
u ⊗ 1M
s
A⊗M
λ
M
Fig. A.4: Module over a K-algebra.
Definition A.146 (Co-module). A right co-module M over a co-algebra C is a K-vector space M with a K-linear map ρ : M → M ⊗ C, (A.35) such that the diagrams in Figure A.5 commute.
A.19 Hopf algebra
ρ M
M⊗C
1M ⊗ Δ
ρ
M⊗C
ρ ⊗ 1C
ρ
M⊗k
| 223
M⊗C
−⊗1
M⊗C⊗C
1M ⊗
M
Fig. A.5: Co-module over a k-co-algebra.
Proposition A.147 (Duality). 1. Let M be a right co-module for the co-algebra C. Then M is a left module for C∗ = Hom(C, K). 2.
Let A be an algebra and M a left A-module. Then M is a right A∘ -co-module if and only if for all m ∈ M dimK (Am) < ∞.
Definition A.148 (Rational module). An A-module M with dimK (Am) < ∞ is called rational. We end this set of definitions with some comments on tensor products of modules and co-modules as well as on homomorphisms between modules. – Tensor products of modules: Let A be a bi-algebra, and V1 and V2 be left Amodules. Then V1 ⊗ V2 is a left A-module a ⋅ (v1 ⊗ v2 ) = ∑ a1 v ⊗ a2 v2 . If we now consider a third A-module V3 , then co-associativity assures that (V1 ⊗ V2 ) ⊗ V3 ≅ V1 ⊗ (V2 ⊗ V3 ). In the specific case of the trivial left A-module K for a ∈ A, v ∈ K a ⋅ v = 𝜖(a)v, it is clear that V1 ⊗ K ≅ V1 ≅ K ⊗ V1 , are left modules. If A is co-commutative, then we also have that V1 ⊗ V2 ≅ V2 ⊗ V1 ,
224 | A Mathematical vocabulary as left modules with the isomorphism given by the flip function τ : v 1 ⊗ v2 → v2 ⊗ v1 . –
Tensor products of co-modules: If B is a bi-algebra and V1 , V2 are right B-comodules, then V1 ⊗ V2 is a right co-module with v1 ⊗ v2 → ∑ v10 ⊗ v20 ⊗ v11 v21 .
–
Homomorphism of modules: Let H be a Hopf algebra and V1 , V2 be left H-modules. Then HomK (V1 , V2 ) is a left H-module with the action for h ∈ H, F ∈ HomK (V1 , V2 ) (h ⋅ F)(v1 ) = ∑ h1 F((Sh2 )v1 ).
Now we shall introduce algebras that also carry topological structures.
A.20 Topological, C ∗ -, and Banach algebras Definition A.149 (Topological algebra). A topological algebra is an algebra supplied with a nontrivial topology τ which is compatible with its linear structure, such that the map X × X → X, (x1 , x2 ) → x1 x2 , is continuous. Definition A.150 (Normed algebra). An algebra A having a norm is a normed algebra if the norm is sub-multiplicative for all a1 , a2 ∈ A a1 a2 ≤ a1 a2 .
(A.36)
Lemma A.151 (Continuity in normed algebras). If A is a normed algebra, then all the algebraic operations are continuous in the norm topology on A. Definition A.152 (Involution on an algebra). An involution on an algebra A is a map ∗ A → A, a → a∗ such that the following condition are fulfilled: 1. conjugate linearity: (za1 + z a2 )∗ = za∗1 + z a∗2 ; 2.
order reversing: (a1 a2 )∗ = a∗2 a∗1 ;
3.
involution: (a∗1 )∗ = a1
A.21 Nuclear multiplicative convex Hausdorff algebras and the Gel’fand spectrum
|
225
for all a1 , a2 ∈ A, z, z ∈ ℂ. An algebra with involution is called an ∗ −algebra. Since we consider here algebras over the complex numbers this is called a C∗ −algebra. Definition A.153 (Banach space). A normed space X is said to be a Banach space if for each Cauchy sequence ∞ {xn }n=1 ⊂ X, there exists an element x ∈ X such that lim x n→∞ n
= x.
Definition A.154 (Banach algebra). An algebra which is complete in the metric induced by its norm is a Banach algebra.
A.21 Nuclear multiplicative convex Hausdorff algebras and the Gel’fand spectrum A topological algebra isomorphism is an algebra morphism which is also a homeomorphism. This allows us to build a number of structures on such isomorphisms, of which we shall introduce some and briefly discuss their properties. Definition A.155 (Filter basis). A nonempty subset F of a partially ordered set (P, ≤) is said to be a filter if the following conditions are fulfilled: 1. for all x1 , x2 ∈ F there exists an element x3 ∈ F, such that x3 ≤ x1 and x3 ≤ x2 ; 2.
for all x1 ∈ F and x2 ∈ P the statement x1 ≤ x2
3.
implies that x2 ∈ F; a filter is proper if it is not equal to the whole set P.
Definition A.156 (Bases for compatible topologies). The filter basis B in an algebra A determines a basis at 0 for a compatible topology for A if and only if 1. B is a neighborhood base at 0 for a topology which is compatible with linear structure of A; 2. for each B∈B
226 | A Mathematical vocabulary there exists a B ∈ B such that the product of two B s is a subset of B. The following type of induced topology is important in the topologization of the shuffle algebra. Definition A.157 (Initial topology). Let X1 be an algebra, X2 a topological algebra with neighborhood filter V(0). Let A : X1 → X 2 be a homomorphism. Then the filter A−1 (V(0)) defines a topology compatible with linear structure of X1 . The topology determined by the filter is said to be the initial topology induced by A. Definition A.158 (Final topology). Let X1 be a topological algebra with V(0) being the filter, X2 an algebra, and A1 : X1 → X 2 a homomorphism. It is easy to see that the collection A of subsets U of X2 , such that A−1 (U) ∈ V(0), forms a base at 0 for a topology compatible with the linear structure of X1 . For each U ∈ A one can select A2 ∈ V(0), such that A2 A2 ∈ A−1 1 (U). Hence, A1 (A2 )A1 (A2 ) = A1 (A2 A2 ) ⊂ A1 (A−1 (U)) ⊂ U. Since A−1 1 (A1 (A2 )) ⊃ A2 ∈ V(0), then A−1 1 (A1 (A2 )) ∈ V(0), i.e., A1 (A2 ) ∈ A, so that A is a base at 0 for a topology which is compatible with the algebraic structure of X2 . The topology generated by 0 is called the final topology for X2 determined by the homomorphism A1 .
A.21 Nuclear multiplicative convex Hausdorff algebras and the Gel’fand spectrum
|
227
Suppose V is a vector space over K, a field or a subfield of the complex numbers. Definition A.159 (Convex space via convex sets). A subset C in V is called 1. convex: if for each x1 , x2 ∈ C, for all α ∈ [0, 1 α x1 + (1 − α )x2 ∈ C; 2. 3.
circled: if for all x ∈ C, λ x ∈ C, |λ | = 1. a cone: if for every x ∈ C and 0 ≤ λ ≤ 1 λ x ∈ C.
4. balanced: if for all x ∈ C, λ x ∈ C if |λ | ≤ 1; 5. absorbent: if the union of α C over all α > 0 is all of V, or equivalently for all x ∈ V, for some α > 0, α x ∈ C; 6. absolutely convex: if it is balanced and convex. A locally convex topological vector space is a topological vector space in which the origin has a local base of absolutely convex absorbent sets. Because translation is continuous, all translations are homeomorphisms, so each base for the neighborhoods of the origin can be translated into a base for the neighborhoods of any given vector. Definition A.160 (Convex space via semi-norms). A semi-norm on V is a map F : V → ℝ, such that 1. F is positive or positive semidefinite: F(x) ≥ 0; 2.
F is positive homogeneous or positive scalable: F(λ x) = |λ |F(x)
3.
for each number λ . In particular, F(0) = 0; F is subadditive, that is F(x1 + x2 ) ≤ F(x1 ) + F(x2 ).
If F obeys positive definiteness, that is if F(x) = 0, then x = 0, then F is a norm. A locally convex space is defined to be a vector space V with a family of semi-norms {Fα },
α ∈ A.
The space possesses the initial topology of the semi-norms. That is to say, it is the weakest topology for which all mappings for x0 ∈ V, a ∈ A x → Fα (x − x0 )
228 | A Mathematical vocabulary are continuous. A base of neighborhoods of x0 for this topology is obtained in the following way: for each finite subset A ∈ A and each 𝜖 > 0, we set UA ,ε (x0 ) = {x ∈ V : pα (x − x0 ) < ε , α ∈ A }. Continuity of the vector space operations follows from properties (ii) and (iii) above. The resulting topological vector space is locally convex because each UA ,ε (0) is absolutely convex and absorbent. Definition A.161 (Multiplicative convexity (m-convex)). A subset A of an algebra A is multiplicative (idempotent) if A2 = A A ⊂ A . It is multiplicatively-convex or m-convex if it is convex and multiplicative. It is absolutely m-convex if it is balanced and m-convex. Definition A.162 (Multiplicative semi-norm). A semi-norm N on an algebra A is multiplicative if for all x1 , x2 ∈ A N(x1 x2 ) ≤ N(x1 )N(x2 ). Definition A.163 (Locally m-convex algebras and Fr´echet algebras). A topological algebra (X, τ ) is a locally m-convex algebra (LMC algebra) if there is a base of m-convex sets for V(0). X is a locally convex algebra if it is a topological algebra with a locally convex linear space structure. If, in addition to being locally m-convex, topology τ is Hausdorff, we say that X is an LMCH algebra, and τ to be LMCH. An LMC algebra which is a complete metrizable topological space is a Fr´ echet algebra. Proposition A.164. A topological algebra X is locally m-convex if and only if its topology is generated by a family of multiplicative semi-norms. The following space is a generalization of a Banach space that is locally convex and complete with respect to a translation invariant metric. However, the metric does not need to arise from a norm (a semi-norm suffices). Definition A.165 (Fr´echet space I). A topological vector space X is called a Fr´echet space if it possesses the following properties: 1. it is Hausdorff; 2. it is complete with respect to the family of semi-norms; 3. the topology on X can be induced by a countable family of semi-norms ‖.‖l , l = 0, 1, 2, . . . , i.e., a subset U ⊂ X is open if and only if for all elements u ∈ U there exist K ≥ 0, 𝜖 > 0, such that for all l ≤ K {𝜈 : ‖𝜈 − u‖l < 𝜖} form a subset of U.
A.21 Nuclear multiplicative convex Hausdorff algebras and the Gel’fand spectrum
|
229
Definition A.166 (Hilbert space). One speaks of a Hilbert space H if there are: 1. a positive definite inner product on the complex linear space X ⟨⋅, ⋅⟩ : X × X → ℂ,
2.
such that (a) ⟨x1 , x1 ⟩ ≥ 0 and ⟨x1 , x1 ⟩ = 0 if and only if x1 = 0, (b) ⟨x1 , x2 + x3 ⟩ = ⟨x1 , x2 ⟩ + ⟨x1 , x3 ⟩, (c) ⟨x1 , λ x2 ⟩ = λ ⟨x1 , x2 ⟩, (d) ⟨x1 , x2 ⟩ = ⟨x2 , x1 ⟩. Under these conditions, H is said to be a pre-Hilbert space; s collection of vectors (xn ) which are orthonormal given that ⟨xm , xn ⟩ = δmn.
Definition A.167 (Compact operator). An operator L acting in a Hilbert space H: L:H→H
is a compact operator if it can be presented as N
L = ∑ ρn ⟨fn , ⋅⟩gn ,
1≤N≤∞
n=1
where f1 , . . . , fN and g1 , . . . , gN are (not necessarily complete) orthonormal sets. Here, ρ1 , . . . , ρN are a set of real numbers, the singular values of the operator, obeying ρn → 0 if N → ∞. The bracket ⟨⋅, ⋅⟩ is the scalar product on the Hilbert space; the sum on the right-hand side must converge in the norm. Definition A.168 (Singular values of a compact operator). The singular values, or snumbers of a compact operator T : H1 → H2 acting between Hilbert spaces H1 and H2 , are the square roots of the eigenvalues of the nonnegative self-adjoint operator T ∗ T : H1 → H1 , where T ∗ stands for the adjoint of T.
230 | A Mathematical vocabulary Definition A.169 (Nuclear operator). A compact operator is nuclear or trace-class if ∞
∑ ρn < ∞. n=1
The most important observation for our purposes is now that for a nuclear operator in a Hilbert space one can define the trace, which is finite and independent on the basis. Proposition A.170 (Finite trace of nuclear operator on Hilbert space). Given an orthonormal basis {ψn } for the Hilbert space, one defines the trace as tr L = ∑⟨ψn , Lψn ⟩
(A.37)
n
where the sum converges absolutely and is independent of the basis. Furthermore, this trace is identical to the sum over the eigenvalues of L. Proposition A.171 (Trace of nuclear operator on Banach space). Let A1 , A2 be two Banach spaces, and A∗1 be the dual of A1 , that is the set of all continuous linear functionals on A1 with the usual norm. Then the operator L: A→B
is said to be nuclear of order q if there exist sequences of vectors {gn } ∈ A2 with ‖gn ‖ ≤ 1, functionals {Fn∗ } ∈ A∗1 with ‖Fn∗ ‖ ≤ 1 and complex numbers {ρn } with inf {p ≥ 1 : ∑ |ρn |p < ∞} = q, n
such that the operator can be presented as ∗
L = ∑ ρn Fn (⋅)gn n
with the sum converging in the operator norm. Definition A.172 (Weak topology). The collection of all unions of finite intersection of sets Fi−1 (Oi ) for F : X1 → X 2 ,
A.21 Nuclear multiplicative convex Hausdorff algebras and the Gel’fand spectrum
|
231
where i ∈ I and Oi is an open set in X2i , is a topology. It is called the weak topology on X1 generated by the (Fi ), i ∈ I and is denoted by σ (X1 , (Fi ), Fi ∈ I). Definition A.173 (Weak-∗ topology). The weak-∗ topology on X is the topology σ (X, (F), F ∈ X ∗ ). Definition A.174 (Vanish at infinity). If X is a locally compact Hausdorff space, then a continuous function F on X vanishes at infinity if {x ∈ X : |F(x)| ≥ 𝜖} is compact for all 𝜖 > 0. Definition A.175 (Gel’fand space or spectrum). Let A be a commutative Banach algebra. Then we denote the collection of nonzero complex homomorphisms H: A→ℂ by (A). Elements of the Gel’fand space are called characters. Theorem A.176. Let A be a commutative unital Banach algebra. Then 1. ≠ 0; 2. J is a maximal ideal in A if and only if J = ker H for some H ∈ ; 3. ‖H‖ = 1, ∀H ∈ ; 4. ∀a ∈ A : σ (a) = {H(a) = H ∈ }. Lemma A.177. If A is a unital Banach algebra, then each proper ideal is included in a maximal ideal and each maximal ideal is closed. Theorem A.178 (Gelfand–Mazur theorem). A unital Banach Algebra wherein each nonzero element is invertible is isometrically isomorphic to ℂ. Definition A.179 (Gel’fand transform). Let A be a commutative Banach algebra with (A) nonempty. The Gel’fand transform of a ∈ A is the function ̂ â : (A) → ℂH → a(H) := H(a). The space (A) is called the spectrum of A. Definition A.180 (Gel’fand topology). The Gel’fand topology on (A) is the smallest topology making each â continuous. Lemma A.181 (Gel’fand topology). The Gel’fand topology on (A) is the relative topology on (A) viewed as a subset of A∗ with the weak-∗ topology.
B Notations and conventions in quantum field theory B.1 Vectors and tensors In general, we use the same conventions as in [7]. We will work in natural units: ℏ = c = 1.
(B.1)
For the Minkowski metric, we take the common convention
gμ 𝜈
1 0 =( 0 0
0 −1 0 0
0 0 −1 0
0 0 ), 0 −1
(B.2)
where Greek indices run over 0, 1, 2, 3 (for t, x, y, z). To denote only the spatial components, we use Roman indices, like i, j, etc. We use the Einstein notation convention throughout the whole book, meaning that repeated indices are to be summed over. A four-vector is denoted in italic, a three-vector in bold and a two-vector (the transversal components) in bold and with a subscript ⊥: pμ = (p0 , p1 , p2 , p3 ) = (p0 , p) = (p0 , p⊥ , p3 ) ,
(B.3)
while a length is mostly denoted in italic, be it a length of a four-, three- or two-vector. The difference should be clear from context, but when needed for clarity, we use p and p⊥ . The scalar product is fully defined by the metric: x⋅p = x0 p0 − x ⋅p.
(B.4)
This implies that we can define a vector with a lower index as pμ = (p0 , −p1 , −p2 , −p3 ) ,
(B.5)
x⋅p = xμ pμ .
(B.6)
such that Note that the index moves up or down when placing the coordinate in the denominator, as is the case for the derivative: 𝜕 N = 𝜕μ . 𝜕xμ
(B.7)
The position four-vector combines time and three-position, while the four-momentum combines energy and three-momentum: xμ = (t, x) ,
pμ = (E, p) .
(B.8)
B.2 Spinors and gamma matrices
|
233
A particle that sits on its mass-shell (on-shell for short) has 2 p2 = E2 − p = m2 .
(B.9)
Last we define the symmetrization (. . .) and antisymmetrization [. . .] of a tensor as 1 μ𝜈 (A + A𝜈μ ) , 2 1 = (Aμ 𝜈 − A𝜈μ ) . 2
A(μ 𝜈) =
(B.10a)
A[μ 𝜈]
(B.10b)
Symmetrizing an antisymmetric tensor returns zero, this implies: A(μ 𝜈) B[μ 𝜈] = 0.
(B.11)
It is straightforward to generalize the definition to tensors of higher rank: 1 μ1 ⋅⋅⋅μn + all permutations) , (A n! 1 μ1 ⋅⋅⋅μn (A = − all odd perm. + all even perm.) . n!
A(μ1 ⋅⋅⋅μn ) =
(B.12a)
A[μ1 ⋅⋅⋅μn ]
(B.12b)
B.2 Spinors and gamma matrices Any field with half-integer spin, i.e., a Dirac field, anticommutes: ψ (x)ψ (y) = −ψ (y)ψ (x).
(B.13)
We define gamma matrices by the anticommutation relations {𝛾μ , 𝛾𝜈 } ≡ 2 gμ 𝜈 1n ,
(B.14)
with the following additional property: †
(𝛾μ ) = 𝛾0 𝛾μ 𝛾0 .
(B.15)
Then we can define the Dirac equation for a particle field ψ : (i 𝜕/ − m) ψ = 0,
(B.16)
where the slash is a shortcut notation for p/ = 𝛾μ pμ .
(B.17)
We can identify an antiparticle field with ψ if we define ψ = ψ † 𝛾0 ,
(B.18)
234 | B Notations and conventions in quantum field theory which satisfies a slightly adapted Dirac equation: (i 𝜕/ + m) ψ = 0.
(B.19)
We can expand Dirac fields in function of a set of plane waves: ψ (x) = us (p) e−i p⋅x s
(p2 = m2 , p0 > 0),
+i p⋅x
2
ψ (x) = v (p) e
2
0
(p = m , p < 0),
(B.20a) (B.20b)
where s is a spin-index. If we define u = u† 𝛾0 ,
v = 𝛾0 v† ,
(B.21)
we can find the completeness relations by summing over spin: s
∑ us (p)u (p) = p/ + m,
(B.22a)
s s
∑ v (p)vs (p) = p/ − m.
(B.22b)
s
We will identify – u with an incoming fermion; – u with an outgoing fermion; – v with an incoming antifermion; – v with an outgoing antifermion. If we define 𝛾5 = i 𝛾0 𝛾1 𝛾2 𝛾3 = − N
𝛾μ 𝜈 = 𝛾[μ 𝛾𝜈] = N
i μ 𝜈ρσ 𝛾μ 𝛾𝜈 𝛾ρ 𝛾σ , ε 4!
1 μ 𝜈 (𝛾 𝛾 − 𝛾𝜈 𝛾μ ) , 2
(B.23a) (B.23b)
we can construct a complete Dirac basis: 1, 𝛾μ , 𝛾μ 𝜈 , 𝛾μ 𝛾5 , 𝛾5 .
(B.24)
We will identify – 1 with a scalar; – 𝛾μ with a vector; – 𝛾μ 𝜈 with a tensor; – 𝛾μ 𝛾5 with a pseudo-vector; – 𝛾5 with a pseudo-scalar. Furthermore, 𝛾5 has the following properties: †
(𝛾5 ) = 𝛾5 ,
2
(𝛾5 ) = 1,
{𝛾5 , 𝛾μ } = 0.
(B.25)
B.3 Light-cone coordinates
|
235
Let us list some contraction identities for gamma matrices in ω dimensions: 𝛾μ 𝛾μ = ω ,
(B.26a)
μ 𝜈
𝜈
𝛾 𝛾 𝛾μ = (2 − ω )𝛾 , μ 𝜈 ρ
𝛾 𝛾 𝛾 𝛾μ = 4 g μ 𝜈 ρ σ
𝜈ρ
(B.26b) 𝜈 ρ
+ (ω − 4)𝛾 𝛾 ,
σ ρ 𝜈
(B.26c) 𝜈 ρ σ
𝛾 𝛾 𝛾 𝛾 𝛾μ = −2𝛾 𝛾 𝛾 + (4 − ω )𝛾 𝛾 𝛾 ,
(B.26d)
and some trace identities: tr(1) = ω ,
(B.27a)
tr(odd number of 𝛾’s) = 0, μ 𝜈
(B.27b) μ𝜈
tr(𝛾 𝛾 ) = 4 g , μ 𝜈 ρ σ
μ 𝜈 ρσ
tr(𝛾 𝛾 𝛾 𝛾 ) = 4 (g g
(B.27c) μρ 𝜈σ
−g g
μσ 𝜈ρ
+g g
)
(B.27d)
B.3 Light-cone coordinates Light-cone coordinates form a useful basis to represent 4-vectors. For a random vector kμ , they are defined by 1 (k0 + k3 ) , √2 1 k− = (k0 − k3 ) , √2 k⊥ = (k1 , k2 ) . k+ =
(B.28a) (B.28b) (B.28c)
We will represent the plus-component first, i.e., kμ = (k+ , k− , k⊥ ) .
(B.29)
One often encounters in literature the notation (k− , k+ , k⊥ ), but this is merely a matter of convention. The factor √12 normalizes the transformation to unit Jacobian, such that d4 k = dk+ dk− dk⊥ .
(B.30)
It is straightforward to show that the scalar product has the form k⋅p = k+ p− + k− p+ − k⊥ ⋅p⊥ , 2
+ −
k = 2k k −
k2⊥ .
(B.31a) (B.31b)
This implies that the metric becomes off-diagonal:
μ𝜈 gLC
0 1 =( 0 0
1 0 0 0
0 0 −1 0
0 0 ). 0 −1
(B.32)
236 | B Notations and conventions in quantum field theory We will drop the index LC when clear from context. Note that this basis is not orthonormal. Note also that gμ 𝜈 g𝜈ρ = δρμ ,
gμ 𝜈 gμ 𝜈 = 4,
(B.33)
just like the Cartesian metric. We can also define two light-like basis vectors: nμ+ = (1+ , 0− , 0⊥ ) ,
(B.34a)
nμ−
(B.34b)
+
−
= (0 , 1 , 0⊥ ) .
They are light-like vectors, and maximally nonorthogonal: n2+ = 0,
n2− = 0,
n+ ⋅n− = 1.
(B.35)
Watch out, as lowering the index switches the light-like components because of the form of the metric: n+ μ = (0+ , 1− , 0⊥ ) , +
−
n− μ = (1 , 0 , 0⊥ ) ,
(B.36a) (B.36b)
such that they project out the other light-like component of a vector: k⋅n+ = k− ,
k⋅n− = k+ .
(B.37)
In other words, we can write k = (k⋅n− ) n+ + (k⋅n+ ) n− − k2⊥ .
(B.38)
The switching of plus and minus components is also apparent when using Dirac matrices in LC-coordinates, e.g., {𝛾+ , k/ } = 2 g+μ kμ = 2k+
𝛾+ k/ = 2k+ − k/ 𝛾+ .
⇒
(B.39)
Note that 1 + − {𝛾 , 𝛾 } = 1, 2
(𝛾+ ) = (𝛾− ) = 0, 2
2
such that equation (B.14) remains valid in light-cone coordinates. We can use the light-like basis vectors to construct a metric for nothing but the transversal part: 𝜈) g⊥μ 𝜈 = gμ 𝜈 − 2 n(μ + n−
0 0 =( 0 0
0 0 0 0
0 0 −1 0
(B.40a) 0 0 ). 0 −1
(B.40b)
B.4 Fourier transforms and distributions
|
237
Note that g⊥μ 𝜈 g⊥ 𝜈ρ = δρμ − nμ+ n− ρ − nμ− n+ ρ ,
g⊥μ 𝜈 g⊥ μ 𝜈 = 2.
(B.41)
Last we can define an antisymmetric metric: ε⊥μ 𝜈 = ε +−μ 𝜈 0 0 =( 0 0
(B.42a) 0 0 0 0
0 0 0 −1
0 0 ), 1 0
(B.42b)
where we adopt the convention ε 0123 = ε +−12 =+ 1.
B.4 Fourier transforms and distributions The Heaviside step function is defined as {0 x < 0 θ (x) = { , 1 x>0 {
(B.43)
and is undefined for x = 0. The Dirac δ -function is defined as its derivative: δ (x) =
d θ (x) , dx
⇒
∫dx δ (x) = 1,
(B.44)
and is zero everywhere, except at x = 0. A generalization to n dimensions is straightforward: ∫dn x δ n (x) = 1.
(B.45)
The most important use of the Dirac δ -function is the sifting property, which follows straight from (B.45): ∫dn x f (x) δ n (x − t) = f (t).
(B.46)
When dealing with on-shell conditions, we often encounter the combination of a Heaviside θ and a Dirac δ function. To save space, we define the shorthand notation δ + (p2 − m2 ) = δ (p2 − m2 ) θ (p0 ) .
(B.47)
When dealing with Fourier transforms, we will use the following conventions: d4 k ̃ f (x) = ∫ f (k) e−i k⋅x , 16π 4
(B.48a)
̃f (k) = ∫d4 x f (x) ei k⋅x .
(B.48b)
238 | B Notations and conventions in quantum field theory The tilde will always be omitted, as the function argument specifies clearly enough whether we are dealing with the coordinate or momentum representation. Note that due to the Minkowki metric, Fourier transforms over spatial components have the signs in their exponents flipped: d3 k f (x) = ∫ 3 ̃f (k) ei k ⋅x , 8π
(B.49a)
̃f (k) = ∫d3 x f (x) e−i k⋅x ,
(B.49b)
and the same for two-dimensional Fourier transforms. An ‘empty’ Fourier transform gives a δ -function: dn k ∫ e−i k⋅x = δ (n) (x) , (2π )n
(B.50a)
∫dn x ei k⋅x = (2π )n δ (n) (x) .
(B.50b)
B.5 Feynman rules for QCD The full Lagrangian for QCD is given by: L = ψ (i 𝜕/ − m) ψ −
2 1 / (𝜕 Aa − 𝜕𝜈 Aaμ ) − g ψ Aψ 4 μ 𝜈
+ g f abc (𝜕μ Aa𝜈 ) Aμ b A𝜈c −
1 2 abx xcd a b μ c 𝜈d g f f Aμ A𝜈 A A , 2
(B.51)
where A/ = Aaμ 𝛾μ ta . This gives rise to the following Feynman rules: N
p i, s
=
usi (p)
=
ui (p)
=
(initial)
(B.52a)
s
(final)
(B.52b)
vi (p)
s
(initial)
(B.52c)
=
vis (p)
(final)
(B.52d)
=
εμa (p)
(initial)
(B.52e)
p i, s p i, s p i, s μ, a
k
B.5 Feynman rules for QCD
k p i a a
k k
μ, a
=
ε μ (p) (final)
j
=
i δ ij
b
=
−i δ ab kμ k𝜈 μ𝜈 − (1 − ξ ) ] [g k2 + i ε k2
b
=
k(μ n𝜈) k2 kμ k𝜈 −i δ ab μ 𝜈 [g − 2 + ξ ] k⋅n k⋅n k2 + i ε
=
−i g𝛾μ (ta )ji
a
p2
| 239
(B.52f)
p/ + m − m2 + i 𝜖
(B.52g) (Lorentz) (LC)
(B.52h) (B.52i)
j
i μ, a
(B.52j)
ρ, c
𝜈, b
=
−gf abc [ gμ 𝜈 (k − p)ρ + g𝜈ρ (p − q)μ + gρμ (q − k)𝜈 ]
(B.52k)
=
−i g2 [f abx f xcd (gμρ g𝜈σ − gμσ g𝜈ρ ) −f acx f xbd (gμσ g𝜈ρ − gμ 𝜈 gρσ ) + f adx f xbc (gμ 𝜈 gρσ − gμρ g𝜈σ ) ].
(B.52l)
μ, a ρ, c
𝜈, b
μ, a
σ, d
The sum over gluon polarization states depends on the gauge, and equals ∑ εμ (k)ε 𝜈 (k) = −gμ 𝜈
(Lorentz)
(B.53a)
(LC)
(B.53b)
pol
∑ εμ (k)ε 𝜈 (k) = −gμ 𝜈 + pol
2k(μ n𝜈) n⋅k
where the light-cone gauge is defined by n− ⋅A = A+ = 0.
C Color algebra C.1 Basics C.1.1 Representations Let us revise some basic color algebra. As is well known, the group which governs QCD is SU(3), but for the sake of generality we list some basic rules and derive some properties for SU(N). The latter is fully defined by dA = N 2 − 1 linear independent Hermitian generators ta and their commutation relations [ta , tb ] = i f abc tc ,
(C.1)
where the f abc are real and fully antisymmetric constants (the so-called structure constants). In practice we will work with representations of the algebra, where the generators are represented by dR × dR Hermitian matrices, with dR the dimension of the representation. Two representations of particular interest are the fundamental representation, which has dimension dF = N and forms a complete basis for the algebra if complemented with the identity matrix: (1, ta ) . The second significant representation is the adjoint representation, which is constructed from the structure constants: (T a )bc = −i f abc and has dimension dA = N 2 −1. We will make the distinction in notation by writing the fundamental with lowercase t and the adjoint with uppercase T. Note that in literature several different notations exist (for instance tF and tA ).
C.1.2 Properties All matrices are traceless in every representation: tr(tRa ) = 0. The trace of two matrices is zero if they are different: tr(tRa tRb ) = DR δ ab .
(C.2)
DR is a constant depending on the representation. By convention DR = 12 , almost always. Summing all squared matrices gives an operator that commutes with all others, the so-called Casimir operator: tRa tRa = CR 1. (C.3)
C.1 Basics
| 241
Again, CR is a constant depending on the representation. Both constants can be easily related tRa tRa = CR 1 ⇒ tr(tRa tRa ) = CR tr(1) = CR dR tr(tRa tRb ) = DR δ ab ⇒ tr(tRa tRa ) = DR δ aa = DR dA ⇒
(C.4)
CR DR = . dA dR
Let us list the constants for the fundamental and the adjoint representation: DF =
1 , 2
CF = DF
DA = 2DF dF = N, dA N 2 − 1 = , dF 2N
CA = DA = N, dA = d2F − 1 = N 2 − 1.
dF = N,
These are the only properties that are representation independent. Because the fundamental representation forms a complete basis, we can derive additional properties that are not valid in other representations. First of all, the anticommutator has to be an element of the algebra, and thus a linear combination of the identity and the generators: 1 {ta , tb } = δ ab 1 + dabc tc . (C.5) N The constant in front of the identity was calculated by taking the trace and comparing to equation (C.2), while dabc can be retrieved, as well as f abc , from f abc = −
i tr [tRa , tRb ]tRc , DR
dabc = 2 tr {ta , tb }tc .
(C.6a) (C.6b)
It is easy to check that the dabc are fully symmetric and that they vanish when contracting any two indices: daab = dbaa = daba = 0. By combining the commutation rules with the anticommutation rules we can find another useful property: ta tb =
{ta , tb } + [ta , tb ]
2 1 1 ab = δ 1 + habc tc , 2N 2
(C.7)
where we defined habc = dabc + i f abc .
(C.8)
242 | C Color algebra habc is Hermitian and cyclic in its indices: bac
habc = h h
abc
aab
h
bca
=h
baa
=h
cba
=h
cab
=h
=h
,
,
aba
=h
acb
= 0.
A last useful property is the Fierz identity (ta )αβ (ta )𝛾δ =
1 1 δ δ − δ δ . 2 αδ β 𝛾 2N αβ 𝛾δ
(C.9)
It is straightforward to prove this identity; first we write a general element of the fundamental representation as X = c0 1 + i ca ta , (C.10) where c0 and ca are easily calculated 1 tr(X) , N ca = −2i tr(Xta ) .
c0 =
αβ
We then get the requested result by calculating 𝜕(X)𝛾δ = δ α 𝛾 δ βδ . The Fierz identity is 𝜕(X) especially handy to rearrange traces containing contractions: 1 1 tr(AC) tr(B) − tr(ABC) , 2 2N a a tr(t B t ) = CF tr(B) , 1 1 tr(A ta B) tr(C ta D) = tr(ADCB) − tr(AB) tr(CD) , 2 2N tr(A ta B ta C) =
(C.11a) (C.11b) (C.11c)
where A, B, C, D are expressions built from tFa ’s.
C.2 Advanced topics C.2.1 Calculating products of fundamental generators We would like to find an expression for a general product of fundamental generators like t a1 t a2 ⋅ ⋅ ⋅ t an . Like we did with the Fierz identity in (C.10), we can write this product in function of the basis for the fundamental representation: ta1 ta2 ⋅ ⋅ ⋅ tan = Aa1 a2 ⋅⋅⋅an 1 + Ba1 a2 ⋅⋅⋅an b tb ,
C.2 Advanced topics
|
243
and we can calculate the color factors by tracing: tr(ta1 ta2 ⋅ ⋅ ⋅ tan ) = NAa1 a2 ⋅⋅⋅an , 1 tr(ta1 ta2 ⋅ ⋅ ⋅ tan tc ) = Ba1 a2 ⋅⋅⋅an c . 2 But the latter can also be calculated as one order higher, giving tr(ta1 ta2 ⋅ ⋅ ⋅ tan tc ) = NAa1 a2 ⋅⋅⋅an c , 1 a1 a2 ⋅⋅⋅an ⇒ Aa1 a2 ⋅⋅⋅an ≡ . B 2N Only one of these is linearly independent. We will adopt the notation N 1 Aa1 a2 ⋅⋅⋅an = Ca1 a2 ⋅⋅⋅an , N with C standing for ‘color factor’. 1 ta1 ⋅ ⋅ ⋅ tan = Ca1 ⋅⋅⋅an + 2 Ca1 ⋅⋅⋅an b tb N Ca1 ⋅⋅⋅an = tr(ta1 ⋅ ⋅ ⋅ tan )
(C.12)
(C.13a) (C.13b)
The color factor has the same properties as the trace, namely cyclicity and Hermiticity: Ca1 a2 ⋅⋅⋅an = Ca2 a3 ⋅⋅⋅an a1 = . . . , C
a1 a2 ⋅⋅⋅an
=C
an ⋅⋅⋅a2 a1
.
(C.14a) (C.14b)
The first color factors are straightforward to calculate: C0 = 1,
(C.15a)
C1a
= 0, (C.15b) 1 C2ab = δ ab , (C.15c) 2 1 (C.15d) C3abc = habc . 4 To calculate the higher orders, we use equation (C.7) to deduce a recursion formula for traces (and thus color factors) in the fundamental representation: tr(ta1 ⋅ ⋅ ⋅ tan ) =
δ an−1 an han−1 an b tr(ta1 ⋅ ⋅ ⋅ tan−2 ) + tr(ta1 ⋅ ⋅ ⋅ tan−2 tb ) , 2N 2
(C.16a)
δ an−1 an a1 ⋅⋅⋅an−2 han−1 an b a1 ⋅⋅⋅an−2 b + . (C.16b) C C 2N 2 This gives, for instance, 1 a1 a2 a3 a4 1 a1 a2 b ba3 a4 a a a a δ + h h , δ C41 2 3 4 = 4N 8 1 1 a1 a2 b1 b1 a3 b2 b2 a4 a5 a a a a a h h , (ha1 a2 a3 δ a4 a5 + δ a1 a2 ha3 a4 a5 ) + h C51 2 3 4 5 = 8N 16 1 a1 a2 a3 a4 a5 a6 1 a1 a2 b1 b1 a3 b2 b2 a4 b3 b3 a5 a6 a a a a a a δ δ δ + h h h , C61 2 3 4 5 6 = h 32 8N 2 1 (ha1 a2 b hb a3 a4 δ a5 a6 + ha1 a2 a3 ha4 a5 a6 + δ a1 a2 ha3 a4 b hb a5 a6 ) . + 16N Ca1 ⋅⋅⋅an =
244 | C Color algebra One extremely useful property is the fact that inner summations only appear between consecutive h’s, and never with a δ . This allows us to define the following shorthand notation: δ = δ ai ai+1 , N
h = hai ai+1 ai+2 , N
hh = hai ai+1 b hbai+2 ai+3 , N
hhh = hai ai+1 b1 hb1 ai+2 b2 hb2 ai+3 ai+5 , N
⋅⋅⋅ which allows us to rewrite the former as (note that the order of the δ ’s and h’s is significant): C2 = δ , 1 C3 = h, 4 1 1 δδ + hh, C4 = 4N 8 1 1 hhh, C5 = (hδ + δ h) + 8N 16 1 1 1 (hhδ + h h + δ hh) + δδδ + hhhh. C6 = 16N 32 8N 2
(C.17a) (C.17b) (C.17c) (C.17d) (C.17e)
If we generalize this to an n-th order trace, we get from equation (C.16): n −1 2
n even:
Ca1 ⋅⋅⋅an = ∑ i=0
n odd:
C
a1 cdotsan
1 2
n +i 2
N
n −i−1 2
n−3 2
=∑ i=0
(
all allowed δ − h combinations ) built from 2i h’s
1 2
n+1 +i 2
N
n−1 −i−1 2
(
(C.18)
all allowed δ − h combinations ) built from 2i + 1 h’s
where the δ − h combinations need to have n open indices, using δ
2 open indices,
h
3 open indices,
hh
4 open indices,
hhh
5 open indices,
⋅⋅⋅ and where it is forbidden to put any δ or h in-between two contracted h’s. Thus for instance hδ h and hhh are not allowed. With this in mind, we can tackle any trace without
C.2 Advanced topics
|
245
the need for recursive calculations. For instance: 1 1 (hhδδδ + δ hhδδ + δδ hhδ + δδδ hh δδδδδ + Ca1 ⋅⋅⋅a10 = 32N 4 64N 3 1 (hhhhδδ 128N 2 + δ hhhhδ + δδ hhhh + hhh hδ + hhhδ h + δ hhh h + h hhhδ +
+ h hδδ + hδ hδ + hδδ h + δ h hδ + δ hδ h + δδ h h) +
+ hδ hhh + δ h hhh + hh hhδ + hhδ hh + δ hh hh + hh h h + h hh h 1 (hhhhhhδ + δ hhhhhh + hhhhhh + hhhhhh 256N 1 hhhhhhhh. + hhhh hh + hh hhhh + hhh hhh) + 512 + h h hh) +
As a result from (C.16) we can use a trick to double check our result, namely that the total number of terms should equal the (n − 1)-th Fibonacci number (counting 0 as the zeroth Fibonacci number). Thus indeed, for the 10-th order trace we have 34 terms.
C.2.2 Calculating traces in the adjoint representation It is not so straightforward to calculate general n-th order traces in the adjoint representation, because we cannot easily calculate the anticommutation relations, which we need to get a recursion relation as in (C.16). We will not attempt to do so, instead we will relate traces in the adjoint representation to traces in the fundamental using a nifty trick. First, note that in general F ⊗ F ≃ A ⊕ 1,
(C.19)
from which we can derive (UA denotes ‘the group element U expressed in the adjoint representation’): tr(UA ) = tr(UF ) tr(UF ) − 1. (C.20) Indeed, if we take U = 1, we get dA = d2F − 1. To calculate the n-th order trace, it is a a sufficient to take U = ∏ni etR αi , expand it and compare terms of the same order in αi . Furthermore we can use UF = UF† = UF−1 , et 1 α1 ⋅ ⋅ ⋅ et n αn = e−t a
a1
a
an
an
a
αn n
⋅ ⋅ ⋅ e−t
a1
a
α1 1
.
For example, the fourth-order trace in the adjoint can be calculated as follows: tr(eT
a
α1a T b α2b T c α3c T d α4d
e
e
e
) = tr(et
a
α1a tb α2b tc α3c td α4d
e
e
e
) tr(e−t
d
α4d −tc α3c −tb α2b −ta α1a
e
e
α1a α2b α3c α4d tr(T a T b T c T d ) = α1a α2b α3c α4d [ tr(ta tb tc td ) N + N tr(td tc tb ta ) + 2 tr(ta tb ) tr(tc td ) + 2 tr(ta tc ) tr(tb td ) + 2 tr(ta td ) tr(tb tc )] .
e
) − 1,
246 | C Color algebra Using this trick we can calculate any trace in the adjoint representation in function of traces in the fundamental representation. Also note that we can derive equations similar to (C.20) using different representation combinations. For example in SU(3) we have 3 ⊗ 3 ≃ 6 ⊕ 3, implying tr(U2F ) = tr(UF ) tr(UF ) − tr(UF ) . Now back to the adjoint generators. We could generalize their trace as tr(T a1 ⋅ ⋅ ⋅ T an ) = N( tr(ta1 ⋅ ⋅ ⋅ tan ) + (−)n tr(ta1 ⋅ ⋅ ⋅ tan ) ) n−2
+ ∑ (−)n−m m=2
(C.21)
n! tr(t(a1 ⋅ ⋅ ⋅ tam | ) tr(tam+1 ⋅ ⋅ ⋅ tan )o ) . m! (n − m)!
We introduced two new notations: first we have the ‘conjugated’ trace, which is simply the trace in reversed order: tr(ta1 ⋅ ⋅ ⋅ tan ) = tr(tan ⋅ ⋅ ⋅ ta1 ) . The only thing that changes when reversing a trace of fundamental generators is that every h gets replaced by its complex conjugate h (hence the notation tr). The result can be simplified further using relations like h − h = 2i f , hh + hh = 2 (dd − ff ) , etc. The second notation we introduced, ( | )o , is an ‘ordered’ symmetrization which for a general tensor M is defined as M (a1 ⋅⋅⋅am | am+1 ⋅⋅⋅an )o =
m! (n − m)! (M a1 ⋅⋅⋅an + n!
(C.22)
all permutations for which both the first m ). + indices and the last n − m indices are ordered with respect to (a1 ⋅ ⋅ ⋅ an ) For instance, M (ab | N cd)o =
1 (M ab N cd + M ac N bd + M ad N bc + M bc N ad + M bd N ac + M cd N ab ) . 6
One handy property is that when A and B are commutative, we have A(a1 ⋅⋅⋅am | Bam+1 ⋅⋅⋅an )o = B(a1 ⋅⋅⋅an−m | Aan−m+1 ⋅⋅⋅an )o ,
C.2 Advanced topics
|
247
or (in our shorthand notation) for instance ( δ | h)o = ( h| δ )o . To conclude, let us list some traces:¹ tr(T a1 T a2 ) = Nδ , tr(T a1 T a2 T a3 ) = tr(T a1 T a2 T a3 T a4 ) = tr(T a1 T a2 T a3 T a4 T a5 ) =
tr(T a1 T a2 T a3 T a4 T a5 T a6 ) =
(C.23a)
N (h − h) , 4 1 N (δδ + 3( δδ ) ) + (hh + hh) , 2 8 ( | )o 1 [(h − h) δ + δ (h − h) + 10 δ (h − h) ] 8 N (hhh − hhh) , + 16 1 1 (δδδ + 15 ( δδδ ) ) + [(hh + hh) δ 4N 16 (
|
(C.23b) (C.23c)
(C.23d)
+ (hh + hh) + δ (hh + hh) + 15 δ (hh + hh) (
)o
− 20 h| h ] +
1 Note that ( δ | δ )o = ( δδ ) , for any number of δ ’s.
N (hhhh + hhhh) . 32
)o
(C.23e)
D Brief literature guide We do not intend to give any kind of a review of the (huge) existing literature. Our list of references by no means pretends to be complete, it certainly misses a number of important (or even crucial) works on the subject. We list, however, all research papers, reviews and books, whose results we directly used in our exposition. Below we give a very brief guide to these works according to the main issues considered in them. 1. General questions: [1–10] 2. Gauge theory and the principal fiber bundle approach: [10, 15–18, 20] 3. Product integrals: [21–23] 4. Topology: [16, 17, 24] 5. Manifolds: [15–18, 25–27] 6. Algebra: [28–33] 7. Topological algebra: [31] 8. Algebraic paths: [34–38] 9. Loop space: [19, 38–56] 10. Mandelstam constraints: [12, 44, 45, 54] 11. Gauge invariance in particle physics: [9–11, 14, 59–76]
Bibliography [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25]
N. N. Bogolyubov and D. V. Shirkov, Introduction to the Theory of Quantized Fields, Intersci. Monogr. Phys. Astron. 3, 1 (1959). N. N. Bogoliubov, A. A. Logunov, A.I. Oksak, and I. T. Todorov, General Principles of Quantum Field Theory, Dordrecht Boston, Kluwer Academic Publishers, 1990. L. D. Faddeev and A. A. Slavnov, Gauge Fields: Introduction to Quantum Theory, Westview Press, 1991. N. P. Konopleva and V. N. Popov, Gauge Fields, Chur, Harwood, 1981. M. B. Mensky, Group of Paths: Measurements, Fields, Particles, Moscow, Nauka, 1983 [in Russian]. A. S. Schwarz, Mathematical Foundations of Quantum Field Theory, Moscow, Atomizdat, 1975 [in Russian]. M. E. Peskin and D. V. Schroeder, An Introduction to Quantum Field Theory, Advanced Book Classics, Boulder, Colorado, Westview Press, 1995. R. C. Field, Applications of Perturbative QCD, Advanced Book Classics, Redwood City, California, Addison-Wesley Publishing Company, 1989. J. Collins, Foundations of perturbative QCD, Cambridge monographs on particle physics, nuclear physics and cosmology 32, Cambridge, Cambridge University Press, 2011. T. P. Cheng and L. F. Li, Gauge Theory of Elementary Particle Physics, Oxford Science Publications, Oxford New York, Clarendon Press, 1984. J. S. Schwinger, Gauge Invariance and Mass, 2, Phys. Rev. 128 (1962), 2425. S. Mandelstam, Quantum electrodynamics without potentials, Ann. Phys. 19 (1962), 1–24. S. Mandelstam, Feynman rules for electromagnetic and Yang-Mills fields from the gaugeindependent field-theoretic formalism, Phys. Rev. 175 (1968), 1580–1603. K. G. Wilson, Confinement of Quarks, Phys. Rev. D 10 (1974), 2445. T. Thiemann, Modern Canonical Quantum General Relativity, Cambridge Monographs on Mathematical Physics, Cambridge, Cambridge University Press, 2007. T. Frankel, The Geometry of Physics: An Introduction, Cambridge, Cambridge University Press, 2011. M. Nakahara, Geometry, Topology and Physics, Graduate Student Series in Physics, Oxon, Taylor & Francis, 2003. S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, Interscience Tracts in Pure and Applied Mathematics, Interscience Publishers, 1963. Y. M. Makeenko, Methods of Contemporary Gauge Theory, Cambridge, Cambridge University Press, 2002. A. Guay, Geometrical aspects of local gauge symmetry, http://philsciarchive.pit.edu/id/eprint/ 2133, Pittsburg, 2004 V. Volterra, Sui fondamenti della teoria delle equazioni differenziali lineari, I, Mem. Soc. Ital. Sci. 6(3), (1877), 1–104. R. P. Feynman, An Operator calculus having applications in quantum electrodynamics, Phys. Rev. 84 (1951), 108–128. A. Slav´i k, Product Integration, Its History and Applications, History of mathematics, Prague, Matfyzpress, 2007. S. Willard, General Topology, Addison-Wesley Series in Mathematics, New York, AddisonWesley, 2004. S. Lang, Introduction to Differentiable Manifolds, Universitext Series, New Haven, Springer, 2002.
250 | Bibliography [26] L. W. Tu, An Introduction to Manifolds, Universitext Series. New York, Springer, 2010. [27] R. Hermann, Differential geometry and the calculus of variations, Interdisciplinary mathematics, New York Math, Sci, Press, 1977. [28] M. Artin, Algebra, New Jersey, Prentice Hall, 1991. [29] J. S. Milne. Algebraic Geometry: V5.0, Taiaroa Publishing, 2005. [30] K. Brown, Hopf Algebras, Lecture Notes, University of Glasgow. [31] E. Beckenstein, L. Narici and C. Suffel, Topological algebras, Notas de matem´atica 24, New York, Elsevier Science, 1977. [32] D. P. Williams, Lecture Notes on C∗ -algbras, Department of Mathematics, Dartmouth College, 2011. [33] E. Abe, Hopf Algebras, Cambridge, Cambridge University Press, 1977. [34] K. T. Chen, Algebraic paths, J. Alg. 9 (1968) 8–36. [35] K. T. Chen, Iterated integrals and exponential homomorphisms, Proc. London Math. Soc. s3-4 (1954), 502–512. [36] K. T. Chen, Algebras of iterated path integrals and fundamental groups, Tr. Am. Math. Soc. 156 (1971), 359–379. [37] K. T. Chen, Integration of paths: A faithful representation of paths by noncommutative formal power series, Tr. Am. Math. Soc. 89 (1958), 395–407. [38] J. N. Tavares, Chen integrals, generalized loops and loop calculus, Int. J. Mod. Phys. A 9 (1994), 4511–4548. [39] H.-M. Chan and S. T. Tsou, Gauge Theories in Loop Space, Acta Phys. Polon. B 17 (1986), 259. [40] C. Di Bartolo, R. Gambini, and J. Griego, The Extended loop group: An Infinite dimensional manifold associated with the loop space, Commun. Math. Phys. 158 (1993), 217–240. [41] R. Gambini and J. Pullin, Loops, Knots, Gauge Theories and Quantum Gravity, Cambridge Monographs on Mathematical Physics, Cambridge, Cambridge University Press, 1996. [42] I. Arefeva, Non-Abelian Stokes formula, Theor. Math. Phys. 43 (1980), 353. [43] I. Y. Arefeva. Quantum Contour Field Equations, Phys.Lett. B 93 (1980) 347–353. [44] R. A. Brandt, F. Neri, and M.-A. Sato, Renormalization of Loop Functions for All Loops, Phys. Rev. D 24 (1981), 879. [45] R. A. Brandt, A. Gocksch, M.-A. Sato, and F. Neri, Loop Space, Phys. Rev. D 26 (1982), 3611. [46] G. P. Korchemsky and A. V. Radyushkin. Renormalization of the Wilson Loops Beyond the Leading Order, Nucl. Phys. B 283 (1987) 342–364. [47] S. V. Ivanov, G. P. Korchemsky, and A. V. Radyushkin, Infrared Asymptotics of Perturbative QCD: Contour Gauges, Yad. Fiz. 44 (1986), 230–240. [48] G. P. Korchemsky and A. V. Radyushkin, Loop Space Formalism and Renormalization Group for the Infrared Asymptotics of QCD, Phys. Lett. B 171 (1986) 459–467. [49] Y. M. Makeenko and A. A. Migdal, Quantum Chromodynamics as Dynamics of Loops, Nucl. Phys. B 188 (1981), 269. [50] Y. M. Makeenko and A. A. Migdal, Self-consistent Areas Law in QCD, Phys. Lett. B 97 (1980), 253. [51] Y. M. Makeenko and A. A. Migdal, Exact Equation for the Loop Average in Multicolor QCD, Phys. Lett. B 88 (1979) 135. [52] A. M. Polyakov, String Representations and Hidden Symmetries for Gauge Fields, Phys. Lett. B 82 (1979), 247–250. [53] A. M. Polyakov, Gauge Fields as Rings of Glue, Nucl. Phys. B 164 (1980), 171. [54] R. Giles, Reconstruction of gauge potentials from Wilson loops, Phys. Rev. D 24 (1981), 2160– 2168. [55] I. P. Zois, On Polyakov’s basic variational formula for loop spaces, Rept. Math. Phys. 42 (1988), 373–384.
Bibliography
[56] [57] [58] [59] [60] [61] [62] [63] [64]
[65] [66] [67] [68] [69] [70] [71]
[72]
[73] [74] [75] [76]
| 251
R. Loll, Loop approaches to gauge field theory, Theor. Math. Phys. 93 (1992), 1415. U. Schreiber, Quantization via Linear homotopy types, arXiv:1402.7041 [math-ph]. W. Ambrose and I. M Singer, A Theorem on Holonomy, Tr. Am. Math. Soc. 75 (1953), 428–443. N. G. Stefanis, Gauge Invariant Quark Two-Point Green’s Function through Connector Insertion to O(αs ), Nuovo Cim. A 83 (1984), 205. I. I. Balitsky and V. M. Braun, Evolution Equations for QCD String Operators, Nucl. Phys. B 311 (1989), 541. A. V. Belitsky and A. V. Radyushkin, Unraveling Hadron Structure with Generalized Parton Distributions, Phys. Rept. 418 (2005), 1. V. Barone and E. Predazzi, High-Energy Particle Diffraction, Berlin, Springer, 2002. A. D. Martin, Proton Structure, Partons, QCD, DGLAP and Beyond, Acta Phys. Polon. B 39 (2008), 2025. D. Boer, M. Diehl, R. Milner, R. Venugopalan, W. Vogelsang, D. Kaplan, H. Montgomery, S. Vigdor, et al., Gluons and the quark sea at high energies: Distributions, polarization, tomography, arXiv:1108.1713 [nucl-th]. A. Bacchetta, U. D’Alesio, M. Diehl, and C. A. Miller, Single-spin asymmetries: The Trento conventions, Phys. Rev. D 70 (2004), 117504. A. Bacchetta, Transverse Momentum Distributions, Lecture Notes for the Doctoral Training Programme, Trento, ECT, Trento, 2010. X.-D. Ji and F. Yuan, Parton distributions in light cone gauge: Where are the final state interactions?, Phys. Lett. B 543 (2002) 66–72. A. V. Belitsky, X.-D. Ji, and F. Yuan, Final state interactions and gauge invariant parton distributions, Nucl. Phys. B 656 (2003), 165–198. D. Boer, P. J. Mulders, and F. Pijlman, Universality of T-odd effects in single spin and azimuthal asymmetries, Nucl. Phys. B 667 (2003) 201–241. J. C. Collins, What exactly is a parton density?, Acta Phys. Polon. B 34 (2003), 3103. I. O. Cherednikov and N. G. Stefanis, Wilson lines and transverse-momentum dependent parton distribution functions: A Renormalization-group analysis, Nucl. Phys. B 802 (2008), 146– 179. I. O. Cherednikov, A. I. Karanikas, and N. G. Stefanis, Wilson lines in transverse-momentum dependent parton distribution functions with spin degrees of freedom, Nucl. Phys. B 840 (2010), 379–404. C. J. Bomhof and P. J. Mulders, Non-universality of transverse momentum dependent parton distribution functions, Nucl. Phys. B 795 (2008), 409–427. N. G. Stefanis, Worldline techniques and QCD observables, Acta Phys. Polon. Supp. 6 (2013) 71–80. U. D’Alesio and F. Murgia, Azimuthal and Single Spin Asymmetries in Hard Scattering Processes, Prog. Part. Nucl. Phys. 61 (2008), 394. N. Brambilla, et al., QCD and strongly coupled gauge theories: challenges and perspectives, arXiv:1404.3723 [hep-ph].
Index accumulation point, 188, 192 algebra, 214 – ∗, 205 – antipode, 220 – Banach, 88, 224, 225 – bi-algebra, 217 – C∗ , 224 – co-commutative, 219 – co-kernel, 213 – co-module, 222 – commutative, 88 – convex – multiplicative, 225 – duality, 223 – graded, 216 – Hausdorff, 225 – homomorphism, 216 – Hopf, 217, 218 – involution, 205, 224 – module, 222 – normed, 224 – nuclear, 89, 225 – opposite, 219 – restricted dual, 222 – unital, 215 algebraic paths, 21 Ambrose–Singer theorem, 69, 92 anticommutation relations – for gamma matrices, 233 antiderivation, 208 antihomomorphism, 221 antipode, 10, 81 antisymmetrization, 233 azimuthal angle, 167, 168
Baire – first category, 191 – second category, 192 basis – neighborhood, 190 Bjorken scaling, 151, 162 Bjorken-x, 140, 144 boson-gluon fusion, 158, 164
bounded – linear operator, 71, 74 – totally, 193 category, 13 Cauchy sequence, 193 center-of-mass energy, 139 Chen iterated integrals, 25 – intermediate points, 40 – multiplication, 26 – reparametrization, 40 – separation property, 43 – without coordinates, 25 cluster, 188 co-final, 188 co-multiplication, 9 co-unit, 9 collinear – divergence, 162 – Parton, 143 color algebra, 240–247 – anticommutation relations, 241 – Casimir operator, 240 – color factors, 243 – commutation relations, 240 – dabc , 241 – f abc , 241 – Fierz identity, 242 – habc , 241 compact – countable, 192 compactification – Alexandroff, 196 compactness, 186 complete, 193 completeness relation – for spinors, 234 – for states, 147 connected, 183 – components, 184 – locally, 186 – path, 184 – locally, 186 connectedness – simply, 202
Index |
continuity, 181 – uniform, 193 contractible, 202 convergence, 86, 89, 192 convolution, 151 countability, 190 – A1, 190 – A2, 190 cross section – for unpolarized DIS, 149 – for unpolarized SIDIS, 169 – in the FPM, 142 curvature, 67 – Cartan’s structure equation, 68 – field strength tensor, 69 – two-form, 68 d-connected, 35 d-discrete point, 36 d-loop, 37, 82 – Shc(d, p), 38 d-path, 29, 82 d-reduced, 37 deep inelastic scattering, see DIS δ + , 237 dense, 191 derivation, 205 derivative – area, 100, 107 – paralllel transporter, 116 – covariant, 68 – endpoint – terminal, 101 – Fr´echet, 71, 73, 100, 120 – left/right invariant, 94 – Lie, 118, 120 – path, 100 – Leibniz, 103 – Polyakov, 100 diffeomorphism, 117, 203 differential of a map, 208 differentiation – category of pointed differentiations, 15 – d-closed, 24 – k-algebra, 15 – k-module, 15 – splitting Pointed, 17 – splitting pointed differentiation homomorphisms, 18
– surjective shuffle module, 17 differentiations – pointed, 96 Dirac – basis, 234 – δ -function, 237 – equation, 233, 234 Dirac map, 89 DIS, 139–165 – orthonormal basis, 144–146 eikonal – approximation, 127, 136 – quark, 138, 156 embedding, 182 evolution equations, 150 – DGLAP, 164, 165 exact sequence, 16 factorization, 150 – collinear, 150, 162, 165 – in PM, 151 – scale, 163 – scheme, 163 – TMD-, 150–151, 172 Fiber bundle – parallel transport, 65 fiber bundle – connection, 50 – Ehresmann connection, 52 – horizontal lift, 55 – parallel transport, 55 – principal, 45 field, 212 filter – basis, 225 final-state cut, 147, 156 flipping operation, 10 form – one-form, 206 Fourier transform, 237 fractional energy loss, 140 fragmentation function, 167 function – bump function, 209 – support, 209 functional – linear, 206
253
254 | Index functor, 14 – covariant functor to SPD, 19 – forgetful, 14 gamma matrices, 233 – contraction identities, 235 – trace identities, 235 Gauge potential, 50 gauge potential, 54 – compatibility condition, 53 Gel’fand – character, 231 – space, 88 – spectrum, 88, 231 – topology, 231 – transform, 231 group – fundamental, 199 – topological, 185 hadronic tensor, 147, 148, 153, 167–169, 171 – constraints, 148 – expansion, 148 hard part, 141 – partonic, 162 Hausdorff, 177, 180, 192, 194, 198 Heaviside step function, 237 holonomy, 66 homeomorphism, 181 homotopy, 199 – equivalent spaces, 202 – relative, 199 Hopf algebra, 11 ideal, 22, 213 – bi, 218 – least δ -closed, 30 – maximal, 214 – prime, 213 IMF, see infinite momentum frame infinite-momentum frame, 143 infrared divergence, 161 invariant mass, 140 κ , 144 Lμ 𝜈 , see lepton tensor L-operator, 12 large-distance process, 141
leading twist, 154 left translation, 185 lepton tensor, 146 Lie – adjoint action, 51 ̃ , 94 – algebra of LM p – Fr´echet, 79, 94 – generalized, 79 – Infinite-dimensional, 79 – left action, 51 – right action, 51 light-cone – coordinates, 235–237 – gauge, 156, 174, 239 Lindel¨of, 190 loop, 184 loops – derivative – Fr´echet, 120 – generalized, 89, 92 – group, 87 – Lie algebra, 94 – Multiplication, 90 – topological group, 91 – loopgroup LMp , 87 manifold – algebra – exterior, 207 – Banach, 71, 74 – boundary, 202 – bundle – tangent, 207 – dual space, 207 – embedding, 204 – Fr´echet, 76, 80, 83 – function – smooth, 205 – immersion, 204, 209 – orientability, 204 – product – exterior, 207 – interior, 207 – real, 202 – smooth, 204 – sub-manifold, 203 manifoldreal analytic, 204 map – closed, 182
Index |
– equivariant, 198 – open, 182 – quotient, 197 mass-shell condition, 153 matrix function – continuity, 61 – derivatives, 58 – differentiability, 57 – Integrals, 59 meagre, 191 metric, 179, 232 – intrinsic, 179 – LC coordinates, 235 – set, 179 – space, 179 – translation invariant, 180 – transversal, 145 μ𝜈 – g⊥ , 144 μ𝜈 – ε⊥ , 145 μ𝜈 – g⊥ , 236 – ultra, 179 module, 211, 213 – graded, 216 – left, 213 – right, 213 momentum fraction, 142 monoid, 211 multiplication – on Alg(Sh(Ω), K), 11 negligible subset, 210 net, 187 norm – semi-norm – multiplicative, 228 – sub-multiplicative, 224 normal, 194 on-shell, see mass-shell condition operator – compact, 229 – singular value, 229 – nuclear, 230 – trace-class, 230 P-trivial, 36 paracompact, 203 parity, 148 partition of unity, 209
parton, 141 parton distribution function, see PDF Parton model, 141 – free, 141–142 path, 184 path dependence, 156 path-ordering, 64 paths – elementary equivalent, 42 – piecewise regular, 42 – reduced, 42 PDF, 150 – full definition, 158 – gauge invariant definition, 155–159 – in FPM, 151 – operator definition, 152–155 – renormalized, 162, 164 Φq , see quark correlator PM, see Parton Model powerset, 178 preimage, 181 probability density function, see PDF product integrals, 60 pull-back, 208 push-forward, 208
QCD – Feynman rules, 238 – gluon polarization sum, 239 – Lagrangian, 238 quark correlator, 154, 173 – expansion, 154
regular, 194 ring, 211 – algebra, 214 – antihomomorphism, 221 – domain, 214 – graded, 215 – homogeneous elements, 215 – integral domain, 214 – local, 214 – nonzero divisor, 214 – trivial gradation, 215 – unit, 214 – zero divisor, 214 running coupling, 164
255
256 | Index s, see centre-of-mass energy separation axioms, 194 sequence – Cauchy, 225 set – absolutely m-convex, 228 – directed, 187 – m-convex, 228 – multiplicative, 228 – partially ordered, 187 short exact, 16 short-distance process, 141 shuffle – ideal, 22, 81 – (k, l), 8 – k-algebra, 9 – multiplication, 8 SIDIS, 166 – orthonormal basis, 168, 169 SIDIS(, 165 SIDIS), 174 sifting property, 237 σ , see cross section soft part, 141 space – absolutely convex, 227 – absorbent, 227 – balanced, 227 – Banach, 225 – circled, 227 – cone, 227 – convex, 227 – Fr´echet, 228 – Hilbert, 229 – pre-Hilbert, 229 splitting function, 161, 164, 165 Stokes, 210 – non-Abelian, 125 structure function, 148 – difference with PDF, 152 – factorization, 159 – for quark, 151, 161, 164 – in FPM, 149 – in PM, 151 – in SIDIS, 169 surjective pointed differentiation, 16 symmetrization, 233 – ordered, 246
t, see transferred momentum theorem – Gel’fand–Mazur, 231 – mean value, 183 – Tychonov, 189 time-reversal, 148, 168 TMD, 151, 166 topological vector space, 73 – locally convex, 85 – nuclear, 85 topology, 176 – basis, 178 – final, 226 – Gel’fand, 231 – induced, 176 – initial, 226 – neighborhood, 176 – product, 181 – quotient, 196 – sub-basis, 180 – Tychonov, 189 – weak, 230 – weak-∗, 231 trace, 89 transferred momentum, 139 transition function, 206 translation operator, 147 transverse – convolution, 172 – momentum, 143, 150–151, 163, 166 – separation, 173 transverse momentum dependent PDF, see TMD Urysohn, 195 vector – contravariant, 206 – covariant, 206 vector field – fundamental, 52 – left-invariant, 51 vector space, 212 W μ 𝜈 , see hadronic tensor Wilson – loop functional, 69 Wilson line – cut propagator, 133, 136 – external point, 128, 132, 135, 157
Index |
– Feynman rules, 135 – Hermitian conjugate, 131 – on a linear path – from −∞ to bμ , 127–129 – from aμ to + ∞, 129–130 – from aμ to bμ , 133–135 – from−∞ to + ∞, 132–133 – propagator, 128, 131, 135
– reversed path, 132 – transversal, 174 – vertex, 129, 131, 136 x, see Bjorken-x ξ , see momentum fraction y, see fractional energy loss
257