270 37 6MB
English Pages 262 [263] Year 2022
Advanced Classical and Quantum
Probability Theory with Quantum
Field Theory Applications
Advanced Classical and Quantum
Probability Theory with Quantum
Field Theory Applications
Harish Parthasarathy Professor
Electronics & Communication Engineering
Netaji Subhas Institute of Technology (NSIT)
New Delhi, Delhi-110078
First edition published 2023 by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN and by CRC Press 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 © 2023 Manakin Press CRC Press is an imprint of Informa UK Limited The right of Harish Parthasarathy to be identified as author of this work has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. For permission to photocopy or use material electronically from this work, access www. copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact [email protected] Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Print edition not for sale in South Asia (India, Sri Lanka, Nepal, Bangladesh, Pakistan and Bhutan) British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 9781032405124 (hbk) ISBN: 9781032405148 (pbk) ISBN: 9781003353430 (ebk) DOI: 10.4324/9781003353430 Typeset in Arial, BookManOldStyle, CourierNewPS, MS-Mincho, Symbol, and TimesNewRoman by Manakin Press, Delhi
Preface
1
Preface The contents of this book are based on three undergraduate and postgradu ate courses taught by the author on Matrix theory, probability theory and an tenna theory over the past several years. The portion on matrix theory covers basic linear algebra including quotient vector spaces, variational principles for computing eigenvalues of a matrix, primary and jordan decomposition theorems for nondiagonable matrices, simultaneous triangulability, and basic matrix de composition theorems useful in statistics, signal processing and control. It also covers Lie algebra theory culminating the celebrated root space decomposition introduced by E.Cartan of a semisimple Lie algebra in terms of Cartan subal gebras and root vectors and also some interesting topics in control theory like controllability of partial differential equations of mathematical physics including Maxwell’s equations, Dirac equation and their quantum versions. By control of a quantum system comprising electrons, positrons and photons described by the second quantized Maxwell and Dirac equations, we mean the design of classical control fields like current and electromagnetic fields so that the resulting ra diation pattern genererated by the electronpositron field will have spacetime moments that provide a good match to a given set of moments. Some discussion on large deviation theory in control has also been included involving designing control fields for pde’s driven by weak stochastic noise so that the probability of deviation of the controlled field from a prescribed set of fields by an amount greater than a given threshold is a minimum. For constructing the irreducible finite dimensional representation of a semisimple idea, we discuss the notion of maximal ideals of the universal enveloping algebra of a semisimple Lie algebra the oneone correspondence between the irreducible representations and maxi mal ideals. The second part of the book deals with probability theory and we introduce Brownian motion, Poisson process and some of the features assoo ciated with such processes. The third part of the book covers basic antenna theory including far field radiation pattern by a current source at a given fre quency, quantum electrodynamics within a cavity described by the coupling of the second quantized Maxwell and Dirac fields and how to control these cavity fields so as to get a far field radiation pattern having prescribed statistics in a coherent state of cavity photons and Fermions. This portion of the book also introduces the quantum Boltzmann equation in the presence of electromagnetic fields and how from the resulting nonlinear evolution of the density matrix, we can compute the refractive index of materials from quantum averages of electric and magnetic dipole moments. We demonstrate how the refractive index com puted in this quantum mechanical way will generally be field dependent. We also show that if gravitational effects are taken into account, for example the metric tensor of spacetime in an expanding universe, then Dirac’s equation will have to be modified by the presence of background curvature and hence the resulting quantum Boltzmann equation will contain gravitational terms thereby causing the refractive index to depend on the spacetime metric of gravitation. We also discuss an interesting notion in electrodynamics namely that of specified the charges in space in terms of the singularities of the electric and magnetic fields. This fact leads to the important conclusion that all charges are in fact generated v
2 by the singularities of the electromagnetic field and hence one can postulate that the electron’s mass and charge is of electromagnetic origin. In this context, we explain how to compute the corrected electron propagator due to external elec tromagnetic and gravitational fields by formulating an appropriate differential equation for the electron propagator and hence from this corrected propagator, how to determine the shift in the electron’s mass due to electromagnetic and gravitational fields.
Author
vi
Table of Contents
1. Matrix Theory
1–50
2. Probability Theory
51–64
3. Antenna Theory
65–98
4. Miscellaneous Problems
99–108
5. More Problems in Linear Algebra and Functional Analysis
109–184
6. Models for the Refractive Index of Materials and Liquids
185–216
7. More Problems in Probability Theory, Antennas and Refractive Index of Materials
217–248
vii
Detailed Contents
1. Matrix Theory 1–50 1.1 Perequisites of Linear Algebra 3
1.2 Quotient of a Vector Space 4
1.3 Triangularity of Comuting Operators 5
1.4 Simultaneous Diagonability of a Family of Comuting
Normal Operators w.r.t an onb in a Finite Dimensional
Complex Inner Product Space 6
1.5 Tensor Products of Vectors and Matrices 7
1.6 The Minimax Variational Principle for Calculating
all the Eigenvalues of a Hermitian Matrix 8
1.7 The Basic Decompostition Theorems of Matrix Theory 8
1.8 A Computational Problems in Lie Group Theory 12
1.9 Primary Decompostition Theorem 13
1.10 Existence of Cartan Subalgebra 15
1.11 Exercises in Matrix Theory 24
1.12 Conjugancy Classes of Cartan Subalgebras 28
1.13 Exercises 29
1.14 Appendix: Some Applications of Matrix Theory to
Control Theory Problems 30
1.15 Controllability of Supersymmetric Field the Oretic Problems 38
1.16 Controllability of Yang-Mills Gauge Fields in the Quantum
Context Using Feynman’s Path Integral Approach to
Quantum Field Theory 39
1.17 Large Deviations and Control Theory 40
1.18 Approximate Contollability of the Maxwell Equations 42
1.19 Controllability Problems in Quantum Scatering Theory 43
1.20 Kalman’s Notion of Controllability and Its Extension to pde’s 43
1.21 Controllability in the Context of
Representations of Lie Groups 44
1.22 Irreducible Representations and Maximal Ideals 45
1.23 Controllability of the Maxwell-Dirac Equations
Using External Classical Current and Field Sources 46
ix
1.24 Controllability of the EEG Signals on the Brain Surface
Modeled as a Spherical Surface by Influencing
the Infinitesimal Dipoles in the Cells of the Brain
Cortex to Vary in Accord to Sensory Perturbations 1.25 Control and Relativity 2. Probability Theory 2.1 The Basic Axioms of Kolmogorov 2.2 Exercises 2.3 Exercises on Stationary Stochastic Processes,
Spectra and Polyspectra 2.4 A Research Problem Based on Problem 2.5 Exercises on the Construction of the Integral w.r.t a Probability Measure 2.6 Exercises on Stationarity, Dynamical Systems and
Ergodic Thery
48
49
51–64 54
54
57
58
61
63
3. Antenna Theory 65–98 3.1 Course Outline 65
3.2 The Far Field Poynting Vector 67
3.3 Exercises 69
3.4 Order of Magnitudes in quantum Antenna Theory 70
3.5 The Notion of a Fermionic Coherent State and
its Application to the Computation of the Quantum
Statistical Moments of the Quantum Electromagnetic
Field Generated by Electrons and Positors Within
a Quantum Antenna 73
3.6 Calculating the Moments of the Radiation Field
Produced by Electrons and Positrons in the Far Field
when the Fermions are in a Coherent State 78
3.7 Controlling the Classical em Fields Interacting
with the Dirac Field so that the Mean Value of
the em Field Radiated by the Resulting Dirac
Second Quantized Current in a Fermionic
Coherent State is as Close as Possible to a Given
Deterministic Pattern in Space and Simultaneously
the Mean Square Fluctuations of this Field in a Fermionic
Coherent State are Minimized 79
x
3.8 Approximate Analysis of a Rectangular Quantum Antenna 3.9 Remark on the Perturbation in the Quantum Dirac
Field and the Quantum Electromangetic Field
Interacting with Each Other Caused by Further
Interaction of the Dirac Field with a Classical
Control em Field and Interaction of the Quantum
Electromagnetic Field with a Control Classical Current 3.10 Quantum Antennas Constructed Using Supersymmetric
Field Theories 3.11 Quantization of the Maxwell and Dirac Field
in a Background Curved Metric of Spacetime. 3.12 Relationship Between the Electron Self Energy and
the Electron Propagator 3.13 Electron Self Energy Corrections Induced by Quantum Gravitational Effects 4. Miscellaneous Problems 4.1 A Problem in Robotics 4.2 More on Root Space Decompostion of
a Semisim-ple Lie Algebra 4.3 A Project Proposal for Developing an Ex-perimental
Setup for Transmitting Quantum States Over
a Channel in the Presence of An Eavesdropper 4.4 A Problem in Lie Group Theory
81
87
91
93
96
97
99–108 99
100
103
106
5. More Problems in Linear Algebra and Functional Analysis 109–184 5.1 Riesz Representation Theorem 109
5.2 Lie’s Theorem on Solvable Lie Algebras 110
5.3 Engel’s Theorem on nil-representation of a Lie Algebra 113
5.4 Aperture Antenna Pattern Fluctuations 114
5.5 Spectral Theorem Using Gelfand-Naimark Theorem 116
5.6 The Atiyah-Singer Index Theorem: A supersymmetric Proof 122
5.7 Replicas, Regular Elements, Jordan Decomposition
and Cartan Subalgebras 122
5.8 Lecture Plan, Matrix Theory 128
5.9 More Assignment Problems in Probability Theory 131
133 5.10 Multiple Choice Questions on Probability Theory xi
5.11 Design of a Quantum Unitary Gate Using Superstring
Theory with Noise Analysis Based on the Hudson-
Parthasarathy Quantum Stochastic Calculus 136
5.12 Study Projects in Probability Theory: Construction of
Brownian Motion, Law of the Iterarted Logarithm 137
5.13 Quantum Boltzmann Equation for a Systerm of Particles
Interacting with a Quantum Electromagnectic Field 141
5.14 Device Physics in a Semiconductor Using the Classical
Boltzmann Transport Equation 143
5.15 Describing the Value of a Point Charge and Its Location in
Space in Terms of the Electrostatic Potential Generated by It 143
5.16 Calculating the Masses of N Gravitating Particles and Their
Postitons and Their Trajectories from Measurement of
the Gravitational Potential Distribution in Space-time
Using the Newtonian Theory 150
5.17 The Quantum Boltzmann Equation for a Plasma 154
5.18 Some Other Remarks on Lie Algebras 160
5.19 Question Paper on Matrix Theory 164
5.20 Study Project on Quantum Antennas 165
5.21 Heat and Mass Transfer Equations in a Fluid 166
5.22 Quantum Electodynamics in a Background Medium
Described by a Permittivity and Permeability Function 167
5.23 Temperature and Field Dependence of Re-fractive Index 171
5.24 Quantum Statistical Field Theory 173
5.25 Root Space Decompositions of the Complex
Classical Lie Algebras 176
6. Models for the Refractive Index of Materials and Liquids 185–216 6.1 Quantum Electrodynamics with the Electronic
Charge Expressed in Terms of the Quantum Fields 186
6.2 Calculating the Masses of N Gravitating Paticles and
Their Positions and Their Trajectories from Measurement
of the Gravitational Potential Distribution in Space-time
Using the Newtonian Theory 193
6.3 The Quantum Boltzmann Equation for a Plasma 196
6.4 Quantum Electrodynamics in a Background Medium
Described by a Permittivity and Per-meability Function 202
xii
6.5 Models for the Refractive Index of a Material Based
on Classical and Quantum Physics 6.6 Quantum Statistical Field Theory 6.7 Relating the Refractive Index of a Material to
the Metric Tensor of Space-time 6.8 Cosmologiccal Effects on the Refractive Index 7. More Problems in Probability Theory, Antennas and Refractive Index of Materials 7.1 Levy’s Modulus of Continuity for Brownian Motion 7.2 Test 2: Antennas and Wave Propagation 7.3 Article Submitted to the Quantum Information
Processing Journals for Publication
xiii
204
207
209
212 217–248 217
220
222
Chapter 1
Chapter 1
Matrix TheoryMatrix Theory Prerequisites of linear algebra. Fields, rings, vector spaces over a field, modules over a ring, algebras, ideals in a ring and an algebra, bases for vector spaces, linear transformations in a vector space, basis for a vector space, matrix of a linear transformation relative to a basis, inner product spaces, unitary, Hermitian and normal operators in an inner product space, spectral theorem for normal operators. [1] Quotient of a vector space by another space. [2] Simultaneous triangulability of commuting matrices relative to an onb. [3] Simultaneous diagonability of commuting normal matrices relative to an onb. [4] Tensor products of vector spaces. [5] Variational principles for calculating the eigenvalues of a Hermitian ma trix. [6] Positive definite matrices. [7] The basic decomposition theorems of matrix theory. [a] Row reduced Echelon form.
[b] Spectral theorem for normal matrices.
[c] Polar decomposition. [d] Singular value decomposition. [e] QR decomposition based on the GramSchmidt orthonormalization pro cess. [f] LDU decomposition of positive definite matrices. [8] Applications of matrix theory to finite state quantum systems. [a] Schrodinger and Heisenberg evolution in finite dimensional Hilbert spaces. [b] Different kinds of unitary gates for finite state quantum computation: CNOT, Swap, Fredkin, Toffoli, phase gate, Quantum Fourier transform gate. [c] Perturbation theory for quantum systems in finite dimensional state space. [8] Lie groups and Lie algebras. [a] Group action on a differentiable manifold. 3
1
42
Advanced Classical and Quantum ProbabilityCHAPTER Theory with Quantum Field Theory Applications 1. MATRIX THEORY
[b] Differential of a group action on a manifold. [c] Differentiable and analytic structures on a group: The notion of a Lie group based on differentiability of the composition and inversion operation. [d] The Lie algebra of vector fields on a differentiable manifold [e] The notion of Lie algebra of a Lie group determined by left invariance of vector fields. [f1] Identification of the Lie algebra of a Lie group with its tangent space at the identity. [f2] The exponential map from the Lie algebra into a Lie group. [f3] Examples when the exponential map is not surjective: O(3) [f4] The oneone correspondence between Lie algebra elements and one pa rameter subgroups of a Lie group. [g] Structure constants associated with a basis for a Lie algebra. [h] Examples of linear Lie groups and their Lie algebras: The classical Lie groups [i] Examples of nonlinear Lie groups taken from dynamical systems. [j] Homotopy and covering groups of a Lie group. [k] The universal enveloping algebra of a Lie algebra. [l] The invariant bilinear form on a Lie algebra. [m] Solvable and semisimple Lie algebras. [n] Simple Lie algebras, root space decomposition and Cartan’s classification theory based on the theory of roots. [9] Representation theory for Lie groups and Lie algebras. [a] Completely reducible representations. [b] irreducible representations. [c] Classification of finite dimensional irreducible representations of a semisim ple Lie algebra based on dominant integral weights: The CartanWeylHarishChandra theory. [d] The character of a representation. [e] Representations of compact groups:Schur Lemmas and the PeterWeyl theory. [f] Characters of a compact semisimple group: Weyl’s character formula. [g] Applications of representation theory to image processing problems. [10] Algebraic varieties in a multivariable polynomial ring. [11] Prime ideals and maximal ideals of a ring. [12] Grothendieck’s generalization of an algebraic variety to schemes on a ring consisting of prime ideals. [13] Some remarks on algebraic groups. [a] Examples. [b] Flag variety. [c] The Grassmannian variety and the Schubert variety. [d1] The relationship between irreducible representations of an algebraic group and Flag varieties. [d2] The relationship between Schubert cells in a Grassmannian variety and the Schubert variety constructed from the Bruhat decomposition of a semisimple algebraic group associated with a parabolic subgroup.
Advanced Classical and Quantum with Quantum Field Theory Applications 1.1. PEREQUISITES OFProbability LINEARTheory ALGEBRA
5 3
[e] Plucker’s coordinates on a Grassmannian variety. [f] Quadratic relations satisfied by Plucker’s coordinates on the Grassman nian variety.
1.1
Perequisites of linear algebra
[1] A field is a set of elements F that has two binary operations called addition and multiplication, a zero element 0 and a unit element 1 such that under addition, F is an Abelian group with identitity 0, under multiplication, F − {0} (ie, F without the zero element) is an Abelian group with identity 1 and multiplication distributes over addition. Examples of fields are
[a] R, C, Q, the set of real, complex and rational numbers.
[b] {0, 1, ..., p − 1} where p is any prime, with addition and multiplication defined as for real numbers but modulo p. [c] If x is any indeterminate, and F is any field, then QF [x], the set of all rational functions in x over F, ie, the set of all ratios of polynomials in x with the coefficients of the polynomials coming from F and with the denominator polynomial never being the zero polynomial is a field. [2] A vectorspace over a field F is a set of elements endowed with a binary operation called vector addition and denoted by +, a scalar multiplication, ie, a map F × V → V denoted by a dot and a zero element called the zero vector so that under vector addition, V forms an Abelian group with identity being the zero vector 0, scalar multiplication distributes over vector addition, if c1 , c2 ∈ F, x ∈ V , then c1 .(c2 .x) = (c1 c2 ).x, (c1 + c2 ).x = c1 .x + c2 .x
if 0 denotes the zero element of F, then
0.x = 0, and if 1 denotes the unit element in F, then 1.x = x if −x denotes the additive inverse of x in the Abelian group V (under vector addition), then, −1.x = −x Examples of vector spaces: [a] V = Fn , the set of all column vectors (or equivalently row vectors), ie, ordered ntuples with the n entries in F and vector addition and scalar multiplication being defined component wise, ie, [c1 , ..., cn ]T + [d1 , ..., dn ]T = [c1 + d1 , ..., cn + dn ]T , ci , di ∈ F, i = 1, 2, ..., n
46
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
c.[c1 , ..., cn ]T = [cc1 , ..., ccn ]T , c, c1 , ..., cn ∈ F In other words, vector addition and scalar multiplication in Fn are induced by addition and multiplication in F. 3. A ring is a set R with two binary operations +, . and a zero element 0 such that under +, R is an Abelian group and under ., R is a semigroup, (A semigroup has all the properties of the group like closure under composition, associativity under composition except that there may not be any identity and inverse of an element) and such that . distributes over +. Clearly, a field is a special case of a ring. Some examples of a rings that is not fields are [a] F[x], the set of all polynomials in an indeterminate x with coefficients taken from the field F. This is called the ring of polynomials over the field F. Note that a polynomial will not have a multiplicative inverse unless it is a constant polynomial. F[x] is an example of a commutative/Abelian ring. [b] R = Mn (F), the space of n × n matrices with elements from F. The . and + operations in this ring are respectively matrix multiplication and matrix addition. Note that this ring is nonAbelian. [c] As a generalization of [a], we define R to be the commutative ring of all functions f from a set X to a field F. The . operation is simply the ordinary multiplication of functions with the multiplication operation induced by that in F and the addition operation is likewise the ordinary addition operation of functions induced by addition in F. [d] As a generalization of [a] and [c] we consider the ring R of all polynomials in an indeterminate x with coefficients coming from Mn (F), ie, all n × n matrix polynomials.
1.2
Quotient of a vector space
2. Simultaneous triangulability of a family of commuting matrices relative to an o.n.b. 3. Simultaneous diagonability of a family of commuting normal matrices relative to an o.n.b. 2[a]. First we prove the triangulability of a matrix over the complex field relative to an o.n.b. Let V be a complex inner product space of dimension n and let T : V → V be a linear operator. T has at least on eigenvalue: Say T e1 = c1 e1 , c1 ∈ C, � e1 �= 1 Define W1 = N (T − c1 I) Then W1 �= {0}
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 1.3. TRIANGULABILITY OF COMUTING OPERATORS
75
Consider the quotient vector space V /W1 ‘, dim(V /W1 ) < n W1 is T invariant and hence T1 : V /W1 → V /W1 , T (x + W1 ) = T (x) + W1 is a well defined linear operator. We define an inner product on V /W1 as follows: For x, y ∈ V , < x + W1 , y + W2 >=< P ⊥ x, P ⊥ y > where P is the orthogonal projection of V onto W and P ⊥ = I − P . It is easy to see that this is a well defined inner product on V /W1 . Now by the induction hypothesis on the vector space dimension, T1 can be brought into upper triangular form relative to an onb say {fk + W1 : k = 1, 2, ..., m} of V /W1 . We note that P ⊥ f k + W 1 = fk + W 1 and hence we can replace the above onb for V /W1 by {P ⊥ fk + W1 : k = 1, 2, ..., m}. Then by construction, P ⊥ fk , k = 1, 2, ..., m is an orthonormal set in V and further the span of this set is orthogonal to W1 . Thus, if we choose an orthogonormal basis {e1 , ..., en−m } for W1 , then it is immediate that T is uppertriangular w.r.t the onb B = {P ⊥ f1 , ..., P ⊥ fm , e1 , ..., en−m } for V .
1.3
Triangulability of comuting operators
Now we prove simultaneous triangulability of a family of commuting operators on a finite dimensional complex inner product space w.r.t an onb. Let dimC V = n < ∞ and let F be a family of commuting operators in this space. Choose one operator say T from this family and choose an eigenvalue c1 with a corresponding normalized eigenvector e1 for this operator: T e1 = c1 e1 , � e1 �= 1 Then
W1 = N (T − c1 )
is a nonzero subspace of V and is invariant under every element of F since all these elements commute with T . Thus for each S ∈ F, we can define a linear operator S1 : V /W1 → V /W1 by
S1 (x + W1 ) = S(x) + W1 , x ∈ V
6 8
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
and it is immediate that {S1 : S ∈ F} is a commuting family of operators in V /W1 and hence by the induction hypothesis on the vector space dimension, this family can be simultaneously triangulated w.r.t an onb for V /W1 . Let {fk + W1 : k = 1, 2, ..., m} be such an onb. Then again by the induction hypothesis, F restricted to W1 can also be simultaneously triangulated w.r.t an onb for it. Denote this onb by {e1 , ..., en−m }. It is immediate then that F is simultaneously triangulable relative to the onb {P ⊥ f1 , ..., P ⊥ fm , e1 , ..., en−m } for V . The proof is complete. Reference: This problem was taken from Rajendra Bhatia, ”Matrix Analysis”, Springer.
1.4 Simultaneous diagonability of a family of com muting normal operators w.r.t an onb in a finite dimensional complex inner product space Let F be a commuting family of normal operators in V with dimC V = n < ∞. We first choose an element T ∈ F and a vector e1 of unit norm such that T e 1 = c1 e 1 for some c1 ∈ C. Again define W1 = N (T − c1 ) Choose any S ∈ F. Then, W1 is Sinvariant since [T, S] = 0. Now define S1 : V /W1 → V /W1 in the usual way. We claim that S1 is normal. Indeed, let u, v ∈ V . Then, < u + W1 |S1∗ S1 |v + W1 >=< S(u) + W1 |S(v) + W1 >=< P ⊥ Su|P ⊥ Sv > where P is the orthogonal projection onto W1 . Then, P is expressible as a function of T using the spectral theorem for normal operators, ie, P = f (T ). From that it is immediate that S commutes with P and hence also with P ⊥ = I − P . Since P ⊥ is Hermitian, S ∗ also commutes with P ⊥ . Then, using this and the normality of S, < P ⊥ Su|P ⊥ Sv >=< u|S ∗ P {perp Sv >= < u|P ⊥ S ∗ S|v >=< u|P ⊥ SS ∗ |v >
=< [S ∗ ]1 (u + W1 )|[S ∗ ]1 (v + W1 > On the other hand, we claim that [S ∗ ]1 = S1∗
Advanced Classical and Quantum Probability Theory withAND Quantum Field Theory Applications 1.5. TENSOR PRODUCTS OF VECTORS MATRICES
9 7
for the following reason: < u + W1 |[S ∗ ]1 |v + W1 >=< u + W1 |S ∗ (v) + W1 >=< P ⊥ u|P ⊥ S ∗ v > =< u|P ⊥ S ∗ v > while on the other hand, < u + W1 |S1∗ |v + W1 >=< S1 (u + W1 )|v + W1 >=< S(u) + W1 |v + W1 > =< P ⊥ S(u)|P ⊥ v >=< Su|P ⊥ v >=< u|S ∗ P ⊥ v > =< u|P ⊥ S ∗ v > proving the claim. Thus, we get < u+W1 |S1∗ S1 |v+W1 >=< S1∗ (u+W1 )|S1∗ (v+W1 ) >=< u+W1 |S1 S1∗ |v+W1 > and therefore
S1∗ S1 = S1 S1∗
ie, S1 is normal for all S ∈ F. Moreover, it is clear that {S1 : S ∈ F} is a commuting family since if S, L ∈ F, then S1 L1 (u+W1 ) = S1 (L(u)+W1 ) = SL(u)+W1 = LS(u)+W1 = L1 S1 (u+W1 ), u ∈ V Thus, {S1 : S ∈ F} is a commuting family of normal operators in V /W1 and by the induction hypothesis (on the dimension of the vector space), it follows that this family is simultaneously diagonable w.r.t an onb say {fk + W1 : k = 1, 2, ..., m} for V /W1 . Likewise by the same induction hypothesis, F is simul taneously diagonable on W1 w.r.t. an onb say {e1 , .., en−m } for W1 . In other words, for any S ∈ F, we have S(fk ) − ck (S)fk ∈ W1 , k = 1, 2, ..., m Thus,
S(P ⊥ fk ) − ck (S)P ⊥ fk = P ⊥ (Sfk − ck (S)fk ) = 0
This proves that {P ⊥ fk , k = 1, 2, ..., m, e1 , ..., en−m } is an onb for V that simul taneously diagonalizes F.
1.5
Tensor products of vectors and matrices
[a] Tensor products of vector spaces. [b] The symmetric tensor product and permanents. [c] The antisymmetric tensor product and determinants.
[d] Tensor product of matrices.
[e] Eigenvalues and eigenvectors of the tensor product, antisymmetric tensor
product and symmetric tensor product of matrices.
8
10
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
CHAPTER 1. MATRIX THEORY
1.6 The minimax variational principle for cal culating all the eigenvalues of a Hermitian matrix Let X be an n × n Hermitian matrix and let {v1 , ..., vn } be an orthonormal eigenbasis for X with corresponding eigenvalues {c1 , ..., cn } so that c1 ≥ c2 ≥ ... ≥ cn . Let M be any k dimensional subspace of Cn . Consider the n − k + 1 dimensional subspace Wk = span{vk , ..., vn } of Cn . It is clear that dim(M ∩ Wk ) = dimM + dimWk − dim(M + Wk ) ≥ k + (n − k + 1) − n = 1 and hence we can choose v ∈ M ∩ Wk , � v �= 1 and therefore, < v|X|v >≤ ck It follows that infx∈M < x|X|x >≤ ck Thus, supdimN =k infx∈N < x|X|x >≤ ck Now choosing
N = span{v1 , ..., vk }
it follows that dimN = k and vk ∈ N , < vk |X|vk >= ck Thus we have proved supdimN =k inf {x ∈ N } < x|X|x >= ck This is the first minimax theorem for computing the eigenvalues of a Hermitian matrix. Now we prove the second minimax theorem.
1.7 The basic decomposition theorems of matrix theory [a] LDU decomposition of a positive definite matrix. Let R be a positive definite matrix. It can be looked upon as the correlation matrix of a random vector X. We write X = [X1 , ..., Xn ]T
Advanced Classical and Quantum Probability Theory with QuantumOF Field Theory Applications 1.7. THE BASIC DECOMPOSITION THEOREMS MATRIX THEORY11 9
and then GramSchmidt orthonormalize this vector relative to the standard inner product on L2 (Ω, F, P ): < u, v >= E(uv). Then we get an orthonormal set {e1 , e2 , ..., en } of random variables in L2 (Ω, F, P ) where X1 = a(1, 1)e1 , X2 = a(2, 1)e1 +a(2, 2)e2 , ..., Xk
= a(k, 1)e1 +...+a(k, k)ek , k = 1, 2, ..., n
writing
e = [e1 , ..., en ]T
we can express this as
X = Le
where L is the lower triangular matrix ((a(i, j))). Then taking correlations on both sides gives us R = E(XXT ) = LE(eeT )LT = LLT which is known as the LU decomposition of R. Defining B as L except that its diagonal entries are all 1 and defining the diagonal matrix D = diag[a(i, i)2 , i = 1, 2, ..., n] we get
R = BDBT
with B lower triangular and having ones on its diagonal. This is called the LDU decomposition of the matrix R. This decomposition is important in linear prediction theory of stochastic processes as its derivation suggests. There is another way to derive the LDU decomposition of a positive definite matrix R. Using the spectral decomposition theorem, we can write R = XX T where X is a square matrix. Then using the QR decomposition, write X T = QY where Y is upper triangular and Q is orthogonal. This can be achieved for example, by applying the GramSchmidt decomposition to the columns of X T or equivalently to the rows of X. In other words Q = [e1 , e2 , ..., en ], eaT eb = δab X T = [x1 , x2 , ..., xn ]
Then,
x1 = y(1, 1)e1 , x2 = y(1, 2)e1 +y(2, 2)e2 , ..., xk
Then,
= y(1, k)e1 +...y(k, k)ek , k = 1, 2, ..., n R = XX T = Y T QT QY = Y T Y
and Y T is lower triangular.
10
12
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
CHAPTER 1. MATRIX THEORY
[b] The QR decomposition of any rectangular matrix. First assume that A ∈ Cm×n . Let A = [a1 , ..., an ], aj ∈ Cm×1 , j = 1, 2, ..., n Choose a permutation matrix P so that if B = AP so that the columns of B are obtained by permuting the columns of A, then B = [b1 , ..., bn ] and b1 , ..., br are linearly independent while br+1 , ..., bn can be expressed as linear combinations of b1 , ..., br . We can thus write [br+1 , ..., bn ] = [b1 , .., br ]C for some appropriate r × n matrix C. Then B = [B1 |B1 C] = B1 [Ir |C], B1 = [b1 , ..., br ] ∈ Cm×r Apply the GramSchmidt process to the columns of B1 which are linearly inde pendent and then obtain B1 = [e1 , ..., er ]R1 where e1 , ..., er are orthonormal and R1 is an r × r upper triangular matrix. Extend the set {e1 , ..., er } to an orthonormal basis {e1 , ..., em } for Cm and thus define the unitary matrix Q = [e1 , ..., em ] ∈ Cm×m Define R2 =
R1 0
∈ Cm×r
Then, we can write
B1 = QR2
and
B = [QR2 |QR2 C] = QR2 [I|C]
and hence
A = BP −1 = QR2 [I|C]P −1
In the special case when the columns of A are all linearly independent, this formula reduces to A = QR2
An application of the QR decomposition to linear least squares problems.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 1.7. THE BASIC DECOMPOSITION THEOREMS OF MATRIX THEORY1311
Let A be an m × n matrix of full column rank n. Then, by GramSchmidt orthornormalization of its columns, we get A = Q1 R1 where Q1 is m × n with orthonormal columns and R1 is n × n upper triangular and nonsingular, ie, with nonzero columns on its diagonals. Extend Q1 by ap pending more orthonormal columns to Q so that Q becomes a m×m orthogonal matrix and define ) ( R1 ∈ Cm×n R= 0
Then
A = QR
Now consider the problem of minimizing
E(θ) = x − Aθ
2
, θ ∈ Rn
We have
x − QRθ
2
2
= y − Rθ
Write
θ=
(
θ1 θ2
, y = QT x
)
where θ1 ∈ Rn , θ ∈ Rm−n Then, clearly, E(θ) = y1 − R1 θ1 where y=
(
y1 y2
)
2
+
y2
2
, y1 ∈ Rn , y2 ∈ Rm−n
Thus, E(θ) is minimized when and only when
θ1 = R1−1 y1
In other words, the set of all θ' that minimize E(θ) are of the form
( −1 ) R1 y 1 θ= , θ2 ∈ Rm−n
θ2
12
14
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
CHAPTER 1. MATRIX THEORY
1.8 A computational problem in Lie group the ory Let G be a Lie group and g is Lie algebra. For X ∈ g, define the left invariant vector field on G by the formula vX f (g) =
d f (g.exp(tX))|t=0 dt
where f : G → C is a differentiable function.
Now do the following:
[a] Take G = SL(2, R) so that g has a basis H, X, Y where � � 1 0 H= , 0 −1 X=
�
0 0
1 0
�
,Y =
�
0 1
0 0
�
Express vH , vX , vY as linear first order differential operators in the coordinates (t, x, y) where g ∈ G is parametrized as g = g(t, x, y) = exp(tH + xX + yY )
[b] Consider the Lie group G = SO(3). Its Lie algebra is the real vector space spanned by all 3×3 skew symmetric real matrices. We choose the standard basis for this Lie algebra, namely ⎛ ⎞ 0 0 0 X1 = ⎝ 0 0 −1 ⎠ 0 1 0 ⎛
Prove that the element
⎞ 0 0 1 X2 = ⎝ 0 0 0 ⎠ −1 0 0 ⎞ ⎛ 0 −1 0 X3 = ⎝ 1 0 0 ⎠ 0 0 0
R(n) = exp(n1 X1 + n2 X2 + n3 X3 ), n1 , n2 , n3 ∈ R defines a rotation around the axis � n ˆ = (n1 , n2 , n3 )/ n21 + n22 + n23
Advanced Classical and DECOMPOSITION Quantum Probability Theory with Quantum Field Theory Applications 1.9. PRIMARY THEOREM
by an angle φ=
1513
J n21 + n22 + n23
in the counterclockwise sense. Evaluate the vector fields vXk , k = 1, 2, 3 in terms of the Euler angles which parametrize R ∈ SO(3) in terms of the Euler angles: R = Rz (φ)Rx (θ)Rz (ψ) = R(φ, θ, ψ) by expressing
d f (R(φ, θ, ψ)exp(tXk ))|t=0 , k = 1, 2, 3 dt as a linear partial differential operators of the first order in (φ, θ, ψ). Now writing vXk = vk1 (φ, θ, ψ)∂/∂φ + vk2 (φ, θ, ψ)∂/∂θ + vk3 ∂/∂ψ, k = 1, 2, 3 compute
f (φ, θ, ψ) = det((vij (φ, θ, ψ)))−1
and prove using general arguments that f (φ, θ, ψ)dφ.dθ, dψ is the Haar measure on SO(3), ie, invariant under left and right translations.
1.9
Primary decomposition theorem
Let T be a linear operator in a finite dimensional complex vector space V . Prove the primary decomposition theorem: V =
r r
Wk
k=1
where
Wk = N ((T − ck )mk ), k = 1, 2, ..., r
with c1 , ..., cr being the distinct eigenvalues of T and
mk
p(t) = Πr
k=1 (t − ck )
being the minimal polynomial of T . Prove that
N ((T − ck )m )
Wk = m≥1
Let Ek denote the projection onto Wk corresponding to this decomposition, ie, j, R(Ek ) = Wk , Ek Ej = 0, k =
r r
k=1
Ek = I
14 16 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
Show that an operator S in V commutes with T iff Ek SEm = 0∀k = m iff S=
m m
Ek SEk
k=1
Now suppose ad(T )(S) = [T, S] leaves all the subspaces Wk , k = 1, 2, ..., r in variant. Then, we claim that S also shares this same property. Indeed, consider for some k = l the operator Skl = Ek SEl . Then since the Ejj s commute with T (in fact, they are polynomials in T ), it follows that
Ek ad(T )(S)El = Ek [T, S]El = [T, Skl ] and the lhs is zero by hypothesis. Thus, Skl commutes with T and hence leaves all the subspaces Wj , j = 1, 2, ..., r invariant.�This means that Skl = 0 r and since k = l are arbitary, it follows that S = j=1 Ej SEj leaves all the subspaces Wj , j = 1, 2, ..., r invariant. Now we can prove the following theorem: If for some positive integer n, ad(T )n (S) = 0, then S leaves all the subspaces Wj , j = 1, 2, ..., r invariant. In fact, 0 = ad(T )n (S) = [T, ad(T )n−1 (S)] implies that ad(T )n−1 (S) leaves all the subspaces {Wj } invariant and by induction, it follows that ad(T )j (S), j = n − 2, n − 3, ..., 1, 0 also leave all these subspaces invariant. The proof of the theorem is complete. Now we wish to use the primary decomposition theorem to prove that �
N (T m )] ∩ [ R(T m )] V =[ m≥1
m≥1
Indeed, suppose T has no zero eigenvalue. Then, the primary decomposition theorem for T reads r r V = N ((T − ck )m k ) k=1
where none of the cjk s are zero. Now, if for some k v ∈ N ((T − ck )mk )
then
(T − ck )mk v = 0
and this equation can be expressed as (−1)mk ck v + T fk (T )v = 0 where f (T ) is a polynomial in T . Since ck = 0, it follows that v = T gk (T )v
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
1.10. EXISTENCE OF CARTAN SUBALGEBRA
15
17
where gk is also a polynomial. By induction, it follows that v ∈ R(T m ), m = 1, 2, ... and hence
v∈
�
R(T m )
m≥1
Since T has no zero eigenvalue, we also have N (T m ) = 0, m = 1, 2, ... and the proof of the theorem for this case is complete. Now suppose that zero is an eigenvalue of T . Then we can take c1 = 0 and since ck �= 0, k > 1, the result follows in the same way from the primary decomposition theorem by noting that
W1 = N (T m1 ) = N (T m ), m≥1
and for k > 1,
Wk = N ((T − ck )mk ) ⊂
�
R(T m )
m≥1
Remark: If T m v = 0 for some positive integer m and simultaneously v ∈ R(T r ) for all r ≥ 1, then we can write v = T r vr , r ≥ 1 and hence T m+r vr = 0 for all r ≥ 1 and hence vr ∈ N (T m1 ) for all r ≥ 1. Therefore, v = T m1 +r vm1 +r = 0. This proves that
� � [ N (T m )] [ R(T m )] = {0} m≥1
1.10
m≥1
Existence of Cartan subalgebra
Now we apply this result to the proof of the existence of a Cartan subalgebra of a semisimple Lie algebra leading thereby to the root space decomposition and hence Cartan’s classification of all the simple Lie algebras. Let g be a any finite dimensional Lie algebra over the complex field. A Lie subalgebra h of g is said to be a Cartan algebra (or a Cartan subalgebra of g), if (a) h is nilpotent and (b) h is its own normalizer in g, ie X ∈ g and [X, h] ⊂ h together imply X ∈ h. We first show that a Cartan algebra is maximal nilpotent. Indeed, suppose that n is a nilpotent Lie algebra that properly contains h. Consider the Lie algebra n/h with its canonical Lie bracket. This is a non trivial Lie algebra and it is nilpotent. Thus, its adjoint representation is a nil representation. By a basic theorem in nilrepresentations, it follows therefore that there exists a nonzero element ξ = X + h ∈ n/h (ie X ∈ / h) such that
1618 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
ad(n/h)(ξ) = h (Note that h is the zero element in n/h. It follows therefore that [n, X] ⊂ h and this obviously implies [X, h] ⊂ h. This contradicts the fact that a Cartan algebra is its own normalizer. Regular element: Now given a X ∈ g define the characteristic polynomial of ad(X): pX (t) = det(tI − ad(X)) = tn + tn−1 c1 (X) + ... + tcn−1 (X) + cn (X) and set
l(X) = min(m : cn−m (X) = 0)
Then 0 ≤ l(X) ≤ n. Define rk(g) = min(l(X) :∈ g) Further, if h is any subspace of g, define ζh (X) = det(ad(X)|g/h ) If h is a subspace of g such that rk(g) = dimh and if X a regular element of h in g, then it is clear that ζh (X) = 0 since in this case, l(X) = rk(g) = dimh. Construction of Cartan algebras: Let X be any regular element in g. Define
N (ad(X)m ) h = hX = m>0
Then, we claim that h is a Cartan algebra. First observe that h is closed under linear combinations and under the Lie bracket operation. Indeed, Y, Z ∈ h imply ad(X)m ([Y, Z]) =
m � � � m [ad(X)r (Y ), ad(X)m−r (Z)] = 0 r r=0
for sufficiently large m. Here we have used that ad(X) is a derivation on g and ad(X)r (Y ) = 0 or ad(X)m−r (Y ) = 0 for all sufficiently large m and all r = 0, 1, ..., m. Next we show that h is a nilpotent algebra. Define h! to be the set of all Y ∈ h for which ζh (Y ) = 0. Clearly we have the direct sum decomposition
� g= N (ad(X)m ) ⊕ R(ad(X)m ) = m≥1
m≥1
h⊕q Now Y ∈ h implies that for some positive integer m, ad(X)m (Y ) = 0 and this implies that (ad(adX))m (ad(Y )) = 0
Advanced and Quantum Probability SUBALGEBRA Theory with Quantum Field Theory Applications 1917 1.10. Classical EXISTENCE OF CARTAN
which implies that ad(Y ) leaves h = m≥1 N (ad(X)m ) invariant. Thus, h is a Lie subalgebra of g. This also implies that ad(Y ) leaves q = m≥1 R(ad(X)m ) invariant as discussed above in the context of consequences of the primary de composition theorem. ad(X) is nilpotent on h and nonsingular on q. Thus, since X has been assumed to be a regular element, it follows that rk(g) = dimh. Now suppose that Y ∈ h� . Then, ad(Y ) is nonsingular on g/h and hence also non singular on q. Thus, l(Y ) ≤ l(X) = rk(g) ≤ l(Y ). Thus, l(Y ) = rk(g) = dimh. Since ad(Y ) is nonsingular on q and since ad(Y ) leaves h invariant, it must necessarily follow that if ad(Y )m (Z) = 0 for some positive integer m and Z ∈ g, then Z ∈ h. In other words, we have shown that hY ⊂ h and since dimhY = l(Y ) = dimh it follows that
hY = h
It follows in particular that ad(Y ) is nilpotent on h (since by definition, it is nilpotent on hY ). Since h� is a nonempty open subset of h and since Y is an arbitrary element of h� , it follows that ad(Y ) is nilpotent for every Y ∈ h. This, completes the proof that h = hX is a Cartan algebra for every regular X ∈ g. Now let h be a Cartan subalgebra of g. Let �� denote the set of regular � elements in h. Let X ∈ hn . We claim that h = hX where hX has been defined above as n≥1 N (ad(X) ). Indeed, since h is nilpotent, it follows that ad(X) is nilpotent on h and therefore h ⊂ hX . On the other hand, we have seen above that hX is a Cartan algebra and hence maximally nilpotent. Since � is also a Cartan algebra, it is also maximally nilpotent. Hence h = hX . In other words, we have proved that every Cartan algebra is of the form hX for some regular element X and in fact or argument shows that � = �X for any regular element X of h. Now let g be any semisimple Lie algebra. Let h be any Cartan algebra. Then, since h is a nilpotent Lie algebra and hence also a solvable Lie algebra and H → ad(H) is a representation of h, in g, it follows that there is a basis for g relative to which all the operators ad(H), H ∈ h are uppertriangular (not necessarily upper triangular). Now, if N h is such that ad(N ) is nilpotent on g, then its matrix relative to this basis will be strictly uppertriangular and hence < N, H >= T r(ad(N ).ad(H)) = 0, H ∈ h If we are able therefore to prove that < ., . > is nonsingular on h × h, then it would follow from the above relation that N = 0, ie, h does not have any nilpotent elements. To prove this claim, we choose X ∈ h� . Then, h = hX . Let X = S + N1 be the Jordan decomposition of X into its semisimple and nilpo tent components. Then, from the basic property of the Jordan decomposition, [X, S] = 0 and hence S ∈ hX = h. further since ad(X) = ad(S) + ad(N ) and ad(S) is semisimple while ad(N ) is nilpotent, it follows that ad(X) and ad(S)
18 20 Advanced Classical and Quantum Probability Theory with Quantum Field TheoryTHEORY Applications CHAPTER 1. MATRIX
have the same characteristic polynomial and hence S ∈ h� . Hence, h = hS . Since ad(S) is semisimple, it then follows that for any Y ∈ h, we have ad(S)(Y ) = 0, ie, [S, Y ] = 0 and in fact, h = N (ad(S)). let q = [S, g]=R(ad(S)). Since ad(S) is semisimple, we get g=h⊕q Then,
In other words,
< [Y, S], H >=< Y, [S, H] >= 0, H ∈ h, Y ∈ g
< q, h >= 0
and hence, from the nonsingularity of < ., . > on g (Cartan’s criterion for semisimplicity of g), it must follow that < ., . > is nonsingular on h. This completes the proof of the claim. Next, we show that if g is semisimple, then any Cartan algebra h is max imal Abelian. Let X ∈ h� and let X = S + N be its Jordan decomposition. Then, ad(X)(N ) = [X, N ] = 0 and hence N ∈ hX = h. Since ad(N ) is also nilpotent, by what we proved above, N = 0. Hence X = S is semisimple. Thus ad(X)m (Y ) = 0 for some Y, m > 0 implies [X, Y ] = 0. In other words, we have that h� is Abelian and since ad(X) is semisimple for all X ∈ h� , it follows that ad(h� ) can be simultaneously diagonalized in g. By taking limits noting that h� is a nonempty open subset of h and hence dense in it it follows immediately that ad(h) is simultaneously diagonable and hence Abelian. From the faithful ness of the adjoint representation of a semisimple Lie algebra, it then follows that h is also Abelian and since it is maximally nilpotent, it is also necessarily maximal Abelian. (Suppose [h, Y ] = 0. Then, ad(X)(Y ) = 0 for X ∈ h� and hence Y ∈ hX = h). Thus, we have proved the following fundamental result in the theory of semisimple Lie algebras: Theorem: if g is a semisimple Lie algebra then there exists a Lie subalgebra h of g such that h is its own normalizer in g and secondly, h is Maximal Abelian, ie, h is Abelian and Y ∈ g, [h, Y ] = 0 implies Y ∈ h. h is called a Cartan subalgebra of g. Later on we shall prove that if the field is complex, g has exactly one Cartan subalgebra upto conjugacy, ie, if h1 , h2 are any two Cartan subalgebras of g, then there exists a g ∈ G such that h2 = Ad(g).h1 . We have already shown assuming that g is a semisimple Lie algebra, that if h is a Cartan subalgebra, then there exists a regular element X ∈ g such that h = hX . Now we prove a slightly stronger version of this result namely: There exists a finite number X1 , ..., Xr of regular elements in g such that if h is any Cartan algebra, then there exists an i ∈ {1, 2, ..., r} and an x ∈ G such that h = hxi where hi = hXi and further that hi , i = 1, 2, ..., r are all mutually nonconjugate. Note: Since we have made use of the Jordan decomposition on a semsimple Lie algebra in our construction of a Cartan subalgebra and proofs of some of these properties, we shall give a proof of this theorem in what follows.
Advanced and Quantum Probability Theory with Quantum Field Theory Applications 1.10. Classical EXISTENCE OF CARTAN SUBALGEBRA
2119
Let g be a semisimple Lie algebra and let X ∈ g. Write ad(X) = T + U where U is nilpotent on g and T is semisimple on g and [T, U ] = 0. This is the Jordan decomposition of a linear operator in a vector space. We claim that T and U are also derivations on g. In fact, we known from Chevalley’s theory of replicas that T and U are replicas of ad(X) and therefore since ad(X) is a derivation, so are T and U . Remark: Let V be a vector space and β : V × V → V bilinear and let D be a derivation on V w.r.t β, ie, Dβ(X, Y ) = β(DX, Y ) + β(X, DY ), X, Y ∈ V Then let L be a replica of D. It is easy to see that L is then also a derivation on V w.r.t β. Indeed, consider the mapping η : V ⊗ V ∗ ⊗ V ∗ → B(V × V, V ) where B(V × V, V ) is the space of V valued bilinear forms on V . The map η is defined by η(X ⊗ f1 ⊗ f2 )(U, V ) = f1 (U )f2 (V )X and then extending η bilinearly w.r.t its first two arguments. It is then easy to see that η is a vector space isomorphism. Note that if we choose a basis {e1 , ..., en } for V , and if {e∗1 , ..., e∗n } denotes the corresponding dual basis, then � β(U, V ) = β(ei , ej )ei∗ (U )e∗j (V ) i,j
= η(
� i,j
β(ei , ej ) ⊗ e∗i ⊗ e∗j )(U, V )
or equivalently, β = η(
� i,j
β(ei , ej ) ⊗ e∗i ⊗ e∗j )
which proves that η is surjective and therefore also injective since the dimension of V ⊗ V ∗ ⊗ V ∗ equals (dimV )3 which is also the dimension of the space of all V valued bilinear forms on V . Writing β1 (U, V ) = β(DU, V ), β2 (U, V ) = β(U, DV ) We have β1 (U, V ) =
�
β1 (ei , ej )e∗i (U )e∗j (V )
ij
=
�
β(Dei , ej )e∗i (U )e∗j (V )
i,j
=
�
[D]ki β(ek , ej )ei∗ (U )ej∗ (V )
i,j,k
=
� j,k
β(ek , ej )(DT ek∗ )(U )ej∗ (V )
20 22 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
and therefore, β1 = η(
� i,j
Likewise, β2 = η(
β(ei , ej ) ⊗ (DT e∗i ) ⊗ ej )
� i,j
Further, D(β(U, V )) =
β(ei , ej ) ⊗ e∗i ⊗ DT e∗j )
�
(Dβ(ei , ej ))e∗i (U )e∗j (V )
i,j
� = η( (Dβ(ei , ej )) ⊗ e∗i ⊗ e∗j )(U, V ) i,j
and hence the derivation property of D, namely D(β(U, V )) = β1 (U, V ) + β2 (U, V ) can equivalently be expressed as � [(Dβ(ei , ej )) ⊗ e∗i ⊗ e∗j − β(ei , ej ) ⊗ DT e∗i ⊗ ej∗ − β(ei , ej ) ⊗ e∗i ⊗ DT e∗j ] = 0 i,j
which in the notation of replicas means that � β(ei , ej ) ⊗ e∗i ⊗ ej∗ ) = 0 − − − (1) D1,2 ( i,j,k
Now let D be a derivation on V w.r.t β and let D =T +U be the Jordan decomposition of D. Then, it is known that T and U are also replicas of D and in particular (1) implies that � T1,2 ( β(ei , ej ) ⊗ e∗i ⊗ ej∗ ) = 0, i,j,k
U1,2 (
�
i,j,k
β(ei , ej ) ⊗ ei∗ ⊗ ej∗ ) = 0
ie T and S are also derivations on V w.r.t β. Now assuming that g is a semisimple Lie algebra, we observe that ad is faithful, ie, injective on g. Indeed, this follows from the fact that ad(X) = 0 for some nonzero X ∈ g would imply that [X, g] = 0 with some nonzero X and this would imply that g has a nonzero centre which contradicts the semisimplicity of g. Thus, writing the Jordan decomposition of the derivation ad(X) as ad(X) = T + U
Advanced and Quantum Probability Theory with Quantum Field Theory Applications 1.10. Classical EXISTENCE OF CARTAN SUBALGEBRA
2321
we get that the semisimple and nilpotent components of ad(X), namely T, U are also derivations and hence are inner, ie, there exist S, N ∈ g such that T = ad(S), U = ad(N ). Then, from the faithfulness of ad, it follows that X =S+N with ad(S) semisimple ad(N ) nilpotent and [ad(S), ad(N )] = 0 or equivalently, ad([S, N ]) = 0 or equivalently, [S, N ] = 0. This is the celebrated Jordan decom position of a semisimple Lie algebra. Remark: Let D be a derivation on g where g is a semisimple Lie algebra. Then, D is inner. Indeed, for X, Y ∈ g, we have ad(DX)(Y ) = [DX, Y ] = D[X, Y ] − [X, DY ] = D[X, Y ] − ad(X)(DY ) = Doad(X)(Y ) − ad(X)(DY ) or equivalently, ad(DX) = [D, ad(X)], ∀X ∈ g Now, by nondegeneracy of the CartanKilling form < ., . > on a semisimple Lie algebra, we have a unique X ∈ g such that T r(D.ad(Y )) =< X, Y >= T r(ad(X).ad(Y ))∀Y ∈ g
since Y → T r(D.ad(Y )) is a linear functional on g. This shows that
T r(D' .ad(Y )) = 0∀Y ∈ g
where
D' = D − ad(X)
Now, D' is also a derivation since D and ad(X) and hence an application of the above formula to D' in place of D, we get ad(D' Y ) = [D' , ad(Y )], Y ∈ g
and hence
< D' Y, Z >= T r(ad(D' Y )ad(Z)) = T r([D' , ad(Y )].ad(Z))
Now,
T r([D' , ad(Y )]ad(Z)) = T r(D' ad(Y )ad(Z)) − T r(ad(Y )D' ad(Z)) =
T r(D' (ad(Y )ad(Z) − ad(Z)ad(Y )) = T r(D' [ad(Y ), a(Z)]) =
T r(D' ad([Y, Z])) = 0
by what we just proved. Therefore,
< D' Y, Z >= 0∀Y, Z ∈ g
22 24 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
and hence by nondegeneracy of < ., . >, it follows that D� Y = 0∀Y ∈ g and hence
D� = 0
ie,
D = ad(X)
This completes the proof that D is inner. This result has been used in the proof of the Jordan decomposition on a semisimple Lie algebra. Remark: Let T be a linear operator in a complex vector space V . Let T � denote the commutant of T , ie, T � is the set of all operators in V that commute with T . Let T �� denote the double commutant of T , ie, the commutant of T � , ie, the set of all operators that commute with every operator in T � . Then, it is easy to show that T �� is precisely the set of all polynomials in T with complex coefficients. In fact, this can be proved by restricting T � and T �� to the space Wk = N ((T − ck )mk ) = R(Ek ) where I = E1 + ... + Er or equivalently,
V = W1 ⊕ ... ⊕ Wr
is the primary decomposition of T with p(t) = Πrk=1 (t − ck )mk being the minimal polynomial of T . Note that Wk is also T � and T �� invariant since T �� ⊂ T � . Then by restricting to Wk the claim reduces to proving that if X ∈ (cI + N )�� where c is a complex scalar and N nilpotent in V , then X is a polynomial cI + N . Further reduction of this problem can be achieved by using the Jordan decomposition of N . In other words, proving the claim reduces to proving that if Jc is a Jordan matrix in V , ie, Jc = cI + Z where c ∈ C and Z has ones on the first superdiagonal and all the other entries as zero, then Jc�� is precisely the set of all polynomials in Jc . Remark: Let T be an operator in V . we claim that if S a replica of T , then S is a polynomial in T with the constant term in the polynomial being zero, ie, S = T f (T ) where f is a polynomial. To see this, we first define an isomorphism µ : V ⊗ V ∗ → L(V ), where L(V ) is the space of all linear operators on V by µ(v ⊗ w∗ )(x) = w∗ (x)v, x, v ∈ V, w∗ ∈ V ∗
Advanced and Quantum Probability Theory with Quantum Field Theory Applications 1.10. Classical EXISTENCE OF CARTAN SUBALGEBRA
2523
and then extending µ by bilinearity. Then, ad(µ(v ⊗ w∗ ))(T )(x) = [µ(v ⊗ w∗ ), T ](x) = w∗ (T x)v−w∗ (x)T v = (T T w∗ )(x)v−w∗ (x)T v = µ(v⊗T T w∗ )(x)−µ(T v⊗w∗ )(x) = −µ(T1,1 (v ⊗ w∗ ))(x) Equivalently,
ad(T )(µ(v ⊗ w∗ ))(x) = µ(T1,1 (v ⊗ w∗ ))(x)
or equivalently,
ad(T )oµ = µoT1,1
or equivalently, ad(T ) = µoT1,1 oµ−1 , T1,1 = µ−1 oad(T )oµ If S is a replica of T , then Tr,s ξ = 0 implies Sr,s ξ = 0 for any r, s ≥ 0, r + s ≥ 1. Thus, in particular, T1,1 ξ = 0 implies S1,1 ξ = 0, or equivalently, in view of the above discussion ad(T )(U ) = 0 implies µoT1,1 oµ−1 (U ) = 0 for an operator U in V implies T1,1 oµ−1 (U ) = 0 implies S1,1 oµ−1 (U ) = 0 implies µoS1,1 oµ−1 (U ) = 0 implies ad(S)(U ) = 0. In other words, S commutes with any operator U that commutes with T , ie, S ∈ T �� and hence by the previous remark, S is a polynomial in T , say S = f (T ) where f is a polynomial. Further since S is a replica of T T ξ = 0 implies Sξ = 0 implies f (T )ξ = 0 implies that if c is the constant term in f (t), then cξ = 0. In other words, if T has a zero eigenvalue, then c = 0, ie, S = T g(T ) where g is a polynomial. Suppose that zero is not an eigenvalue of T . Then, T is invertible and hence the minimal polynomial of T has the form p(t) = Πrk=1 (t − ck )mk , ck �= 0∀k Thus,
p(t) = c + tq(t), c � = 0
with q a polynomial Then p(T ) = 0 implies
cI + T q(T ) = 0 and therefore,
T −1 = −c−1 q(T )
ie, T −1 is also a polynomial in T . It then follows that if S is a replica of T , then S = f (T ) = c0 I + T g(T ) = c0 T T −1 + T g(T ) = T ((−c0 /c)q(T ) + g(T )) ie, S can be expressed as a polynomial in T where the polynomial has zero constant term. We shall now prove the following important result: S is a replica of T iff for all r, s ≥ 0, r + s ≥ 1, we have that Sr,s = pr,s (Tr,s ) where pr,s is a polynomial with zero constant term.
24
26
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
1.11
CHAPTER 1. MATRIX THEORY
Exercises in Matrix Theory
: [1] Let X be a square matrix of size n × n. Prove that X has a polar decomposition:
X = UP
where U is unitary and P is positive semidefinite. In case X is nonsingular, show that √ P = X ∗ X, U = X(X ∗ X)−1/2 √ where Q denotes the unique positive semidefinite square root of a positive semidefinite matrix Q. hint: Show that if Q is positive definite of size n×n, then Q can have atmost 2n distinct square roots out√of which exactly one is positive semidefinite. Show that X, X ∗ X and |X| = X ∗ X all have the same nullspace and hence the same nullity and hence the same rank and hence R(|X|)⊥ | and R(X)⊥ also have the same dimension. Show that the operator U1 : R(|X|) → R(X) defined by U1 |X|x = Xx∀x ∈ Cn is a well defined unitary operator. Do this by showing that the lengths of |X|x and Xx are the same. Hence show that there exists a unitary operator U2 from R(|X|)⊥ → R(X)⊥ . Let V = Cn and hence V = R(|X|) ⊕ R(|X|)⊥ = R(X) ⊕ R(X)⊥ Define a linear operator U : V → V by the relation that U restricted to R(|X|) equals U1 and U restricted to R(|X|)⊥ equals U2 . Show that U is unitary and X = U1 |X| = U |X|
[2] Deduce the singular value decomposition from the polar decomposition: If X is a matrix of size m × n of rank r (Note that r ≤ min(m, n)), then there exist unitary matrices U ∈ Cm×m , V ∈ Cn×n and a matrix D ∈ Cm×n having the block structure D1 0 D= 0 0 where
D1 = diag[σ1 , ..., σr ], σ1 , ..σr > 0
such that
X = U DV ∗
[3] Prove the Riesz representation theorem in an infinite dimensional Hilbert |f (x)|/ I space H: If f : H → C is a bounded linear functional, ie I f I= supx=0 x I, then there exists a unique vector zf ∈ H such that f (x) =< zf , x >, x ∈ H
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 27 25 1.11. EXERCISES IN MATRIX THEORY
hint: Assume H to be separable which guarantees the existence of an or thonormal basis {en : n ≥ 1}. Then show that for any x ∈ H, we have � x= e n < en , x > n
Deduce using the linearity and boundedness of f that � f (x) = f (en ) < en , x > n
Finally, show that ∞ >� f �2 = and hence that
zf =
� n
is well defined.
� n
|f (en )|2
f¯(en )en ∈ H
[4] If T : V1 → V2 is a linear transformation from one vector space V1 to another vector space V2 , both assumed to be finite dimensional, then letting V1∗ , V2∗ denote the vector space of linear functionals on V1 and V2 respectively, define the transpose T � of T as a transformation T � : V2∗ → V1∗ by T � f (x) = f (T x), x ∈ V1 , f ∈ V2∗ Show that T � is a well defined linear transformation and if we have a third vector space V3 and a linear transformation S : V2 → V3 , then (ST )� = T � S � For these statements to be true, do we actually require V1 , V2 , V3 to be finite dimensional or can we drop this condition ? [5] This problem gives us some properties of the derivative of a function with values in a Hilbert space. Let H be an infinite dimensional Hilbert space and x : R → H be a function such that
limδ→0 (x(t + δ) − x(t))/δ = y(t) ∈ H
exists for t ∈ (−a, +a) where the convergence is in the sense of the norm induced by the inner product in H. Deduce that if z ∈ H is arbitrary, then d < x(t), z >=< y(t), z >, t ∈ (−a, a), z ∈ H dt where now the convergence is in the usual sense on the real line. We shall write dx(t)/dt = y(t), t ∈ (−a, a) and say that x(t) is differentiable in (−a, a) with derivative dx(t)/dt equal to y(t) for t ∈ (−a, a). Now prove that if x1 (t) and
26 28 Advanced Classical and Quantum Probability Theory with Quantum Field TheoryTHEORY Applications CHAPTER 1. MATRIX
x2 (t) assume values in H and are differentiable in (−a, a), then deduce that for any c1 , c2 ∈ C, c1 x1 (t) + c2 x2 (t) is also differentiable in (−a, a) with derivative given by d dx1 (t) dx2 (t) (c1 x1 (t) + c2 x2 (t)) = c1 + c2 , t ∈ (−a, a) dt dt dt hint: Prove using the definitions limδ→0 �
xk (t + δ) − xk (t) − dxk (t)/dt �= 0, k = 1, 2 δ
and the triangle inequality that limδ→0 �
c1 x1 (t + δ) + c2 x2 (t + δ) − (c1 dx1 (t)/dt + c2 dx2 (t)/t) �= 0 δ
[6] Let T be a linear operator on a finite dimensional complex vector space V with minimal polynomial p(t) = Πrk=1 (t − ck )mk , c�k s distinct and mk > 0. The primary decomposition of T is V =
r r k=1
Wk , I =
r r
Ek , R(Ek ) = Wk , Ek Ej = 0, k = j
k=1
Then if S is another linear operator in V such that [T, S] leaves each of the Wk invariant, then show that S also shares the same property. hint: Note that the Ek� s are all polynomials in T and hence commute with T . Also note that an operator L leaves each of the Wk invariant iff r r L= Ek SEj = Ek SEk k,j
k
or equivalently, iff Ek LEj = 0∀k = j Hence, for all k = j, 0 = Ek [T, S]Ej = [T, Ek SEj ] and hence Ek SEj leaves every Wi invariant and in particular, Wj invariant. This means that Ek SEj = 0 for all k = j and hence r r Ek SEk S= Ek SEj = k,j
k
proving that S leaves every Wk invariant. Note that in the proof, we have used the easily verified fact that if [T, U ] = 0, then U leaves every Wk invariant because U Ek = Ek U, Wk = R(Ek )
Advanced and Quantum ProbabilityTHEORY Theory with Quantum Field Theory Applications 1.11. Classical EXERCISES IN MATRIX
2927
Note that U commutes with Ek because Ek is a polynomial in T . [7] Let G be a connected Lie group and let H be a connected Lie subgroup of G. Thus if g and h are respectively the Lie algebras of G and H, then the respective exponential maps are surjective, ie, exp(g) = G and exp(h) = H. Show then that if N (H) denotes the normalizer of H in G, ie, the set of all g ∈ G for which gHg −1 ⊂ H, and if n(H) denotes the Lie algebra of N (H), then n(H) = {X ∈ g : [X, h] ⊂ h}
[8] This problem discusses a method for obtaining all the finite dimensional irreducible representations of the Lie group SL(2, C) or equivalently of its Lie algebra sl(2, C). Let G = SL(2, C), ie, the set of all 2 × 2 complex matrices having determinant one. Let sl(2, C) = g, the Lie algebra of G. Show that sl(2, C) is the set of all 2 × 2 complex matrices having trace zero. Show that a basis for sl(2, C) is given by {H, X, Y }, where � � 1 0 H= , 0 −1 X= Y =
�
�
0 0
1 0
0 1
0 0
�
,
�
Note prove the following identities: [a] [H, X] = 2X, [X, Y ] = −2Y, [X, Y ] = H
and hence if π is any representation of sl(2, C) in a vector space V , then [π(H), π(X)] = 2π(X), [π(H), π(Y )] = −2π(Y ), [π(X), π(Y )] = π(H) Let now π be in particular a representation of sl(2, C) in a finite dimensional complex vector space V . Show that if v is a vector in V such that π(H)v = λv for some λ ∈ C, then π(H)π(X)v = (λ + 2)π(X)v, π(H)π(Y )v = (λ − 2)π(Y )v Hence, deduce from the finite dimensionality of V that there exist a nonzero vector v0 ∈ V and a λ0 ∈ C such that π(X)v0 = 0, π(H)v0 = λ0 v0
28 30 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
and that there exists a smallest positive integer l such that π(Y )l v0 = 0. By smallest, we mean that π(Y )l−1 v0 = 0. Deduce that V0 = {π(Y )m v0 : 0 ≤ m ≤ l − 1} is an invariant subspace for π in V , ie, π(sl(2, C))(V0 ) ⊂ V0 and hence deduce that if π is assumed to be irreducible, then {π(Y )m v0 : 0 ≤ m ≤ l − 1} is a basis for V . Now prove that if n is any positive integer, then [π(H), π(X)n ] = 2n.π(X)n , [π(H), π(Y )] = −2nπ(Y )n and [π(X), π(Y )n ] = π(H).π(Y )n−1 + π(Y )π(H).π(Y )n−2 + ... + π(Y )n−1 π(H) = [−2(n − 1)π(Y )n−1 + π(Y )n−1 π(H)] + [−2(n − 2)π(Y )n−1 + π(Y )n−1 π(H)] +... + [0.π(Y )n−1 + π(Y )n−1 π(H)]
= −2(0 + 1 + 2 + ... + (n − 1)]π(Y )n−1 + nπ(Y )n−1 π(H) = −2n(n − 1)π(Y )n−1 + nπ(Y )n−1 π(H)
1.12
Conjugacy classes of Cartan subalgebras
Consider a semisimple Lie algebra g and let g� denote the set of all of its regular elements. Let gi , i = 1, 2, ..., r denote all the connected components of g. For each i = 1, 2, ..., r, choose an Xi ∈ gi . Then since Xi is regular, it follows that hi = hXi is a Cartan algebra for each i. Let h be any Cartan algebra. Then h = hX for some regular X as we have already seen above. Since X is regular, it follows that X ∈ gi for some i. Our aim is to show that h = hX is conjugate to hi . We would then have established that any semisimple Lie algebra (finite dimensional) has a finite set of nonconjugate Cartan subalgebras such that any Cartan subalgebra is conjugate to one in this set. Let h�X denote the set of regular elements in hX , ie, h�X = hX ∩ g� . Let hX+ denote the connected component of h�X that contains X and define bX = (hX+ )G . Then, it is clear that bX is connected. Choose any Z ∈ bX . Then, Z is regular and hence the Cartan algebra hZ defined. We claim that bZ = bX . Indeed, Z y is a regular element in hX for some y ∈ G and hence, hyZ = hZ y = hX which implies that � � hZy = hX and therefore � G � G bZ = (hZ ) = (hX ) = bX
proving the claim. Now let U, V ∈ gi . We claim that either bU = bV or else bU ∩ bV = φ. Indeed, suppose Z ∈ bU ∩ bV . Then, Z ∈ bU which implies as
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
1.13. EXERCISES
29
31
shown above that bZ = bU and likewise, bZ = bV . Thus, bU = bV , thereby proving the claim. Further, bU is an open connected set of regular elements containing U and hence bU ⊂ gi . In other words, we have proved that {bU : U ∈ gi } is a family of connected open sets, each of which is contained in gi and two elements in this family are either disjoint or the same. Further, the union of this whole family is precisely gi since if U ∈ gi , we have that U ∈ bU . It follows from the connectedness of gi that bU = gi ∀U ∈ gi . We have thus shown that bXi = gi , i = 1, 2, ..., r. Now, let h = hX be as above with X ∈ gi (Recall that any Cartan subalgebra is of this form for some regular X, and some i). Then we have established that bX = bXi . In other words, we have established that G y = hG hX+ Xi + and this implies that X = Z for some y ∈ G and some Z ∈ hXi + y (Recall that X ∈ hX+ ). Then, h = hX = hZ y = hy Z = hXi (since Z, Xi are y both regular elements in hXi ). Therefore, h = hi , ie, h is conjugate to hi . This completely proves our aim. Remark: If the underlying field of g is complex, then g has just one Cartan subalgebra upto conjugacy, ie, any two of its Cartan subalgebras are conjugate.
1.13
Exercises
[1] Let h be a Cartan subalgebra of any finite dimensional Lie algebra g, ie h is a Lie subalgebra, nilpotent and its own normalizer. Show that there exists an X ∈ g� (g� is the set of regular elements in g) such that hX = h where
N (ad(X)m ) hX = m>0
[2] With h any Cartan subalgebra of a Lie algebra g, show that X ∈ h is regular iff
� 0 ζ(X) = det(ad(X)|g/h ) = hint: ζ(X) is nonzero for X ∈ h iff ad(X)|g/h has a zero eigenvalue iff when we write det(tI − ad(X)) = ck tk + ck+1 tk+1 + ... + tn , n = dimg with ck �= 0, then k > dim(h) = rk(g). Do this by noting that for X ∈ h, all the eigenvalues of ad(X)|h are zero since by definition, ad(X) is nilpotent on h and therefore, det(tI − ad(X)|h ) = tl , l = dim(h) = rk(g)
3032 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY ' [2] With hXm as in the problems [1,2] for X regular (ie, X ∈ g , hX = m>0 N (ad(X) )), show without assuming that hX is a Cartan algebra that ad(X) is nonsingular on g/hX and hence dim(hX ) = rk(g)(Note that to prove that ad(X) is nonsingular on g/hX , you need not even assume that X is regular since by the definition of hX , ad(X)|g/hX is nonsingular). Now show that with X regular, if Y ∈ hX , then ad(X)m (Y ) = 0 for some m > 0 and hence ad(Y ) leaves hX invariant. Show further that if ζ(Y ) = det(ad(Y )|g/hX ) = 0, then Y is regular and hence hY = hX . Deduce from this that for such a Y , ad(Y ) is nilpotent on hX . Then, observe that if Y ∈ hX is arbitrary, we can write Y = limYn with ζ(Yn ) = 0 and hence conclude that ad(Y ) is nilpotent on hX , ie hX is a nilpotent Lie algebra. Deduce from this that if Y ∈ hX , then Y is regular iff ζ(Y ) = det(ad(Y )|g/hX ) = 0.
[3] This problem is a prerequisite for attempting the previous problems: let V be a finite dimensional vector space and T a linear operator on V . Let W be a T invariant rdimensional subspace of V . Choose any basis {er+1 + W, ..., en + W } for V /W . Show that if {e1 , ..., er } is any basis for W , then B = {e1 , ..., en } is a basis for V . Show that [T ]B has the following block structure: � � A11 A12 [T ]B = 0 A22 and hence deduce that det(T ) == det(A11 ).det(A22 ) = det(T |W ).det(T |V /W ) Note that W is the zero element of the vector space V /W . Specialize this result to show that the characteristic polynomial of T can be expressed as f (t) = det(tI − T ) = det(tI − T |W ).det(tI − T |V /W )
1.14 Appendix:Some applications of matrix the ory to control theory problems A.Controllability of the YangMills nonAbelian field equations The Lagrangian for the nonAbelian gauge fields Aaµ (x), a = 1, 2, ..., N, µ = 0, 1, 2, 3 is a µνa F L = (−1/2)Fµν where
a = [Dµ , Dν ]a Fµν
with Dµ the gauge covariant derivative defined by Dµ = ∂µ + iAµa τa = ∂µ + iAµ
Advanced Classical and Quantum Probability Theory withOF Quantum FieldTHEORY Theory Applications 31 1.14. APPENDIX:SOME APPLICATIONS MATRIX TO CONTROL THEORY P
where τa , a = 1, 2, ..., N are Hermitian generators of the gauge group Lie alge bra g = Lie(G). The structure constants associated with these generators are denoted by C(abc): N � [τa , τb ] = −i C(abc)τc c=1
Note that
a Fµν = Fµν iτa = [Dµ , Dν ] =
[∂µ + iAµ , ∂ν + iAν ] =
i(Aν,µ − Aµ,ν ) − [Aµ , Aν ]
= (Aaν,µ − Aaµ,ν )iτa − Abµ Acν [τb , τc ]
= (Aaν,µ − Aaµ,ν + C(abc)Aµb Acν )iτa Thus,
a Fµν = Aaν,µ − Aaµ,ν + C(abc)Aµb Aνc
The field equations in the absence of current sources are obtained from the variational principle � a F µνa d4 x = 0 δ Fµν
and these give
[Dν , F µν ] = 0
or equivalently, ∂ν F µνa + C(abc)Aνb F µνc = 0 in the presence of interaction with a current source J µa , with the interaction Lagrangian being Lint = J µa Aaµ
the field equations are
∂ν F µνa + C(abc)Aνb F µνc = J µa and these are nonAbelian G generalizations of the Abelian U (1) electromagnetic field equations where G is a Lie subgroup of U (M ) with dimG = N , where by the dimension of a Lie group, we mean the dimension of its Lie algebra as a vector space. The above field equations be expanded as ∂ν (Aνa,µ − Aµa,ν + C(abc)Aµb Aνc ) +C(abc)Abν (Aνc,µ − Aµc,ν + C(cde)Aµd Aνe ) = J µa or equivalently after attaching a perturbation parameter δ to keep track of the quadratic and cubic nonlinear terms, Aνa,µ − Aµa,ν + δC(abc)(Aµb Aνc )),ν ,ν ,ν
32
34
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
CHAPTER 1. MATRIX THEORY
+δC(abc)Abν (Aνc,µ − Aµc,ν ) + δ 2 C(abc)C(cde)Abν Aµd Aνe = J µa − − − (1) The controllability problem for these field equations is then posed as follows: Let U and V be two disjoint subsets of R3 . Then, given that at time t = 0, the potentials Aaµ and their time derivatives Aaµ,0 have prescribed values on U , does there exist a control current field J µa (t, r), 0 ≤ t ≤ T, r ∈ R3 such that at time T , these potentials have prescribed values on V ?. More generally, we can ask the question that given two disjoint subsets U, V of R4 , does there exist a control current field J µa (x) on R4 and a solution Aaµ to the above YangMills field equations corresponding to this current source such that these potentials have prescribed values on both U and V ? We shall attempt to solve this controllability problem approximately by means of perturbation theory. First, we expand the solution in powers of δ: � Aaµ = Aµa(0) + − − − (2) δ k .Aa(k) µ k≥1
Since the gauge group G has dimension N equal to the number of possible values of the gauge index a, we can always gauge transform the gauge field so that the gauge conditions ∂µ Aµ(a) = 0 hold good. In that case, (1) reduces to µa,ν νc −A,ν + δC(abc)Aµb ,ν A
+δC(abc)Aνb (Aνc,µ − Aµc,ν ) + δ 2 C(abc)C(cde)Abν Aµd Aνe = J µa − − − (3a) or equivalently with
� = ∂ ν ∂ν = ∂02 − �2
denoting the D’Alembert wave operator,
νc −�Aµa + δC(abc)Aµb ,ν A
+δC(abc)Abν (Aνc,µ − Aµc,ν ) + δ 2 C(abc)C(cde)Abν Aµd Aνe = J µa − − − (3b) or on using the antisymmetry of the structure constants, µb −�Aµa + δC(abc)Aνc (2A,ν − Abν,µ )
+δ 2 C(abc)C(cde)Abν Aµd Aνe = J µa − − − (3b) Substituting the perturbation expansion (2) into (3b) and equating equal powers of δ gives us (a), for δ 0 = 1, −�Aµa(0) = J µa , for δ 1 = δ, −�Aµa(1) + C(abc)Aνc(0) (2Aµb(0) − Ab(0) ,ν ν,µ ) = 0,
Advanced Classical and Quantum Probability Theory withOF Quantum FieldTHEORY Theory Applications 33 1.14. APPENDIX:SOME APPLICATIONS MATRIX TO CONTROL THEORY P
for δ 2 , − Ab(1) −DAµa(2) + C(abc)[Aνc(0) (2Aµb(1) ,ν ν,µ ) +Aνc(1) (2Aµb(0) − Ab(0) ,ν ν,µ )] µd(0) νe(0) +C(abc)C(cde)Ab(0) A =0 ν A
Let G(x − x' ) denote the Green’s function for the wave operator D. Then, we can successively solve the above equations as J Aµa(0) (x) = − G(x − x' )J µa (x' )d4 x' Aµa(1) (x) = C(abc)
C(abc)
J
J
b(0) G(x − x' )Aνc(0) (x' )(2Aµb(0) − Aν,µ )(x' )d4 x' , ,ν
Aµa(2) (x) =
b(1) G(x − x' )[Aνc(0) (x' )(2Aµb(1) − Aν,µ )(x' ) ,ν
b(0) +Aνc(1) (x' )(2Aµb(0) )(x' )]d4 x' − Aν,µ ,ν J ' µd(0) ' +C(abc)C(cde) G(x − x' )Ab(0) (x )Aνe(0) (x' )d4 x' ν (x )A
Remark: G(x) satisfies the pde
DG(x) = δ 4 (x)
which on Four dimensional Fourier transforming gives
ˆ (k) = 1 , k 2 = kµ k µ G k2 It is easily deduced then that G(x) = Cδ(x2 ), x2 = xµ xµ is one such solution. That follows by four dimensional Fourier inversion. It is easily seen from the above formulas that upto O(δ 2 ), the solution can be expressed as J J ' ' 4 ' A(x) = G0 (x − x )J(x )d x + δ G1 (x − x' , x − y ' )J(x' ) ⊗ J(y ' )d4 x' d4 y ' + +δ
2
J
G2 (x − x' , x − y ' , x − z ' )(J(x' ) ⊗ J(y ' ) ⊗ J(z ' ))d4 x' d4 y ' d4 z ' − − − (4)
where A(x), J(x) are appropriate vector space valued smooth functions on R4 and G0 , G1 , G2 are appropriate matrix valued known functions on R4 , R4 × R4 and R4 × R4 × R4 . Now the controllability problem is easily stated: Given an E > 0 and a function Ag (x) on R4 , does there exist a source current J(x) such
3436 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
that the output of the above system is Ad (x). Actually, the problem is more intricate if we take initial conditions into account. In that case, we absorb the intial conditions Ai (0, r) into the solution for the zeroth order perturbation. Then, � � A0 (x) =
G(x − x� )J(x� )d4 x� +
F (t, r|r� )Ai (0, r� )d3 r�
This initial condition then propagates into the higher order perturbations yield ing finally upto second order in δ a solution of the form � � � � 4 � A(x) = G0 (x − x )J(x )d x + δ G1 (x − x� , x − y � )(J(x� ) ⊗ J(y � ))d4 x� d4 y � + �
+δ 2 +
�
G2 (x − x� , x − y � , x − z � )(J(x� ) ⊗ J(y � ) ⊗ J(z � ))d4 x� d4 y � d4 z � �
�
3 �
F0 (t, r|r )Ai (0, r )d r + δ +δ 2
�
�
F1 (t, r|r� , r�� )Ai (0, r� ) ⊗ Ai (0, r�� )d3 r� d3 r��
F2 (t, r|r� , r�� )(Ai (0, r� ) ⊗ Ai (0, r�� ) ⊗ Ai (0, r�� ))d3 r� d3 r�� d3 r��
and the question of approximate controllability is then the question of whether for a given input field Ai (r) at time 0 and a given output field Af (r) at time T and an � > 0, does there exist a control input current field J(x) for which the output A(T, r) in the above equation at time T has a mean weighted square distance from Af (r) defined by � W (r) � A(T, r) − Af (r) �2 d3 r smaller than � ? Remark: After discretization in the spatial variables, the Yang Mills field equations appears in state variable form as x� (t) = A0 x(t) + δ.A1 (x(t) ⊗ x(t)) + δ 2 .A2 (x(t) ⊗ x(t) ⊗ x(t)) + u(t) A second order perturbative solution gives x(t) =
�
t 0
�
G0 (t − s)u(s)ds + δ.
[0,t]3
� t� 0
t 0
G1 (t − s1 , t − s2 )(u(s1 ) ⊗ u(s2 ))ds1 ds2 +
G2 (t − s1 , t − s2 , t − s3 )u(s1 ) ⊗ u(s2 ) ⊗ u(s3 )ds1 ds2 ds3
where
G0 (t) = exp(tA0 )
G1 (t1 , t2 ) = G2 (t1 , t2 , t3 ) =
Advanced Classical and Quantum Probability Theory with Quantum FieldTHEORY Theory Applications 35 1.14. APPENDIX:SOME APPLICATIONS OF MATRIX TO CONTROL THEORY P
Using second order perturbation theory, with x(t) = x0 (t) + δ.x1 (t) + δ 2 .x2 (t) + O(δ 3 ) x�1 (t) = A0 x1 (t) + A1 (x0 (t) ⊗ x0 (t)) so x1 (t) =
�
G0 (t−s)A1 (G0 (s−s1 )⊗G0 (s−s2 ))(u(s1 )⊗u(s2 ))ds1 ds2 ds 0 dxδ 2 )
I v(x, θ) +
√
Ed(x) I2 dx >≥ δ 2 ) =
exp(−(2E)−1 infg:JD Iv(x,θ)+g(x)I2 >dxδ2
J
g(x)T Q(x, y)g(y)dxdy) D×D
Finally, the parameter vector θ must be chosen so that this deviation probability is as small as possible, or equivalently, such that J g(x)T Q(x, y)g(y)dxdy) infg:JD Iv(x,θ)+g(x)I2 dx>δ2 D×D
is as large as possible. Some additional remarks on controllability: Suppose we have a nonlinear vector sde of the form √ dx(t) = f (t, x(t)|θ)dt + Eg(t, x(t), θ)dB(t)
4244 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
Then, we wish to design the control parameters θ such that the probability of the trajectory falling in a set D ⊂ C[0, T ]n over the time duration [0, T ] is minimized. Then, the large deviation solution to this problem will be to maximize E(θ) =
inf
{x∈D,ξ∈C 1 [0,T ]:dx(t)/dt=f (t,x(t),θ)+g(t,x(t),θ)ξ(t)}
�
T 0
� ξ(t) �2 dt
1.18 Approximate controllability of the Maxwell equations The wave equations for the vector and scalar potentials are �Aµ = J µ +
√
�wµ
where µ µ J,µ = 0, w,µ =0
and wr (x) are jointly Gaussian noise fields with zero mean. Thus, 0 w,0
=
r −w ,r , w0
=−
�
x 0 0
r
w,r dx0
We now incorporate control terms in the current, ie, replace J µ by J µ (x) + K µ (x|θ) where θ are control parameters and K µ satisfies µ K,µ (x|θ) = 0∀θ ∈ Θ
The solution is Aµ (x) =
�
G(x − x� )(J µ (x� ) + K µ (x� |θ) +
√ µ � �w (x |θ))d4 x�
and the problem is to use large deviation theory for Gaussian processes to mininimize the probability � P( � Aµ (x) − Aµd (x) �2 W (x)d4 x > �) D
by instead minimizing the inifimum of the rate function of Aµ over the given set indicated in the probability w.r.t the control parameters θ.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
43
1.19. CONTROLLABILITY PROBLEMS IN QUANTUM SCATTERING THEORY45
1.19 Controllability problems in quantum scat tering theory Let H0 denote the free projectile Hamiltonian and H1 (θ) the Hamiltonian when the projectile interacts with the scattering centre. θ is a control parameter vector for the scattering potential. The wave operators are Ω+ (θ) = limt→∞ exp(−iH1 (θ)).exp(itH0 ), Ω− (θ) = limt→−∞ exp(−itH1 (θ)).exp(itH0 ) These wave operators can be computed using the LippmannSchwinger equa tions. The scattering matrix is then S(θ) = Ω+ (θ)∗ Ω− (θ) The controllability problem is then to determine whether for each given scatter ing operator Sg (ie, a unitary operator) in a given family whether there exists a θ for which S(θ) has a distance smaller than � w.r.t the spectral norm from Sd .
1.20 Kalman’s notion of controllability and its extension to pde’s Consider the state equations X � (t) = AX(t) + BU (t) This can also be expressed as [Id/dt − A, B]
�
X(t) U (t)
�
=0
The controllability problem then involves determining whether a solution (X(t), U (t)) to this equation exists over the time interval [t1 , t2 ] such that the value of X(t) at t1 and t2 are given. Generalizing this to pde’s, we ask the question, given a matrix partial differential operator p(∂), there exists a solution f to it, ie, p(∂)f (x) = 0 such that Ψ(f (x)) has specified values on two disjoint Borel sets U and V . Note that in the above special case considered by Kalman, the two disjoint open sets are {t1 } and {t2 }.
4446 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
1.21 Controllability in the context of represen tations of Lie groups Let π be a representation of a Lie group G that acts on a manifold M . Let an image field f1 (x) be given on M . After transforming it by a Gaction and adding noise to it, the image field becomes f2 (x) = f1 (g −1 .x) + w(x), x ∈ M We can regard f1 as the input image field and f2 as the output image field with the system being defined by g ∈ G. We can write g(t) = exp(tX) for a one parameter subgroup t → g(t) of G where X is an element of the Lie algebra g of G. Then the initial image field f (x) after time t transforms to f2 (t, x) = f1 (exp(−tX).x), t ≥ 0, x ∈ M Its rate of change at time t is given by ∂f2 (t, x)/∂t = −ξX (x).f2 (t, x) where ξX is the vector field induced on M by the infinitesimal action of the one parameter group g(t) on M , ie, ξX (x) =
d exp(tX).x|t=0 dt
Formally, we can express the solution to the above pde as f2 (t, x) = exp(−tξX (x)).f1 (x) and now we can pose the controllability question: Given a family of smooth functions F on M , and two elements fa , fb in M does there exist an element X ∈ g such that fb (x) = exp(−T.ξX (x))fa (x) for some fixed T > 0. Note that this is equivalent to the existence of a one parameter group g(t) such that fb (x) = fa (g(T )−1 x) Now the representation π of G induces a representation dπ of g such that π(exp(tX)) = exp(tπ(X)), t ≥ 0 If G acts transitively on M , we can define a Fourier transform of f (x) at π formally by choosing a point x0 ∈ M and defining � ˆ f (π) = f (gx0 )π(g)dg G
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 1.22. IRREDUCIBLE REPRESENTATIONS AND MAXIMAL IDEALS 47 45
After time t, f (x) evolves to f (t, x) = f (exp(−tX)x) and its Fourier transform at π is then given by � � ˆ f (t, π) = f (t, gx0 )π(g)dg = f (exp(−tX)gx0 )π(g)dg G
=
�
G
f (gx0 )π(exp(tX)g)dg = π(exp(tX))fˆ(π)
G
which is equivalent to saying that ∂fˆ(t, π)/∂t = dπ(X).fˆ(t, π) the right side being intepreted in terms of ordinary matrix multiplication. Thus, we have fˆ(t, π) = exp(t.dπ(X))fˆ(π) and hence we can pose the controllability problem of when this operation will carry the Fourier transform of an initial signal field evaluated at π to the Fourier transform of another signal field at π when both the signal fields are taken from a given family ?
1.22 Irreducible representations and maximal ide als Let A be an algebra with a unit 1 and π a representation of this algebra in a vector space V . Assume that π has a cyclic vector v, ie, π(A)v = V Define I0 = {x ∈ A : π(x)v = 0} Clearly, I0 is a left ideal of A and A/I0 is isomorphic to V as a vector space via the mapping x + I0 → π(x)v, x ∈ A. Now, let W be a πinvariant subspace of V and define IW = {x ∈ A : π(x)v ∈ W } Clearly since W is π invariant, it follows that IW is a left ideal in A containing I0 . ¯ (x)(y+I0 ) = xy+I0 , x, y ∈ A. We can define a representation π ¯ of A in A/I0 by π ¯ invariant Then, it is clear that π ¯ is isomorphic to π and that IW /I0 is a π subspace of A/I0 . This argument shows that there is a oneone correspondence, ie, bijection between all πinvariant subspaces of V and all π ¯ invariant subspaces of A/I0 . Equivalently, there is a oneone correspondence between all πinvariant subspaces of V and all ideals of A containing I0 . In particular, if π is irreducible, then I0 is a maximal ideal of A and conversely if I0 is any maximal ideal of A, then the representation π ¯ of A in A/I0 is irreducible with 1+I0 as a cyclic vector. In other words, there is a oneone correspondence between equivalence classes
4648 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
of irreducible representations of A and maximal ideals in A. This fact was used by HarishChandra with great power in constructing all the finite dimensional irreducible representations of a semisimple Lie algebra using dominant integral weights.
1.23 Controllability of the MaxwellDirac equa tions using external classical current and field sources Suppose that we are given a second quantized Dirac electronpositron field and and a second quantized Maxwell photon field inside a cavity having perfectly conducting boundary. We wish to control these fields so that the far field radi ation pattern has a given set of quantum moments in a given state of the cavity field, say the tensor product of a Bosonic and Fermionic coherent state. Let Acµ and Jµc denote the external classical field and current sources into the cavity. The perturbed MaxwellDirac equations are then given by 0Aµ = −eψ ∗ αµ ψ + Jµc , γ µ (i∂µ − m)ψ = −eγ µ (Aµ + Acµ )ψ
Let D(x − x' ) and S(x − x' ) denote respectively the photon and electron prop agator kernels: D = 0−1 , S = (iγ µ ∂µ − m + i0)−1 The approximate solutions to these equations is J Aµ (x) = A(0) (x) + G(x − x' )(−eψ0 (x' )∗ αµ ψ0 (x' ) + Jµc (x' ))d4 x' µ
ψ(x) = ψ0 (x) +
J
= A(0) µ (x) + δAµ (x) ' ' 4 ' c S(x − x' )(−eγ ν (A(0) ν (x ) + Aν (x ))d x
= ψ0 (x) + δψ(x),
where ψ0 free second quantized Dirac field that satisfies [iγ µ ∂µ − m]ψ0 = 0 (0)
and Aµ is the free second quantized photon field that satisfies
0A(0)
µ =0 as a linear superposition of plane waves with dispersion relation ψ0 is expressed J p0 = m2 + p21 + p22 + p23 with coefficients being the electron annihilation and
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
47
1.23. CONTROLLABILITY OF THE MAXWELLDIRAC EQUATIONS USING EXTERNAL (0)
positron creation operators in momentum space while Aµ is � expressed as a linear superposition of plane waves with dispersion relation k 0 = k12 + k22 + k32 with coefficients being photon annihilation and creation operators in momentum space. Using these expressions, if |Φ > is any state of the electrons, positrons and photons within the cavity, then we can calculate in principle all the moments of the radiation and Dirac field in this state. electromagnetic radiation from the cavity comes from the current of electron and positrons within the cavity as well as from the surface current density induced by the the quantum magnetic field on the cavity boundary. Applying the retarded potential formula to these two currents, it follows that the total quantum electromagnetic field radiated from the cavity will have the form � Arad,µ (x) = G1 (x, x� )ψ(x� )∗ αµ ψ(x� )d4 x� cavity
+
�
cavity
Gν2µ (x, x� )Aν (x� )d4 x�
In this expansion, we retain terms only upto linear orders in δψ and δAµ . It follows then that we can express the radiated electromagnetic field as � Arad,µ (x) = F1µ (x) + F2,µ (x, x� )δψ(x� )d4 x� +
�
F3νµ (x, x� )δAν (x� )d4 x�
where F2 mu (x, x� ) is a linear functional of ψ0 and hence of the electronpositron creation and annihilation operators while F3νµ are cnumber functions. F1µ is the retarded Maxwell potential produced by the free Dirac quantum current density −eψ0∗ αµ ψ0 and is therefore a quadratic functional of the electron and positron creation and annihilation operator fields. It should be noted that the classical current and potential sources are contained in the terms δψ and δAµ . The controllability issue can then be posed as follows: For a given �, δ > 0 and a given electromagnetic field Ag,µ (x� ) in a region V of spacetime, do there exist classical control fields Jµc and Aµc so that the quantum average of Arad,µ (x) in the given coherent state of the electrons, positrons and photons has a distance smaller than � from the given electromagnetic field in the sense of a weighted integral of the error square over V and simultaneously, this field has a fluctuation mean square value smaller than δ 2 over this region in the coherent state ? By fluctuation mean square value, we mean the quantity � < Φ|Arad,µ (x)Arad,ν (x� )|Φ > W µν (x, x� )d4 xd4 x� V ×V
−
�
< Φ|Arad,µ (x)|Φ >< Φ|Arad,ν (x� )|Φ > W µν (x, x� )d4 xd4 x� V ×V
48 50 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 1. MATRIX THEORY
We require that this must be smaller than δ 2 and likewise, we require that �
V ×V
(< Φ|Arad,µ (x)|φ > −Ag,µ (x))(< Φ|Arad,ν (x� )|φ > −Ag,ν (x� ))W µν (x, x� )d4 xd4 x�
must be smaller than �2 .
1.24 Controllability of the EEG signals on the brain surface modeled as a spherical sur face by influencing the infinitesimal dipoles in the cells of the brain cortex to vary in accord to sensory perturbations If U (r) is the potential generated on the brain surface by infinitesimal dipoles p1 , ..., pN present at the locations r1 , ..., rN in the cortex, then U satisfies Pois son’s equation �2 U (r) = −ρ(r)/�, ρ(r) =
� k
pk .�δ 3 (r − rk )
More generally, if we assume the presence of stochastic perturbation terms in the charge distribution, we obtain the following stochastic pde �2 U (r) = −
� k
pk .�δ 3 (r − rk ) + w(r)
Assume that there are no charges present on the brain surface, ie, if n ˆ is the unit normal at any point on the brain surface, then ∂U (r)/∂n ˆ=0 Then if G(r, r� ) is the Green’s function for the Neumann boundary value prob lem, we have �2 G(r, r� ) = δ 3 (r − r� ), ∂G(r, r� )/∂n ˆ=0 and we get as solution � � U (r) = G(r, r� )(− pk .�δ 3 (r� − rk ) + w(r� ))d3 r� k
Assume that w(r) is weak zero mean Gaussian noise with known autocorrelation Rw (r, r� ) = E(w(r)w(r� ))
Advanced and Quantum Probability Theory with Quantum Field Theory Applications 1.25. Classical CONTROL AND RELATIVITY
5149
Then, the controllability problem is to add additional charge sources to the brain cortex, defined by a control charge density ρc (r), so that the controlled potential on the brain surface J L Uc (r) = G(r, r' )(− pk .Vδ 3 (r' − rk ) + w(r' ) + ρc (r' ))d3 r' k
has a minimal probability of determining an electric field Ec (r) = VUc (r) on the brain surface S that deviates from a given surface electric field Eg (r) on S by an amount > E. This is the classical control problem based on large deviation theory.
1.25
Control and relativity
Application of the representation theory of SL(2, C) as an alternative way of characterizing Lorentz transformations to control problems. We use the irre ducible representations of SL(2, C) to estimate a Lorentz group transformation element on a time varying three dimensional image field and using this esti mate, to design an error feedback controller in the group domain so that the transformed image field is as close as possible in the sense of some distance measure to a given 3D time varying image field. This idea can be compared to the extended Kalman filter based state observer to design a controller based on output error feedback so that the state tracks a given trajectory. Given a Lie group or more generally an algebraic group over a field, the ques tion is how to construct an appropriate basis for an irreducible representation of the group or an irreducible representation of a module. The standard Verma Module or BorelWeil method involves starting with a formal vector, called the highest weight vector and operating on it freely by the negative root vectors of the semisimple Lie algebra and then extracting out a Maximal ideal from this free module and using the fact that the quotient of the universal enveloping algebra by a maximal ideal is an irreducible module for the universal enveloping algebra of the Lie algebra of the group. Standard monomials based on Schu bert varieties of the Grassmannian provide nice bases for irreducible modules of algebraic groups like SL(n, K), SO(n, K) where K is any algebraic field. Such fields naturally appear in the construction of classical codes and we can look upon the elements of these algebraic groups as linear transformations on the space of code vectors and formulate code pattern recognition problems for the same via the irreducible representations of these groups. Acknowledgements: I am grateful to Professor ShivaShankar for suggesting this problem to me and for providing me with his lecture notes on controllability, partial differential equations and the vector potential delivered at the Steklov Institute, Moscow. References:
50
52
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
CHAPTER 1. MATRIX THEORY
[1] Shiva Shankar, ”Six lectures at the Steklov Institute, Moscow on Con trollability and the vector potential. [2] Amir Dembo and Ofer Zeitouni, ”Large deviations, Techniques and Ap plications”, Springer. [3] V.S.Varadarajan, ”Lie groups, Lie algebras and their Representations, Springer, 1984. [4] C.S.Seshadri, ”An Introduction to Standard Monomials”, Hindustan Book Agency.
Chapter 2
Chapter 2 Probability Theory, Course Instructor:Probability Harish Theory Parthasarathy [0] Some philosophical remarks on probability theory: Why probabilistic mod els are required to simplify calculations involving very complex deterministic dynamical systems ? The Buffon Needle problem and its application to the MonteCarlo calculation of π. [1] A.N.Kolmogorov’s axiomatic foundations of probability theory: The sam ple space, σalgebra of events and probability measure; the notion of a classical probability space (Ω, F, P ). Importance of the countable additivity postulate for the probability measure. [2] Properties of the probability measure. [3] The notion of independence of events. The BorelCantelli lemmas. [4a] The general definition of a random variable on a probability space. Joint probability distributions and their properties. [4b] Lebesgue integration in a probability space: The notion of expectation of a random variable. [4c] Cornerstone theorems of Lebesgue integration theory: Monotone con vergence theorem, Fatou’s Lemma, dominated convergence theorem. [5] Statement of the Caratheodory extension theorem: Extension of a prob ability measure as a countably additive measure on an algebra of events to the σalgebra generated by the algebra. [6] The product of probability spaces: Notion of the product measure and its application to the construction of independent experiments, Fubini’s theorem on integration w.r.t a product probability measure. [7] Examples of probability spaces from die throwing to coin tossing. [8] Absolute continuity of two measures and the RadonNikodym theorem. [9] Application of the RadonNikodym theorem to the construction of the conditional expectation of a random variable given a sub σalgebra. 53
51
5254CHAPTER Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 2. PROBABILITY THEORY, COURSE INSTRUCTOR: HARISH PARTHASAR
[10] Another derivation of the conditional expectation using orthogonal pro
jection operators in a Hilbert space.
[11] Properties of the conditional expectation. [12] Application of the conditional expectation to the construction of the
minimum mean square nonlinear estimate of a random variable given a family
of random variables.
[13] Application of the RadonNikodym theorem to the construction of prob
ability density of a finite set of random variables.
[14] Describing discrete probability distributions using the Dirac δdistribution. [15] Estimation of parameters in linear models using linear minimum mean
square methods.
[16] Joint characteristic function of a finite set of random variables. [17] Positive definite properties of the characteristic function and Bochner’s
theorem.
[18] Jensen’s inequality for convex functions of random variables. [19] Chebyshev’s inequality, Markov’s inequality. [20] [a] Various notions of convergence of an infinite sequence of random
variables. Convergence almost surely, convergence in probability, convergence
in the mean square sense, convergence in Lp norm, convergence in distribution.
The relationship between these modes of convergence.
[b] The weak and strong laws of large numbers for sequences of independent
random variables.
[c] The Gaussian distribution and the central limit theorem: Proof based on
the use of the characteristic function.
[21] Definition of a stochastic process in discrete time and continuous time.Kolmogorov’s existence theorem for stochastic processes in discrete time and in continuous time. [22] The AR,MA and ARMA time series models. [23] Transmission of stochastic processes through linear and nonlinear fil
ters. Derivation of differential and difference equations satisfied by the output
moments and the input output cross moments in terms of the input moments.
[24] Stationary and Wide sense stationary processes. [25] VonNeumann’s L2 ergodic theorem, Birkhoff’s individual ergodic the
orem and ergodicity of a measure preserving transformation with applications
to stochastic processes.
[26] Autocorrelation, spectrum, higher order spectra and the causal and
nonCausal Wiener filters.
[27] Nonlinear filtering and the Kalman and extended Kalman filters for real
time filtering.
[28] Simulation of random variables on a computer by transformation of a
uniformly distributed random variable.
[29] The Brownian motion, Poisson process and some of their properties. [30] Stochastic integration w.r.t Brownian motion and stochastic differential
equations driven by the Brownian motion process.
[31] Martingales and their properties. Doob’s inequality for Martingales, the
Martingale downcrossing inequality and the Martingale convergence theorem.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 55 53
[32] Stochastic integration w.r.t a Martingale. [33] Examples of Martingales. [34] An introduction to large deviation theory with applications to the dif fusion exit problem and stabilization of stochastic differential equations with feedback controllers. [35] Stochastic processes in robotics: [a] The dlink robot equation. [b] The dlink robot equation with 3D links–analysis using Lie group theory. [c] Large deviation control of 3D link robots. [35] Markov chains and the ChapmanKolmogorov equations. Examples in cluding the pure birth process, the birthdeath process, the telegraph process. The stationary distribution of a Markov chain. [36] Derivation of the FokkerPlanck equations for a continuous state space Markov process from Ito’s stochastic differential equation. [37] Approximation of the Boltzmann kinetic transport equation for a plasma by the FokkerPlanck equation. [38] An introduction to probability in quantum mechanics. [a] Interference of wave functions. [b] Interpretation of quantum probabilities using Feynman’s path integral formula for the probability amplitude. [c] Transition probabilities in quantum mechanics. [d] Transition probabilities when the system Hamiltonian is perturbed by a time varying HamiltonianDevelopment of time dependent perturbation theory. [e] The quantum mechanical harmonic oscillator and its application to the construction of the Boson Fock space. [f] The creation, annihilation and conservation processes of Hudson and Parthasarathy in Boson Fock space. [g] Quantum stochastic integration and quantum stochastic differential equa tions in hte HudsonParthasarathy formalism for describing the evolution of quantum systems in the presence of quantum noise. References: [1] A.Papoulis, ”Probability Theory, Random Variables and Stochastic Pro cesses”. [2] William Feller, ”An introduction to probability theory and its applica tions, vol.I and II”, John Wiley. [3] K.R.Parthasarathy, ”An introduction to probability and measure”, Hin dustan Book Agency. [4] K.R.Parthasarathy, ”An introduction to quantum stochastic calculus”, Birkhauser, 1992. [5] Harish Parthasarathy, ”Developments in Mathematical and Conceptual Physics:Concepts and Applications for Engineers”, Springer Nature, 2020. [6] I.Karatzas and S.Shreve, ”Brownian motion and stochastic calculus”, Springer.
54 56CHAPTER Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 2. PROBABILITY THEORY, COURSE INSTRUCTOR: HARISH PARTHASAR
2.1
The basic axioms of Kolmogorov
A triplet (Ω, F, P ) is called a classical probability space for an experiment where Ω is the sample space, namely a set whose elements are called elementary out comes of the experiment, F is a σfield of subsets of Ω and P : F → [0, 1] is a probability measure. By a σfield, we mean that it is closed under countable unions and complementation (and hence also under countable intersections (by � DeMorgan’s rule n En = ( n Enc )c ), and therefore it also contains the sample space as well as the nullset. The elements of F (which are subsets of Ω) are called the events of the experiment. If E ∈ F, we say that the event E has occurred if on performing the experiment, the elementary outcome ω ∈ E. We say that the event E has not occurred, ie E c has occurred if the elementary / E. If En , n = 1,2, ... is a finite or infinite outcome ω ∈ E c , or equivalently ω ∈ sequence of events, then the finite/countable union n En is an event by hy pothesis and this event is said to have occurred if the elementary outcome ω is � in at least one of the En� s. Likewise, if ω is in all the En� s, ie ω ∈ n En , then we say that all the events En , n = 1, 2, ... have occurred. Note that F need not be closed under arbitrary unions/intersections. The reason for this is seen when we define the probabilitymeasure P as a countably additive set function on F � � m imply P (E) = n P (En )) (which means that E = n En , En ∩ Em = φ∀n = such that P (Ω) = 1 and hence P (φ) = 0. Now suppose we assumed that F is closed under arbitrary unions, not necessarily countable and then we also make P additive under uncountable unions of disjoint events. Then, we run into trou ble as the following example shows: Let P be the uniform distribution on the closed interval [0, 1]. Then P ([0, 1]) = 1. However P ({x}) = 0 for any single point x ∈ [0, 1] for P by definition is given by P ([a, b]) = b − a for 0 ≤ a ≤ b ≤ 1. On the other hand, uncountable additivity of P would result in � � 1 = P ([0, 1]) = P ({x}) = 0=0 x∈[0,1]
x∈[0,1]
which is absurd. That is the reason why we have to be content with F being closed under countable unions and P being countably additive on F.
2.2
Exercises
[1] Show that if A, B ∈ F and A ⊂ B, then P (A) ≤ P (B) [2] Show using countable additivity of P on F that if En ∈ F, n = 1, 2, ... and En ↑ E, ie, En ⊂ En+1 ∀n ≥ 1 and n≥1 En = E, then P (En ) ↑ P (E)
Conversely, show that if P is finitely additive on F and this property holds then P is countably additive on F.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 57 55 2.2. EXERCISES
hint: Let E0 = φ and define Fn = En − En−1 = En ∩ Enc−1 , n ≥ 1 then the Fn� s are pairwise disjoint events and
Fn = En = E n
n
Apply now the countable additivity property of P and use the fact that P (Fn ) = P (En ) − P (En−1 ) [3] Show that if En ↓ E (all being events), ie, En+1 ⊂ En ∀n and E = then
P (En ) ↓ P (E)
�
n
En ,
hint: En ↓ E iff Enc ↑ E c . Now use the result of the previous exercise. Conversely show that if P is finitely additive on F and this property holds good, then P is countably additive. [4] Study project on the Caratheodory extension theorem. Let B field, ie, a collection of Ωsubsets that is closed under finite unions and complementation and if P is a countably additiveprobability measure on B � (ie, En ∈ B, n ≥ 1, En ∩ Em = φ∀n �= m and E = n En ∈ B, then P (E) = n P (En )), then P has a unique countably additive extension to a probability measure P0 on the σfield F = σ(B) generated by B. By unique countable extension, we mean that (a) P0 is a probability measure on F and (b) P0 (E) = P (E)∀E ∈ B [5] If Xn is a bounded sequence of random variables such that Xn ≤ Xn+1 ∀n, then Xn increases to a limit X. Show that for X to be measurable, ie, a random variable in general, we require F to be a σfield just being a field will not suffice. hint: The set {ω : X(ω) ∈ (a, b]} is the increasing limit of the events {ω : Xn (ω) ∈ (a, b]}, n = 1, 2, ..., in particular the former is the countable union of the latter. Hence for the former to be measurable, ie, an event, the class F of events must be closed under countable unions. Further, if we require the continuity condition P (X ∈ (a, b]) = limn P (Xn ∈ (a, b]) then P must be countably additive in general, finite additivity will not suffice. [6] Let (Ωk , Fk , Pk ), k = 1, 2, ..., r be probability spaces. Let Ω = Ω1 × Ω2 × ... × Ωr and let F be the σ field on Ω generated by the measurable rectangles, ie, by sets of the form E1 × E2 × .... × Er with Em ∈ Fm , m = 1, 2, ..., r. Let B denote the field consisting of finite disjoint unions of such rectangles (Show that this is indeed a field). It is clear that F is the σfield generated by B. Prove using
5658CHAPTER Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 2. PROBABILITY THEORY, COURSE INSTRUCTOR: HARISH PARTHASAR
the countable additivity of Pk on Fk , k = 1, 2, ..., r that the finitely additive set
function P defined on B by
P (A1 ∪ A2 ∪ ... ∪ Ar ) = P (A1 ) + ... + P (Ar ) where A1 , ..., Ar are disjoint measurable rectangles and if A = E1 × .... × Er is
a measurable rectangle, then
P (A) = P1 (E1 )...Pr (Er ) is also countably additive on B and hence use Caratheodory’s extension theorem
to deduce that P extends to a unique probability measure P0 on F (When we
say probability measure, we mean that it should be countably additive).
Note: In order to show that P is countably additive on B, it suffices to show
that if En ∈ B, En ↓ φ then P (En ) ↓ 0.
[7] The Kolmogorov existence theorem for stochastic processes. Let Fn (x1 , ..., xn ), n ≥ 1 be a consistent family of probability distributions on Rn , n = 1, 2, ... respec tively. Then, if
B= (B(Rn ) × RZ+ )
n≥1
prove that B is a field in RZ+ .
Exercises: [1] if X1 , ..., Xn , ... is a sequence of random variables on a probability space
(Ω, F, P ), then show that if we define the joint probability distribution function
of the first n r.v’s by
Fn (x1 , ..., xn ) = P (X1 ≤ x1 , ..., Xn ≤ xn ) = P (
n �
k=1
Xk−1 ((−∞, xk ])), x1 , ..., xn ∈ R
then Fn , n ≥ 1 has the following properties: lim ↓ xi yi Fn (x1 , .., xi , .., xn ) = Fn (x1 , ..., yi , ..., xn ) ie, Fn is right continuous in each of its arguments. To prove this, make use of the fact that [−∞, yi ) = limxi ↓ yi (−∞, xi ] and hence limxi ↓ yi Xi−1 ((−∞, xi ]) = Xi−1 ((−∞, yi ]). Hence, deduce that limxi ↓ yi
n �
k=1
Xk−1 ((−∞, xk ]) = X1−1 ((−∞, x1 ])
�
...
�
Xi−1 ((−∞, yi ])
�
...
�
Xn−1 ((−∞, xn ])
Then, make use of the continuity of the probability measure P (which is a consequence of the countable its additivity) to deduce the result.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
57
2.3. EXERCISES ON STATIONARY STOCHASTIC PROCESSES, SPECTRA AND POLYSPE
2.3 Exercises on stationary stochastic processes, spectra and polyspectra [1] Let X(t), t ∈ R (X(n), ∈ Z) be a stochastic process in continuous time (discrete time). The process is said to be stationary if the joint distribution of the random variables (X(t), X(t + t1 ), ..., X(t + tk )) does not depend on t for any k, t1 , ..., tk . In this case, define the (k + 1)th order moments of the process as MX (t1 , ..., tk ) = E(X(t)X(t + t1 )...X(t + tk )) Show that this does not depend upon t, ie, the process is (k + 1)th order sta tionary. Second order stationarity in particular means that the autocorrelation function RX (s) = E(X(t)X(t + s)) does not depend on t. Give an example of a process that is second order stationary but is not stationary. Then, define its kvariate Fourier transform by � MX (t1 , ..., tk )exp(−j(ω1 t1 + ... + ωk tk ))dt1 ...dtk PX,k (ω1 , ..., ωk ) = Rk
PX,k is called the k th order polyspectrum of the process X. In the discrete time case, we define it using the kvariate DTFT of the moment sequence rather than the continuous time FT, ie, CTFT. Show that if X(t) is passed through an LTI system with impulse response h(t) so that its output is � Y (t) = h(s)X(t − s)ds R
or in discrete time, Y (n) =
�
m∈Z
then
h(m)X(n − m)
¯ 1 + ... + ωk )PX,k (ω1 , ..., ωk ), k = 1, 2, ... PY,k (ω1 , ..., ωk ) = H(ω1 )...H(ωk )H(ω
In particular, show that SY (ω) = PY,2 (ω) = |H(ω)|2 S(X(ω), SX (ω) = PX,2 (ω)
A process that is both first and second order stationary is said to be wide sense stationary (WSS). Let X(t) be a WSS process and define its time and ensemble average power by � 1 T /2 X(t)2 dt WX = limT →∞ E T −T /2
58 60CHAPTER Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 2. PROBABILITY THEORY, COURSE INSTRUCTOR: HARISH PARTHASA
Prove using the Parseval theorem that WX where SX (ω) = limT →∞
1 = 2π
J
R
SX (ω)dω
1 ˆ E|XT (ω)|2 = T
where ˆ T (ω) = X
J
J
R
RX (s)exp(−jωs)ds
T /2
X(t)exp(−jωt)dt −T /2
and RX (s) = E(X(t)X(t + s)) For this reason SX (ω) is called the power spectral density (PSD) of the WSS process {X(t) : t ∈ R}. The above result, namely that the PSD of a WSS process is the Fourier transform of its autocorrelation function is called the WienerKhintchine theorem. For proving this, you must assume that lim|s|→∞ RX (s) = 0 which in particular, is true if J
2.4
R
|RX (s)|ds < ∞
A research problem based on problem [1]
Explain how using measurements of the power spectral density of the input and output of an LTI system, you can estimate the magnitude |H(ω)| of the transfer function of the system and by using measurements of the polyspectrum of order k where k ≥ 3 of the input and outputs of the LTI system, we can also estimate the phase of the LTI system. [3] Let Z(ω), ω ∈ R be a zero mean complex valued stochastic process on a probability space such that E(dZ(ω).dZ¯ (ω � )) = (2π)−1 S(ω)dω.δω,ω� Now define a stochastic process X(t) as the stochastic integral J X(t) = exp(jωt)dZ(ω) R
Advanced Classical and Quantum Probability TheoryON withPROBLEM Quantum Field[1] Theory Applications61 59 2.4. A RESEARCH PROBLEM BASED
Show that X(t) is WSS with autocorrelation � ¯ RX (s) = E(X(t + s)X(t)) = exp(jωs)S(ω)dω/2π R
Hence deduce that S(ω) = 2π.E(|dZ(ω)|2 )/dω is the power spectral density of the process X(t). Conversely, given a WSS process X(t), define a process Z(ω), ω ∈ R by the equation � (exp(−jω2 t) − exp(−jω1 t)) X(t)dt, −∞ < ω1 < ω2 < ∞ Z(ω2 )−Z(ω1 ) = (2π)−1 −jt R Then show that formally we can write � dZ(ω)/dω = (2π)−1 exp(−jωt)X(t)dt, ω ∈ R R
Note that when we are rigorous, this statement is true only if the complex measure on R defined by µZ ((ω1 , ω2 ]) = Z(ω2 ) − Z(ω1 ) is absolutely continuous w.r.t the Lebesgue measure. Show that even if it is not so, but the limit limω2 →ω1 E(|Z(ω2 ) − Z(ω1 )|2 )/(ω2 − ω1 ) = (2π)−1 S(ω1 ) exists for all ω1 ∈ R, then show also by virtue of the WSS property of X(t) that we have the orthogonality relations ¯ 3 ) − Z(ω ¯ 4 ))] = 0 E[(Z(ω2 ) − Z(ω1 ).(Z(ω for ω2 > ω1 ≥ ω3 > ω4 and hence deduce the relation ¯ RX (s) = E(X(t + s)X(t)) = (2π)−1
�
R
S(ω)exp(jωs)dω
in the sense of RiemannStieltjes. In this case, show that we can define the integral � exp(jωt)dZ(ω) R
in the L2 sense as an L2 limit of Riemann sums and that this L2 limit equals X(t).
60 62CHAPTER Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 2. PROBABILITY THEORY, COURSE INSTRUCTOR: HARISH PARTHASAR
[4] This problem is a generalization of the previous problem. Let (Ω, F, P ) be a probability space and let H = L2 (Ω, F, P ) denote the Hilbert space of all complex valued random variables X on this probability space for which � 2 E|X| = |X(ω)|2 dP (ω) < ∞ Let Z be a complex set function on a measurable space (X, E) with the property that Z is countably additive in the L2 sense, ie, if E1 , E2 , ... is a sequence of pairwise disjoint sets in E, then E|µ(
n
En ) −
N �
n=1
µ(En )|2 → 0, N → ∞ − − − (1)
Let µ be a measure on (X, E), ie (X, E, µ) is a measure space. Assume that ¯ E(Z(A).Z(B)) = µ(A ∩ B), A, B ∈ E Show using the countable additivity of µ on E, that this condition automatically guarantees that Z(.) will be countably additive in the L2 sense, ie, the property (1) will hold. Let f be a complex valued measurable function on this measure space and assume that � |f (x)|2 dµ(x) < ∞ X
ie,
f ∈ L2 (X, E, µ) We wish to define a stochastic integral � f (x)dZ(x) ∈ H X
in the L2 sense and elucidate some properties of this stochastic integral. Choose a simple sequence of measurable functions fn on (X, E, µ) converging in the L2 sense to f , ie, each fn has the form fn (x) =
Nn �
k=1
c(n, k)χEn,k (x), n ≥ 1
By saying that this sequence converges to f in the L2 sense, we mean that � |fn (x) − f (x)|2 dµ(x) → 0, n → ∞ X
Then define IZ (fn ) =
Nn �
k=1
c(n, k)Z(En,k )
Advanced and Quantum Probability Theory with Field Theory 2.4. AClassical RESEARCH PROBLEM BASED ONQuantum PROBLEM [1] Applications 6361
Show that {IZ (fn )} is a Cauchy sequence in H, or more precisely, � E|IZ (fn ) − IZ (fm )|2 = |fn (x) − fm (x)2 dµ(x) → 0, n, m → ∞ X
where the last convergence follows from the fact that every convergent sequence in a Hilbert space (or more generally, in any inner product space) is Cauchy. Deduce that there exists an element IZ (f ) ∈ H such that E|IZ (fn ) − IZ (f )|2 → 0 and that this L2 limit IZ (f ) does not depend upon the sequence fn of simple functions converging to f . We write � f (x)dZ(x) IZ (f ) = X
and call it the 2. L2 stochastic integral of f w.r.tCOURSE Z. 64CHAPTER PROBABILITY THEORY, INSTRUCTOR: HARISH PARTHASAR
2.5
Exercises on the construction of the integral w.r.t a probability measure
[1] [a] Let (Ω, F, P ) be a probability space and let X be a random variable on it. We say that X is a simple r.v. if it assumes atmost only a finite number of distinct values, say c1 , ..., cn . Define Ek = X −‘1 ({ck }) = {ω ∈ Ω : X(ω) = ck }, k = 1, 2, ...n Show that E1 , ..., En are disjoint events, ie, � k, Ek ∈ F Ek ∩ Ej = φ, k = and further Ω=
n
Ek
k=1
Show that we can write X(ω) =
n �
ck χEk (ω)
k=1
Define
�
XdP =
n �
ck P (Ek )
k=1
Show that if we write the same r.v X in another way as X(ω) =
m �
dk χFk (ω)
k=1
where the Fk� s need not be disjoint (but they are events), then � m n � � dk P (Fk ) = XdP = ck P (Ek ) k=1
k=1
�
XdP =
n �
ck P (Ek )
k=1
Show that if we write the same r.v X in another way as 62
m �
Advanced Classical and Quantum Probability with Quantum Field Theory Applications dTheory X(ω) = k χF (ω) k
k=1
where the Fk� s need not be disjoint (but they are events), then � m n � � ck P (Ek ) dk P (Fk ) = XdP = k=1
k=1
[b] Show that the set of simple r.v.s is a vector space over the real number, ie, it is closed under addition and scalar multiplication by real numbers. Therefore, this set is also closed under all finite real linear combinations. [c] Show that if X, Y are simple r.v’s and X(ω) ≤ Y (ω)∀ω ∈ Ω, then � � XdP ≤ Y dP
In particular, deduce [b] that if X isDYNAMICAL a nonnegative simple r.v. thenERGODIC THEORY 2.6. EXERCISES ONusing STATIONARITY, SYSTEMS AND � XdP ≥ 0 [2] First we construct the integral of a nonnegative r.v. Let (Ω, F, P ) be a probability space and X a nonnegative real valued random variable on this space. A simple r.v. is a r.v. that assumes only a finite number of values. For each positive integer N , define the simple r.v XN by XN (ω) =
N N.2 �
(k/2N )χX −1 ((k/2N ,(k+1)/2N ]) (ω)
k=0
where χE (ω) denotes the indicator of E, ie, χE (ω) = 1 if ω ∈ E and χE (ω) = 0 if ω ∈ / E. Using the decomposition (k/2N , (k + 1)/2N ] = (2k/2N +1 , (2k + 1)/2N +1 ] ∪ ((2k + 1)/2N +1 , (2k + 2)/2N +1 ] of the lhs into a disjoint union, deduce that 0 ≤ XN (ω) ≤ XN +1 (ω)∀N ≥ 1 ie, XN , N ≥ 1 is a nondecreasing sequence of simple r.v.s Show further that limN →∞ XN (ω) = X(ω)∀ω ∈ Ω Deduce using the result of the previous exercise that � � 0 ≤ XN dP ≤ XN +1 dP < ∞, ∀N ≥ 1 and hence that I = limN →∞ Exists. Define
Let YN , N ≥ 1
�
�
XN dP
XdP = I
Deduce using the result of the previous exercise that � � 0 ≤ XN dP ≤ XN +1 dP < ∞, ∀N ≥ 1 and hence that
�
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
I = limN →∞
Exists. Define
Let YN , N ≥ 1
2.6
�
XN dP
63
XdP = I
Exercises on stationarity, dynamical systems and ergodic theory
[1] Let f ∈ L1 (Ω, F, P ) and let T : Ω → Ω be a measure preserving transforma tion, ie, T −1 (F) ⊂ F, P oT −1 = P if T is invertible and X is a random variable on (Ω, F, P ), then show that the process X(T n ω), n ∈ Z is a stationary stochastic process on (Ω, F, P ).
Chapter 3
Chapter 3
Antenna Theory, Course Antenna Theory Instructor: Harish Parthasarathy 3.1
Course Outline
[1] Maxwell’s equations in the frequency domain. Solution using retarded po tentials, the far field approximation. Calculation of the total power radiated at a given frequency in the far field zone. [2] Maxwell’s equations in the frequency domain taking inhomogeneities, anisotropicity and field dependence (nonlinearity) of the permittivity and per meability tensors into consideration. Perturbative solution of the differential equations. [3] Construction of the Green’s function of the Helmholtz operator for bound aries of various shapes for the Dirichlet and Neumann boundary conditions:Applications to cavity resonator antennas. [4] The reciprocity theorem between two electromagnetic fields driven by two different current densities. [5] The radar equation, directivity and antenna aperture, reciprocity theorem between transmitter and receiver antenna. [6a] The basic antenna parameters: Loss resistance, radiation resistance total resistance, effective aperture. Equivalent circuit of a transmitter and receiver antenna. [6b] Poynting’s theorem and its evaluation in the far field zone in terms of the current distribution in the source. [7] The fields of an infinitesimal electric dipole in the near field zone, far field zone, intermediate zone. [8] The fields produced by a finite straight wire of length L carrying a spatial sinusoid current distribution vanishing at its ends. 67
65
6668CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
[9] The fields produced by a circular loop carrying sinusoidal current in the far field zone. [10] The far field pattern produced by an infinitesimal loop of wire in terms of the magnetic moment of the wire. [11] The far field pattern produced by a volume carrying a sinusoidal current density. [12] The far field pattern produced by an infinitesimal dipole, a finite straight wire and a circular loop placed above an infinite ground plane. Calculation based on the method of images. [13] The total power radiated by an antenna in the far field zone. [14] The radiation resistance of an antenna in terms of the current distribu tion in it and the feeding current. [15] A planar aperture on which an electromagnetic field is incident as an antenna. Computation of the far field Poynting vector. [16] A waveguide terminated by a horn:The far field radiation pattern. [17] Helical antennas as broad band antennas. [18] A cavity resonator as a microstrip antenna. [19] Waveguide feeding an aperture as an antenna. [20] The method of stationary phase. [21] The mutual impedance between two antennas, the self impedance of an antenna. [22] Antenna arrays: The pattern multiplication theorem, application to binomial and Chebyshev arrays. [23] Plotting of antenna pattern lobes. [24a] Computing the current density induced on an antenna surface by an in cident electromagnetic field using generalizations of the Pocklington and Hallen integral equations. [24b] Application to the problem of determining the induced currents on the driver and driven elements of a YagiUda array. [24c] Solving Pocklington’s integral equations by the method of moments. [25] The effect of a gravitational field and inhomogeneous, anisotropic and field dependent permittivity and permeability on the pattern of an antenna:General relativistic considerations based on Covariant form of the Maxwell equations in a background curved spacetime. [26] A brief description of the finite element method for numerically solving antenna problems. [25] An introduction to quantum antennas. [a] Quantum current generated by electrons, positrons and photons within a cavity on the atomic scale. Description of currents on the cavity surface generated by the quantum magnetic field and the currents within the cavity generated by the electronspositron Dirac field. Evaluation of the quantum statistical moments of the near and far field radiation patterns in a given state of the electrons, positron and photons. A description of coherent states for photons and electronspositrons.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
3.2. THE FAR FIELD POYNTING VECTOR
3.2
67
69
The far field Poynting vector
The far field magnetic vector potential is given by � A(t, r) = (µ/4π) J(ω, r)exp(−jk|R − r|)d3 r/|R − r| ˆ )exp(−jkR)/R
≈ P(ω, R where
ˆ ) = (µ/4π) P(ω, R
�
3 ˆ J(ω, r)exp(jk R.r)d r
S
where S denotes the source volume. Thus, the far field electric field (ie upto O(1/R)) is given by E(ω, R) = −�Φ − jωA = −�((jc2 /ω)divA)) − jωA
ˆ ˆ = [−(jc2 /ω)(−jk R.P)(−jk R)/R − jωP/R]exp(−jkR)/R
ˆ ˆ ˆ + Pθ θˆ + Pφ φ)]exp(−jkR)/R = [(jk 2 c2 /ω)Pr R/R − jω(Pr R ˆ = −jω(Pθ θˆ + Pφ φ)exp(−jkR)/R
Note: The only O(1/R) term in when we take the gradient of a function of the form f (R, θ, φ)exp(−jkR)/R is given by ˆ (R, θ, φ)/R −jk R.f All the other terms are of order 1/R2 and they do not contribute anything to the outward radiated power. In other words, the O(1/R) terms come only by differentiation of the phase factor or equivalently from differentiation of the far field delay factor, not by differentiating the amplitude factor. We can likewise evaluate the far field magnetic field (ie, with neglect of O(1/R2 ) terms): B(ω, R) = � × A(ω, R) = ˆ × P.exp(−jkR)/R −jkR
ˆ = −jk(Pθ φˆ − Pφ θ).exp(−jkR)/R Therefore, the time averaged far field Poynting vector, ie, power flux is given by (upto O(1/R2 ) terms) ˆ ) = S(ω, R, θ, φ) = Re[E × B∗ ]/2µ0 = S(ω, R, R ˆ = [ωk(|Pθ |2 + |Pφ |2 )/2µ0 R2 ]R
ˆ )|2 + Pφ (ω, R ˆ )|2 )R ˆ = (ω 2 /2µ0 cR2 )(|Pθ (ω, R where we use the relations
ˆ θˆ × φˆ = R
68 70CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
The total power radiated out by the antenna when it operates at the frequency ω is then � ˆ ).RdS(R ˆ ˆ) W = S(ω, R, R sphereorradiusR
=
�
ˆ) ˆ ).R.R ˆ 2 dΩ(R S(ω, R, R
unitsphere
2
= (ω /2µ0 c)
�
0
π
�
2π 0
|Pθ (ω, θ, φ)|2 + |Pφ (ω, θ, φ)|2 )sin(θ)dθ.dφ
Note that this result is independent of the radial distance R as long as we are operating in the far field zone. This formula can be used to define the radiation resistance Rr of the antenna at frequency ω as |I0 (ω)|2 Rr /2 = W where I0 (ω) is the input sinusoidal current phasor at frequency ω. Reciprocity theorem: Sources (ρ1 , J1 , ρm1 ; M1 ) generate fields (E1 , H1 ) while sources (ρ2 , J2 , ρm2 , M2 ) generate fields (E2 , H2 ). Thus, curlEk = −jωµHk − Mk , curlHk = jω�Ek + Jk , k = 1, 2 Compute
div(E1 × H2 ) = curl(E1 ).H2 − E1 .curl(H2 ) =
= (−jωµH1 − M1 ).H2 − E1 .(jω�E2 + J2 ), div(E2 × H1 ) = curl(E2 ).H1 − E2 .curl(H1 ) = = (−jωµH2 − M2 ).H1 − E2 .(jω�E1 + J1 ), Thus, div(E1 × H2 − E2 × H1 ) = M2 .H1 − M1 .H2 + J1 .E2 − J2 .E1 Integrating this identity over the entire three dimensional space, making use of Gauss’ integral theorem and the fact that the electromagnetic fields vanish at ∞ gives us the fundamental reciprocity relation � (J1 .E2 − J2 .E1 + M2 .H1 − M1 .H2 )d3 r = 0 R3
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
3.3. EXERCISES
3.3
71
Exercises
3.3.1 Equivalent circuit of a transmitterreceiver antenna system 3.3.2 Radiation resistance of an infinitesimal dipole start ing from the relation A(ω, r) = (µdl/4π)ˆ z .exp(−jkr)/r While calculating the fields by differentiating the potentials, make use of the fact that spatial derivatives of only the phase term exp(−jkr) need to be taken not of the other amplitude terms, for only the phase derivatives contribute to O(1/r) terms in the field and hence to O(1/r2 ) terms in the Poynting vector in the far field zone and hence to nonzero total radiated power. Amplitude derivatives of the potentials contribute to O(1/r2 ) terms in the fields and hence to O(1/r3 ) terms in the Poynting vector which do not contribute anything to the radiated power in the far field zone. Note that the electric field can be computed using jω�E = curlB = curlcurlA = �(divA) − �2 A = �(divA) + k 2 A and the magnetic field using H = curlA/µ Thus, upto O(1/r), we have H = curlA/µ = (−jk.dl/4π)ˆ r × z.exp(−jkr)/r ˆ ˆ = (jkdl/4πr)sin(θ)φ.exp(−jkr)
and hence also
ˆ
E = curlH/jω� = (−jk2 dl/(ω�4πr)sin(θ)ˆ r × φ.exp(−jkr) ˆ = (jk 2 dl/ω�.4πr)sin(θ)θ.exp(−jkr) where we have used
ˆ
rˆ × zˆ = −sin(θ)φ,
rˆ × φˆ = −θˆ
Note that the above formulae imply
Eθ /Hφ = k/ω� = 1/c� =
� µ/�
which is the characteristic impedance of the medium, as it should be.
69
7072CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
[3] Show that for a small loop of wire carrying a current I(ω) located around the origin of the coordinate system with the parametric equation of the loop being given by s → R(s), 0 ≤ s ≤ 1, the far field magnetic vector potential is given by 1 1 A(ω, r) = (µI(ω)/4π) dR(s).exp(−jk|r − R(s)|)/|r − R(s)| 0
≈ (µI/4πr)exp(−jkr).
1
1
dR(s)exp(j rˆ.R(s))
0
Show that when this is a circular loop of radius a, the above formula simplifies to A(ω, r, θ, φ) = 1 2π ' (µIa/4πr)exp(−jkr) (−ˆ x.sin(φ' )+ y.cos(φ ˆ )).exp(jka.sin(θ)cos(φ' −φ))dφ' 0
and then using the formulas
Aρ = Ax .cos(φ) + Ay .sin(φ), Aφ = −Ax .sin(φ) + Ay .cos(φ) deduce that in the far field zone. Aρ = 0, Az = 0, 1 2π Aφ = (µIa/4πr)exp(−jkr) exp(jka.sin(θ).cos(φ' ))cos(φ' )dφ' 0
Problem: From the expression for the far field electromagnetic field pattern upto O(1/r2 ) for a source at definite frequency, calculate the Poynting vector field upto O(1/r2 ) and hence determine the total power radiated out into space in the far field zone as a quadratic form in the source current density. Specialize this result to infinitesimal dipoles and to finite length dipoles and circular loops carrying sinusoidal current and hence calculate the radiation resistance R using P =< I 2 > R/2. Appendix, B.E and M.Tech projects
3.4 Order of magnitudes in quantum antenna theory Consider a cavity resonator of one Angstrom size, ie, a cube with each side of length a = 10−10 m. The Maxwell equations in such a cube have solutions of the from L Ar (t, x, y, z) = c(mnp, t)ur,mnp (x, y, z), r = 1, 2, 3 mnp
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
3.4. ORDER OF MAGNITUDES IN QUANTUM ANTENNA THEORY
73
71
where ur,mnp are spatial functions obtained by integrating the electric field w.r.t time. These functions are of the form {cos(mπx/a), sin(mπx/a)}⊗{cos(nπy/a), sin(nπy/a)}⊗{cos(pπz/a), sin(pπz/a)} multiplied by some constants depending on the indices (m, n, p). We may, with out loss of generality, assume that the functions ur,mnp are normalized so that � ur,mnp (r)¯ us,m� n� p� (r)d3 r = δrs δmm� δnn� δpp� C
The dependence of c(mnp, t) on t is exp(iω(mnp)t) where ω(mnp) are the char acteristic frequencies of oscillation: � ω(mnp) = (πc/a) m2 + n2 + p2 , m, n, p = 1, 2, ... which are of the order of magnitude
ω = πc/a The electric field is Er = ∂t Ar =
�
c(mnp, t)iω(mnp)ur,mnp (r)
mnp
The magnetic field is
B = curlA
which is of the order of magnitude |c(mnp, t)|/a where by c(mnp, t) we actuall mean its average in a coherent state. The total electric field energy within the cavity C is � UE = (�0 /2)
C
|E|2 d3 r
which has components of the order of magnitude
�0 |ω(mnp)c(mnp, t)|2 a3 = �0 ω(mnp)2 a3 |c(mnp, t)|2
The total magnetic field energy within the cavity is
� UB = (2µ0 )−1 |B|2 d3 r C
which is has components of the order of magnitude
|c(mnp, t)/a|2 a3 /µ0 = |c(mnp, t)|2 a/µ0
The ration of the orders of magnitude of the electric field energy and the mag
netic field energy within the cavity therefore has the order of magnitude
UE /UB ≈ µ0 �0 ω(mnp)2 a2 ≈ ω 2 a2 /c2 ≈ 1
7274CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
as expected. The canonical commutation relations are [Ar (t, r), ∂t As (t, r� )] = (ih/2π)δ 3 (r − r� ) These yield, [cr (mnp, t), ω(mnp)cs (m� n� p� , t)∗ ] = (h/2π)δrs δmm� δnn� δpp� so that the eigenvalues of cr (mnp, t)∗ cr (mnp, t) are positive integer multiples of h/2πω(mnp). This means that the field energy within the cavity when a finite number of modes are excited assumes eigenvalues that are of the same order of magnitude as positive integer multiples of hω/2π as expected by Planck’s quantum theory of radiation.� This fact also yields the result that |c(mnp, t)| is of the order of magnitude of h/(2πω). Now we come to the question of computing the order of magnitude of the Poynting vector power flux at a given radial distance R from the quantum cavity antenna caused by the surface current density induced by the magnetic field on on the antenna surface. The magnetic field on the surface and hence the corresponding induced surface current density both have the order of magnitudes � of |c(mnp, t)|/a which is of the order a−1 h/ω. Therefore, the far field magnetic vector potential at a distance R from the � cavity is of the order of magnitude (use the retarded potential formula) (a/R) h/ω and hence the corresponding � far field radiated magnetic field is of the order of magnitude (ω/c)(a/R)� h/ω while the near field magnetic field is of the order of magnitude (a/R2 ) h/ω. √ Actually, these expressions for the magnetic field must be multiplied by N where N is a positive integer corresponding to the largest modal eigenvalue of the operators (2πω(mnp)/h)c(mnp, t)∗ c(mnp, t). The far field Poynting vector has the order of magnitude of B 2 c/2µ0 which is of the order � √ (c/2µ0 )( N .(ω/c)(a/R). h/ω)2 = (h/2µ0 )N.(ω/c)(a2 /R2 )
and the total power radiated outward by this quantum antenna in the far field zone is thus of the order of magnitude P = N (h/2µ0 )(a2 ω/c) Now we look at the order of magnitude of the power radiated in the far field zone by the Dirac field of electrons and positrons within the cavity. The Dirac equation is [iγ µ ∂µ − m]ψ(x) = 0 or more precisely in arbitrary units, [(ih/2π)∂t − c(α, (−ih/2π)�) − βmc2 ]ψ(x) = 0 Here, the appearance of the constants h, m, c is explicitly shown. Now the |ψ(x)|2 is the probability density of the electron which must integrate to unity
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
73
3.5. THE NOTION OF A FERMIONIC COHERENT STATE AND ITS APPLICATION TO TH
over the cavity volume. Thus ψ(x) is of the order of magnitude a−3/2 . The Dirac current density J µ = eψ ∗ γ 0 γ µ ψ has the same order of magnitude as e|ψ(x)|2 c which is ec/a3/2 . Therefore the far field magnetic vector potential at a radial distance of R from the cavity is, in accordance with the retarded potential theory of the order (ec/a3/2 ).(a3 /R) = eca3/2 /R The electric field in the far field zone is then of the order E ≈ ω.eca3/2 /R where ω is the characteristic oscilation frequency of the Dirac current. The magnetic field is of the order √ B ≈ a−1 .eca3/2 /R = ec a/R If P is the characteristic momentum of the electrons and positrons in a given state, for example P may be the average momentum of an electron in a given state, then according to DeBroglie, P is of the order h/a since a is the order of the electron wavelength. Then the electron energy is of the order � � Ee = c m2 c2 + P 2 ≈ c m2 c2 + h2 /a2 and the characteristic frequency of oscillation of the Dirac wave field is then ω = Ee /h The Poynting vector corresponding to the power radiated by the Dirac field in the far field zone then has the order of magnitude S ≈ c(�0 E 2 + B 2 /µ0 ) = c3 �0 ω 2 ea3 /R2 + e2 c3 a/µ0 R2 and the total power radiated in the far field zone is of the order W = SR2 = c3 �0 ω 2 ea3 + e2 c3 a/µ0
3.5 The notion of a Fermionic coherent state and its application to the computation of the quantum statistical moments of the quan tum electromagentic field generated by elec trons and positrons within a quantum an tenna Aim: The aim of this section is to present a calculation involving the com putation of the quantum statistical moments of the electromagnetic field pro duced by an ensemble of electrons and positrons whose state is specified by
74 76CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
a mixed state superposition of Fermionic coherent states. Fermionic coherent states are parameterized by Grassmannian/Fermionic numbers and in order to attach physical signficance to the final results, we must use the Berezin inte gral for Fermionic variables to determine the above mentioned superposition of Fermionic coherent states. We can incorporate some unknown real parame ters into the Berezin linear combination of coherent states and estimate these parameters by minimizing the distance between the average value of the elec tromagnetic field generated by the Fermions and the desired electromagnetic field pattern. If need be, we may modify this cost function to be minimized by constraining the higher order quantum statistical moments of the generated quantum electromagnetic field to be specified. An example of an application of this circle of ideas is to use a quantum antenna to generate a set of desired spatial patterns at a given set of frequencies. First consider just a single Fermion specified by the annihilation operator a and the creation operator a∗ . Thus, a2 = a∗2 = 0, aa∗ + a∗ a = 1 Let γ be a Grassmannian variable that will be used to specify the coherent state of this Fermion just as a complex number z is used to specify a the coherent state of a single Boson. γ anticommutes with itself, with γ ∗ and with a, a∗ , just as in the Bosonic situation, the complex number z that specifies the coherent state commutes with itself, with z¯ and with the Boson creation and annihilation operators: γ 2 = 0, γγ ∗ + γ ∗ γ − 0, γ.a + a.γ = 0, γ ∗ a + aγ ∗ = 0, γ ∗ a∗ + a∗ γ ∗ = 0
Define now the Fermionic Weyl operator D(γ) = exp(γ.a∗ − aγ ∗ ) Clearly, D(γ) is a unitary operator since it is the exponential of a skew Hermitian operator. Now, D(γ) = 1 + γa∗ − aγ ∗ + (1/2)(γa∗ − aγ ∗ )2 = 1 + γa∗ − aγ ∗ − (1/2)(γa∗ aγ ∗ + aγ ∗ γa∗ )
= 1 + γa∗ − aγ ∗ + (1/2)(γ ∗ γa∗ a − γ ∗ γ(1 − a∗ a)) = 1 + γa∗ − aγ ∗ + γ ∗ γ(a∗ a − 1/2)
Then,
aD(γ) = a − γaa∗ + γ ∗ γa/2 = (1 + γ ∗ γ/2)a − γaa∗
D(γ)a = a + γa∗ a − γ ∗ γa/2 = (1 − γ ∗ γ/2)a + γa∗ a
Thus,
aD(γ) − D(γ)a = γ ∗ γa − γ
Advanced Classical and Quantum TheoryCOHERENT with Quantum Field Theory Applications 75 3.5. THE NOTION OF AProbability FERMIONIC STATE AND ITS APPLICATION TO TH
However,
D(γ)γ = γ − γ ∗ γa
γ.D(γ) = γ − γ ∗ γa
ie,
[γ, D(γ)] = 0
Thus,
aD(γ) − D(γ)a = −D(γ)γ = −γ.D(γ)
These equations can be rearranged as
D(γ)aD(γ)−1 = a + γ, D(γ)−1 .aD(γ) = a − γ
We define the Fermionic single particle coherent state as
|γ >= D(−γ)|0 >= D(γ)−1 |0 >
where |0 >, is the vacuum, ie, zero particle state. Then
a|γ >= aD(γ)|0 >= D(γ)−1 .D(γ)a.D(γ)−1 |0 >= D(γ)−1 (a + γ)|0 >
= γ.D(γ)−1 |0 >= γ|γ >
This proves the desired property of a coherent state, namely that it should be
an eigenvector of the annihilation operator. We observe that D(γ)−1 = 1 − γa∗ + aγ ∗ + γ ∗ γ(a∗ a − 1/2) and hence |γ >= |0 > −γ|1 > −(1/2)γ ∗ γ|0 >= (1 − γ ∗ γ/2)|0 > −γ|1 > − − −(1) From this expression, we can directly verify the coherent state property: a|γ >= γ.a|1 >= γ|0 >, while
γ|γ >= γ|0 >
since
γ 2 = 0, γγ ∗ γ = −γ ∗ γ 2 = 0
proving thereby the coherent state property for the state (1). Now we are in a position to discuss physical implications for Fermionic co herent states. The first observation is that a coherent state is not parametrized by a complex number, it is parametrized by a Fermionic/Grassmannian param eter or a set of anticommuting Grassmannian parameters. Then, if we compute average values of quantities like for example the Dirac four current density in such a state, we will get a Grassmannian number. What physical significance
76 78CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
does this have when our averages are not real or complex numbers ? The answer to this question is provided by the Berezin integral: Let φ(γ, γ ∗ ) be a function of the Grassmannian parameters γ, γ ∗ so that the Berezin integral � ρ = φ(γ, γ ∗ )|γ >< γ|dγ.dγ ∗ defines a mixed state. Then, the average value of a function F (a, a∗ ) of the Fermionic operators a, a∗ in the state ρ becomes a complex number to which we can attach physical meaning: � ∗ T r(ρ.F (a, a )) = φ(γ, γ ∗ ) < γ|F (a, a∗ )|γ > dγ.dγ ∗ Another example involving computing average values of the electromagnetic field emitted by a field of electrons and positrons in a given coherent state of the electronpositron field. Let ak , k = 1, 2, ... denote the annihilation operators of the electrons and positrons after discretizing in momentum space. They satisfy the CAR ∗ [ak , am ]+ = 0, [ak , am ] = δkm The current density field generated by this field is according to Dirac’s theory, a quadratic function of these operators and hence the electromagnetic field gener ated by this current density according to the retarded potential formula, is also a quadratic function of these operators. We can express this electromagnetic field as Fµν (x) =
N �
¯ µν (x, k, m, 1)a∗ a∗ +Gµν (x, k, m, 2)a∗ am ], x ∈ R4 [Gµν (x, k, m, 1)ak am +G m k k
k,m=1
This should be a Hermitian operator field and hence ¯ µν (x, k, m, 2) = Gµν (x, m, k, 2) G The coherent state of the electrons and positrons is given by ∗ ∗ |γ >= D(γ)|0 >, D(γ) = ΠN k=1 exp(γ(k)ak − ak γ(k) )
where γ = ((γ(k)))N k=1 are Fermionic/Grassmannian parameters and γ(k) and γ(k)∗ anticommute with γ(l), γ(l)∗ , al , al∗ for all l. We can write ∗ ∗ ∗ ∗ D(γ) = ΠN k=1 (1 + γ(k)ak − ak γ(k) + γ(k) γ(k)(ak ak − 1/2))
The state of the electrons and positrons is assumed to be given by a Berezin integral based superposition of the coherent states: � ρ(θ) = φ(γ, γ ∗ |θ)|γ >< γ|dγ.dγ ∗
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
77
3.5. THE NOTION OF A FERMIONIC COHERENT STATE AND ITS APPLICATION TO TH
and hence the average electromagnetic field in this state is < Fµν (x) > (θ) = T r(ρ(θ)Fµν (x)) = �
φ(γ, γ ∗ |θ) < γ|Fµν (x)|γ > dγ.dγ ∗
where
< γ|Fµν (x)|γ >=
N �
¯ µν (x, k, m, 1) < γ|a∗ a∗ |γ > [Gµν (x, k, m, 1) < γ|ak am |γ > +G m k
k,m=1
=
N
�
+Gµν (x, k, m, 2) < γ|a∗k am |γ >], x ∈ R4
¯ µν (x, k, m, 1)γ(m)∗ γ(k)∗
[Gµν (x, k, m, 1)γ(k)γ(m)+G
k,m=1
+Gµν (x, k, m, 2)γ(k)∗ γ(m)], x ∈ R4
We can now control the parameter vector θ so that this average electromagnetic field is as close as possible to a desired electromagnetic field Fdµν (x) over a given spacetime region x ∈ D by minimizing � E(θ) = | < Fµν (x) > (θ) − Fµν (x)|2 dµ(x) D
where µ(.) is a measure on D. Remark 1: More generally, we can compute all the statistical moments of the radiation field T r(ρ(θ)Fµ1 ν1 (x1 )...Fµk �k (xk ) > in the superposed coherent state ρ(θ). This computation will involve determin ing coherent state expectations such as < γ|a∗k1 ...ak∗r as1 ...asm |γ > and noting that this evaluates to γ(kr )∗ ...γ(k1 )∗ γ(s1 )...γ(sm ) The reference for Fermionic coherent state for us has been the master’s thesis by Greplova, title ”Fermionic Gaussian States”.
78
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
80CHAPTER 3. ANTENNA THEORY, COURSE INSTRUCTOR: HARISH PARTHASARATH
3.6 Calculating the moments of the radiation field produced by electrons and positrons in the far field when the Fermions are in a co herent state Remark 2: From Steven Weinberg’s book, ”The quantum theory of fields, vol.1”, it is known that the free Dirac field can be expanded in terms of momentumspin space electron annihilation operators a(P, σ) and positron creation operators b(P, σ)∗ which satisfy the CAR (canonical anticommutation relations) [a(P, σ), a(P ' , σ ' )∗ ]+ = δ 3 (P − P ' )δσ,σ� , [b(P, σ), b(P ' , σ ' )∗ ]+ = δ 3 (P − P ' )δσ,σ� and all the other anticommutators evaluating to zero. The second quantized Dirac wave field is then the solution to Dirac’s relativistic wave equation and is given by J ψ(x) = ψ(t, r) = [a(P, σ)u(P, σ)exp(−ip.x) + b(P, σ)∗ v(P, σ)exp(ip.x)]d3 P where p0 = E(P ) =
m2 + P 2
The Dirac current density operator field is then J µ (x) = −eψ(x)∗ γ 0 γ µ ψ(x) and it is evident that this can be expressed as a linear combination of the quadratic operators a(P, σ)∗ a(P, σ ' ), a(P, σ)∗ b(P, σ ' )∗ , b(P, σ)a(P, σ ' )∗ , b(P, σ)b(P ' , σ ' )∗ Thus, using the retarded potential formula for the Maxwell equations in the form J Aµ (x) = G(x − x' )J µ (x' )d4 x' it is evident that once again Aµ (x) is expressible as a linear combination of the above quadratic operators. After discretizing the integrals in 3momentum space, we then club all the electron and positron annihilation operators into one set {ak } and their adjoints into {a∗k } and then use the above coherent state formalism of Greplova to determine the quantum averages of the electromagnetic field.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
79
3.7. CONTROLLING THE CLASSICAL EM FIELDS INTERACTING WITH THE DIRAC FI
3.7 Controlling the classical em fields interact ing with the Dirac field so that the mean value of the em field radiated by the re sulting Dirac second quantized current in a Fermionic coherent state is as close as possi ble to a given deterministic pattern in space and simultaneously the mean square fluctu ations of this field in a Fermionic coherent state are minimized The ultimate aim of all these computations can be formulated in very simple terms as an optimization problem: Design the control parameters θ or the con trol classical fields to a quantum antenna so that the error energy between the average value of the quantum electromagnetic field produced by the quantum antenna and the desired classical electromagnetic field pattern is a minimum subject to the constraint that the second order central moments of the quan tum electromagnetic field (ie variance of fluctuations) is smaller than a given threshold. Remark:More generally, we can control the wave function operator of the Dirac field of electrons and positrons as well as the Maxwell photon field opera tors within the cavity resonator antenna by introducing classical control current and electromagnetic field sources into the cavity. The quantum cavity photon and electronpositron fields will then be expressible in terms of the free quantum fields plus additional perturbation terms involving the classical current and field sources. Once this is done, we can in principle calculate the far field antenna pattern produced by the cavity surface currents induced by the tangential com ponents of the quantum magnetic field operators as well as that produced by the Dirac field of electrons and positrons and then design these classical con trol fields so that the far field quantum Poynting radiation pattern has a mean value and correlations in a given quantum coherent state of the photons and electronspositrons within the cavity as close as possible to specified values. To formulate this optimization problem in abstract terms, let X(t, r) = X(x) =
�
[fk (θ)Xk (t, r) + f¯k (θ)Xk (t, r)∗ ]
k
be the quantum field radiated out by the quantum antenna where θ is a control parameter vector or a classical field. This form arises typically by perturbatively solving the MaxwellDirac field equations upto linear orders in the perturbing classical current and electromagnetic field. fk (θ) is a complex valued function of θ and Xk (t, r) are quantum operator fields. The average value of this radiated
80 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 82CHAPTER 3. ANTENNA THEORY, COURSE INSTRUCTOR: HARISH PARTHASARATH
field in a state |Φ > is given by < X(t, r) >=< Φ|X(t, r)|Φ >= � k
[fk (θ) < Φ|Xk (t, r)|Φ > +f¯k (θ) < Φ|Xk (t, r)∗ |Φ >]
and the central correlations in the field are C(t, r|t� , r� ) =< X(t, r).X(t� , r� )∗ > − < X(t, r) >< X(t� , r� )∗ > =
�
fk (θ)f¯m (θ) < Φ|(Xk (t, r)− < Xk (t, r) >).(Xm (t� , r� )∗ − < Xm (t� , r� ) >∗ |Φ >
�
fk (θ)fm (θ) < Φ|(Xk (t, r)− < Xk (t, r) >).(Xm (t� , r� )− < Xm (t� , r� ) >)|Φ >
k,m
+
k,m
+c.c
where c.c denotes complex conjugate of the previous terms. It is then easy to � see that if the fk� s are linear functions of θ, then the problem of minimizing D (
−Xd (t, r))2 dtd3 r subject to the � constraint that
W (t, r|t� , r� )C (t, r|t� , r� )dtd3 rdt� d3 r� is fixed is equivalent to finding the minimum of the ratio of two quadratic forms: D×D
θ T Q1 θ θ T Q2 θ where Q1 , Q2 are two Hermitian positive definite matrices and the optimal equa tions for θ then result in the generalized eigenvalue problem E(θ) =
(Q1 − cQ2 )θ = 0
The minimum value of this ratio is the minimum of all the generalized eigenval ues c, ie, the minimum of all the c� s for which det(Q1 − cQ2 ) = 0 and the the optimal value of θ is a generalized eigenvector corresponding to the minimum of c with the normalization condition θ T Q2 θ = E where E is the prescribed value of the energy of the quantum fluctuations � W (t, r|t� , r� )C(t, r|t� , r� )dtd3 rdt� d3 r� . D×D
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
81
3.8. APPROXIMATE ANALYSIS OF A RECTANGULAR QUANTUM ANTENNA83
3.8 Approximate analysis of a rectangular quan tum antenna The quantum antenna is assumed to be the cuboid region [0, a] × [0, b] × [0, d]. This rectangular cavity is assumed to comprise of photons, electrons and positrons. The exact equations governing the quantum fields corresponding to these par ticles are (a) The Maxwell equations for the four vector potential driven by the Dirac field current and (b) The Dirac field equations driven by an interaction between the Dirac field and the Maxwell field four vector potential. These exact field equations are: �Aµ (x) = µ0 eψ ∗ (x)αµ ψ(x), x = (t, r) − −(1) ((α, −i�) + βm0 )ψ(t, r) + eAµ (x)αµ ψ(x) = i∂t ψ(t, r) − − − (2) where
αµ = γ 0 γ µ , β = γ 0
and γ µ are the Dirac Gamma matrices. Note that (γ 0 )2 = I4 and hence α0 = I4 . The boundary conditions under which we need to solve these MaxwellDirac equations are that the Dirac operator wave field ψ(x), the tangential components of the electric field F0r = Ar,0 − A),r , r = 1, 2, 3 and the normal components of the magnetic field Frs = As,r − Ar,s , 1 ≤ r < s ≤ 3 must vanish on the boundaries of the cavity. In particular, the freed Dirac field must have an expansion � ψ (0) (t, r) = c(mnp, t)umnp (r) mnp
where m, n, p run over positive integers and √ √ umnp (r) = (2 2/ abd)sin(mπx/a)sin(nπy/b)sin(pπz/d) Substituting this into the free Dirac equation, ie, without any electromagnetic interactions, we get � (i∂t c(mnp, t))umnp (r) = mnp
((α, −i�) + βm0 ).
�
c(mnp, t)umnp (r)
mnp
from which we derive on taking the inner products on both sides with ukls (r) and using the orthonormality of this set of functions over the cavity volume, ie, � < ukls , umnp >= ukls (r)umnp (r)d3 r = δkm δln δsp B
where B is the cavity volume B = [0, a] × [0, b] × [0, d],
82 84CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
the following sequence of differential equations � i∂t c(kls, t) = [< ukls , −i∂x umnp > α1 c(mnp, t) mnp
+ < ukls , −i∂y umnp > α2 c(mnp, t)+ < ukls , −i∂z umnp > α3 c(mnp, t)]+m0 βc(kls, t)
Now we evaluate
< ukls , −i∂x umnp >= −iδln δsp (mπ/a)( where
a1 (k, m) = (−2imπ/a2 )
�
�
a
(2/a)sin(kπx/a)cos(mπx/a)dx 0
= a1 (k, m)δln δsp
a
sin(kπx/a).cos(mπx/a)dx 0
Likewise,
< ukls , −i∂y umnp >= a2 (l, n)δkm δsp ,
and
< ukls , −i∂z umnp >= a3 (s, p)δkm δln
Combining all these equations gives us finally,
� i∂t c(kls, t) = a1 (k, m)α1 c(mls, t)
m
+
�
a2 (l, n)α2 c(kns, t) +
n
�
a3 (s, p)α3 c(klp, t)] + m0 βc(kls, t)
p
Note that α1 , α2 , α3 , β are 4×4 Hermitian matrices while c(mnp, t) is a 4×1 com plex vector. Arranging the 4 × 1 vectors c(mnp, t), m, n, p ≥ 1 in lexicographic order to give an infinite vector c(t) and likewise defining a block structured infinite dimensional Dirac Hamiltonian matrix H0 by � H0 = a1 (k, m)(I4 ⊗ e(kls))α1 (I4 ⊗ e(mls)T ) klsm
+
�
lksn
+
�
klsp
a2 (l, n)(I4 ⊗ e(kls))α2 (I4 ⊗ e(kns)T ) a3 (s, p)(I4 ⊗ e(kls))α3 (I4 ⊗ e(klp)T ) +m0 β ⊗ I
where we may choose e(mnp), m, n, p ≥ 1 as any orthonormal basis for l2 (Z+ ), the Hilbert space of all one sided square summable infinite sequences and define � c(t) = c(mnp, t)e(mnp) mnp
Advanced Classical and Quantum ProbabilityOF Theory with Quantum Field Theory Applications 83 3.8. APPROXIMATE ANALYSIS A RECTANGULAR QUANTUM ANTENNA85
By orthonormal, we mean that e(kls)T e(mnp) = δkm δln δsp Thus the free Dirac equation in the RDRA has been put in ”Standard” block matrix form: dc(t) i = H0 c(t) dt the general solution to which can be expressed as � c(t) = d(n).cn exp(−iE(n)t) n
where cn , n ≥ 1 form an orthonormal basis for l2 (Z+ ) and the d(n)� s are arbi trary complex numbers such that � |d(n)|2 = 1 n
�
E(n) s are the (energy) eigenvalues of the infinite dimensional Hermitian H0 : det(H0 − E(n)I) = 0 The average energy of the free Dirac field of electrons and positrons within the cavity is then � < c(t), H0 c(t) >= E(n)d(n)∗ d(n) n
It is easy to see as in the case of the Dirac equation in free space that if E(n) is an eigenvalue of H0 then so is −E(n) where the E(n)� s may be taken as positive, Hence if cen is an eigenvector of H0 corresponding to the eigenvalue E(n) and cpn is an eigenvector corresponding to the eigenvalue −E(n), then the solution can be expressed as � c(t) = [de (n)cen exp(−iE(n)t) + dp (n)∗ cpn exp(iE(n)t)] n
Therefore, it is plausible in the second quantized theory, to look upon the de (n)� s as annihilation operators of the electrons and the dp (n)∗ ’s as the creation oper ators of the positrons. The actual Dirac wave function ψ(t, r) in the absence of electromagnetic interactions is then � ψ(t, r) = [de (k)cek (mnp)umnp (r)exp(−iE(k)t)+
kmnp
dp (k)∗ cpk (mnp)umnp (r)exp(iE(k)t)]−−−(3)
A simple calculation then shows that the second quantized Hamiltonian of the free Dirac field of electrons and positrons within the cavity is given by � HD0 = ψ(t, r)∗ ((α, −i�) + βm)ψ(t, r)d3 r B
84
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
86CHAPTER 3. ANTENNA THEORY, COURSE INSTRUCTOR: HARISH PARTHASARATH = =
� k
�
ψ(t, r)∗ i∂t ψ(t, r)d3 r B
E(k)(de (k)∗ de (k) − dp (k)dp (k)∗ )
Now from the basic anticommuation relations for the Dirac field, we have {ψ(t, r), ψ(t, r� )∗ } = δ 3 (r − r� )I and this immediately implies the following anticommutation relations for the electron and positron creation and annihilation operators: {de (k), de (m)∗ } = δkm , {dp (k), dp (m)∗ } = δkm with all the other anticommutators vanishing. This completes our description of the free Dirac field of electrons and positrons within the RDRA. Using these anticommutation relations, we immediately get that the total second quantized Hamiltonian of the free Dirac field in the cavity can equivalently be expressed as � E(k)(de (k)∗ de (k) + dp (k)∗ dp (k)) HD0 = k
namely, the sum of the total electron and positron energies. Likewise, when we solve the free Maxwell equations within the cavity after incorporating the appropriate boundary conditions, we get that the scalar potential is zero since there are no charges while the magnetic vector potential admits an expansion obtained from E = −∂t A as A(t, r) =
� k
[b(k)wk (r)exp(−iω(k)t) + b(k)∗ wk (r)∗ exp(iω(k)t)] − − − (4)
where now wk (r) has three components that are calculated from the expansion of Ez and the relationship between the transverse and longitudinal components of the electric field within the cavity. Note that the electromagnetic field is being computed in the Coulomb gauge which implies that the electric scalar potential becomes zero in view of the fact that in the Coulomb gauge, the scalar potential satisfies Poisson’s equation and is therefore a matter field which evaluates to zero since there is no unperturbed charge density. We also note that the third, ie, z component of wk (r) where the index k is identified with the modal triplet (mnp) is proportional to sin(mπx/a)sin(nπy/b)cos(pπz/d) in view of the boundary conditions on the electric field an the fact that each mode of the magnetic vector potential is proportional to the electric field (−jωA = E). b(k) = b(mnp) is identified with a photon annihilation operator while b(k)∗ with a photon creation operator. They satisfy the canonical commutation rela tions [b(k), b(m)∗ ] = δkm
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
85
3.8. APPROXIMATE ANALYSIS OF A RECTANGULAR QUANTUM ANTENNA87
Formally, we can compute both the free Dirac current density ψ(t, r)∗ αµ ψ(t, r) of electrons and positrons within the cavity as well as the surface current den sity on the RDRA walls induced by the tangential components of the quantum magnetic field B = curlA and obtain the far field radiation pattern generated by both of these cavity current components. Obviously, this far field radiation pattern will have its first component being a quadratic form in the electron positron creation and annihilation operators de (k), dp (k), de (k)∗ , dp (k)∗ while the second component will be linear in the photon creationannihilation oper ators b(k), b(k)∗ and therefore, in principle, we can compute all the statistical moments of the radiation field in a joint coherent state of the photons, electrons and positrons. However, this picture of the far field quantum radiation pattern is incomplete because it does not take into account the cavity current density terms caused by perturbation in the Dirac wave field due to interaction with the photons and it does not also take into account the cavity surface current density terms caused by perturbation in the Maxwell field caused by its interaction with the Dirac field. We shall now indicate an approximate first order calculation by which these extra correction terms may be obtained due interactions between the Maxwell field and the Dirac field. We denote the free Dirac field within the cavity derived above by ψ (0) (t, r) and the corresponding momentum space wave function c(mnp, t) by c(0) (mnp, t). Likewise, we denote the free Maxwell field within the cavity by A(0) . Let δA denote the perturbation to the Maxwell field caused by the Dirac current and δψ, δc(mnp, t) the perturbation to the Dirac field caused by the Maxwell current. Then, clearly if S(x−y) denotes the electron propagator and D(x−y) the photon propagator, we have using (1) and (2), approximately, � δAµ (t, r) = µ0 e D(t − t� , r − r� )ψ (0)∗ (t� , r� )αµ ψ (0) (t� , r� )dt� d3 r� δψ(t, r) = e
�
� � µ (0) � � S(t − t� , r − r� )A(0) (t , r )dt� d3 r� µ (t , r )α ψ (0)
where we substitute for ψ (0) and Aµ the expressions given in (3) and (4). Then, ψ (0)∗ αµ ψ (0) (t, r) = �
cpk (mnp)exp(−iE(k)t)].αµ .
[de (k)∗ c¯ek (mnp)exp(iE(k)t) + dp (k)¯
kmnpk� m� n� p�
.[de (k � )cek� (m� n� p� )exp(−iE(k � )t)+dp (k � )∗ cpk� (m� n� p� )exp(iE(k � )t)]umnp (r)um� n� p� (r) � cek (mnp)αµ cek� (m� n� p� )exp(i(E(k)−E(k � ))t)umnp (r)um� n� p� (r)] = [de (k)∗ de (k � )¯ � + [de (k)∗ dp (k � )∗ c¯ek (mnp)αµ cpk� (m� n� p� )exp(i(E(k)+E(k � ))t)umnp (r)um� n� p� (r)] � cpk (mnp)αµ cek� (m� n� p� )exp(−i(E(k)+E(k � ))t)umnp (r)um� n� p� (r)] + [dp (k)de (k � )¯
86
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
88CHAPTER 3. ANTENNA THEORY, COURSE INSTRUCTOR: HARISH PARTHASARATHY +
L
[dp (k)dp (k ' )∗ c¯pk (mnp)αµ cekl (m' n' p' )exp(−i(E(k)−E(k ' ))t)umnp (r)uml nl pl (r)]
We see that the frequencies of the Dirac current that generate the perturbation to the quantum electromagnetic field are E(k) ± E(k ' ), k, k ' = 1, 2, ... or more precisely, these divided by Planck’s constant. Here E(k) was obtained by solv ing the free Dirac eigenvalue equation inside the rectangular cavity with zero boundary conditions. The E(k)' s were obtained as the eigenvalues of the Dirac Hamiltonian. From basic principles of special relativity, it is easy to see that these E(k)' s are of the order � c m20 c2 + P 2 where
P 2 = (h/2π)2 ((mπ/a)2 + (nπ/b)2 + (pπ/d)2 ) with m, n, p being positive integers determined by the mode of oscillation of the field within the cavity. Now this current is of the general form −eψ (0)∗ αµ ψ (0) (t, r)
=
L µ ' µ ∗ ' ∗ µ [d(k)∗ d(k ' )fkk ¯kkl (t, r)] l (t, r) + d(k)d(k )gkk l (t, r) + d(k) d(k ) g k,kl
where the d(k)' s are the annihilation operators of the electrons and positrons and their adjoints d(k)∗ are the corresponding creation operators. The functions fkkl , gkkl are constructed by superposing exp(±i(E(k)±E(k ' ))umnp (r)uml nl pl (r) and these components are easily seen to be expressible as superpositions of spacetime sinusoids with the temporal frequencies being E(k) ± E(k ' ) or their negatives and the spatial frequencies, ie, wavenumbers being mπ/a, nπ/b, pπ/d, m π/a, n π/b, p π/d. Note that because the current density is a Hermitian operator field, it follows that µ µ f¯kk l (t, r) = fk l k (t, r) '
'
'
Then, the perturbation in the electromagnetic potentials can be expressed as L
[d(k)∗ d(k ' )
k,kl
J
δAµ (t, r) = µ ' 3 ' ' ' ' D(t−t' , r−r' )fkk l (t , r )dt d r +d(k)d(k )
+d(k)∗ d(k ' )∗
Let
where
J
J
µ ' ' ' 3 ' D(t−t' , r−r' )gkk l (t , r )dt d r
µ ' ' ' 3 ' D(t−t' , r−r' )¯ gkk l (t , r )dt d r ]
J J µ (k) = −e ψ (0)∗ (x)αµ ψ (0) (x)exp(−ik.x)d4 x J = ψ (0)∗ (t, r)αµ ψ (0) (t, r)exp(−i(k 0 t − K.r))dtd3 r k = (k µ ) = (k 0 , K)
Advanced Classical and Quantum Probability Theory withIN Quantum Field Theory Applications 87 AND THE QU 3.9. REMARK ON THE PERTURBATION THE QUANTUM DIRAC FIELD
denote the spacetime four dimensional Fourier transform of the unperturbed Dirac four current density. Then, we can write down the spacetime Fourier transform of the correction δAµ (x), x = (t, r) to the electromagnetic four po tential caused by this Dirac current as � µ δA (k) = δAµ (x).exp(−ik.x)d4 x = µ0 D(k)J µ (k) = µ0 J µ (k)/k 2 , k 2 = kµ k µ = (k 0 )2 − |K|2 in units where c = 1. It should be noted that by the convolution theorem for Fourier transforms, if ψ (0) (k) denotes the spacetime Fourier transform of ψ (0) (x), then � J µ (k) = (2π)−4 ψ (0)∗ (k � − k)αµ ψ (0) (k � )d4 k � and hence, the perturbation to the electromagnetic four potential in the space time Fourier domain, ie, in four momentum space of the photon can be expressed as � δAµ (k) = (µ0 /(2π)4 k 2 ) ψ (0)∗ (k � − k)αµ ψ (0) (k � )d4 k �
3.9 Remark on the perturbation in the quan tum Dirac field and the quantum electro mangetic field interacting with each other caused by further interaction of the Dirac field with a classical control em field and interaction of the quantum electromagnetic field with a control classical current The unperturbed electromagnetic field is in the Coulomb gauge, ie, divA(0) = 0 and also since there is no charge/current for the unperturbed field, the unper turbed electric scalar potential is a matter field which is identically zero, ie, A(0)0 = 0. Hence, we are guaranteed that the unperturbed electromagnetic potentials also satisfy the Lorentz gauge conditions, ie, divA(0) + ∂t A(0)0 = 0. This means that while computing the perturbations to the electromagnetic po tentials caused by currents coming from the Dirac field, we can safely work in the Lorentz gauge. Likewise, the change in the Dirac field caused by interaction with the elec tromagnetic field within the cavity is given upto first order perturbation theory
88 90CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
by
δψ(x) = δψ(x) = e =e
= −e .[
�
= −e
�
S(t−t� , r−r� )
�
−e −e −e
�
S(x − x� )Aµ(0) (x� )αµ ψ (0) (x� )d4 x�
� r (0) � 4 � S(x − x� )A(0) (x )d x r (x )α ψ
S(x − x� )(α, A(0) (x� ))ψ (0) (x� )d4 x�
[b(k)(α, wk (r� ))exp(−iω(k)t� )+b(k)∗ (α, wk (r� )∗ )exp(iω(k)t� )].
k
[de (k)cek (mnp)umnp (r� )exp(−iE(k)t� )+dp (k)∗ cpk (mnp)umnp (r� )exp(iE(k)t� )]dt� d3 r �
kmnp
= −e
�
�
�
b(k)de (k � )
kk� mnp
�
b(k)dp (k � )∗
kk� mnp
�
kk� mnp
�
kk� mnp
b(k)∗ de (k � )
�
�
b(k)∗ dp (k � )∗
�
�
S(t−t� , r−r� )(α, wk (r� ))cek� (mnp)umnp (r� )exp(−i(ω(k)+E(k � ))t� )dt� d3 r� S(t−t� , r−r� )(α, wk (r� )∗ )cpk� (mnp)umnp (r� )exp(−i(ω(k)−E(k � ))t� )dt� d3 r� S(t−t� , r−r� )(α, wk (r� ))cek� (mnp)umnp (r� )exp(i(ω(k)−E(k � ))t� )dt� d3 r� S(t−t� , r−r� )(α, wk (r� )∗ )cpk� (mnp)umnp (r� )exp(i(ω(k)+E(k � ))t� )dt� d3 r�
From this expression, it is clear that the characteristic frequencies of the in teraction term between the electromagnetic potentials and the Dirac field and hence the characteristic frequencies of the perturbation in the Dirac field caused by electromagnetic interaction are ±ω(k) ± E(k � ). In terms of the compact no tation introduced above, namely using the same symbol d(k) for both electron and positron annihilation operators and likewise d(k)∗ for both electron and positron creation operators, we can write � � b(k)d(k � )h1kk� (t� , r� ) + b(k)d(k � )∗ h2kk� (t� , r� )+ δψ(x) = S(t − t� , r − r� )[ +b(k)∗ d(k � )h3kk� (t� , r� ) + b(k)∗ d(k � )∗ h4kk� (t� , r� )]dt� d3 r�
where the functions hmkk� (t, r) are built by superposing the functions exp(i ± (ω(k) ± E(k � ))t)(α, wk (r))ck� (mnp)umnp (r) and the same expression with wk (r) replaced by its complex conjugate wk (r)∗ . Here, the symbol ck� (mnp) stands for either cek� (mnp) or cpk� (mnp). In particular this expression shows that the perturbation to the Dirac field caused by electromagnetic interactions have frequencies ±ω(k) ± E(k � ), namely linear combinations of the unperturbed electromagnetic characteristic frequen cies and the unperturbed Dirac characteristic frequencies. This represents a new feature of our model.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
89
3.9. REMARK ON THE PERTURBATION IN THE QUANTUM DIRAC FIELD AND THE Q
Before proceeding further, observe that we can write in the four dimensional momentum/spacetime frequency domain, µ (0) δψ(k) = S(k)F(eA(0) )(k) µ α ψ
where
S(k) = (k 0 − (α, K) − βm0 + i0)−1
is the electron propagator in the four momentum domain k = (k µ ) = (k 0 , K) and Control of the quantum electromagnetic field and the Dirac field of electrons and positrons within the rectangular cavity by means of a classical electro mangetic field coming from a laser source connected to the cavity plus a classical current source coming from a probe inserted into the cavity: Let Acµ (x) denote the classical electromagnetic four potential from the laser and Jµc (x) the classical current density coming from the probe insertion. The relevant equations are �Aµ = −eµ0 ψ ∗ αµ ψ + µ0 Jcµ , ((α, −i�) + βm)ψ = [−e(α, A) − e(α, Ac )]ψ The first order perturbative solution to these equations is with x = (t, r), ψ(x) = ψ (0) (x) + δψ(x), Ar (x) = Ar(0) (x) + δAr (x), r = 1, 2, 3 where ψ (0) (x) = �
[de (k)cek (mnp)umnp (r)exp(−iE(k)t)+dp (k)∗ cpk (mnp)umnp (r)exp(iE(k)t)]
kmnp
Ar(0) (x) =
�
[b(k)wkr (r)exp(−iω(k)t) + b(k)∗ w ¯kr (r)exp(iω(k)t)]
k
δψ(x) = −e
�
Se (x − y)[(α, A(0) (y)) + (α, Ac (y))]ψ (0) (y)d4 y
= −e
�
δAr (x) = −eµ0 ≈ −eµ0
�
�
r c (0) Se (x − y)[αr A(0) (y)d4 y r (y) + α Ar (y)]ψ
= δψ1 (x) + δψctr (x), D(x − y)(ψ ∗ αr ψ)(y)d4 y + µ0
D(x − y)(ψ (0)∗ αr ψ (0) )(y)d4 y + µ0
�
�
D(x − y)Jrc (y)d4 y D(x − y)Jrc (y)d4 y
90
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
92CHAPTER 3. ANTENNA THEORY, COURSE INSTRUCTOR: HARISH PARTHASARATH where the classically controllable part of the Dirac field is � δψctr (x) = −e Se (x − y)αr Acr (y)ψ (0) (y)d4 y and this component contains a classical field component Acr and a quantum field component ψ (0) , while the part of the Dirac field perturbation that is not controllable is � (0) (y)d4 y δψ1 (x) = −e Se (x − y)αr A(0) r (y)ψ On the other hand, the controllable part of the electromagnetic field is purely classical: � δAr,ctr (x) = µ0 D(x − y)Jrc (y)d4 y
If we go one step further in the perturbation series, then we get an additional term in the controllable part of the electromagnetic field so that the above equation gets modified to: � δAr,ctr (x) = µ0 D(x − y)Jrc (y)d4 y −eµ0 −eµ0
�
�
D(x − y)(δψctr (y)∗ αr (ψ (0) + δψ1 )(y)d4 y D(x − y)(ψ (0)∗ + δψ1∗ )(y)αr δψctr (y)d4 y
Note that in this analysis, the perturbation parameter is the electron charge e and if we neglect O(e2 ) terms, then the above expression for the controllable part of the electromagnetic field simplifies to � δAr,ctr (x) = µ0 D(x − y)Jrc (y)d4 y+ −eµ0 −eµ0
�
�
D(x − y)δψctr (y)∗ αr ψ (0) (y)d4 y D(x − y)ψ (0)∗ (y)αr δψctr (y)d4 y
In the particular case of the rdra considered here, we find that the controllable part of the Dirac field has the expansion � δψctr (x) = −e Se (x − y)αr Acr (y)ψ (0) (y)d4 y = −e
�
Se (t−t� , r−r� )αr Acr (t� , r� )[
�
[de (k)cek (mnp)umnp (r� )exp(−iE(k)t� )+
kmnp ∗
dp (k) cpk (mnp)umnp (r� )exp(iE(k)t� )]dt� d3 r�
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 91 FIELD THEO 3.10. QUANTUM ANTENNAS CONSTRUCTED USING SUPERSYMMETRIC
Now define the following Fourier components of the control classical laser gener ated electromagnetic field w.r.t the cavity boundary conditions and the energy spectrum of the free Dirac field in the cavity after: � Acr (t� , r� )umnp (r� )exp(−iK.r� )exp(iωt� )d3 r� dt� = CA,r (ω, K|m, n, p) Then, we can express the above controllabe part of the Dirac field in the fol lowing form in the spatiotemporal Fourier domain: � δψctr (t, r).exp(i(ωt − K.r)dtd3 r = �
mnpk
[−ede (k)S(ω, K)cek (mnp)αr [Cr (ω − E(k), K|mnp)
−edp (k)∗ S(ω, K)cpk (mnp)αr Cr (ω + E(k), K|mnp)] � = −eS(ω, K) [de (k)cek (mnp)αr Cr (ω−E(k), mnpk
K|mnp)+dp (k)∗ cpk (mnp)αr Cr (ω+E(k)|mnp)]
The controllable part of the Dirac four current density is then given upto first order perturbation terms by (x = (t, r)) δJ µ (t, r) = −eψ (0)∗ (x)αµ δψctr (x) − eδψctr (x)∗ αµ ψ (0) (x) and it is immediately clear from the above expression that the far field radiated electromagnetic potential generated by this controllable current field can be expressed in the form � δAµR (t, r) = D(t − t� , r − r� )δJ µ (t� , r� )dt� d3 r� =
�
∗
�
de (k) de (k )
mnpkrm� n� k� s
�
Cr (ω − E(k), K|mnp)C¯s (−ω � − E(k � ), K � |m� n� p� )
.F µrs (t, r|ω, K, ω � , K � , mnpk, m� n� p� k � )dωd3 Kdω � d3 K �
plus three other similar terms involving de (k)∗ dp (k � )∗ , dp (k)de (k � ), dp (k)dp (k � )∗ . In compact notation, the expected value of this controllable far field pattern can be expressed as a Hermitian quadratic form in the complex numbers Cr (ω, K|mnp), ω ∈ R, K ∈ R3 , m, n, p ∈ Z+ . These complex numbers are controllable since they represent in some sense the spatiotemporal components of the Fourier compo nents of the classical control electromagnetic field Acµ .
3.10 Quantum Antennas constructed using su persymmetric field theories Reference: Steven Weinberg, ”The quantum theory of fields, vol.III, Supersym metry”, Cambridge University Press.
92 94CHAPTER Advanced Classical and Quantum ProbabilityCOURSE Theory withINSTRUCTOR: Quantum Field Theory Applications 3. ANTENNA THEORY, HARISH PARTHASARATH
[1] Let Φ be a left Chiral field, ie, it is a function of only θL = (1 + γ5 )θ/2 and µ T x+ = xµ + (1/2)θR �γ µ θL Let V A be gauge superfields for each YangMills gauge group index A. Expand V A as V A (x, θ) = θT �γ µ θ.VµA (x) + θT �θ.θT γ5 �λA (x) + (θT �θ)2 DA (x) VµA is called the gauge field, λA is called the gaugino field and DA is called the auxiliary field. The transformation law of the gauge superfield under extended gauge transformations defined by an arbitrary left Chiral superfield Ω is given by Γ → exp(iΩ)Γ.exp(−iΩ∗ ) Now define a left Chiral spinor supefield T W L = DR �DR exp(t.V )DL (exp(−t.V ))
where t.V = tA V A . Note that the gauge superfield transformation law implies exp(t.V ) → exp(iΩ)exp(t.V ).exp(−iΩ∗ ) exp(−t.V ) → exp(iΩ∗ )exp(−t.V )exp(−iΩ) Then since Ω is left Chiral, Ω∗ becomes right Chiral and therefore under the gauge transformation, T WL → exp(iΩ)DR �DR exp(t.V )DL (exp(−t.V )exp(−iΩ)) T = exp(iΩ)DR �DR exp(t.V )(DL exp(−t.V ))exp(−iΩ)) T +exp(iΩ)DR �DR DL (exp(−iΩ)) T The second term is zero since DR �DR .exp(−iΩ) = 0 because DR .exp(−iΩ) = 0 T �DR , DL ] is proportional to γ µ ∂µ DR . Thus under gauge transforma and [DR tions, WL transforms as
WL → exp(iΩ)WL .exp(−iΩ) which is consistent with the fact that since WL is left Chiral, its gauge transform should also be left Chiral. It is then easy to see that the quantity T r[WLT �WL ]F = [WLAT �WLA ]F is gauge invariant, Lorentz invariant and supersymmetry invariant where the T �θL in the expansion of a Chiral super subscript F denotes the coefficient of θL field. The matter superfield Φ is left Chiral and has a scalar field component, a left handed Dirac field component and an auxiliary F component. The field
Advanced Classical and QuantumOF Probability Theory with Quantum Field Theory Applications 93 3.11. QUANTIZATION THE MAXWELL AND DIRAC FIELD IN A BACKGROUND CUR
L1 = [Φ∗ exp(−t.V )Φ]D is a supersymmetric Lagrangian that is also gauge in variant since under a gauge transformation, exp(−t.V ) → exp(−iΩ∗ )exp(t.V )exp(iΩ), while on the other hand, Φ → exp(−iΩ)Φ, Φ∗ → Φ∗ exp(iΩ∗ ) L1 describes the scalar field, the Dirac field and their interactions with gauge field in a way that generalizes the interaction of the Dirac field with the gauge fields like the electromagnetic field and more generally, with nonAbelian gauge fields. If the left Chiral superpotential field f (Φ) is also taken into account by using its F term, then we obtain a Lagrangian for matter interacting with gauge fields that is supersymmetry, Lorentz and gauge invariant. This Lagrangian can be written down easily and we leave it as an exercise.
3.11 Quantization of the Maxwell and Dirac field in a background curved metric of space time Let Γµ denote the spinor connection of the gravitational field. If Vaµ is the tetrad field of the metric, ie, g µν = η ab Vaµ Vbν where η is the Minkowski metric, then it is known (see for example, [a] Steven Weinberg, ”Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity”, Wiley, or [b] Steven Weinberg, ”The quantum theory of fields, vol.III, Supersymmetry”, Cambridge University Press) that Γµ = (1/2)Vaν:µ Vbν [γ a , γ b ] and that Dirac’s equation that is diffeomorphic as well as locally Lorentz invari ant is given by [γ a Vaµ (i∂µ + eAµ + iΓµ ) − m0 ]ψ = 0 Local Lorentz invariance is checked by noting that if Λ(x) is a local Loretnz transformation (ie, a spacetime dependent Lorentz transformation matrix w.r.t. the Minkowski metric η, ie, Λ(x)T ηΛ(x) = η), and if D(.) denotes Dirac’s spinor representation of the Lorentz group, then Vaµ (x)D(Λ(x))γ a (∂µ + Γµ (x))D(Λ(x))−1 = = Vaµ D(Λ)γ a D(Λ)−1 D(Λ)(∂µ + Γµ (x))D(Λ(x))−1 = Vaµ Λab γ b D(Λ)(∂µ + Γµ (x))D(Λ(x))−1
94
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
96CHAPTER 3. ANTENNA THEORY, COURSE INSTRUCTOR: HARISH PARTHASARATH �
= Vaµ (x)γ a (∂µ + Γ�µ (x)) �
where Γ�µ (x) is obtained from Γµ (x) by replacing Vaµ (x) by Va µ (x) with �
Vaµ (x) = Λab (x)Vbµ (x) This is proved by assuming Λ(x) = I + ω(x) to be an infinitesimal Local Lorentz transformation. Note that iΓµ (x) is not a Hermitian matrix since [γ a , γ b ] is not skewHermitian for all a, b = 0, 1, 2, 3. Remark: γ 0∗ = γ 0 , γ a∗ = −γ a , a = 1, 2, 3 and hence [γ 0 , γ a ] is Hermitian for a = 1, 2, 3 while [γ a , γ b ] is skewHermitian for a, b = 1, 2, 3. However, it can be shown using integration by parts that the action func tional for the Dirac field in curved spacetime defined by � � S[ψ, ψ ∗ ] = i ψ ∗ (x)γ 0 [γ a Vaµ (x)(∂µ + Γµ (x)) − m0 ]ψ(x) −g(x)d4 x
is real and apart from being locally Lorentz invariant, it is also diffeomorphic invariant. The electron propagator in a curved background metric: The electron prop agator is clearly given by the formal operator theoretic expression Se = (Vaµ γ a (i∂µ + iΓµ ))−1 More specifically, if Se (x, y) is the position space kernel representation of the electron propagator, then [iγ a Vaµ (x)(∂µ + Γµ (x)) − m0 ]Se (x, y) = δ 4 (x − y) We shall obtain an approximate solution to the electron propagator using per turbation theory. Let Se0 (x − y) denote the unperturbed electron propagator. Then, [iγ a ∂a − m0 ]Se0 (x − y) = δ 4 (x − y) with solution
Se0 (x − y) = [iγ a ∂a − m0 ]−1 δ 4 (x − y)
or equivalently, using Fourier transforms, � Se0 (x − y) = (2π)−4 (iγ a pa − m + i0)−1 exp(ip.(x − y))d4 p Let δSe (x, y) denote the correction to the electron propagator upto first order perturbation theory caused by the gravitational terms. We then have upto first order in the gravitational metric perturbations from the flat spacetime Minkowski metric, Vaµ = δaµ + δVaµ (x) and hence upto the first order, Γµ = δΓµ = (1/2)[γ a , γ b ]δaν [δVbν,µ − Γρνµ ηbρ ]
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
95
3.11. QUANTIZATION OF THE MAXWELL AND DIRAC FIELD IN A BACKGROUND CUR = (1/2)[γ a , γ b ][δVba,µ − Γbaµ ]
Note that upto first order,
gµν = ηab Vµa Vνb = ηab (δµa + δVµa )(δνb + δVνb ) = ηµν + ηab δµa δVνb + ηab δνb δVµa
= ηµν + 2δVµν
where the tetrad Vaµ has been chosen to be symmetric in its two indices. In
fact, if the metric is expressed as gµν = ηµν + δgµν then we can choose Vµν = δgµν /2 Now writing the differential equation satisfied by the propagator as [iγ a (δaµ + δVaµ (x))(∂µ + δΓµ (x)) − m0 ][Se0 (x − y) + δSe (x, y)) = δ 4 (x − y) we get on equating first order terms, [iγ µ ∂µ − m0 ]δSe (x, y) + iγ a δVaµ (x)∂µ Se0 (x − y) + iγ µ δΓµ (x)Se0 (x − y) = 0 from which we deduce the following formula for the first order propagator cor rection: � δSe (x, y) = −i Se0 (x − z)[γ a δVaµ (z)∂µ Se0 (z − y) + γ µ δΓµ (z)Se0 (z − y)]d4 z = −i
�
Se0 (x − z)γ a [Vaµ (z)∂µ Se0 (z − y) + δΓa (z)Se0 (z − y)]d4 z
In order to relate all this to quantum antennas, we must also calculate the Dirac four current density in curved space time. Consider ψ ∗ γ 0 [(Vaµ γ a (i∂µ + iΓµ ) − m0 ]ψ = 0 or equivalently, or equivalently,
ψ ∗ [Vaµ αa (i∂µ + iΓµ ) − m0 β]ψ = 0 i∂µ [ψ ∗ Vaµ αa ψ] − i∂µ [ψ ∗ Vaµ ]αa ψ +ψ ∗ [iVaµ αa Γµ − m0 β]ψ = 0
Taking the conjugate of this equation gives
−i∂µ [ψ ∗ Vaµ αa ψ] + iψ ∗ αa ∂µ [Vaµ ψ] +ψ ∗ [−iVaµ Γ∗µ αa − m0 β]ψ = 0
96
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
98CHAPTER 3. ANTENNA THEORY, COURSE INSTRUCTOR: HARISH PARTHASARATH
3.12 Relationship between the electron self en ergy and the electron propagator Consider an electron bound to its nucleus. Let En denote its energy eigenvalue corresponding to the eigenfunction un (x). Then, the propagator of the electron in the second quantized picture can be expressed as L un (x)¯ un (y)exp(−iEn (t − t' )) S(x, y) = S(t, x|(t' , y) = θ(t − t' ) n
To check this, we prove that S satisfies the propagator differential equation (i∂t − H)S(t, x|t' , y) = iδ 4 (x − y), x = (t, x), y = (t' , y) This is proved using the identities θ' (t − t' ) = δ(t − t' ),
L n
un (x)¯ un (y) = δ 3 (x − y)
In the frequency/energy domain, the propagator is given by J S(x, y|E) = S(t, x|0, y)exp(iE(t − t' ))dt = i
L n
R
un (x)¯ un (y)/(E − En )
or equivalently, in operator theoretic notation, L S(E) = i |un >< un |/(E − En ), < un |um >= δn,m n
In particular,
J
u ¯n (x)S(x, y|E)d3 x =
iu ¯n (y)/(E − En )
Then the change in the propagator caused by radiative effects in which the energy levels get perturbed by δEn and correspondingly, the stationary state eigenfunctions get perturbed by δun (x) is given by −iδS(x, y|E) =
L
[δun (x)¯ un (y)+un (x)δun (y)]/(E−En )+
n
L
un (x)¯ un (y)δEn /(E−En )2
n
It follows then that on writing the one loop radiative correction to the electron propagator as δS = S.Σ.S that −i < un |δS|un >= −i < un |SΣ.S|un >= −i < un |Σ|un > /(E − En )2
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 97 GRAVITATI 3.13. ELECTRON SELF ENERGY CORRECTIONS INDUCED BY QUANTUM
on the one hand while on the other, −i < un |δS|un >= δEn /(E − En )2 where we have used the orthogonality relation < δun |un >= 0 since both un and un + δun are normalized, ie have unit norm. This gives us the fundamental relation between the change in the electron propagator caused by one loop radiative corrections and the shift in the electron energy as δEn = −i < un |Σ|un > This is the extra energy gained by the electron due to propagator corrections coming from radiative as well as gravitational effects.
3.13 Electron self energy corrections induced by quantum gravitational effects For the free gravitational field, let < 0|T (δΓµ (x)δΓν (y))|0 >= DΓ (x, y) This can be viewed as some sort of propagator for the free quantum gravitational field. The wave equation satisfied by the Dirac field in the presence of quantum gravitational effects is given by [iγ a (δaµ + δVaµ (x))(∂µ + δΓµ (x)) − m0 ][ψ (0) (x) + δψ(x)] = 0 ψ (0) is the free electronpositron wave operator field. It satisfies the zeroth order perturbation equation: [iγ µ ∂µ − m0 ]ψ (0) = 0 and its solution is expressible as a superposition of the electron annihilation operators and the positron creation operators in momentum space. δψ is the correction to the free Dirac field caused by gravitational effects upto first order. It satisfies the first order perturbation equation: [iγ µ ∂µ − m0 ]δψ(x) +iγ a δVaµ (x)∂µ ψ (0) (x) + iγ µ δΓµ (x)ψ (0) (x) = 0 ie δψ(x) satisfies the same differential equation as the first order perturbation δSe (x, y) in the electron propagator and its solution is given by � δψ(x) = −i Se0 (x − y)[γ a δVaµ (y)∂µ ψ (0) (y) + γ µ δΓµ (y)ψ (0) (y)]d4 y
98 100CHAPTER Advanced Classical and Quantum THEORY, Probability Theory with Quantum Field TheoryHARISH Applications 3. ANTENNA COURSE INSTRUCTOR: PARTHASARAT
and hence the approximate corrected electron propagator upto linear orders in the graviton propagator is given by Se (x, y) =< 0|T {(ψ (0) (x) + δψ(x)).(ψ (0) (y) + δψ(y))∗ }|0 >
= Se0 (x − y)+ < 0|T {ψ (0) (x).δψ(y)∗ }|0 > + < 0|T {δψ(x).ψ (0)∗ (y)}|0 >
+ < 0|T {δψ(x).δψ(y)∗ }|0 >
= Se0 (x − y)+ < 0|T {δψ(x).δψ(y)∗ }|0 >
with
�
< 0|T {δψ(x).δψ(y)∗ }|0 >=
Se0 (x−u)[γ a δVaµ (u)∂µ ψ (0) (u)+γ µ δΓµ (u)ψ (0) (u)].[(δVbν (v)∂ν ψ (0) (v)∗ γ b∗ +ψ (0) (v)∗ δΓν (v)∗ γ ν∗ ]Se0 (y−v)∗ d4 ud4 v
Chapter 4
Chapter 4
Miscellaneous Problems
Miscellaneous problems 4.1
A problem in robotics
Let Rk (t), k = 1, 2, ..., N denote the positions of moving point objects which have to be picked up by a robot. More generally, instead of point objects, we can assume extended rigid bodies moving on the ground or in space which have to be picked up by the robot. We have a camera in synchronization with the robot at the location R(t) (ie the position of its centre of mass). Now the camera takes pictures of the point objects and also of the robot and a digital computer calculates the distances and bearings of the images of the objects with that of the robot and accordingly generates control torques that are used to manipulate the robot so that it moves closer to one of the objects, say the mth one in succession, ie, the robot uses the error in the images of the position of the mth object and that of the robot to generate a control torque signal that eventually enables the robot to track this object and finally reduce the error in its position relative to the robot to zero and finally pick the object up. This series of jobs is performed successively on the different objects so that finally all the objects are picked up. The mathematical details of formulating an algorithm are based on the gradient descent algorithm and could be described as follows: Let I(Rk (t), x, y) denote the image field on the camera screen generated by the k th object at time t and let I(R(t), q(t), x, y) be the image field on the camera screen generated by the robot at time t whose centre of mass is located at R(t) and whose link angles relative to a given direction are denoted by q(t). The computer calculates the error energy � Ek (t, R(t), q(t)) = (I(Rk (t), x, y) − I(R(t), q(t), x, y))2 dxdy screen
and then the computer generates the following algorithm for moving the robot using a force and torque that causes the robot’s location and link angles respec tively to change after a small time δt to R(t) + δR(t), q(t) + δq(t) 101
99
100102 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 4. MISCELLANEOUS PROBLEMS
where
δR(t) = −µ.δt�R(t) Ek (t, R(t), q(t)),
δq(t) = −µ.δt�q(t) Ek (t, R(t), q(t)) Our aim will be to generalize this model to the tracking and picking up rigid bodies and even nonrigid extended bodies. The force and torque generation mechanism are based on Newtonian mechanics: F(t) = M R�� (t), τ (t) = J(q(t))q�� (t) + N (q(t), q� (t)) where M, J are respectively the robot mass and its mass moment of inertia matrix and N (q(t), q� (t)) consists of centrifugal, coriolis, frictional and grav itational potential contributions to the computed torques. In practice, these control forces and torques are generated by discretizing the time derivatives with a time step of δt: δq�� (t) ≈ (δq(t) − 2δq(t − δt) + δq(t − 2δt))/(δt)2 R�� (t) = (R(t) − 2R(t − δt) + R(t − 2δt))/(δt)2 We propose in our project to do a noise analysis of this algorithm based on Varadhan’s large deviation theory. We also propose to do a robustness analysis of this problem based on how sensitive is the tracking error energy to errors induced in camera imaging and also to errors induced in the digital computer due to finite register effects while computing the control forces and torques. This work is to be regarded as an extension of a paper [1] based on the gradient search algorithm. [1] does not consider a mathematical analysis of noise effects on the algorithm. We propose to do such an analysis by adding WGN to the rhs of the gradient algorithm thereby resulting in a nonlinear stochastic difference equation and we shall use the standard techniques based on mean and variance propagation to analyze these effects [6]. The gradient algorithm for developing the computed force and torques are based on the gradient algorithm which take into account only the instantaneous error. In our project, we shall also be considering generalizations of this based on past error history.
4.2 More on root space decomposition of
a semisimple Lie algebra
Let g be a semisimple Lie algebra and let h be a Cartan subalgebra of it. Then, h is Abelian, ad(h) is an Abelian family of semisimple linear operators on g and hence can be simultaneously diagonalized. Thus we get the root space decomposition of g as � g=h⊕ gα α∈Δ
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
101
4.2. MORE ON ROOT SPACE DECOMPOSITION OF A SEMISIMPLE LIE ALGEBRA103 where Δ is a finite subset of h∗ , none of which is zero (The zero eigenspace of ad(h) is precisely h itself since h is maximal Abelian) and ad(H)(X) = [H, X] = α(H)X, if f X ∈ gα ∀H ∈ h
In other words, for any α ∈ Δ, gα is the eigenspace of ad(H) corresponding to the eigenvalue α(H) for every H ∈ h. An element of Δ is called a root. We now claim that Δ = −Δ In fact, we have that for any X ∈ gα , Y ∈ gβ the identity
B([H, X], Y ) = −B(X, [H, Y ]), H ∈ h and hence
(α(H) + β(H))B(X, Y ) = 0
Further, α(H � )B(H, X) = B(H, [H � , X]) = −B([H, H � ], X) = 0, H, H � ∈ h and since α ∈ h∗ is nonzero, it follows that B(H, X) = 0∀H ∈ h In other words, we have proved two things: One, that h ⊥ gα ∀α ∈ Δ and two that β = � −α implies that gα ⊥ gβ . From these two and the above root space decomposition of g, it easily follows that if there is an α ∈ Δ such that −α ∈ / Δ, then gα ⊥ g which contradicts the nondegeneracy of B(., .). This proves that Δ = −Δ. Next, we prove that dimgα = 1∀α ∈ Δ and gkα = 0, k > 1α ∈ Δ. Indeed consider the subspace V of g defined by V = span{Y } ⊕ hgα ⊕ g2α ⊕ ... ⊕ gkα ⊕ ... the series terminating after a finite number of steps since g and hence V are finite dimensional. Here, Y ∈ g−α is any nonzero element. Note that for any α, β ∈ Δ, we have [gα , gβ ] ⊂ gα+β and
[gα , g−α ] ⊂ h∀α, β ∈ Δ
the first follows by Jacobi’s identity and the second by Jacobi’s identity and maximal Abelian property of h. Now choose 0 = � X ∈ gα and Then ad(X) leaves V invariant while ad(Y ) also leaves V invariant. Then, ad(H) with H = [X, Y ] ∈ h also leaves V invariant. But ad(H) = [ad(X), ad(Y )] and hence T r(ad(H)|V ) = 0. Thus we get 0 = −α(H) + α(H)dim(gα ) + 2α(H)dim(g2α ) + ...
102104 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 4. MISCELLANEOUS PROBLEMS
Choosing H ∈ h so that α(H) �= 0, we get 0 = −1 + dim(gα ) + 2dim(g2α ) + ... and this easily results in dim(gα ) = 1, dim(gkα ) = 0, k ≥ 2
Now let α, β ∈ Δ. We look at the root spaces gα+kβ , k = −p, −p+1, ..., q−1, q where {α + kβ : k = −p, −p + 1, ..., q − 1, q} is a maximal chain, ie, α + kβ, k = −p, p + 1, ..., q − 1, q are all roots, ie, elements of Δ but α + (q + 1)β and α − (p + 1)β and α + (q + 1)β are not roots. By maximality, it is clear that no other chain of this sort can overlap with this chain. Note that we have already proved �qthat dim(gα+kβ ) = 1, k = −p, ..., q. Our immediate aim is to show that V = k=−p gα+kβ is a vector space such that the restriction of the sl(2, C) Lie algebra generated by Hα , Xα , X−α } has its adjoint representation restricted to V an irreducible representation of sl(2, C). Here, Xα ∈ gα and X−α ∈ g−α are nonzero elements and their normalizations are chosen so that B(Xα , X−α ) = 1. Then, we have ¯α [Xα , X−α ] = cH ¯ α ∈ h and hence, for some H ¯ α , H) = B([Xα , X−α ], H) = B(Xα , [X−α , H]) = cB(H ¯ α so that We choose H
α(H)B(Xα , X−α ), H ∈ h ¯ α , H) = α(H), H ∈ h
B(H
This is possible in view of the above equation by taking
c = B(Xα , X−α ) Note that since B is nonsingular on g, and Xα is orthogonal to both h and to � −α, and since X−α is a nonzero element of the one dimensional vector gβ ∀β =
� 0. Thus with the above choice of c, space g−α , it follows that B(Xα , X−α ) = we get that ¯ α
[Xα , X−α ] = B(Xα , X−α )H Now we define
¯α Hα = B(Xα , X−α )H
and obtain
[Xα , X−α ] = Hα
and then, B(Hα , Hα ) = B(Hα , [Xα , X−α ]) =
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
103
4.3. A PROJECT PROPOSAL FOR DEVELOPING AN EXPERIMENTAL SETUP FOR TRA B([Hα , Xα ], X−α ) = α(Hα )B(Xα , X−α )
If we choose the normalizations of Xα , X−α so that ¯ α, H ¯ α ) = 2/α(H ¯ α) B(Xα , X−α ) = 2/B(H then we get
¯ α ) = 2H ¯ α, H ¯ α) ¯ α /α(H ¯ α /B(H Hα = 2H
and then we get [Hα , Xα ] = α(Hα )Xα = 2Xα , [Hα , X−α ] = −α(Hα )X−α = −2X−α and as before, [Xα , X−α ] = Hα In other words, the triplet {Hα , Xα , X−α } form a standard set of generators of an sl(2, C) Lie algebra.
4.3 A project proposal for developing an ex perimental setup for transmitting quantum states over a channel in the presence of an eavesdropper In quantum computation and information theory, it is by now a well established d fact that a qubit state and more generally dqubit state (ie a pure state in C2 ) can be transmitted over a channel from A to B by transmitting just 2d classical bits provided that A and B share a maximally entangled state, ie, a state of �d−1 the form d−1/2 k=0 |k, k >. The idea is simply to append this state to the maximally entangled state at A� s end, then perform a unitary transformation on the total 2dqubit state of A, perform a measurement at A� s end thereby causing B � s state to collapse to one of 2d possible dqubit states. When A then reports to B about his measurement outcome via 2d classical bits, B is able to apply an appropriate unitary gate at his end to recover the original state that A had intended to transmit. In quantum information theory, another important problem is the Cq problem in which A wishes to transmit classical information over a quantum channel by encoding his classical bits in the form of quantum states. Thus, if A� s classical information source is the alphabet A = {1, 2, ..., a}, with the alphabet k occurring with probability p(k), then the total information �a contained in this source that A wishes to transmit is H(A) = H(p) = − x=1 p(x).log(p(x)). A encodes the alphabet x ∈ A in the form of a density matrix ρ(x) (ie, a mixed state in a finite dimensional Hilbert space H), and transmits this state over the channel assumed to be noiseless.
104106 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 4. MISCELLANEOUS PROBLEMS
The � state received by B is then ρ(x) and the average state received by B is ρ¯ = x∈A p(x)ρ(x). It is natural to expect that the total information that A has transmitted to B must be given by � ρ) − p(x)H(ρ(x)) Ip (A, B) = H(B) − H(B|A) = H(¯ x∈A
where
H(W ) = −T r(W.log(W ))
is the VonNeumann entropy of the state W . In order for this to be a meaning ful measure of the information transmitted, we can ask the following question: Suppose A encodes a string of his source alphabets x = (x(1), ..., x(n)) ∈ An into the state ρ(x) = ρ(x(1)) ⊗ ... ⊗ ρ(x(n)) in the tensor product Hilbert space H⊗n , then can he choose Mn such distinct sequences x1 , ..., xM such that (a) these sequences are all typical for A� source w.r.t to the probability distribu tion pn (x) = p(x(1))...p(x(n)), (b) There exist positive ”Detection Operators” D1 , ..., DMn for B in the Hilbert space H⊗n such that for any � > 0 with n sufficiently large, one has D1 + ... + DMn ≤ I, T r(ρ(xk )Dk ) > 1 − �, k = 1, 2, ..., Mn and
T r(Dk ) ≤ T r(E(xk , n, δ)), k = 1, 2, ..., Mn
where E(x, n, δ) is a δtypical projection on H⊗n corresponding to the situation when x is a dtypical sequence for A� s source. These requirements amount to saying that the detection operator Dk of B does not have too large a dimension so as to ”leak” into another sequence xj , j = k when xk is transmitted, and further, that with a large probability, B � s decision on what sequence A had transmitted is correct when he uses his detection operators. Then the question is that if Mn is maximal subject to these requirements, so that the rate of reliable transmission of information (ie, with error probability smaller than �), n) is log(M , then limlog(Mn )/n = supp Ip (A, B). In other words, the maximum n rate of reliable transmission of information on a Cq channel is precisely the Cq channel capacity defined by � � p(x)ρ(x)) − p(x)H(ρ(x))) C = supp Ip (A, B) = supp (H( x
x
Our project will involve verifying this capacity formula by preparing the states ρ(x), x ∈ A using lasers and ions, so that by shining the laser on an ion, we can generate excited ion states used for transmission. In other words, one of the primary objectives of our experimental setup will be to prepare a large class of quantum states using the quantum electromagnetic field generated by a laser interacting with ions. If the ion and the laser field start in an initial state |k, φ(u) > where |k > represents a stationary state of the ion and |φ(u) >
Advanced and Quantum Probability with QuantumAN FieldEXPERIMENTAL Theory Applications SETUP 105 4.3. AClassical PROJECT PROPOSAL FORTheory DEVELOPING FOR TRA
a coherent state of the laser, then after interacting with each other for a time duration T , the final state of the ion and the laser field will be the pure state U (T )|k, φ(u) > and by partially tracing this out over the laser field state, the ion state becomes the mixed state ρion (k, u, T ) = T r2 (U (T )|k, φ(u) >< k, φ(u)|U (T )∗ ) By varying k, u, T , namely, the intial state of the ion, the coherent state of the laser field and the time duration of interaction of these two, we can thus generate a host of mixed states on the Hilbert space of the ion. These mixed states can be used for transmission. Another application of our experimental setup will be to create entangled states between three people A, B, E or more generally a mixed state ρABE in the tensor product Hilbert space HABE = HA ⊗ HB ⊗ HE and to transmit maximal information from A to B while restricting the information transmitted from A to E to be a minimum. Let A have a classical source with alphabet A and source probability distribution p(x), x ∈ A and let for each x ∈ A ρBE (x) be a state in HBE = HB ⊗ HE . Then the information transmitted from A to B is given by � � p(x)H(ρB (x)) I(A, B) = H( p(x)ρB (x)) − x
x
while the information transmitted from A to E is given by � � I(A, E) = H( p(x)H(ρE (x)) p(x)ρE (x)) − x
x
where
ρB (x) = T rE (ρBE (x)), ρE (x) = T rB (ρBE (x))
The problem is to select the probability distribution p(x), x ∈ A for A� s source and the states ρBE (x), x ∈ A so that I(A, B) − I(A, E) is a maximum, ie, to transmit maximum information across the Cq channel from A to B while keeping the Cq information transmitted to E a minimum. Such a setup can be arranged using lasers and ion trap experiments as follows: Let A generate a current I(t, x) dependent upon his source alphabet x ∈ A and let him connect this current to a classical antenna that transmits electromagnetic waves to both B and E. The magnetic vector potential at B is then given by � G(t − s, RB + ξ, u)J(s, u, x)dsdu AB (t, ξ) = SA
while the magnetic vector potential at E � s end is given by � G(t − s, RC + ξ, u)J(s, u, x)dsdu AE (t, ξ) = SA
106108 Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications CHAPTER 4. MISCELLANEOUS PROBLEMS
where G(t−s, R, u) is the standard retarded potential Green’s function between the point u on A� s antenna surface the point R of reception of the field. RB is the location of the nucleus of B � s atomic receiver while RE is the location of the nucleus of E � s atomic receiver. J(t, u, x) is the surface current density on A� s antenna surface SA and it depends upon the alphabet x that A wishes to transmit and hence from basic antenna theory, it can be expressed as � J(t, u, x) = F(t − s, u)I(s, x)ds, x ∈ A where the function F(t, u), t ∈ R, u ∈ SA depends only upon the antenna surface geometry and the point on this surface where the current source I(t, x) us fed in. There is an interaction potential VBE between the systems used by B and E and hence, the Schrodinger equation for the joint state ρBE when A transmits the symbol x has the form iρ�BE (t) = [HBE (t), ρBE (t)] where HBE (t) = HB (t) + HE (t) + VBE with HB (t) = (pB + eAB (t, ξB )2 /2m − Ze2 /|ξB | − eΦB (t, ξB ), and
HE (t) = (pE + AE (t, ξE ))2 /2m − Ze2 /|ξE | − eΦE (t, ξE ) pB = −i�ξB , pE = −i�ξE
When A uses a quantum antenna source to transmit a quantum electromagnetic field, then the fields AB , AE also become quantum fields and then the above Schrodinger equation for ρBE (t) must be partially traced out over the coherent state of the bath field to obtain the ”system part” of ρBE (t). Note that since the surface current density J(t, u, x) in A� s antenna depends upon the symbol x that A wishes to transmit, it follows that ρBE (t) = ρBE (t, x) will also depend upon the symbol x and then the Cq approach mentioned above can be applied to design A� s antenna and his source probability distribution p(x) for maximal transmission of information from A to B while keeping the information that has been leaked into E � s receiver at a minimum.
4.4
A problem in Lie group theory
Let H, X, Y be the standard generators of sl(2, C), ie, [H, X] = 2X, [H, Y ] = −2Y, [X, Y ] = H. Then define for t, x, y ∈ R, g(t, x, y) = exp(tH + xX + yY ) and express ∂t g(t, x, y) = g(t, x, y)(a(1)H + a(2)X + a(3)Y ),
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
4.4. A PROBLEM IN LIE GROUP THEORY
107
109
∂x g(t, x, y) = g(t, x, y)(b(1)H + b(2)X + b(3)Y ), ∂y g(t, x, y) = g(t, x, y)(c(1)H + c(2)X + c(3)Y ) Evaluate a(k), b(k), c(k), k = 1, 2, 3. Hint: Use the the following formula for the differential of the exponential map: ∂t exp(A + tB) = exp(A + tB)(f (ad(A + tB))(B) where f (z) = (1 − exp(−z))/z
Hence, express the Haar measure on SL(2, R) in terms of t, x, y, ie if µ is the Haar measure, then dµ(g(t, x, y)) = F (t, x, y)dtdxdy and evaluate the Haar density F . For doing this, you must use the formulas obtained to express g(t, x, y)H, g(t, x, y)X, g(t, x, y)Y as linear combinations of ∂t g(t, x, y), ∂x g(t, x, y), ∂y g(t, x, y)
Chapter 5 Chapter 5
More Problems in
More problems linearand
Linearin Algebra algebra and functional Functional Analysis
analysis Course titles: [A] Matrix theory, [B] Antennas and wave propagation, [C] Prob ability theory. Course instructor:Harish Parthasarathy
5.1
Riesz representation theorem
[1] Prove Riesz’ representation theorem in the following form: Let X = C[0, 1], the space of continuous functions on [0, 1] with supnorm, ie, � x �= sup(|x(t)| : t ∈ [0, 1]}, x ∈ X Show that (X, � . �) is a Banach space. [2] Let (X, d) be a compact metric space and let C(X) denote the space of all continuous functions on X. Prove that C(X) under the sup norm is a Banach space. Let ψ : C(X) → R be a continuous linear functional on C(X). Then, show that there exists a unique signed measure µ on the Borel subsets of X such that � ψ(f ) = f (x)dµ(x), f ∈ C(X) X
Show further that
� ψ �= sup(|ψ(f )| :� f �≤ 1) = |µ|(X) where � f �= sup(|f (x)| : x ∈ X) 111 109
110112CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
and |µ| is the variation norm of µ, ie, corresponding to the Hahn decomposition X = X+ ∪ X− , X+ ∩ X− = φ so that µ is nonnegative on all the Borel subsets of X+ and nonpositive on all the Borel subsets of X− , we define |µ|(E) = µ(X+ ∩ E) − µ(X− ∩ E)
5.2
Lie’s theorem on solvable Lie algebras
Let g be a solvable Lie algebra, ie for some finite positive integer p, we have Dp+1 g = 0, Dp g �= 0 where if S is any subset of g, then DS denotes the linear span of the elements [X, Y ], X, Y ∈ S. Let ρ be any finite dimensional representation of g in a complex vector space V . Then, we claim that there exists a nonzero vector v ∈ V and a linear functional λ on g, ie, λ ∈ g∗ such that ρ(X)v = λ(X)v, ∀X ∈ g We prove this theorem by induction. g is solvable, it is clear that Dg is a proper subspace of g (In fact Dg is an ideal in g). Thus, we can choose a subspace h of g such that dimh = dimg − 1, or equivalently, dim(g/h) = 1 and such that Dg ⊂ h ⊂ g It is clear that h is an ideal in g since [X, h] ⊂ Dg ⊂ h, X ∈ g In particular, Dh ⊂ Dg ⊂ h or in other words, that h is a proper Lie subalgebra of g having codimension one in g. Hence, ρ restricted to h is a representation. By the induction hypothesis (induction is on dimg), it follows that there is a nonzero w0 ∈ V such that ρ(Y )w0 = λ(Y )w0 , ∀Y ∈ h for some λ ∈ h∗ . Now since h has codimension one in g, it is clear that we can / h and then it follows that choose an X0 ∈ g, X0 ∈ g = {cX0 + Y : c ∈ C, Y ∈ h} =< X0 > +h
Advanced Classical and QuantumON Probability Theory with Field Theory Applications 113 111 5.2. LIE’S THEOREM SOLVABLE LIEQuantum ALGEBRAS
Now let q be the smallest nonnegative integer for which {ρ(X0 )k w0 : 0 ≤ k ≤ q} forms a linearly independent set. Then, ρ(X0 )q+1 w0 is linearly dependent on {ρ(X0 )k w0 : 0 ≤ k ≤ q} and by successive application of ρ(X0 ), it follows that for any m > q, ρ(X0 )m w0 is also linearly dependent upon {ρ(X0 )k w0 : 0 ≤ k ≤ q}. Define the subspaces Wm = span{w0 , w1 , ..., wm }, m = 0, 1, ..., q, wk = ρ(X0 )k w0 , k = 0, 1, ..., q Since then ρ(X0 )wk = wk+1 , k = 0, 1, ..., q − 1 and ρ(X0 )wq ∈ Wq , it follows that Wq is ρ(X0 ) invariant. We have further, ρ(Y )ρ(X0 )wk = [ρ([Y, X0 ]) + ρ(X0 )ρ(Y )]wk , Y ∈ h − − − (1) Now, h is an ideal in g and hence, if we assume that the proposition Pk defined by ρ(Y )wk = λ(Y )wk modWk−1 , Y ∈ h is valid for some k, then P0 is true with W−1 = 0 and Pk implies in view of (1) that ρ(Y )wk+1 = λ([Y, X0 ])wk + λ(Y )wk+1 , Y ∈ h − − − (2) so that Pk+1 is also true. Note that since [Y, X0 ] ∈ h, Y ∈ h since h is an ideal. Thus, by induction, it follows that Pk is true for k = 0, 1, ..., q, ie, ρ(Y )wk = λ(Y )wk modWk−1 , k = 0, 1, ..., q, Y ∈ h In particular, Wq is ρ(h)invariant and further, T r(ρ(Y )|Wq ) = (q + 1)λ(Y ), Y ∈ h − − − (3) Now, we have seen that Wq is ρ(X0 ) invariant and also ρ(Y ) invariant and hence, since [Y, X0 ] ∈ h, we have T r(ρ([Y, X0 ])|Wq ) = T r([ρ(Y ), ρ(X0 )]|Wq ) = 0, Y ∈ h Thus from (3), with [Y, X0 ] in place of Y , we have λ([Y, X0 ]) = 0, Y ∈ h and hence, we get from (2) that ρ(Y )wk+1 = λ(Y )wk+1 , k = 0, 1, ..., q − 1, Y ∈ h and hence ρ(Y )wk = λ(Y )wk , k = 0, 1, ..., q, Y ∈ h In other words, for any Y ∈ h ρ(Y ) acts on Wq as multiplication by λ(Y ) times the identity operator. Further, since Wq is ρ(X0 )invariant, and the field is the C which is algebraically closed, we can choose a nonzero vector v ∈ Wq such that ρ(X0 )v = c1 v
112114CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
for some complex number c1 . Now, we extend the domain of definition of λ from h∗ to the whole of g by setting λ(cX0 + Y ) = cc1 + λ(Y ), Y ∈ h and then the proof of the result is complete. Now from this result we deduce the result that if g is a solvable Lie algebra and ρ is a finite dimensional repre sentation of g in a complex vector space V , then we can choose a basis B for V such that for each X ∈ g, [ρ(X)]B is an upper triangular matrix. In fact, using Lie’s theorem, we first choose a nonzero vector v1 ∈ V and a λ ∈ g∗ so that ρ(X)v1 = λ(X)v1 , X ∈ g Then define V1 =< v1 > and observe that since V1 is ρinvariant, ρ induces in a natural way a represen tation ρ1 of g on V /V1 and again applying Lie’s theorem to this representation, / V1 such that ρ(X)v2 − λ2 (X)v2 ∈ V1 , X ∈ g for we get an element v2 ∈ V, v2 ∈ some λ ∈ g∗ . Define V2 =< v1 , v2 > Then we have ρ(X)v2 = λ2 (X)v2 modV1 , V1 ⊂ V2 In general, suppose, we have constructed linearly independent vectors v1 , ..., vk and linear functionals λj ∈ g∗ , j = 1, 2, ..., k such that ρ(X)vj = λj (X)vj mod < v1 , ..., vj−1 >, j = 1, 2, ..., k, X ∈ g Then, the vector space Vk =< v1 , ..., vk > is clearly ρinvariant and if Vk = V , the proof is complete while if Vk = V , then ρ induces in a natural way a representation of g on the nonzero vector space / Vk and V /Vk and hence by application of Lie’s theorem, we can select a vk+1 ∈ a λk+1 ∈ g∗ such that ρ(X)vk+1 − λk+1 (X)vk+1 = 0modVk Then set, Vk+1 =< v1 , ..., vk , vk+1 > and the induction proceeds further. This process will terminate in a finite number of steps since V is finite dimensional.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
113
5.3. ENGEL’S THEOREM ON NILREPRESENTATIONS OF A LIE ALGEBRA115
5.3 Engel’s theorem on nilrepresentations of a Lie algebra Let g be a (finite dimensional) Lie algebra and let ρ be a nilrepresentation of g in a finite dimensional complex vector space V , ie, ρ(X) is nilpotent for all X ∈ g. Then, there exists a nonzero vector v ∈ V such that ρ(X)v = 0∀X ∈ g. Hence, there exists a basis B for V such that [ρ(X)]B is strictly upper triangular for all X ∈ g. In particular, if g is a nilpotent Lie algebra, ie, ad(X) is nilpotent for all X ∈ g, then, there is a basis B for g such that [ad(X)]B is strictly upper triangular for every X ∈ g and consequently, if n = dimg and X1 , ..., Xn ∈ g are arbitrary then ad(X1 ).ad(X2 )...ad(Xn ) = 0 Solution: Let a = Ker(ρ) = {X ∈ g : ρ(X) = 0} Then, a is an ideal in g since if X ∈ g, Y ∈ a, then ρ(Y ) = 0 and therefore, ρ([X, Y ]) = [ρ(X), ρ(Y )] = 0 implying that [X, Y ] ∈ a. So it is meaningful to speak of the Lie algebra g' = g/a with the bracket defined by [X + a, Y + a] = [X, Y ] + a, X, Y ∈ g Note that the bracket is well defined since X ' + a = X + a, Y ' + a = Y + a imply which imply
U = X ' − X, V = Y ' − Y ∈ a [X ' , Y ' ] = [X + U, Y + V ] = [X, Y ] + Z
where Z = [X, V ] + [U, Y ] + [U, V ] ∈ a since a is an ideal. Thus, [X ' , Y ' ] + a = [X, Y ] + a proving thereby that the bracket on the vector space g/a is well defined. The necessary properties of the bracket, namely bilinearity, skewsymmetry and the Jacobi identity immediately follows from the same properties of the bracket on g. Now, clearly the Lie algebra g/a is isomorphic with the Lie algebra ρ(g) via
114116CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
the isomorphism ρ1 that maps X + a to ρ(X) for X ∈ g. For example, the injectivity of ρ1 follows from the fact that X + a = a implies X ∈ a implies ρ(X) = 0. Surjectivity of ρ1 is obvious. That ρ1 preserves the bracket follows from ρ1 ([X + a, Y + a]) = ρ1 ([X, Y ] + a) = ρ([X, Y ]) = [ρ(X), ρ(Y )] = [ρ1 (X + a), ρ1 (Y + a)] Note that ρ1 is well defined because X + a = X � + a implies X � − X ∈ a implies ρ(X � ) − ρ(X) = ρ(X � − X) = 0. In other words, ρ1 is a faithful representation of g� in V . Since the elements of ρ(g) are all nilpotent, it follows therefore from the above isomorphism that g� = g/a is a nilpotent Lie algebra, ie, all the elements of ad(g� ) are nilpotent. To see this clearly, we observe that if U ∈ g� , then ρ1 oadU (V ) = ρ1 ([U, V ]) = [ρ1 (U ), ρ1 (V )] = ad(ρ1 (U ))(ρ1 (V )) or equivalently, ρ1 oad(U ) = adoρ1 (U ) or equivalently, ρ1 oad = adoρ1 on g� or equivalently,
ad|g� = ρ−1 1 oad|ρ(g) oρ1
on g� and since ad|ρ(g) is nilpotent on L(V ) because ρ(g) is nilpotent on V , it follows that ad|g� is also nilpotent on g� . Remark: if A is a linear nilpotent operator on a vector space W , then ad(A) is nilpotent on L(W ). This follows from the identity n � � � n r n n LA (−RA )n−r (B) ad(A) (B) = (LA − RA ) (B) = r r=0 =
n � � � n r=0
r
(−1)n−r Ar BAn−r ∀B ∈ L(W )
In view of the above remarks, we can assume without any loss of generality that g is a nilpotent Lie algebra and ρ is a nilrepresentation of g in order to prove the existence of a nonzero vector v such that ρ(g)v = 0.
5.4
Aperture antenna pattern fluctuations
Consider a surface antenna with the surface equation z = f (x, y). Let Ei (x, y, z) be an electric field at fixed frequency ω that is incident upon this surface. [a] Justify that the surface magnetic current density on the antenna surface is given by ˆ × Ei (x, y, f (x, y)) Ms (x, y) = −n
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
5.4. APERTURE ANTENNA PATTERN FLUCTUATIONS where n ˆ = (−f,x (x, y)ˆ x − f,y (x, y)ˆ y + zˆ)/ where f,x =
117
115
� 2 + f2 1 + f,x ,y
∂f ∂f , f,y = ∂x ∂y
[b] Show that the differential surface area element on the antenna surface is given by � 2 + f 2 dxdy dS(x, y) = 1 + f,x ,y
[c] Show that the far field electric vector potential radiated by the antenna surface aperture is given by � ˆ � x+y ˆ � y+f ˆ (x� , y � )ˆ z))dS(x� , y � ) F(r) = (�/4π)(exp(−jkr)/r) Ms (x� , y � )exp(jK r.(x
[d] Hence, if the surface fluctuates by a small amount so that its new equation is z = f (x, y)+δf (x, y), then evaluate δF(r) and hence δE(r), the radiated fields in the far field zone as a linear functional of δf . hint: � � 2 + f 2 = (f δf + f δf )/ 1 + f 2 + f 2 δ 1 + f,x ,x ,x ,y ,y ,x ,y ,y
Hence, evaluate δn ˆ (x, y), δdS(x, y) in terms of δf (x, y) and its partial deriva tives.
[6] If δf (x, y) in the previous problem is a random function with mean zero and correlations E(δf (x, y).δf (x� , y � ))Rf f (x, y|x� , y � ) then evaluate the correlations in the far field pattern fluctuations. [7] In problem [5], evaluate the total power radiated out by the aperture surface antenna in the far field zone. hint: In the far field zone, the electric field upto O(1/r) is E = −� × F/� = jkrˆ × F/� and the corresponding far field zone magnetic field is given by −jωµH = � × E = −jkrˆ × E Now calculate the far field Poynting vector field (1/2)Re(E × H∗ ) in the far field zone upto O(1/r2 ), take the radial component, multiply it by the surface element dS = r2 dΩ where dΩ is the solid angle differential element and integrate the result over all solid angles to get the total radiated power.
116
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
118CHAPTER 5. MORE PROBLEMS IN LINEAR ALGEBRA AND FUNCTIONAL ANALYS
5.5 Spectral theorem using GelfandNaimark the orem [8] Prove the spectral theorem for bounded normal operators on a Hilbert space using the GelfandNaimark theorem on spectrum of a commutative Banach alge bra combined with the Riesz representation theorem for continuous functionals on the space of continuous functions on a compact metric space. hint: Let H be a Hilbert space and A a commutative Banach subalgebra of B(H). Let Δ denote the space of continuous homomorphisms from A into C. Thus, h ∈ Δ means that h : A → C is a continuous map such that h(T S) = h(T )h(S), T, S ∈ A, h(I) = 1, h(aT + bS) = ah(T ) + bh(S), a, b ∈ C, T, S ∈ A Note that if T ∈ A is invertible, then it follows that h(T −1 ) = h(T )−1 If T ∈ A, then λ ∈ σ(T ) (the spectrum of T ) iff λ − T is noninvertible in A = B(H). Note that if T ∈ A, then for λ ∈ C, λ − T ∈ A, in particular, this is in B(H) and hence defined on the whole of H, so if it is invertible, then by the open mapping theorem, (λ − T )−1 is bounded, ie, in A which means that λ∈ / σ(T ). Thus, for T ∈ A, we have that λ ∈ σ(T ) iff (λ − T ) is noninvertible. If λ ∈ / σ(T ) and h ∈ Δ, then h((λ − T )−1 )h(λ − T ) = 1 implies that h(T ) = � h(λ) = λ (h(λ) is an abbreviation for h(λ.e) = λ.h(e) = λ). Equivalently, if h(T ) = λ for some h ∈ Δ, then λ ∈ σ(T ) and this implies ˆ (h) = h(λ), λ ∈ C, then that (λ − T ) is not invertible. Note that if we write λ ˆ (h) = λ∀h ∈ Δ∀λ ∈ C and Tˆ(h) = h(T ), h ∈ Δ, T ∈ A. Then, it follows that λ λ ∈ σ(T ) iff h(T ) = Tˆ(h) = λ for some h ∈ Δ. Remark: If λ ∈ σ(T ), then λ − T is not invertible and hence (λ − T )−1 is not defined. Choose a sequence λn → λ such that λn − T is invertible for every n and then we get h((λn − T )−1 )h(λn − T ) = 1 implies h((λn − T )−1 )(λn − h(T )) = 1∀n Now since λ − T is not invertible and λn → λ, it follows that there must exist an h ∈ Δ such that limsupn |h((λn − T )−1 | = ∞ and therefore h(T ) = limλn = λ
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 117 5.5. SPECTRAL THEOREM USING GELFANDNAIMARK THEOREM119
for such an h. Remark: Another way to see this is as follows: Suppose X = λ − T is not invertible. Then, I = {Y X : Y ∈ A} is an ideal in A that does not contain the identity element e and hence I is contained in a maximal proper ideal of A. Thus, there exists an h ∈ Δ that vanishes on this ideal and in particular, h(X) = 0. From this, it follows that h(T ) = h(λ) = λ. In other words, we have proved that σ(T ) = {h(T ) : h ∈ Δ} = Tˆ(Δ) Note that if h ∈ Δ, then λµ = h(λµ) = h(λ)h(µ), λ, µ ∈ C.
Now let T be a normal operator in B(H), ie, T is bounded and it commutes
with T ∗ . Consider the commutative Banach algebra A generated by {T, T ∗ }. Choose x, y ∈ H and consider the mapping Tˆ →< x, T y > from C(Δ) into C. This mapping is well defined since if T, S ∈ A and Tˆ = Sˆ, then h(T ) = h(S)∀h ∈ Δ and hence h(T − S) = 0∀h ∈ Δ which implies that � T − S �= 0, ie, T = S. Remark: if A is a commutative Banach algebra and x ∈ A, then with Δ denoting the space of continuous homomorphisms on A, we have that � x �= sup{h(x) : h ∈ Δ} = sup{x ˆ(h) : h ∈ Δ} =� x ˆ� Remark: We have seen that the spectrum σ(T ) of T can be identified with the continuous function Tˆ on Δ. In fact, we have seen that if A is a commutative Banach algebra, then σ(T ) = Tˆ(Δ) = {h(T ) : h ∈ Δ} For example, if T is a bounded Hermitian operator in the Hilbert space H and if A is the Banach algebra of all bounded functions of T , then from the spectral representation of T , then for any bounded function f on R, we have � � f (T ) = f (λ)dE(λ) = λdE(f −1 (−∞, λ]) − −(1) We note that if S1 , S2 ∈ A, Sk = fk (T ) =
�
fk (λ)dE(λ), k = 1, 2
for some bounded measurable functions fk , k = 1, 2 on R and therefore, if h ∈ Δ where Δ is the space of continuous homomorphisms from A into C, then � h(S1 S2 ) = f1 (λ)f2 (λ)h(dE(λ)) = �
f1 (λ)f2 (λ)dh(E(λ))
118 120CHAPTER Advanced Classical and Quantum ProbabilityIN Theory with Quantum Field Theory 5. MORE PROBLEMS LINEAR ALGEBRA AND Applications FUNCTIONAL ANALY
on the one hand and on the other, h(S1 S2 ) = h(S1 )h(S2 ) = =
�
�
f1 (λ)f2 (µ)h(dE(λ)).h(dE(µ))
f1 (λ)f2 (µ)dh(E(λ))dh(E(µ))
and comparing these two expressions, we can infer that dh(E(λ)).dh(E(µ)) = dh(E(λ))δλ,µ which is equivalent to saying that h(E(B1 )).h(E(B2 )) = h(E(B1 ∩ B2 )), B1 , B2 ∈ B(R) This is in agreement with the homomorphism property of h : A → C: h(E(B1 ∩ B2 )) = h(E(B1 ).E(B2 )) = h(E(B1 )).h(E(B2 )), B1 , B2 ∈ B(R)
Now, suppose, we pick a λ ∈ σ(T ) and define hλ (T ) = λ, and more generally, for S= we define
�
f (λ)dE(λ)
hλ (S) = f (λ) then hλ (S1 S2 ) = f1 (λ)f2 (λ) = hλ (S1 )hλ (S2 ), hλ (c1 S1 + c2 S2 ) = c1 f1 (λ) + c2 f2 (λ) = c1 hλ (S1 ) + c2 hλ (S2 ) and hence hλ : A → R is a homomorphism, ie, hλ ∈ Δ It is clear from the above formulas that sup{|hλ (S)| : λ ∈ σ(T )} = sup{|f (λ)| : λ ∈ σ(T )} =� S � More generally, if h ∈ Δ is arbitrary, ie, h : A → C is a homomorphism, then for � S = f (λ)dE(λ)
Advanced Classical and Quantum Probability Theory with Quantum Field TheoryTHEOREM121 Applications 119 5.5. SPECTRAL THEOREM USING GELFANDNAIMARK
we find that h(S) =
�
f (λ)dh(E(λ))
where the integral is taken over λ ∈ σ(T ). We now observe that for any Borel function g on R, we have that h(g(T )) = g(h(T )), h ∈ Δ where g(T ) is defined as an operator via the spectral theorem for this result follows from h(T n ) = h(T )n , n = 0, 1, ... and hence by taking linear combinations, h(g(T )) = g(h(T )) In particular, taking g as the function which maps T to E(λ), ie, g(x) = χ(−∞,λ] (x), we get h(E(λ)) = χ(−∞,λ] (h(T )) and hence, h(S) = =
�
�
f (λ)dλ χ(−∞,λ] (h(T ))
f (λ)dθ(λ − h(T )) = f (h(T ))
which is consistent with our above observation. Thus, � S �= sup{|f (λ)| : λ ∈ σ(T )} = sup{|f (h(T ))| : h ∈ Δ} which is once again, consistent with our previous observations regarding the definition of the spectrum of T as the set of all λ for which λ−T is not invertible and our result that this spectrum is also equal to the set of all h(T ) as h varies over Δ. Now we return to the commutative Banach algebra A generated by T, T ∗ where T is a bounded normal operator. Then, we get from the Riesz represen tation theorem, in view of the fact that Sˆ →< x, Sy > from C(Δ) into C is a bounded linear functional (in fact, � S �= sup{|h(S)| : h ∈ Δ} = = sup{|Sˆ(h)| : h ∈ Δ} =� Sˆ � where the last norm occurs because it is the standard supnorm on the space C(Δ)) � < x, Sy >= Sˆ(h)dµx,y (h), S ∈ A Δ
120122CHAPTER Advanced Classical and Quantum ProbabilityIN Theory with Quantum Field Theory Applications 5. MORE PROBLEMS LINEAR ALGEBRA AND FUNCTIONAL ANALYS
for some measure µx,y on Δ. Replacing S by f (T ) gives us � < x, f (T )y >= Φ(f )(h)dµx,y (h) Δ
where we have defined Φ(f ) = Sˆ ∈ C(Δ), S = f (T ) where f is a bounded Borel function on R. Now writing S1 = f1 (T ), S2 = f2 (T ), we get Φ(f1 f2 )(h) = (S1 S2 )ˆ(h) = h(S1 S2 ) = h(S1 )h(S2 ) = Sˆ1 (h)Sˆ2 (h) = Φ(f1 )(h)Φ(f2 )(h) = (Φ(f1 )Φ(f2 ))(h), h ∈ Δ and hence Φ(f1 f2 ) = Φ(f1 )Φ(f2 ) for all bounded Borel functions f1 , f2 on R. In particular, if B1 , B2 are bounded Borel subsets of R, then we get Φ(χB1 ∩B2 ) = Φ(χB1 ).Φ(χB2 ) We also have the following useful identities: � Φ(f1 )(h)dµf2 (T )x,y (h) =< f2 (T )x, f1 (T )y >=< x, f2 (T )∗ f1 (T )y > =< f1 (T )∗ x, f2 (T )∗ y >= µf1 (T )∗ x,f2 (T )∗ y (Δ) In fact, suppose S ∈ A, ie, S is a Borel function of T, T ∗ . Then, < x, ST y >=< S ∗ x, T y >=< x, T Sy >=< T ∗ x, Sy > and hence, µx,ST y (Δ) = µS ∗ x,T y (Δ) = µx,T Sy (Δ) = µT ∗ x,Sy (Δ) For S = f (T, T ∗ ) we define Ψ(f ) = Sˆ ∈ C(Δ) and then get � < x, Sy >= h(S)dµx,y (h) Δ
Replacing S by χB (S) where B is a bounded Borel set in R gives us � h(χB (S))dµx,y (h) < x, χB (S)y >= Δ
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications 121 5.5. SPECTRAL THEOREM USING GELFANDNAIMARK THEOREM123
and thus, if B1 , B2 are two bounded Borel subsets of R, we have � h(χB1 (S))h(χB2 (S))dµx,y (h) < x, χB1 ∩B2 (S)y >= Δ
Remark: To relate this circle of ideas to the spectral theorem for normal operators, we must transform the integral over Δ to an integral over σ(T ) = Tˆ(Δ) Since Tˆ(Δ) = {h(T ) : h ∈ Δ} = σ(T ), it follows that < x, T y >=
�
Tˆ(h)dµx,y (h) = Δ
�
λdµx,y oTˆ−1 (λ) σ(T )
and more generally, < x, f (T )y >=
�
Φ(f )(h)dµx,y (h)
Δ
with Φ(f1 f2 ) = Φ(f1 )Φ(f2 ), Φ(cf1 + f2 ) = cΦ(f1 ) + Φ(f2 ) More importantly, < x, f (T )y >=
�
h(f (T ))dµx,y (h)
Δ
Now if f (λ) = λn , then h(f (T )) = h(T n ) = h(T )n = f (h(T )) and since h is linear, it follows that if f is any polynomial, then h(f (T )) = f (h(T )) and by taking limits and using the continuity of h we get that this identity is true even if f is any bounded Borel function. Thus, � < x, f (T )y >= f (h(T ))dµx,y (h) Δ
=
�
Δ
f oTˆ(h)dµx,y (h) =
�
f (λ)dµx,y oTˆ−1 (λ)
σ(T )
which is precisely the content of the spectral theorem once we are able to show that for any bounded Borel set B in R, we can write µx,y (Tˆ−1 (B)) =< x, ET (B)y > where ET is a spectral measure on the Borel subsets of R with values in the space of orthogonal projection operators on H.
122
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
124CHAPTER 5. MORE PROBLEMS IN LINEAR ALGEBRA AND FUNCTIONAL ANALYS
5.6 The AtiyahSinger index theorem:A super symmetric proof Prerequisites: [1] The Dirac operator in curved spacetime with a YangMills connection term. [2] Lichnerowicz’ formula for the square of the Dirac operator in curved space time with a YangMills connection term. [3] The index of a linear operator. [4] Expressing the index of a linear operator D using the difference trace of the heat kernels exp(−tD∗ D) and exp(−tDD∗ ) by observing that the nonzero eigenvalues of D∗ D and DD∗ are identical inclusive of multiplities and hence the contribution to the index comes only from the multiplicities of their zero eigenvalues. [5] Write � � 0 D∗ T = D 0 and observe that 2
T =
�
D∗ D 0
0 DD∗
�
and hence with str denoting supertrace, we have str(exp(−tT 2 )) = T r(exp(−tD∗ D)) − T r(exp(−tDD∗ )) = Index(D), t ∈ R
5.7 Replicas, regular elements, Jordan decom position and Cartan subalgebras [6] Describe Chevalley’s theory of replicas and how it is used in proving the Jordan decomposition of a semisimple Lie algebra g. The steps involved are as follows: [a] Choose any X ∈ g and consider the derivation D = ad(X) of g. [b] Write down the Jordan decomposition of D viewed as a linear operator on g: D =U +V where U, V are respectively the semisimple and nilpotent parts of D. Thus [U, V ] = 0 and hence [X, U ] = [X, V ] = 0 and in fact, we know from basic linear algebra that U and V are expressible as polynomial functions of D having constant coefficient zero and that U, V with this constaint on semisimplicity and nilpotency and mutual commutativity are uniquely determined by D.
Advanced Classical andREGULAR Quantum Probability Theory with Quantum DECOMPOSITION Field Theory ApplicationsAND 123 5.7. REPLICAS, ELEMENTS, JORDAN CARTAN SUBA
[c] Hence, it is obvious that Dr,s = Ur,s + Vr,s , r, s ≥ 0, r + s ≥ 1 is the Jordan decomposition of Dr,s on Wr,s = W ⊗r ⊗ W ∗s with V = g. [d] Thus Ur,s , Vr,s are expressible as polynomial functions of Dr,s with these polynomials having zero constant coefficient. Hence N (Dr,s ) ⊂ N (Ur,s ), N (Vr,s ), ∀r, s ie, U, V are replicas of D. In particular, taking r = 1, s = 2 gives us N (D1,2 ) ⊂ N (U1,2 ), N (V1,2 ) which means that U, V are also derivations on V = g. Since every derivation of a semisimple Lie algebra is inner (this is proved using the non singularity of the CartanKilling form), it follows that U = ad(S), V = ad(N ) where S, N ∈ g. Since U and V commute and ad([S, N ]) = [ad(S), ad(N )], it follows from the faithfulness of the ad map on g that [S, N ] = 0 and hence, we obtain the Jordan decomposition on the semisimple Lie algebra g: X = S + N, S, N ∈ g, ad(S) is semisimple, ad(N ) is nilpotent and [S, N ] = 0. Furthermore, this decomposition is unique as follows from the Jordan decomposition on a vector space and the faithfulness of the ad map on semisimple Lie algebras. [7] Show that if g is a semisimple Lie algebra, and h is a Cartan subalgebra, then there exists a regular element X ∈ g such that
N (ad(X)m ) h = hX = m≥1
hint: Let X ∈ h be a regular element. Since ad(X) is nilpotent on h and since X is regular, it follows that ad(X) is nonsingular on g/h. Then, it is easy to see that hX ⊂ h and by regularity of X, we get that hX = h. [8] Show that if g is a semisimple Lie algebra, and h is a Cartan subalgebra, then h is maximal Abelian. hint:The steps involved in the proof ar outlined below: [a] Let X be regular in g such that h = hX . Let X =S+N be its Jordan decomposition. Then since ad(X)(S) = [X, S] = 0, ad(X)(N ) = [X, N ] = 0, it follows that S, N ∈ hX = h. It is also clear that S is regular and hence �S = �. Regularity of S can be seen as follows. Clearly, since ad(S) and ad(N ) commute, ad(N ) leaves each eigensubspace of ad(S) invariant and on such a subspace, ad(N ) is nilpotent. Thus, there is a basis for g relative to which ad(S) is diagonal and ad(N ) is strictly upper triangular. It follows that relative to such a basis ad(X) is upper triangular with the same diagonal entries as those of ad(S). Hence, the characteristic polynomial of ad(X) is the same
124126CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
as that of ad(S) and since X is regular, it follows that S is also regular. Now we wish to show that N = 0 and that will prove that ad(h) consists of only semisimple elements. Another proof of the fact that hS = hX . First observe that ad(X)(S) = 0 and hence S ∈ hX . Suppose that Y ∈ hS . Then, ad(S)m (Y ) = 0 for some positive m and since ad(S) is semisimple, it follows that ad(S)(Y ) = 0. It follows that ad(S)m (Y ) = 0 for all m > 0 and therefore, m
ad(X) (Y ) =
m � � � m r=0
r
ad(N )m−r ad(S)r (Y ) = ad(N )m (Y ) = 0
for sufficiently large m since ad(N ) is nilpotent. This implies that Y ∈ hX and therefore, hS ⊂ hX Conversely suppose Y ∈ hX . Then, ad(S)m (Y ) = (ad(X) − ad(N ))m (Y ) =
m �
(−1)m−r (ad(X))r (ad(N ))m−r (Y )
r=0
for sufficiently large m since ad(X) is nilpotent on hX while ad(N ) is nilpotent. This proves that Y ∈ hS and hence, hX = hS Another way to see that hX = hS is simply to prove that S is a regular element of hX . But this would follow immediately if we are able to prove that the characteristic polynomial of ad(S) equals that of ad(X). This can be seen in the following way: ad(X) = ad(S) + ad(N ), [ad(S), ad(N )] = 0 is the Jordan decomposition of ad(X). Now ad(S) is semisimple on g and hence we can choose a basis B for g such that [ad(S)]B is diagonal. Let c1 , .., cr be the distinct eigenvalues of ad(S), then we can write the spectral decomposition of ad(S) as r � ad(S) = c k Ek k=1
where I=
r �
k=1
Ek
5.7. REPLICAS, ELEMENTS, JORDAN CARTAN SUBA Advanced Classical andREGULAR Quantum Probability Theory with Quantum DECOMPOSITION Field Theory ApplicationsAND 125 5.7. REPLICAS, REGULAR ELEMENTS, JORDAN DECOMPOSITION AND CARTAN SUBA
GULAR ELEMENTS, JORDAN DECOMPOSITION AND CARTAN SUBALGEBRAS127 is a resolution on I (not necessarily orthogonal). Since ad(N ) commutes with is a resolution on I (not necessarily orthogonal). Since commutes ie, it) leaves R(Ekwith ) = ad(S), it follows it )also commutes every Ek , ad(N not necessarily orthogonal). Sincethat ad(N commutes withwith , ie, it leaves R(E ad(S), it follows that it also commutes with every E k k) = I) invariant. Now we can choose a basis for R(E ) so that ad(N ) N (ad(N ) − c k Ek , ie, it leaves R(Ek ) = k t it also commutes with)every I) invariant. Now we can choose asince basisad(N for R(E ad(N ) N (ad(N − ckR(E k ) so that ) is strictly upper triangular ) restricted to E is restricted to k R(Ek ) so that ad(N ) k riant. Now we can choose atobasis for ) is strictly upper triangular since ad(N ) restricted to E is restricted R(E k k If we pool all these bases k ) then we get a basis for g such s strictly upper nilpotent. triangular since ad(Nup ) restricted to Ekfor is R(E then webasis get aisbasis for gupper such nilpotent. If wethis pool up all these bases for R(Ek)) in that ad(S) in basis is diagonal and ad(N this strictly then webasis get aisbasis for g and such ad(N ) in this basis is strictly upper up all these bases for ad(S) R(Ek )in that diagonal In this otherbasis words, we have shown that ad(X) = ad(S) + ad(N ) in asis is diagonal triangular. and ad(N ) in this is strictly upper triangular. In other words, we have shown thatentries ad(X)as=that ad(S) + ad(NThis ) in this basis is uppertriangular with same diagonal of ad(S). words, we havethis shown that ad(X) = ad(S) with + ad(N ) in basis is uppertriangular same diagonal entries as that of ad(S). This immediately proves as that theofcharacteristic of ad(X) and ad(S) are angular with same diagonal entries that ad(S). This polynomials immediately proves that the characteristic polynomials of ad(X) ad(S) are the same and therefore since X is regular, it follows that is alsoand regular. hat the characteristic polynomials of ad(X) ad(S) are it follows that S the same and therefore sinceand X is regular, S is also regular. re since X is regular, it follows that S is also regular. Problem: Show that if X is regular then Y ∈ hX = h is regular iff det(ad(Y )|g/h ) �= Problem: Show that if X is regular then Y ∈ hX = h is regular iff det(ad(Y )|g/h ) �= 0. � 0 at if X is regular0.then Y ∈ hX = h is regular iff det(ad(Y )|g/h ) = hint: The above determinant is zero iff there exists a Z1 ∈ g − h such that hint: The above determinant is zero iff there exists a Z1 ∈ g − h such that eterminant is zero iff there exists a Z1 ∈ g − h such that ad(Y )(Z1 ) = 0 ad(Y )(Z1 ) = 0 ad(Y )(Z1 ) = 0 Now choose elements Z2 , ..., Zr such that {Zk + h : k = 1, 2, ..., r} form a basis Nowg/h choose show elements Z2 , ..., Zr such that {Zk + h : k = 1, 2, ..., r} form a basis for {Zkand + hshow : k =that 1, 2, ..., r} form a basis Z2 , ..., Zr such that for g/h and that g=h⊕q t g=h⊕q g= h⊕q where where q = span{Z1 , ..., Zm } q = span{Z1 , ..., Zm } q = span{Z ..., Zm }a basis {H , ..., H } for h such that ad(Y )| in this basis is strictly Now1 , choose 1 l h Now choose a basis {H1is, ..., Hl } forbecause h such that ad(Y )|h in this basis is strictly upper triangular (This possible h is a nilpotent Lie algebra). Then H1 , ..., Hl } for hupper such that ad(Y )|(This basis is strictly h in this triangular is possible because h is a nilpotent Lie algebra). Then compute the characteristic polynomial of ad(Y ) in the basis {H , ..., H , Z , ..., Zr } 1 l 1 is is possible because h isthe a nilpotent Lie algebra). Then compute characteristic polynomial of ad(Y ) in the basis {H , ..., H , Z , ..., Zisr } 1 l 1 ) = 0, then this characteristic polynomial for g and observe that if ad(Y )(Z 1 ristic polynomialfor of ad(Y ) in the basis {Hif1 ,ad(Y ..., H)(Z Zrthen } l , Z1), ..., = 0, this characteristic polynomial is g and observe that 1 the form then this characteristic polynomial is t if ad(Y )(Z1 ) =of of0,the form l+1 tl+1 f (t) t f (t) l+1 t where f (t) f is a polynomial and hence Y cannot be regular. Further, if there is no wherevector f is aZpolynomial and hence cannot be regular. Further, if there is no show that the Ycharacteristic polynomial of ad(Y ) relative 1 , thenFurther, ial and hence Y such cannot be regular. if there is no such vector Z , then show that the l characteristic polynomial of ad(Y ) relative 1 f (t) where f is a polynomial with f (0) �= 0 to this basis must be of the form) trelative show that the characteristic polynomial of ad(Y to this basis must be of the form tl f (t) where f is a polynomial with f (0) �= 0 l or in other words, Y is regular. To show this, we must make the fact that hX where f words, is a polynomial with To f (0) �= 0 this, we must make the fact that h of the form t f (t) or ain other Y is regular. show X is nilpotent Lie algebra for fact regular X.hXTo prove this fact, let Y ∈ hX be such is regular. To show this, we must make the that is a nilpotent Lie algebra for regular X. To prove this fact, let Y ∈ h be such X thatTo the determinant g/hsuch X is nonzero. Then, it is clear that the ebra for regular X. prove this fact, of letad(Y Y ∈))hon X be that the determinant of ad(Y on g/hX be is of nonzero. is clear the k ≤that dimh characteristic polynomial of ad(Y ) must the formThen, tkk f (t)itwhere X of ad(Y ) on g/h Then, it of is clear that the X is nonzero. f (t) where k ≤ dimh characteristic polynomial ad(Y ) must be of the form t X k and f (0) = � 0. Hence, h ⊂ h . This is possible only if h = h because X is Y X Y X k. ≤ dimh mial of ad(Y ) must bef (0) of the form t f (t)h where X and = � 0. Hence, ⊂ h This is possible only if h = h because X is Y X Y X and nonsingular on g/h and further that h regular and is nilpotent on h X X X hY ⊂ hX . Thisregular is possible only if hY = on hXhbecause X is and nonsingular on g/h and further that h and is nilpotent m X X X is ad(Y )invariant (since ) =hX0 for some positive m, it follows that on g/hX and ad(X) furtherm (Y that nt on hX and nonsingular is ad(Y )invariant (since ad(X) (Y ) = from 0 forelementary some positive m, algebra, it followsad(Y that) m m ad(ad(X)) (ad(Y )) = 0 and therefore linear nce ad(X) (Y )ad(ad(X)) = 0 for some m positive m, it follows that (ad(Y )) = 0 and m therefore from elementary linear algebra, ad(Y ) (ad(X) ) invariant). particular, ad(Y ) is nilpotent on leaves X = m≥1 Nlinear = 0 and therefore fromh algebra, ad(Y ) In m =the m≥1 N (ad(X) ) invariant). In particular, ad(Y ) is on nilpotent on leaves hXelementary m . Since elements Y of h for which ad(Y nonsingular g/hX are h X X (ad(X) ) invariant). In particular, ad(YY) is nilpotent on ad(Y )) is . Since the elements of h for which is nonsingular on g/h are h X X X it nonsingular follows that every ∈are hX is such that ad(Y ) is nilpotent on hX), is nts Y of hX for dense which in ad(Y on g/hY follows that every YXthat ∈ hXif Y is such that ad(Y ) is nilpotent on dense in hthat X , itthis . Note proof also shows a regular element of hX , then h X ws that every Y h∈ .hX is such that ad(Y )also is nilpotent on if Y is Note that this proof shows that is a regular element of h , then X X hY =ifhX roof also shows that Y .. is a regular element of hX , then hY = hX
126128CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
Now coming back to the proof that every Cartan algebra h is maximal abelian, when g is semisimple we choose a regular X in h and observe that h = hX because, ad(X) is nilpotent on h (by that part in the definition of a Cartan algebra which states that a Cartan algebra is nilpotent) and hence h ⊂ hX . Further, consider the Lie algebra hX /h. By Engel’s theorem applied to the adjoint representation of the nilpotent algebra h on this space, we deduce that there exists a Y ∈ hX − h such that [Y, h] ⊂ h and since by that part in the definition of a Cartan algebra which states that it is its own normalizer, it follows that Y ∈ h, a contradiction unless h = hX . (This argument does not require g to be semisimple Now, for semisimple g, we let X =S+N be the Jordan decomposition of X where X is a regular element in h. We wish to show that N = 0. (Note that the existence of the Jordan decomposition depends on the semisimplicity of g. Since ad(S) is semisimple, we have the decomposition g = R(ad(S)) ⊕ N (ad(S)) Since S is a regular element of h = hX , it follows that hS = h and since ad(S) is semisimple, it follows that h = hS = N (ad(S)) So we can write
g = R(ad(S)) ⊕ h
Now since [S, N ] = 0, it follows that for any Y ∈ g, we have < N, ad(S)(Y ) >=< N, [S, Y ] >=< [N, S], Y >= 0 and hence by the nonsingularity of < ., . > for semisimple Lie algebras, it follows that N ⊥ R(ad(S)) To prove that N ⊥ h, we take any Y ∈ h = N (ad(S)). We must show that T r(ad(N ).ad(Y )) =< N, Y >= 0 Recall that ad(X)(S) = ad(S)(N ) = 0 and hence S, N ∈ hX = h. Now every nilpotent Lie algebra is also solvable. (This can be seen by making use of Engel’s theorem for nilrepresentations. Indeed let g0 be a nilpotent Lie algebra. Then the adjoint representation of g0 on itself is a nil representation and hence according to Engel’s theorem, there exists a basis for g0 relative to which all the operators ad(Z), Z ∈ g0 are strictly upper triangular. Hence, by working in this basis, we deduce that for all sufficiently large n, we have ad(Z1 ).ad(Z2 )..ad(Zn ) = 0∀Z1 , ..., Zn ∈ g0
Advanced Classical and Quantum Probability Theory withJORDAN Quantum Field Theory Applications AND 127 CARTAN SUB 5.7. REPLICAS, REGULAR ELEMENTS, DECOMPOSITION
which is the same as saying that [Z1 , [Z2 , .., [Zn−1 , [Zn , Z]]..., ] = 0, f orallZ1 , ..., Zn , Z ∈ g0 and in particular,
Dn g0 = 0
thereby establishing solvability of g0 ). Thus, h is a solvable Lie algebra and hence by applying Lie’s theorem to the adjoint representation of h in g, we de duce that there exists a basis for g relative to which all the operators ad(H), H ∈ h are uppertriangular matrices. In particular, relative to this basis ad(N ) and ad(Y ) are also uppertriangular. But since ad(N ) is nilpotent on g, it must necessarily follow that relative to this basis, ad(N ) is strictly uppertriangular. Since the product of an uppertriangular matrix and a strictly uppertriangular matrix is strictly uppertriangular, it follows that ad(N ).ad(Y ) is strictly upper triangular and hence its trace is zero, ie, < N, Y >= 0 Therefore, we have proved that N ⊥ R(ad(S)) ⊕ N (ad(S)) = g and therefore, N = 0 by nondegeneracy of the CartanKilling symmetric bilinear form on g. Hence X = S. In other words, we have proved that if X is a regular element of h, then X is semisimple (ie, ad(X) is semisimple on g). Now for such an X, we have that since h = hX the result that ad(X)m (h) = 0 for sufficiently large m and therefore (by expressing the semisimple operator ad(X) on h relative to basis which makes it diagonal) that ad(X)(h) = 0, ie, [X, h] = 0. Now if Y ∈ h is arbitrary, we can write Y = limXn where Xn ∈ h are semisimple because the set of regular elements of h is dense in h and we have shown that every regular element of h is semisimple in g. Hence, [Y, h] = lim[Xn , h] = 0 proving that h is Abelian. Another way to see this is to start with the equation [X, h] = 0∀X ∈ h� and hence,
[h� , h� ] = 0
with ad(h� ) consisting only of semisimple elements. Thus, ad(h� ) is simultane ously diagonable and hence there is a basis B for g so that [ad(H)]B is diagonal for all H ∈ h� . Then taking the limit points of this set we get that [ad(H)]B is diagonal for all H ∈ h. Remark: We are making use of the fact that the space of regular elements of any Lie algebra g is dense in g. To prove this result, we choose an element Y ∈ g that is nonregular, ie, if l = rk(g) (then, det(tI − ad(Y )) = c(l + 1)tl+1 + ... + tn
128130CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
Let Z ∈ g be arbitrary and consider f (t, s) = det(t − ad(X + sZ)) = det(tI − ad(X) − s.ad(Z)) This is a polynomial in t, s and hence it can be expressed as f (t, s) = cl (s)tl + cl+1 (s)tl+1 + ... + cn (s)tn where cl (s) is a polynomial in s. Suppose that cl (s) vanishes for all |s| < δ. Then it vanishes for all s since cl (s) being a polynomial has only a finite number of zeroes. In this case, it would follow that X + sZ is an irregular element for all s. By choosing a basis {Z1 , ..., Zn } for g and applying this argument, we would deduce that any element of the form X + s1 Z1 + ... + sn Zn is irregular and hence that there exists no regular element in g which is a contradiction to the definition of l = rk(g). This means that there exists an element Z ∈ g such that for each δ > 0 there is an s with |s| < δ such that X + sZ is regular. In other words, there is a sequence sn → 0 such that X + sn Z is regular for all n = 1, 2, .... Writing Xn = X + sn Z, we deduce that Xn is regular for each n and limXn = X. The proof is complete. Remark: If g is any Lie algebra and h is a CSA (ie, h is a nilpotent Lie subalgebra of g and is also its own normalizer in g, then h is also maximally nilpotent, ie, every CSA is maximally nilpotent. Note that when we say that h is a nilpotent sublgebra, we mean that [h, h] ⊂ h and secondly that ad(H)m (h) = 0∀H ∈ h where m some finite positive integer. Note that since we are dealing only with finite dimensional Lie algebras, nilpotency of h is equivalent to saying that for each H1 ∈ h there is a finite positive integer m such that ad(H1 )m (H2 ) = 0 ∀H2 ∈ h. To see the maximal nilpotency of a CSA h, suppose h ⊂ h1 is a proper inclusion where h1 is a nilpotent subalgebra. Then since h is ad(h)invariant, it follows that ad : h → L(h1 /h) is a well defined representation and since ad(h1 ) is nilpotent on h1 and h ⊂ h1 , it follows that ad(h) is nilpotent on h1 /h. In other words, this representation is a nilrepresentation and hence by Engel’s theorem, there exists a Y ∈ h1 − h such that ad(h)(Y + h) = h which is equivalent to saying that [Y, h] ⊂ h and since h is its own normalizer in g, it follows that Y ∈ h, a contradiction.
5.8
Lecture Plan, Matrix Theory
5.8. LECTURE PLAN, MATRIX THEORY, M.TECH, INSTRUCTOR:HARISH PARTHASAR [1] Vector spaces, basis, linear transformations [2] Matrix representation of a linear transformation relative to a basis, sim ilarity transformations. [3] Rank and nullity of a matrix. [4] Decomposition theorems for matrices.
[a] QR factorization based on the GramSchmidt orthonormalization process.
[b] LDU decomposition of a positive definite matrix. [e] Spectral theorem for normal operators in finite and infinite dimensional Hilbert spaces. [f] Polar decomposition in a Hilbert space. [g] Singular value decomposition in finite dimensional Hilbert spaces. [5] Primary decomposition theorem of a matrix in a finite dimensional com
[2] Matrix representation of a linear transformation relative to a basis, sim ilarity transformations. [3] Rank and nullity of a matrix. [4] Decomposition theorems for matrices. Advanced Classical and Quantum Probability with Quantum Field Theory Applications [a] QR factorization based on theTheory GramSchmidt orthonormalization process.129 [b] LDU decomposition of a positive definite matrix. [e] Spectral theorem for normal operators in finite and infinite dimensional Hilbert spaces. [f] Polar decomposition in a Hilbert space. [g] Singular value decomposition in finite dimensional Hilbert spaces. [5] Primary decomposition theorem of a matrix in a finite dimensional com plex vector space. [6] Jordan decomposition of a matrix. [7] Canonical representation of nilpotent matrices. [8] The Jordan canonical form. [9] Computational algorithms for calculating the Jordan canonical form. [10]Perturbation theory for computing the eigenvalues and eigenvectors of Hermitian and diagonable matrices. [11] Calculating functions of matrices using contour integrals in the complex plane. [12] Computing functions of matrices using the Jordan canonical form. [13] Matrix norms and Banach algebras. [14] Homomorphisms and spectra of commutative Banach algebras. [15] Numerical methods for computing eigenvalues and eigenvectors of a matrix. [16] Tensor product of vector spaces and of linear transformations. [17] Application of the tensor product to describing nonlinear systems. [18] Quotient vector spaces and linear transformations on them. [19] Lie groups, Lie algebras and their representations. [19.1] Definition of a Lie group and its Lie algebra. [19.2] Jacobi identity on a Lie algebra. [19.3] Representation of a Lie group and the associated representation of its Lie algebra. [19.4] The adjoint representation of a Lie algebra. [19.5] Solvable and semisimple Lie algebras. [19.6] Nilpotent Lie algebras and nilpotent representations of a Lie algebra. [19.7] Lie’s theorem on reprsentations of solvable Lie algebras. [19.8] Engel’s theorem on nilpotent representations of a Lie algebra. [19.9] Regular elements of a Lie algebra. [19.10] Cartan subalgebras of a Lie algebra and their construction. [19.11] Properties of Cartan subalgebras. [19.12] Jordan decomposition on a Lie algebra, proofs based on Chevalley’s theory of replicas. [19.13] Cartan subalgebras of a semisimple Lie algebra and their properties. [19.14] The root space decomposition of a semisimple Lie algebra. [19.15] Cartan’s classification of all the complex simple Lie algebras.
130 Advanced Classical and Quantum Probability withALGEBRA Quantum FieldAND Theory Applications 132CHAPTER 5. MORE PROBLEMS IN Theory LINEAR FUNCTIONAL ANALYSI
[20] Some notions in the theory of unbounded operators in a Hilbert space. [20.1] The uniform boundedness principle. [20.2] The HahnBanach theorem. [20.3] The closed graph theorem. [20.4] Closed operators. [20.5] Adjoint of an operator. [21] Applications of unbounded operators to quantum mechanics. Test on Probability theory [1] Let (Ω, F, P ) be a probability space and let En , n = 1, 2, ... be an infinite
sequence of events on this space, ie En ∈ F, n = 1, 2, .... Then justify the
statement that the event that an infinite number of the En� s occur is given by
�
{En , i.o} =
Ek
n≥1 k≥n
Show further that the probability of this event satisfies �
P( Ek ) P ({En , i.o}) = limn→∞ n≥1
≤
�
k≥n
P (Ek )
k≥n
and in particular, show that if �
k≥1
P (Ek ) < ∞
then the probability of an infinite number of En� s occurring is zero. [2] Show that if Xn , n ≥ 1 is an infinite sequence of random variables on the
same probability space, then Xn converges to zero with probability one if for
each � > 0,
� P (|Xn | > �) < ∞ n≥1
hint: Show that the event that Xn does not converges to zero can be expressed
as
{Xn → 0}c = {|Xn | > 1/k, i.o}
k≥1
and that this event has probability zero if P ({|Xn | > 1/k, i.o}) = 0, k = 1, 2, .. Now make use the result of the preceding problem.
Advanced ClassicalASSIGNMENT and Quantum Probability Theory with Field TheoryTHEORY Applications 133 131 5.9. MORE PROBLEMS INQuantum PROBABILITY
[3] Let (Ω, F, P ) be a probability space and let φ be a continuous convex bounded function. By convex, we mean that φ(λ.x + (1 − λ)y) ≤ λ.φ(x) + (1 − λ)φ(y)∀x, y ∈ R, 0 ≤ λ ≤ 1 Let X ∈ L1 (Ω, F, P ), ie, E|X| < ∞. Then prove Jensen’s inequality: E(φ(X)) ≥ φ(EX) hint: First prove this result for simple random variables by using the given definition of convexity, then obtain a sequence of simple random variables that converge to the given random variable and take limits using Lebesgue’s domi nated convergence theorem.
5.9 More Assignment problems in probability theory [1] Let X(n), n ∈ Z be random walk on the d dimensional lattice, ie, X(n) ∈ Zd with transition probabilities given by P (X(n+1)−X(n) = ek |X(n)) = p(k), P (X(n+1)−X(n) = −ek |X(n)) = q(k), k = 1, 2, ..., d where
ek = [0, 0, .., 0, 1, 0, ..., 0]T ∈ Zd
with a one in the dth position and zeros at all the other positions and d
p(k), q(k) ≥ 0,
(p(k) + q(k)) = 1 k=1
Define the probability of the random walk being at the position k = at time n by P (n, k) = P r(X(n) = k)
�d
j=1
where k = [k1 , ..., kd ] = k1 e1 + .. + kd ed , k1 , ..., kd ∈ Z From elementary intuition, derive the recurrence relation d
P (n + 1, k) = j=1
(P (n, k − ej )p(j) + P (n, k + ej )q(j))
By elementary intuitive arguments, show that if P (n, k) = r1 ,...,rd ,s1 ,...,sd
n! p(1)r1 ...p(d)rd q(1)s1 ...q(d)sd r1 !...rd !s1 !...sd !
k j ej
132134CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
where the sum is over all nonnegative integers r1 , ..., rd , s1 , ..., sd for which rj − sj = kj , j = 1, 2, ..., d,
d �
(rj + sj ) = n
j=1
Now, suppose we view this random walk as a spacetime discretized version of a continuous timestochastic process Y(t) with values in Rd such that if f (t, x) is the probability density of Y(t) with x ∈ Rd and P (n, k) is approximated by f (nτ, kΔ)Δd with τ being the time discretization step size and Δ the spatial discretization step size, then show that the above recursion can be expressed as f (t + τ, x) =
d � j=1
(p(j)f (t, x − Δ.ej ) + q(j)f (t − τ, x + Δej ))
Show that if Δ → 0, τ → 0, Δ2 /2τ → D, p(j) − q(j) → 0, (p(j) − q(j))Δ/τ → vj , p(j) + q(j) → a(j) then taking this continuum limit, f (t, x) will satisfy the partial differential equa tion (a diffusion equation) d
∂f (t, x) � ∂f (t, x) ∂ 2 f (t, x) = (−vj + Dj ) ∂x2j ∂t ∂xj j=1 where Dj = Da(j), j = 1, 2, ..., d Assuming that at time t = 0 the particle executing this diffusion process is located at the origin, ie, f (0, x) = δ(x) show that if we define the spatial Fourier transform of f by � F (t, K) = f (t, x)exp(iK.x)dd x where K.x =
d �
Kj xj
j=1
then F satisfies the ode ∂F (t, K) = (iv, K) + KT DK)F (t, K), t ≥ 0 ∂t with the initial condition, F (0, K) = 1
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
133
5.10. MULTIPLE CHOICE QUESTIONS ON PROBABILITY THEORY 135
where v = [v1 , ..., vd ]T , D = diag[D1 , ..., Dd ] so that (v, K) = vT K =
d �
v j Kj ,
j=1
T
K DK =
d �
Dj Kj2
j=1
Show that the solution is T
T
F (t, K) = exp(itv K − tK DK) = exp(it
d � j=1
Kj vj − t
d �
Dj Kj2 )
j=1
which is the characteristic function of a ddimensional Gaussian random vector having mean vt and covariance matrix 2tD. By Fourier inversion show that f (t, x) = (4π.det(D)dt)−1/2 .exp(−(x − vt)T D−1 (x − vt)/4t) = (4π.D1 ...Dd t)−1/2 .exp(−
d � j=1
(xj − vj t)2 /4Dj t)
5.10 Multiple choice questions on probability the ory Instructions:Select the most appropriate answer. [1] Let X, Y be two random variables with joint density f (x, y) and marginal densities fX (y), fY (y) respectively. Then the probability density of Z = X + Y is given by � [b]
[a]
f (z + y, y)dy
�
fX (z − y)fY (y)dy
[c]
�
f (z − y, y)dy
[d]noneof theabove
134 Advanced Classical and Quantum Probability with ALGEBRA Quantum FieldAND TheoryFUNCTIONAL Applications 136CHAPTER 5. MORE PROBLEMS INTheory LINEAR ANALYSI
[2] Let X be the outcome when a fair die is thrown. Then, the probability distribution function (CDF) of X is given by 6
[a]
1� δ(x − k) 6 k=1 6
[b]
1� θ(x − k) 6 k=1
[c]θ(x)/6
[d]noneof theabove. Here, θ(x) is the unit step function. [3] Let X1 , X2 , ..., Xn be random variables defined on the same probability space that are non necessarily uncorrelated. Then the variance V ar(Sn ) of Sn = X1 + ... + Xn is given by [a]
n �
V ar(Xk )
k=1
[b]
n �
k=1
[c]
n �
k=1
V ar(Xk ) +
�
Cov(Xk , Xj )
1≤k 0, [3] E(Xn − X)2 → 0, [4] P (Xn → X) = 1. Then [a] [3] =⇒ [2] [b] [4] =⇒ [2] =⇒ [1] [c] [4] =⇒ [3] [d] both [a] and [b] [9] Let X(t) be a random process passed through a system having the input output relation described by the differential equation d2 Y (t)/dt2 = adY (t)/dt + bY (t) + X(t) Then the crosscorrelation RY X (t, s) = E(Y (t)X(s)) satisfies [a] [b]
∂ 2 RY X (t,s) ∂t2 ∂ 2 RY X (t,t) ∂t2 ∂ 2 RY X (t,s) ∂t2
X (t,s) = a ∂RY ∂t + bRY X (t, s) + RXX (t, s) X (t,t) = a ∂RY∂t + bRY X (t, t) + RXX (t, t
Y (t,s) [c] = a ∂RY∂t + bRY Y (t, s) + RXX (t, s) [d] None of the above.
[10] Let X be a random variable with a strictly increasing probability dis tribution function F (x). Then, if U is a uniformly distributed random variable with values in [0, 1], the random variable F −1 (U ) has the probability distribu tion [a] uniform
136
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
138CHAPTER 5. MORE PROBLEMS IN LINEAR ALGEBRA AND FUNCTIONAL ANALYS [b] F (x)
[c] F −1 (x)
[d] none of the above.
5.11 Design of a quantum unitary gate using su perstring theory with noise analysis based on the HudsonParthasarathy quantum stochas tic calculus The superstring action is given by � � S[X, ψ] = ((1/2)∂α X µ ∂ α Xµ − iψ µT σ 2 ρα ∂α ψµ )d2 σ + Bµν (X)dX µ ∧ dX ν Note that
µ ν µ ν X,2 − X,2 X,1 )d2 σ dX µ ∧ dX ν = (X,1
= �αβ ∂α X µ .∂β X ν
where �12 = 1, �21 = −1, �11 = �22 = 0
The supersymmetry transformations under which this action is invariant are
δX µ = c1 k T σ 2 ψ µ , δψ µ = c2 ρα k∂α X µ Here, ρ0 = σ 1 , ρ1 = σ 3 where k is an infinitesimal Fermionic parameter. This supersymmetric action can also be derived using basic superfield theory by defining our superfield on the space of two dimensional Bosonic and two dimensional Fermionic space as Φµ (σ, θ) = X µ (σ) + θT �ψ µ (σ) + θT �θ.Y where Y is a scalar Bosonic field and � = iσ 2 . The infinitesimal supersymmetry transformations are defined by the supervector field L = k T (�ρα θ.∂α + ∂/∂θ) with k being an infinitesimal Fermionic parameter. It is clear that under such an infinitesimal transformation of the superfield Φ, we have δX µ = k T �ψ µ , θT �δψ µ = k T �ρα θ.∂α X µ
or equivalently,
δψ µ = ρα k∂α X µ
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
137
5.12. STUDY PROJECTS IN PROBABILITY THEORY:CONSTRUCTION OF BROWNIAN M
5.12 Study projects in probability theory:Construction of Brownian motion, law of the iterarted logarithm [1] Construction of Brownian motion on [0, 1] using the Haar basis. Step 1: For n ≥ 1 and k = 0, 1, ..., 2n−1 − 1, define Hn,k (t) to be 2(n−1)/2 for t ∈ [2k/2n , (2k + 1)/2n ) and −2(n−1)/2 for t ∈ [(2k + 1)/2n , (2k + 2)/2n ). For all other t ∈ [0, 1] set Hn,k (t) = 0. Define H0 (t) = 1. Show that if f ∈ L2 [0, 1] and < f, Hn,k >, < f, H0 >= 0 for all n, k, then f = 0. This proved by noting that f ⊥ Hn,k implies � (2k+1)/2n � (2k+2)/2n f (t)dt = f (t)dt, k ∈ I(n) 2k/2n
(2k+1)/2n
where
I(n) = {0, 1, ..., 2n−1 − 1}
By considering the limit of the above equations as n → ∞, deduce that f ⊥ Hn,k for all n, k implies that f is a.s a constant on [0, 1] and by further making use of f ⊥ H0 , deduce that this constant is zero. Step 2: Prove the orthogonality relations < Hnk , Hml >= δnm δkl Do this by first taking m = n and noting then that if k = l, then Hnk , Hml have disjoint supports, ie nonoverlapping supports and hence the two are orthogonal. Next observe that if n > m then by expressing j/2m as j.2n−m /2n , it follows that if Hnk and Hml have overlapping supports, then the support of Hnk is contained entirely in the first half or entirely in the second half of that of Hml and hence by using the fact that the integral of Hnk is zero, deduce orthogonality of these two. Finally observe the trivial result that Hnk is orthogonal to H00 since its integral over [0, 1] is zero. Conclude then that H00 , {Hnk : n ≥ 1, k ∈ I(n)} is an onb for L2 [0, 1]. Step 3: Define the Schauder functions � t Snk (t) = Hnk (s)ds 0
and using the identity f (t) =
� n,k
< f, Hnk > Hnk (t), f ∈ L2 [0, 1]
deduce by taking f (s) = χ[0,t] (s), t, s ∈ [0, 1] that � Snk (t)Snk (s) = min(t, s) n,k
138
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
140CHAPTER 5. MORE PROBLEMS IN LINEAR ALGEBRA AND FUNCTIONAL ANALYS
Observe that the graph of Snk (t) is a symmetric tent of height 2−(n+1)/2 over the interval [(2k − 1)/2n , (2k + 2)/2n ], the symmetry being about the mid point � s have of this interval (2k + 1)/2n . Observe also that for a fixed n, the Snk nonoverlapping supports as k varies over I(n). Step 4: Now let ξ(n, k), n ≥ 1, k ∈ I(n) be iid N (0, 1) r.v’s. Define b(n) = max(|ξ(n, k)| : k ∈ I(n)} and observe that √ P (b(n) > n) ≤ 2n−1 .P (|ξ(1, 1)| > n) = 2n .(1 − Φ(n)) ≤ 2n .exp(−n2 /2)/n 2π and hence
� n
P (b(n) > n) < ∞
so that by the BorelCantelli lemma, P (b(n) > n, i.o) = 0 Show that this is the same as saying that for a.e.ω, there exists a finite positive integer N (ω) such that b(n, ω) ≤ n, ∀n > N (ω) Conclude that �
k∈I(n)
|ξ(n, k)(ω)Snk (t)| ≤ b(n, ω)
�
Snk (t)
k∈I(n)
≤ b(n, ω)2−(n+1)/2 ≤ n.2−(n+1)/2 , ∀n > N (ω) Conclude that if we define the processes BN (t) =
N � �
n=1 n∈I(n)
ξ(n, k)Snk (t), N ≥ 1
� s are continuous processes that converge uniformly to a limiting then the BN process B(t) over [0, 1] almost surely and hence the limiting process B(t) has almost surely continuous sample paths. Complete the proof by showing that � Snk (t)Snk (s) EBN (t) = 0, Cov(BN (t), BN (s)) = k∈I(n),1≤n≤N
and further that for each t BN (t) is a Cauchy sequence in L2 (Ω, F, P ) and therefore converges in L2 to B(t). Then using continuity properties of the inner product in L2 (Ω, F, P ), deduce that E(B(t)) = limN E(BN (t)), E(B(t)B(s)) = limN E(BN (t)BN (s)) � = Snk (t)Snk (s) = min(t, s) n≥1,k∈I(n)
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
139
5.12. STUDY PROJECTS IN PROBABILITY THEORY:CONSTRUCTION OF BROWNIAN
ie, B(.) is a Gaussian process with a.s. continuous sample paths having zero mean and covariance min(t, s), or in other words, B(.) is a Brownian motion process over [0, 1]. Remark: To prove that B(.) is a Gaussian process, it suffices to show that if {t1 , ..., tk } is a finite set of points in [0, 1], then (B(t1 ), ..., B(tk )) is a Gaus sian random vector. But this is immediately a consequence of the fact that (BN (t1 ), ..., BN (tk )) is a Gaussian random vector which converges in distribu tion since it converges in probability since it converges a.s since the process BN (.) converges uniformly a.s. [2] The law of the iterated logarithm: This result states that if B(.) is stan dard BM, then
B(t)
limsupt→∞ � = 1a.s 2t.loglog(t)
Equivalently, since tB(1/t) is also a BM, we can state the law of the iterated logarithm as B(t) limsupt→0 = 1a.s 2t.loglog(1/t) Equivalently since −B(t) is also a BM, this law can also be stated as liminft→∞
B(t) = −1 2t.loglog(t)
Intuitively what these result state is that for very large�times t, B(t) almost surely oscillates between the two bounding curves x = ± 2t.loglog(t). To prove this result, define h(t) =
�
2t.lnln(1/t), 0 < t < 1
so that for 0 < θ < 1 and n = 0, 1, ..., we have � h(θn ) = 2θn .ln(n.ln(1/θ)
Then we use Doob’s Martingale’s inequality in the form P (max0 0, the inequality
1 − Φ(x) = (2π)−1/2 =≥ (2π)−1/2
�
∞
�
∞
exp(−y 2 /2)dy
x
(y/x)exp(−y 2 /2)dy
x
= (2π)−1/2 xexp(−x2 /2), x > 0 In particular, (1 − Φ(xn )) ≥ (2π)−1/2 (h(θn )/θn/2 )exp(−h(θn )2 /2θn )
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
141
5.13. QUANTUM BOLTZMANN EQUATION FOR A SYSTEM OF PARTICLES INTERACTI =K
� 2ln(n.ln(1/θ)).exp(−ln(n.ln(1/θ)))
whose sum over n ≥ 1 is clearly divergent. Hence, by the second BorelCantelli lemma, the events √ En = {(B(θn ) − B(θn+1 ))/h(θn ) > 1 − θ}, n ≥ 1 occur infinitely often with probability one. Further, from the previous half with the Brownian motion B replaced by −B, we have the result that a.s. there exists an integer N = N (ω) such that {−B(θn+1 )/h(θn+1 ) ≤ 1 + δ}, ∀n > N This is the same as saying that −B(θn+1 )/h(θn ) ≤
√ θ(1 + δ)∀n > N
and combining this with the previous result gives us the result that the events √ √ Fn = {B(θn )/h(θn ) > 1 − θ − (1 + δ) θ} occur infinitely often. This true for every δ > 0 and for every θ ∈ (0, 1). Letting first δ ↓ 0 gives us the result that √ √ P (limsupt→0 B(t)/h(t) > 1 − θ − θ) = 1 Then letting θ ↑ 1 gives us the result that P (limsupt→0 B(t)/h(t) ≥ 1) = 1 and this completes the proof of the law of the iterated logarithm for Brownian motion: P (limsups→0 B(s)/h(s) = 1) = 1
5.13 Quantum Boltzmann equation for a sys tem of particles interacting with a quan tum electromagnetic field Let ρ(t) = ρ123..N (t) denote the state of the system of N identical particles. This state is an operator on H⊗N and it satisfies Schrodinger’s equation iρ� (t) = [H(t), ρ(t)] where H(t) =
N � j=1
((pj + eA(t, rj ))2 /2m − eΦ(t, rj )) + HF (t)
142
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
144CHAPTER 5. MORE PROBLEMS IN LINEAR ALGEBRA AND FUNCTIONAL ANALYS
where HF (t) is the electromagnetic field Hamiltonian in Boson Fock space. It is given by � � 2 3 HF (t) = (�/2) |E(t, r)| d r + (1/2µ) |B(t, r)|2 d3 r where
E(t, r) = −�Φ(t, r) − ∂t A(t, r), B(t, r) = curlA(t, r)/µ
The Maxwell equations for E, B are written down taking into account the quan tum current density and charge density associated with the charges of the N particles and their joint density operator ρ(t). If ρ1 (t, r, r� ) denotes the position space representation of the marginal density for one particle, then we know by analogy with the expression for the quantum current and charge density in a pure state ψ(t, r), J(t, r) = (i/2m)(ψ(t, r)∗ �ψ(t, r) − ψ(t, r)�ψ(t, r)∗ ), σ(t, r) = ψ(t, r)∗ ψ(t, r) that the same quantities in the mixed state ρ1 are given by J(t, r) = (i/2m)[�1 ρ(t, r, r� ) − �2 ρ(t, r, r� ))]|r� =r σ(t, r) = ρ(t, r, r) Note that ρ1 (t) = T r23...N ρ123...N (t) These expressions for the current and charge densities are to be substituted into the Maxwell equations curlE(t, r) = −∂t B(t, r), curlB(t, r) = µJ(t, r) + µ�∂t E(t, r), divB(t, r) = 0, divE(t, r) = σ(t, r)/� This model can be used to describe a quantum plasma within a quantum cavity resonator having a quantum electromagnetic field within it. The solutions for the quantum electromagnetic field will be given by a sum of two terms:The first term is the free field solutions as conventionally described in quantum electrody namics in terms of the photon creation and annihilation operators. This part is the solution to the homogeneous (ie, source free) part of the Maxwell equations. The second term is the particular solution of the Maxwell equations that is lin ear in the current and charge densities. This expression for the electromagnetic field operators is to be substituted into the quantum Boltzmann equation for ρ1 (t, r, r� ) in order to get an appropriate description of the plasma.
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
143
5.14. DEVICE PHYSICS IN A SEMICONDUCTOR USING THE CLASSICAL BOLTZMANN
5.14 Device physics in a semiconductor using the classical Boltzmann kinetic transport equation Study project. Study the effect of small electric and magnetic field perturbations on the solution to the kinetic transport equation using first order perturbation theory and hence evaluate approximately the current density produced by such field fluctuations. Use this to describe the photocurrent in a device when radi ation falls upon it.
5.15 A.Describing the value of a point charge and its location in space in terms of the electrostatic potential generated by it B.Rewriting Dirac’s relativistic wave equation for the electron inter acting with a quantum electromagnetic field and the classical nuclear field without explicitly bringing in the electronic charge Let the point charge be Q and let its location be r0 . The potential produced by it is V (r) = Q/4π�|r − r0 |
Thus,
�2 V (r) = −Qδ(r − r0 )/�
We can thus recover Q and r0 from V (.) using the formula � f (r)�2 V (r)d3 r = (−Q/�)f (r0 ) for any measurable function f having compact support. In particular, let g(r) be another function. Then, � f (r)�2 V (r)d3 r � = f (r0 )/g(r0 ) g(r)�2 V (r)d3 r Therefore, if f, g are functions that are zero outside a compact subset K of R3 and are such that f (x, y, z)/g(x, y, z) = x, then � f (r)�2 V (r)d3 r x0 = � g(r)�2 V (r)d3 r
Likewise y0 , z0 can be recovered from V (.). In this way, we can recover r0 = (x0 , y0 ,0 ) from V (.) Then, Q is also determined using � Q = (−�/f (r0 )) f (r)�2 V (r)d3 r
144146CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
Now consider the problem of determining the point charges and their locations given their number from the electrostatic potential generated by them. Let these charges be Q1 , ..., Qn and let their locations be r1 , ..., rn . If V (r) is the electrostatic potential generated by them, then Poisson’s equation gives ��2 V (r) =
n �
k=1
Qk δ(r − rk )
and hence, if f (r) is a bounded measurable function, we have that n �
Qk f (rk ) = �
k=1
�
f (r)�2 V (r)d3 r
Now choose the functions f1 (r) = xm , f2 (r) = y m , f3 (r) = z m , m = 0, 1, 2, ... and derive from the above, the following system of equations n �
Qk xkm =
�
2 3 xm k � V (r)d r,
Qk ykm
=
�
ykm �2 V (r)d3 r,
Qk zkm =
�
zkm �2 V (r)d3 r
k=1 n �
k=1 n �
k=1
for m = 0, 1, 2, ...N − 1. Thus we get a system of 3N equations which can be solved for Qk , xk , yk , zk , k = 1, 2, ..., n or a least squares solution can be obtained provided that N ≥ 4n. In this way a finite discrete charge distribution in space can be completely determined from the potential field. We could also do this using measurements of the electric field only using Gauss’ law: �divE(r) =
n �
k=1
Thus, �
�
Qk δ(r − rk )
f (r)divE(r)d3 r =
n �
Qk f (rk )
k=1
Another way to express the point charge distribution as a functional of the potential is to assume that the distance between the locations of any two charges in the set is greater than 2δ. Let Q1 , ..., Qn denote the point charges with locations r1 , ..., rn so that the charge density is ρ(r) =
n �
k=1
Qk δ(r − rk )
Advanced and Quantum Probability Theory with Quantum Field Theory Applications 145 5.15. Classical A.DESCRIBING THE VALUE OF A POINT CHARGE AND ITS LOCATION IN SPACE
with |rk − rj | > δ∀k = j
Then let B(δ) denote the open ball in R3 with the origin as centre and radius δ: B(δ) = {r ∈ R3 : |r| < δ} Then given an arbitary point r ∈ R3 , we have that B(r, δ) = r + B(δ) can contain at most only one of the rk� s. It follows that for any r, � −� �2 V (r� )d3 r� B(r,δ)
equals either zero or Qk for some k = 1, 2, ..., n. It equals Qk iff |r − rk | < δ. In other words, by moving the center ball B(δ) to different points, we get a result either equal to zero or Qk and from the location of the centre of the ball, we can determine rk upto an accuracy of δ. This result can also be stated as � (−��2 V (r� ))d3 r� limr→rk ,δ→0 B(r,δ)
= Qk Now consider an electron of charge −e interacting with the atomic nucleus of charge Ze. Let Aq (t, r) denote the free quantum electromagnetic field in space time and let ψ(t, r) denote the second quantized Dirac wave function of the electron. It satisfies the equation [(γ µ (i∂µ + eAqµ (t, r) + eAN µ (r)) − m]ψ(t, r) = 0 where
AN 0 (r) = −Ze2 /|r|, AN j (r) = 0, j = 1, 2, 3
is the classical nuclear potential. According to our theory, Aq , the free quantum electromagnetic field does not have any singularity and hence if δ is sufficiently small, we have � � �2 (Aqµ (t, r) + AN µ (r))d3 r = −Zeδµ,0 B(δ)
This equation then determines the electron charge −e from the total electro magnetic potential. One way to write down Dirac’s equation without introduc ing explicitly the electronic charge −e is then to write it as � �2 A0 (t, r� )d3 r� )]Aµ (t, r)]) − m]ψ(t, r) = 0 − − − (1) [γ µ (i∂µ − Z −1 [ B(δ)
where Aµ = Aqmu + AN µ
146 Advanced Classical and Quantum Probability with Quantum FieldAND TheoryFUNCTIONAL Applications 148CHAPTER 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
is the total electromagnetic four potential comprising of the free field part de scribed in term of photon creation and annihilation operators and the classical nuclear part. Remark: It should be noted that in our formalism, the electron exists only because it is a part of the atom having an atomic nucleus. The existence of the electron without a corresponding nucleus is not meaningful. Now the total electromagnetic field Aµ in the region |r| > 0, ie, in R3 − {0} satisfies the wave equation � 2 Aµ −
1 2 ∂ Aµ = 0, ∂ µ Aµ = 0 − − − (2) c2 t
The question is , ”Is it possible to derive all the consequences of conventional quantum field theory from equns (1) and (2) only ? To answer this question, let us assume first that the total electromagnetic field Aµ is given and we have to solve (1). Perturbatively solving it gives us to a first order approximation in the interaction term ψ = ψ0 + ψ 1 , [iγ µ ∂µ − m]ψ0 = 0, Z −1 [
�
[iγ µ ∂µ − m]ψ1 = B(δ)
�2 A0 (t, r� )d3 r� )]Aµ (t, r)γ µ ψ0
ψ0 therefore represents the free Dirac field expressible as a superposition of electron and positron creation and annihilation operators. The first order per turbation ψ1 to the free Dirac field is then � � ψ1 = Z −1 S(x − x� )[ �2 A0 (t, r� )d3 r� )]Aµ (x� )γ µ ψ0 (x� )d4 x� − − − (3) B(δ)
We now use (3) to compute radiative corrections to the electron propagator in terms of the photon propagator: < T (ψ(x)ψ(x� )∗ ) >≈< T (ψ0 (x)ψ0 (x� )∗ ) > + < T (ψ0 (x)ψ1 (x� )∗ ) > + < T (ψ1 (x)ψ0 (x� )∗ ) > where now
< T (ψ0 (x)ψ0 (x� )∗ ) >= S0 (x − x� )
is the free Dirac field electron propagator known to be given by � S0 (x − x� ) = K. (γ µ pµ − m)−1 exp(ip.x)d4 p Now come to the time varying case for calculating charges and their ve locities from the singularities in the electromagnetic field. A point charge Q
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
147
5.15. A.DESCRIBING THE VALUE OF A POINT CHARGE AND ITS LOCATION IN SPACE moving along the trajectory R(t), t ≥ 0 with nonrelativistic velocity generates an electromagnetic field given approximately by E(t, r) = B(t, r) =
Q(r − R(t)) , 4π�|r − R(t)|3
µQV (t) × (r − R(t)) |r − R(t)|3
Equivalently in terms of the Maxwell equations,
divE(t, r) = Qδ(r − R(t))/�,
curlB(t, r) = µQV (t)δ(r − R(t)) + �∂t E(t, r)
and hence we deduce that for a smooth function f (r) of space coordinates,
� f (r)divE(t, r)d3 r = (Q/�)f (R(t)), �
f (r)curlB(t, r)d3 r − �
�
f (r)∂t E(t, r)d3 r = µQV (t)f (R(t))
and by selecting f appropriately, it is clear from these two equations how to determine the charge and its trajectory including velocity at each time from the total electromagnetic field in space. We apply this idea to quantize the electromagnetic field and Dirac field when the nucelus having charge Q = Ze that binds the electron moves along a trajectory R(t). The magnetic vector potential generated by such a nucleus is AN (t, r) =
µQV (t) , 4π|r − R(t)|
and the electric scalar potential is given by
AN 0 (t, r) =
Q 4π�|r − R(t)|
in the nonrelativistic approximation. Equivalently, in the nonrelativistic ap proximation, we have �2 AN (t, r) = −µQV (t)δ(r − R(t)), �2 AN 0 (t, r) = −Qδ(r − R(t))/�
so that for any test function f (r), we have
� f (r)�2 AN (t, r)d3 r = −µQV (t)f (R(t)), �
f (r)�2 AN 0 (t, r)d3 r = −Qf (R(t))/�
148150CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
By taking the ratio of these two equations, we obtain the charge velocity vector � 2 3 −1 � f (r)� AN (t, r)d r V (t) = (µ�) f (r)�2 AN 0 (t, r)d3 r
The charge Q can be calculated in terms of the field AN 0 by integrating the Laplacian applied to it over a small neighbourhood of its position R(t). However, to do so, we require first to estimate R(t) from the field. That can be done by taking f (r) = |r| giving thereby � |r|.�2 AN 0 (t, r)d3 r = −Q|R(t)|/�, and then taking f (r) = |r|2 , we get � |r|2 �2 AN 0 (t, r)d3 r = −Q|R(t)|2 /� Eliminating |R(t)| between these two equations gives us the charge as � ( |r|.�2 AN 0 (t, r)d3 r)2 Q = −� � 2 2 |r| � AN 0 (t, r)d3 r
R(t) may now be calculated using � r�2 AN 0 (t, r)d3 r = −QR(t)/� and V (t) using
�
�2 AN (t, r)d3 r = −µQV (t)
If we assume that the quantum electromagnetic field Aqµ fluctuates rapidly in space, then its spatial average over any small open ball of finite radius will be negligible and hence we can to a good degree of approximation write � � �2 A0 (t, r� )d3 r� B(r,δ) f (r)( )d3 r = −Qf (R(t))/�, V (B(δ)) � � �2 A(t, r� )d3 r� B(r,δ) f (r)( )d3 r = −µQV (t)f (R(t)) V (B(δ)) where
A = AN + Aq , A0 = AN 0 + Aq0
or equivalently,
Aµ = AN µ + Aqµ
These are respectively the total magnetic vector potential due to the nucleus and the quantum field and the total electrostatic field due to the same. Note that works because the nuclear potential has a singularity at the origin and at
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
149
5.15. A.DESCRIBING THE VALUE OF A POINT CHARGE AND ITS LOCATION IN SPACE other spatial points, it varies slowly in space, while the quantum field is smooth thereby ensuring that � A (t, r� )d3 r� B(r,δ) qµ ≈0 V (B(δ))
and since
� A (t, r� )d3 r� B(r,δ) N µ ≈ AN µ (t, r) V (B(δ))
for small positive δ. Therefore,
� A (t, r� )d3 r� B(r,δ) µ ≈ AN µ (t, r) V (B(δ)) Dirac’s equation for the electron wave function is now expressible entirely in terms of the total electromagnetic field without even bringing in the electronic charge parameter. Formally, this equation is therefore expressible as Relativistic considerations:Suppose that the nucleus is moving with relativis tic velocities. Then, we replace the Laplacian operator by the wave operator in the above equations thereby obtaining �AN (t, r) = −µQV (t)δ(r − R(t)), where
�AN 0 (t, r) = −Qδ(r − R(t))/� � = �2 − µ�∂t2
It is then clear how all the parameters of the moving nucleus, namely it charge, position trajectory and velocity can be computed as functions of weighted inte grals of the total electromagnetic field. Specifically, we find that � � �A(t, r� )d3 r� B(r,δ) f (r)( d3 r = −µQV (t)f (R(t)) V (B(δ)) � � �A0 (t, r� )d3 r� B(r,δ) f (r)( d3 r = −Qf (R(t))/� V (B(δ))
The Dirac equation is now of the form
[γ µ (i∂µ + F (Anu (t, r), r ∈ R3 )Aµ (t, r)) − m]ψ(t, r) = 0 where e = F (Aν (t, r), r ∈ R3 ) is the electronic charge value determined as above as spatial functional of the electromagnetic field. The electromagnetic field on the other hand satisfies Maxwell’s equations in the form �Aµ (t, r) = 0, r = R(t) The whole point of this exercise is that by measuring data about the Dirac wave function, or equivalently the Dirac four current density, we can in principle calculate the electromagnetic field Aµ and hence from the singularity theory mentioned above, calculate the nuclear charge as well as its trajectory.
150
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
152CHAPTER 5. MORE PROBLEMS IN LINEAR ALGEBRA AND FUNCTIONAL ANALYS
5.16 Calculating the masses of N gravitating par ticles and their positions and their trajec tories from measurement of the gravita tional potential distribution in spacetime using the Newtonian theory Let m1 , ..., mN denote the masses of N point particles moving under their mu tual gravitation along trajectories r1 (t), ..., rN (t). The Newtonian equations of motion are rj�� (t) =
N �
k=1,k�=j
Gmk (rj − rk )/|rk − rk |3 , j = 1, 2, ..., N
The gravitational potential generated by these masses is then Φ(r) =
N � j=1
Gmj /|r − rj |
and this potential satisfies Poisson’s equation �2 Φ(t, r) = 4πG
N � j=1
mj δ(r − rj (t))
Thus, we get for a test function f (r), �
f (r)�2 Φ(t, r)d3 r = 4πG
N �
mj f (rj (t))
j=1
By choosing test functions f1 , ..., fN appropriately, we get the following linear system of equations for the masses given their positions: � N � fk (rj (t))mj = fk (r)�2 Φ(t, r)d3 r, k = 1, 2, ..., N j=1
Define the N × N matrix valued function of time A(t) = ((fk (rj (t))))1≤k,j≤N and let
B(t) = A(t)−1 = ((bij (t)))
Then, mk =
N � j=1
bkj (t)
�
fj (r)�2 Φ(t, r)d3 r
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
151
5.16. CALCULATING THE MASSES OF N GRAVITATING PARTICLES AND THEIR POSI
This formula will work even if the masses are functions of time. Now the bjk (t)� s are functions of the rj (t)� s. So the rj (t)� s also have to be estimated from the potential distribution. Choose N vectors ξ1 , ..., ξN in R3 . Then, we have �
< ξk , r > �2 Φ(t, r)d3 r = 4πG
N �
mj < ξk , rj >, k = 1, 2, ..., N
j=1
This is a system of N linear equations for the N masses and defining the matrix ((ckj (rj )))1≤k,j≤N = C(r1 , ..., rN ) = 4πG((< ξk , rj >))1≤k,j≤N gives us mk =
N �
ekj (r1 , ..., rN )
j=1
where
�
< ξj , r > �2 Φ(t, r)d3 r, k = 1, 2, ..., N
((ekj )) = C −1
Thus we obtain the following N equations for r1 , ..., rN :
N
�
bkj (t)
j=1
=
N �
ekj (r1 , ..., rN )
j=1
�
�
fj (r)�2 Φ(t, r)d3 r
< ξj , r > �2 Φ(t, r)d3 r, k = 1, 2, ...., N
and by varying the vectors ξk in these equations, we can derive at least 3N equations for the N vectors r1 , ..., rN which can in principle be solved. Now we address the same problem in Einsteinian gravity. The energy momentum tensor for N point particles of masses m1 , ..., mN is given by � mk (−g(x))−1/2 δ 3 (x − xk (t))(dxµk (t)/dt)(dxνk /dτk ) T µν (x) = k
where τk is the proper time for the k th particle. It is given by
dτ 2 = gµν (xk (t))dxµk (t)dxνk (t)
where x0k (t) = t is the universal coordinate time. The Einstein field equations corresponding to this energymomentum tensor are Gµν = Rµν − (1/2)Rgµν = −KTµν , K = 8πG
152154CHAPTER Advanced Classical and Quantum Probability with Quantum Field AND TheoryFUNCTIONAL Applications 5. MORE PROBLEMS INTheory LINEAR ALGEBRA ANALYS
We find that � � � � T µν (t, r) −g(t, r)f (t, r)dtd3 r = T µν (x) −g(x)f (x)d4 x =
� k
where
mk
�
f (xk (t))vkµ (t)vkν (t)dτk (t)
vkµ (t) = dxµk /dτk
is the four velocity of the k th particle. From this equation, we can infer by choosing different functions f : R4 → R, the particle trajectories as functions of coordinate time as well as their masses. Application of the same ideas to Superstring theory: A superstring compris ing of a Bosonic and a Fermionic part is given by � � (αµ (n)/n)exp(in(τ −σ))−i (α ˜ µ (n)/n)exp(in(τ +σ)) X µ (τ, σ) = xµ +pµ τ −i n=0
n=0
ψ µ (τ, σ) = ψ+ (τ, σ) + ψ− (τ, σ)
� � = Snµ exp(in(τ − σ)) + S˜nµ exp(in(τ + σ))
n
n
since these satisfy the string field equations
µ µ
∂+ ∂− X µ = 0, ∂− ψ+ = 0, ∂+ ψ− =0
where
∂+ = ∂τ + ∂σ , ∂− = ∂τ − ∂σ
so that ∂+ ∂− = ∂τ2 − ∂σ2 Note that the Lagrangian for the Bosonic part of the string is
LB = (1/2)∂+ X µ .∂− Xµ
while that of the Fermionic part is
T T LF = −iψ+ ∂− ψ− − iψ− ∂ + ψ− µ T Note that ψ+ ∂− ψ− is an abbreviation for ψ+ ∂− ψ+ µ and likewise for the other term. If ψ+ and ψ− denote the canonical position fields for the Fermionic component of the superstring, then the corresponding canonical momenta are
π+ = ∂LF /∂∂τ ψ+ = −iψ+ , π− = ∂LF .∂τ ψ− = −iψ−
Advanced Classical and Quantum Probability Theory withGRAVITATING Quantum Field Theory ApplicationsAND 153THEIR POSIT 5.16. CALCULATING THE MASSES OF N PARTICLES
so that the canonical anticommutation relations are [ψ+ (τ, σ), ψ+ (τ, σ � )]+ = −δ(σ − σ � ) [ψ− (τ, σ), ψ− (τ, σ � )]+ = −delta(σ − σ � )
These equations give
ν [Snµ , Sm ]+ = η µν δ(n + m), ν [S˜nµ , S˜m ]+ = η µν δ(n + m),
To obtain the Noether conserved currents for the Fermionic sector, we first observe that LF is invariant under the infinitesimal transformations δψ+ = �.ψ− , δψ− = −�ψ+ where � is an infinitesimal parameter. The first conserved Noether current corresponding to this symmetry is then given by T J − = (∂LF /∂∂− ψ+ )δψ+ + (∂LF /∂∂− ψ− )δψ− = ψ+ ψ+
which is obeys the conservation law ∂− J − = 0 when the field equations are satisfied. Likewise, the second conserved current corresponding to this symmetry is T J + = (∂LF /∂∂+ ψ− )δψ− = ψ− ψ−
which satisfies the conservation law ∂+ J + = 0 T when the field equations are satisfied. Likewise, J + = ψ− ψ− satisfies the con servation law
∂+ J + = 0
when the field equations are satisfied. The problem is can we calculate pµ , the translational Dmomentum of the string from measurements on the string observables ? Exercise: Evaluate the Fourier series components of the energymomentum tensor of a superstring. Also evaluate the Fourier series components of the supercurrent of a superstring. Show that the supercurrent field determine the Fermionic supersymmetry generators. Acknowledgements:I am grateful to Prof.Hans Van Leunen, Prof.Andre Michaud and Prof.Steven Arthur L+angford for encouraging me to work on this problem and apply the method of determining all the charges and their locations from electromagnetic field measurements to Dirac’s relativistic wave equation, by re placing the electronic charge which appears in this equation with functionals of the quantum electromagnetic field.
154
Advanced Classical and Quantum Probability Theory with Quantum Field Theory Applications
156CHAPTER 5. MORE PROBLEMS IN LINEAR ALGEBRA AND FUNCTIONAL ANALYSI
5.17
The quantum Boltzmann equation for a plasma
Suppose that the joint density matrix of N particles is ρ(123...N ). It satisfies the Schrodinger equation i∂t ρt (12...N ) = [
N �
�
Ha +
a=1
Vab , ρt (12..N )]
1≤a