157 112 3MB
English Pages [127] Year 2018
.
NOVOSIBIRSK STATE UNIVERSITY
Physics department
Valeriy G. SERBO
LECTURES ON ELEMENTARY PARTICLE PHYSICS
Novosibirsk 2018
2
Contents Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 § 1. Introduction: elementary particles and their interactions . . . . . . . . . . . . . . . .5 1.1. Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2. Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3. Three generations of leptons and quarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4. Quarks and hadrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.5. Notion of quantum field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 § 2. Quantization of the electromagnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15 2.1. Electromagnetic field as a set of oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2. Quantization of the field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3. Creation and annihilation of field quanta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4. The Heisenberg representation versus the Schr¨odinger representation . . . . . . . . . . . 25 § 3. The Lagrange approach in the field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1. The Lagrangian equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2. Symmetry and the conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 § 4. Real scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 § 5. Complex scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 § 6. C, P , and T transformations for the complex scalar field . . . . . . . . . . . . . . . 42 § 7. C, P , and T transformations for the electromagnetic field . . . . . . . . . . . . . . 46 § 8. The spinor Dirac field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 8.1. Three-dimensional spinors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 8.2. Four-dimensional spinors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 8.3. The Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 8.4. C, P , and T transformations for the Dirac spinor field . . . . . . . . . . . . . . . . . . . . . . . . . 52 8.5. Hamiltonian form of the Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 8.6. Plane waves solution of the Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 8.7. Quantization of the Dirac spinor field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 § 9. Interaction representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58 § 10. Invariant perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 § 11. Probability amplitudes and transition probability . . . . . . . . . . . . . . . . . . . . . 64 11.1. The scattering amplitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 11.2. The decay width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67 11.3. Cross section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 § 12. The first order of the perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 ˆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 12.1. Interaction g φˆ+ φˆ Φ ˆΨ ˆ Φ. ˆ Decay of the Higgs boson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 12.2. Interaction g Ψ 12.3. Production of the Higss boson in e+ e− , µ+ µ− and γγ collisions . . . . . . . . . . . . . . . .77 ˆ Ψ ˆ ˆµ 12.4. QED interaction eΨγ µ A . The Feynman rules in QED . . . . . . . . . . . . . . . . . . . . . . 78 ˆ § 13. The second order of the perturbation theory with interaction g φˆ+ φˆΦ. The Mandelstam variables. Propagator of the scalar particle . . . . . . . . 81 13.1. The Mandelstam variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 13.2. Scattering of charged particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 13.3. Propagator of the scalar particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 13.4. Processes π 0 π − → π 0 π − and π + π − → π 0 π 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94
3 § 14. The second order of the perturbation theory in QED. The photon propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 14.1 Scattering of electrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 14.2. The photon propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 14.3. The Fyenman diagrams and the Coulomb law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 14.4. The annihilation processes e+ e− → µ+ µ− and e+ e− → τ + τ − . . . . . . . . . . . . . . . . . 101 14.5. Processes e+ e− → q¯q and e+ e− → hadrons at high energies . . . . . . . . . . . . . . . . . . 105 14.6. Process eµ → eµ and crossing symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 § 15. The second order of the perturbation theory in QED. The electron propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112 15.1. The γe scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112 15.2. The electron propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 15.3. The Compton effect and its applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 15.4. Main characteristics of e+ e− → γγ and γγ → e+ e− processes at high energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4
Notations Numeration of formulas are given by two numbers, for example, the number (3.7) means that it is the formula (7) from § 3. References to formulas from the same section are given without reference to the number of section. Constants: ~ = 1, 055 · 10−27 erg· s — the Plank constant; c = 2, 998 · 1010 cm/s — speed of light in vacuum; |e| = 4, 803 · 10−10 esu — electron charge magnitude; α = e2 /(~c) = 1/137.04 — fine-structure constant; 1 eV=1, 602 · 10−12 erg=1, 602 · 10−19 J. Units: In the initial sections § 1—§ 2, the absolute Gaussian system of units is used. In other sections, the relativistic units with c = 1, ~ = 1 are used. In such system the energy, momentum, frequency, (length)−1 and (time)−1 have the same dimensions, in particular me = 0, 511 MeV — electron mass; mp = 0, 940 GeV — proron mass; 1/me = 3, 862 · 10−11 cm — the reduced Compton wave length of the electron; re = α/me = 2, 818 · 10−13 cm — classical electron radius; 1/(1 GeV) = 1, 97 · 10−14 cm. Four-vectors: One agrees that one sums over any repeated index, and omits the summation sign, i. e. the expression Aµ Bµ means Aµ Bµ ≡ A0 B0 − Ax Bx − Ay By − Az Bz = A0 B0 − AB. We will often used the reduced notation AB ≡ Aµ Bµ . Radius four-vector xµ = (t, r), xµ = (t, −r), ( ) ( ) ∂ ∂ ∂ ∂ ≡ , +▽ ≡ ∂µ , ≡ , −▽ ≡ ∂ µ . ∂xµ ∂t ∂xµ ∂t
Acknowledgements I am very grateful to Vitaly Vorobyev, Irina Sokolova and to my daughter Olga Karpushina for an invaluable help in preparation of this manuscript.
5
§ 1. Introduction: elementary particles and their interactions We will study science which is referred to as Elementary Particle Physics, while this will be mostly an Introduction to Elementary Particle Physics. The course relates mainly to issues that are explained in books on the quantum field theory. And our ultimate goal is to obtain a basic understanding about elementary particles and their interactions. At the end of this course, you will also be able to draw and calculate the simple Feynman diagrams. The classics said that an illiterate person is beyond politics. The situation in the modern physics of elementary particles is such that if one does not know the language of the Feynman diagrams, then this person is also largely outside the particle physics. This course is not designed as “pure theoretical”, but rather it aims to provide practical knowledge to those directly involved with Budker Institute of Nuclear Physics. So, such is our program. Now about the text-books. The basic book will be Quantum Electrodynamics by Berestetskii, Lifshitz and Pitaevskii [1]. It is the fourth volume of Theoretical Physics by Landau and Lifshitz. In our course we use notations and normalizations mainly from this tutorial. But, of course, there are a lot of other useful books, including the books by Bogolyubov and Shirkov [2, 3] and Peskin and Schroeder [4], which is perhaps for in-depth studies. It is rather funny that there is a thick book Introduction to the Theory of Quantized Fields by Bogolyubov and Shirkov , and there is a short one that is called not Introduction but Quantum Fields. The course is serious enough, but do not think that it is beyond your strength. Its difficulty is just in its uncommonness. You are used to the fact that analytical mechanics is a natural continuation of general physics, which particles physics is not. Anyway, quantum mechanics was a science fundamentally new to you, but you have already mastered it. And now we proceed to another science, you have met with very rarely. I mean Relativistic Quantum Field Theory, which presents a double difficulty. On the one hand, relativism, you have to get used to, and on the other hand, the field theory, which is very different from quantum mechanics. Anyway, you already know the basic concepts. Therefore, we will act in the following way: we will start with familiar objects and investigate them very thoroughly, so as to eliminate any ambiguity. Then, when we proceed to other, more complex subjects, it will be possible to accelerate the pace. At the initial stage, it is very important not to shelve any questions and to seek immediate clarification if you have any. Today our main topic is just the Introduction. I am going to present the Introduction in bold strokes, i.e. to talk about what particles are, about their main types, interactions, and the relation with quantum field theory, which will be our main subject. This section will cover a lot of things that are not proved but just declared. In fact, this is a kind of running ahead, so if you have some principal questions, it would be better to postpone them a little. Then the regular course will start. In the Introduction, I will tell about the elementary particles and their interactions, then about quarks and leptons, next about quarks and hadrons, and, finally, about the quantum field theory. Those are our small subsections. Once again, this will be a kind of a bird’s-eye view. Well, a few words about the system of units. A relativistic system in which both the
6 Plank constant ~ and speed of light c are taken as basic units is a natural system of units. You should get used to this system little by little, so we will do this later on, when we proceed to specific calculations. Until that we continue using the system we are accustomed to, i.e. the absolute Gaussian system of units. Nevertheless, from the very beginning we have to work with four dimensions, i.e. the radius vector xµ is a set of temporal and spatial coordinates, the zero component x0 being namely ct. In addition to the contravariant vector xµ = (ct, r), we will also have the covariant one, which differs in the sign of the spatial part xµ = (ct, −r). Well, you probably know this quite well.
1.1. Particles So let’s start talking about particles. First of all, what is a particle? Earlier, anything less than an atom used to be referred to as an elementary particle. If so, the number of particles in the table of elementary particles would be a few times bigger than that of atoms in the Mendeleev periodic table. Besides that, each year a few new particles are discovered, most of which are unstable. The most characteristic feature of these particles is a possibility to be produced and to be transformed into another particles. Let us compare the photo-effect γ+H →p+e and the β-decay of neutron n → p + e + ν¯e . In the first case, the hydrogen atom H is a coupled state of e and p, in the second case, the p, e and ν¯e are produced as a result of the reaction. If we switch to the fundamental level and speak of the fundamental components, the situation becomes somehow simpler. In this case we have a relatively small set of elementary particles: • quarks and leptons (l and q), spin J = 21 ; • gauge vector bosons (γ, W ± , Z 0 , g), J = 1; • scalar Higgs boson (H), J = 0.
1.2. Interactions It is important to recall some facts about the force and radius of the interactions. For all interactions, we can roughly write down the potential energy as follows: U (r) =
g 2 −r/R e r
with some coupling constant g and the exponent e−r/R , where R is the interaction radius. The main types of interactions are: 1. Electromagnetic (EM) interaction: the characteristic radius of interaction Rem ∼ ~ mγ c = ∞, because mγ = 0, the intensity of interaction is characterized by the die2 ≈ 1 ≪ 1, that is why here it is possible to use the mensionless constant α = ~c 137 perturbation theory — quantum electrodynamics (QED).
7 Moreover, from the Feynman diagrams we will see that the interaction radius is associated with the mass of the particle carrying this interaction. Well, if we talk about electromagnetic interaction, it is the interaction of charged particles such as electrons, and an electromagnetic field. These charged particles interact via electromagnetic field. An electromagnetic field quantum has a mass equal to zero, and therefore, the radius of electromagnetic interaction is infinite. In other words, this interaction has a purely Coulomb form, in which the constant g is the charge value e. As a result, the purely Coulomb interaction is strictly due to the fact that the photon has no mass, which is experimentally well proven, i. e. this mass has a very low experimental upper limit. Electrodynamics is a totally unprecedented science because some electrodynamic characteristics have been measured with an accuracy of 10−11 , and calculations have been carried out with an accuracy of up to the third, fourth, and in some cases even up to the fifth degree of α. 2. Gravitation interaction: Rg ∼ ∞, very weak, in the atomic scales it is completely negligible. Speaking further about gravitational interaction, we should mention that quantum theory of gravitational interaction is yet to be built. Nevertheless, gravitational interaction is assumed to be carried by gravitons — particles with a spin of 2 and a mass of 0. From this viewpoint, the gravitational radius is also infinite. As for the force, we can try talking about the force of gravitational interaction using a simple example. Let consider interaction of two hydrogen atoms a distance r apart. They are neutral, and therefore the interaction between them will be mainly gravitational. If we want to compare the electromagnetic and gravitational interactions, it may be better to take not the atoms but a pair of protons. The relation of gravitational and electromagnetic interaction forces is given by a formula you know quite well. The gravitational force is given by the law of universal gravitation: Fg = G(mp )2 /r2 , where mp is the proton mass and G is the gravitational constant. The electromagnetic interaction is given by the Coulomb law; as the proton charges are equal, the corresponding force will be: Fem = e2 /r2 . So, we have obtained a dimensionless ratio Gm2p Fg = 2 . Fem e If we substitute all the values — let me omit details — the result will be Fg ∼ 10−36 . Fem But this means, in general, that gravitational interaction plays no role in all processes associated with elementary particles. Of course, a question arises: “In general, where does gravitational interaction matter if electromagnetic interaction is much-much-much stronger than gravity one?” The answer is that for some reason, at least in our part of the Universe, the matter is mainly electro-neutral. This is the only reason for the fact that when there arise large masses of electro-neutral matter, electromagnetic forces are fully compensated, while
8 gravity forces begin to manifest. This is crucial for the life of the planets, the stars, and the Universe as a whole. 3. Strong interaction: it is responsible for coupling of nucleons in nucleus, for quick decays of the resonance states, its characteristic time is τs ∼ 10−24 s, Rs ∼ m~π c ∼ 10−13 cm, the strong force can be characterize by the dimensionless constant αs ∼ 1 on distances ∼ Rs . The radius of strong interaction is R ∼ ~/(mπ c), where mπ is the pion mass, and this value is about 10−13 cm. This is the characteristic size of proton. The size of proton is measured in various experiments, including studies of the scattering of electrons on protons. So it is a reliably determined value. In terms of interaction of quarks — we’ll talk about that later on — such distances are large. Those are distances at which strong interaction occurs, and the corresponding constant of strong interaction is about 1. This is sad, because, as classics have written, only in a state of deep despair one can perform an expansion in a constant that exceeds 1. This means that as long as we are dealing with such distances, we have no fundamental theory, only phenomenology. But if we switch to much smaller distances, there arises a fundamental theory that describes precisely strong interaction — quantum chromodynamics (QCD). We’ll talk about this theory in some more detail. 4. Weak interaction: it is responsible for decays of many longlived particles: n, π, K, . . . , its caracteristic time is τw ∼ 10−13 ÷ 10−8 s, Rw ∼ m~W c ∼ 10−16 cm. Its characteristic radius is determined by the value of mass of its carriers, i.e. W ± and Z bosons. And the respective distances are of the order of 10−16 cm. The weak interaction force is however a more complicated issue. At the small energies, this weak interaction is associated with a constant that is approximately four orders of magnitude smaller than the electromagnetic one. But at higher energies, the weak interaction is comparable with the electromagnetic interaction. This enabled combining these two sciences into a joint electroweak theory. Just one example. The neutrino from reactor (at small energies) may pass through the Earth, but at E ∼ mW c2 its cross sections can compared with the electromagnetic one. Interactions of the elementary particles are due to the following exchanges: • γ (photon) — for EM interaction; • W ± and Z bosons — for weak interaction; • g (gluon) — for strong interaction.
1.3. Three generations of leptons and quarks There are three sets (or generations) of leptons: the electron, muon and tau leptons. As I was already said, they all have a spin of 1/2. Conventionally, they correspond to negatively charged particles. Their partners are respective neutrinos: electron neutrino,
9 muon neutrino and tau lepton neutrino. ( ) ( ) νe u , — the first generation, e d ( ) ( ) νµ c , — the second generation, µ s ( ) ( ) ντ t , — the third generation τ b + their antiparticles.( ) ( ) Qν = 0 Qu = 32 Electric charge Qe: , , spin J = 12 and the same for other Qe = −1 Qd = − 31 generations. That is, in processes involving electrons there are emitted neutrinos “loving” the electron. If some decay resulted in the production of muon, then the muon would be mated with another neutrino, which differs from neutrinos produced together with the electron. Actually, this is an unusual property, and Bruno Pontecorvo was the first to note it — before him, these neutrinos were considered as the same particles. Then neutrinos produced together with the electron or with the muon were experimentally proved to be different particles. Accordingly, when the tau lepton was discovered, it immediately becames clear that there is a corresponding neutrino. The most sophisticated recent experiments suggest that, in fact, these particles can gradually turn into each other, but this is the next, more advanced level of study. They are grouped in generations: the first, second, and third generations. They have quite different masses. While the electron rest energy is only 0.5 MeV, the muon mass is already 100 MeV, and the tau mass is 1.8 GeV. Perhaps, to easier navigate in this set, you’d better have this table before your eyes. I’ve prepared some tips for you. You will occasionally look at them, and then this will go into the subcortex. And thus this will become habitual, giving a feeling of understanding. The same can be said about our leptons: here is a table listing them all with their properties. So, I recommend that you look now and then at these tables, for it will be very useful. In short, all leptons have a spin of 1/2; the charge of all neutrinos is zero. It turned out somehow that there are also three generations of quarks with a spin of 1/2. The up quarks and down quarks are the first generation. The up quarks and down quarks, of course, have a spin of 1/2, but they differ in the charge. The charge of the up quark is +2/3 of the elementary charge, while the down quark has a charge of (−1/3). The same goes for the second generation: there are strange and charmed quarks. And in the third generation, there are top quarks and b quarks; of course, if this is top, then that is bottom, which in English is a hint at some indecent parts of the human body. So it is sometimes called the beauty quark. They all have the respective charges. All quarks have additional quantum number — colour: q = q i , i = 1, 2, 3 (red, blue, green). Besides, we note that quarks, in addition to the electric charge, have also the baryon charge. This is the classification of the most fundamental particles. Well, now we will talk about various types of interaction. The gravitational interaction was omitted because it plays no role in the considered region. How are these fundamental particles involved in various interactions? Quarks are involved in strong, electromagnetic and weak interactions. As for the electron, muon and tau lepton,
10 they are involved only in electromagnetic and weak interactions. Finally, all the neutrinos, which have no electric charge, are involved only in weak interaction. This is a complete list of particles and their interactions.
1.4. Quarks and hadrons Hadrons are colourless state: ¯ • mesons: q q¯, for example, π + = ud; • baryons: qqq, for example, p = uud, n = udd. There is a possibility for exotitca: four-quark mesons q q¯q q¯, five-quarks baryons qqqq q¯, and so on. Quarks interact with colour gluons gji , i, j = 1, 2, 3, this leads to the colour capture or confinement. I have already mentioned that quarks have colour. It is a purely conventional name. The electron, muon and tau have a charge, but have no colour, and the respective science is called quantum electrodynamics (QED). Likewise, the respective science for particles that have colour and are involved in strong interaction is called colour dynamics, or chromodynamics (QCD). This means that a corresponding description of a quark should specify its colour (the first, second or third). It’s just an additional inner degree of freedom. Conventionally the first, second and third colours are referred respectively as the red, the blue and the green. As for those particles that are available to the experimentalists, they always are colourless or white. These particles are called hadrons and they are composed of coloured particles. In Greek, hadros means large and leptos means small. While we were considering the electron and the proton, it was quite correct since the former is small and the latter is large, i.e. the proton mass is two thousand times greater than the electron mass. However, later on, when the tau was discovered, with a mass greater than that of the proton, those attributions started sounding rather strange. Nevertheless, the terms were fixed. Grouped quarks make mesons, colorless particles composed mainly of quark-antiquark pairs. But this does not mean that these particles have no charge, because the quark can be of one sort and the antiquark may be of another sort. For instance, if I take an up quark and a down antiquark, then we will have the total charge as a sum of +2/3 and +1/3 because the antiquark charge will be different. The result will be the π + meson. Baryons. They are composed usually of three quarks. The proton is a typical baryon composed of three quarks: up, up and down with charges: +2/3, +2/3, and (-1/3). Therefore, its charge equals (+1). And, accordingly, the neutron is composed of a pair of down quarks and an up quark with charges (-1/3), (-1/3), and +2/3. Most baryons are composed of three quarks. Look at the tables you have, they contain leptons and selected baryons. This table shows also the composition of these particles from quarks, mass, etc. I must say that this is a well-established pattern, which, nonetheless, is not the ultimate truth. What about other states of quarks? I was just about to tell that the above scheme is at least well determined for a number of well-studied particles. However, in a number of resonances, including those that have been investigated at BINP, there are very serious
11 grounds for believing that the mesons can be a four-quark state, i.e. qq q¯q¯. Very recently, there was a big buzz due to the discovery of baryons consisting of five quarks, qqqq q¯. The situation at present is the following. The four-quark state is considered almost proven for some mesons. As for pentaquarks, the experiments are still conflicting. Some experimenters see pentaquarks, while other experimenters, on the contrary, do not see them. So the situation is unsettled, but such exotic things are already thought to be acceptable. At last, I would talk about how all this wonderful picture of elementary particles and their interactions is related to our main subject, i.e. to quantum field theory.
1.5. Notion of quantum field theory All elementary particles are quants of the corresponding fields and their fundamental interactions are described as interactions of quantum fields. Electromagnetic interaction So, we’ll start the section relating to quantum field theory. This will be our main object of study. Not elementary particles, but namely quantum field theory, because in fact it is the basic theory that describes interactions of elementary particles. We can begin with electromagnetic field. We use to perceive it classically, as some spatial electric and magnetic fields, which can be measured directly, even felt with one’s finger in a socket. Later on it turned out that such an approach, which is ideally described with the classical Maxwell equations, which were written down as early as in the nineteenth century, is incomplete. In some cases, an electromagnetic field behaves not as a propagating wave, but as a set of particles. As you know, these particles were named the photons. As in an experiment in which ultraviolet rays were falling on a zinc plate and a suitable neighboring anode was drawing electrons out. Naturally, the photoeffect was explained on the basis of the fact that an electromagnetic field is a set of photons; an individual photon is absorbed by an atom, which results in an electron flying out. In 1905, Einstein published three great works: on the photoeffect, on the theory of special relativity and on stochastics (the Brownian motion). The ideas were very innovative, and it took time for the physicist community to fully appreciate it. Of course, the theory of relativity is the main achievement, a new world indeed. Nevertheless, it was his discovery of the photoeffect that Einstein was granted the Nobel Prize for. So, the idea that the electromagnetic field can be considered as a set of quanta penetrated the society early enough, at the beginning of the twentieth century, while the idea that the electron also has wave properties appeared much later, after the de Broglie hypothesis which was later implemented in such calculations as the Schr¨odinger equation. But if an electron has wave properties, which was absolutely reliably established in experiments, then what about the earlier presented picture? A photon is a quantum of electromagnetic field. If an electron has wave properties, it must also be regarded as a quantum of a field. Regarding an electron as a quantum of electron-positron field turned out to be very natural. The novelty is that when we talk about an electromagnetic field and its particles, photons, they can perfectly well be produced and annihilated without violation of the law of conservation of charge, because they are not charged. As for electrons, the situation is more complicated. If there is a field corresponding to an electron and the electron is a quantum of this field, then, as with photons, there must be possibility of production and annihilation of the electron. Because of the law of conservation of charge, an electron can be produced and annihilated only with a partner, an antiparticle. This antiparticle is a positron. Then let’s go further and say that each lepton or each quark is a quantum of a respective field.
12 If we agree with that, then in the nature there will be nothing but fields, with particles being their quanta. What will the interaction of particles look like then? It will look like interaction of some fields via other fields. Thus, in the nature there are only interacting fields and their quanta. Namely from this point of view we will try to describe all the interactions we have talked about. At this stage I will talk dogmatically and without any proof, but only for you to perceive that the Standard Model, we are talking about, does have some essential consistency. At present, we have a picture that looks very uniform for all such interactions, which are very different and have different characteristics. Let us dwell on this topic a bit. Once again, I’m not going to prove anything in this section. I will declare concepts we will then prove and study. Let’s start with electromagnetic interaction, our most frequent object, as in case of colliding beams. How can one describe electromagnetic interaction? In the classical science, going back to Faraday, a charge q interacts with another charge not directly. Around the charge q, there is a field, which affects another charge. So, if we are talking about a charge, how does it interact with this field? Interaction of charge with field is described by the potential energy U = qϕ which is the product of the charge value q by the field potential ϕ. In general, a potential is a function of t and r, or simply put, we will say that it is a function of 4-coordinate x; I won’t even write the indexes. This is for a point charge. For distributed charges we will have a charge density ρ(x) and the density of the potential energy is ρ (t, r) ϕ (t, r) . Again, these things are very well known. How is this transformed into a “normal” science? A “normal” science must be a relativistic one, and so we have ρ, the zero component of the 4-current. On the other hand, ϕ is the zero component of the 4-vector potential. So, a respective relativistic generalization should be something like this j µ (x)Aµ (x) = cρ(x)ϕ(x) − j(x)A(x), and the indexes must be correspondingly contravariant and covariant. It is clear with the vector potential. How can we describe a current? It must be a flow of charged particles. If we recall quantum mechanics, a current of electrons looks as follows. The charge (with its sign), then the conjugate wave function Ψ∗ (x) multiplied by the velocity vˆ, which is the ˆ=p ˆ /m, and then Ψ(x). In quantum mechanics this value momentum divided by the mass, v must be real, so we have to add the complex conjugate value. The momentum operator is ˆ = −i~∇, as a result also well known; it is p −i~∇ 1 ˆ Ψ + c.c.) , v ˆ= j = e (Ψ∗ v , 2 m where “c.c.” means the complex conjugate expression. It is important that here we see wave functions of electron in the current. In relativism, the electron is described by the Dirac equation. You may have heard about it. In the near future, it will be discussed in our course. So, running ahead a little, in relativistic quantum mechanics a current is represented as follows. The charge is included, as well as the Diracconjugate wave function Ψ(x) at the point x. It is multiplied by the wave function. And then there must be a 4-vector. The 4-vector is determined by the Dirac matrices γ µ , you will also soon be told about. Such is the current, and thus the electromagnetic interaction
13
Figure 1: Elementary QED process
Figure 2: Elementary weak process
Figure 3: Elementary QCD process
in question shall be defined by the following quantity: the charge e, Ψ(x)γ µ Ψ(x), multiplied, as we remember, by Aµ (x): j µ (x)Aµ (x) = eΨ(x)γ µ Ψ(x) Aµ (x) . All this originates from the early statement we have already discussed that the charge multiplied by the potential is included. Looking far ahead again, this construction is often depicted as follows. The initial electron is represented as a line going to some point x. The final electron is also at this point x, and this all interacts with the electromagnetic field. Thus, the interaction eΨ(x)γ µ Ψ(x)Aµ (x) is described the processes (real or virtual) of the type of Fig. 1. This is the description of electromagnetic interaction in terms of quantum field theory. The following aspects arise here: interaction of quanta of electron-positron field with photons, i.e. with the quanta of electromagnetic field. More complicated processes can be built of these blocks. For example, how can the electron-electron interaction be described in these terms? An electron interacts with the electromagnetic field of the second electron; there occurs electron-electron scattering. On the electron-electron beams it was first studied here in Novosibirsk, on the VEP-1 accelerator. From this interaction, described in such a complicated way, the Coulomb law results. Weak interaction A similarly described interaction of electron (a quantum of electron-positron field) with the quanta of carriers of weak interaction can be a typical example of weak interaction, i.e. W and Z bosons with masses mW c2 = 80.4 GeV and mZ c2 = 91.2 GeV. Let us take interaction with Z boson for example. Once again, I am not proving anything, just trying to give you a feeling of consistency in our reasoning. If we write this down, ( ) e µ ¯ Ψ(x)γ gV − gA γ 5 Ψ(x)Zµ (x), sin 2θW we have the same: a quantized electron-positron field Ψ(x) corresponding to the initial ¯ electron, a quantized field Ψ(x) corresponding to the final electron, and interaction with the field of the Z boson, which is also described by some 4-vector; let’s denote it with the letter Zµ (x), in the same way as the photon was described with a 4-vector potential Aµ (x) – see
14 Fig. 2. Here sin 2θW = 0, 84, gV and gA are dimensionless constants of the order of 1. It is interesting that the relevant interaction constant is proportional to an electrical charge. There are some difficulties here however, because this coupling constant is included with some coefficient. What is the structure between the initial and final states of the electron? Here there is a known structure γ µ , which is consistent with the presence of the 4-vector Ψγ µ Ψ, which is linked with the 4-vector Zµ . But it turns out that in the weak interaction, there is one more structure, which is defined by the product of two Dirac γ matrices — γ µ γ 5 . You know that electromagnetic interactions conserve parity. This is due to the fact that Ψ(x)γ µ Ψ(x) is a 4-vector. Here — in addition to the 4-vector — another object is present, Ψγ µ γ 5 Ψ, which is actually a 4-pseudovector and does not conserve parity. Parity nonconservation in atomic transitions was first demonstrated here in Novosibirsk by Barkov and Zolotorev. Between these two structures, there are some coefficients (gV and gA ) of the order of 1. It turns out that the structure of weak interaction is very similar to that of electromagnetic interaction. But the former is more complicated: a different carrier of interaction and the interaction vertex is a little bit more complicated. However, the strength of interaction ∝ e is, roughly speaking, the same as in electromagnetism. These facts have led to the creation of the unified electro-weak theory. Strong interaction It occurs, as you remember, between quarks. Let’s take a look at it. What are quarks? They are quanta of a corresponding quark-antiquark field. We’ll consider the process in which a quark interacts with a carrier of the strong interaction, gluons — see Fig. 3. They are often depicted with such a “looped” line. The structure of this interaction is similar to the structure of the previous interaction: i
gs Ψ (x)γ µ Ψj (x) (Gµ )ji (x). That is, we have the wave function of the initial quark Ψ(x). Here we have the Diracconjugate field of the final quark Ψ(x). Here we have a gluon field; I will write it with the letter G(x). It is also a vector field, the same as with the photon and similar to that with the Z boson. It is closer to the field in case of photon, because gluon is also considered to be massless, just like the photon. In the strong interaction, the parity is conserved, so the vertex will be the same as in electrodynamics. The charge will be different. Instead of the letter e, we have a “strong” constant gs — the one that is of the order of 1. It is also important that quarks have an additional degree of freedom — colour. Therefore, they have to be attributed one more index — that of colour. Gluon, in turn, must provide their binding, so it also has colour indices i and j, which take the values 1, 2, 3. In principle, this is the language of the Feynman diagrams, and we have to gradually get used to this language. I have recounted all about quantum field theory, at least its structure. Now we will study how all these equations arise and why the coefficients and the dependencies are such. Here is depicted the essence of the Standard Model, which now describes a huge set of experiments. Nevertheless, there are already experiments that contradict this model. At least, they show the limits of its applicability. Again, a huge number of experiments are described by the Standard Model, sometimes with unprecedented accuracy. Thus we should start learning this Standard Model. Of course, the easiest way would be to start with fields that have the simplest structure. Those are fields with a spin of 0. They are described by a single function. However, if we start with that, then there will be a lot of
15 questions. So let’s act in a different way. We will take not the simplest field, but a field that is the most familiar to you. You see, it is very natural that you start understanding ideas when you get used to them. You will use these terms and write them down; then you will get a feeling that you understand them, and once you have done a number of tasks in this course, the real understanding will come. So I suggest we start not with a simplest boson field with a spin of 0, but with a more complicated electromagnetic field. The advantage of electromagnetic field is that you know a lot about it, and thus we can build on your prior knowledge.
§ 2. Quantization of the electromagnetic field Let’s start talking about electromagnetic field. Here are the subjects which we will cover in the nearest lectures. First, we will talk about the usual harmonic oscillator, which is a very convenient object of classical mechanics and is known in all details. An electromagnetic field in a “box” without external charges can also be considered as a set of oscillators. This is a purely classical description, you have already faced in part. It will be extremely useful for the next step, quantum field. It is easiest to start from studying the same-name piece in the The Classical Theory of Fields, in the second volume by Landau and Lifshitz [5]. So, the first section will be electromagnetic field as a set of oscillators. Please, pay a very close attention to this very important piece that will be repeated at all further stages. So, we’ll start by describing a complex object — an electromagnetic field in some volume — as a set of oscillators. What will be the next? We have mastered quantization of oscillators, so when we proceed to quantizing the electromagnetic field, it will be the process we already know from quantum mechanics. And in the end, we will talk about the production and annihilation of field quanta. Such is the program of our lectures in the nearest future.
2.1. Electromagnetic field as a set of oscillators Now we will talk about the linear or harmonic oscillator. Let’s remember a usual oscillator, which you have known from the school. In the Hamilton approach, this system is described with the following Hamiltonian, you hopefully know backwards and forward: H(x, p) =
p2 mω 2 x2 + . 2m 2
No need to solve the corresponding equations because they should be well known by now x(t) = b cos(ωt + φ) , p(t) = −mωb sin(ωt + φ) , where b is the amplitude and φ is the initial phase. It is also well known that we can use not only x(t) and p(t), but other generalized coordinate Q(t) and momentum P (t), too. It was especially emphasized that combinations of the following kind are very important: mωx+ip, a complex function. For certain reasons, it is often written down with the multiplier √ 1/ 2m~ω: mωx + ip ∗ mωx − ip a(t) = √ , a (t) = √ . 2m~ω 2m~ω In our case, even ~. In this case, however, it bears no relation to the classical mechanics; it’s just a matter of normalization. Instead of real x(t) and p(t), we can use two complex
16 quantities values, a(t) and a∗ (t). In analytical mechanics it has been proved that if we take a(t) as the generalized coordinate Q(t) and i~a∗ (t) as the generalized momentum P (t), then they will also be canonical variables, and the new Hamiltonian for them will be somehow simpler than the old one H(Q, P ) = −iωP Q = ~ωa∗ a . We know explicit expressions for x(t) and p(t); let us substitute them and obtain a(t) ∝ b e−i(ωt+φ) , a∗ (t) ∝ b e+i(ωt+φ) . In general, if we talk about a classical oscillator, switching from x(t) and p(t) to Q(t) and P (t), we do not win much. Indeed, we know what x(t) and p(t) equal and what for will we complicate our life, introducing new variables? Still, these new variables a little bit better. Better in which respect, you may ask? Even in classical mechanics, when you operate on x(t), e.g. differentiate it, the cosine function turns into the sine one, while differentiation of a(t) results again in an exponent. Even technically it turns out that a(t) and a∗ (t) have somewhat simpler dependence on time. The effect is especially good when the oscillator has some perturbation. Then switching to these new variables helps one to solve the task, or at least to develop an effective perturbation theory. Let us show that the electromagnetic field in vacuum can be reduced to a set of oscillators described by variables a(t) and a∗ (t) which obey equations a ¨(t) + ω 2 a(t) = 0, a ¨∗ (t) + ω 2 a∗ (t) = 0. Once again, there are obvious advantages of the beginning from electromagnetic field. We know that an electromagnetic field is described by the electric E(x) and magnetic B(x) intensities. Moreover, we know the corresponding Maxwell equations: div E = 0 , rot E = −
1 ∂B 1 ∂E , div B = 0 , rot B = . c ∂t c ∂t
Note please that the energy is denoted with the ordinary letter E, and the electric field intensity with the letter E similar a large epsilon. It will be a good comprehensive and adequate description; unfortunately, it has a lot of components, 3 electric field components and 3 magnetic field components. Instead of E and B, we can use a 4-vector potential Aµ (t, r) = (ϕ(t, r), A(t, r) , which contains only four components. This is very convenient, as we can go from 6 quantities to 4 quantities. You are also well aware of the fact that the electric field and magnetic fields are expressed as follows: E = −∇ϕ −
1 ∂A , B = ∇ × A. c ∂t
Great, so we have 4 quantities, and this makes our life easier. We will continue going by the way of simplification. We have some freedom in choosing these potentials; let’s use it. For instance, we can impose a very nice invariant condition, which is called the Lorentz condition ∂Aµ ∂ϕ = − divA = 0 . µ ∂x c∂t
17 Moreover, if we consider a field in a vacuum without charges, we can also impose a condition in relation to the scalar potential, ϕ(t, r) = 0. Those are some additional conditions that we impose using our freedom of choice of the potential. Immediately the life becomes very easy, because there are only 3 functions left instead of the initial 6 functions. Besides, the Lorentz condition simplifies to divA = 0. It would be better to write it down as a scalar product of the gradient by A: ∇A(t, r) = 0. What does it all mean? How many independent quantities do we have now? Only two. By the way, it also means that plane electromagnetic waves have only two independent components, which are transverse to the direction of movement. This is great. We began with 6 components and ended with two independent components. This is a very economical description. What is the equation these A(x) obey? Let’s take the fourth Maxwell equation. The rotor of B is the vector product of the gradient by B, and B in turn is the vector product of the gradient by A. And then we follow the formula for a double vector product a × (b × c) = b (ac) − (ab)c and obtain rot B = ∇ × (∇ × A) = ∇(∇A) − ∆A = −∆A =
1 ∂E 1 ∂ 2A =− 2 2 c ∂t c ∂t
So, what have we got finally? We have got the wave equation 1 ∂ 2A − ∆A = 0 . c2 ∂t2 This is the basic equation of electromagnetic waves. We want to present our object that describes the field — those are two independent components of the vector A(t, r) — as a set of oscillators. What will be the difference in the oscillators of the field? Generally speaking, they will have various frequencies and different frequencies will be matched by different oscillators. Therefore, we’ll expand our vector potential into plane waves taking into account that the vector A(t, r) must be a real quantity: ∫ ] d3 k [ −ikr ikr ∗ A(t, r) = (t) e . (2.1) A (t) e + A k k (2π)3 Now, these Fourier amplitudes — Ak (t) and A∗k (t) — should be a little bit similar to quantities a(t) and a∗ (t). But our quantities a(t) and a∗ (t) depend on time in the very defined form which follows from the equation for the oscillator. Accordingly, we see that, indeed, the wave equation implies that amplitudes Ak (t) satisfy the usual equation for oscillator: ¨ k (t) + ω 2 Ak (t) = 0, ωk = c| k| . (2.2) A k As a result, in each mode, i.e. for each wave vector k, we have a harmonic oscillator and Ak (t) ∝ e−iωk t , Ak∗ (t) ∝ eiωk t .
(2.3)
18 So, we have that, indeed, this vector potential describes a set of oscillators. However, the frequencies of these oscillators are continuously distributed in dependence on the value of the vector k. But for clarity, making it look like just a set and taking the dimensional considerations into account, we can try switching from the continuous distribution to the discrete one. This is done in the following way. We assume that in the space xyz we have a very big parallelepiped with sides Lx , Ly , and Lz . Thus the product Lx Ly Lz is some volume our field is enclosed within V = Lx Ly Lz . Since we will meet the perturbation operator V very often, let’s denote the volume with the “arty” letter V, to distinguish it and the perturbation operator. We need to impose some field conditions on the boundaries of this volume: either the field shall vanish on this boundary — or and it is a more technically convenient — the conditions shall be repeated periodically at the beginning and at the end points of the volume. And therefore, upon acquiring the phase eikx Lx , these fields must equal the field at the zero point. This means that kx Lx must equal 2π multiplied by some integer nx . Thus the wave vectors themselves will be quantized: the x, y and z components all together will be a 3-vector of quantized values: √ n2y 2π nx n2x n2z 2π ny 2π nz kx = + + . , ky = , kz = , ωk = 2πc Lx Ly Lz L2x L2y L2z If so, then we can now turn from integration to summation. The relevant wave functions will be eikr , where k is the quantized vector. We will need the condition of their orthogonality, so let’s write down it separately ∫ ( ikr )∗ ik′ r 3 e e d r = V δk,k′ . (2.4) V
As a result, now we do have a discrete set of oscillators ∑[ ] Ak (t) eikr + A∗k (t) e−ikr , A(t, r) =
(2.5)
k
where new amplitudes Ak (t) satisfy the same relations (2) − (3). Moreover, the sums similar (5) can be written for the electric and magnetic fields as well. And the corresponding amplitudes due to equations E =−
1 ∂A , B=∇×A c ∂t
related to the amplitudes of the vector potential as Ek =
iωk k Ak , Bk = ik × Ak = × Ek . c |k|
(2.5a)
Our analysis was not completed yet. Still, as you remember, we started with six functions. We turned out to be so “tricky” and managed to reduce them to three functions, three components of the vector A(x). But these three components are constrained by the condition div A(t, r) = 0, so only two functions are independent. So, in fact, this set of oscillators turns out to be two sets of oscillators. Let us explicitly write this down. We take the vector
19 Ak ; its time dependence is known. Since A(x) should meet the requirement ∇ · A(x) = 0, these components should also meet the requirement k · Ak = 0 .
(2.6a)
It means that amplitude Ak must be perpendicular to k. In fact, this is a plane monochromatic wave, the vector potential of which obligatorily lies in a plane perpendicular to the direction of propagation of this wave. Thus, we have reduced the three-component quantity Ak (t) to two components in a plane perpendicular to k. This is all that relates to the vector potential. You know from electrodynamics that the electric and magnetic fields of a plane wave lie in a plane perpendicular to k. That can also be implemented easily using Eq. (5a). So, you can see that in the k space, the electric field is directed like the vector potential Ak , i.e. it lies in a plane perpendicular to k. The magnetic field is perpendicular to Ak , and hence to the electric field. That is, we see a picture we are accustomed to. The monochromatic wave propagates along the k direction. Let at some moment the electric field be directed here, then the magnetic field is perpendicular to it in the same plane. The triple vectors E k , Bk , and k is the same as x, y, and z. Thus, if we direct, for instance, the z axis along k, then Ak , E k , and Bk must lie in the xy plane. And now, when they are in the xy plane, we can expand them into two components. For example, we can choose the (Ak )x and (Ak )y as independent axes. It is preferable, and it will be very convenient in the future, to choose the circular polarization and introduce two unit circular polarization vectors λ ekλ = − √ (1, iλ, 0) = −e∗k,−λ , 2 where index λ = ±1 corresponds to to the right (left) circular polarization. The upper sign is matched by a wave that rotates counterclockwise (when the rotation occurs from the x axis to the y axis), i. e. this is the right-hand polarization. If the rotation occurs clockwise, we have the left-hand polarization. Later on, we will often associate this (+1)/(−1), right/left-hand polarization, with the photon helicity. In principle, there is no commonly agreed upon terminology on this topic. For example, the courses at our university and the books by Landau and Lifshitz The Classical Theory of Fields [5] and Quantum Electrodynamics [1] use this terminology. However, in old classic books on optics, they preferred looking on a ray of light from (−z)-side, not from (+z)-side. And therefore, their concepts of right and left-hand polarizations are directly opposite. But we will adhere to the up-to-date point of view. The polarization vectors satisfy the conditions of transversality:
mutual orthogonality: and completeness:
∑ λ
k · ekλ = 0 ,
(2.6b)
e∗kλ · ekλ′ = δλλ′
(2.7)
(ekλ )i (e∗kλ )j = δij −
ki kj k2
(2.8)
(here i, j denote components of the polarization vector; in the right-hand side there is a unit tensor in the plane orthogonal to the vector k).
20 Now we expand our vector potential Ak (t) in two components ∑ Ak (t) = Nk akλ (t) ekλ λ
The expansion coefficient akλ (t) ∝ e−iωk t corresponds to the oscillator with frequency ωk = c|k| and polarization λ. We introduce some common coefficient Nk in such a way that the total field energy ∫ 2 E + B2 3 E= dr 8π will be written down as the sum of the energies of the individual oscillators: ∫ E=
∑ E 2 + B2 3 dr= ~ωk a∗kλ akλ . 8π kλ
(2.9)
Our task of expansion has been completed. Indeed, now we have a presentation of the vector potential as a set of oscillators. There remains only one technical detail of finding the right normalization coefficient Nk . We continue in the same way, and this is a little more “tricky” thing, although it has no ideological meaning. This is a purely technical task: selection of the coefficient Nk such that we could convert this expression further into E and B, integrate it over all the volume and get such a simple sum (9). Let’s deal with this technical detail now keeping in mind, however, the fact that we have already completed the fundamental task. Once again, to it is crucial that you completely understand the first section of this lecture. If so, in the next phase, when all this is repeated for spinor or boson fields, you will be able to perceive this as a simple thing, without any difficulties. To avoid messing with the coefficients, let’s first consider the one separate term, without the coefficient 1/(8π). We will have an integral of E 2 with respect to d3 r over the entire volume. Each E is such a sum ∑[ ] E k (t) eikr + E ∗k (t) e−ikr . E(x) = k
Once again, at this stage, we should trace all the details. This expression implies summing over λ inside. So, we have such a sum. But this is only one E vector; we should multiply this by the second E. This will be summing over other indices, e.g. over k′ . Then we take the result of the multiplication ] ∑[ ][ ′ ′ E2 = E k (t) eikr + E ∗k (t) e−ikr E k′ (t) eik r + E ∗k′ (t) e−ik r k,k′
and perform integration over r using Eq. (4): ∫ ∑[ ] E 2 d3 r = V E k (t) E −k (t) + E ∗k (t) E ∗−k (t) + 2E k (t) E ∗k (t) . V
k
The items E k (t) E −k (t) ∝ e−2iωk t and E ∗k (t) E ∗−k (t) ∝ e2iωk t depend on and they cancel ∫ time 2 3 when we take into account the contribution of the magnetic field B d r. On the other hand, the item 2E k (t) E ∗k (t) does not depend on time and it doubles when we take into
21 account the contribution of the magnetic field. (By the way, this is a very good exercise, I advise you to do yourself. It is very useful for getting used to this technique.) Therefore, V ∑ E= E k (t) E ∗k (t). 2π k Now we can use Eqs. (5a) and (7) and get ∑ ∑ ω2 ω2 E k (t) E ∗k (t) = 2k |Nk |2 a∗kλ akλ′ . e∗kλ ekλ′ a∗kλ akλ′ = 2k |Nk |2 c c λ λλ′ As a result, we obtain E=
V ∑ V ∑ ωk2 |Nk |2 a∗kλ akλ . E k (t) E ∗k (t) = 2π k 2π kλ c2
If now we choose the normalization factor as √ 2π~c2 Nk = , ωk V i.e. if we use the decomposition √ ∑ 2π~c2 [ ] A(r, t) = akλ (t) ekλ eikr + a∗kλ (t) e∗kλ e−ikr , ωk V kλ
(2.10)
then the total field energy will looks like the sum over energies Ekλ = ~ωk a∗kλ akλ
(2.11)
of the individual oscillators. Namely, the quantities ak,λ and a∗k,λ will represent the coordinates and momenta of the corresponding oscillators. Analogously, one can show that the total field momentum ∫ E ×B 3 P= dr 4πc can be reduced to the sum ∑ P= ~ka∗kλ akλ (2.12) kλ
of the corresponding momenta for each modes with a given polarization λ k Ekλ ~ka∗kλ akλ = . |k| c Indeed, using Eq. (5a) we can present the integral in the expression for P as ∫ ∑ c [ ] E × B d3 r = V −kE k (t) E −k (t) − kE ∗k (t) E ∗−k (t) + 2kE k (t) E ∗k (t) . ωk V k Then we take into account that the first and the second items in squared brackets are odd functions of k and, therefore, they vanish after sum over all values of k. The field is represented as a set of oscillators differing in the wave vector k and the polarization λ.Here we have completed a very large piece, which has significant technical difficulties. Now let’s deal with simpler things which are still associated with perception of new ideas. We were talking about the classical oscillator. How can we turn to the quantum oscillator in quantum mechanics?
22
2.2. Quantization of the field For an ordinary oscillator, once again, there is no problem: x(t) and p(t) are known, and that’s all we need in classical mechanics. In quantum mechanics, the situation is a little bit different. There, these classical quantities a(t) and a∗ (t) are transformed into operators and acquire very clear physical meaning: one of them becomes an annihilation operator and the second, a creation operator. And thus all these operators become extremely useful. We will move all these things associated with classical and quantum mechanics to the theory of field, an electromagnetic field. Well, we have established the electromagnetic field as a set of oscillators. Now we are talking about the next section, quantization. What was the aim of representing the field as a set of oscillators? The point is that we can handle oscillator — this is one of the very simple and understandable themes in quantum mechanics. In quantum mechanics the classical coordinate x(t) turns into an operator xˆ, which is conventionally represented as multiplication by the quantity x, i.e. xˆ = x. A time-independent operator! A classical d momentum turns into a momentum operator, which resolves in a gradient pˆ = −i~ dx . And, again, a classical time-dependent quantity becomes a time-independent operator. The time dependence is hidden in the wave functions. This is the usual Schr¨odinger representation. The annihilation operator a ˆ and the creation operator a ˆ+ has a well-known commutator [ˆ a, a ˆ+ ] = 1 .
(2.13)
And then our classical Hamiltonian becomes the Schr¨odinger operator ˆ = 1 ~ω(ˆ H a+ a ˆ+a ˆa ˆ+ ). 2 ˆ to the form Using relation (13) we transform the operator H ˆ = ~ω(ˆ H n + 12 ), n ˆ=a ˆ+ a ˆ. The solutions for one-dimensional oscillator are well known. The Ψ functions Ψn (x, t) are the Hermit polynomials Hn (x) multiplied by exponents √ ~ H (x/l) 2 2 n Ψn (x, t) = ψn (x) e−iEn t/~ , ψn (x) = 1/4 √ e−x /(2l ) , l = . mω π n! 2n l In this case, x is a one-dimensional coordinate, not a four- dimensional vector. Since ˆ ψn (x) = ~ω(n + 1/2) ψn (x), H energy levels are equal to ~ω(n + 1/2), where n takes values of 0, 1, 2, . . . It is crucially important that the minimum energy differs from zero and is equal to half of ~ω. So, since ˆ has such a structure, what can we say about the action of the operator n the operator H ˆ on the wave function? It retains the same function with the additional factor n n ˆ Ψn (x, t) = n Ψn (x, t). So, we can call the combination n ˆ = a ˆ+ a ˆ an operator of the number of quanta, whose eigenvalues are the integer numbers n = 0, 1, 2, . . . What are these quanta? We were talking about only one particle in the oscillator field. When we say “the number of quanta”, what is
23 meant by “quanta”? We say that the energy in this state can be “imagined” as the energy of a certain number of quanta. When there are no quanta, the energy is equal to ~ω/2, while with quanta, ~ω multiplied by the number of the quanta is added. But those were just broad terms. And when we were saying that operator could be called an operator of the number of quanta, we were talking just about the labeling of the energy levels and nothing else; there were no “real quanta”. However, if it is a charged particle moving in a electromagnetic field, then a transition from level n to level n − 1 would result in emission of a photon with an energy En − En−1 = ~ω. The classical quantities a(t) and a∗ (t) are transformed into an annihilation operator a ˆ and a creation operator a ˆ+ , i.e. these operators act on the function corresponding to n-th energy level and transfers it to a function corresponding to either the (n − 1)-th or (n + 1)-th energy level: √ √ a ˆψn (x) = n ψn−1 (x), a ˆ+ ψn (x) = n + 1 ψn+1 (x). Let’s, once again, turn to a quantum picture for the electromagnetic field. Here the ∗ quantities akλ (t) and akλ (t) become creation operators a ˆ+ ˆkλ kλ and annihilation operators a of quanta corresponding to a certain photon with the energy ~ωk , momentum ~k and polarization λ. Our vector potential will turn into a time-independent operator, which depends only on r: √ ∑ 2π~c2 ( ) ∗ −ikr ˆ A(r) = a ˆkλ ekλ eikr + a ˆ+ . (2.14) kλ ekλ e ωk V kλ The electric E(t, r) and magnetic B(t, r) fields also become time-independent operators √ ∑ iωk 2π~c2 ( ) ∗ −ikr ˆ E(r) = a ˆkλ ekλ eikr − a ˆ+ , (2.15) kλ ekλ e c ωk V kλ ˆ B(r) =
∑ kλ
√
( ) 2π~c2 ∗ −ikr ik × a ˆkλ ekλ eikr − a ˆ+ e e , kλ kλ ωk V
Instead of the total energy and momentum of the field we obtain the sum of individual Hamiltonians and momentum operators for individual photons: ˆ = H
∑ kλ
∑ k H ˆ kλ ( + ) ˆ = ˆ kλ , H ˆ kλ = 1 ~ωk a H ˆkλ a ˆkλ + a ˆkλ a ˆ+ , P . kλ 2 |k| c kλ
(2.16)
Using the commutation relations ˆ+ [ˆ akλ , a ˆ+ akλ , a ˆk′ λ′ ] = 0 , [ˆ a+ kλ , a k′ λ′ ] = 0 k′ λ′ ] = δλλ′ δkk′ , [ˆ
(2.17)
ˆ kλ to the form we transform the operator H ˆ kλ = ~ωk (ˆ H nkλ + 12 ), n ˆkλ , ˆ kλ = a ˆ+ kλ a
(2.18)
where n ˆ kλ is the operator of the number of quanta, whose eigenvalues are the integer numbers nkλ = 0, 1, 2, . . . It can be shown that right (left) circular polarization of the photon corresponds to its helicity1 equals to ±~. 1
Let us remind that helicity is the projection of the total angular momentum of particle on the direction of its momentum.
24
2.3. Creation and annihilation of field quanta Let’s talk about the production and annihilation of electromagnetic field quanta. For a usual oscillator, the picture looks like this: the operator a ˆ+ or a ˆ transforms the time dependent function Ψ(x, a function that matches a higher or lower energy level, √ t) into √ and the important factor n + 1 or n is also there √ √ a ˆ+ Ψn (x, t) = n + 1 Ψn+1 (x, t) eiωt , a ˆ Ψn (x, t) = n Ψn−1 (x, t) e−iωt . Accordingly, the operator a ˆ+ is called the creation operator and a ˆ is called the annihilation operator. Similarly the same reasoning can be applied to the electromagnetic field. Let’s, for example, the creation operator a ˆ+ kλ for a well-defined quantum with a given wave vector k (and hence with a given frequency ωk = c|k|) and a given polarization λ, i.e. operator of creation of a monochromatic plane wave of a certain polarization. This operator acts on the state of a field | . . . , nkλ , . . . , t⟩ in which there are a number of quanta — that is, there may be other quanta, too. Then the state of the field should be described by a set of quantum numbers corresponding to all values of k and λ. And we agreed that this operator is time-independent, while the wave function is time-dependent. Same as earlier, we have √ a ˆ+ nkλ + 1 | . . . , nkλ + 1, . . . , t⟩ eiωk t , kλ | . . . , nkλ , . . . , t⟩ = √ a ˆkλ | . . . , nkλ , . . . , t⟩ = nkλ | . . . , nkλ − 1, . . . , t⟩ e−iωk t . All the other quanta will not change. And we need to carefully trace the time dependence. Thus, the creation operator a ˆ+ and the annihilation operator a ˆ will produce or annihilate the quantum of the electromagnetic field, i.e. photons. While earlier it was largely a convenient terminology associated with the transition from level to level, here these are also field levels. If the field energy changes, there accordingly originate or disappear quanta of this field. Such is the picture. Okay, we’ve established that the field is a quantized one. ˆ ˆ ˆ What happens when the operator A(r), or the operator E(r), or the operator B(r) acts? + According to Eqs. (14) and (15), these operators contain both a ˆ and a ˆ , and thus these operators can cause transitions to states with the number of quanta less or greater by 1. It corresponds to absorption or radiation of one photon. For example, the matrix elements of ˆ the operator A(r) are: for radiation of the photon iωk t ˆ ⟨ nkλ + 1, t | A(r) | nkλ , t ⟩ = Arad , f i (r) e √ √ 2π~c2 ∗ −ikr Arad nkλ + 1 e e , f i (r) = ωk V kλ
(2.19)
for absorption of the photon −iωk t ˆ , ⟨ nkλ − 1, t | A(r) | nkλ , t ⟩ = Aabs f i (r) e √ √ 2π~c2 Aabs (r) = n ekλ eikr . kλ fi ωk V
(2.20)
ˆ ˆ ˆ Likewise, we can say that since A(r) is in merely linear relations with E(r) and B(r), ˆ ˆ the action of the operators E(r) and B(r) will give a result of the same kind with slightly
25 different coefficients. As a result, for a field state with a given number of quanta, this number is increased or decreased by 1, nothing more. Please pay attention to the factor √ nkλ + 1 in Eq. (19). I emphasize this fact because the presence of this factor, which is purely quantum-mechanical and cannot be met in any classical mechanics, is a fundamental basis for laser technology. The problem is that if you have some medium of atoms in which atoms are in the inverse states, i.e. they are not at the ground level but at some excited one, a spontaneous transition from this level to the ground one can occur, with emission of quanta of different kinds. Of course, they all will be of the same energy, but, generally speaking, they will have different k and may be different λ. If this medium is placed in a resonator (e. g. in a transparent ruby crystal with polished translucent walls), the quanta, that fly who knows where, will fly away, whereas those quanta which propagate along the crystal axis will recoil between these walls. And we’ll have a field state with several quanta of this kind, with a defined k and λ that correspond to movement along the crystal axis. And now you know that such transition takes place under field influence. Moreover, you √ 2 know that this transition occurs with a probability proportional to the factor nkλ + 1 = nkλ + 1. If we accumulate a sufficient number of quanta, the probability will increase many times. Moreover, this is the probability of emission of quanta only of this kind. All the quanta are identical, with the same wave vector k and polarization λ, i.e. this will be a coherent state. When a sufficient number of quanta are accumulated, they are released via translucent walls. This is the basis for laser operation.
2.4. The Heisenberg representation Schr¨ odinger representation
versus
the
So, we’ve developed a theory of quantization of the electromagnetic field, and will apply this scheme in all further cases. In the considered case we had the field coordinates and momenta a(t) and a∗ (t). We are starting from the vector potential. In a general sense, the vector potential is a field coordinate. However, a usual coordinate x, y, z corresponds to a given particle, whereas this coordinate A(r) will be some function. I.e. instead of coordinates of certain particles that can be listed there appears continuous distribution and the field in this sense is similar to a continuous medium in hydrodynamics. Just remember that we in fact associate A(r) with the coordinates and momenta of the field itself. Apparently, we should do the same in other cases, too, when we begin describing fields via potentials or functions. These functions may differ from the vector function in case of electromagnetic field. They may be scalars, which in turn can also be different: real and complex. They may be spinors, again real or complex. The scheme will be the same: we’ll expand corresponding coordinates in a series of plane waves and choose a coefficient in the decomposition of the total field energy in such a way that the corresponding Hamiltonian will look like that for electromagnetic field, i.e. it will be proportional to the relevant quantum number operator. We could probably make this “golden dream” a little bit more convenient if we start to consider time and the coordinates on equal terms. Indeed, everything that we have done is normal, good, and strict, but rather inconvenient: while we were in the framework of
26 nonrelativistic quantum mechanics, we did not really care that the coordinates and time were on equal footing. Now that’s really not the case: in the Schr¨odinger equation the time and the coordinates are not symmetric relative to each other. The differentiation with respect to the coordinates is double, whereas the differentiation with respect to the time is single. Sure, for quantum nonrelativistic mechanics this is natural, right and good. And the formalism that was developed there works greatly. But we’re going to deal with the field theory, a fundamental aspect of which is the possibility of creation and annihilation of quanta. This requires from the very beginning the consideration of the relativistic theory. In the relativistic theory, the coordinates and time must be a kind of four-dimension unity. This means that a description of the operators in which they turned out to be depending on the coordinate, but not on the time, is inconvenient for the problems of relativistic quantum field theory. This description uses the Schr¨ odinger picture of quantum mechanics. But you know about the Heisenberg picture, in which the time dependence of the states was transferred to the operators. This picture is more convenient for us. This is easily done, if you look at matrix elements in Eqs. (19) and (20). I.e. let’s perform an expansion in Eq. (14), not just in plane waves eikr , but in plane waves containing time dependence e−i(ωk t−kr) . This corresponds to replacements: iωk t a ˆkλ → a ˆkλ e−iωk t , a ˆ+ ˆ+ . kλ → a kλ e
Now the vector potential itself will thus depend on the time ( ) −ikx ikx ∑ √ e e + ∗ ˆ a ˆkλ ekλ √ A(x) = 4π~c2 +a ˆkλ ekλ √ , kx = ωk t − kr . 2ωk V 2ωk V kλ
(2.21)
(2.22)
ikx Here coefficients a ˆkλ at e−ikx and a ˆ+ satisfy the same commutation relations (17) kλ at e and they are the creation and annihilation operators for production and annihilation of the field quanta-photons with the energy ~ωk , momentum ~k and polarization λ. Moreover, the right or left circular polarization corresponds to helicity ±~. To this end, we should have considered, in addition to the energy and momentum, the relativistic angular momentum. It is a bit more cumbersome thing, so we won’t deal with it. But the very assertion that the right and left circular polarization corresponds to certain helicity appeals to facts you know from quantum mechanics. Let us note that wave function √ e−ikx √ ekλ 4π~c2 (2.23) 2ωk V
corresponds to normalization for one particle in the whole volume V. The same normalization we will use for all other fields. Problem 2.1. Consider the electromagnetic field in the homogeneous, transparent and nonmagnetic medium with the refraction index n > 1. Perform the quantization of this field taken into account that the corresponding Maxwell equations are div E = 0 , rot E = − and the energy of the field is
n2 ∂E 1 ∂B , div B = 0 , rot B = c ∂t c ∂t ∫
E=
n2 E 2 + B2 3 d r. 8π
27
§3. The Lagrange approach in the field theory We’ve finished everything associated with the electromagnetic field and are starting a new topic which will be associated with other fields. We will consider scalar fields and additionally spinor fields and what would be a reasonable way of action in this case? Many issues in relation to the electromagnetic field were taken off the table due to the fact that we know a lot about the electromagnetic field. E.g. we know that the classical electromagnetic fields obey the Maxwell equations. The vector potential obeys the wave equation — that was the most significant for us. The wave equation, in turn, means that we had this link ωk = c|k| between a wave vector and a frequency corresponding to a plane wave or between the energy ~ωk and momentum ~k, as we now understand, of a single quantum. Apparently, we shall do the same in relation to other fields. We must also keep the equations of motion in mind. We must keep in mind the expressions for the energy, momentum (and angular momentum which I did not write, but which can be written). However, unlike the electromagnetic field – for which these issues were covered during your second year – we have no such information in relation to other fields. Let’s begin by postulating corresponding equations and corresponding conservation laws. The validity of these postulates has been confirmed by the experiments. On the other hand, there is a common approach that allows deriving the equations of motion and conservation laws. It is well known in classical mechanics. It is the Lagrange approach. It could also be applied to the electromagnetic field. Namely, this approach is used in the book The Classical Theory of Field by Landau and Lifshitz [5]. In the beginning, a corresponding Lagrangian is postulated for the electromagnetic field, as well as the principle of least action, from which the Maxwell equations and expressions for the energy, momentum, etc. are derived. We did not use this approach, which is rather technically complicated, because we know everything about the electromagnetic field without it. But using this approach in relation to other fields may be justified. Let’s recall how the Lagrange approach works in the classical mechanics. Then we’ll use it for operation with the simplest fields, scalar fields. Such is our program for the nearest future.
3.1. The Lagrangian equations So, let’s recall how the classical picture is constructed in mechanics. You’ve completed the course of analytical mechanics, but you could have forgotten the formulas, so let’s recall them- see, for example, Mechanics by Landau and Lifshitz [6]. Ideologically, our reasoning in the field theory will be the same, but with some complications. It would be good to recall simple variants. How is the description in classical mechanics begun? It is assumed that we have the Lagrangian L (q, q). ˙ It is a function of some generalized coordinates qi and velocities q˙i = ∂0 qi . Generally speaking, of the time too, but the time is insignificant for us so far. Then it is said that in the Newton picture we begin with postulating the equations of motion. In the Lagrange approach we begin with postulating the principle of least action δS = 0, which says as follows. Let’s construct the action which is the integral of the Lagrangian ∫ t2 S= L (q, q) ˙ dt, t1
where the generalized coordinate and generalized velocity depend on time. Therefore, we can integrate this function from the point t1 to the point t2 . We’ll obtain a number, the
28 action S. How is the number obtained? What is t1 ? What is t2 ? What is q(t)? The idea is that we look at a picture where a particle is at the point q (1) at the time t1 . Generally speaking, there can be a lot of q coordinates, so let’s keep the subscripts to enumerate these coordinates. E.g. the usual Cartesian coordinates x, y, z will be q1 , q2 , q3 . This is the start moment, when the particle is at this point. We are interested in how the particle will move, to come to the point q (2) on the q − t scale at the time t2 — the start and end positions. By the way, different speeds at the start point will result in different trajectories, which could lead, in general, God knows where. Among them, we’ll mark out the path that corresponds to the real movement — the particle that was initially at the point q (1) passes to the point q (2) in real movement — and this will be the function q(t) we look for. The particle can pass because only the initial and final coordinates are fixed in the statement of the problem. The initial velocities are not fixed; they can be arbitrary. Well, how are we finding the real path? We consider a set of all possible curves that differ from the desired one by some addition. This addition — let’s designate it as δq(t) — is an arbitrary function. There are certain conditions of its smoothness, but it necessarily begins at the point 1 and ends at the point 2. Thus, we should impose conditions δqi (t1 ) = δqi (t2 ) = 0. I.e. all curves must begin and end at the same place where the real curve does. In such circumstances, what is the action S, when a number S is determined by the varying function q(t) + δq(t)? Mathematicians call such object a functional. So, this is a functional over the field of every possible q(t) + δq(t). Then we search for an extremum of the functional δS = 0 under conditions δqi (t1 ) = δqi (t2 ) = 0 imposed on the variations of coordinates. It is declared that this postulate replaces Newton’s equations. From this postulate we can obtain the equation of motion d ∂L ∂L − = 0 , i = 1, 2, . . . , s . dt ∂ q˙i ∂qi It may be useful to briefly recall this derivation. Let’s consider the variation of the functional ∫ t2 ∂L ∂L δS = δq + δ q. ˙ δL(q, q) ˙ dt, δL(q, q) ˙ = ∂q ∂ q˙ t1 There follows a trick easy to demonstrate in this one-dimension case, which we’ll use later on in the field theory. The trick is as follows. The last item can also be written in a different way ( ) ( ) ∂L d ∂L d ∂L δ q˙ = δq − δq ∂ q˙ dt ∂ q˙ dt ∂ q˙ And then we require that this variation of action be equal to zero ) ) ∫ t2 ( ∫ t2 ( ∂L d ∂L d ∂L δS = − δq dt = 0. δq dt + ∂q dt ∂ q˙ ∂ q˙ t1 t1 dt The second item includes the total derivative. The total derivative integrated over time will give us the expression which equals zero ) ∫ t2 ( ∂L t2 d ∂L δq dt = δq = 0 ∂ q˙ ∂ q˙ t1 t1 dt if we recall the requirement δqi (t1 ) = δqi (t2 ) = 0.
29 Look at the result
∫
t2
δS = t1
(
∂L d ∂L − ∂q dt ∂ q˙
) δq dt = 0.
The integral equals zero. This, however, does not imply that the integrand is equal to zero. For example, the integral ∫ 2π sin φ dφ 0
is equal to zero, whereas sin φ, in general, is not necessarily equal to zero. However, here we have an additional condition that this is equal to zero for any function δq(t). And the fundamental lemma of variational calculus states that if this is true for all variations δq(t), then this factor ( ) ∂L d ∂L − ∂q dt ∂ q˙ is necessarily equal to zero. It results in the equation of motion. We can perform the same procedure with respect to the classical field theory. We set the generalized coordinates, and then the generalized velocities are automatically known as the derivatives of the coordinates. The whole system is set when the Lagrangian is known. Once the Lagrangian is known, we can construct the action. And, putting forth a postulate of extremum of the action under certain conditions, we obtain the equations of motion. In addition, from the action we’ll obtain the integrals of motion. All this can be transferred to the classical field theory. As earlier, we should start from the definition of the coordinates. But what do we mean by the coordinates? It will be Aµ (x) — in electrodynamics, Φ(x) — for a real scalar field, φ(x) and φ∗ (x) — for a complex scalar field, Ψi (x) and Ψi (x) — for a Dirac spinor field, and so on. E.g. in case of electromagnetic field, the coordinates are the vector potential Aµ (x) as a function of the 4-radius vector x. If we take a real scalar field, it must be a single function. We shall designate it as Φ(x). The association will be the same as with the electromagnetic field. This will be very close to what we were doing. You remember that each component of Aµ was a real function. Φ(x) will also be a real function. If the scalar field is complex, then, generally speaking, there are two coordinates, q1 (x) and q2 (x). One of them will be the function φ(x), and the other will be φ∗ (x) (conjugate). They are, generally speaking, two independent functions. We could take the real and imaginary parts, but our choice is more convenient. Finally, it can be coordinates Ψi (x) and Ψi (x) for a spinor field, i.e. a Dirac field. A usual spin 1/2 has two components, whereas the Dirac field has four components. So the index i runs over four values corresponding to the four components of bispinor. It is important that this is a complex function, even four complex functions, so the coordinates will be eight. I want to say in advance what coordinates are: the coordinates are functions of the 4-radius vector – one function or several functions. Instead of a usual Lagrangian, which depends on q(t) and q(t), ˙ we have to use the Lagrangian density ∫ L→
L(q(x), ∂µ q(x))d3 r .
30 The Lagrangian density, designated with the arty letter L, will be an analog of the usual Lagrangian. Therefore, the action will be2 ∫ L(q, ∂µ q)d4 x,
S= Ω
where Ω — is a part of 4-space between two space-like 4hypersurface, for example, between t = t1 and t = t2 (Fig. 4). This action is a functional of coordinates q(x); then we’ll vary these coordinates and find the variation of the action. In accordance with the postulate of the least action principle, we will demand that the variation of action should turn into zero Figure 4: Region Ω
δS = 0
under conditions that variations δqi = 0 on the boundary Σ of the region Ω. The requirements on the Lagrangian density are: • Since the action is a number, the function L under integral over d4 x must be a relativistic invariant; • L must be a real function, since we use it for obtaining the energy, momentum, etc.; • The function L must depend on a finite number of derivatives. In mathematics it is stated that if you take an infinite set of derivatives, you can simulate with them an integral operator that will correspond to nonlocal interaction. But in all calculations we will confine ourselves to the first derivatives in the Lagrangian density like in classical mechanics, where the Lagrangian depends on the first derivatives over time, while the equations of motion will be second-order equations like Newton’s equation. The choice of the Lagrangian density L is ambiguous. Indeed, the replacement L → L′ = L + ∂µ f µ (q) gives the same variation of action ∫ ∫ ′ µ 4 δS = δS + δ ∂µ f (q)d x = δS + δ f µ dΣµ = δS . Ω
Σ
Here we use generalization of the usual 3-dimension Gauss theorem ∫ I 3 (∇ f) d r = f dS V
on the region in 4-space:
S
∫
I µ 4
f µ dΣµ
∂µ f d x = Ω
Σ
and take into account that coordinates q(x) do not vary on the hypersurface Σ. 2
Here and below we use the relativistic units with ~ = 1, c = 1.
31 The requirement δS = 0 leads to the expression } ∫ { ∂L ∂L δS = δq + δ(∂µ q) d4 x = ∂q ∂(∂µ q) Ω {[ ] [ ]} ∫ ∂L ∂ ∂L ∂ ∂L − µ δq + µ δq d4 x = 0. = ∂q ∂x ∂(∂µ q) ∂x ∂(∂µ q) Ω
The last item can be transform according to the Gauss theorem and then it disappears, since δq|Σ = 0. As a result, we obtain equations of motion for fields: ∂µ
∂L ∂L = 0 , i = 1, 2, . . . , s . − ∂(∂µ qi ) ∂qi
This is a scheme of the Lagrange approach in the classical field theory. The book The Classical Theory of Fields by Landau and Lifshitz[5], for example, starts with an explanation of the Lagrange approach in case of not material points but a continuous media, such as the electromagnetic field.
3.2. Symmetry and the conservation laws Employing the Lagrange approach, we can easily deal with the conservation laws, because they are obtained automatically in this approach. 3.2.1. The N¨ other theorem In classical mechanics, the conservation laws may be deduced from the N¨other theorem. Let’s recall it. It states the following. Suppose that we have an infinitesimal transformation of the coordinates and time q → q ′ = q + δq,
t → t′ = t + δt,
(3.1a)
where δq = δq(q, t) and δt = δt(q, t), and let under such a transformation an action stays unchanged3 , i.e. δS = 0: ∫t2
′
( ) ) ∫t2 ( ′ dq ′ dq L q, dt = L q , ′ dt′ dt dt
(3.1b)
t′1
t1
with the accuracy up to δq and δt inclusively. Then the quantity Eδt − pδq = const, where p= 3
∂L ∂L , E= q˙ − L , ∂ q˙ ∂ q˙
It means that in the left and right side of Eq. (1b) there is the same function L, but from the different arguments.
32 is conserved. Or, in other words, the quantity ( ) ∑ ∂L ∑ ∂L q˙i − L δt − δqi δΘ = ∂ q˙i ∂ q˙i i i
(3.2a)
satisfies the equation dδΘ = 0. dt
(3.3a)
It also can be expressed in such a statement: if the mechanical system has a symmetry (1a)–(1b) then the conservation law (3a) is fulfilled. In fact, this statement is the modified and simplified N¨other theorem, adapted for the purposes of classical mechanics, whereas the original theorem of Emma N¨other concerns namely continues systems, like the classical fields. The N¨ other theorem for the classical fields: Suppose that we have an infinitesimal transformation of 4-coordinates xµ → x′µ = xµ + δxµ and fields qi → qi′ = qi + δqi , and let under such a transformation a field action stays unchanged, i.e. δS = 0.
(3.1c)
Then the quantity δΘµ = T µν δxν −
∑ i
∑ ∂L ∂L δqi , T µν = ∂ ν qi − g µν L ∂(∂µ qi ) ∂(∂ q ) µ i i
(3.2b)
satisfies the continuity equation ∂ δΘµ = 0 , ∂xµ from which the following conservation law is followed: ∫ δΘ0 d3 r = const .
(3.3b)
(3.3c)
Well, we got a fairly general statement which is not worth proving because that would be very long technically and very simple ideologically. It is the direct transfer from the classical mechanics to the field theory. Instead, we’ll proceed as follows. We’ll consider this proof by pieces: first, for the case when only the coordinates change and then for the case when only the fields change. And thus we’ll prove this theorem. In each separate case, technically this can be easily done, whereas in the general case the proof is quite cumbersome. If you look at the proof in the book Introduction to the Theory of Quantized Fields by Bogolyubov and Shirkov [2], it takes a few pages. So, let us consider two important examples.
33 3.2.2. The homogeneity of the space-time and conservation of momentum-energy We know from the classical mechanics that the homogeneity of the space results in the law of conservation of momentum of the whole system and the uniformity of the time results in the law of conservation of energy. We’ll try to show this for the classical field. Since the coordinate and the time are on equal terms, the homogeneity of the four-dimension space-time immediately leads to the law of conservation of the energy-momentum. Let our action be invariant with respect to shifts in the four-dimensional space xµ → x′µ = xµ + εµ , i. e. δS = 0 at δxµ = εµ and δq = 0. In this case , it follows from the N¨other theorem that 4-vector δΘµ = T µν εν , where T µν =
∑ i
∂L ∂ ν qi − g µν L ∂(∂µ qi )
(3.4)
is the density of the energy-momentum tensor, obeys Eq. (3b) or the equation ∂T µν = 0. ∂xµ Therefore, the total 4-momentum of the field is conserved: ∫ T µ0 d3 r = P µ = constµ .
(3.5)
(3.6)
The N¨other theorem itself was not proved, just proclaimed, and thus its consequence is also a kind of proclamation. Let’s try to prove it by a direct calculation. First, let us recall how that was done in classical mechanics. In classical mechanics, if there are no changes when some coordinate is varied (shifted), then the invariance of action means invariance of the corresponding Lagrangian. This means that the space is homogeneous. The physical essence of this statement is the following experimentally verifiable fact. Let us consider the system of material points; they have some initial conditions, from which the system develops. Let this experiment be done in an empty space (there is no Earth, no Sun, no stars at all, no any observer). If the same experiment is done at a different point in the empty space, from the same initial conditions it must develop as it has developed there. This statement is understandable, unlike its consequences and proof. In the Lagrange approach in the classical mechanics the following is proved: if the space is homogeneous in this sense, the corresponding generalized momentum is conserved. In case of the usual space, the total usual momentum of the system must not change in time. The same is true for shift in time. Let us consider this very case in detail. Namely, we consider the case when the action is invariant with respect to shift in the time, t → t′ = t+ε. It means, that the Lagrangian depends on q and q˙ but not on time: L = L(q, q). ˙ Then the total time derivative of the Lagrangian reads ∂L ∂L dq˙ dL = q˙ + . dt ∂q ∂ q˙ dt
34 Transforming the second item in the right side ( ) ( ) ∂L dq˙ d ∂L d ∂L = q˙ − q˙ ∂ q˙ dt dt ∂ q˙ dt ∂ q˙ and using the Lagrange equations, we obtain ( ) ( ) dL d ∂L d ∂L = q˙ or q˙ − L = 0. dt dt ∂ q˙ dt ∂ q˙ This is the conservation law of the energy E=
∑ ∂L i
∂ q˙ i
q˙i − L = const.
In classical theory of fields we consider the case when the action is invariant with respect to shifts in 4-dimensional space, xµ → x′µ = xµ + εµ . It means, that the Lagrangian density depends on q and ∂ µ q but not on 4-coordinates: L = L(q, ∂ µ q). Then we find the derivative ∂L ∂L ∂q ∂L ∂(∂ν q) = + ∂xµ ∂q ∂xµ ∂(∂ν q) ∂xµ and transform the second item in the right side using the identity ∂(∂ν q) ∂ 2q ∂ 2q ∂(∂µ q) = = = . µ µ ν ν µ ∂x ∂x ∂x ∂x ∂x ∂xν to the form
[ ] [ ] ∂L ∂(∂ν q) ∂L ∂q ∂ ∂L ∂q ∂ = − . µ ν µ ν ∂(∂ν q) ∂x ∂x ∂(∂ν q) ∂x ∂x ∂(∂ν q) ∂xµ
Using the Lagrangian equation we obtain [ ] ∂L ∂L ∂q ∂ = . ∂xµ ∂xν ∂(∂ν q) ∂xµ Then we rewrite the left side
and obtain
∂L ν ∂L = g µ ∂xµ ∂xν ∂ ∂xν
{
∂L ∂q − gµν L ∂(∂ν q) ∂xµ
} = 0,
which is just Eq. (5). Analogously, the total angular momentum of field is conserved if the action is invariant under rotation of 4-space. In other words, the homogeneity of the space-time implied the law of conservation of energy and momentum and concrete expressions for this law. Similarly, we could say that isotropy of 4-space implies the law of conservation of angular momentum, in its relativistic generalization. This thing is more cumbersome, so we will take it as a correct statement, which, nonetheless, looks exactly the same as in classical mechanics. So, this portion of the N¨other theorem is proved. We’ll use it further, i.e. we now know how expressions for the energy-momentum can be obtained. The system is defined in classical mechanics if the Lagrangian is known and in the classical theory of fields if the
35 Lagrangian density is known. Knowing the Lagrangian density, we can organize this tensor, T µν , which is called the energy-momentum tensor (4), via simple operations of addition and differentiation. Then it turns out that this tensor obeys the equation of continuity (5), which implies conservation of the energy-momentum 4-vector (6). So, the procedure for obtaining the energy-momentum concept has been automated. And if the space and time are uniform, the statement that this object is conserved is then also automated. This is a very useful thing, and we have proved it directly. The next episode yet again associated with the N¨other theorem. Let’s consider another useful statement, which regards transformation of field coordinates. 3.2.3. Gauge transformations of the first kind and conservation of charge We’ll prove a particular case of invariance related to the complex scalar field φ(x). So, let’s consider the following option. Let’s assume that L depends on q1 = φ(x), q2 = φ∗ (x) and also on ∂µ φ, ∂µ φ∗ in such a way that L(q, ∂µ q) does not change under such transformation of the fields φ(x) → eiα φ(x), φ∗ (x) → e−iα φ∗ (x), where α is a real number. Certainly, an action does not change under this transformation either. Therefore, δS = 0 at δxµ = 0 , δφ(x) = iεφ(x) , δφ∗ (x) = −iεφ∗ (x) , where ε = δα → 0. As a consequence, [ ] ∂L ∂L µ ∗ δΘ = − φ+ φ iε ∂(∂µ φ) ∂(∂µ φ∗ ) and it follows from the N¨other theorem that ∂ δΘµ = 0. ∂xµ Let us prove it by a direct calculation taking into account that δL = 0 under above variations, i.e. ] ∑ [ ∂L ∂L 0 = δL = δqi + δ(∂µ qi ) = ∂qi ∂(∂µ qi ) i ] [ ] ∑ ∑ [ ∂L ∂ ∂L ∂ ∂L − µ δqi + µ δqi . = ∂qi ∂x ∂(∂µ qi ) ∂x ∂(∂µ qi ) i i The first square brackets in the right hand side equal zero, [. . .] = 0, due to equation of motion, then the second item gives the needed result. Thus, if we introduce 4-vector ) ( ∂L ∂L ∗ µ φ− φ , (3.7) j = −i ∂(∂µ φ) ∂(∂µ φ∗ ) then
∂j µ =0 ∂xµ
(3.8)
36 and
∫ j0 d3 r = Q = const.
(3.9)
The quantity Q is a charge, and not necessarily an electric one. It may be a baryon charge, which is experimentally proved to be conserved even stronger than electric charge. Restrictions on the conservation of baryon charge are more stringent than those on the conservation of electric charge. So, we’ve proved the second part of the N¨other theorem in a particular case. You can see that the proof of the N¨other theorem even for its parts is cumbersome.
§4. Real scalar field Φ(x) = Φ∗(x) Let’s start from the field that is most similar to the electromagnetic one. This is a real scalar field. To reemphasize its similarity to the electromagnetic field, let’s even use the large letter Φ(x), in order to associate it with the large letter Aµ (x) in case of the electromagnetic field. Well, we’ll assume that this field is real Φ(x) = Φ∗ (x). So, we are interested, first of all, in the corresponding Lagrangian (the Lagrangian density), which must depend on the field Φ(x) and its derivative ∂ µ Φ(x). Actually a Lagrangian, from the the physicists’ point of view, is just an instrument; the main thing is the equation of motion. So, if you have an idea of the required equations of motion, then you construct the Lagrangian for them. If there’s no such idea, then we act in an the opposite manner, i. e. we invent various Lagrangians from different considerations, and then look what field equations arise. It was revealed that in case of electromagnetic field, the equation of motion is a wave equation ∂ µ ∂µ A(x) = 0. If we introduce the operator of 4-momentum pˆµ = i∂ µ , then this equation can be written down as follows: pˆµ pˆµ A(x) = 0. From it we got the statement that for plane waves the 4-vector of momentum is k µ = (ω, k) and k µ kµ = 0, i. e. the dispersion law reads ωk = |k|. What is our goal now? We want to obtain an equation from which the appropriate 4momentum pµ =√(ε, p) for plane wave shall obey the relation pµ pµ = m2 , i. e. the dispersion law reads εp = p2 + m2 , where m is the mass of quantum of this field. It can be done if we use such Lagrangian density L(Φ, ∂µ Φ) =
) 1( ∂µ Φ∂ µ Φ − m2 Φ2 , 2
that the Lagrangian equation ∂L ∂L ∂ − = ∂µ ∂ µ Φ + m2 Φ = 0 µ ∂x ∂(∂µ Φ) ∂Φ reads
(
) pˆµ pˆµ − m2 Φ(x) = 0.
37 This equation is called the Klein-Fock-Gordon (KFG) equation although Vladimir Alexandrovich Fock is not mentioned in many text-books4 . The next step is to act as we did in case of electromagnetic field, with the following difference: the potentials of the electromagnetic field is related to well understandable quantities. Indeed, the corresponding derivatives of the potentials are the intensities of the electric and magnetic fields, which can be directly measured with classical instruments, whereas the potential of the real field has no such classic, measurable analog. Only when we quantize them, quanta of this field will be interpreted as particles with nonzero mass. And their coordinate and momentum can be measured with some accuracy. The main thing we managed to do was the transformation of the electromagnetic field in a set of oscillators. Let’s to do the same here, i.e. let’s try to turn this field into a set of oscillators. We performed this procedure for the electromagnetic field, but the picture is a bit different here: the particles have masses, and the dispersion law is different from that for the electromagnetic field. So, it may be useful to repeat this procedure in every detail. Later on, when we proceed to the complex field with two field coordinates and to the spinor field with 8 field coordinates, we’ll do this more quickly, because we’ll have already grasped the main ideas. To start with this procedure, we need to likewise represent Φ(x) as a sum over plane waves ∑ [ ] Φ(x) = Np ap (t)eipr + a∗p (t)e−ipr . p
From equation of motion (ˆ pµ pˆµ − m2 ) Φ(x) = 0 we obtain an equation for the Fourier amplitudes a ¨p (t) + (p2 + m2 ) ap (t) = 0, from which the law of dispersion ε2 (p) = p2 + m2 and the dependence of the amplitudes 4
A personal remark. V.A. Fock wrote and published his work earlier than Gordon did, and that work was much more substantial than the one by Klein. That was a large work concerning a certain fivedimensional generalization of usual space-time, and this equation was a small piece in this work. The work was so complicated that most scientists somehow missed it and only later on, the tribute was paid to this remarkable scientist, including, for example, the Quantum Electrodynamics by Berestetskii, Lifshitz and Pitaevskii [1]. V.A. Fock was the head of the Chair of Theoretical Physics at the former Leningrad State University (now known as St. Petersburg State University). I was a student there, and when we started the course of quantum mechanics, the first chapters were delivered by academician Fock. It was an interesting sight. While his lectures was unemotional and somewhat dry, the personal interaction with that person was unforgettable. Immediately after a bell for a break or the end of the class, he was surrounded by students. And I remember one episode that made a strong impression on me. An overachieving student, Victor K., who had already read the beginning of quantum mechanics and its interpretation by de Broglie, began to attack Fock telling him: “You say one thing, while de Broglie is reasoning differently in his book.” Fock listened for some time and then said: “De Broglie is wrong.” That turned out to be the absolute truth; de Broglie was a terribly ignorant man. Moreover, when he wrote a really excellent work, in which he suggested considering particles as having wave properties, physicists in the quantum capital of the world, in G¨ottingen, did not pay attention to this work, for they already knew de Broglie as a real muddle-head. Earlier he had published a few bad articles on optics, and Max Born, who had been the head of G¨ottingen theorists and a specialist in optics, — you are probably familiar with the thick book Optics by Born and Wolf — knew that de Broglie was really a muddle-head and did not pay attention to this work. Einstein was the first to note this work and initiated the Schr¨odinger research.
38 ap (t) on time ap (t) ∝ e−iεp t ,
a∗p (t) ∝ eiεp t , εp = +
√ p 2 + m2
are derived. Let’s choose the coefficient Np in such a form Np = √
1 2εp V
that the expression for the total energy of field would look like just a set of oscillators, each with its own energy (which depends on the momentum p): ∑ E= εp a∗p ap . p
So, to start, we need to have the density of energy (see Eq. (3.4)) ] ∂L 0 1[˙2 T 00 = ∂ q − g 00 L = Φ + (∇Φ)2 + m2 Φ2 . ∂(∂0 q) 2 It is very nice that the density of the total energy turns out to be positive defined, T 00 ≥ 0. Then we present the total energy as an integral over the whole volume ∫ ∫ ∑ {( )( ) 1 00 3 E = T dr= d3 r Np Np′ ap eipr − a∗p e−ipr ap′ eip′r − a∗p′ e−ip′r × 2 p,p′ ( )( )} 2 (−εp εp′ − pp′) + m ap eipr + a∗p e−ipr ap′ eip′r + a∗p′ e−ip′r . Now, let’s perform the integration reminding ourselves the formula we already met: ∫ ipr −ip′r e e d3 r = Vδp,p′ . As a result, we obtain ∑ ∑ E= Np2 2ε2p Va∗p ap = εp a∗p ap . p
p
We got what we wanted, i. e. the energy as the sum of the contributions of individual oscillators. Well, it is clear that all this is very easy ideologically, and the algebraic reasoning is simple. Still, some accuracy is required. I ask you to prove the following: take the total field momentum ∫ n P = T n0 d3 r, n = x, y, z and then perform the calculations similar to the previous one. Make sure that this expression is reduced to the sum of the same product, but with the momentum of individual oscillator instead of the energy ∑ P= p a∗p ap . p
This exercise will help you scrutinize the previous calculations. This has been done. Now let’s proceed to quantization of the field. What are our usual actions in this case? Namely, the classical coordinate and momentum of oscillator with the number p turn into the operators that are time-independent ˆ+ ap (t) → a ˆp , a∗p (t) → a p .
39 Then the field coordinate turns into the field operator. This operator depends only on the spatial coordinates. Then we can repeat everything we did, and the energy of the field transforms into the Schr¨odinger operator: ∑1
ˆ = E→H
2
p
( + ) εp a ˆp a ˆp + a ˆp a ˆ+ p .
Next is the question of the behavior of the operators a ˆp and a ˆ+ p and their algebra. If without any thinking we continue acting in the same way as for the electromagnetic field, we had to require that these operators to behave as expected for a normal oscillator. In particular, if these two oscillators are identical, i. e. they have the same numbers of p — an oscillator of certain sort — then this is true for it. If these are different oscillators - with p and p′ - they must behave as independent variables, i. e. they must commute. As a result, we have such quantization rules [ ] [ + +] ˆp′ = 0, a ˆp , a ˆ+ [ˆ ap , a ˆp′ ] = a ˆp , a p′ = δpp′ , which led to the result: ˆ = H
∑ p
( ) 1 εp n ˆp + . 2
If we count off the energy from the infinite sum 1∑ εp , 2 p we obtain ˆ = H
∑
εp n ˆp
p
and, analogously, ˆ = P
∑
pn ˆp .
p
It is seen from here that n ˆ p has a notion of the operator of the number of quanta with the energy εp and momentum p. Its eigenvalues are n = 0, 1, 2, . . . I. e. in this case, it naturally turns out that the field energy is the energy of the individual oscillators, and for a given 4-momentum pµ the number of these oscillators can be 0, 1, 2 etc. This is an important point I would like to emphasize: in this state, i. e. for a given plane wave, there can be any number of quanta, which corresponds to the Bose-Einstein statistics. Since we’ll further meet the Dirac field, where the Fermi-Dirac statistics is applied, a question arises: we have acted in a natural way - as in quantum mechanics with an oscillator and as in the case of the electromagnetic field. But, what is the base of the concept of naturalness? Nothing, just an analogy. May be, it would be natural to check a different approach, which essentially rejects the Bose-Einstein statistics and proclaims the Fermi-Dirac reasoning. In this case, we would have not a commutator, but rather an anticommutator: { } ˆp = δpp′ , ˆ+ a ˆp , a ˆ+ ˆp a ˆ+ p′ a p′ ≡ a p′ + a ˆ is reduced to a sum, which does not It leads to the wrong conclusion that the operator H depend on the number of quanta at all!
40 It is the moment when an assertion arises that is usually formulated as the Pauli theorem about the spin-statistics connection. A real scalar field has spinless quanta and this field must be quantized according to the Bose-Einstein statistics. This must be true for all particles with integer spin: 0, 1, 2, etc. And in case of particles with half-integer spin: 1/2, 3/2, etc., on the contrary, if we take the Bose-Einstein statistics, the result will make no sense, and a reasonable result will only arise with the Fermi-Dirac statistics. So, although we acted by analogy, we have shown that these actions were reasonable. Well, actually this could finish the story about the real scalar field. The last thing we can do is to follow what we did in case of the electromagnetic field. All this is very simple, clear and familiar, but in fact we need to get used to another approach, namely, this description is not clearly relativistic-covariant. The operators here are dependent solely on the spatial coordinates, and not on the time. In terms of relativism, this is bad. Therefore, for more clarity in the future, it would be more convenient to turn to the Heisenberg representation. Same way as in case of the electromagnetic field, this can be done as follows ( ) −ipx ipx ∑ e e ˆ (x) = Φ a ˆp √ +a ˆ+ , p√ 2ε V 2ε V p p p √ px = εp t − pr, εp = + p2 + m2 . Let us note that the wave function
e−ipx √ 2εp V
corresponds to normalization for one particle in the whole volume V.
§5. Complex scalar field φ (x) ̸= φ∗ (x) The Lagrangian density, which is now a function of φ, φ∗ , ∂µ φ, ∂µ φ∗ , must still be relativistic-covariant and real. Similar to what was done in the previous section, we use such a Lagrangian density L(φ, φ∗ , ∂µ φ, ∂µ φ∗ ) = ∂µ φ∗ ∂ µ φ − m2 φ∗ φ , that the Lagrangian equations coincide with the KFG equations for functions φ(x) and φ∗ (x): ∂ ∂L ∂L − = ∂µ ∂ µ φ + m2 φ = 0 , µ ∗ ∂x ∂(∂µ φ ) ∂φ∗ ∂ ∂L ∂L − = ∂µ ∂ µ φ∗ + m2 φ∗ = 0. µ ∂x ∂(∂µ φ) ∂φ Then we find
T µν = ∂ µ φ∗ ∂ ν φ + ∂ ν φ∗ ∂ µ φ − g µν L,
the density of energy
T 00 = φ˙ ∗ φ˙ + (∇φ∗ ) (∇φ) + m2 φ∗ φ ≥ 0, ∫ ∫ the total field energy E = T 00 d3 r, the total field momentum P n = T n0 d3 r, the current jµ = i [φ∗ ∂µ φ − (∂µ φ∗ ) φ] ,
41 and the total field charge ∫
j0 d3 r, j0 = i (φ∗ φ˙ − φ˙ ∗ φ) .
Q=
The field φ(x) must be expanded in plane waves. The most important differences arise in this expansion. Instead of passing this point quickly, let’s try to scrutinize it. Now we have a certain constant Np and plane waves of two sets of oscillators φ (x) =
∑ p
( ) 1 , Np ap (t) eipr + e ap (t) e−ipr , Np = √ 2εp V
because e ap (t) ̸= a∗p (t) though, as earlier, ap (t) ∝ e−iεp t , e ap (t) ∝ e+iεp t . Then, in addition to the expansion for φ (x), we must have an expansion for φ∗ (x). Let’s write it down ∑ ( ) φ∗ (x) = Np a∗p (t) e−ipr + e a∗p (t) eipr . p
So, we have two coordinates, and it turns out that we have two sets of oscillators. One set of oscillators is described by the generalized coordinate ap (t) and the generalized momentum a∗p (t), and the other set of oscillators is described with the coordinate e ap (t) and the momentum e a∗p (t). Generally speaking, these are different oscillators. The difference will be found when we see how these values determine the energy, momentum, and charge that are related to the measurable quantities. In the end, we will turn this function into an operator. And, if we recall our interpretation, then a ˆ+ ˆp will be the annihilation operator of quanta of p will be the creation operator and a certain kind. Let’s call it as quanta of sort A. But then we must call the operator obtained from e ap (t) the creation operator ˆb+ a∗p (t) the annihilation operator p and that obtained from e ˆbp of quanta of sort B. Thus, ap (t) → a ˆp ,
e ap (t) → ˆb+ p,
a∗p (t) → a ˆ+ p,
e a∗p (t) → ˆbp .
As a result, the field operators in the Heisenberg representation have the form ( ) ipx ∑ e−ipx e φˆ (x) = a ˆp √ + ˆb+ , p√ 2ε V 2ε V p p p φˆ+ (x) =
∑ p
(
−ipx eipx ˆbp √e √ a ˆ+ + p 2εp V 2εp V
(5.1a)
) ,
(5.1b)
and operators of the energy, momentum and charge are: ) ) ) ∑( ∑ ( ∑ ( + + + + + + ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ a ˆp a ˆ p − bp bp . p a ˆp a ˆ p + bp bp , Q = ˆp a ˆp + bp bp , P = H= εp a p
p
p
42 Here we also used the quantization rules corresponding to the Bose-Einstein statistics: ] [ [ ] + + ˆ ˆ bp , bp′ = δpp′ , a ˆp , a ˆp′ = δpp′ , [ ] [ ] [ ] ] [ + + ˆ + ˆ+ ˆ ˆ ′ ′ a ˆp , bp = a ˆ p , bp = a ˆp , bp′ = 0, ˆp , bp′ = a [ ] [ ] [ + +] + ˆ ˆ [ˆ ap , a ˆp′ ] = ˆbp , ˆbp′ = a ˆ+ , a ˆ ′ p p = bp , bp′ = 0 . In this case we obtain (omitting the infinite constant) ∑ ( ∑ ( ) ) ˆ = ˆ = ˆp , P ˆp , H εp n ˆp + n p n ˆp + n p
p
Q=
∑(
) ˆp , n ˆp − n
p
where n ˆp = a ˆ+ ˆp pa is the operator of the number of quanta of sort A, while ˆ ˆ p = ˆb+ n p bp is the operator of the number of quanta of sort B. It is seen from here that particles of sort A and B have the same energy εp , the same momentum p, but different charge (+1) and (−1). Note once more that the charge may not necessarily mean an electrical charge. It may be, for example, a baryon charge. For example, neutrons are neutral particles and antineutrons are neutral particles, however, they have different baryon charges. The particles of sort B are called the antiparticles. In the expansion (1) the wave function e−ipx √ 2εp V corresponds to normalization for one particle in the whole volume V.
§6. C, P , and T transformations for the complex scalar field In this section we consider the discrete symmetries related to the charge (or C) transformation, to the spatial (or P ) inversion and to the time (or T ) inversion. But before we start, let’s recall the basic properties of the continuous proper Lorentz transformations. The proper Lorentz transformation is defined as a linear transformation of the radius 4-vector x → x′ = Λx which conserve the square of the vector xµ xµ = t2 − r2 = x′µ (x′ )µ = (t′ )2 − (r′ )2 . In other words, it is a linear transformation x′µ = Λνµ xν , where matrix Λ satisfies equation Λαµ Λµβ = δβα
43 and depends in a continuous way on parameters of the Lorentz group, which are six angles of rotation in six planes: xy, yz, zx, tx, ty, tz. The determinant of this matrix is equal to unit: ( ) D ≡ det Λνµ = +1. In particular, under rotation on the angle ω in the xy plane, the radius 4-vector is transformed as t′ = t, x′ = x cos ω + y sin ω , y ′ = −x sin ω + y cos ω , z ′ = z ,
(6.1)
and the matrix Λ equals
1 0 0 0 0 cos ω sin ω 0 2 2 Λ= 0 − sin ω cos ω 0 , D = cos ω + sin ω = 1. 0 0 0 1
Let us now consider the proper Lorentz transformation along the axis x with velocity V : t′ = t cosh ν − x sinh ν, x′ = x cosh ν − t sinh ν, y ′ = y, z ′ = z, where cosh ν = √
1
1−V and the rapidity ν is determined as: tanh ν cosh ν − sinh ν 0 − sinh ν cosh ν 0 Λ= 0 0 1 0 0 0
(6.2)
V , sinh ν = √ 1−V2 = V . In this case the matrix Λ is equal to 0 0 , D = cosh2 ν − sinh2 ν = 1. 0 1
2
This transformation can be viewed as hyperbolic rotation in the xt plane, therefore, the transformation (2) of 4-radius vector xµ can be obtained by replacements y → ±it, ω → ±iν from rotation (1) in the xy plane. The scalar functions are not changed under the proper Lorentz transformation: ( ) φ (x) → φΛ (x) = φ Λ−1 x . (6.3a) In other words, the transformed function in the new point coincides with the untransformed function in the old point. That assertion is, in a sense, a definition of a scalar field. And now let’s look separately at a 4-inversion or complete reflection of the coordinates and time, i. e. a transformation in which t and all the three coordinates r change the sign x → x′ = −x. The corresponding determinant is equal to unit, D = +1. Therefore, from a purely mathematical point of view, it must be the same as a continuous transformation, and that is why this transformation reads just the same as Eq. (3a): φ (t, r) = +φ (−t, −r) .
(6.3b)
44 From here one obtains the following transformation for the field operators a ˆp ↔ ˆb+ p,
ˆbp ↔ a ˆ+ p,
(6.3c)
i. e. this transformation involves interchange of particles and antiparticles. The P (parity) transformation corresponds to the transformation when the time does not change, while the spatial coordinates change their signs r → −r. The sign changes in all the three components. This is an important point, because when the signs change in two components that can be reduced to a rotation around the third component. For example, when the signs of the x and y axes change, that can be reduced to a rotation through 180 degrees around the z axis when the x and y axes changing their directions. But the reflection of three axes is a separate operation, which cannot be obtained via any rotations. Moreover, the corresponding determinant D of this linear transformation is negative, D = −1, unlike the determinant in the proper Lorentz transformations when D = +1. For a considered case, the function φ(x) must remain unchanged up to some phase factor ηP , which must be equal to 1 in module, because in such a transformation, the probabilities for a scalar particle must not change: φ (t, r) → φP (t, r) = ηP φ (t, −r) , |ηP | = 1.
(6.4a)
Double application of the operation P leads to ηP2 φ (t, r) = φ (t, r), i. e. ηP = ±1. If ηP = +1, the field is called the scalar field; if ηP = −1, the field is called the pseudoscalar field. And that is a feature inherent to a given field. It is considered as the internal parity of the respective quanta of the field, i. e. the respective corresponding particles. This aspect was almost ignored during the discussion of the wave functions in quantum mechanics. The reason was that particles do not disappear in quantum mechanics. The Schr¨odinger equation implies the law of conservation of the number of particles. If you have a particle, and its wave function is normalized at some moment in time, the Schr¨odinger equation itself ensures that the normalization will remain, and the particle will not disappear. So, if it has some internal parity — plus or minus — that affects nothing: it was a particle in the initial state of some process, and it will be a particle in the final state of the same process, together with quantum number of its internal parity. Now we are entering a new area, i. e. the quantum field theory, the most striking difference of which from quantum mechanics is the possibility of disappearance and appearance of particles. And this quantum number — the internal parity — becomes of interest. Experiment must confirm whether these quantum numbers are conserved or not. And the experiment shows that in electromagnetic and strong interactions, the parity of the initial state (which includes the internal parity of all initial particles) is indeed conserved, and therefore it is quite important and interesting, and that special attention should be paid to the internal parity. The transformation of the field operators reads φˆ (t, r) → φˆP (t, r) = ηP
∑ p
√
) ( 1 i(εp t+pr) e a ˆp e−i(εp t+pr) + ˆb+ p 2εp V
If we change the sign of the summation index p, then P
φˆ (t, r) = ηP
∑ p
( ) 1 −ipx + ipx ˆ √ a ˆ−p e + b−p e , 2εp V
px = εp t − pr,
45 i. e. a ˆ p → ηP a ˆ−p ,
ˆb+ → ηP ˆb+ , p −p
(6.4b)
a ˆ+ ˆ+ p → ηP a −p ,
ˆbp → ηP ˆb−p .
(6.4c)
and similarly Therefore, at a reflection of the spatial axes, a particle remains a particle, but its momentum changes the sign. From here one can see that the law of transformation for operators of particles and for operators of antiparticles is exactly the same. Thus, we can make an important physical conclusion that in case of a complex scalar field the internal parities of particles and antiparticles are the same. The C (charge) transformation is defined by relations a ˆp → ˆbp ,
ˆbp → a ˆp ,
ˆ+ a ˆ+ p → bp ,
ˆb+ → a ˆ+ p p,
(6.5a)
i. e. the particles are changed by the antiparticles and vice versa. It results in the following transformation for the field operators φˆ (x) → φˆC (x) = φˆ+ (x) .
(6.5b)
The T (time) transformation: reversal of the time t → −t. In quantum mechanics ˆ does not change its form if we make the complex = HΨ the Schr¨odinger equation i ∂Ψ ∂t conjugation of the wave function simultaneously with reversal of the time, i. e. t → −t and Ψ → Ψ∗ . Therefore, we will use the same approach in our case: φˆ (t, r) → φˆT (t, r) = ηT φˆ+ (−t, r) , a ˆ p → ηT a ˆ+ −p ,
ˆb+ → ηT ˆb−p . p
(6.6a) (6.6b)
The annihilation operators for particles with momentum p are therefore replaced by creation operators for particles with momentum (−p). It means that in the matrix elements, time reversal not only changes motion with momentum p into motion with momentum (−p), but also interchanges initial and final states . If further we make C and P transformations, then φˆ (t, r) → φˆP CT (t, r) = ηP ηT φˆ (−t, −r) , i. e. it will be, in fact, 4-inversion x → −x with determinant D = +1. As we have just found out above, nothing changes in 4-inversion. Therefore, we have obtained the following result φˆP CT (t, r) = φˆ (t, r) .
(6.7)
Moreover, we can also conclude that ηP ηT = +1 and, therefore, ηT = ηP = ±1. The only thing to add is that there is a general theorem stating that invariance with respect to a complete C, P , and T transformation is a consequance of the general principles of the quantum field theory. It is so called CP T theorem of L¨ uders and Pauli. From this point of view, if the usual spatial parity is violated somewhere, this is necessarily accompanied by time parity violation. That was observed experimentally in weak decay processes.
46
§7. C, P , and T transformations for the electromagnetic field For completeness, let us present the almost evident formulas for C, P , and T transformations of the electromagnetic field: AC µ (x) = −Aµ (x) , AP0 (t, r) = A0 (t, −r) ,
AP (t, r) = −A (t, −r) ,
AT0 (t, r) = A0 (−t, r) ,
AT (t, r) = −A (−t, r) .
Therefore, the photon has a negative C-parity. From here it follows, in particular, that final states in the γγ → hadrons transition have a positive C-parity. For example, the ηc (2980) meson is the bound state of c¯ c quarks produced PC −+ in the collision of two photons and it has quantum numbers J = 0 . On the contrary, if the final states in e+ e− → hadrons annihilation are produced via one photon intermediate state, they have a negative C-parity. For example, the famous J/ψ(3097) meson which was discovered as a very sharp resonance in e+ e− collisions is also the bound state of c¯ c quarks but with a negative C-parity: J P C = 1−− .
§8. The spinor Dirac field 8.1. Three-dimensional spinors (3-spinors) Here we recall the known facts about the electron spin and three-dimensional spinors. Let ˆs be the spin operator. We define the Pauli matrices σx , σy , σz by relations ˆs = 12 σ , (
then σx =
0 1 1 0
)
( , σy =
0 −i i 0
)
( , σz =
1 0 0 −1
) .
Their properties are: σj σk = σ0 δjk + iεjkn σn ,
Tr σj = 0, Tr σ0 = 2 ,
where σ0 is the unit matrix. Every 2 × 2 matrix A can be presented as A = a0 σ0 + a σ, a0 = 12 Tr A, a = 12 Tr (Aσ) . ˆ l , related to its orbital motion, is proportional to Magnetic moment of a charged particle µ ˆ its orbital angular momentum l (here and in the Pauli equation below the absolute Gaussian system of units is used): e~ ˆ l. µˆl = 2mc ˆ s , related to its spin degree of freedom, is proportional to Magnetic moment of a particle µ its spin ˆs, but the corresponding coefficient depends on sort of a particle, in particular, for electron, proton, and neutron this relation has the form ˆ s = 2 µs ˆs = µs σ , µ
47 µe ≈ −1, 001 µB , µB =
|e|~ , 2me c
µp ≈ 2, 79 µN , µn ≈ −1, 91 µN , µN =
|e|~ . 2mp c
Equation of motion of a particle with spin s = 1/2 and charge e in electromagnetic field reads (W. Pauli, 1927) ∂Ψ ˆ , i~ = HΨ ∂t
1 ( e )2 ˆ ˆ − A + eϕ − µ ˆ sB , H= p 2m c
(8.1)
where the last term has the form of the potential energy of a magnetic dipole in the external magnetic field. The wave function Ψ is a two-component spinor ( ) Ψ1 (t, r) Ψ= . Ψ2 (t, r) The density ρ(t, r) and the normalization condition are as follows: ∫ + 2 2 ρ(t, r) = Ψ Ψ ≡ |Ψ1 | + |Ψ2 | , ρ(t, r) d3 r = 1 . Now we’ll review the basic properties of 3-spinors. Let us consider the rotation through angle ω around the axis characterized by the unit vector n. In this case the radius vector is transformed as r → r′ = R(ω) r, ω = ω · n, where R(ω) is the rotation 3 × 3 matrix. The corresponding operator for a spinor wave function can be presented as 2 × 2 matrix: U (ω) = eiσn ω/2 . This matrix is the unitary one:
(8.2)
U + (ω) = U −1 (ω).
Therefore, the transformation of a spinor under rotation reads Ψ(t, r) → Ψ′ (t, r′ ) = U (ω)Ψ(t, r) = [σ0 cos(ω/2) + iσn sin(ω/2)]Ψ(t, r),
(8.3)
where the state Ψ′ corresponds to the spin vector turned on the angle (−ωn) with respect to the spin vector in the state Ψ. Let us show that the spin operator transforms under the rotation as a vector, i. e. the transformed spin 12 σ ′ = 12 U + (ω) σU (ω) is expressed via the original spin 12 σ as σ ′ = U + (ω) σU (ω) = R(ω)σ.
(8.4)
Since the arbitrary rotation can be presented as a succession of three rotations (around z-axis, then around y-axis and again around z-axis), it is sufficient to consider the spin transformation with respect to rotations around z- and y-axes. Under rotation on the angle ω around z-axis, the radius is transformed as x′ = x cos ω + y sin ω , y ′ = −x sin ω + y cos ω , z ′ = z ,
48 and the rotation operator is U (ω) ≡ Uz (ω) = σ0 cos (ω/2) + i σz sin (ω/2) . Using properties of the Pauli matrices, we obtain Uz+ (ω) σx Uz (ω) = [σ0 cos (ω/2) − i σz sin (ω/2) ] σx [σ0 cos (ω/2) + i σz sin (ω/2) ] = = σx cos ω + σy sin ω , and Uz+ (ω) σy Uz (ω) = −σx sin ω + σy sin ω ; Uz+ (ω) σz Uz (ω) = σz , i. e. in this case the spin operator transforms in the same way as the radius vector. Let us now consider the rotation on the angle ω around y-axis. In this case x′ = x cos ω − z sin ω , z ′ = x sin ω + z cos ω , y ′ = y while the spin transformation reads Uy+ (ω) σx Uy (ω) = [σ0 cos (ω/2) − i σy sin (ω/2) ] σx [σ0 cos (ω/2) + i σy sin (ω/2) ] = = σx cos ω − σz sin ω and Uy+ (ω) σz Uy (ω) = σx sin ω + σz cos ω ; Uy+ (ω) σy Uy (ω) = σy , i. e. in this case the spin operator also transforms in the same way as the radius vector. As a result, in the general case the spin operator transforms under the rotation as a vector. In particular, the spinor ( ) 1 (8.5a) Ψ= 0 corresponds to the mean value of the spin vector along the axis z, i. e. Ψ+ σΨ = (0, 0, 1), while the spinor
( Ψn = Uz (−φ)Uy (−θ)Ψ =
cos(θ/2) e−iφ/2 sin(θ/2) eiφ/2
(8.5b) ) (8.6a)
corresponds to the mean value of the spin vector along the unit vector n = (sin θ cos φ, sin θ sin φ, cos θ)
(8.6b)
characterized by the polar angle θ and azimuthal angle φ, i. e. Ψ+ n σ Ψn = n .
(8.6c)
Using four matrices σ0 and σ with the above properties U + (ω)σ0 U (ω) = σ0 , U + (ω)σU (ω) = R(ω) σ,
(8.7a)
49 we find that the bilinear form of 3-spinors Ψ+ σ0 Ψ is the scalar, while Ψ+ σΨ is the vector with respect to the rotation: (Ψ+ )′ σ0 Ψ′ = Ψ+ σ0 Ψ, (Ψ+ )′ σΨ′ = R(ω) Ψ+ σΨ.
(8.7b)
Under the spatial inversion r′ = −r, the spin componets (as well as the components of the angular momentum M = r × p) do not change their forms. It means that each spinor component transforms via itself only, i. e. Pˆ Ψ(t, r) = ηP Ψ(t, −r) ,
(8.8)
where |ηP | = 1. Under the double reflection we return to the original coordinate frame. If we define the double reflection as the identical transformation (or the rotation on the angle ω = 0) then ηP2 = 1 and ηP = ±1. But we can define the double reflection as the rotation on the angle ω = 2π. In this case, components of spinor change their sign according to Eq. (8.2): Ψ′ = −Ψ at ω = 2π , therefore, ηP2 = −1 and ηP = ±i. As a result, the corresponding matrix U = ηP σ0 and the transformed operator coincides with the original: U + σU = σ .
(8.9)
Thus, we come to conclusion that the spin operator is the axial vector. Let us note that the rotation operator (2) does not change under the spatial inversion because the axis n is the axial vector: n → n′ = +n as well as the operator σ.
8.2. Four-dimensional spinors (4-spinors) Let us now consider the proper Lorentz transformation (6.2) along the axis x with velocity V . We have already mentioned that it can be viewed as hyperbolic rotation in the xt plane, therefore, this transformation can be obtained by replacements y → ±it, ω → ±iν from rotation (6.1) in the xy plane. Accordingly, the operator for spinor transformation can be obtained by replacement ωn → ±iνV/V in the rotation operator: U (ω) → BR,L (V) = e∓σnν/2 = σ0 cosh(ν/2) ∓ σn sinh(ν/2), n =
V . V
(8.10)
Here the Hermitian matrices BR,L (V) satisfy relations + −1 BR,L (V) = BR,L (V), BR,L (V) = BR,L (−V) = BL,R (V).
Thus, we have to deal with two types of 4-spinors: Ψ′R (x′ ) = BR (V)ΨR (x), Ψ′L (x′ ) = BL (V)ΨL (x).
(8.11)
The behaviour of 4-spinors as regards the rotation is the same as of 3-spinors. However, under the spatial inversion r → −r, 4-spinor of ΨR (or ΨL ) type is transformed to 4-spinor of ΨL (or ΨR ) type. Indeed, the unit vector n = V/V changes its sign under inversion, therefore, BR (V) → BL (V), BL (V) → BR (V) and ΨR (t, r) → ηP ΨL (t, −r), ΨL (t, r) → ηP ΨR (t, −r).
(8.12)
50 Below we will use the combinations 1 1 φ(x) = √ (ΨR (x) + ΨL (x)) , χ(x) = √ (ΨR (x) − ΨL (x)) , 2 2
(8.13)
which are transformed under rotation according to (3) and under inversion as φ(t, r) → +ηP φ(t, −r), χ(t, r) → −ηP χ(t, −r).
(8.14)
Let us introduce two sets of four matrices: σRµ = (σ0 , σ), σLµ = (σ0 , −σ).
(8.15a)
µ It is not difficult to prove that σR,L is 4-vector operator with respect to the bust transformation given by BR,L (V):
σRµ → (σRµ )′ = BR+ σRµ BR , σLµ → (σLµ )′ = BL+ σLµ BL .
(8.15b)
For definiteness, we choose V = (V, 0, 0) and σRµ . Using properties of σ-matrices we find that σx′ = σx cosh ν − σ0 sinh ν, σ0′ = σ0 cosh ν − σx sinh ν, σy′ = σy , σz′ = σz in accordance with Eq. (6.2) for 4-vector xµ . The similar expressions are also valid for (σLµ )′ . µ It means that the differential operator σR,L pˆµ is the scalar with respect to rotation and the corresponding bust transformation. Similarly, it is not difficult to prove that σRµ pˆµ ΨR (x) is transformed under the bust transformation as 4-spinor ΨL (x), while σLµ pˆµ ΨL (x) is transformed as ΨR (x).
8.3. The Dirac equation The Dirac equation has the form of the linear differential equations for two 4-spinors ΨR (x) and ΨL (x): σRµ pˆµ ΨR (x) = m ΨL (x), (8.16a) σLµ pˆµ ΨL (x) = m ΨR (x).
(8.16b)
µ this equation does not change its form under rotation Due to properties of ΨR,L (x) and σR,L and bust, while the operator of spatial inversion converts Eq. (16a) to Eq. (16b) and vice verse. Therefore, the Dirac equation obeys the requirement of relativistic covariance with respect to the extended Lorentz group. If we apply the operator σLν pˆν to the left and right hand sides of Eq. (16a), we find
ˆ 2 ) ΨR (x) = m (σLν pˆν ) ΨL (x). p20 − p (σLν pˆν ) (σRµ pˆµ ) ΨR (x) = (ˆ Using further Eq. (16b) for the right hand side, we obtain the KFG equation for ΨR (x): ) ( µ (8.17a) pˆµ pˆ − m2 ΨR (x) = 0 and similarly for ΨL (x):
(
) pˆµ pˆµ − m2 ΨL (x) = 0.
(8.17b)
51 It means that plane-wave solutions ΨR,L (x) = ψR,L e−ipx correspond to particles with the mass m, ordinary relation for components of 4-momentum pµ pµ = ε2 − p2 = m2 and obey equations σ(p/ε) ψR = +ψR − (m/ε) ψL , (8.18a) σ(p/ε) ψL = −ψL + (m/ε) ψR .
(8.18b)
It is seen from these equations that in the ultrarelativistic limit m/ε ≪ |p/ε| ≈ 1 the solution ΨR (x) is the plane wave with helicity ˆsp/|p| equals (+1/2), while the solution ΨL (x) is the plane wave with helicity equals (−1/2). In the nonrelativistic limit |p| ≪ ε ≈ m both solutions have the same form ΨR (x) = ΨL (x). On the contrary, the combinations (13) have the different behaviour in this limit: φ(x) → ΨR (x) = ΨL (x), but χ(x) → 0. Two 4-spinor ΨR (x) and ΨL (x) can be combined in the form of one 4-bispinor ( ) ΨR (x) Ψ(x) = (8.19) ΨL (x) for which the Dirac equation in the spinor representation takes the form ( ( ) ) 0 σ0 0 −σ µ 0 , γ= , (γ pˆµ − m I) Ψ(x) = 0, γ = γ0 = σ0 0 σ 0
(8.20)
where I is the 4 × 4 unit matrix. Here the Dirac matrices γ µ obey the basic relations γ µ γ ν + γ ν γ µ = 2g µν I. Below we will use bispinors
( Ψ(x) =
φ(x) χ(x)
(8.21)
) (8.22a)
combined from spinors (13). These bispinors satisfy the Dirac equation in the standard representation ) ( ) ( σ0 0 0 σ µ 0 . (8.22b) (γ pˆµ − m I) Ψ(x) = 0, γ = γ0 = , γ= 0 −σ0 −σ 0 Certainly, these matrices obey the basic relations (21). For the bispinors (22a) the rotation transformation reads ( ) Ψ′ = exp 2i ωΣn Ψ = [ I cos(ω/2) + i Σn sin(ω/2) ] Ψ , (8.23) while the Lorentz transformation has the form ( ) Ψ′ = exp − 12 ναn Ψ = [ I cosh(ν/2) − αn sinh(ν/2) ] Ψ . Here the Hermitian matrices
( Σ=
σ 0
0 σ
)
( , α=
0 σ
σ 0
(8.24)
) (8.25)
satisfy relations Σj Σk = I δjk + iεjkn Σn , αj αk + αk αj = 2I δjk .
(8.26)
52 It is useful to note that the Dirac equation can be obtained as the Lagrange equation if we choose the Lagrangian density in the form ( ) 1[ ] L Ψ(x), Ψ(x), ∂µ Ψ(x), ∂µ Ψ(x) = Ψγµ i∂ µ Ψ − (i∂ µ Ψ)γµ Ψ − mΨΨ, 2 where Ψ(x) = Ψ+ (x) γ 0 is the so called the Dirac conjugate wave function with respect to the wave function Ψ(x). It satisfies the equation pˆµ Ψ(x)γ µ + mΨ(x) = 0. The probability 4-current can be obtained using Eq. (3.7): ( ) ∂L ∂L µ j = −i Ψ−Ψ = Ψγ µ Ψ. ∂(∂µ Ψ) ∂(∂µ Ψ)
(8.27a)
It satisfies the continuity equation ∂µ j µ (x) = 0.
(8.27b)
The probability density ϱ(x) = j0 (x) = Ψ(x)γ0 Ψ(x) = Ψ+ (x)Ψ(x) = φ+ (x)φ(x) + χ+ (x)χ(x)
(8.27c)
is the positive defined function. The probability 3-current reads j(x) = Ψ(x) γΨ(x) = φ+ (x)σχ(x) + χ+ (x)σφ(x).
(8.27d)
Using properties of 4-spinors φ(x) and χ(x) we can easily show that bilinear combination Ψ(x)Ψ(x) is the scalar with respect to the extended Lorentz group, while Ψ(x)γ µ Ψ(x) is the 4-vector. Analogously, one can show that: Ψ(x) γµ γν Ψ(x) is 4-tensor of the second rank, Ψ(x) γ5 Ψ(x) is the pseudoscalar, Ψ(x) γ5 γµ Ψ(x) is the axial 4-vector. Here the matrix γ5 is defined as ( ) 0 −σ0 γ5 = −iγ0 γx γy γz = . (8.28) −σ0 0
8.4. C, P , and T transformations for the Dirac spinor field Let us briefly discuss transformations of the Dirac spinor field regarding the discrete symmetries. For definiteness, we consider the standard representation. Besides the Dirac equation for a free particle, γ µ pˆµ Ψ(x) = mΨ(x), (8.29) we will use also the Dirac equation for a particle with the charge e in the electromagnetic field Aµ (x): γ µ (ˆ pµ − eAµ (x)) Ψ(x) = mΨ(x). (8.30) As for the spatial inversion, it corresponds to the transformation (see Eq. (14)): ΨP (t, r) = ηp γ0 Ψ(t, −r)
(8.31)
53 Now we consider properties of the Dirac equation with respect to the charge conjuagate transformation. Let the function Ψ(x) satisfy Eq. (30). It is easy to prove that the function ΨC (x) = CΨ(x) , C = γy γ0 = −αy
(8.32)
corresponds to the charge-conjugate particle. Indeed, this function satisfies the equation γ µ (ˆ pµ + eAµ (x)) ΨC (x) = m ΨC (x) ,
(8.32)
which differs from Eq. (30) for Ψ(x) by the sign of the charge e only. It is also easy to prove that if the function Ψ(x) satisfies Eq. (29), then the functions ΨT (t, r) = UT Ψ(−t, r) , UT = iγz γx γ0
(8.33)
ΨCP T (t, r) = iγ5 Ψ(−t, −r)
(8.34)
and satisfy the same Eq. (29).
8.5. Hamiltonian form of the Dirac equation Multiplying Eq. (30) by γ0 from the left, we obtain the Dirac equation in the Hamilton form i
∂Ψ(x) ˆ Ψ(x), =H ∂t
ˆ = α(ˆ ˆ = −i∇ . H p − eA) + mγ0 + eA0 I , p
(8.35)
ˆ In the central field (at A = 0, eA0 = U (r)), the orbital angular momentum ˆl = r × p and spin ( ) 1 σ 0 1 ˆs = Σ = 2 2 0 σ do not conserve:
dˆl ˆ ˆl] = α × p ˆ, = i [H, dt
dˆs ˆ ˆs] = −α × p ˆ. = i [H, dt It is quite natural, however, that the total angular momentum ˆj = ˆl + ˆs does conserve dˆj ˆ ˆj] = 0 . = i[H, dt Let us consider the free electron in the state with the defined momentum p. In this case the Hamiltonian ˆ = αp + mγ0 , H generally speaking, does not commute with the spin operator ˆ ˆs] = i α × p . [H,
(8.36)
However, the last equation tell us about two possible exclusions. 1. The right hand side of Eq. (36) tends to zero if p → 0 (which takes place in the rest frame of electron) ˆ ˆs] = 0 at p → 0 . [H, (8.37)
54 It means that one can describe the spin state of free electron by defined values σ = ±1/2 of the operator sˆz in the rest frame of electrons. 2. If one multiplies Eq. (36) by the vector p, then the right hand side of the obtained ˆ i. e. the projection of equation becomes equal zero. It proves that the helicity operator Λ, spin on the direction of the electron momentum, commutes with the Hamiltonian ˆ Λ] ˆ = 0, Λ ˆ = ˆs · p . [H, |p|
(8.38)
ˆ equal λ = ±1/2 and the corresponding states are called The eigenvalues of this operator Λ the spiral states.
8.6. Plane waves solution of the Dirac equation The plane wave Ψ(x) = u(p) e−ipx , p x ≡ pµ xµ = Et − pr ,
(8.39)
corresponds to free motion of a particle with the defined 4-momentum pµ = (E, p). Here the bispinor ( ) φ(p) u(p) = u(E, p) = , χ(p) satisfies the equation (γ µ pµ − m I) u(p) = 0 .
(8.40)
As a result, two-component spinors, φ(p) and χ(p), obey the following system of equations (E − m) φ − σp χ = 0 ,
σp φ − (E + m) χ = 0 .
This system has no-zero solution if its determinant equals zero, i. e. if E 2 = p2 + m2 . Let us introduce the positive value √ ε = + p2 + m2 . (8.41) Now we can discuss two possibilities. 1. Energy is positive: E = +ε , χ =
σp φ. ε+m
Using normalization φ+ φ = 1 , u¯u = 2m we obtain the bispinor ) (√ √ σp ε + mφ , Aˆ = √ u(ε, p) ≡ up = = ε − m σn , ˆ Aφ ε+m
(8.42a)
where n = p/|p|; besides u¯p up = 2m, u¯p γ µ up = 2pµ .
(8.42b)
2. Energy is negative: ( E = −ε, u(−ε, p) =
ˆ √ −A χ ε + mχ
) , χ+ χ = 1, u¯(−ε, p) u(−ε, p) = −2m.
(8.43)
55 Four components of wave function correspond to two spin states and two signs of the energy. It is impossible to exclude states with the negative energy because in quantum field theory there are transitions between states. Acting in the frame of quantum mechanics, Dirac postulates that all levels with the negative energy are occupied (the negative Dirac sea). Therefore, there is no transitions to these levels due to the Pauli exclusion principle. The hole in this Dirac sea is equivalent to the particle with the same mass as an electron, but with the opposite charge. Moreover, such a hole (i. e. the state without electron with the negative energy (−ε) and momentum (−p)) corresponds to the state of the particlehole with the positive energy (+ε) and momentum (+p). In quantum field theory such a particle-hole can be interpreted as an antiparticle, therefore, the postulate about the Dirac sea becomes unnecessary. Such an antiparticle was indeed discovered (K. Anderson, 1932) and called the positron. Below we collect the basic formulas about polarization states of free electrons and positrons in two approaches mentioned in Section 8.5. √ 1. In the first approach, a free electron with the momentum p, energy ε = + p2 + m2 and the spin projection σ on z-axis in the rest frame of the electron is described by the function Ψpσ (x) = N upσ e−ipx , px = εt − pr . Here the bispinor upσ satisfies equations (γ µ pµ − m I) upσ = 0 , sˆz upσ = σ upσ at ε → m and the normalization condition u¯pσ upσ′ = 2m δσσ′ . Its evident form reads upσ
(8.44)
( √ ) ε + m φ(σ) p = √ , n= . (σ) ε − m σn φ |p|
(8.45a)
Besides, u¯pσ γ µ upσ′ = 2pµ δσσ′ .
(8.46)
Two-component spinors φ(σ) satisfy equations 1 σ 2 z
′
φ(σ) = σ φ(σ) , φ(σ)+ φ(σ ) = δσσ′ ;
their evident forms can be the same as in nonrelativistic case: ( ) ( ) 1 0 (σ=1/2) (σ=−1/2) φ = ; φ = . 0 1
(8.45b)
The normalization factor N equals 1 N=√ , 2εV if we use the standard normalization for one particle in the volume V. Now let us describe the spin states of a free positron in the first approach. We recal that in the Dirac picture the positron with the positive energy (+ε), momentum (+p) and spin projection (+σ) on z-axis in the electron rest frame is the hole which corresponds to the
56 electron state with the negative energy (−ε), momentum (−p) and spin projection (−σ) on z-axis. Therefore, such an electron is described by the function Ψ−p−σ (x) = C ψ pσ (x) = N vpσ e+ipx , px = εt − pr . Here the bispinor vpσ = C u¯pσ satisfies equations (−γ µ pµ − m I) vpσ = 0 , sˆz vpσ = −σ vpσ at ε → m and the normalization condition v¯pσ vpσ′ = −2m δσσ′ . Its evident form reads vpσ
(8.47)
) (√ ε − m σn χ(−σ) √ = , ε + m χ(−σ)
(8.48a)
where two-component spinors are χ(−σ) = −σy φ(σ) = −2σi φ(−σ) .
(8.48b)
v¯pσ γ µ vpσ′ = 2pµ δσσ′ .
(8.49)
Besides, Certainly, bispinors u and v are mutually orthogonal: v¯pσ upσ′ = u¯pσ vpσ′ = 0 .
(8.50)
For future references, let us note the useful equations: γ 0 u−pσ = +upσ , γ 0 v−pσ = −vpσ .
(8.51)
2. In the second approach a free electron is described by the plane wave Ψpλ (x) = N upλ e−ipx , px = εt − pr , where the bispinor upλ satisfies equations ˆ upλ = λ upλ (γ µ pµ − m) upλ = 0 , Λ and the normalization condition u¯pλ upλ′ = 2m δλλ′ . Its evident form reads upλ
(8.52)
( √ ) ε + m w(λ) (n) √ = . 2λ ε − m w(λ) (n)
(8.53a)
Two-component spinors w(λ) (n) satisfy equation 1 2
′
(σn) w(λ) (n) = λ w(λ) (n) , w(λ)+ (n) w(λ ) (n) = δλ′ λ ;
57 their evident forms are ( −iφ/2 ) e cos 2θ (λ=1/2) w (n) = ; eiφ/2 sin 2θ
( w
(λ=−1/2)
(n) =
−e−iφ/2 sin 2θ eiφ/2 cos 2θ
) .
(8.53b)
In the Dirac negative sea, the state without the electron with helicity λ (i. e. the spin projection onto the direction of the electron momentum (−p)) corresponds to the holepositron with the same helicity λ (i. e. the spin projection onto the direction of the hole momentum (+p)). As a result, such a positron is described by the function Ψ−pλ (x) = N vpλ e+ipx , px = εt − pr , where the bispinor vpλ = C u¯pλ satisfies equations ˆ ′ vpλ = λ vpλ , Λ ˆ ′ = ˆs · (−p) (−γ µ pµ − m I) vpλ = 0 , Λ |p| and the normalization condition v¯pλ vpλ′ = −2m δλ′ λ . Its evident form reads vpλ
(8.54)
( √ ) ε − m w(−λ) (n) √ =i . −2λ ε + m w(−λ) (n)
(8.55)
Certainly, bispinors upλ and vpλ are mutually orthogonel: v¯pλ upλ′ = u¯pλ vpλ′ = 0 .
(8.56)
8.7. Quantization of the Dirac spinor field In the above section it was shown that wave functions e−ipx eipx √ √ upσ and vpσ 2εp V 2εp V
(8.57)
give us the complete set and that normalization of these functions corresponds for one particle in the whole volume V. Expanding operators of the spinor Dirac field over functions (57) in the same way as we did for the complex scalar field, we obtain ( ) −ipx ipx ∑ e e ˆ (x) = Ψ a ˆpσ √ upσ + ˆb+ vpσ , (8.58) pσ √ 2ε V 2ε V p p pσ ˆ (x) = Ψ
∑ pσ
(
biipx eipx ˆbpσ √e √ a ˆ+ u + v pσ pσ pσ 2εp V 2εp V
) .
(8.59)
apσ ) is the creation (annihilation) operator of free particle with the momentum Here a ˆ+ pσ (ˆ √ ˆ (annihilap, energy εp = + p2 + m2 and polarization σ, while ˆb+ pσ (bpσ ) is the creation √ tion) operator of free antiparticle with the momentum p, energy εp = + p2 + m2 and polarization σ.
58 It is convenient to calculate the field energy using the Hamiltonian form of the Dirac equation (see Section 8.5): ∫ ∫ ∂Ψ (x) 3 + ∂Ψ (x) 3 E= Ψ i d r = Ψγ 0 i d r. ∂t ∂t Then we obtain as usual ˆ = E→H
∑
) ( ˆbpσˆb+ . εp a ˆ+ a ˆ − pσ pσ pσ
pσ
To prove the last equation we use orthonormalization of bispinors (see (44), (46), (47)), (49)–(50))) and the relation (51), which gives: upσ γ 0 upσ′ = 2εp δσσ′ ,
v pσ γ 0 vpσ′ = 2εp δσσ′ ,
upσ γ 0 v−pσ′ = v pσ γ 0 u−pσ′ = 0, ˆ one should use quantization rules according to the To get a meaningful expression for H, Fermi-Dirac statistics: { } { } ˆbpσ , ˆb+ = δσσ′ , ′, a ˆpσ , a ˆ+ = δ ′ σσ pσ pσ all others pairs of operators a ˆ, a ˆ+ , ˆb, ˆb+ anticommute. In this case ∑ ( ) ˆ ˆ = ˆ pσ − 1 , n ˆ pσ = ˆb+ H εp n ˆ pσ + n ˆ pσ = a ˆ+ ˆpσ , n pσ a pσ bpσ . pσ
Similarly, for the charge of the field ∫ ∫ 0 3 Q = j (x) d r = Ψ (x) γ 0 Ψ (x) d3 r we obtain ˆ= Q→Q
∑(
a ˆ+ ˆpσ pσ a
) ∑( ) + ˆ ˆ ˆ pσ + 1 , + bpσ bpσ = n ˆ pσ − n
pσ
pσ
ˆ pσ is the operator of number of where n ˆ pσ is the operator of number of particles and n antiparticles. It turns out that particles and antiparticles differ giving different signs in the charge operator. If it is about electrons and positrons, traditionally the negatively charged particles (the electrons) are called particles and the positively charged particles (the positrons) are called antiparticles. This completes the problem of quantization of the spinor Dirac field.
§ 9. Interaction representation Up to now we study free fields. We started with description of free fields in the Schr¨ odinger representation well-known in quantum mechanics. Then we found out that the most natural quantum description of these fields is within the Heisenberg representation. However, the main aim of our course is related to the interaction of various fields, which describes the interaction of elementary particles — quanta of these fields. When we proceed
59 to the interaction of particles, the most effective apparatus for calculations of their interaction, at least for today, is the perturbation theory. The perturbation theory is conveniently developed in a special picture which is called the interaction representation. Let me recall some basic facts about these representations. In the Schr¨odinger representation we have a wave function (the vector of state) that depends on the coordinates and time. In the future, we’ll be interested only in its development in time t, so we will shorten the notation to indicate only the time, and coordinates are implied: Ψ (t). This wave function obeys the Schr¨odinger equation i
∂Ψ (t) ˆ (t) . = HΨ ∂t
As a rule, operators are considered to depend on the coordinates and momenta, but not the time. For example, we studied the vector potential of the electromagnetic field. In the quantization it turned out to depend only on the coordinates, but not the time. Let us find a development operator which expresses the wave function of the time t via the wave function at the initial moment t = 0. Let’s designate this operator with the letter Uˆ (t); therefore, Ψ (t) = Uˆ (t)Ψ (0). It must depend on the time and must maintain the normalization of the wave function, and thus it must be unitary, i. e. Uˆ + (t) = Uˆ −1 (t). How can this operator be found? We can recall the following facts, which are well known in the Schr¨odinger picture. ˆ i. e. If we know the eigenvalues En and eigenfunctions Ψn (0) of the operator H, ˆ HΨn (0) = En Ψn (0), then any initial state Ψ (0) can be represented as a superposition of these eigenfunctions: ∑ Ψ (0) = cn Ψn (0) . (9.1a) n
Then the transition to the time dependence is performed very simply: we add the factor e−iEn t at each item in this sum ∑ Ψ (t) = cn Ψn (0) e−iEn t . (9.1b) n
Since
e−iHt Ψn (0) = e−iEn t Ψn (0) , ˆ
it means that Eq. (1b) can be presented in the compact form ∑ Ψ (t) = Uˆ (t) cn Ψn (0) = Uˆ (t) Ψ (0) ,
(9.2)
n
where
ˆ Uˆ (t) = e−iHt .
(9.3) ˆ Thus, we have found that the development operator is exactly that given by Eq. (3). As H ˆ is a unitary operator. I have to tell the is a Hermitian operator, the exponent with (−iHt) following: we, of course, made things look nice but in fact, the action of this operator on the some initial state can actually be understood through the sum (1). That is, creating such expressions, we always need to have a procedure to functionally apply this operator. ˆ Using the relation (2), we can rewrite the mean value of some operator A, ⟨ ⟩ ˆ (t) , ⟨A (t)⟩ = Ψ (t) |A|Ψ
60 in the form
⟨ ⟩ ⟨A (t)⟩ = Ψ (0) |AˆH (t) |Ψ (0) ,
AˆH (t) = Uˆ −1 (t) AˆUˆ (t) ,
(9.4)
in which the operator AˆH (t) depends on time, but the vector of state Ψ (0) does not. Such a description of the considered system is called the Heisenberg representation. In quantum mechanics, this picture is used relatively infrequently and just to emphasize of the fact that the Heisenberg picture is a little bit closer to the classical mechanics description then the Schr¨odinger picture. Of course, the Schr¨odinger representation is the most habitual in quantum mechanics. In the field theory, on the contrary, the Heisenberg picture is preferable, because it expresses with most clear relativistic covariance, for example, the operators of fields we considered. These were just reminiscences from quantum mechanics. How is that connected with the subject of today’s lecture? If our Hamiltonian happens to be divisible into two parts, ˆ =H ˆ 0 + Vˆ , H ˆ 0 corresponding to simple and well-known solutions and the second Vˆ being a one of which H small correction, then the perturbation theory can be developed. To this end we remember that we dealing with the field theory and the Heisenberg representation is preferable here. On the other hand, we can start from an expression without perturbation. Therefore, we need to make such a representation that would coincide with the Heisenberg one in the absence of perturbation. It can be done in the interaction representation as follows: we need to create a new development operator, ˆ Uˆ0 (t) = e−iH0 t ,
and a new vector of state
(9.5)
Φ (t) = Uˆ0−1 (t) Ψ (t) ,
ˆ 0 we mean a which corresponds to a Hamiltonian without interaction. Very often, by H Hamiltonian for free fields, we know all about and complete description of which we have, and we’ll use these solutions further. This vector of state obeys the equation i
∂Φ (t) ∂ Uˆ −1 (t) ∂Ψ (t) ˆ 0 Ψ (t) + U −1 (t) HΨ ˆ (t) =i 0 Ψ (t) + iUˆ0−1 (t) = −U0−1 (t) H 0 ∂t ∂t ∂t
or i where
∂Φ (t) = Vˆ (t) Φ (t) , ∂t
(9.6)
Vˆ (t) = Uˆ0−1 (t) Vˆ Uˆ0 (t) .
The interaction representation is very convenient since: • if the interaction is negligibly small, Vˆ (t) → 0, then we get to the Heisenberg picture, which is preferable. Moreover, we already know all the basic relations for noninteracting fields; • the vector of state Φ (t) obeys the equation (6), which right side includes the small quantity Vˆ (t), and the perturbation theory can be based on it. These are the main arguments in support of our work in the interaction representation.
61
§ 10. Invariant perturbation theory In the previous section we obtained the formal solution of the Schr¨odinger equation ˆ (t) in the compact form: i ∂Ψ(t) = HΨ ∂t Ψ (t) = e−iHt Ψ (0) , ˆ
(10.1)
using decomposition of Ψ (0) over stationary stares. Now we’d like to try to obtain the same result in another way. Not as additional proving (this equation is already proved), but to map out our future path in the field theory. If we have the wave function at the initial time t = 0, how can we obtain it at the final time t? The answer is known, but we could act as follows: the interval from 0 to t is divided into n small sections. Let us consider the small section from the time tα up to tα + δtα . In this interval, we can apply the formula (1), too. Then this will be as follows: the wave function in the next small moment is exactly the development operator in this small section, multiplied by wave function Ψ (tα ) at the time when we started: ˆ Ψ (tα + δtα ) = e−iHδtα Ψ (tα ) . It is clear. Now we can start repeating this procedure. As a result, the initial wave function will be under the action of the product of these development operators ∏ ˆ Ψ (t) = e−iHδtα Ψ (0) . α
ˆ and We are lucky: all operators in this product are expressed via the same operator H, ˆ ˆ hence the operators e−iHδtα and e−iHδtβ commute with each other. And thus the product of the exponents will be reduced to an exponent with the sum. We already know into what this sum will turn in the limit: ∏ ˆ ∑ δtα −iH ˆ α ˆ −iHδt α e =e = e−iHt = Uˆ (t) . (10.2) α
So, using a totally different procedure, we get exactly the same answer. Nothing new, except that we have a new procedure. This new procedure can be applied to the case where we are dealing not with the Schr¨odinger equation, but with the equation i ∂Φ(t) = Vˆ (t) Φ (t) ∂t for the wave function Φ (t) in the interaction representation. The interval from the initial time ti to the final time tf is divided into n small sections. As before, we obtain Φ (tα + δtα ) = e−iV (tα )δtα Φ (tα ) . ˆ
Repeating this procedure, we get Φ (tf ) =
∏
e−iV (tα )δtα Φ (ti ) . ˆ
(10.3)
α
But now we have another equation, which is more heinous. Why? It is very significant that in the previous case the development operator included a Hamiltonian, which is timeindependent. That is why the individual factors of the product (2) commute. Now we face a much more complex situation in which the operator depends on the time. For this reason, the operators Vˆ (tα ) and Vˆ (tβ ), taken at the different moments tα and tβ , do not commute.
62 The further transition, however, becomes absolutely problematic. The absolutely obvious last-case one is impossible here. As usual, if we want to get a nice result, we can try to create some formal procedure. Then we’ll need to tell what the procedure means, though. Let’s ˆ 1 ) and create such a nice procedure, time ordering procedure. Let us have some operators A(t ˆ 2 ) at different times t1 and t2 . These times are not necessarily in ascending or descending A(t order. Let’s create an operator that will automatically order them in a certain chronological sequence. Such chronological operator is assumed to act on them as follows: { ˆ 1 ) A(t ˆ 2 ) if t1 > t2 A(t ˆ 1 ) A(t ˆ 2 )} = Tˆ{A(t ˆ 2 ) A(t ˆ 1 ) if t2 > t1 . A(t The definition, unlike its implementation, is very simple and clear. That will help us in the following: if the answer is always going to be the same at any arrangement of these ˆ k ) under the sign of Tˆ, then we can start from any arrangement. This will be operators A(t very helpful, since we can rearrange these factors under the sign of this operator Tˆ as much as we want. Using this property of the Tˆ operator, we can rewrite (3) in the form ∏
e
−iVˆ (tα ) δtα
= Tˆe
−i
∑ˆ V (tα ) δtα α
= Tˆe−i
∫ tf ti
Vˆ (t) dt
= Uˆ (tf , ti ) .
α
This is a rather formal expression. The constructive meaning for it can be done in this way: we remember that Vˆ (t) is a small quantity, and thus this exponent can be expanded into a series which includes the expressions like this: (∫ tf )n Vˆ (t) dt . ti
The integral to a degree of n can be represented as the product of these integrals Uˆ (tf , ti ) =
∞ ∑ (−i)n n=0
n!
∫ Tˆ
tf
∫ Vˆ (t1 ) dt1 . . .
ti
tf
Vˆ (tn ) dtn
ti
and the operator Tˆ will put them in the correct order. And the integration is performed only so that these times are in the correct order. This expression looks nice already. We assume that the particles interacted at some initial moment in negative infinity and after the interaction it was the final moment in positive infinity. At the initial and final stages, the particles will be separated by large distances, and therefore they will not interact and can be described as quanta of Heisenberg free fields. Then we can send tf to infinity and ti to negative infinity, and as a result we obtain the operator of development: the state in positive infinity can be expressed via the state in negative infinity using this operator. We’ll call this development operator Sˆ (scattering) operator: Uˆ (tf , ti ) → Sˆ = Tˆe−i
∫
Vˆ (t)dt
=
∞ ∑ (−i)n n=0
n!
∫ Tˆ
∫ Vˆ (t1 ) dt1 . . .
Vˆ (tn ) dtn .
(10.4)
Unfortunately, calculations according to this scheme will be a rather cumbersome procedure, i. e. we’ll always need to integrate so that the next time is more than the previous
63 one. Even in case of the second order, there arise some difficulties. They can be overcome, but when it is about the third or fourth order, it will be quite a cumbersome procedure. Despite this, the mode of action is quite clear. This operator must be understood in this sense. So, this is a nice expression, compact, unitary and so on. But its real meaning is that we have a sum, which is actually quite tricky. Moreover, when we analyze the first and the second orders, everything will be nice and smooth. However, when we move on to a higher order, we’ll need to deal with divergences occurring there. It will be a big problem. For the time being, the mode of action has been described and we’ve got a tool. Then an individual summand in this sum can be presented in a way facilitating memorization of the calculation rules. This presentation is best done with the so-called Feynman diagrams, which are one of the ultimate goals of our course. We need to learn to use the Feynman rules and the Feynman diagrams for calculation of specific processes. But the essence is enclosed here. From this standpoint, the problem was solved in principle. However, the devil is in the details, and there will be a lot of devilish tricks. Now we can proceed to the calculations. Let us have some initial state |i⟩ associated with the field state Φ(t = −∞), and it goes into a final state |f ⟩ associated with this field state Φ(t = +∞). We need to act on the ˆ initial state with the development operator, and it will go into some state S|i⟩. To find out ˆ whether it went to a specific final state |f ⟩, we need to project the state S|i⟩ onto the final state. This will be the matrix element itself, and a set of such matrix elements makes an S matrix: Sf i = ⟨f | Sˆ |i⟩ . Let us consider the case of Quantum Electrodynamics (QED) which describes the interaction of the electron-positron Dirac field Ψ(x) with the electromagnetic field Aµ (x). The corresponding Lagrangian can be obtained from the Lagrange density of the free Dirac field (see §8.3) by replacement pˆµ = i∂ µ → i∂ µ − eAµ , where e = −|e| is the electron charge: L=
[ ] } 1{ Ψγµ (i∂ µ − eAµ ) Ψ + (−i∂ µ − eAµ ) Ψ γµ Ψ − mΨΨ. 2
Therefore, the electromagnetic interaction is described by the Lagrange density of the form: LI = −eΨ(x)γµ Ψ(x)Aµ (x). But the perturbation operator Vˆ is a small addition to the Hamiltonian of free fields. As you know, the small additions in the Hamiltonian and Lagrangian differ in the sign. As a result, the operator Vˆ (t) in QED reads ∫ ∫ 3 Vˆ (t) = − LˆI d r = Vˆ (x)d3 r , where
ˆ ˆ ˆµ Vˆ (x) = eΨ(x)γ µ Ψ (x) A (x) .
(10.5)
Therefore, the Sˆ operator Sˆ = Tˆe−i
∫
Vˆ (x) d4 x
= Tˆe−ie
∫ ˆ ˆ ˆµ (x) d4 x Ψ(x)γµ Ψ(x) A
(10.6)
is the unitary, relativistic √ invariant operator. Since the constant of the electromagnetic interaction is small |e| = α ≪ 1, the perturbation theory is very effective method for calculations in QED.
64 This is one example we’ll work with. It is useful to consider a couple of simple examples for a more real perception of the field theory. 1) The interaction of the complex scalar field φ (x) and the real scalar field Φ (x). In QED we consider the interaction of the spinor Dirac field and the vector electromagnetic field. There is an option in which this looks much simpler; for example, if we take scalars φ(x) and φ∗ (x) instead of spinors corresponding to charged particles. Instead of a neutral electromagnetic field, we can take a neutral field Φ(x) of scalar particles. Which quantity will then be an analogue of this interaction? It must be a scalar, and it must be constructed from these fields. It can be constructed in approximately the same way, i. e. the conjugate field φ∗ (x) is multiplied by the nonconjugate one φ(x) and by Φ(x). The structure will be the same, i. e. two charged fields and one neutral field. In QED the operator Vˆ (x) ˆ ˆ ˆµ was scalar, because Ψ(x)γ µ Ψ (x) is 4-vector, and A (x) also is 4-vector and the result is a scalar product of these 4-vectors. Now we also have a scalar, because all the three fields themselves are scalar. The electron charge e in QED can be replaced by some charge g in this example ˆ (x) . Vˆ (x) = g φˆ+ (x) φˆ (x) Φ (10.7) It is a simplified model of electrodynamics. It is an easy start point, because there are no complications associated with spinors or vector operator of electromagnetic field. We’ll study the basic ideas on this model. We can take a more intricate model, e.g. let the same set of Ψ(x) and Ψ(x) describe a charged particle, but use a neutral scalar field Φ(x) instead of the vector electromagnetic field. 2) The interaction of the spinor Dirac field Ψ (x) and the real scalar field Φ (x). Since we have a scalar Φ(x) here, it should also be multiplied by a scalar. The spinor fields Ψ(x) and Ψ(x) make a scalar in only one way: the product of Ψ(x) by Ψ(x). Therefore, ˆ Ψ ˆ (x) Φ ˆ (x) , Vˆ (x) = g Ψ(x)
(10.8)
where g is another coupling constant. It will also be a simplified model of electrodynamics, but it is more complicated than the previous one, because of the spinors. And this model is quite realistic, because Φ(x) can describe the Higgs boson, which has been recently discovered, and Ψ(x) can describe leptons, e.g. electrons and positrons. Then we can use this model, provided we know the constant g, to describe the Higgs boson decay into electrons and positrons or, on the contrary, the production of the Higgs boson in collisions of electrons and positrons. We’ll do that and try to understand why colliding muon beams are more preferable for this process than electronpositron ones.
§ 11. Probability amplitudes and transition probability Now we have the calculation scheme. But we still need to connect our probability amplitudes in terms of the elements of S matrix with the observable values. This section is purely kinematic, but if you understand this section well, it will remain not a very important part of kinematics. However, if you fail to comprehend this section, every time calculating something you will stumble over this kinematic threshold and painfully recall the required mode of action.
65
11.1. The scattering amplitude First, the overall picture (Fig. 5). Let us have an initial state |i⟩ with some amount of particles, which are free particles. Each of the particles is characterized by its own 4-momentum: p1 , p2∑ , . . . , pn . The total 4-momentum of the initial state is Pi = i pi . They approach one another, Figure 5: Transition |i⟩ → |f ⟩ collide, and interact. Some of them annihilate, and some are produced, and there arise a certain amount of final particles with the total 4-momentum ∑ Pf = f p′f which form the final state |f ⟩. Their number is different; their momenta are ˆ different. Everything that happens is described, of course, by the development operator S. The probability amplitude for such a transition ⟨f | Sˆ |i⟩ = Sf i is the matrix element of the Sˆ operator which has the known form in the perturbation theory Sˆ = Tˆe−i
∫
Vˆ (x)d4 x
,
where Vˆ (x) is the perturbation operator. We can have a transition from an initial state to the same state, which is quite trivial and uninteresting to us. For this reason, let’s separate this contribution δf i , corresponding to absence of interaction at scattering, from the total matrix element. Let’s designate the remaining part with the letter Tf i . We know that we have the law of conservation, i. e. the sum of all the 4-momenta over the initial particles Pi shall equal the sum of all the 4-momenta over the final particles Pf . We take into account this law by introducing the delta function δ(Pi − Pf ) with the factor (2π)4 . The imaginary unit is often introduced in the definition. As a result, Sf i = δf i + i (2π)4 δ (Pi − Pf ) Tf i .
(11.1)
The question is how one can calculate the corresponding transition probability. We have to take the probability amplitude squared, which is a very simple thing. If these states were discrete (from initial discrete states to final discrete states), this would give us the probability. In fact, the final states are associated with free particles which are characterized by a continuous distribution of their momenta. Therefore multiplication by the number of final states is required. For some particle with momentum p′f the number of final states dnf is a usual volume V multiplied by an element of the momentum space d3 p′f and divided by (2π~)3 , where ~ = 1. Then we need to increase the number of the phase volumes over all final particles. The result will be the probability of transition from an initial state to a final state. If we are interested in the case of real scattering, when the initial and final states are different, then the first summand δf i in (1) will disappear. Let the initial state correspond to the time ti and the final state — to tf (then ti → −∞, tf → ∞) and let T = tf − ti , then the transition probability per unit time (at |i⟩ ̸= |f ⟩) is [ ]2 |Tf i |2 ∏ Vd3 p′f |Sf i |2 ∏ 4 ˙ dnf = (2π) δ (Pi − Pf ) . dWi→f = T T (2π)3 f f In this expression there is such a nasty thing as the square of the δ function. What is it? We can write it this way: one δ function by the second δ function. Then we have to require
66 the argument of the second δ function to be necessarily zero (this requirement arises from the fact that we have the first δ function with the same argument) (2π)4 [δ (Pi − Pf )]2 = δ (Pi − Pf ) · (2π)4 δ(0). ∫ An ordinary δ function is e−i(Pf −Pi )x d4 x/(2π)4 . If we want to have no infinity, let’s try to say that the integration is over a finite volume V, as we have always done, and over a finite time interval T . All this is to be done when Pi − Pf = 0, but then the integrand is 1, and we get such an answer ∫tf 4
(2π) δ(0) =
∫ dt
ti
d3 r ei(Pi −Pf )x
Pi =Pf
V
= VT .
This shows that our probability will increase in proportion to the time T , which is why we are interested in the probability per unit time. Finally, we get ˙ i→f = (2π)4 δ (Pi − Pf ) V |Tf i |2 dW
∏ Vd3 p′f (2π)3
f
.
The number of states dnf = Vd3 p′f / (2π)3 is in accordance with the normalization of one particle per volume V. It corresponds to the wave function −ipx √e 2εp V √ −ikx √e 4π eµkλ 2ωk V −ipx √e upσ 2εp V
for a scalar particle, for a photon , for an electron .
We can conclude from this expression that it is convenient to isolate these factors √ 1 2εp V for all particles from the amplitude Tf i and introduce new quantity Mf i which is called the scattering amplitudes: Sf i = δf i + i (2π)4 δ (Pi − Pf ) Mf i
∏ if
As a result,
1 √ . 2εp V
|Mf i |2 ∏ d3 p′f 4 ˙ dWi→f = (2π) δ (Pi − Pf ) V ∏ 3. 2εi V f 2ε′f (2π)
(11.2)
(11.3)
i
The auxiliary notion, the volume V in which the process occurs, remains so far. It is clear beforehand that this volume should disappear in the physical quantities, but for the time being we have to keep it. Additional remark. We need to stipulate the following, though. Of course, the volume V will disappear if we consider an ideal set up of the problem, i. e. the initial state is a set of plane waves where a plane wave is some flow in the entire space. The same idealizations
67 are taken for the final states. Real experiments, however, take place in a confined space and in a limited time, and sometimes that needs to be taken into account, in particular, when high orders of the perturbation theory are calculated and it is necessary to integrate over distances comparable with the size of our equipment. Thus, a specific volume and a specific formulation may need to be included in the final result. This condition became apparent here at the Budker Institute of Nuclear Physics in detailed research on the bremsstrahlung process in 1980–1981. The cross section for this process (an electron is scattered on a positron and emits a photon) has been known since 1934. And suddenly it became clear that it would be wrong to perform the calculation of this cross section as Bethe and Heitler (the discoverers of the corresponding formula) did, considering that there were plane waves only. In the region of small photon energies, it is necessary to take into account the fact that the initial beam is not a plane wave, but a wave packet, which resulted in a noticeable difference of 30 % as compared with the plane wave approach. We will further discuss this problem in §13.2. So, here came the result for any transition probabilities. Now let’s consider the two most interesting options: (i) when in the beginning there is one particle and it decays, and (ii) when in the beginning two particles collide with some final state. These are the options to be considered in next two subsections.
11.2. The decay width A simple set up of the problem is in a situation where you have only one particle, and it decays. Such a particle with 4-momentum p decays into a number of particles with 4-momenta p′f . The experimentalists are interested in the decay probability of the particle in a unit time, i. e. the decay width dΓi→f . Then we adapt our formula for this formulation of the problem. We still have the transition probability per unit time, only now it is called the decay width ∏ d3 p′f V 4 2 ˙ dWi→f ≡ dΓi→f = (2π) δ (p − Pf ) |Mf i | . 2εV 2ε′f (2π)3 f
(11.4)
As we expected, the auxiliary notion, the volume V, disappears in this expression. Note that it is the decay width of this particle in this final state, because a decaying particle can decay through various channels. In a usual formulation, this quantity has the dimension of probability per second, i. e. 1/s. In our relativistic units, the same quantity has the dimension of energy, i. e. it can be measured either in 1/s or in energy Figure 6: Kinematics of decay units: eV, keV, MeV and so on. In a case (Fig. 6) when a particle with the mass m is at rest at the beginning and then decays into two final particles with energies ε′1 + ε′2 = m and momenta p′1 = −p′2 , this formula is simplified . I ask you to try to do that by yourself. That requires accuracy, but after that you’ll get used to how four-dimensional δ functions are excluded in integration over three-dimensional variables: dΓi→f =
|Mf i |2 ′ |p |dΩ′1 . 32π 2 m2 1
(11.5)
Later on we’ll need this formula for specific and interesting calculations. In particular, we’ll try to calculate the probability of the Higgs boson decay into two leptons.
68
11.3. Cross section Let us now consider the case when a pair of colliding particles produces some number of final particles (Fig. 7). Of course, the formula for the probability is the same, but it is necessary to concretize it for the initial states. So, the transition probability per unit time for such a reaction will look like this ˙ i→f = (2π)4 δ (p1 + p2 − Pf ) dW
Figure 7: Kinematics of scattering
∏ d3 p′f |Mf i |2 V 2ε1 V2ε2 V f 2ε′f (2π)3
However, experimentalists are mostly interested not in the probability of transition, but in the cross section which is defined as dσ =
˙ i→f dW , j
where j is the density current. And why are they? If flows of two particles are colliding, the higher the intensity of the flow, more events will occur and the higher probability per unit ˙ i→f by time will be there. Therefore, the experimentalists prefer to divide probability dW the density of the flow of colliding particles, and then this quantity is no longer dependent on the flow rate but only on the properties of the interacting particles. So, this reflects the primary information. Let’s try to calculate the flow. It is convenient to start in the centre-of-mass system, in which the momentum of the second particle is equal to the momentum of the first one with the minus sign p2 = −p1 . What is the flow in this case? Let us consider the first particle; its flow will be, like in hydrodynamics, the velocity v1 = |p1 |/ε1 multiplied by the density 1/V (one particle in the entire volume). Then we should add the same flow of the second particle flying in the opposite direction ( ) v1 v2 1 |p1 | |p2 | |p1 | (ε1 + ε2 ) j= + = + = . V V V ε1 ε2 Vε1 ε2 As was expected, the volumes V will disappear in the cross section: dσ = (2π)4 δ (p1 + p2 − Pf )
|Mf i |2 ∏ d3 p′f 3, ′ 4I 2ε (2π) f f
(11.6)
where I = |p1 | (ε1 + ε2 ). The quantity I can be expressed via the scalar production of two initial 4-momenta √ I = |p1 | (ε1 + ε2 ) = (p1 p2 )2 − m21 m22 (11.7) and it is called the M¨oller invariant. Eq. (6) also has a simplified expression, when two initial particles turn into two final particles, i. e. a 2 → 2 reaction p1 +p2 → p′1 +p′2 . Such type of processes are of great interest; special Mandelstam variables are even introduced for them. However, this all will be a little bit later. For this specific case we have (in the center-of-mass system) 2 ′ |p1 | ′ Mf i dσ = (11.8) dΩ . 8π (ε1 + ε2 ) |p1 | 1
69
§12. The first order of the perturbation theory So, the basic preparation for calculation within the invariant perturbation theory has been done: we know how the probability amplitude can be calculated and how this probability amplitude can then be connected with the scattering amplitude. Finally, we know how the scattering amplitude can be connected with the decay widths and cross sections. Let’s start acting. All the basic ideas of the acting and the basic ideas of graphic expression of that acting, i. e. the Feynman diagrams, can be felt on a very simple example, when we analyze the lowest nontrivial orders of the perturbation theory. So, let’s start from the first order of the perturbation theory.
ˆ 12.1. Interaction g φˆ+ φˆ Φ To avoid complications associated with spins of particles, it is convenient to start with scalars. In this sense, the easiest option will be to do the following. We consider the interaction (10.7) of particles of a charged complex field φ (x) and a neutral scalar field Φ (x). The Sˆ operator in the first order has the form ∫ (1) ˆ ˆ (x) . S = −ig d4 x φˆ+ (x) φˆ (x) Φ (12.1) Since there are no different times, we can omit the chronological operator Tˆ. This simplification, though, reduces the opportunity to show the work of the perturbation theory in full form. However, in the second order, we must see new features, associated with the need in full-scale consideration of the chronological operator. Let us consider the operator corresponding to the field ( ) ipx ∑ e e−ipx φˆ (x) = a ˆp √ + ˆb+ . (12.2) p√ 2ε V 2εp V p p What does this operator describe? What can be done with its help? This operator can annihilate a particle and produce an antiparticle, since it contains operators a ˆ and ˆb+ . It would be very convenient to invent a mnemonic rule for these actions. Let the time flows from the left to the right. As for the particle, the operator a ˆ corresponds to its annihilation. I. e. the particle existed and then it was annihilated. The exponent e−ipx corresponds to the usual quantum mechanics plane wave. Thus, this item corresponding to annihilation of particle with 4-momentum p (the operator a ˆp ) can be portrays as the line which ends at the point x:
On the other hand, if we turn to the antiparticle with 4-momentum p (the operator ˆb+ p ), how shall we portray it? This is a particle that appeared. It is produced at this point and spreads further over time; it appeared, not disappeared. On the other hand, the corresponding exponent e+ipx is different by sign from the one that was before. Figuratively speaking, we can say that antiparticle is particle propagating backwards in time. So we’ll designate it with an arrow with (−p) to remember that this exponent has the opposite sign:
70
For the sake of brevity, let’s designate these particles as π ∓ . Similarly to the choice of electron in the more complex theory of electrodynamics, let’s choose a negative charge particle (which is described by the operators a ˆ+ and a ˆ) to be the particle. Since it is a scalar, − we designate it with the letter π . Respectively, the antiparticle, which is described by the operators ˆb+ and ˆb, will be designated with the same degree of conditionality as π + . The field ( ) ipx −ipx ∑ e e a ˆ+ φˆ+ (x) = , (12.3) + ˆbp √ p√ 2ε V 2ε V p p p contains items corresponding to creation of particle with 4-momentum p (the operator a ˆ+ p)
or annihilation of antiparticle (the operator ˆbp )
Now our charged particles π − and π + interact with another scalar neutral particle π 0 , ˆ (x). As usual, we decompose this field operator which is described by a real scalar field Φ over plane waves with exponents e−ikx and eikx and with them, respectively, the annihilation and creation operators. Until now, we always designated them with letters a ˆk and a ˆ+ k . Now, − 0 for the sake of clarity, we have to say that a ˆp refers to π , and here a ˆk refers to π , or come up with another letter, say to the letter c. Let cˆk be the annihilation and cˆ+ k be creation operators for π 0 : ) ikx ∑ ( e−ikx e + ˆ (x) = Φ cˆk √ + cˆk √ . (12.4) 2ε V 2ε V k k k This field contains items corresponding to annihilation of neutral particle with 4-momentum k (the operator cˆk )
or creation of the same neutral particle with 4-momentum k (the operator cˆ+ k)
71
We know that the real interaction among pions is strong, but here we discuss a perturbation theory. So, all this is conditional and hypothetical, but if we want to understand the structure of quantum electrodynamics, let’s associate in mind this scalar π 0 with a photon. We know that the masses of the particles and antiparticles are the same, m+ = m− . The mass of π 0 , as you know, is experimentally close to the mass of π ± , but since the name π 0 is conditional, we can consider the ratio of m0 and m+ = m− as an arbitrary one. As a result, the operator Sˆ(1) can describe processes: π± → π± + π0, π± + π0 → π±, π0 → π+ + π−, π+ + π− → π0 . Example 1. Let’s consider the decay π − → π − + π 0 (Fig. 8). We have to say immediately that such a process is impossible. Why so? Because of the conservation law. Indeed, in this process π − at rest turns into two particles. Even if they fly away with zero speeds, the Figure 8: π − → π − + π 0 decay initial energy will not equal the final energy, because of the nonzero mass of π 0 , in addition to the mass of the final π − . This is certainly true, and our perturbation theory shall confirm this. Furthermore, we’ll also learn to perform calculations within the perturbation theory; we have to justify this form of the answer. The calculations will yield a corresponding δ function which will forbid this process. But doing so we’ll find out the coefficient at this zero. This will help us when we start doing the calculations in a little trickier version, which is allowed by the perturbation theory. To start, we should arrange the initial state corresponding to some π − with some 4momentum p. We need to take the creation operator a ˆ+ p of a charged particle and apply it to the vacuum state. That will be the state of one π − with the momentum p. Another π − with another 4-momentum p′ , and a π 0 with 4-momentum k will be the final state. For this we take, respectively, a ˆ+ ˆ+ k will also act on the vacuum: p′ and act on the vacuum, and c |i⟩ = a ˆ+ p |0⟩ ,
|f ⟩ = cˆ+ ˆ+ ka p′ |0⟩ .
The matrix element of S reads ∫ (1) ˆ (x) a Sf i = −ig d4 x ⟨0| a ˆp′ cˆk φˆ+ (x) φˆ (x) Φ ˆ+ p |0⟩ .
(12.5)
Here we have only one 4-coordinate x, in which all the interaction takes place, i. e. we’ll have to put a point, in which the line of the charged particle will continue without change, whereas there may be no line of neutral particle at the beginning but only at the end. The matrix element itself can be rewritten in the form ∫ (1) Sf i = −ig F (x) f (x) d4 x, (12.6) where the function F (x) corresponds to the plane wave of the final π 0 : ) ′ ik′ x ∑( e−ik x eikx + e ˆ cˆk′ √ F (x) = ⟨0| cˆk Φ (x) |0⟩ = ⟨0| cˆk + cˆk′ √ |0⟩ = √ . 2εk′ V 2εk′ V 2εk V k′
(12.7)
72 Here we take into account that ⟨0| cˆk cˆk′ |0⟩ = 0,
⟨0| cˆk cˆ+ k′ |0⟩ = δkk′ .
(12.8)
ˆ Sometimes all this is described as follows: the operator cˆk and the operator Φ(x), which have resulted in the desired exponent, are “contracted” and the corresponding procedure is denoted as ˆ (x) |0⟩ = ⟨0| cˆk Φ ˆ (x) |0⟩ . ⟨0| cˆk Φ | {z } ˆ In other words, a pair of operators cˆk and Φ(x) are reduced to a pair of expressions (8) giving a nonzero exponent. Analogously, the function f (x) = ⟨0| a ˆp′ φˆ+ (x)φˆ (x) a ˆ+ p |0⟩ corresponds to plane waves of the initial and final π − . Namely, if we pull the operator a ˆp′ to the right, it can be contracted ether with φˆ+ (x) (that gives the nonzero contribution) or ′ with a+ p , but that gives zero contribution. Indeed, since the momenta p and p are different, we obtain ( + ) a ˆp′ a ˆ+ ˆp a ˆp′ + δpp′ |0⟩ = 0. p |0⟩ = a As a result,
′
f (x) = ⟨0| a ˆ ′ φˆ | p {z
+
eip x e−ipx √ |0⟩ = √ . } | {z } 2εp′ V 2εp V
ˆ+ (x) φˆ (x) a p
(12.9)
After integration over x we finally obtain (1)
(1) Sf i
′
= i (2π) δ (p − p − k) √ 4
Mf i
2εp V2ε V2εk V p′
,
(1)
Mf i = −g.
(12.10)
The calculation turned out to be pretty simple, clear, and understandable, and indeed it confirms that such process is impossible, because the conservation law p = p′ + k must be fulfilled and the energy of the initial particle must be equal to the energy of the final particles. However, if we consider this decay in a rest frame of the initial particle, we find that εp = m− ̸= εp′ + εk ≥ m− + m0 . Example 2. What process would be possible under the conservation law? It is easy to see that it will be the decay π 0 → π + +π − (Fig. 9) for which as before (1)
Mf i = −g ,
0 + − (12.11) Figure 9: π → π + π decay
but now we get the conservation law k = p+ + p− , which will be fulfilled if m0 > 2m− .
ˆΨ ˆ Φ. ˆ Decay of the Higgs boson 12.2. Interaction g Ψ As a further option, we consider a processes with some complication. Let the neutral particle remain as it is. Instead of the scalars π ± described by the complex scalar field φ(x),
73 we take the Dirac spinor particles described by bispinors Ψ(x). Let their interaction be determined by the perturbation operator (10.8) ˆ Ψ ˆ (x) Φ ˆ (x) . Vˆ (x) = g Ψ(x)
(12.12)
The structure of the theory is the same, but the picture is considerable complicated by the fact that the spinor fields Ψ (x) and Ψ(x) will have another expansions over plane waves (8.19)-(8.20) in which we should take into account their spins. ˆΨ ˆΦ ˆ in the Standard Model 12.2.1. Interaction g Ψ In the Standard Model the spinor field describes a lepton l± or quark q ± with mass m, the real scalar field describes the Higgs boson H with mass mH = 125 GeV. The constant g is considered to be given. We won’t derive it, but in the Standard Model it has a very concrete expression √ m g = 4πα , mW = 80.4 GeV, sin2 θW = 0.23. 2mW sin θW It is proportional to the mass of a lepton or quark m. Today, at last, we’ll proceed from simple theoretical lessons (quantization of free field and the first order of perturbation theory) to the basis of Feynman approach and Feynman diagrams. At this point we can face things that correspond to calculated values related to experiments. Moreover, on this fairly primitive example, we’ll even try to get acquainted with some exciting challenges of our time, being relevant to BINP by the way. Let we consider the process in which l = e. In this case this constant includes the electron mass. The constant itself is dimensionless, but it includes the ratio m/mW of the electron mass to the mass of W boson which is of the order of 10−5 . The parameter sin θW is called the sine of the Weinberg angle or sine of weak interaction, it is also known well from experiments. Usually, the square of this value is taken, which is equal to 0.23. 12.2.2. The transition e− → e− + H Let us consider the process (Fig. 10) e− → e− + H . Repeating calculations for the process π − → π − + π 0 , we obtain ∫ (1) Sf i = −ig F (x) f (x) d4 x, where
Figure 10: e− → e− + H transition
ikx ˆ (x) |0⟩ = √e , F (x) = ⟨0| cˆk Φ | {z } 2εk V ˆ (x) Ψ ˆ (x) a f (x) = ⟨0| a ˆ p′ σ ′ Ψ ˆ+ |0⟩ | {z } | {z pσ} and index σ (σ ′ ) denotes the spin state of the initial (final) electron. Then we take into account that ( ) −ip′′ x ip′′ x ∑ e e ˆ (x) = Ψ a ˆp′′ σ′′ up′′ σ′′ √ + ˆb+ p′′ σ ′′ vp′′ σ ′′ √ 2εp′′ V 2εp′′ V p′′ σ ′′
74 and ⟨0| a ˆp′′ σ′′ a ˆ+ ˆp′ σ′ ). It gives us pσ |0⟩ = δpp′′ δσσ ′′ (and the similar relations for Ψ and a ′
eip x e−ipx √ f (x) = up′ σ′ upσ √ . 2εp′ V 2εp V As a result, after integration over x, we obtain the matrix element (1)
(1) Sf i
Mf i
′
= i (2π) δ (p − p − k) √ 4
2εp V2εp′ V2εk V
,
(1)
Mf i = −gup′ σ′ upσ .
As you can see, this process impossible since m < mH + m. 12.2.3. Decay of the Higgs boson H → e+ e− The Higgs boson with a 4-momentum k decays into an electron, considered as a particle, and a positron, considered as an antiparticle. The electron and the positron have 4-momenta p− and p+ , respectively. In our reasoning the positron must be a particle moving backwards in time, i. e. it is associated with the line labeled by the Figure 11: H → e+ e− decay momentum (−p+ ) — see Fig. 11. Certainly, this decay is allowed by the conservation law. The initial state is associated with a plane wave with 4-momentum k; the final states are associated with the plane waves with 4-momenta p+ and p− multiplied by the spinors. As a result, after integration over x, we obtain the matrix element (1)
(1) Sf i
Mf i = i (2π) δ (k − p+ − p− ) √ , 2εp V2εp′ V2εk V 4
(1)
Mf i = −gup− σ− vp+ σ+ .
Now, this process is possible since mH > 2m. We can remember this result via the figure 11. This is the Feynman diagram, which we’ll discuss in more detail afterwards. I want you to understand that this picture has no special content; all the content is in our calculations. This is a meaningful result, and the picture is a rule to remember and reproduce this result. Let us find the decay width ∫ ∑ dΩ− ΓH→e+ e− = |Mf i |2 |p− | 32π 2 m2H σ ±
performed calculations in the Higss boson rest frame and taken z axis along the electron momentum. In this frame 1 1 ε− = ε+ = mH , p− = (0, 0, ve ε− ) = −p+ , |p− | = ve mH , 2 2 √
where ve = is the electron velocity.
1−
4m2 m2H
75 It is instructive to perform the direct calculation of the scattering amplitude using the evident forms of bispinors for the electron and positron: ( ) ( √ ) √ ε− + m φ(σ− ) ε− + m φ(σ− ) √ up− σ− = √ = , ε− − m (σ n− ) φ(σ− ) 2σ− ε− − m φ(σ− ) ( √ ) ε+ − m φ(−σ+ ) vp+ σ+ = Cup+ σ+ = −i . √ 2σ+ ε+ + m φ(−σ+ ) ( )∗ Here we take into account that C = −αy , −σy φ(+σ) = −2iσφ(−σ) and (σ n− ) φ(σ− ) = 2σ− φ(σ− ) ,
(σ n+ ) φ(−σ+ ) = 2σ+ φ(−σ+ ) ,
since n− = −n+ . As a result, we obtain: √ (1) Mf i = −ig ε2− − m2 (1 − 4σ− σ+ ) φ(σ− )∗ φ(−σ+ ) = −igmH ve δσ− ,−σ+ . As you can see, z projection of the electron spin is opposite to z projection of the positron spin, in other words, helicities of leptons are equal each other. This is a direct consequence of the conservation law for z projection of the total angular momentum. Now it is easy to find the sum over spin states of final leptons ∑ (1) 2 (12.1) Mf i = 2g 2 ve2 m2H σ±
and the width of the discuss decay ΓH→e+ e−
α = 8 sin2 θW
(
me mW
)2 ve3 mH .
It is very small since it is proportional to (me /mW )2 ≈ 4 · 10−11 . Certainly, it is possible other lepton decays of the Higss boson: H → µ+ µ− and H → τ + τ − , the latter has the largest width. Among the quark decay modes H → qq the largest width has the bb mode: ΓH→bb
α = NC 8 sin2 θW
(
mb mW
)2 vb3 mH ≈ 6 MeV
(for NC = 3 and mb = 5 GeV). Just for comparison, let we mention that decay of ϕ(1020) meson has the width Γϕ = 4.3 MeV. The latest data on the decay modes of the Higgs boson can be found in the book Review of Particle Physics [7]. 12.2.4. Summing over spin states The calculation of the quantity (1) performed in the section 12.2.3 is rather cumbersome. Here we present a simpler method. We start with the remarks that the sum over spin states ∑ up− σ− vp+ σ+ 2 (12.2) Σ= σ±
76 can be rewritten as Σ = Tr (P+ P− ) , where matrices P± are ∑( ∑( ) ( ) ) ( ) (P+ )jk = vp+ σ+ j v p+ σ+ k , (P− )jk = up− σ− j up− σ− k . σ+
σ−
The evident forms of these matrices can be found from the following consideration. The matrix P± is the scalar quantity which depends only on 4-vector p± . Therefore, it can be presented in the form (P± )jk = a± pµ± (γµ )jk + b± δjk , where constants a± and b± can be found from the evident equations: [ ] TrP± = ∓4m, pν± (γν )jk ± mδjk (P± )kl = 0.
(12.3)
It gives a± = 1, b± = ∓m and, therefore, P± = pµ± γµ ∓ m where for the sake of simplicity we replace the matrix m δjk with a simple form m. Thus, the sum over spin states S reads ] [ Σ = Tr (pµ− γµ + m)(pν+ γν − m) .
(12.4)
(12.5)
Then we can use the following identities Tr (γµ ) = 0, Tr (γµ γν ) = 4gµν ,
(12.6)
which can be derived from Eqs. (8.14) and (8.15). As a result, we obtain the final expression ∑ (1) 2 ( ) (12.7) Mf i = 4g 2 p+ p− − m2 , σ±
which coincides with that of Eq. (1). 12.2.5. Decay of the Higgs boson H → γγ We studied the specific interaction, with decays into leptons and quarks. Other interactions may lead to other decays, including decay into W + and W − , which in turn decay further. A quite unusual decay into two photons is also possible H → γγ. Its probability is small, the corresponding branching ratio is about 0.23 %, but it Figure 12: H → γγ decay has been observed at the LHC collider. The Higgs boson was discovered in collisions of protons in the Large Hadron Collider. Of course, not the Higgs boson itself was observed, but its decay. It was decay into quarks that was mainly detected. It is quite probable, but what is the difficulty of detection of this process? For example, in a decay into two quarks, each quark then turns into a hadronic set, which is to be separated from a real mess there. On the other hand, this decay into two
77 photons has a much clearer signature. Here we must have two isolated photons, which can fly at large angles. Their invariant mass distribution must have a peak at the mass of the Higgs boson. They really were observed in this channel! Actually it’s a very interesting process, because when disclosed via the similar pictures, it looks like this: the Higgs boson and two photons in the end. We have no such interaction through a single vertex, but we have a more tricky interaction, through 3 vertexes Fig. 12. Here the Higgs boson decays into two charged particles, e.g. into an electron and a positron, which annihilate and produce a pair of photons. Since this vertex is small, we can say in advance that the amount here will be small. It can include heavy particles here, e.g. muons or tau leptons or quarks. The point is that since this process is a virtual one, there can be a t quark here. The conservation law will be fulfilled for the initial and final particles and not mandatory for the intermediate ones. This process is unique. Usually such processes (if they are going through intermediate particles, and if the particles are heavy) are suppressed. In our case, if heavy charged particles are involved, the corresponding probability does not tend to zero. Why is this interesting? Let’s suppose a charged particle X ± is so heavy that it cannot be produced on an accelerator and thus we cannot see it and do not study it. But it can be involved in this loop with nonzero contribution. Therefore, such a process can give us a hint of things existing beyond the reach of our technology. We’ll explore this process, calculate it within the Standard Model, compare it with the experiment, and if the calculated result does not coincides with the experimental one, a possible way out will be in the existence of heavy charged particles that contribute here. For this reason, sometimes this process is called “a counter of undiscovered particles”.
12.3. Production of the Higss boson in e+ e− , µ+ µ− and γγ collisions
Figure 13: Process e+ e− → H
Figure 14: Process µ+ µ− → H
We have examined the Higss boson decay H → e+ + e− . Of course, the reverse process e+ + e− → H, when an electron and a positron produce the Higgs boson, is also possible Fig. 13. What does it resemble? This is similar to a research BINP performs for many years: colliding e+ and e− produce a ρ meson, for example, i. e. resonance production of ρ particle. The only problem is that the production probability will still be proportional to the decay width. While for a ρ meson in collisions of e+ and e− this probability is sufficiently large, for the Higgs boson it will be very small, because it’s proportional to this tiny factor (me /mW )2 . So, the resonance production of the Higgs boson in e+ e− collisions is an unobservable phenomenon. We thus have to deal with processes of higher order, e.g. when an electron and a positron collide and produce a Higgs boson and some other particle. One can immediately say what kind of particle it will be. Since the total charge is 0 and the Higgs boson charge is 0, this particle must also be neutral, like a Z boson. So, one of the most important tasks of the future international linear collider (ILC) will be looking for the Higgs boson production in this very mode e+ + e− → H + Z. From this point of view,
78 although the Higgs boson was already discovered in proton collisions and will be explored further, but its study in the future linear collider will yield much new additional information. Because on LHC of the very large background caused by collisions of strongly interacting particles (protons), in addition to the Higgs boson, which decays in such nontrivial ways, there are a terrible amount of other hadrons here. The situation on ILC will be quite clear, with much less background and so on. And besides, we can try to dream about the possibility of observing the Higgs boson on other accelerators that are being discussed now. Two types of accelerators that are regularly discussed at international conferences are accelerators with colliding high-energy muon beams and accelerators with colliding high-energy photon beams. Both projects of these new accelerators were suggested in Novosibirsk. The muon beams were suggested in works by Budker (1969) and by Scrinsky and Parkhomchuk (1981). The colliding photon beams were suggested by Ginzburg, Kotkin, Serbo and Telnov (1981). In the case of muon collider we can discuss direct resonance production of the Higgs boson µ+ + µ− → H – Fig. 14. The probabilities of transitions µ+ µ− → H and e+ e− → H are proportional to widths of the corresponding decays, the ration of these probabilities is ΓH→µ+ µ− W (µ+ µ− → H) = ≈ + − ΓH→e+ e− W (e e → H)
(
mµ me
)2 ≈ 40 000 ,
therefore, the probability to produce the Higss boson on the colliding µ+ µ− beams will be 40 000 times larger then on the colliding e+ e− beams of the same energy. For this reason, in spite of great technical difficulties related to the µ+ µ− beams, one of their most important tasks will be the detailed study of the Higgs boson, produced there in the resonance mode µ+ + µ− → H. Colliding photon beams can be realized if the project of the international linear collider with e+ and e− beams is implemented. The resonance production of the Higgs boson will be possible there. Even if the Higgs boson is discovered in colliding proton beams (and it was already discovered) — examination of it on colliding e+ and e− beams in the reaction e+ +e− → H +Z will yield more detailed information. Even with that, it would be interesting to explore it on colliding photon beams, because this variant has a number of advantages: on e+ e− beams a pair of heavy particles is produced, and thus the initial energy must be large. In gamma-gamma collisions, it is resonance production γ + γ → H, the energy can be halved. Moreover, the corresponding cross section is one order of magnitude larger than that in the reaction e+ + e− → H + Z. But the main thing is, as we said, the special interest in the vertex Fig. 12 which can be studied on colliding photon beams. You can see that a fairly simple reasoning, which began with the first-order perturbation theory, engages the most topical problems of modern elementary particle physics.
ˆ Ψ ˆ ˆµ 12.4. QED interaction eΨγ µ A . The Feynman rules in QED We consistently study an increasingly sophisticated processes. Let us now consider the first order of the perturbation theory and interaction corresponding to quantum electrodynamics (QED). We already know that in QED the constant g is replaced with the charge e of our charged particles. If the particle is an electron, it will be a negative value. The interaction is also known to us — see Eqs. (10.5) and (10.6). It is a construction of Dirac bispinors, which has a vector structure, and a photon, which is described by the vector
79 potential. The perturbation operator Vˆ (x) and the first order Sˆ operator are ∫ ˆ ˆ µ (1) 4 ˆ ˆ ˆ ˆ ˆ ˆµ V (x) = e Ψ(x)γµ Ψ(x)A (x); S = −ie Ψ(x)γ µ Ψ(x)A (x) d x.
(12.8)
As a result, the operator Sˆ(1) can describe the following processes: e± → e± + γ, e± + γ → e± , γ → e+ + e− , e+ + e− → γ . Strictly speaking, we know that the conservation laws prohibit all these processes, but it would be interesting to study it, at least for the reason that such pieces can be found in more complex diagrams. Besides, there is an interesting example when the QED process of the first order is allowed – see Problem 12.1 below. ˆ Ψ(x) ˆ Φ(x) ˆ What will change in the previous interaction g Ψ(x) from this point of view? As for the electron and positron, their description remains the same as it was. As for the neutral particle, a vector particle occurs here instead of the scalar: ) ∑( e−ikx √ eikx √ µ µ∗ + µ ˆ cˆkλ √ A (x) = 4πekλ + cˆkλ √ 4πekλ . (12.9) 2ω V 2ω V k k kλ Let us consider the process e− (p) → e− (p′ ) + γ(k)
the corresponding scattering amplitude reads √ (1) Mf i = −eup′ σ′ γµ upσ 4πeµ∗ kλ .
(12.10)
Important note. Gauge transformation of 4-potential Aµ (x) → Aµ (x) − ∂ µ f (x)
(12.11)
eµkλ → eµkλ + k µ if˜ (k) ,
(12.12)
corresponds to the replacement
where f (x) is an arbitrary function and f˜(k) is its Fourier amplitude. This replacement does not change the scattering amplitude since ( ) k µ u′ γµ u = pµ − p′µ u′ γ µ u = 0, (12.13) due to the Dirac equations: pµ γ µ u = mu, p′µ u′ γ µ = mu′ ′ (here u = upσ and ( u =µup′ σ′ ).) Note that Eq. (13) itself is the consequence of the current conservation ∂µ Ψ (x) γ Ψ (x) = 0.
80
In the similar way, if we consider the process γ(k) → e+ (p+ ) + e− (p− ) then the corresponding scattering amplitude reads √ (1) Mf i = −eup− σ− γµ vp+ σ+ 4πeµkλ .
(12.14)
From the above examples we can derive the Feynman rules for QED processes. So, we have electrons, positrons and photons. Let’s consider their initial and final states The initial state
The final state
electron
upσ
u¯pσ
positron
v¯pσ
vpσ
photon
√ 4π eµkλ
√ 4π eµ∗ kλ
The vertex
is associated with the factor −ieγ µ . Bispinors are written in the scattering amplitude (iMf i ) from the end of an electron line to its beginning. There are several ways to depict the Feynman diagrams. We adhere to a manner tied with the European tradition, going from the Greeks and Latins, of reading from left to right. There is another tradition adhered to in the book ”Quantum electrodynamics” by Berestetskii, Lifshitz and Pitaevskii. This manner proceeds from the convenience of drawing diagrams so that they resemble the matrix element structure. As you remember, the initial state in the matrix element is in the right and the final state is in the left. So, diagrams in the above-mentioned book are inverted as compared with this. In fact, this is tied with a Semitic tradition of reading from right to left. One more remark is concerning with the case when the Feynman diagram, which includes continuous electron lines from the beginning to the end, replaced by that with the corresponding positron lines. Let us consider two processes. The process e− → e− + γ with electron lines is related to the function ′
e−i(p−p )x ˆ (x) γ Ψ + ˆ ′ ′ √ (x) a ˆ fµ = ⟨0| a ˆp′ σ′ Ψ |0⟩ = u γ u , p σ µ pσ | {z } µ | {z pσ} 2εp V 2εp′ V
81
while the process e+ → e+ + γ with positron lines is related to the function ′
e−i(p−p )x ˆ (x) γ Ψ + ˆ ˆ ′ ′ √ f¯µ = ⟨0| ˆbp′ σ′ Ψ (x) b |0⟩ = −¯ v γ v , µ pσ µ p σ pσ 2εp V 2εp′ V with the additional factor (−1) due to anti-commutation of fermion operators and another set of contractions: }| { z ˆ ˆ + + ˆ ˆ ˆ ˆ ˆ ˆ ⟨0| bp′ σ′ Ψ (x) γµ Ψ (x) bpσ |0⟩ and ⟨0| bp′ σ′ Ψ (x) γµ Ψ (x) bpσ |0⟩ . | {z } Problem 12.1. This problem has been considered by Nobel Prize winner Vitaly L. Ginzburg in 1940 as his PhD. An electron flies in the homogeneous, transparent and nonmagnetic medium with the refraction index n > 1. If the electron velocity v > 1/n, the decay e(p) → e(p′ ) + γ(k) becomes possible which results in the Vavilov-Cherencov radiation. Find the spectral–angular dΓ/(dω dΩγ ) and spectral dΓ/dω distributions of the emitted radiation. Hint: Use the following expressions for the decay width (1) 2 M fi √ (1) ′ 3 3 ′ dΓ = 4πα u¯p′ σ′ γµ upσ (eµkλ )∗ . 2 2 ′ δ(p + k − p) d kd p , Mf i = 32π n εε ω with ωk = |k|/n (see Problem 2.1 in §2).
§ 13. The second order of the perturbation theory with ˆ The Mandelstam variables. Propainteraction g φˆ+φˆΦ. gator of the scalar particle We began the perturbation theory with simplest models which corresponds to interaction of scalar particles of the form ˆ Vˆ (x) = g φˆ+ (x)φ(x) ˆ Φ(x),
(13.1)
i. e. at one point there are a pair of charged particles and one neutral particle. And then, when we had got understanding of that, we gradually switched to more complex cases, when
82 the charged particles were replaced with spinor ones, and the scalar neutral particle was replaced with the photon, i. e. we proceeded to quantum electrodynamics (QED). Now we will deal with the second order of the perturbation theory. Again, we start from the simplest perturbation. The problem is that this model is very good for working out the basic idea. Our strategy will be the same as in the previous case: we’ll analyze the second order on this model, and when we proceed to QED, there will be only ploys associated with due account of the spin of the particles, whereas the ideas will be already developed well. The S operator is known; it is T -exponent with the interaction in the power. We even know how we should take into account this T -product: we should perform a series expansion and take a multiple integral with respect to different variables. In particular, in the second order, the expansion of the exponent will yield the following 2 ∫ [ ] (−ig) (2) 4 4 ′ˆ + + ′ ′ ˆ ′ ˆ ˆ S = d xd x T φˆ (x) φˆ (x) Φ (x) φˆ (x ) φˆ (x ) Φ (x ) . (13.2) 2! So, let’s see what such an operator can give. Here we have a pair of points, x and x′ , and the same operators Vˆ at each point:
Now these operators correspond to much more diverse opportunities. When we were talking of one of the verteces, we proceeded to consideration of the possibility of various processes. Let’s recall that the operator φˆ was associated with charged particles (by which ˆ was associated with neutral particles (for we conventionally meant π ± ) and the operator Φ 0 simplicity we call them π ). Moreover, the masses of these particles being in no relation with each other. Here could be processes with π − turning into π − + π 0 or vice versa, π 0 turning into π + + π − or vice versa. Such processes, which are accompanied by certain states of free particles in the beginning and in the end. Here we can use 6 particles, distributing them somehow in the beginning and in the end. And if we take states of 6 particles as the initial and final states, this will be reduced to a set of two products of what was done before, and nothing interesting will be added. If we have 5 particles, generally speaking with different momenta, and if we start, as before, liquidating the respective perturbation operators using the initial and final states, then we’ll have one vacant operator, and its average value will give 0. So, this picture is uninteresting. Next is a situation with 4 particles. What is possible in this case? Then we can have a picture when some two particles come in and some two particles go out, and something occurs, which is determined by Sˆ(2) operator. Thus, we have two initial states, two final states and some development. What processes are possible here? Given that we have an elementary vertex with two charged particles and a neutral particle, the following processes are possible: scattering of a particle on another particle or antiparticle: π−π± → π−π± , π0π± → π0π± , π+π+ → π+π+ , annihilation of charged particles
π+π− → π0π0
83 and their production in collisions of two neutral particles π0π0 → π+π− . All these processes are so called two-to-two processes. Such processes can be easily computed, but since these are really some of the basic processes, including those on colliding beams, a special, very convenient kinematic language was invented for them. It is represented by the so-called Mandelstam variables. Let’s talk about them in a little more detail.
13.1. The Mandelstam variables 13.1.1. The s channel: 1 + 2 → 3 + 4 So, we are talking of a reaction when particle 1 collides with some particle 2 and turns into some particles 3 and 4. In general, they are all different. In this case, we have the following picture (Fig. 15)
Figure 15: s channel: 1 + 2 → 3 + 4 Accordingly, we say that the particles have the momenta p1 , p2 , p3 , and p4 . Of course, the law of conservation is valid, i. e. the sum of the initial four-momenta is equal to the sum of the final four-momenta, p1 + p2 = p3 + p4 , and all these particles are asserted to be on the mass shell, i. e. the squared four-momentum is equal to their mass squared, p2i = m2i . In this light, generally speaking, one can make 16 mutual products from these 4-vectors (pi pj ). However, only two of them turn out to be independent. Instead of invariants such as scalar products, it is preferable to use the Mandelstam variables: s = (p1 + p2 )2 = 2p1 p2 + m21 + m22 = (p3 + p4 )2 , t = (p1 − p3 )2 = −2p1 p3 + m21 + m23 = (p2 − p4 )2 , u = (p1 − p4 )2 = −2p1 p4 + m21 + m24 = (p2 − p3 )2 . In fact, these variables are not independent since s + t + u = 2p1 (p2 − p3 − p4 ) + 3m21 + m22 + m23 + m24 = m21 + m22 + m23 + m24 . If so, what is the good of introducing them? They have certain symmetry. Let’s try to find the use. It would be better if we switch to the center-of-mass system. Then we will have a picture in which the momentum of the first particle is equal to the momentum of the second particle with the minus sign and, respectively, the momentum of the third particle is equal to the momentum of the fourth particle with the minus sign: p1 = −p2 , p3 = −p4 .
84 If we start unwinding the variable s, in this system it will turn into the following: the total 4-momentum p1 + p2 will only have the zero component and no spatial components p1 + p2 = (ε1 + ε2 , 0, 0, 0) Thus, the invariant s will look like this: s = (ε1 + ε2 )2 = (ε3 + ε4 )2 , i. e. this variable is just the square of the total energy in the center-of-mass system. The variables t and u in this system t = −2ε1 ε3 + 2p1 p3 + m21 + m23 , u = −2ε1 ε4 + 2p1 p4 + m21 + m24 = −2ε1 ε4 − 2p1 p3 + m21 + m24 depend on the scattering angle θ: p1 p3 = |p1 | |p3 | cos θ . The cross section (11.8) reads 2 |p3 | Mf i |Mf i |2 dφ dσ = dΩ d(−t) , = 3 2 8π (ε1 + ε2 ) |p1 | 64πI 2π
(13.3)
since d(−t)dφ dΩ3 = sin θdθdφ = d (− cos θ) dφ = , |p1 | (ε1 + ε2 ) = 2 |p1 | |p3 |
√ (p1 p2 )2 − m21 m22 = I .
13.1.2. The t channel: 1 + ¯3 → ¯2 + 4 Besides the process 1 + 2 → 3 + 4, we can begin considering a process with exchange particles 2 and 3. This will be such a process: particles 1 and 4 will still be the same ones. The third particle will turn into the initial state, and thus it will become an antiparticle ¯3. The second particle will turn into the final state; it will also become an antiparticle ¯2. So, we have such a reaction 1 + ¯3 → ¯2 + 4 (Fig. 16):
Figure 16: t channel: 1 + ¯3 → ¯2 + 4 If we still use these designations p1 and p4 for momenta of particles 1 and 4, then we have to write p¯2 for antiparticle ¯2 and to substitute p2 → −p¯2 . Analogously, 4-momentum p¯3 will be for antiparticle ¯3 with the substitution p3 → −p¯3 . Therefore, the conservation of the total 4-momentum reads p1 + p¯3 = p¯2 + p4 . In this channel the Mandelstam variables have the forms t = (p1 + p¯3 )2 ,
s = (p1 − p¯2 )2 ,
u = (p1 − p4 )2 .
85 The former variable t turns into the quantity which coincides with the total energy squared in the center-of-mass system in this channel: t = (ε1 + ε¯3 )2 = (ε¯2 + ε4 )2 , Accordingly, variables s and u depend on the scattering angle in this channel. Thus, in principle, we will have the same variables, but with another domain of variation. E. g. if all particles were identical particles, one could easily see that in the s channel the variable s is a positive value. In whatever system we calculate it, it is an invariant and thus will remain a positive value. As for the variables t and u, they both will be negative. Thus, that will be true in any reference frame. And vice versa, in the t channel the variables s and u will become negative, and t will be positive. So, from this point of view, this will be a unified description of two different processes, described with the same variables, but with different domains of their variation. The first process is referred to as the s channel of some generalized two-to-two reaction and the second process is referred to as the t channel of the same generalized reaction. Note that the second process can also be seen in Fig. 15 if the observer is looking on it from the above.
13.1.3. The u channel: 1 + ¯4 → 3 + ¯2 And, accordingly, there can be the u channel, in which particles 1 and 3 will still be the same ones, but particles 2 and 4 were exchanged. So, we have such a process 1 + ¯4 → 3 + ¯2 (Fig. 17):
Figure 17: u channel: 1 + ¯4 → 3 + ¯2 In this case, the Mandelstam variables have the forms u = (p1 + p¯4 )2 ,
s = (p1 − p¯2 )2 ,
t = (p1 − p3 )2 ,
the invariant u coincides with the total energy squared in the center-of-mass system in this channel: u = (ε1 + ε¯4 )2 = (ε3 + ε¯2 )2 and variables s and t depend on the scattering angle in this channel. Example: the Compton effect To make this more evident, let’s consider an example of a certain reaction. Let the s channel be a reaction corresponding to the Compton effect γe− → γe− Fig. 18. This process was discovered in the early XX century, in 1923-24, when Artur Compton observed scattering of X-rays on atoms. In the t channel, the third and the second particles are exchanged and we have such a picture: the photon turns into the antiphoton, which is its own antiparticle. And the electron
86
Figure 18: The Compton scattering γe → γe
Figure 19: γγ → e+ e−
The process
Figure 20: γe+ → γe+
The process
turns into the positron, and there occurs the process γγ → e+ e− (Fig. 19). Such process was not observed directly, but the reverse process was observed, e+ e− → γγ, this is one of the first processes that are examined when any accelerator with colliding electron-positron beams starts operation. The cross section of γγ → e+ e− at relatively high energies is about the same order of magnitude as the cross section of e+ e− → γγ, i. e. it is not small from this point of view. However, we have good sources of electrons and positrons; dense beams with large number of particles. As for the photons, we actually have dense photon beams, laser ones. Unfortunately, their energy mainly lies in the range of visible light about 1 eV, while this reaction is required over 1 MeV at least. It is necessary to pass over the pair production threshold. The mass of each particle is 0.5 MeV. Thus each photon must have energy over 0.5 MeV. There are no such lasers so far. However, there are roundabout ways to produce such photons, including production of one photon from scattering off electron, and production of the second photon from scattering off positron. Then we can observe the process e+ e− → e+ e− + (γγ) → e+ e− + (γγ → e+ e− ). Using such a scheme, this process was actually observed; on the colliding beams for the first time that was on the VEPP-2 collider in BINP. Finally, in the u channel the second and fourth particles are exchanged. Then electron 2 turns into an antiparticle ¯2, i. e. the positron, and electron 4 turns into another positron ¯4, which means scattering of the photon on the positron (Fig. 20). Such process was also observed in collisions beams of laser photons and positrons.
13.2. Scattering of charged particles Let’s consider a process in which a charged particle is scattered on another charged particle π−π− → π−π− . The initial and final states of this process are: ˆ+ |i⟩ = a ˆ+ 1 |0⟩ , 2a
ˆ+ ˆ+ ˆ+ |f ⟩ = a ˆ+ pi . 3 |0⟩ , a 4a i ≡ a
(13.4)
ˆ commutes with the operators of the charged particles The operator of the neutral particles Φ ˆ (x) Φ ˆ (x′ ) φˆ and φˆ+ , therefore, the chronological operation Tˆ can be applied separately to Φ and to other operators: ∫ (2) 2 d4 xd4 x′ iD (x − x′ ) f (x, x′ ) , (13.5) Sf i = (−ig) where the function f (x, x′ ) =
[ ] + + 1 ˆ1 |0⟩ ˆ2 a ⟨0| a ˆ3 a ˆ4 Tˆ φˆ+ (x) φˆ (x) φˆ+ (x′ ) φˆ (x′ ) a 2!
(13.6)
87 corresponds to plane waves of the initial and final charged particles with the different 4momenta pi . The propagator [ ] ′ ′ ˆ ˆ ˆ iD (x − x ) = ⟨0| T Φ (x) Φ (x ) |0⟩ (13.7) corresponds to propagation of a neutral particle from the point x′ to the point x. It is convenient to consider this function in the momentum representation: ∫ 4 ′ ˜ (k) e−ik(x−x′ ) d k , D (x − x ) = D (2π)4 ˜ (k) is the Fourier amplitude of the function D(x − x′ ). In what follows, we will where D often omit the sign tilde when it does not lead to some misunderstanding. Note that here 4-vector k µ = (k0 , k) is a simple four-dimensional variable and does not corresponds to a real particle, therefore, there is no requirement k02 = k2 + m2 which is valid for real particles. This situation sometimes is expressed in such terms: 4-vector k µ corresponds to a virtual particle for which the identity k µ kµ = m2 is not valid. Let us consider the function f (x, x′ ). We assume that all 4-momenta p1,2,3,4 are different, therefore, the operators a ˆ3,4 commute with the operators a ˆ+ ˆ+ 1,2 . As a result, the operators a 1,2 can contract (and give the nonzero contribution) with φˆ (x) and φˆ (x′ ) only. In the similar way, a ˆ3,4 can contract (and give the nonzero contribution) with φˆ+ (x) and φˆ+ (x′ ) only. As a consequence, the chronological operator Tˆ does not work and can be omitted. The further analysis will be relatively simple, because what’s happening here is very similar to that in the first order. We’ll act as follows: take, for example, this operator a ˆ+ 2 and pull it to the left. If it can get to the zero state, then the operator’s action will yield zero. Meanwhile it will of course meet with a ˆ3 and a ˆ4 ; their commutation yields zero, because, as we agreed, p2 ̸= p3,4 . So, it will necessarily meet with the operator a ˆp with exactly the same momentum as p2 . It may be contained only in the field operator φˆ (x′ ) or φˆ (x). A lot of summation here, and thus, among other things, there is a summand with a momentum equals p2 . Let we consider the first possibility. As a result of contraction with the operator φˆ (x′ ) we get the initial plane wave with 4-momentum p2 . After that the operator a ˆ+ 1 can give the nonzero contraction only with φˆ (x) which corresponds to the initial plane wave with 4-momentum p1 . And the operator a ˆ3 is contracted with φˆ+ (x) (while a ˆ4 is contracted + ′ + ′ + with φˆ (x )) or with φˆ (x ) (while a ˆ4 is contracted with φˆ (x)): 4 {[ ] } 1 ∏ 1 −i(p1 −p3 )x −i(p2 −p4 )x′ ′ √ f (x, x ) = e e + (p3 ↔ p4 ) + (x ↔ x ) 2! n=1 2εn V ′
As a result, S
(2)
∫ (−ig)2 d4 k 4 4 ′ = ∏4 √ 4 d x d x iD (k) 2ε V (2π) n n=1 ] } 1 {[ −i(p1 +k−p3 )x −i(p2 −k−p4 )x′ e e + (p3 ↔ p4 ) + (x ↔ x′ ) . × 2!
Now the integration over x and x′ becomes trivial ∫ d4 x d4 x′ {...} = (2π)8 2 [δ (p1 + k − p3 ) δ (p2 − k − p4 ) + (p3 ↔ p4 )] ;
(13.8)
88 the factor 2 is appear here when we take into account the item (x ↔ x′ ). Two delta functions correspond to conservation of the total 4-momentum in each vertices corresponding to the points x and x′ . After the integration over k, the delta function is appeared: δ (p1 + p2 − p3 − p4 ) ; it corresponds to conservation of the total 4-momentum of the process. The scattering amplitude of the process is expressed via the sum of two propagators in the momentum representation (see Figs. 21–22) (2)
iMf i = (−ig)2 [iD (p2 − p4 ) + iD (p2 − p3 )] .
Figure 21: The process π − π − → π − π − : π 0 exchange in the t channel
(13.9)
Figure 22: The process π − π − → π − π − : π 0 exchange in the u channel
A small note related to these diagrams. Once again I want to emphasize the idea that will always be invisibly present in this course. In fact, the only thing physicists have here is the computation scheme. However, this computation scheme is a basis for certain ideology that boils down to qualitative and graphical interpretation of our findings. We get no new information. In fact, we get images convenient for memorizing and convenient for talking of. Having written a Feynman diagram, we immediately get a mental image of the corresponding matrix element of the perturbation theory. Furthermore, there arises interpretation which is unnecessary but very useful. After some experience, we even get an insight into subsequent calculations. Virtual particles.The argument of propagator in the first diagram on Fig. 21 is equal to 4-momenta of the intermediate neutral particle k = p2 − p4 = p3 − p1 , thus k 2 = t < 0. Therefore, this diagram corresponds to the π 0 exchange in the t channel; analogously, the second diagram on Fig. 22 corresponds to the π 0 exchange in the u channel. For the initial and final particles the identities p2n = m2π− , n = 1 ÷ 4√is valid. On the contrary, for the intermediate particles k 2 ̸= m2π0 , and therefore εk = k2 + m2π0 ̸= k0 . Such a particles are called the virtual particles or particles off the mass shell. The quantity k 2 − m2π0 is called a virtuality of the given intermediate particle. A virtuality k 2 − m2π0 characterizes the deviation of a particle from the mass shell k 2 = m2π0 . The smaller the virtuality is, the closer the particle is to the real one; the larger the virtuality is, the farther the particle is from the usual real one. In our case, this is a state of a particle that was produced at one point and disappeared at another point. That took some finite time, and thus the particle is in connection with a
89 final distance it flew. The uncertainty relation shall also be included with saying that this particle is living for about 1 τ∼√ 2 |k − m2π0 | and this particle flies a distance of about r∼√
1 . |k 2 − m2π0 |
The larger the virtuality is, the smaller distances can probe such a particle – see experiments for deep inelastic electron-proton scattering. On the other hand, if the virtual particles with very small virtuality participate in the process then we should take into account some effects related to large distances. Do not think of this as theorists’ fictions. I’d like to support my words with an example from the practice of BINP. Example of the MD effect. It concerns a slightly more complex problem, but is, nevertheless, in direct relationship to our object. I’m talking about a process that was studied on colliding beams when VEPP-4 was not VEPP-4M yet, i. e. the process of single bremsstrahlung on colliding beams: e+ e− → e+ e− γ. The cross section of this process has been known since 1934, from works by Bethe and Heitler. It seems to be unlikely to have anything new or interesting in this process. The cross section was measured many times under normal conditions and in experiments on a fixed target. Moreover, it proved to be very useful for experiments with colliding beams. On the VEP-1 and VEPP-2, this process is sometimes used for determination of luminosity. The cross-section is large and easy to calculate; there are a lot of events; the photon is easy to measure in a detector, so one can calculate luminosity L from the measured number of events per unit time: dN˙ = L dσ. All was fine until they at VEPP-4 started experiments in 1981-1983 at the MD detector to examine the process with better accuracy. The first accelerators are known to work with very poor accuracy. What was revealed? The cross section, well known since 1934, turned out to differ from the cross section that was measured on colliding beams at low photon energies. The difference was about 30 percent, which cannot be attributed to any radiative corrections and so on. The question was if either it was an artifact related to something unknown in the work accelerator or detector or it was something fundamentally new. Yuriy Tikhonov researched this issue and put forward a hypothesis which was then confirmed. The essence was that in this process a virtual particle (a photon) can live a very long time and fly over a long distance. The corresponding diagram looks like this (Fig. 23)
Figure 23: Bremsstrahlung at e− e+ collisions The electron and positron exchange by virtual photon. This will be the usual scattering, which we can calculate soon. Photon emission is also possible. For example, an electron
90 emits a photon. As regards this intermediate particle, it is the virtual one as described above. Its virtuality (p2 − p4 )2 may be very small if a final photon is detected in a low energy range. This means that this virtual photon can fly over very large distances. Among other things, this means the following. Let a positron beam fly towards a beam of electrons. And some individual positron interacts with some individual electron. Their interaction can be considered as follows: an electromagnetic field flying beside the positron is a set of virtual photons that went far away from the positron. So far that the transverse disc of these photons as estimated from the concept of virtuality may reach 5 cm in size. The transverse size of the colliding beam is now the same as in the past, i. e. about 30 microns, but it is still 3 orders of magnitude smaller than the distance our photons run away from the positron. Thus these photons cannot encounter with an electron, which results in a decrease in the cross section of the process under calculation. The adequate description of such processes requires changes in our calculation scheme. The reason is as follows. In our scheme we have a plane wave corresponding to this electron, i. e. the intensity is uniform in the transverse direction. Then we have a plane wave of positrons, but nothing of the kind in reality. A limited package instead of plane wave. The concept of virtual particle facilitated establishment of this fact. The discussed effect, which reduces the cross-section, is called the MD effect or the effect of limitation of impact parameters. Why was this effect from quantum electrodynamics so interesting? A significant reduction in the cross section affects the lifetime of particles in the beam. Indeed, as soon as this electron emits photons and loses an energy of more than one per cent, it is immediately goes out the acceleration regime and leaves the beam. If this process is one of the major processes affecting the beam lifetime, which is common, taking the effects into account actually increases the beam lifetime. A particular pattern in the LEP collider was very interesting. When the LEP started operation (50 by 50 GeV, electrons and positrons), and they were going to study the Z boson with a high degree of precision, about thousand bosons were to be produced per day. So, they wanted to know everything about the accelerator. Among other things, a program was made for determination of the beam lifetime. All known data were input in it. Comparison of the calculations with experiments lead to a surprising discovery that the experimental beam life exceeds the calculated one by 40 %. There was no explanation for a long time, until they learned about the MD effect, discovered in Novosibirsk. When it had been taken into account, everything started making sense. Summing up, we can say that what we got here will ultimately reduce to two diagrams, in which π − is scattered on π − with exchange of π 0 either in the t channel or in the u channel. Let us now consider the process π−π+ → π−π+ , for which the corresponding diagrams are given by Figs. 24–25. The scattering amplitude of this process ) ] [ ( (2) iMf i = (−ig)2 iD p′− − p− + iD (p− + p+ ) . (13.10) corresponds to exchange of π 0 either in the t channel or in the s channel.
13.3. Propagator of the scalar particle Using properties of the chronological operator Tˆ [ ] { Φ ˆ (x) Φ ˆ (x′ ) ′ ˆ (x) Φ ˆ (x ) = Tˆ Φ ˆ (x′ ) Φ ˆ (x) Φ
at t > t′ at t′ > t
(13.11)
91
Figure 24: The process π − π + → π − π + : k 2 < 0, π 0 exchange in the t channel
Figure 25: The process π − π + → π − π + : k 2 > 0, π 0 exchange in the s channel
and the evident forms of field operators ) ) −ik′ x′ ∑( ∑ ( e−ikx e ′ ˆ ˆ )= Φ(x) = cˆk √ + c. c. , Φ(x cˆk′ √ + c. c. . ′V 2ε V 2ε k k ′ k k
(13.12)
we rewrite the propagator in the form [ 1 −ikx+ik′ x′ √ θ (t − t′ ) ⟨0| cˆk cˆ+ + k′ |0⟩ e ′V 2ε V2ε k k ′ k,k ] −ik′ x′ +ikx + θ (t′ − t) ⟨0| cˆk′ cˆ+ |0⟩ e . k
iD (x − x′ ) =
∑
The first item in square brackets corresponds to the event of neutral particle production at the point x′ and its annihilation at the point x while the second item corresponds to the event of neutral particle production at the point x and its annihilation at the point x′ . Since ⟨0| cˆk cˆ+ k′ |0⟩ = δk,k′ , we obtain the single sum instead of double sum before. Moreover, to simplify the following calculations, we can designate this difference (x − x′ ) with one letter x, and then, when necessary, we will substitute it here as the difference. As a result, iD(x) =
] 1∑ 1 [ θ(t)e−iεk t+ikr + θ(−t)e+iεk t−ikr . V k 2εk
(13.13)
If we make the replacement k → −k in the second item and pass from the sum to the integral over k ∫ 1 ∑ d3 k , → V k (2π)3 we finally get
∫ iD(x) =
d3 k e−iεk |t|+ikr , 2εk (2π)3
(13.14)
where the expression |t| is due to θ(±t). We could be satisfied with this, because although this expression is slightly wildish and has no evident relativistic covariant form, we still know that when we perform the integration, we’ll get a relativistic covariant result, because we use the relativistic covariance from the very beginning. Since the final result will be relativistic covariant, we will know in advance
92 what argument may determine the function. Obviously, it can only be xµ xµ ≡ x2 . We’ll get this eventually, although it is not seen from the above expression. This expression, however, provides a great benefit. You remember that when we were unwinding this, there was a sum over plane wave states. Those were normal and √ correct physical states with 4µ momentum k = (εk , k) and such a normal energy εk = k2 + m2 . This is good because this is customary. And here “the frog begins jumping into the water” as my first professor Rumer used to say. Now we’ll try to get rid of this customary and standard image and introduce the concept of virtual particles. Once again, we could have done nothing of this sort. Eventually, we can calculate this integral and get the answer. This answer will be, however, rather unpleasant. I’ll give it in the end; it will be a function of x2 and will include delta functions, theta functions, Bessel functions, Neumann functions, Hankel functions, and so on. And yet, when we perform all these integrations (integration with respect to k and integration with respect to x and x′ ), we’ll finally get very simple expressions. However, if we start from the explicit form of the integral, we won’t see them. Actually, it would be convenient to take the Fourier transform of this product. To do this, it would be better to represent this integral in an explicitly covariant form. To this end, we have to overcome a stereotype, according to which the summation or integration was performed over states with ordinary particles with such usual components of 4-momentum k µ . Now we’ll try to introduce a new kind of particles, virtual particles, which will have none of this. First, let’s do this in a formal way: this expression, besides this wonderful exponent eikr , includes this rather vile quantity, e−iεk |t| /(2εk ). I want to proclaim that this quantity can be formally represented as an integral with respect to some variable k0 (which has nothing to do with the ordinary energy εk ): e−iεk |t| =J, J =i 2εk
∫∞
−∞
dk0 e−ik0 t . 2π k02 − ε2k + i0
(13.15)
But you need to pay attention to the following obstacle. When we integrate with respect to k0 , we may stumble over the poles corresponding to identity k02 − ε2k = 0, and we have to indicate how we will treat it. A Feynman’s prescription corresponds to adding a small imaginary value, denoted as i0. In this case, 0 means an infinitesimal value. We are quite able to verify whether or not this is true. We’ll take this integral in different ways depending on the t value. If we draw the complex plane k0 (see Fig. 26), we can see that the integrand has poles at two points, where the whole value is equal to 0, i. e. at the points √ k0 = ± ε2k − i0 = ±εk ∓ i0 . If t > 0, we close the contour in the low half plane: J = i (−2πi)
1 e−iεk t e−iεk t = . 2π 2εk 2εk
In the similar way, if t < 0 we close the contour in the upper half plane: J = i (+2πi)
1 eiεk t e−iεk |t| = . 2π (−2εk ) 2εk
93 As a result, ∫ e−ikx d4 k D(x) = , (2π)4 k 2 − m2 + i0
kx = k0 t − kr,
k0 ̸= εk =
√ k2 + m2 , m ≡ mπ0 , (13.16)
therefore, its Fourier amplitude is ˜ D(k) =
1 . k 2 − m2 + i0
(13.17)
All this looks as a pure mathematics, and we could have stopped here. Now let’s try to interpret this mathematics. Look, we have integration with respect to four components (k0 , k) and in the exponent we have a quantity that corresponds to the product of such four-vector (k0 , k) by the fourvector (t, r). Moreover, we have the four-vector k squared in the denominator. As a result, we need to declare that the variable k0 is the zero component of the new four-vector k µ = (k0 , k). Then Figure 26: The complex plane of k0 everything will finally make sense. This all will be the integral with respect to four component of 4-vector k µ , which obviously looks relativistic covariant. Well, on the other hand, this beautiful simplicity requires a lot of new things. Why? Because now we have to change our ideology and say that the new summation (or integration) over intermediate states of neutral scalar particle corresponds to states in which k 2 is not equal to m2 , generally speaking, and k0 is not equal to the energy of normal particle. Here k0 can be both positive and negative. We have to get used to this. Such particles, the square of the four-momentum of which is not equal to the square of the mass, they are referred to as virtual particles or particles off the mass shell. Since ∫ ( µ ) ) d4 k ( 2 e−ikx 2 2 pˆµ pˆ − m D(x) = = δ(x), (13.18) k −m k 2 − m2 + i0 (2π)4 the propagator D(x) is the Green function for the KFG equation. The evident form of D(x) can be found in the book Introduction to the Theory of Quantized Fields by Bogolyubov and Shirkov [2] (Application 2): ) (√ )] (√ ) [ (√ δ(λ) m im D(x) = − + √ θ(λ) J1 m2 λ − iN1 m2 λ − 2 √ θ(−λ)K1 −m2 λ , 4π 4π −λ 8π λ where λ = x2 = t2 − r2 and J1 (z), N1 (z), and K1 (z) are the Bessel function, the Neumann function, and the Hankel function of an imaginary argument. From this explicit expression one can derive asymptotics at |λ| ≫ 1/m2 where the propagator is rapidly decreases: √ 2 13/4 e− m |λ| , at λ < 0 |λ| D(x) = 1 , at λ > 0 λ3/4 In the neighborhood of the light cone (at |λ| ≪ 1/m2 ) it look like this √ m2 |λ| i im2 m2 δ(λ) + 2 − 2 ln + θ(λ). D(x) = − 4π 4π λ 8π 2 16π
94 As you can see, this propagator has the whole bunch of singularities on the light cone. In the second order of the perturbation theory we manage to calculate the scattering amplitude up to the end without any troubles. If we turn to higher orders, we may meet the product of these singular functions. As you know, once we meet the product of generalized functions, we need to complete their definition. For example, we know that δ(x) is a functional that sets the rules of integration of this function with some other, smoother, function. But what is the product of two delta functions at one point? This thing is opaque and leads to infinities. So, this is the source of difficulties at higher orders. We need to complete the definition of the product of these functions in the vicinity of the light cone. Ultimately, it will reduce to renormalizations.
13.4. Processes π 0 π − → π 0 π − and π + π − → π 0 π 0 Let us consider the process π 0 π − → π 0 π − (Fig. 27) with the following initial and final states: |i⟩ = a ˆ+ ˆ+ 1 |0⟩ , 1c
ˆ+ |f ⟩ = a ˆ+ 2c 2 |0⟩ .
The matrix element of this process is (2) Sf i
Figure 27: The process π 0 π − → π 0 π −
∫ [ ] (−ig)2 ˆ Φ(x ˆ ′ ) cˆ+ |0⟩ · d4 xd4 x′ ⟨0| cˆ2 Tˆ Φ(x) = 1 2! [ + ] ⟨0| a ˆ2 Tˆ φˆ (x)φ(x) ˆ φˆ+ (x′ )φ(x ˆ ′) a ˆ+ 1 |0⟩ .
(13.19)
In the momentum representation the Feynman diagrams of Figs. 28–29 give the contribu(2) tion into iMf i . Here D(p) is the Fourier amplitude of the propagator of the charged scalar
Figure 28: The process π 0 π − → π 0 π − : π − exchange in the u channel
Figure 29: The process π 0 π − → π 0 π − : π − exchange in the s channel
particle D(x): ∑ [ ] 1 √ iD(x − x′ ) = ⟨0| Tˆ φ(x) ˆ φˆ+ (x′ ) |0⟩ = · ′V 2ε V2ε p p ′ p,p [ ] −ipx+ip′ x′ ′ ˆbp′ ˆb+ |0⟩ e−ip′ x′ +ipx . θ (t − t′ ) ⟨0| a ˆp a ˆ+ |0⟩ e + θ (t − t) ⟨0| ′ p p The first item in square brackets corresponds to the event of charged particle production at the point x′ and its annihilation at the point x while the second item corresponds to the
95 event of antiparticle moving backward. The final result is the same as for propagator of a neutral particle: ∫ d4 p e−ipx D(x) = , µ ≡ mπ − . (2π)4 p2 − µ2 + i0 As a result, the scattering amplitude reads: [ ] 1 1 (2) 2 Mf i = −g + (p1 − k2 )2 − µ2 + i0 (p1 + k1 )2 − µ2 + i0
(13.21)
Analogously, the process π − π + → π 0 π 0 is described by diagrams of Figs. 30–31:
Figure 30: The process π − π + → π 0 π 0 : π − exchange in the t channel [ (2) Mf i
= −g
Figure 31: The process π − π + → π 0 π 0 : π − exchange in the u channel
] 1 1 + . (p− − k1 )2 − µ2 (p− − k2 )2 − µ2
2
(13.20)
§ 14. The second order of the perturbation theory in QED. The photon propagator In QED, the time-ordering procedure in Sˆ(2) can be performed separately for photons and electron-positron operators: ∫ [ ] [ ] (−ie)2 ˆ (2) 4 4 ′ˆ ˆ ′ ′ µ ν ′ ˆ ˆ ˆ ˆ ˆ ˆ S = d xd x T Ψ(x)γµ Ψ(x)Ψ (x ) γν Ψ (x ) T A (x)A (x ) . (14.1) 2!
14.1 Scattering of electrons At the first stage, we consider the process of electron-electron scattering, which was observed at the first accelerator, VEP-1. To calculate the cross section of the process e− e− → e− e− , we will perform the same steps as before in section 13.2: |i⟩ = a ˆ+ ˆ+ 2a 1 |0⟩ , ∫ (2) Sf i
where
2
= (−ie)
ˆ+ a ˆ+ p i σi , i ≡ a
(14.2)
d4 xd4 x′ iDµν (x − x′ ) fµν (x, x′ ) ,
(14.3)
|f ⟩ = a ˆ+ ˆ+ 4a 3 |0⟩ ,
[ ] 1 ˆ ˆ ′ ′ ˆ ˆ ˆ ˆ+ ˆ+ fµν (x, x ) = ⟨0| a ˆ3 a ˆ4 T Ψ(x)γµ Ψ(x)Ψ (x ) γν Ψ (x ) a 1 |0⟩ , 2a 2! ′
(14.4)
96 [ ] iDµν (x − x′ ) = ⟨0| Tˆ Aˆµ (x)Aˆν (x′ ) |0⟩
and
(14.5)
is the photon propagator. When calculating fµν , we will act as in the scalar case (see Eqs. (13.8)–(13.9)) taking into account anti-commutation of fermion operators and spinor structure of the field: ( ( ) ) −ipx +ipx ∑ ∑ e e ˆ ˆ Ψ(x) = a ˆpσ upσ √ = a ˆ+ ¯pσ √ + . . . , Ψ(x) + ... . pσ u 2ε V 2ε V p p pσ pσ fµν (x, x′ ) = =
1 {[Fig. 32–Fig. 33] + (x ↔ x′ )} = 2!
4 {[ ] } 1 1 ∏ ′ √ u¯4 γν u2 e−i(p1 −p3 )x e−i(p2 −p4 )x − (p3 ↔ p4 ) + (x ↔ x′ ) , 2! n=1 2εn V
(14.6)
where un = upn σn , n = 1 − 4.
Figure 32: Contractions in fµν
Figure 33: Another variant of contractions in fµν
The further integration over x and x′ is the standard one, the result corresponds to Feynman diagrams of Figs. 34–34 and reads (compare with Eq. (13.9)): (2)
Mf i = (−ie)2 [(¯ u3 γµ u1 ) Dµν (p3 − p1 ) (¯ u4 γν u2 ) − (¯ u4 γµ u1 ) Dµν (p4 − p1 ) (¯ u3 γν u2 )] . (14.7)
Figure 34: Diagram with γ exchange in the t channel for the e− e− scattering
Figure 35: Diagram with γ exchange in the u channel for the e− e− scattering
Problem 14.1. Find the scattering amplitude for the process of electron-positron scattering e− e+ → e− e+ , which was observed at the accelerator VEPP-2.
97
14.2. The photon propagator The photon propagator was defined above: [ ] µν ′ µ ν ′ ˆ ˆ ˆ iD (x − x ) = ⟨0| T A (x)A (x ) |0⟩ .
(14.8)
We need to calculate this expression, which will be much easier if we find out the structure of this result for any gauge in advance. Up to now, we were working only with the special Coulomb gauge so far. A general analysis of the situation can help us. Here we deal with the symmetric second-rank tensor Dµν (x) which depends of 4-vector xµ . First, we can use the combination g µν multiplied by some function D(x2 ), which is symmetrical and relativistic covariant. Second, we can use the combination xµ xν multiplied by some function D (2) (x2 ), which is also symmetrical and relativistic covariant. This combination can also be presented in the form ∂ µ ∂ ν D(l) (x2 ). Therefore, the general form of the photon propagator reads ( ) ( ) Dµν (x) = g µν D x2 − ∂ µ ∂ ν D (l) x2 .
(14.9)
In the momentum representation it has the form: ( ) ( ) ˜ µν (k) = g µν D ˜ k2 + kµkν D ˜ (l) k 2 , D
(14.10)
˜ (k 2 ) and D ˜ (l) (k 2 ) are the Fourier transforms of the functions D (x2 ) and D (l) (x2 ). where D We use the special index ”(l)”, which means “longitudinal”, because the corresponding item contains components along the vector k µ . Here we have to point out the special remark: the final physical result does not depend on this function due to gauge invariance. According to this invariance, if we replace the potential Aµ (x) with another potential, which is equal to the previous potential minus the derivative of an arbitrary function of x, Aµ (x) → A′µ (x) = Aµ (x) − ∂µ f (x),
(14.11)
the strength of the electric and magnetic fields will not change, and therefore the physical ˜ (l) and result will not change either. This means that we can use the arbitrary function D nothing will depend on it. This fact will be very useful, it means that we need to find only ˜ 2 ) and nothing else. We can find this function in any gauge and one invariant function D(k then put it in the general form (10). To do this, let’s use the Coulomb gauge in which Aˆ0 = 0 and √ ∑ ) 4π ( ikx ∗ ˆ √ , kx = ωk t − kr. A(x) = cˆkλ ekλ e−ikx + cˆ+ kλ ekλ e 2ω V k kλ √ The important differences from the scalar field are: (i) the factor 4π ekλ ; (ii) another dependence the energy ω on momentum k, namely ωk = |k|. Since ⟨0| cˆkλ cˆ+ k′ λ′ |0⟩ = δkk′ δλλ′ ,
(14.12)
then, taking into account these differences and repeating calculations of §13.3, we obtain ∫ ˜ mn (x) = D
∑ d4 k e−ikx 4π (ekλ )m (e∗kλ )n , (2π)4 k 2 + i0 λ
(14.13)
98 where m, n = 1, 2, 3. Using the completeness of vectors ekλ , ∑
(ekλ )m (e∗kλ )n = δ mn −
λ
knkm , k2
δ mn = −g mn ,
(14.14)
we obtain (in the Coulomb gauge) ˜ mn
D
(k) = g
mn
( ) ( ) ˜ k2 + kmknD ˜ (l) k 2 = D
4π k 2 + i0
( ) knkm mn δ − , k2
(14.15)
from which it follows
( ) ˜ k 2 = −4π , D k 2 + i0 As a result, the general expression for the photon propagator reads ( ) ˜ µν (k) = −4π g µν + k µ k ν D ˜ (l) k 2 . D k 2 + i0
(14.16).
(14.17)
In what follows we will use the Feynman gauge with D (l) (k 2 ) = 0 : ˜ µν (k) = −4π g µν . D k 2 + i0
(14.18)
In other problems there may be other options. In some problems the so-called Landau gauge is used, ( ) kµkν −4π µν µν ˜ D (k) = 2 g − 2 . k + i0 k ˜ (l) (k 2 ) is different from 0. In this gauge the propagator is orthogonal to 4in which D ˜ µν (k) = 0 and kν D ˜ µν (k) = 0. momentum: kµ D Now we can present the scattering amplitude (14.7) for the process e− e− → e− e− in the final form (see Figs. 34–35) [ ] (¯ u3 γµ u1 ) (¯ u4 γ µ u2 ) (¯ u4 γµ u1 ) (¯ u3 γ µ u2 ) (2) Mf i = 4πα − , (14.19) t u where s = (p1 + p2 )2 , t = (p3 − p1 )2 and u = (p4 − p1 )2 . The further detailed calculations can be found in §81 of Quantum Electrodynamics by Berestetskii, Lifshitz and Pitaevskii [1]. The final result for the differential cross sections in the centre-of-mass reference frame and in the ultrarelativistic limit reads (K. M¨oller, 1932) ( )2 α2 3 + cos2 θ dσe− e− →e− e− = at s ≫ m2 . (14.19) 4 dΩ s sin θ Analogously, the differential cross sections for the process e+ e− → e+ e− reads (H. Bhabha, 1936) dσe− e− →e− e− dσe+ e− →e+ e− = cos4 (θ/2) at s ≫ m2 . (14.20) dΩ dΩ Now that we have obtained everything related to the photon propagator, we can start exploration of a variety of processes. First, we consider a simple process that helps us to understand the important relation between the well-know Coulomb law for interaction of charged particles and the developed invariant perturbation theory.
99
14.3. The Fyenman diagrams and the Coulomb law Let us consider the elastic scattering of a nonrelativistic electron on a muon e− µ∓ → e− µ∓ . Since the muon mass mµ ≈ 200 me , the muon can be viewed as a pointlike source of the Coulomb field e2 U (r) = ± . r − − This field corresponds to the repulsion for e µ interaction and to the attraction for e− µ+ interaction. We recall that in quantum mechanics the differential cross section for scattering of an electron e(p) → e(p′ ) on the Coulomb potential is equal to dσ = |f |2 , dΩ
f =−
me Uq , 2π
where q = p′ − p, ε′ = ε and Uq is the Fourier transform of the Coulomb field ∫ Uq = U (r) e−iqr d3 r. To calculate this expression we introduce the integral ∫ 3 d r −µr−iqr I(µ) = e r assuming that µ > 0. This integral can be easily calculated in spherical coordinates: ∫ ∞ ∫ 1 4π −µr I(µ) = 2π e rdr e−i|q|r cos θ d cos θ = 2 . q + µ2 0 −1 As a result, 4πe2 2me e2 . (14.21) , f = ∓ q2 q2 In QED, first, we need to clarify the form of the perturbation operator. As you remember, in quantum electrodynamics we have such a perturbation operator Uq = ±e2 I(0) = ±
ˆ ˆ (e) (x), Vˆ (e) (x) = eAˆα (x)Ψ(e) (x)γα Ψ where the field Aˆα (x) corresponded to the electromagnetic quanta (photons) and the field ˆ (e) (x) to electrons and positrons. This interaction can be depicted as on Fig. 36. The muon Ψ is quite another particle; it differs in the mass and the lepton charge, but muons can also
Figure 36: Vertex e → eγ
Figure 37: Vertex µ → µγ
Figure 38: Vertex e → µγ
be involved in electromagnetic interaction. Then we have to take into account additional interaction ˆ ˆ (µ) (x) Vˆ (µ) (x) = eAˆα (x)Ψ(µ) (x)γα Ψ
100 and additional diagram Fig. 37. And then we have to obey the rules of the game. If before by a particle we meant the electron e− , here by a particle we have to mean µ− . Generally speaking, we could have a muon coupled with the electron field, like Fig. 38. We have an electron interacting with a muon, and an electromagnetic field is involved. In principle, this agrees with the electric charge conservation law, but does not happen in nature, we do not know why. Since this does not occur, let’s come up with a special name for this law: let the particles differ in some new feature, for example, lepton charge. Let’s assume that the electron has a leptonic charge, and the muon has a different one. Let us require the law of leptonic charge conservation to be met, then the vertex of Fig. 38 is prohibited. However, you need to understand that if such processes are suddenly discovered tomorrow — and they have been vigorously sought — we’ll have to revise everything. So, we introduced the law of conservation of leptonic charge and only such an interaction perturbation operator is left [ ] ˆ(µ) ˆ(e) α (e) (µ) ˆ ˆ ˆ ˆ V (x) = eA (x) Ψ (x)γα Ψ (x) + Ψ (x)γα Ψ (x) . (14.22) The process e− µ− → e− µ− is described by a single diagram Fig. 39: iMf i = (−ie)2 (¯ u3 γα u1 ) iDαβ (q) (¯ u4 γβ u2 ) . In the rest frame of the initial µ− , we have q = p3 − p1 = (0, q) , q 2 = −q2 , D00 =
−4π 4π =+ 2. 2 q q
Besides, all the bispinors ui ≡ upi σi have the upper components only. Therefore, (¯ u3 γα u1 ) ̸=
Figure 39: The process e− µ− → e− µ−
Figure 40: The process e− µ+ → e− µ+
0 only at α = 0 and (¯ u3 γ0 u1 ) = u+ u4 γβ u2 ) ̸= 0 only at β = 0 3 u1 = 2me δσ1 σ3 . Analogously, (¯ + and (¯ u4 γ0 u2 ) = u4 u2 = 2mµ δσ2 σ4 . As a result, Mf i = −
4πe2 4πe2 ( + ) ( + ) = − u u u u 2me δσ1 σ3 2mµ δσ2 σ4 . 4 2 3 1 q2 q2
Taking into account that 2 Mf i dσ 2 = |f | = dΩ 8π (mµ + me )
or f =
Mf i , 8πmµ
we obtain the result which coincides with that in quantum mechanics (21). For the process e− µ+ → e− µ+ , the scattering amplitudes reads (see Fig. 40) iMf i = (−1) (−ie)2 (¯ u3 γα u1 ) iDαβ (q) (¯ v2 γβ v4 ) ,
101 where the additional factor (−1) is connected with anticummutation of fermion operators and different set of contractions for µ+ as compare with that for µ− (see §12.4). Besides, (¯ u3 γ0 u1 ) = u+ v2 γβ v4 ) ̸= 0 only at β = 0 and (¯ v2 γ0 v4 ) = v2+ v4 = 3 u1 = 2me δσ1 σ3 and (¯ 2mµ δσ2 σ4 . As a result, 4πe2 Mf i = + 2 2me δσ1 σ3 2mµ δσ2 σ4 . q As a consequence, the Coulomb law is obliged to the exchange of the vector particle (the photon) between charged fermions. Problem 14.1. Consider the interaction of the form [ ] ˆ(e) ˆ(µ) (e) (µ) ˆ ˆ ˆ ˆ V (x) = g Φ(x) Ψ (x) Ψ (x) + Ψ (x) Ψ (x) and find in the nonrelativistic limit the scattering amplitude Mf i for the process e− µ∓ → e− µ∓ with the exchange of a neutral scalar particle corresponding to the field Φ(x). Show that it corresponds to the Yukawa potential energy (2)
Uq = −
g2 , q 2 + m2
U (r) = −
g 2 /4π −rm e , r
i. e. to the attraction for the e− µ− as well as for e− µ+ interactions.
14.4. The annihilation processes e+ e− → µ+ µ− and e+ e− → τ + τ − First, we consider annihilation of positron and electron into muons (see Fig. 41) for which the conservation of 4-momenta reads p1 + p2 = p3 + p4 . The Mandelstam variables of this process are s = (p1 + p2 )2 ,
t = (p1 − p3 )2 ,
u = (p1 − p4 )2 .
Besides the following relations will be useful below s + t + u = 2m2 + 2µ2 ,
p21 = p22 = m2 ,
p23 = p24 = µ2
and p1 p3 = p2 p4 ,
p1 p4 = p2 p3 .
The scattering amplitude is equal to Mf i = (−ie)2 (¯ v1 γα u2 )
4πα (−4π)g αβ (¯ u4 γβ v3 ) = F, s s
(14.23)
where F = (¯ v1 γα u2 ) (¯ u4 γ α v3 ) .
(14.24)
102 If, as usually it is, the initial electrons and positrons are not polarized and we do not detect the polarization of the muons, we need to calculate the module F squared, summed over the polarizations of the final particles and averaged over the polarizations of the initial particles. The corresponding quantity reads }1 1 { 1 1 ∑ |F |2 = · Tr (̸ p1 − m) γ α (̸ p2 + m) γ β Tr {(̸ p4 + µ) γα (̸ p3 − µ) γβ } = 2 2σ 2 2 1,2,3,4 } { = 2pα1 pβ2 + 2pα2 pβ1 − sg αβ {2p3α p4β + 2p4α p3β − sgαβ } , where for simplicity we introduce the notation ̸ p ≡ pα γα (very often, pα γα is denoted as p with hat, pα γα = pˆ, but we apply hat only to operators, and thus we will not use it). As a result, ( ) 1 1 ∑ · |F |2 = 8 (p1 p3 )2 + 8 (p1 p4 )2 + 4s µ2 + m2 . (14.25) 2 2σ 1,2,3,4
In centre of mass system we have p1 p3 = ε21 (1 − ve vµ cos θ) ,
p1 p4 = ε21 (1 + ve vµ cos θ) ,
where the scattering angle θ is the angle between vectors p3 and p1 (i. e. the angle of the final µ+ + − + − with respect to the direction of the initial e+ ) and Figure 41: The process e e → µ µ velocities of electron and muon are √ √ 4m2 4µ2 , vµ = 1 − . ve = 1 − s s The final result for the differential cross section reads ( ) 4µ2 + 4m2 2 2 2 2 2 |Mf i | = (4πα) 1 + + ve vµ cos θ , s ( ) dσ |Mf i |2 |p3 | α2 4µ2 + 4m2 vµ 2 2 2 = = 1+ + ve vµ cos θ . . 2 dΩ 64π s |p1 | 4s s ve
(14.26)
At s ≫ 4µ2 this expression is simplified ) dσ α2 ( = 1 + cos2 θ . dΩ 4s
(14.27)
The number of muons produced in the forward and backward directions is the same. The
Figure 42: Angular distribution in the process e+ e− → µ+ µ− number of muons at 90 degrees is twice smaller than the number of muons flying forward.
103 This agrees with the wide distribution of scattered particles in e+ e− annihilation. It is very important that the form of the angular distribution is almost independent on the energy. The total number of the produced muons depends on the energy, whereas the form of the angular distribution does not. Even with very high energy, muons will be produced very widely (see Fig. 42). It is very convenient when a detector with not the total solid angle is used for registration of muons. Then we can calculate the total cross section; this all can be integrated easily. If the speed of the electron is considered to be equal to the unity, the total cross section is ( )√ 2µ2 4πα2 4µ2 1+ σ(s) = 1− (14.28a) 3s s s where the last root is the muon speed. What is this cross section like? All starts from the threshold, which is equal√to the doubled 2
muon mass, and then the main dependence on s will be tied with the root 1 − 4µs . This root is increasing, but at higher energies all starts falling quite fast, quadratically, with increasing energy. At high energies, only σ0 =
4πα2 , 3s
(14.28b)
will be left. Since all our accelerators, like many others, are operated considerably above the threshold of muon production, it is a certain standard value, all the rest cross sections are convenient to compare with that for muons. Hadrons are often detected together with muons, which enables measurement of the cross section of hadron production in the units of cross sections of muon production. Then the quantity σe+ e− → hadrons ≡R σ0
(14.28c)
has a special name, R. We’ll talk in detail about it. Let me point what the quantity σ0 is. Translation σ0 into usual units yields 10−33 cm2 , i. e. nanobarn, and the number 87 divided by s, which is measured in GeV2 : 4πα2 87 · 10−33 cm2 ≈ . (14.29) 3s s [GeV2 ] √ So, if you are working, for example, on VEPP-4M with s = 10 GeV, this cross section will contain only 0.87 nanobarn, because of 100 in the denominator. This shows that if we proceed from the VEPP-4 energy (10 GeV), to the operation energies of LEP-1 (100 GeV), for example, the cross section decrease 100 times, which makes about several picobarn. So, to have a similar number of events with mouns you need to increase the luminosity by two orders of magnitude at least. Moreover, we’ll find out this ratio R to be of the order of the unity, which is significant not only for muons that are still a background process, but also for more interesting things related to the production of hadrons. Therefore, when we turn from VEPP-4M to LEP-1 and all the more to LEP-2 (200 GeV), the cross-section decreases 100 and 400 times, and respectively, it is necessary to increase the luminosity a corresponding number of times. On the other hand, if we talk about the International Linear Collider (ILC), with a total collision energy of 500 GeV even at the first stage, what shall be the σ0 =
104 cross section decrease as compared with that? 2 500 times, which means that the planned luminosity for the ILC should be at least 3-4 orders of magnitude larger than the actual one. So, when we turn to this simple process, we touch upon the issues of planning future experiments. And it is not the end yet. Let’s now consider another interesting process, involving annihilation into tau leptons. There was time when τ lepton, which was discovered by chance, was interesting to nobody. But now it fits well to the Standard Model. One of the important characteristics of the τ lepton is the exact value of its mass, which can be found if we study the total cross section. What is the difference between the total cross section of e+ e− → µ+ µ− from the above one? The only difference is the mass. However, at high energies, this cross section is independent of the mass of electron and the final lepton. Therefore, we are looking into a region near the threshold. This is the region where the cross section depends rather strongly on the mass. Studying the cross section in this region, one can find out the mass of τ . I have some figures for you to look at. The first figure (see Fig. 43) is taken from the paper of DELCO collaboration (W. Bacino, et all., Physical Review Letters 41 (1978) 13). Here is laid the cross section of τ + τ − production in comparison with the cross section of muon production in the region from about 3.6 to 4.4 GeV. It is a relatively small area, changing approximately by 1 GeV. In case of muons, their cross section in dependence on s is quite smooth function. On the contrary, in case of τ , this mass-dependent factor is very important. There is another interesting thing, a point that lies below the production threshold. This
Figure 43: The ratio σ(e+ e− → τ + τ − )/σ(e+ e− → µ+ µ− ) of measured cross sections near the threshold for τ + τ − pair production as measured by DELCO collaboration. means that the statistics are small. The coming out point captures the neighbors with its big error bars. That’s one fact. Besides, the point that lies below the formal threshold may be caused by the poor determination of the beam energy. Here you see the experimental data relating to 1978, when the value of mass indicated there was obtained, 1782 MeV with an accuracy of +2 and −7 MeV. That was done on the basis of one of the first experiments. Now it turns out that if you know the beam energy in this area very well, it is critical for very good determination of the mass of τ . Here is another variant for you. It relates to the VEPP-4M collider – see Fig. 44 taken from CERN Courier (December 2006). You can see the results of the first experiment, where
105 only 3 or 4 points are put. It is the beginning of the experiments; they go on now. I can only say that in these circumstances it is fundamentally important how the beam energy
Figure 44: The observed cross section σ(e+ e− → τ + τ − ) as measured by KEDR (Novosibirsk) can be determined accurately. You can see in the picture a strong peak corresponding to the production of Ψ(2s). This mass value was measured separately, and could be used for normalization, because it has a high degree of accuracy. When the τ mass was measured, the beam energy was monitored during data-taking through Compton backscattering of infrared laser light with a precision of 5 · 10−5 . The beam energy was absolutely calibrated daily with a precision of 1 · 10−5 using the resonant depolarization method. So, it was a double accuracy control. In the first picture, the τ mass was mτ = 1782 MeV with the accuracy +2 and −7 MeV, whereas the modern data give mτ = 1776 MeV with the accuracy ±0.16 MeV, i. e. the accuracy is at the level of 10−4 . But we have not finished the analysis of the discuss process. It also has direct relevance to the measurement of R (see Eq. (28c)), which is one of the most important tasks for the VEPP-4M collider.
14.5. Processes e+ e− → q¯q and e+ e− → hadrons at high energies Let us consider annihilation of an electron and positron to a pair of quarks. As we know, there are several types of quarks, so here we point out that this is annihilation to a concrete quark-antiquark pair qa q¯a with charges ±Qa |e|. If relevant diagram is drawn, the beginning will Figure 45: The process e+ e− → q¯a qa be exactly the same: the positron and electron are annihilated into a virtual photon with the momentum k = p1 + p2 , and then a pair of quarks is produced (see Fig. 45). Here we have a quark of a sort a, and antiquark of the sort a. Let’s see the difference between the process e+ e− → q¯a qa and the above process e+ e− → µ+ µ− . Firstly, another vertex, i. e. for the former process we have the particle charge e, whereas for the latter process we have the quark charge Qa |e|, where Qa is a dimensionless quantity corresponding to the charge of the quarks in the elementary charge units. The second, also significant, difference is that unlike muons, quarks have an additional
106 quantum number of color, which does not show here, except for the fact that there may be three such colors. Therefore, if we compare the cross section of the process e+ e− → q¯a qa with that cross section σ0 , which is a standard one, the difference is due primarily to the squared quark charge Q2a . Secondly, for these final particles we will have to perform summation — in addition to summing over the spins — over the colors, too. The three colors will give an additional factor of 3. As a result, at high energies, s ≫ 4m2a , we have σe+ e− →¯qa qa = 3Q2a σ0 .
(14.30)
Now let we discuss the process e+ e− → hadrons. If we are talking of the birth of hadrons near the threshold which corresponds to production of a pair of pions; that is about 280 MeV. It is easy to figure out that the corresponding distance is at the level of 10−13 cm. This distance is characteristic to strong interactions. On the one hand, we believe that even in this region all processes can be described by quantum chromodynamics (QCD). On the other hand, this offers not much use, because we cannot solve exactly the equations of QCD, and, as classics wrote, only in a state of deep despair can one use the perturbation theory with the corresponding constant not small. Therefore, in these conditions, while we are near the threshold, we have to use a variety of models. For example, when talking of the birth of charged particles π + π − , at first everything looks very similar to quantum electrodynamics, i. e. in some approximation, π ± can be considered as point particles, but this is only an approximation. And if we go a little bit further, there arise catastrophic differences from quantum electrodynamics: rise of resonances and so on. Experimenters, including those working with colliding e+ e− beams, are trying to sort out this mess, this spectroscopic area, with utmost precision. Various models are used and are constantly refined and so on. But when it comes to high energies, we can start using QCD, because everything happens at short distances. If we are talking further about production of hadrons (i. e. about the process e+ e− → hadrons at s ≫ 4m2a ) it is a complex process which can be described in the √ lowest order of the perturbation theory as production of a q¯a qa pair at small distances ∼ 1/ s with the subsequent transition of quarks into hadrons. And we can try to do the calculation not only in the lowest approximation, but also considering further, subsequent terms of the perturbation theory of QCD. The subsequent terms will correspond to the possibility of gluon exchanges between the produced quarks as well as an emission additional gluons. All this will not be easy, but we have a guiding thread of the perturbation theory, and hence all these complications will be in the next order, and all they can be considered as corrections. This is good while everything takes place at short distances. However, there appear hadrons, and the transformation of quarks into a set of hadrons again occurs at large distances, and thus we can say almost nothing, basing on the first principles. However, we knew how quarks are produced at small distances. And we can describe the transition of quarks to hadrons at large distances due to a simple assertion that the quarks will certainly turn into hadrons, with 100 % probability. Therefore, ∑ σe+ e− →h (14.31) ≡R=3 Q2a σ0 a and this result will be true if s is much larger than the squared two quark masses. Let’s try to see what it will equal. If the energies are such that we go a little bit beyond the resonance region, where we cannot directly derive anything in terms of the fundamental theory, then beyond the region, e. g. of 1.5 ÷ 2 GeV in the total energy, a set of up and down quarks of
107 the first generation may participate, as well as a strange quark. ( ) ∑ 4 1 1 2 R=3 Qa = 3 + + = 2. 9 9 9 uds
(14.32)
This is a kind of prediction for R. If the energy is such, this set of quarks is complemented with a charmed quark, then R=3
∑
Q2a = 2 + 3 ·
udsc
10 4 = . 9 3
(14.33)
It will happen when we move beyond the threshold of production of c quarks, which is about 3 GeV in energy. If we go beyond the threshold of production of b quark, we’ll have, in addition to this set, the b quark contribution, the same as the contribution of d or s quark: R=3
∑ udscb
Q2a = 2 +
4 1 11 + = . 3 3 3
(14.34)
Let’s try to see how this theory describes experimental results. It is quite interesting to compare. Of course, this is a simplest reasoning, because, apart from this process in the lowest order, there are another contributions where gluon exchange occurs, where gluons can be emitted additionally, and so on. All this is taken into account in accurate calculations, and the resulting picture is very impressive. Let’s see the upper panel on Fig. 46 (this is Fig. 51.5 from Review of Particle Properties 2016 – see Internet address: www.pdg.lbl.gov and then “Reviews”, “Kinematics, CrossSection Formulae, and Plots”, “Plots of cross sections and related quantities” or the direct address http://pdg.lbl.gov/2016/reviews/rpp2016-rev-cross-section-plots.pdf). It shows just the cross section of hadron production in e+ e− collisions. The cross section in millibarns is laid in the vertical scale. Let me remind that the millibarn is 10−27 cm2 , because the barn is 10−24 cm2 (this cross √ section is considered huge as a barn). What can we see from this picture? The value s in GeV is laid in the bottom. If this cross section behaves as the muon cross√section does, we definitely see that this quantity is decreasing quadratically with increasing s. Then, in a logarithmic scale, which is used both vertically and horizontally, it must be a straight line. This is clearly seen. First, there is a ρ meson peak; then a narrow ω peak, even more narrow ϕ peak, and a ρ′ peak. As soon as we pass beyond ρ′ , everything is arranged in roughly straight lines, within a huge interval. We see production of the J/ψ family, and then, before the Z boson, there is the Υ (Upsilon) family. Everything is in an almost straight line, which corresponds to the dependence 1/s. And look how large the range is: from about 2 GeV to about 70 GeV; and near 90 GeV there is a “gift of nature”, Z boson peak, which exceeds the background level by 3 orders of magnitude. Meanwhile, the cross section decreases from 10−4 mbarn to 10−7 mbarn, i. e. by 3 orders. But this picture shows what the cross section looks like. The lower panel of the same Fig. 46 shows the ratio R. Again (in the same scale) these are almost straight horizontal lines. After ρ′ , there goes one straight line to ψ, small jump and straight line to Υ, and a small jump after Υ. In fact, it looks as if we pass from R = 2 to R = 2 + 4/3 and to R = 2 + 4/3 + 1/3. Of course, the scale is logarithmic, and it’s not seen very well, but what will you see in the next Fig. 47 (this is Fig. 51.6 from Review of Particle Properties 2016). The same picture, but in more detailed scale – see Fig. 47. Look
108
Figure 46: The cross section σ(e+ e− → hadrons) and ratio R = σ(e+ e− → hadrons)/σ0 at the very top: this is the area where, according to our calculation, R should be equal to 2. And what do we see? Experimental points in the range of 1.5 to 3 GeV: a lot of them near ρ′ , and then only separate points. By the way, VEPP-4M used to work well in this area. All of them are very well fitted by a straight horizontal line. The dotted line below corresponds to our prediction, which is called the “naive quark model”, in which a pair of quarks and nothing more is taken into account. The straight line above represents the refined quantum chromodynamics with corrections and so on. It is seen that quantum chromodynamics is in better agreement with experiment. The next piece includes the interval from 3 to 5 GeV. We see some fluctuations because of the rise of resonances associated with c¯ c bound quark states. But after that, again good agreement with the quantum chromodynamics predictions. But the dotted line is not constant, because of attempts to take the threshold effect into account. Finally, the next piece, approximately from 9 to 11 GeV. It is the area where production of b¯b bound quark states begins. We can see that except for the narrow resonances the experiment is in remarkably and amazingly good agreement with the
109
Figure 47: The ratio R in the light-flavor, charm, and beauty threshold regions quantum chromodynamics predictions. Thus this science really has a very rational kernel, confirmed by experiment. That was about the total cross section of hadron production in e+ e− collisions. Further, we found that if we considered the angular distribution of muons at high energies, it would be described by this simple distribution dσ ∝ 1 + cos2 θ. dΩ
110 By the way, this simple dependence is strongly connected with the assertion that the muon has spin 1/2. If a scalar particle were produced here, the distribution would be substantially different. Next, when it comes to production of quarks in the lowest order of quantum chromodynamics, the angular distribution of quarks must be the same. Of course, there will be complications because of corrections, but the main distribution will be the same: a general dependence of the type 1 + cos2 θ. This is very good and useful, because the distribution is broad and detectors that cannot operate with small angles cover a significant part of this cross section. If quarks had another spin, they would have a different distribution. Certainly, quarks are objects we cannot see directly. On the other hand, these quarks for certain turn into hadrons. At high energies, the distribution of hadrons look like jets produced by quarks. Indeed, at high energies, generally speaking, the hadron distribution is broad, but this broad distribution often has clearly notable jets. Examination of the angular distribution of these jets confirms the above dependence, and thus they came from particles with spin 1/2. This is one of the first example of experimental evidence on the quark spin. Of course, what we are observing is a prediction based on the lowest order of the perturbation theory. Generally speaking, there could be production of gluons, which must also turn into jets. Such three-jet events were also observed with good energy of the e+ e− collision. Moreover, in a sense, experimental research on the hadronization revealed differences between quark and gluon jets. In these pictures, the gluon jet is slightly wider and has less energy than the quark jet. So, from this point of view, study of the processes of the lower orders allows us to touch the fundamental problems of hadron physics. In conclusion of this section, let me mention that in addition to this kind of processes that are directly studied in the annihilation of electrons and positrons it would be possible, with appropriate accelerators at hand, to observe processes in which muons or quarks are produced in collision of two photons, if we had colliding beams of photons of sufficient energy. We will discuss this question later.
14.6. Process eµ → eµ and crossing symmetry Let’s consider a process of elastic electronmuon scattering eµ → eµ that involves the same particles that were in the annihilation process e+ e− → µ+ µ− (see §14.4), but in another incarnation. For example, if in the latter reaction we transfer the positron into the final state and µ+ Figure 48: The process eµ → eµ into the initial state, then we will have a process of scattering, not annihilation: an electron is scattered on a muon. Consideration of this process will allow us to talk of crossing symmetry. This is a very deep and very useful idea of the quantum field theory. It can be felt through this simple example. The Feynman diagram of the scattering process is shown on Fig. 48 with the corresponding 4-momenta, remind that p2 = (p′ )2 = m2 , P 2 = (P ′ ) = µ2 , q = p′ − p. 2
The scattering amplitude for this process reads 4πα Mf i = 2 F¯ , q
(14.35)
where F¯ = (¯ up′ σ2 γ α upσ1 ) (¯ uP′ σ4 γα uPσ3 ) .
(14.36)
111 Let us compare }1 1 1 ∑ ¯ 2 1 { ′ F = Tr (̸ p + m) γ α (̸ p + m) γ β Tr {(̸ P ′ + µ) γα (̸ P + µ) γβ } 22 σ 2 2
(14.37)
1,2,3,4
with the corresponding expression (14.25) for the reaction e+ e− → µ− µ+ in § 14.4. We immediately observe that the above result can be obtained from that in § 14.4 if we perform substitutions p1 = −p′ , p2 = p, p3 = −P, p4 = P ′ , s = q 2 . (14.38) Note that no tricks here: no spinors; just replacement of momenta and nothing more. This opportunity to describe such processes, related or so-to-say crossing, with simple replacements of 4-momenta is one of the biggest advantages of our covariant perturbation theory. No need in a new calculation; it is sufficient to take the expression for the square of the matrix element of the initial process and perform a replacement of the 4-momenta in it, and we get the square of the matrix element of the crossing process ( |Mf i | = 2
4πα q2
)2 [
( )] 2 8 (pP )2 + 8 (p′ P ) + 4q 2 m2 + µ2 .
(14.39)
However, the cross section is expressed not only via the square of the matrix element, but it also includes the momenta of the initial and final particles: they are a little bit different in these two reactions. But that will be a simple, purely kinematic change. In particular, the cross section reads (in the centre-of-mass) dσ |Mf i |2 = . dΩ 64π 2 (εe + εµ )2 Let’s consider an option, when the muon is at rest and the electron is scattered. The energy of the electron is much smaller than the mass of the muon, εe ≪ µ. But this electron, since the muon mass is about 200 me , may be relativistic as well. In this case the muon experiences very small recoil and εe = ε′e , q 2 = −q2 = − (p − p′ )2 = −4p2 sin2 2θ . Therefore, this can be considered as scattering of an electron (including, perhaps, relativistic one) on an immovable center which is the Coulomb field of the muon. For scattering on an immovable center of the Coulomb type the answer is known — it is the Rutherford cross section dσRuth α2 = , dΩ 4v 2 p2 sin4 2θ
(14.40)
where v = |p|/εe is the electron velocity. Such is the Rutherford cross section in classical mechanics. Our answer, obtained from the exact result of the QED lowest order looks like this: ( ) α2 dσ 2 2 θ = 1 − v sin , (14.41) dΩ 2 4v 2 p2 sin4 2θ it contains the Rutherford cross section, with such an additional factor ( ) 2 2 θ 1 − v sin . (14.42) 2
112 In the nonrelativistic case, at v ≪ 1, this factor is equal to unit and it can be discarded. Whereas in the relativistic case, also included in this consideration, the factor (42) may be significant. How does this factor behave in dependence on the scattering angle θ? In forward scattering, this is exactly 1, and our quantum electrodynamic cross section does not differ from the classical Rutherford one. With the increasing angle, when θ → π, it tends to (1 − v 2 ), and for a relativistic particle this may be a value very close to zero. I. e. the classical Rutherford cross section is already rapidly decreasing with increasing angle. But for the electron, there occurs additional suppression due to this factor.
§ 15. The second order of the perturbation theory in QED. The electron propagator 15.1. The γe scattering Let us consider the Compton scattering, i. e. the process γ(k1 ) + e− (p1 ) → γ(k2 ) + e− (p2 ) ,
(15.1)
for which the Mandelstam variables are t = (k1 − k2 )2 ,
s = (k1 + p1 )2 ,
u = (p1 − k2 ) .
(15.2)
The initial and final states of this process are determined as |i⟩ = cˆ+ ˆ+ 1a 1 |0⟩ ,
|f ⟩ = cˆ+ ˆ+ ˆi ≡ cˆki λi , a ˆi ≡ a ˆ p i σi , 2a 2 |0⟩ , c
and the corresponding matrix element reads (2) Sf i
(−ie)2 = 2!
∫
d4 xd4 x′ F µν (x, x′ ) fµν (x, x′ ) ,
{ } F µν (x, x′ ) = ⟨0| cˆ2 Tˆ Aˆµ (x) Aˆν (x′ ) cˆ+ 1 |0⟩ , {( )( )} ˆ (x) γ Ψ ˆ (x′ ) γ Ψ ˆ ˆ (x′ ) a fµν (x, x′ ) = ⟨0| a ˆ2 Tˆ Ψ Ψ ˆ+ µ (x) ν 1 |0⟩ It is convenient to compare this process with its analogue as concerns scalar particles, the process π 0 π − → π 0 π − , which was considered in § 13.4 and which matrix element is (2) Sf i
(−ig)2 = 2!
∫
d4 xd4 x′ F (x, x′ ) f (x, x′ ) ,
{ } ] [ 1 −ik1 x+ik2 x′ −ik1 x′ +ik2 x ˆ (x) Φ ˆ (x′ ) cˆ+ √ |0⟩ = F (x, x′ ) = ⟨0| cˆ2 Tˆ Φ , e + e 1 2εk1 V2εk2 V { } + f (x, x′ ) = ⟨0| a ˆ2 Tˆ φˆ+ (x) φˆ (x) φˆ+ (x′ ) φˆ (x′ ) a ˆ1 |0⟩ , In QED complications are related to spins of particles, in particular, F µν (x, x′ ) = √
] [ 4π −ik1 x+ik2 x′ ν µ∗ −ik1 x′ +ik2 x e , eµi ≡ eµki λi . eµ1 eν∗ e + e e 1 2 2 2ω1 V2ω2 V
113 The propagator of scalar particle { } ⟨0| Tˆ φˆ (x) φˆ+ (x′ ) |0⟩ = iD (x − x′ )
(15.3)
was included in the function f (x, x′ ) besides contractions of operators a ˆ2 and a ˆ+ ˆ+ (x′ ) 1 with φ and φˆ (x). Analogously, the propagator of spinor particle } { ˆ j (x) Ψˆk (x′ ) |0⟩ = iGjk (x − x′ ) ⟨0| Tˆ Ψ (15.4) is now included in the function fµν (x, x′ ) besides contractions of operators a ˆ2 and a ˆ+ 1 with ˆ ′ ˆ (x). This propagator becomes the matrix over spinor indices j and k. Ψ (x ) and Ψ The process π 0 π − → π 0 π − is described by two Feynman diagrams Figs. 28 and 29 corresponding to the exchange of virtual π − in the u and s channels. As a result, the scattering amplitude for π 0 π − scattering is Mf i = −g 2 [D (p1 − k2 ) + D (p1 + k1 )] .
(15.5)
The process γe− → γe− is described by two Feynman diagrams Figs. 49 and 50 corresponding to the exchange of virtual e− in the u and s channels. As a result, the scattering amplitude
Figure 49: Diagram with e exchange in the u channel for γe scattering
Figure 50: Diagram with e exchange in the s channel for γe scattering
for γe− reads ν ¯2 γµ eµ∗ Mf i = −4πe2 [¯ u2 γµ eµ1 G (p1 − k2 ) γν eν∗ 2 G (p1 + k1 ) γν e1 u1 ] . 2 u1 + u
(15.6)
15.2. The electron propagator Let me remind you that last time we were talking that the scattering of charged particles necessitated a corresponding propagator of charged particle; that was the vacuum expectation of the chronological product of two operators of these charged particles – see Eqs. (3) and (4). We’ve got a complete result for the scalar case, which looked particularly simple with its Fourier transform, i. e. with the Fourier transform D(p) of the function D(x). For this function, we’ve got a very simple formula: D(p) = 1/(p2 − m2 + i0). Besides, it was necessary to change the traditional paradigm of elementary particles. In the beginning, this field was represented as a set of plane waves, the squared momenta of which was equal to the squared mass of the particle p2 = m2 . Those were ordinary particles; the corresponding integrand included a double sum, because of the two operators. Using the rule of taking the appropriate chronological product, we reduced this double sum to a single one, the summing
114 being performed over states corresponding to usual real particles. Then it turned out that this ordinary result could be written via changing the sum, or the integral with respect to d3 p, to the integral with respect to d4 p. On this way, however, we rejected the ordinary condition: p2 = m2 . In this case, this condition is not met. The states in which we perform the expansion become virtual, and the quantity p2 − m2 is the measure of the virtuality. We are talking about scalar particles. That was a process in which a neutral particle was scattered on a charged one, and such an object D(x − x′ ) needed study. This object corresponded to the propagation of the charged particle from the point x to the point x′ , in terms of coordinate space. Now we are going to consider an analogue of this process, where a photon is instead of π 0 and an electron is instead of charged particle π − . This is the Compton effect. In the scalar case we’ve done everything to the end and gained a new understanding of particles, which included the concept of “virtual particles”. Besides, we revealed that this function D(x), the complete form of which we have found, is in fact the Green function of the equation these fields obey. These fields obey the Klein-Fock-Gordon equation, and the function D(x) is the Green function for this equation. I. e. we are talking about such an equation: application of the operator (ˆ pµ pˆµ − m2 ) to the field ϕ(x) shall yield 0, and application this operator to the propagator yields the delta function of x: (
) pˆµ pˆµ − m2 D (x) = δ(x)
or in the momentum representation (
) p2 − µ2 D(p) = 1 .
It results in the final simple expression D(p) =
p2
1 . − m2 + i0
And we told that this last circumstance could be used very effectively in the future. Then we proceeded to the scattering of a photon on an electron. Now we are interested in the electron propagator Gjk (x) which is determined by Eq. (4). Quite similarly, this propagator is the chronological product of the electron-positron fields Ψ(x) and Ψ(x′ ). This object will now be under study. We can act as before: (i) taking the sum of Ψ(x) and Ψ(x′ ) over the real states; (ii) then, due to the property of the T -product, reducing the double sum to a single one; (iii) transforming it to the integral with respect to d3 p; (iv) then turning to the integral with respect to d4 p and so on. Everything is repeated as before, but with significant complications because of the fact that now the fields Ψ(x) and Ψ(x′ ) are bispinors, not scalars. These difficulties can be avoided via an analogy with this last approach: the function D(x) is exactly the Green function for the equation the field ϕ(x) obeys. The field Ψ(x) obeys the Dirac equation, which looks like this (ˆ pµ γ µ − mI)ij Ψj (x) = 0.
(15.7)
The following is included here: the usual operator pˆµ = i∂x, the Dirac matrix (γ µ )ij , and the mass m is multiplied by the identity matrix Iij = δij , which is sometimes just implied, not written. So, in this case, it will be an equation involving matrixes. Since the field
115 Ψ(x) meets the equation (7), the electron propagator Gjk (x) in turn must obey the equation corresponding to the Green function: (ˆ pµ γ µ − mI)ij Gjk (x) = δ(x)δik ,
(15.8)
This gives (let me omit spinor indexes, which are implied) ∫ ∫ 4 4 µ −ipx d p µ −ipx d p = (pµ γ − mI) G(p) e = (ˆ pµ γ − mI) G(p) e (2π)4 (2π)4 ∫ d4 p = δ(x)I = I · e−ipx , (2π)4 or in the momentum representation
Formally, it means that
(pµ γ µ − mI) G(p) = I .
(15.9)
G(p) = (pµ γ µ − mI)−1 .
(15.10)
This is a formal answer, and it would be good to have an explicit form of the inverse matrix, especially since we know that there are very subtle aspects in this analysis, because the integration is performed over an area where p2 can exceed m2 , be less than m2 , and be equal to m2 . The important problem of solving this singularity led to the need to introduce this additional item +i0, which extends the definition of the singularity for the point p2 = m2 . So, we can multiply the left and right parts of Eq. (9) by (pν γ ν + mI) and take into account that pν γ ν pµ γ µ = p2 I. It leads to ( ) (pν γ ν + mI) (pµ γ µ − mI) G(p) = p2 − m2 G(p) = pν γ ν + mI . As a result, we obtain
pν γ ν + mI G(p) = 2 p − m2 + i0
or in the reduced form G(p) =
p2
̸p + m , − m2 + i0
(15.11)
(15.12)
where ̸ p ≡ pµ γ µ .
(15.13)
15.3. The Compton effect and its applications 15.3.1. Cross section of the Compton scattering Taking into account Eqs. (6) and (12), we obtain the Compton scattering amplitude in the form [ ] u¯2 ̸ e1 (̸ p1 − ̸ k2 + m) ̸ e∗2 u1 u¯2 ̸ e∗2 (̸ p1 + ̸ k1 + m) ̸ e1 u1 (2) Mf i = −4πα + . (15.14) u − m2 s − m2
116 Using this expression, we can qualitatively predict the behaviour of the differential cross section taking into account the two propagators, one of which is proportional to 1/(u − m2 ) (this quantity is always negative), and the second will be proportional to 1/(s − m2 ) (this quantity is always positive). Eq. (14) is a basis for dynamics of the process: we know the matrix element; it shall be squared, summed over the spin states of the final particles, and averaged over the spin states of the initial particles; then we get the cross section. This is a tedious work, which will yield the following result for the differential and total cross sections (the corresponding calculations can be found in §86 of Quantum Electrodynamics by Berestetskii, Lifshitz and Pitaevskii [1]) { [ ]} dσ 2σ0 1 4y y = +1−y− 1− , (15.15) dy x 1−y x(1 − y) x(1 − y) [( ) ] 4 8 1 8 1 2σ0 σ= 1 − − 2 ln(x + 1) + + − . (15.16) x x x 2 x 2(x + 1)2 Here we introduce two invariant dimensionless variables x =
s − m2 2p1 k1 = , 2 m2 m
y =
k1 k2 . p 1 k1
(15.17)
and the quantity πα2 −25 cm2 . 2 = 2.5 · 10 m It is easily to find that with the increase of s (or x) the total cross section decreases from the Thompson cross section (8/3)σ0 at x ≪ 1 to the asymptotic value ( ) 2σ0 1 σ = ln x + at x ≫ 1 . x 2 σ0 = π re2 =
This effect is one of my favorites. Surprisingly, the phenomenon was discovered as early as in 1923-24. Cross section for the Compton scattering was derived soon afterwards but, of course, with no polarization effects. That was the Klein-Neshina formula (1929). A year later, I.E. Tamm derived this formula independently in framework of the old quantum mechanics. Though the main calculations concerning the Compton effect were done in the late 1920-s, the study continued until recently. For example, a full calculation that took into account polarization of all initial and final particles and the nonlinear effects, was done only a few years ago, in a work by Ivanov, Kotkin and Serbo (2004). Since I have been directly involved in that, I would like to tell you more than you can find in the textbooks. Let us talk in more detail about kinematics. The element tied with kinematics is very important. Why do I focus on this? Because this is a very unusual situation, quite different from that on colliding electron-positron beams. We are used to collisions in the center of mass system, especially at BINP. Although at the B factory KEKB, the electrons have the energy of 8.5 GeV and the energy of positrons is almost twice as small. As far as the Compton effect is concerned, a situation to study may be that the colliding particles differ heavily in the energy; we have already talked about this a bit. The kinematics here is unfamiliar, and thus we need to talk about it in more detail. The kinematics begins with a conservation law: k1 + p1 = k2 + p2 . In a collision of two particles with known
117 4-momenta, the energy of the emitted particle (an electron or a photon, because both are interesting) is strongly tied with the emission angle. For example, we have a collision of a photon and an electron and are interested in the final photon, whereas the final electron is of no interest to us so far. If so, it is very convenient to rewrite the above equation in the following way, with the final electron left alone, k1 + p1 − k2 = p2 , and then square this: (p1 + k1 − k2 )2 = (p2 )2 . It results in equation m2 + 2p1 k1 − 2p1 k2 − 2k1 k2 = m2 . Thus, the information about the final electron will be reduced to its mass squared, and we get the final expression k2 (p1 + k1 ) = p1 k1 . (15.18) It is very convenient, because the energy ω2 and angle θ of the second photon occur only in the left hand side, therefore, the energy of the second photon can be expressed through its angle and the energies of the colliding electron ε1 and photon ω1 . Let us consider two very different possibilities. 15.3.2. Experiments of Arthur Compton (1923-1924) X-rays with the energy ω1 from a few keV and up to tens of keV were scattered on atoms. Such X-rays have been obtained in X-ray tubes, they have the wavelength λ1 = 2π/ω1 ∼ 10−8 cm, which is comparable with an atom size. The atom binding energy, i. e. the energy of electrons, especially of the upper ones, is only a few eV, maybe ten eV. For example, the hydrogen atom binding energy is 13.6 eV. When a photon with the energy of a few keV collides with a weakly bound electron, the electron in these conditions can be regarded as free. At that time the picture of atom looked like this: the nucleus and the electron, bound by some quasi-elastic forces. A photon strikes the electron. What is a photon? At that time, they considered just X-rays in terms of wave pattern: an incident wave, in which an electric field varies. There is, of course, a magnetic field, but the existing fields cause non-relativistic motion of the electron. Since the Lorentz force is v/c times as small as the force from the electric field, the magnetic field can be neglected. In such a field, the electron will oscillate and emit radiation. If the field oscillates with a frequency ω1 , the electron will oscillate with the same frequency and emit the same frequency ω1 , regardless of the photon flight direction. What did Compton find? Contrary to this expectation, he observed a frequency that depended on the scattering angle. The frequency shifts were small (ten percent or less), but the dependence was reliably measured in experiments. An explanation was needed. A natural explanation was found in a picture where X-rays were considered as a set of photons: some balls (photons) collide with other balls (electrons), in this case, at rest. Once again, the electron could be considered as free. Then, even in terms of non-relativism, it is clear that under these conditions the final photon energy (and, therefore, frequency) must depend on the scattering angle. Let’s prove it. So, we have a photon, which, for example, flies along the x axis and collides with an electron at rest, and then a final photon flies off and has a finite frequency ω2 , which depends on the angle θ, therefore, k1 = ω1 (1, 1, 0, 0) , k2 = ω2 (1, cos θ, sin θ, 0) , p1 = (me , 0, 0, 0) .
118 If we insert these expressions into Eq. (18), we obtain the relation ω2 (me + ω1 − ω1 cos θ) = ω1 me , from which we find x=
2ω1 , m
y=
x sin2 (θ/2) 1 + x sin2 (θ/2)
and the change of the wave length of X-rays scattered at the angle θ: θ 4π sin2 , me 2
(15.19)
1 = 3.86 · 10−11 cm me
(15.20)
λ2 − λ1 = where
is the reduced Compton wave length of the electron. Let us note that λ2 > λ1 , therefore, ω2 < ω1 . In the picture where balls collide, it is almost obvious that the ball of the electron at rest rebounds, acquiring some kinetic energy, which is taken from the initial photon energy, and thus the final photon energy must be less than that of the initial one. The experiment was in perfect agreement with this relationship, which was a very significant confirmation of the fact that light is not only emitted and absorbed in portions, as Planck had suggested; it is propagated in portions and transfers energy in portions. Of course, we need to say a few words about this quantity: 1/me , or ~/(me c) in the ordinary units. It equals 386 fm or about 4 · 10−11 cm, i. e. almost 100 times as large as the nucleus size. Therefore, if the initial wavelength was comparable with atomic dimensions ∼ 10−8 cm, the shift in the energy was a few percent. The angular distribution of the final photons corresponds to dipole radiation, i. e. they was emitted mainly in the direction perpendicular to the electric field. However, when we proceed to modern applications, the picture will be quite different. 15.3.3. Collisions of ultrarelativistic electrons and laser photons Just now there are a number of devises in which beams of high energy electrons with the energy ε1 ≫ me ≈ 0.5 MeV perform head-on collisions with bunches of laser photons with a small energy ω1 ∼ 1 eV. This process is used for production of high-energy photons because the final photons fly mainly backward, i. e. almost along the direction of the initial electrons, and take a considerable portion of the electron energy. To explain it qualitatively, we have to consider the dynamics of the process, which is defined by these two items in Eq. (14). The most sensitive place in this case is the denominators of the propagators. They are proportional to either u − m2 or s − m2 . Quantity s − m2 = x m2 = 2p1 k1 = 2ε1 ω1 (1 + v1 ) is fixed (here v1 is the velocity of the initial electron). On the contrary, quantity m2 − u = 2p1 k2 = x(1 − y)m2 = 2ε1 ω2 (1 − v1 cos θ)
(15.21)
depends strongly on the photon scattering angle θ with respect to the direction of the initial electron momentum. The Lorentz-factor of the initial electron is large γ = ε1 /me ≫ 1 and v1 ≈ 1 − 1/(2γ 2 ) is close to 1. Therefore, if the angle θ tends to 0, quantity (21) becomes proportional to the difference (1 − v1 ), which is very small. But if this term m2 − u is
119 small, its contribution to the cross section will be large. Thus the dynamics dictates that the distribution here must be such that small angles θ ∼ 1/γ are very important. For such a set-up, Eq. (18) takes the form ω2 [ε1 (1 − v1 cos θ) + ω1 (1 + cos θ)] = ε1 ω1 (1 + v1 ) . Taking into account that θ ≪ 1, cos θ ≈ 1 − θ2 /2, we find the energy of the final photon in the form x 4ω1 ε1 . (15.22) ω2 (θ) = y ε1 = 2 ε1 , x = x + 1 + (γθ) m2e Therefore, the maximum photon energy is at θ = 0 and equals max{ω2 } = xε1 /(x + 1). Take a look at just two examples. The ROKK facility (scattering of the backward Compton quanta) was used in the BINP. Experiments on the scattering of electrons with the energy of about 5 GeV were conducted at VEPP-4M in 1997. In the VEPP-4M beamline, only one electron beam was circulating; no positron beam. The laser photons were produced by several types of lasers, especially the infrared lasers on neodymium glass with the energy of 1.2 eV; the wavelength was approximately 1 micron. In this case parameter x = 0.092 and the maximum energy turns out to be 0.42 GeV, i. e. upon a collision, the tiny laser photon takes a little less than 10 % of the electron energy, and the final energy of the photon increases 350 million times. In this case, in the BINP experiments, even tagged photons are produced. The MD detector facility was equipped with a system for registration of scattered electrons (SRSE). Upon a collision, the electron is scattered at a very small angle. Nevertheless, it loses some energy, drops out of the acceleration mode and the central detector does not detect it, unlike this special SRSE. It is located far from the collision point, where the electrons that are slightly turned by the magnetic field, get into the system and are detected. If this energy is measured with high accuracy, as it is, we’ll know all about the final photon, i. e. it is a tagged photon, provided that not too many these photons are produced in one collision of bunches. The reason is that in detecting the scattered electrons, we cannot identify which photon is tagged, and it would be better if one photon at most is produced in one collision. Remember this figure; we’ll need it. This means that the photon target should not be too dense. Thus, the ROKK facility was a very effective device for production of high-energy photons. How can we use them? They are used in very interesting experiments that show the sharp contrast between the quantum electrodynamics and the classical electrodynamics. For example, according to the principle of superposition in the classical electrodynamics, if you have two fields, here is what happens to the strength of the field in the intersection of these fields: if the field intensities are E1 and E2 , in the intersection area, the total field equals the sum of the fields E1 + E2 . In particular, this means that if you have a nucleus with its Coulomb field and a photon, which has its own fields, is flying here – nothing must occur in the area where the beam of light crosses the nucleus electric field, except for the geometric summing of the corresponding intensity vectors. And what really happens? With the perturbation theory of quantum electrodynamics and explicit relevant terms of the expansion of the perturbation theory (the Feynman diagrams), we can say that the following may happen (see Fig. 51): the nucleus interacts with the photon, and thus our photon can turn for a short time into a virtual electron-positron pair, and then the pair will interact with the nucleus. As a result, there arise scattered photons that can deviate from the initial direction. An informal name of this phenomenon is the Delbr¨ uck scattering,
120 although Delbr¨ uck had no relation to the calculation of this phenomenon, but it was his idea to examine this process. This phenomenon was theoretically calculated a long time ago, in the 1940s, and was also observed experimentally. BINP has added more accurate measurements.
Figure 51: The Delbr¨ uck scattering
Figure 52: The process of the photon splitting in the field of a nucleus
But the same quantum electrodynamics enables observation of a more exotic process. Let us note that in the case under discussion, the electron-positron pair interacts with the nucleus via two photons. But this is not obligatory; another process is also possible, in which the electron-positron pair interacts with the nucleus via one photon – see Fig. 52. As a result, if the photon energies are not too high, the nucleus experiences almost no recoil and just provides an external electric field. What happens to our photon in this external electric field? Splitting happens! Moreover, if the transfer is very low, such an approximate equality must be valid ω1 = ω2 + ω3 . This process was also calculated long time ago and observed experimentally in an experiment in Darmstadt. But it became clear soon after that the experiment had been completely wrong. So, BINP was the first to make such observation. In 1996 some interesting experiments were carried out at the Stanford Linear Accelerator Center (SLAC, USA). In the 3-km linear accelerator, an electron with an energy of ε1 = 46 GeV collided with a photon. There were a few different lasers, including this one with the energy ω1 = 1.2 eV. The electron energy here was nearly 10 times as large, and hence the parameter x was also about 10 times as big, of the order of 0.85. So, the maximum energy of the scattered photon turned out to be as high as ω2 = 21 GeV. These 21 GeV mean that upon a collision of this tiny photon with a high-energy electron, half of the energy of the electron was taken from it. There was another super-task in this experiment, namely, it was used for verification of effects of nonlinear quantum electrodynamics. The point is that the laser was very unusual, of higher power. Its flash was accompanied by a small amount of energy, about 1 Joule, but the features of the laser enabled concentration of this energy in a very short period of time, about 10−12 second. So, the flash power was — 1 J divided by 10−12 s — about 1012 W, which exceeds the power of any hydropower station. (However, power stations run continuously, whereas the laser operates during a picosecond.) That was due to the concentration of this small energy in a very short time. If that flash was very well focused, the laser target was very dense, unlike that at BINP. That enabled observation of the nonlinear Compton effect, in addition to the ordinary Compton effect. A description of the former in terms of the same Feynman diagrams looked like this (see Fig. 53): the electron could be scattered on a pair or more of photons simultaneously. That, of course, changed the final photon energy. Those initial photons were identical ones and had equal energies, but the entire kinematics changed a little. The density of these photons was that large, at a level corresponding to 1028 1/cm3 . What can it be compared with? All the electron bunch densities attainable in our accelerators are insignificant as compared with
121
Figure 53: The nonlinear Compton scattering this, i. e. the usual density in beams is that of a strongly rarefied plasma. Even the density of electrons in a usual solid (which is always considered to be large enough — at a level of 1022 1/cm3 ) is six orders less. That resulted in a nonzero probability that an electron can collide with one, two, three or even four photons. This is possible theoretically. How was it noticed? It was enough to detect the recoil energy. As we know, when in the ordinary Compton effect photons have a maximum energy, the electron has a minimum energy, which can be calculated in advance. Since these electrons were detected and their minimal energy was measured, experimentalists managed to find out that it happened in some cases when at least four photons were absorbed at once. 15.3.4. Compton scattering as a basic process for e → γ conversion at future photon colliders At present, the international community is developing the International Linear Collider (ILC), where the new principles of acceleration are to be applied. Until now, all colliders, including the electron-positron, proton-proton, proton-antiproton, and electron-proton ones, were built by a scheme in which two beams of charged particles are generated, injected in storage rings and collided at the collision point. A distinctive feature of this particle collision method, and perhaps the most significant one, was the storage rings, where a beam lives for hours, gradually losing its intensity via scattering on the residual gas in the beam line and via the interaction of the beams. Then, in a few hours, or maybe even in a day, it is replaced with a new particle beam. So, these are accelerators in which beams are used repeatedly for multiple collisions. The word “repeatedly” is the key one here. For example, on VEPP-4M, beams collide millions of times per second during hours. The scheme is very good, but has some limitations. What happens when we turn to higher energy, e. g. from VEPP-4M with its 5 GeV to the ILC project, where the energy of the electrons and positrons in the first stage is to be 50 times as large? In storage rings, a light particle, electron, is circulating under the influence of magnetic fields, producing a synchrotron radiation. Growth in the energy of particles under acceleration leads to increase in the share of energy for the synchrotron radiation. So, how can this be avoided? As you know, even from non-relativistic mechanics, a particle moving by a circle must undergo centripetal acceleration. If the radius of motion is increased, the acceleration decreases, as does the synchrotron radiation. How much? The lengths of VEPP-4M is 366 meters. And the length of the big 100 GeV accelerator, LEP-2, is already 27 km. The length of the circular storage ring is growing quadratically with the energy growth. Therefore, if we think about the energy of about 250 GeV or 500 GeV, the length of such a collider should be well over 100 km, which is unrealistic. That gave an idea of using linear colliders instead of storage rings. It was realized long time ago that
122 the problem of synchrotron radiation is absent in linear colliders. In this case the radiation power due to acceleration is negligible because the radiated power for acceleration along the velocity is independent of the energy, therefore the length of linear colliders grows linearly with the energy. Linear colliders are schematically arranged as follows. A source of electrons, e.g. a photogun, is used. Beams of nonrelativistic electrons are injected into a small acceleration — storage ring where their initial state with a large spread of momenta is changed to a state where all momenta are roughly equal. Then the beams are transported to an interim pre-accelerator; there they are brought to relativistic energies, not very large. Next is the main beam line, which is a linear accelerator where a traveling electromagnetic field must increase the energy of the electrons to approximately 250 GeV. The length of the beam line of the linear collider is assumed to reach approximately 15 km. Positrons are accelerated likewise, but first it is necessary to produce them. There is a convenient complex scheme of positron production, which we omit now. Nevertheless, another 15 km of linear acceleration are needed and only then electrons and positrons collide and pass through one another. So, a rather small number of electrons and positrons perform some useful action: produce new particles or cause a large number of events with the known particles. The Higgs boson, for example, can be studied in these colliders with much greater accuracy and in more detail than in the Large Hadron Collider. After that the life of this electron bunch ends; it goes into a wall, and where it disappears. The same happens to the positrons. Then, it is necessary to prepare the next bunch for the next collision, which must be done fast for the number of collisions to be appropriate. Another feature is that the number of particles in these bunches will be less than the number of particles in bunches circulating in storage rings. The frequency of collisions, which is also determined by the speed of production of these bunches, will be lower. How can an effective large number of useful events be achieved? It can be done only via increase in the density of bunches. However, with a fixed number of particles, which is generally less than the usual one, this means a need to reduce the sizes, first of all, the transverse ones. The facility being planned now looks absolutely fantastic. Let’s recall beams of VEPP-4M once again. The vertical size in the transverse direction is about 30 microns. The horizontal size is 30 times as large, i. e. almost a millimeter. For collisions of bunches, the vertical size of linear colliders is planned to be at a level of 6 nanometers! This means an approximately 1 000-fold vertical compression as compared with VEPP-4M. The same is in the horizontal. It is difficult to operate with such bunches, even to locate them. But all these problems are solvable. The project is being developed, and its technical details have been published. A fundamental difference between this type of colliders and the storage rings is that electron bunches are used only once, i. e. they are single-pass accelerators. A Novosibirsk group of Ginsburg, Kotkin, Serbo and Telnov (1981-1983) put forth an idea that these electrons could be converted into photons via the Compton effect. We know that this idea is realizable. It is easy to calculate that for an energy of 250 GeV and shoots from the same laser, the corresponding parameter x = 4.5 and the maximum photon energy ω2 = x/(x + 1) = 0.82 ε1 = 205 GeV, at a level of 80 percent of the initial electron energy. So, in principle, instead of an electron, a photon can be produced with approximately the same energy, which differs from the electron energy by just 20 percent. What will be the number of these photons? Calculations show that if the laser flash is well-focused, even with the existing lasers almost each electron has a chance to collide with a laser photon. Thus, the number of photons produced will be of the order of that
123
Figure 54: Scheme of the photon-photon collider
for the initial electrons. So, the use of the Compton effect in this case enables construction of colliders with colliding photons. This Fig. 54, taken from a review in Physics Today 48 (March 1998) by Andrew Sessler (who was the president of the American Physical Society), explains our idea. The idea is that somewhere near the collision point, the electron beam is focused finally, from quite large dimensions to very small ones: a few nanometers vertically and tens or hundreds of nanometers horizontally. Moreover, in these conditions, after the final lens, these electrons move freely and are focused into a tiny area at the collision point. The positrons fly from the other side. In fact, for our purposes, electrons from the other side would be enough; positrons are not required in this scheme. At a short distance from the collision point, in a so-called conversion area (see the yellow circle on Fig. 54), they are illuminated with photons from a high-power laser, flying almost head on. These red rays represent the infrared laser photons. If the laser photon target is dense, a sufficient quantity of high-energy photons can be produced. The issue of focusing still remains, though. Electrons are easy to focus; they can be affected by electric and magnetic fields; appropriate lenses focus them. But how can 200 GeV photons be focused? No lenses or diffractive plates can be used in such energy range. The already mentioned feature of the Compton effect is helpful here. Upon collision with an electron, even at an angle, not necessarily head-to-head, the photons follow the path of the electrons. And, accordingly, all photons will fly to the collision point almost along the initial motion of the electrons. Thus, the Compton scattering causes production of highenergy photons, and in addition they are focused automatically. This is a very beautiful feature in this scheme. In this way, it is possible to generate focused beams of high-energy photons, which is the essence of gamma-gamma colliders. The magnetic field deflects both the electrons that have lost a significant amount of energy and those that have not interacted in the collision region. A better option is using electron beams from both sides, the collective field of like-charged particles causing them to diverge in the collision, while neutral photons continue their movement to the collision point. At the end of this section I should mentioned that physics on gamma-gamma colliders will be no less interesting then physics on electron-positron colliders.
124
15.4. Main characteristics of e+ e− → γγ and γγ → e+ e− processes at high energies Let’s consider a process of two-photon annihilation e+ e− → γγ. The two Feynman diagrams of this process are shown on Figs. 55 and 56. It is just a cross channel to the Compton
Figure 55: Diagram with e exchange in the t channel for the e+ e− → γγ process
Figure 56: Diagram with e exchange in the u channel for the e+ e− → γγ process
scattering and its differential cross section can be easily obtained by the corresponding replacements. Let us denote by t = (p+ + p− )2 = (2ε− )2 the squared of the e+ e− energy and by θ the scattering angle in the centre-of-mass. In the high-energy area, which is of interest to us and is studied on colliding beams, the total cross section is ( ) 2πα2 t σ= ln − 1 at t ≫ m2 , (15.23) t m2 i. e. it is on the same level as the cross section for annihilation to hadrons. The differential cross section reads dσ α2 1 + cos2 θ = at t ≫ m2 , θ ≫ m/ε− . dΩ t sin2 θ
(15.24)
The dependence on the photon scattering angle in the region of large angles is very important. Indeed, we draw the relevant diagrams of quantum electrodynamics and obtain the result (24). It is not obligatory that quantum electrodynamics is applicable at short distances. There may be modifications, e. g. the electron propagator, which we have studied well, turns out to be somehow modified at small distances. This propagator corresponds to the s and u exchange. Therefore, to see how it is modified, we need to investigate the area where the variables s and u are large, and hence the corresponding distances are small. It is the area where this virtual electron has large virtuality and correspondingly small distances. Therefore, when any new e+ e− accelerator is commissioned, the process of two-photon annihilation is one of the first to study. Let’s look at Fig. 57 from the experiment of the HRS collaboration (M. Derrick et al., Physical Review D 34 (1986) 3286). The differential crosssection of the process is shown as a function of the scattering angle, in fact, of the cosine of the scattering angle. The energy is large and mostly large angles are taken. In this area the agreement of this QED calculations can be verified. Something unusual may occur there at small distances. Here you can see such a pretty steep dependence, which is easy to study experimentally. When the first electron-positron beams were generated here in BINP, the two-photon annihilation was also among the first processes studied.
125
√ Figure 57: The observed cross section dσ(e+ e− → γγ)/dΩ at t = 29 GeV as measured by the HRS collaboration. The solid line is the second-order theoretical prediction, Eq. (24) And of course, nearby this process, we can immediately study the reverse one γγ → e+ e− . The same Feynman diagrams of Figs. 55 and 56 should be viewed from the right to the left. The corresponding Mandelstam variable t = (k1 + k2)2 = 2k1 k2 is just the doubled product of the 4-momenta of the initial photons. A threshold corresponding to the fact that the energy of colliding photons must exceed the rest energy of the electron and the positron, i. e. 2k1 k2 > (2me )2 . Upon this threshold, there arises a cross section different from 0. It grows √ rapidly approximately up to t ∼ 3me and then falls by approximately the same law as before. To observe this process, we need photon beams of appropriate energy. In the centerof-mass system, the energy of each of them must exceed 0.5 MeV. There are no appropriate lasers. Nevertheless, in a roundabout way, something like this process was actually observed. It’s an interesting story, and I’ll tell it briefly, as it refers to the modern physics. As you remember, we considered a SLAC experiment where a 46 GeV electron collided with a photon corresponding to a neodymium glass laser. The emitted photon had a maximum possible energy ω2 = 21 GeV. It is necessary to understand the following: a laser bunch, even with a flash time of the order of 10−12 s, corresponded to a length of about 0.3 mm. If an electron collided with a laser photon and produced high-energy photon, that highenergy photon would keep moving inside the laser beam, and there would be photon-photon collisions of this high-energy photon with the laser photons. It is easy to calculate that t = 4ω1 ω2 was still less than (2me )2 . That energy was slightly insufficient. The energy ω1 can be doubled if the high-energy photon collides with two laser photons. But even this doubling is still insufficient. If it collides with 3 photons, it will still be too small. But if it collides with 4 photons, that will be enough for production of an electron-positron pair, which was experimentally noticed by the observation in the forward direction. There will be produced mostly photons and high-energy electrons that did not react or give away their energy; positrons will be emitted too. That was observed experimentally. There were only
126 about 100 events, but certainly that pointed to a reaction in which four laser photons collided with a high-energy photon and resulted in production of e+ e− pair. All this took place due to the very high energy of electrons and the high density of the laser photons. We were talking about the particle physics, accelerators, lasers etc. The beauty of our science lies in its relation to the entire Universe. Indeed, the process that we have just discussed in regard to something earthly has direct relevance to a number of astronomical problems. When here, on the Earth, we observe high-energy photons flying to us from afar, they are passing through areas with stars, but it is a very rare phenomenon. The interstellar medium also has almost no effect on the propagation of photons. Yet, there are also relict photons which fill the Universe everywhere. Their density is small and the energy is tiny, but collisions of photons of very high energy with these relict ones may result in electron-positron pair production if the energy of high-energy photons is larger than 1015 eV. These photons are also worth studying. They do not reach the Earth because their free path may turn out to be less than the distance from the source to the Earth. But we know that, if the energy of the photon and thus t ∼ 4ω1 ω2 increases, the cross section of the process γγ → e+ e− decreases and the environment becomes more and more transparent for high-energy photons. But this is not the end of story yet. There occurs a process I studied and published a few works on: two photons annihilate not in one pair, but in two pairs. Of course, its cross section is of a higher order, α4 , but it does not decrease with the energy σγγ→e+ e− e+ e− ∼ α4 /m2e . Quite unexpectedly I received a letter from the astrophysicists about this very process. And they found out that for very high energy this process becomes important. Due to this process photons with the highest-energy from the edge of the Universe can not reach the Earth.
Bibliography [1] V.B. Berestetsky, E.M. Lifshitz, L.P. Pitaevsky. Quantum Electrodynamics (Pergamon Press, Oxford, 1982) [2] N.N. Bogolyubov, D.V. Shirkov. Introduction to the Theory of Quantized Fields (John Wiley & sons, 1980) [3] N.N. Bogolyubov, D.V. Shirkov. Quantum Fields (Benjamin, 1983) [4] M.E. Peskin, D.V. Schroeder An Introduction to Quantum Field Theory (Perseus Books Publishing, 1995) [5] L.D. Landau, E.M. Lifshitz The Classical Theory of Field (Buttherworth & Heinemann, 1988) [6] L.D. Landau, E.M. Lifshitz Mechanics (Buttherworth & Heinemann, 1976) [7] Review of Particle Physics — see the Particle Data Group website: http://pdg.lbl.gov
127