164 79
English Pages [680] Year 2023
ISTUDY
Quantum Mechanics A Graduate Course
Written for a two-semester graduate course in quantum mechanics, this comprehensive text helps the reader to develop the tools and formalism of quantum mechanics and its applications to physical systems. It will suit students who have taken some introductory quantum mechanics and modern physics courses at undergraduate level, but it is self-contained and does not assume any specific background knowledge beyond appropriate fluency in mathematics. The text takes a modern logical approach rather than a historical one and it covers standard material, such as the hydrogen atom and the harmonic oscillator, the WKB approximations, and Bohr–Sommerfeld quantization. Important modern topics and examples are also described, including Berry phase, quantum information, complexity and chaos, decoherence and thermalization, and nonstandard statistics, as well as more advanced material such as path integrals, scattering theory, multiparticles, and Fock space. Readers will gain a broad overview of quantum mechanics as a solid preparation for further study or research. Hora¸tiu N˘astase is a Researcher at the Institute for Theoretical Physics, State University of S˜ao Paulo. He completed his PhD at Stony Brook with Peter van Nieuwenhuizen, co-discoverer of supergravity. While in Princeton as a postdoctoral fellow, in a 2002 paper with David Berenstein and Juan Maldacena he started the pp-wave correspondence, a sub-area of the AdS/CFT correspondence. He has written more than 100 scientific articles and five other books, including Introduction to the AdS/CFT Correspondence (2015), String Theory Methods for Condensed Matter Physics (2017), Classical Field Theory (2019), Introduction to Quantum Field Theory (2019), and Cosmology and String Theory (2019).
“There are a few dozen books on quantum mechanics. Who needs another one? Graduate students. Most of the existing books aim at the undergraduate level. Often, students of graduate courses of QM are all familiar with the motivation and its historical development but have a very varied technical background. This book skips the history of the field and enables all the students to reach a common needed level after going over the first chapters. This book spans a very wide manifold of QM topics. It provides the tools for further researching any quantum system. An important further value of the book is that it covers the forefront aspects of QM including the Berry phase, anyons, entanglement, quantum information, quantum complexity and chaos, and quantum thermalization. Readers understand a topic only if they are able to solve problems associated with it; the book includes at least 7 exercises for each of its 59 chapters.” — Professor Jacob Sonnenschein, Tel Aviv University “N˘astase’s Quantum Mechanics is another marvellous addition to his encyclopaedic collection of graduate-level courses that take both the dedicated student and the hardened researcher on a grand tour of contemporary theoretical physics. It will serve as an ideal bridge between a comprehensive undergraduate course in quantum mechanics and the frontiers of research in quantum systems.” — Professor Jeff Murugan, University of Cape Town
Quantum Mechanics A Graduate Course
HORA¸T IU N A˘ STASE Universidade Estadual Paulista, São Paulo
University Printing House, Cambridge CB2 8BS, United Kingdom One Liberty Plaza, 20th Floor, New York, NY 10006, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia 314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India 103 Penang Road, #05–06/07, Visioncrest Commercial, Singapore 238467 Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning, and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/highereducation/isbn/9781108838733 DOI: 10.1017/9781108976299 © Horat¸iu N˘astase 2023 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2023 A catalogue record for this publication is available from the British Library. Library of Congress Cataloging-in-Publication Data Names: N˘astase, Horat¸iu, 1972– author. Title: Quantum mechanics : a graduate course / by Horatiu Nastase, Instituto de Fisica Teorica, UNESP, Sa˜o Paulo, Brazil. Description: Cambridge, United Kingdom ; New York, NY : Cambridge University Press, 2022. | Includes bibliographical references. Identifiers: LCCN 2022010290 | ISBN 9781108838733 (hardback) Subjects: LCSH: Quantum theory. | BISAC: SCIENCE / Physics / Quantum Theory Classification: LCC QC174.12 .N38 2022 | DDC 530.12–dc23/eng20220517 LC record available at https://lccn.loc.gov/2022010290 ISBN 978-1-108-83873-3 Hardback Additional resources for this publication at www.cambridge.org/Nastase Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
To the memory of my mother, who inspired me to become a physicist
Contents
Preface Acknowledgements Introduction
page xix xx xxi
Part I Formalism and Basic Problems Introduction: Historical Background 0.1 0.2 0.3 0.4 0.5
Experiments Point towards Quantum Mechanics Quantized States: Matrix Mechanics, and Waves for Particles: Correspondence Principle Wave Functions Governing Probability, and the Schr¨odinger Equation Bohr–Sommerfeld Quantization and the Hydrogen Atom Review: States of Quantum Mechanical Systems
1 The Mathematics of Quantum Mechanics 1: Finite-Dimensional Hilbert Spaces
3 3 6 8 8 10
12
(Linear) Vector Spaces V Operators on a Vector Space Dual Space, Adjoint Operation, and Dirac Notation Hermitian (Self-Adjoint) Operators and the Eigenvalue Problem Traces and Tensor Products Hilbert Spaces
12 15 17 19 21 21
2 The Mathematics of Quantum Mechanics 2: Infinite-Dimensional Hilbert Spaces
24
1.1 1.2 1.3 1.4 1.5 1.6
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
Hilbert Spaces and Related Notions Functions as Limits of Discrete Sets of Vectors Integrals as Limits of Sums Distributions and the Delta Function Spaces of Functions Operators in Infinite Dimensions Hermitian Operators and Eigenvalue Problems The Operator Dxx
3 The Postulates of Quantum Mechanics and the Schrödinger Equation 3.1 3.2 3.3 3.4 vii
1
The Postulates The First Postulate The Second Postulate The Third Postulate
24 25 26 27 29 30 30 31
35 35 37 37 38
viii
Contents
3.5 3.6 3.7 3.8
The Fourth Postulate The Fifth Postulate The Sixth Postulate Generalization of States to Ensembles: the Density Matrix
4 Two-Level Systems and Spin-1/2, Entanglement, and Computation 4.1 4.2 4.3 4.4 4.5 4.6
Two-Level Systems and Time Dependence General Stationary Two-State System Oscillations of States Unitary Evolution Operator Entanglement Quantum Computation
5 Position and Momentum and Their Bases; Canonical Quantization, and Free Particles 5.1 5.2 5.3 5.4 5.5
38 41 41 42
45 45 47 50 52 52 54
57
Translation Operator Momentum in Classical Mechanics as a Generator of Translations Canonical Quantization Operators in Coordinate and Momentum Spaces The Free Nonrelativistic Particle
57 58 60 61 63
6 The Heisenberg Uncertainty Principle and Relations, and Gaussian Wave Packets
66
6.1 6.2 6.3 6.4 6.5
Gaussian Wave Packets Time Evolution of Gaussian Wave Packet Heisenberg Uncertainty Relations Minimum Uncertainty Wave Packet Energy–Time Uncertainty Relation
7 One-Dimensional Problems in a Potential V(x) 7.1 7.2 7.3 7.4 7.5 7.6 7.7
Set-Up of the Problem General Properties of the Solutions Infinitely Deep Square Well (Particle in a Box) Potential Step and Reflection and Transmission of Modes Continuity Equation for Probabilities Finite Square Well Potential Penetration of a Potential Barrier and the Tunneling Effect
8 The Harmonic Oscillator 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8
Classical Set-Up and Generalizations Quantization in the Creation and Annihilation Operator Formalism Generalization Coherent States Solution in the Coordinate, |x, Representation (Basis) Alternative to |x Representation: Basis Change from |n Representation Properties of Hermite Polynomials Mathematical Digression (Appendix): Classical Orthogonal Polynomials
66 68 69 71 72
74 74 75 78 81 83 84 87
91 91 92 95 96 96 99 100 101
ix
Contents
9 The Heisenberg Picture and General Picture; Evolution Operator 9.1 9.2 9.3 9.4 9.5
The Evolution Operator The Heisenberg Picture Application to the Harmonic Oscillator General Quantum Mechanical Pictures The Dirac (Interaction) Picture
10 The Feynman Path Integral and Propagators 10.1 10.2 10.3 10.4 10.5
106 106 108 109 110 112
116
Path Integral in Phase Space Gaussian Integration Path Integral in Configuration Space Path Integral over Coherent States (in “Harmonic Phase Space”) Correlation Functions and Their Generating Functional
116 119 120 121 123
11 The Classical Limit and Hamilton–Jacobi (WKB Method), the Ehrenfest Theorem
127
11.1 11.2 11.3 11.4 11.5
Ehrenfest Theorem Continuity Equation for Probability Review of the Hamilton–Jacobi Formalism The Classical Limit and the Geometrical Optics Approximation The WKB Method
12 Symmetries in Quantum Mechanics I: Continuous Symmetries 12.1 12.2 12.3 12.4 12.5
Symmetries in Classical Mechanics Symmetries in Quantum Mechanics: General Formalism Example 1. Translations Example 2. Time Translation Invariance Mathematical Background: Review of Basics of Group Theory
13 Symmetries in Quantum Mechanics II: Discrete Symmetries and Internal Symmetries 13.1 13.2 13.3 13.4 13.5 13.6
Discrete Symmetries: Symmetries under Discrete Groups Parity Symmetry Time Reversal Invariance, T Internal Symmetries Continuous Symmetry Lie Groups and Algebras and Their Representations
14 Theory of Angular Momentum I: Operators, Algebras, Representations 14.1 14.2 14.3 14.4 14.5 14.6
Rotational Invariance and SO(n) The Lie Groups SO(2) and SO(3) The Group SU(2) and Its Isomorphism with SO(3) Mod Z2 Generators and Lie Algebras Quantum Mechanical Version Representations
127 129 130 132 133
137 137 139 141 142 143
149 149 150 152 154 155 155
159 159 160 162 164 165 166
x
Contents
15 Theory of Angular Momentum II: Addition of Angular Momenta and Representations; Oscillator Model 15.1 15.2 15.3 15.4 15.5
The Spinor Representation, j = 1/2 Composition of Angular Momenta Finding the Clebsch–Gordan Coefficients Sums of Three Angular Momenta, J1 + J2 + J3 : Racah Coefficients Schwinger’s Oscillator Model
16 Applications of Angular Momentum Theory: Tensor Operators, Wave Functions and the Schrödinger Equation, Free Particles 16.1 16.2 16.3 16.4 16.5
Tensor Operators Wigner–Eckhart Theorem Rotations and Wave Functions Wave Function Transformations under Rotations Free Particle in Spherical Coordinates
17 Spin andL + S 17.1 17.2 17.3 17.4 17.5 17.6
Motivation for Spin and Interaction with Magnetic Field Spin Properties Particle with Spin 1/2 Rotation of Spinors with s = 1/2 Sum of Orbital Angular Momentum and Spin, L + S Time-Reversal Operator on States with Spin
18 The Hydrogen Atom 18.1 18.2 18.3 18.4 18.5
Two-Body Problem: Reducing to Central Potential Hydrogenoid Atom: Set-Up of Problem Solution: Sommerfeld Polynomial Method Confluent Hypergeometric Function and Quantization of Energy Orthogonal Polynomials and Standard Averages over Wave Functions
19 General Central Potential and Three-Dimensional (Isotropic) Harmonic Oscillator 19.1 19.2 19.3 19.4 19.5 19.6 19.7 19.8
General Set-Up Types of Potentials Diatomic Molecule Free Particle Spherical Square Well Three-Dimensional Isotropic Harmonic Oscillator: Set-Up Isotropic Three-Dimensional Harmonic Oscillator in Spherical Coordinates Isotropic Three-Dimensional Harmonic Oscillator in Cylindrical Coordinates
20 Systems of Identical Particles 20.1 20.2 20.3 20.4
Identical Particles: Bosons and Fermions Observables under Permutation Generalization to N Particles Canonical Commutation Relations
172 172 174 177 178 179
183 183 185 186 188 190
194 194 196 197 199 200 201
204 204 205 207 210 211
214 214 215 220 221 221 222 223 225
229 229 231 232 233
xi
Contents
20.5 Spin–Statistics Theorem 20.6 Particles with Spin
21 Application of Identical Particles: He Atom (Two-Electron System) and H2 Molecule 21.1 21.2 21.3 21.4
Helium-Like Atoms Ground State of the Helium (or Helium-Like) Atom Approximation 3: Variational Method, “Light Version” H2 Molecule and Its Ground State
22 Quantum Mechanics Interacting with Classical Electromagnetism 22.1 22.2 22.3 22.4 22.5
234 236
240 240 243 245 246
250
Classical Electromagnetism plus Particle Quantum Particle plus Classical Electromagnetism Application to Superconductors Interaction with a Plane Wave Spin–Magnetic-Field and Spin–Orbit Interaction
250 251 254 256 256
23 Aharonov–Bohm Effect and Berry Phase in Quantum Mechanics
260
23.1 23.2 23.3 23.4 23.5
Gauge Transformation in Electromagnetism The Aharonov–Bohm Phase δ Berry Phase Example: Atoms, Nuclei plus Electrons Spin–Magnetic Field Interaction, Berry Curvature, and Berry Phase as Geometric Phase 23.6 Nonabelian Generalization 23.7 Aharonov–Bohm Phase in Berry Form
24 Motion in a Magnetic Field, Hall Effect and Landau Levels 24.1 24.2 24.3 24.4 24.5 24.6
Spin in a Magnetic Field Particle with Spin 1/2 in a Time-Dependent Magnetic Field Particle with or without Spin in a Magnetic Field: Landau Levels The Integer Quantum Hall Effect (IQHE) Alternative Derivation of the IQHE An Atom in a Magnetic Field and the Land´e g-Factor
25 The WKB; a Semiclassical Approximation 25.1 25.2 25.3 25.4
Review and Generalization Approximation and Connection Formulas at Turning Points Application: Potential Barrier The WKB Approximation in the Path Integral
26 Bohr–Sommerfeld Quantization 26.1 26.2 26.3 26.4 26.5
Bohr–Sommerfeld Quantization Condition Example 1: Parity-Even Linear Potential Example 2: Harmonic Oscillator Example 3: Motion in a Central Potential Example: Coulomb Potential (Hydrogenoid Atom)
260 262 264 265 266 269 269
272 272 272 273 275 278 278
282 282 283 285 287
290 290 291 292 294 298
xii
Contents
27 Dirac Quantization Condition and Magnetic Monopoles 27.1 27.2 27.3 27.4 27.5 27.6
Dirac Monopoles from Maxwell Duality Dirac Quantization Condition from Semiclassical Nonrelativistic Considerations Contradiction with the Gauge Field Patches and Magnetic Charge from Transition Functions Dirac Quantization from Topology and Wave Functions Dirac String Singularity and Obtaining the Dirac Quantization Condition from It
28 Path Integrals II: Imaginary Time and Fermionic Path Integral 28.1 The Forced Harmonic Oscillator 28.2 Wick Rotation to Euclidean Time and Connection with Statistical Mechanics Partition Function 28.3 Fermionic Path Integral 28.4 Gaussian Integration over the Grassmann Algebra 28.5 Path Integral for the Fermionic Harmonic Oscillator
29 General Theory of Quantization of Classical Mechanics and (Dirac) Quantization of Constrained Systems 29.1 Hamiltonian Formalism 29.2 Constraints in the Hamiltonian Formalism: Primary and Secondary Constraints, and First and Second Class Constraints 29.3 Quantization and Dirac Brackets 29.4 Example: Electromagnetic Field
Part IIa Advanced Foundations 30 Quantum Entanglement and the EPR Paradox 30.1 30.2 30.3 30.4 30.5
Entanglement: Spin 1/2 System Entanglement: The General Case Entanglement: Careful Definition Entanglement Entropy The EPR Paradox and Hidden Variables
31 The Interpretation of Quantum Mechanics and Bell’s Inequalities 31.1 31.2 31.3 31.4
Bell’s Original Inequality Bell–Wigner Inequalities CHSH Inequality (or Bell–CHSH Inequality) Interpretations of Quantum Mechanics
32 Quantum Statistical Mechanics and “Tracing” over a Subspace 32.1 32.2 32.3 32.4 32.5
Density Matrix and Statistical Operator Review of Classical Statistics Defining Quantum Statistics Bose–Einstein and Fermi–Dirac Distributions Entanglement Entropy
301 301 303 304 305 307 308
311 311 315 318 320 321
325 325 326 329 332 337
339 339 341 342 343 345
350 350 353 356 357
361 361 363 365 370 371
xiii
Contents
33 Elements of Quantum Information and Quantum Computing 33.1 33.2 33.3 33.4
Classical Computation and Shannon Theory Quantum Information and Computation, and von Neumann Entropy Quantum Computation Quantum Cryptography, No-Cloning, and Teleportation
34 Quantum Complexity and Quantum Chaos 34.1 34.2 34.3 34.4
Classical Computation Complexity Quantum Computation and Complexity The Nielsen Approach to Quantum Complexity Quantum Chaos
35 Quantum Decoherence and Quantum Thermalization 35.1 35.2 35.3 35.4 35.5 35.6 35.7
384 384 385 386 388
393 393 393 394 395 397 398 400
Part IIb Approximation Methods
405
Set-Up of the Problem: Time-Independent Perturbation Theory The Nondegenerate Case The Degenerate Case General Form of Solution (to All Orders) Example: Stark Effect in Hydrogenoid Atom
37 Time-Dependent Perturbation Theory: First Order 37.1 37.2 37.3 37.4 37.5 37.6
375 377 378 380
Decoherence Schr¨odinger’s Cat Qualitative Decoherence Quantitative Decoherence Qualitative Thermalization Quantitative Thermalization Bogoliubov Transformation and Appearance of Temperature
36 Time-Independent (Stationary) Perturbation Theory: Nondegenerate, Degenerate, and Formal Cases 36.1 36.2 36.3 36.4 36.5
375
Evolution Operator Method of Variation of Constants A Time-Independent Perturbation Being Turned On Continuous Spectrum and Fermi’s Golden Rule Application to Scattering in a Collision Sudden versus Adiabatic Approximation
38 Time-Dependent Perturbation Theory: Second and All Orders 38.1 Second-Order Perturbation Theory and Breit–Wigner Distribution (Energy Shift and Decay Width) 38.2 General Perturbation 38.3 Finding the Probability Coefficients bn (t)
407 407 408 410 412 414
418 418 419 421 421 423 425
429 429 432 433
xiv
Contents
39 Application: Interaction with (Classical) Electromagnetic Field, Absorption, Photoelectric and Zeeman Effects 39.1 39.2 39.3 39.4 39.5
Particles and Atoms in an Electromagnetic Field Zeeman and Paschen–Back Effects Electromagnetic Radiation and Selection Rules Absorption of Photon Energy by Atom Photoelectric Effect
40 WKB Methods and Extensions: State Transitions and Euclidean Path Integrals (Instantons) 40.1 40.2 40.3 40.4
Review of the WKB Method WKB Matrix Elements Application to Transition Probabilities Instantons for Transitions between Two Minima
First Form of the Variational Method Ritz Variational Method Practical Variational Method General Method Applications
Part IIc Atomic and Nuclear Quantum Mechanics 42 Atoms and Molecules, Orbitals and Chemical Bonds: Quantum Chemistry 42.1 42.2 42.3 42.4 42.5 42.6 42.7 42.8 42.9 42.10 42.11
Hydrogenoid Atoms (Ions) Multi-Electron Atoms and Shells Couplings of Angular Momenta Methods of Quantitative Approximation for Energy Levels Molecules and Chemical Bonds Adiabatic Approximation and Hierarchy of Scales Details of the Adiabatic Approximation Method of Molecular Orbitals The LCAO Method Application: The LCAO Method for the Diatomic Molecule Chemical Bonds
43 Nuclear Liquid Droplet and Shell Models 43.1 43.2 43.3 43.4
436 437 438 442 444
447 447 448 451 452
457
41 Variational Methods 41.1 41.2 41.3 41.4 41.5
436
Nuclear Data and Droplet Model Shell Models 1: Single-Particle Shell Models Spin–Orbit Interaction Correction Many-Particle Shell Models
44 Interaction of Atoms with Electromagnetic Radiation: Transitions and Lasers 44.1 Two-Level System for Time-Dependent Transitions 44.2 First-Order Perturbation Theory for Harmonic Potential
457 458 459 460 462 467
469 469 470 471 472 473 474 475 476 477 478 478
483 483 484 487 489
492 492 494
xv
Contents
44.3 The Case of Quantized Radiation 44.4 Planck Formula
Part IId Scattering Theory 45 One-Dimensional Scattering, Transfer and S-Matrices 45.1 45.2 45.3 45.4 45.5 45.6
Asymptotics and Integral Equations Green’s Functions Relations between Abstract States and Lippmann–Schwinger Equation Physical Interpretation of Scattering Solution S-Matrix and T-Matrix Application: Spin Chains
46 Three-Dimensional Lippmann–Schwinger Equation, Scattering Amplitudes and Cross Sections 46.1 46.2 46.3 46.4 46.5
Potential for Relative Motion Behavior at Infinity Scattering Solution, Cross Sections, and S-Matrix S-Matrix and Optical Theorem Green’s Functions and Lippmann–Schwinger Equation
47 Born Approximation and Series, S-Matrix and T-Matrix 47.1 47.2 47.3 47.4
Born Approximation and Series Time-Dependent Scattering Point of View Higher-Order Terms and Abstract States, S- and T-Matrices Validity of the Born Approximation
48 Partial Wave Expansion, Phase Shift Method, and Scattering Length 48.1 48.2 48.3 48.4 48.5
The Partial Wave Expansion Phase Shifts T-Matrix Element Scattering Length Jost Functions, Wronskians, and the Levinson Theorem
49 Unitarity, Optics, and the Optical Theorem 49.1 49.2 49.3 49.4 49.5 49.6 49.7
Unitarity: Review and Reanalysis Application to Cross Sections Radial Green’s Functions Optical Theorem Scattering on a Potential with a Finite Range Hard Sphere Scattering Low-Energy Limit
50 Low-Energy and Bound States, Analytical Properties of Scattering Amplitudes 50.1 Low-Energy Scattering 50.2 Relation to Bound States
495 497 501
503 504 506 507 509 511 512
516 516 517 518 521 522
528 528 530 530 532
535 535 537 540 541 542
547 547 548 550 551 553 554 554
557 557 558
xvi
Contents
50.3 Bound State from Complex Poles of Sl (k) 50.4 Analytical Properties of the Scattering Amplitudes 50.5 Jost Functions Revisited
51 Resonances and Scattering, Complex k and l 51.1 51.2 51.3 51.4 51.5
560 561 563
566
Poles and Zeroes in the Partial Wave S-Matrix Sl (k) Breit–Wigner Resonance Physical Interpretation and Its Proof Review of Levinson Theorem Complex Angular Momentum
566 568 570 573 574
52 The Semiclassical WKB and Eikonal Approximations for Scattering
577
52.1 52.2 52.3 52.4
WKB Review for One-Dimensional Systems Three-Dimensional Scattering in the WKB Approximation The Eikonal Approximation Coulomb Scattering and the Semiclassical Approximation
588
53 Inelastic Scattering 53.1 53.2 53.3 53.4 53.5 53.6
Generalizing Elastic Scattering from Unitarity Loss Inelastic Scattering Due to Target Structure General Theory of Collisions: Inelastic Case and Multi-Channel Scattering General Theory of Collisions Multi-Channel Analysis Scattering of Identical Particles
Part IIe Many Particles 54 The Dirac Equation 54.1 54.2 54.3 54.4 54.5
Naive Treatment Relativistic Dirac Equation Interaction with Electromagnetic Field Weakly Relativistic Limit Correction to the Energy of Hydrogenoid Atoms
55 Multiparticle States in Atoms and Condensed Matter: Schrödinger versus Occupation Number 55.1 55.2 55.3 55.4 55.5
577 578 581 584
Schr¨odinger Picture Multiparticle Review Approximation Methods Transition to Occupation Number Schr¨odinger Equation in Occupation Number Space Analysis for Fermions
56 Fock Space Calculation with Field Operators 56.1 Creation and Annihilation Operators 56.2 Occupation Number Representation for Fermions 56.3 Schr¨odinger Equation on Occupation Number States
588 590 592 593 595 597 601
603 603 606 607 607 611
614 614 616 616 617 621
623 623 624 625
xvii
Contents
56.4 56.5 56.6 56.7 56.8
Fock Space Fock Space for Fermions Schr¨odinger Equations for Fermions, and Generalizations Field Operators Example of Interaction: Coulomb Potential, as a Limit of the Yukawa Potential, for Spin 1/2 Fermions
57 The Hartree–Fock Approximation and Other Occupation Number Approximations 57.1 57.2 57.3 57.4
Set-Up of the Problem Derivation of the Hartree–Fock Equation Hartree and Fock Terms Other Approximations in the Occupation Number Picture
58 Nonstandard Statistics: Anyons and Nonabelions 58.1 58.2 58.3 58.4 58.5
Review of Statistics and Berry Phase Anyons in 2+1 Dimensions: Explicit Construction Chern–Simons Action Example: Fractional Quantum Hall Effect (FQHE) Nonabelian Statistics
References Index
626 627 628 628 630
632 632 633 635 637
641 641 642 644 646 647 650 652
Preface
There are so many books in quantum mechanics, that one has to ask: why write another one? When teaching graduate quantum mechanics in either the US or Brazil (countries I am familiar with), as well as in other places, one is faced with a conundrum: how to address all possible backgrounds, while keeping the course both interesting and also comprehensive enough to offer the graduate student a chance to follow competitively in any area? Indeed, while graduate students have usually very different backgrounds, there is usually a compulsory graduate quantum mechanics course, usually a two-semester one. The students will certainly have some introductory undergraduate quantum mechanics and some modern physics, but some have studied these topics in detail, while others less so. To that end, I believe there is little need for a long historical introduction, or a detailed explanation of why we need to define quantum mechanics in the way we do. In order to challenge students, we cannot simply repeat what they heard in the undergraduate course. So one needs a tighter presentation, with more emphasis on the building up of the formalism of quantum mechanics rather than its motivation, as well as a presentation that contains more interesting new developments besides the standard advanced concepts. On the other hand, I personally found that many (even very bright) students are caught up between the two systems and miss on important information: they have had an introductory quantum mechanics course that did not treat all the standard classical material (examples: the hydrogen atom in all detail, the harmonic oscillator in various treatments, the WKB and Bohr–Sommerfeld formalisms), yet the graduate course assumes that students have covered all these topics and so they struggle in their subsequent research as a result. Since I have found no graduate book that adheres to these conditions of comprehensiveness to my satisfaction, I decided to write one, and this is the result. The book is intended as a two-semester course, corresponding to the two parts, each chapter corresponding to one two-hour lecture, though sometimes an extended version of a lecture.
xix
Acknowledgements
I should first thank all the people who have helped and guided me on my journey as a physicist, starting with my mother Ligia, who was the first example I had of a physicist, and from whom I first learned to love physics. My high school physics teacher, Iosif Sever Georgescu, helped to start me on the career of a physicist, and made me realize that it was something that I could do very well. My student exchange advisor at the Niels Bohr Institute, Poul Olesen, first showed me what string theory is, and started me in a career in this area, and my PhD advisor, Peter van Nieuwenhuizen, taught me the beauty of theoretical physics, of rigor and of long calculations, and an approach to teaching graduate physics that I still try to follow. With respect to teaching quantum mechanics, most of the credit goes to my teachers in the Physics Department of Bucharest University, since during my four undergraduate years there (1990–1994), many of the courses there taught me about various aspects of quantum mechanics. Some recent developments described here I learned from my research, so I thank all my collaborators, students, and postdocs for their help with understanding them. My wife Antonia I thank for her patience and understanding while I wrote this book, as well as for help with an accident during the writing. I would like to thank my present editor at Cambridge University Press, Vince Higgs, who helped me get this book published, as well as my outgoing (now retired) editor Simon Capelin, for his encouragement in starting me on the path to publishing books. To all the staff at CUP, thanks for making sure this book, as well as my previous ones, is as good as it can be.
xx
Introduction
As described in the Preface, this book is intended for graduate students, and so I assume that readers will have had both an introductory undergraduate course in quantum mechanics, which familiarized them with the historical motivation as well as the basic formalism. For completeness, however, I have included a brief historical background as an optional introductory chapter. I do assume a certain level of mathematical proficiency, as might be expected from a graduate student. Nevertheless, the first two chapters are dedicated to the mathematics of quantum mechanics; they are divided into finiteand infinite-dimensional Hilbert spaces, since these concepts are necessary to a smooth introduction. I start the discussion of quantum mechanics with its postulates and the Schr¨odinger equation, after which I start developing the formalism. I mostly use the bra-ket notation for abstract states, as it is often cleaner, though for most of the standard systems I use wave functions. I introduce early on the notion of the path integral, since it is an important alternative way to describe quantization, through the sum over all paths, classical or otherwise, though I mostly use the operatorial formalism. I also introduce early on the notion of pictures, including the Heisenberg and interaction pictures, as alternatives for the description of time evolution. The important notion of angular momentum theory is presented within the larger context of symmetries in quantum mechanics, since this is the modern viewpoint, especially within the relativistic quantum field theory that extends quantum mechanics. With respect to topics, in Part I, on basic concepts, I consider both the older, but still very useful, topics such as WKB and Bohr–Sommerfeld quantization as well as more recent ones such as the Berry phase and Dirac quantization condition. In Part II I consider scattering theory, variational methods, occupation number space, quantum entanglement and information, quantum complexity, quantum chaos, and thermalization; these are the more advanced topics. The advanced foundations of Quantum Mechanics (Part IIa ) are discussed in Part II since they are newer and harder to understand even though they could be said to belong to Part I for being foundational issues. After each chapter, I summarize a set of “Important Concepts to Remember”, and present seven exercises whose solution is meant to clarify the concepts in the chapter.
xxi
PART I
FORMALISM AND BASIC PROBLEMS
Introduction: Historical Background
In this book, we will describe the formalism and applications of quantum mechanics, starting from the first principles and postulates, and expanding them logically into the fully developed system that we currently have. That means that we will start with those principles and postulates, assuming that the reader has had an undergraduate course in modern physics, as well as an undergraduate course in quantum mechanics. This implies a first interaction with the ideas of quantum mechanics, and the historical development and experiments that led to it. However, for the sake of consistency and completeness, in this chapter we will quickly review these historical developments and how we were led to the current formalism for quantum mechanics. In classical mechanics and optics, there were deterministic laws, involving two types of objects: particles of matter, following well-defined classical paths, with their evolution defined by their Hamiltonian and the initial conditions, and waves for light, defined by wave functions (for the electric and B), and the observable intensities I defined by them, showing interference and magnetic fields E patterns in the sum of waves. But there was no overlap between the two descriptions, in terms of particles and waves. Notably, Newton had a rudimentary particle description for light (in his Optics), but with the development of the classical wave picture of Huyghens for optics, that description was forgotten for a long time.
0.1 Experiments Point towards Quantum Mechanics Then, around the turn of the nineteenth century into the twentieth, a string of developments changed the classical picture, allowing for a complementary particle description for waves, and for a complementary wave description for particles. One of the first such developments was the discovery of radioactivity by Henri Becquerel in 1896, who, after Wilhelm Roentgen’s discovery of X-rays in 1895, showed that radioactive elements produce rays similar to X-rays in that they can go through matter and then leave a pointlike mark on a photographic plate, thus involving emission of highly energetic particles. Now we know that these emitted particles can be α (nucleons), β (electrons and positrons), but also γ (high-energy photons, so lightlike). Next came what is believed to be the start of the modern quantum era, with the first theoretical idea of a quantum, specifically of light, used to describe blackbody radiation. In 1900, Planck managed to explain the blackbody emission spectrum with a simple but revolutionary idea. Assuming that the energy exchange between matter and the emitted radiation occurs not continuously (as assumed in classical physics), but only in given quanta of energy, proportional to the frequency ν of the radiation, ΔE = hν,
3
(0.1)
4
0 Experiments Point towards Quantum Mechanics
where the constant h is called the Planck constant, Planck managed to derive a formula for the spectral radiance: 2hν 3 1 , (0.2) 2 hν/k T −1 B c e which matches the emission spectrum of the radiation of a perfect black body, which was then known experimentally. Note that by integrating, we obtain the power per unit of emission area, ∞ P= dν dΩBν cos θ = σT 4 , (0.3) Bν (ν, T ) =
0
where we have used the differential of solid angle dΩ = sin θ dθdφ and have defined the Stefan– Boltzmann constant 2k B4 π 5 , (0.4) 15c2 h3 obtaining the (experimentally discovered) Stefan–Boltzmann law. Planck himself didn’t quite believe in the physical reality of the quantum of energy (as existing independently of the phenomenon of emission of radiation), but thought of it as a useful trick that manages to provide the correct result. It was necessary to wait until Einstein’s explanation of the photoelectric effect in 1905 for the true start of the quantum era. Indeed, Einstein took Planck’s idea to its logical conclusion, and postulated that there is a quantum of energy, that he called a photon and that can interact with matter, such that its energy is E = hν. Such an energy quantum is absorbed in the photoelectric effect by the bound electrons with binding energy W , such that they are released with kinetic energy σ=
Ekin =
mv 2 = hν − W , 2
(0.5)
providing an electrical current; see Fig. 0.1a. Continuing the idea of the interaction of electrons with quanta of light, i.e., photons, we also note the Compton effect, observed experimentally by Arthur Compton in 1923, in which a photon scatters off a free electron and changes its wavelength according to the law θ 2h sin2 . (0.6) mc 2 The law is easily explained by assuming that the photon has a relativistic relation between the energy and momentum, E = pc, leading to a momentum Δλ =
p=
hν h = . c λ
(0.7)
m
free
n
(a) Photoelectric
Figure 0.1
(b) Compton
(c) Balmer
Interactions of photons with electrons: (a) photoelectric effect; (b) Compton effect; (c) Balmer relation (transition to excited state).
5
0 Experiments Point towards Quantum Mechanics Then relativistic energy and momentum conservation for the process e− + γ → e− + γ (see Fig. 0.1b), where the initial e− is (almost) at rest, p γ = p γ + p e mc2 + pγ c = pe2 c2 + m2 c4 + pγ c,
(0.8)
leads to the Compton law (0.6). Next, we can also consider the case where the absorbed or emitted energy quantum leaves the electron bound to an atom, just changing its state inside the atom; see Fig. 0.1c. Since not all possible energies for the photon can be absorbed in this way, this means that the allowed energy states for the electron inside the atom are also quantized, i.e., they can only take well-defined values. In fact, experimentally we have the Balmer relation, discovered by Johann Balmer in 1885, for the possible frequencies of the absorbed or emitted radiation for the hydrogen (H) atom, given by 1 1 (0.9) ν=R 2− 2 , n m where R is called the Rydberg constant. Together with Planck’s and Einstein’s relation ΔE = hν, we obtain the law that the possible energies of the electron inside the H atom are given by the energy levels En = −
hR , n2
(0.10)
where n is a natural nonzero number, i.e. n = 1, 2, 3, . . . The fact that the states of the electron in the H atom (and, in fact, of any atoms or molecules, as we now know) are quantized suggests that it is possible to have more general quantization relations for energy states. That this is true, and that we can turn it into a concrete description for states in quantum mechanics, was proved in another important experiment, the Stern–Gerlach experiment of 1922 (by Otto Stern and Walter Gerlach). In this experiment, horizontal paramagnetic atomic beams are passed through a magnetic field varying as a function of the vertical position, B = B(z), created by two magnets oriented on a vertical line, as in Fig. 0.2. The electrons are understood as “spinning”, like magnets with a magnetic moment μ, so the energy of the interaction of the “spinning” electrons with the magnetic field is given by ΔE = − μ · B.
(0.11)
= ml. Classically, we But the magnetic moment is proportional to the “angular momentum” l, μ , and its expect any possible value for l and, so any possible value for both the magnetic moment μ projection μz on the z direction. Since the force in the z direction is μz ∂z Bz , Fz = −∂z ( μ · B)
(0.12)
we expect the arbitrary value of μz to translate into an arbitrary deviation of the beam, experimentally detected on a screen transverse to the original beam. But in fact one observes only two possible deviations, implying only two possible values of μz , symmetric about zero, which in fact can only be +μ and −μ, with a fixed μ. That in turn means that l z has only the possible values +l and −l, with l fixed.
6
0 Experiments Point towards Quantum Mechanics
Stern–Gerlach experiment
Figure 0.2
The Stern–Gerlach experiment.
0.2 Quantized States: Matrix Mechanics, and Waves for Particles: Correspondence Principle The conclusion is that the “spinning” states of the electron have a projection onto any axis (since z is a randomly chosen direction) that can have only two given values, which is another example of the quantization of states, involving the simplest possibility: two possible choices only, states “1” and “2”. In general, we expect more quantized values for the energy, therefore more, but usually countable, states. Observables in these states can be “diagonal”, meaning they map the state to itself (for instance, the energy of a state), or “off-diagonal”, meaning they map one state to another (like those related to the transition between matter states due to interaction with light). In total, we obtain a “matrix mechanics”, for matrix observables Mi j acting on states i. This was defined by Werner Heisenberg, Max Born, and Pascual Jordan in 1925, in three papers (first by Heisenberg, then generalized and formalized by Born and Jordan, then by all three). But one still had to understand what is the physical meaning of the states, and obtain an alternative to the classical idea of the path of a matter-particle. This came with the idea of a wave associated with any particle, and conversely, a particle associated with any wave, or particle–wave complementarity; see Fig. 0.3a. This is illustrated best in the classic double-slit experiment, which today is a table top experiment covered in undergraduate physics. Consider a plane wave, or a beam of particles, moving perpendicularly towards a screen with two slits in it, and observe the result on a detecting parallel screen behind it, as in Fig. 0.3b. If we have classical waves, as in classical electromagnetism (the description of the macroscopic quantities of light), a traveling plane wave is described by a function ψ depending only on the propagation direction x and the time t, as 2π 2π x− t , (0.13) ψ(x, t) = A exp i λ T and B. The general (spherical) form of the traveling wave is where ψ is made up from E ψ(r, t) =
A exp [i(kr − ωt)] , r
(0.14)
7
0 Experiments Point towards Quantum Mechanics
(a)
Figure 0.3
(b)
(a) Particle–wave complementarity (correspondence principle). (b) Two-slit interference of waves ψ1 , ψ2 giving a screen image made up of individual points (particles), forming an interference pattern at large times. and is the form that is relevant for the wave that comes out of a slit. One can however measure only the intensity, given by I = |ψ| 2 .
(0.15)
The role of the double slits is to split the plane wave into two waves, 1 and 2, with origin in each slit, but converging on the same point on the detecting screen. Then one detects the total I = I1+2 , but what adds up is not I, but rather ψ, so ψ1+2 = ψ1 + ψ2 , and I1+2 = |ψ1 + ψ2 | 2 .
(0.16)
This leads to the interference pattern observed on the screen, with many local maxima of decaying magnitude. On the other hand, for particles, the particle beam divides at the slits in two beams. The “intensity” of the beam is proportional to the number of particles per unit area (with the total number of particles classically a constant, since they cannot be created or destroyed), so for them one finds instead I1+2 = I1 + I2 .
(0.17)
This would lead to a pattern with a single maximum, situated at the midpoint of the screen, but this is not what it is observed. In fact, as one knows from the experiment, if one decreases the intensity of the beam, such that, according to the previous description, we have only individual photons (quanta of light) coming through the slits, one can see the individual photons hitting the screen. However, their locations are such that, when sufficiently many of them have hit, they still create the same interference pattern of the waves. That means that, in a sense, a photon can “interfere with itself”. It also implies a correspondence principle, that classical physics corresponds to macroscopic effects, whereas microscopic effects are described by quantum mechanics. As long as something becomes macroscopic (through many iterations of the microscopic, like for instance the large number of photons in the double slit experiment), it becomes classical. So the question arises, is there a wave associated with the photon itself? And what does it correspond to? And if this is true for a photon, could it be true for any particle, even a particle of matter? In fact, historically, this was first proposed by (the Marquis) Louis de Broglie in 1924, who said that there should be a wave number k associated with any particle of momentum p, given by p (0.18) k= ,
8
0 Experiments Point towards Quantum Mechanics where = h/(2π). Then it follows that the de Broglie wavelength of the particle is λ=
2π h = . k p
(0.19)
This gives the correct formula for a photon, since for a photon, p = E/c and λ = c/ν. But the formula is supposed to also apply for any matter particle, such as for instance an electron. For electrons, the formula was confirmed experimentally by Davisson and Germer in experiments performed between 1923 and 1927. Today, the principle implied in these experiments is used in the electron microscope, which uses electrons instead of photons in order to “see”. This allows it to have a better resolution, since the resolution is of the order of λ and the de Broglie λ for an electron is much smaller than that for a photon of light (since the momentum p is much larger).
0.3 Wave Functions Governing Probability, and the Schrödinger Equation Because the intensity of a beam of particles is I = |ψ| 2 , if we allow for the possibility that particles have only probabilities of behaving in some way, we arrive at the conclusion that the intensity I must be proportional to this probability, and that as such the probability is given by |ψ| 2 , where ψ is a wave function associated with any particle. This was the interpretation given by Erwin Schr¨odinger, who then went on to write an equation, now known as Schr¨odinger’s equation, for the wave function. The equation for ψ is in general i∂t ψ = Hˆ ψ,
(0.20)
where Hˆ is an operator associated with the classical Hamiltonian H, now acting on the wave function ψ. This interpretation of quantum mechanics, of the Schr¨odinger equation for a wave function associated with probability, was complementary to the matrix mechanics of Heisenberg, Born, and Jordan. The two however are united into a single one by the interpretation of Dirac in terms of bra ψ| and ket |ψ states, which we will develop. In it, thinking of various states |ψi as the states i of matrix mechanics, operators like Hˆ appear as matrices Hi j .
0.4 Bohr–Sommerfeld Quantization and the Hydrogen Atom Quantization of the electrons in the H atom on the other hand is obtained from the Bohr–Sommerfeld quantization rules, which state that the total action over the domain of a state, such as a cycle of the motion of an electron around the H atom, is quantized in units of h, p · dq = nh, (0.21) H=E
where E is the energy of the electron. More generally, this should be true for any variable q and its canonically conjugate momentum p, i.e., pi dqi = nh. (0.22)
9
0 Experiments Point towards Quantum Mechanics
We will study this in more detail later, but in this review we will just recall the principal details. For the H atom, we have three simple quantizations. For pφ (the momentum associated with the angular variable φ), pφ dφ = lh (0.23) leads to quantization of the total angular momentum, L = l. For pr (the momentum associated with the radial variable r), pr dr = k h leads to the relation (e02 ≡ e2 /(4π0 ))
(0.24)
(0.25)
2π 2 me04 − 2πL = k h. −E
(0.26)
The sum of the quantum numbers k and l is called the principal quantum number, n = l + k.
(0.27)
This then leads to the derivation of the energy levels obtained from the Balmer law, me04 . (0.28) 22 n2 Physically, one describes the original Bohr–Sommerfeld quantization as the particle trajectory being an integral number of de Broglie wavelengths, as in Fig. 0.4. On the other hand, the projection of the angular momentum L on any direction z must also be quantized (as we deduced from the Stern–Gerlach experiment), so (0.29) L z dφ = mh, En = −
leading to L z = m. Thus the H atom is described by the three quantum numbers (n, l, m).
Figure 0.4
Bohr–Sommerfeld quantization, physical interpretation.
(0.30)
10
0 Experiments Point towards Quantum Mechanics
0.5 Review: States of Quantum Mechanical Systems To summarize, in the case of quantum mechanical systems we have the following possibilities for the states. • We can have a finite number of discrete states, as in the case of spin systems. Quantization in this case means exactly that: instead of a continuum of possible states (for a classical angular momentum, in this case), now we have a discrete set of states. In particular, for spin s = 1/2 (as in the case of electrons, proven by the Stern–Gerlach experiment), we have a two-state system, the simplest possible, which will be analyzed first. • We can have an infinite number of discrete states, as in the case of the hydrogen atom, considered above. In this case, quantization means the same: instead of a continuum of states, we have a discrete set, with an allowed set of energies and states, as well as other observables. • We can also have a continuum of states, as in the case of free particles. In this case, quantization refers only to the types of states allowed; there is no restriction on the allowed number of energies and states associated with them.
Important Concepts to Remember • Some states for some systems are quantized (quantized energy states, for instance), leading to matrix mechanics. • There is always a wave associated with a particle and a particle with a wave, leading to a complementarity. • There is always a (complex-valued) wave function ψ describing probabilities for a measurement via P = |ψ| 2 . It satisfies the Schr¨odinger equation. • We can always have a discrete infinite number of states, or even a continuous infinite number, but both are still described by matrix mechanics. In any case, there is a wave function.
Further Reading See, for instance, Messiah’s book [2]. It has a more thorough treatment of the historical background.
Exercises (1) Derive the Stefan–Boltzmann law from the Planck law for the spectral radiance. (2) In the photoelectric effect, if the incoming flux of light has frequency ν and the released electrons get thermalized through interaction, what is the resulting temperature? If ν is within the visible spectrum and W is comparable with the energy of the photon, what kind of temperature range do you expect? (3) Derive the Compton effect Δλ from the relativistic collision of the photon from a free electron.
11
0 Experiments Point towards Quantum Mechanics
(4) Considering a relativistic traveling wave of frequency ν, calculate the distance between the local maxima (or local minima) in the interference pattern on a screen. (5) Calculate the de Broglie wavelength for an electron at “room temperature”. How does it compare with a corresponding wavelength in the radiation spectrum? What does this imply for the feasibility of microscopes using electrons versus those using radiation? (6) Can we apply Bohr–Sommerfeld quantization to a circular pendulum? Why, and would it be useful to do so? (7) Do you think that the Schr¨odinger equation is applicable for light as well? Explain why.
1
The Mathematics of Quantum Mechanics 1: Finite-Dimensional Hilbert Spaces Before we define quantum mechanics through its postulates, we will review relevant issues about the mathematics of vector spaces, formulating it in a way more easily applicable to the physical case in which we are interested, and in particular the vector spaces we will call Hilbert spaces, relevant for quantum mechanics. In this chapter we will consider the case of finite-dimensional spaces, which are easier to define and analyze, and in the next chapter we will consider the more complicated, but more physically relevant, case of infinite-dimensional Hilbert spaces, which are more involved.
1.1 (Linear) Vector Spaces V We start with the general definition of a vector space. A vector space is a generalization of the space of vectors familiar from classical mechanics. As such, it is a collection {|v} of elements denoted by |v (in the so-called “ket” notation defined by Dirac) together with an addition rule “+” and a multiplication rule “·”, and a set of axioms about them, as follows. • There is an addition operation “+” that respects the space (a group property): ∀|v, |w ∈ V, |v + |w ∈ V. • There is a multiplication by a scalar operation “·” that also respects the space: ∀α ∈ C and |v ∈ V, α|v ∈ V. • The addition of scalars is distributive: ∀α, β ∈ C and |v ∈ V, (α + β)|v = α|v + β|v. • The addition of vectors is also distributive: ∀α ∈ C and |v, |w ∈ V, α(|v + |w) = α|v + α|w. • The addition of vectors “+” is commutative and associative, and so is the multiplication by scalars “·”. • There is a unique zero vector |0, such that |v + |0 = |v. • There is a unique inverse under vector addition: ∀|v ∃| − v such that |v + | − v = |0. Given a set of vectors |v1 , |v2 , . . . , |vn , we can form linear combinations of them. We say that the vectors are linearly dependent if there is a linear combination of them that vanishes, i.e., if ∃ αi 0 (at least two of them nonzero), complex numbers such that n
α i |vi = |0.
(1.1)
i=1
In this case, we can write one of the vectors in terms of the others. For instance, if α n 0, we write |vn = − 12
n−1 n−1 αi |vi = α i |vi . α n i=1 i=1
(1.2)
13
1 Finite-Dimensional Hilbert Spaces We will then call a vector space V n-dimensional if there are n linearly independent vectors |v1 , |v2 , . . . , |vn , but ∀ sets of n + 1 vectors |w1 , |w2 , . . . , |wn+1 , they are linearly dependent. We call {|v1 , |v2 , . . . , |vn } a basis for the space. We can then decompose any vector |v ∈ V in this basis, i.e., there is a unique set of coefficients α i , such that |v =
n
α i |vi .
(1.3)
i=1
The fact that the set of coefficients is unique follows from the fact that if there were a second set α˜ i satisfying the same condition, |v =
n
α˜ i |vi ,
(1.4)
i=1
then by taking the difference of the two decompositions, we obtain n
(α i − α˜ i )|vi = 0,
(1.5)
i=1
which would mean that the |vi would be linearly dependent, which is a contradiction.
Subspaces If there is a subset V ⊂ V such that, ∀|v ∈ V and ∀α ∈ C, then α|v ∈ V as well, and also ∀|v, |w ∈ V, |v + |w ∈ V, we call V a subspace of the vector space. In this case, the subspace will have dimension k ≤ n (n being the dimension of V), and there will be a basis (|v1 , |v2 , . . . , |vk ) for V.
Scalar Product and Orthonormality On a vector space we can define a notion of a scalar product, which associates with any two vectors |v and |w a complex number w|v such that: (1) w|v = v|w∗ . (2) The product is distributive in the second term, i.e., w|(α|v + β|u) = w|αv + βu = αw|v + βw|u. (3) The product of a vector with itself is nonnegative, v|v ≥ 0, and is zero only for the null vector, v|v = 0 ⇔ |v = |0.
(1.6)
Note that the first and second axioms imply also that the product is distributive in the first term, αv + βu|w = (α∗ v| + β∗ u|)|w = α ∗ v|w + β∗ u|w.
(1.7)
Given this scalar product, we can define orthogonality, namely when v|w = 0.
(1.8)
14
1 Finite-Dimensional Hilbert Spaces
We also deduce, using the second axiom, that the product of any vector with the null vector gives zero, since v|0 = v|w − v|w = 0.
(1.9)
We define the norm as the scalar product of a vector with itself, as v ≡ v|v.
(1.10)
Next, we can define orthonormality as orthogonality with a unit norm for each vector, v = 1, ∀|v. If a basis vector does not have unit norm, we can always find a complex number c such that cv = 1, since cv|cv = cc∗ v|v ⇒ |c| 2 =
1 . v 2
(1.11)
We can consider a basis of orthonormal vectors |i, and decompose an arbitrary vector |v using it, as vi |i. (1.12) |v = i
Considering also another vector decomposed in this way, |w = wi |i,
(1.13)
i
we obtain for the scalar product v|w =
vi∗ w j i| j.
(1.14)
i,j
If the basis is orthonormal, we have i| j = δi j , so v|w = vi∗ wi .
(1.15)
i
Then the norm is given by v 2 = v|v =
|vi | 2 .
(1.16)
i
By multiplying the decomposition of |v by k | (considering the scalar product with |k), we obtain (using k |i = δik ) k |v = vi k |i = vk . (1.17) i
Since vi = i|v, the decomposition becomes |v =
|ii|v.
(1.18)
i
The Gram–Schmidt theorem (orthonormalization) Given a linearly independent basis, we can always find an orthonormal basis by making linear combinations of the basis vectors |v.
15
1 Finite-Dimensional Hilbert Spaces
1.2 Operators on a Vector Space An operator Aˆ is an action on the vector space that takes us back into the vector space, i.e., if |v ∈ V ˆ then also |w = A|v ∈ V. In other words, it associates with any vector another vector, Aˆ : |v→ | w.
(1.19)
In other words, it is the generalization of the notion of a function for an action on a vector space. We will mostly be interested in linear operators, which means that ˆ ˆ + β A|w. ˆ A(α|v + β|w) = α A|v
(1.20)
Note however that there are also useful antilinear operators, relevant for the time-reversal invariance (T) operator, ˆ ˆ + β∗ A|w. ˆ A(α|v + β|w) = α ∗ A|v
(1.21)
Nevertheless, at present we will consider only linear operators (except later on in the book, when we will consider the T symmetry). In the space of operators (or “functions” on the vector space) there exists the notion of 0ˆ (a neutral element under addition) and 1ˆ (a neutral element under multiplication), namely defined by ˆ 0|v = |0,
ˆ 1|v = |v.
(1.22)
We also have the same definition of the product with a scalar (in general a complex number), and of addition, as in the vector space itself, ˆ ˆ (α A)|v = α( A|v), ˆ ˆ ˆ (α Aˆ + β B)|v = α( A|v) + β( A|v).
(1.23)
Since the operator is the generalization of the notion of a function for a vector space, we can also define the product of operators as the generalization of the composition of functions, then Aˆ · Bˆ → A ◦ B, namely ˆ ˆ B|v). ˆ Aˆ · B|v ≡ A(
(1.24)
On the space of operators, obtained from the vector space, we can define a completeness relation, saying that an orthonormal basis of vectors |i is also complete in the space, i.e., that we can expand any vector in the vector space in terms of them. As we saw, this expansion formula was |v = |ii|v, (1.25) i
ˆ (by the definition of 1), ˆ we obtain the completeness relation and by identifying this with 1|v ˆ |ii| = 1. (1.26) i
This “ket-bra” notation for operators is used more generally than just for the completeness relation, and it can be shown that we can write an operator Aˆ in the form Aˆ = |ab|, (1.27) a ∈ A,b ∈B
16
1 Finite-Dimensional Hilbert Spaces where A, B are some sets of vectors, since by acting on an arbitrary vector |v we obtain ˆ = A|v |ab|v = vb |a ≡ |w, a ∈ A,b ∈B
(1.28)
a ∈ A,b ∈B
where vb ≡ b|v.
Matrix Representation of Operators We have talked until now in an abstract way of vector spaces and operators acting on them, but we now show that we can represent operators as matrices acting on column vectors. Then in the case of quantum mechanics one obtains the representation of the “matrix mechanics” of Heisenberg. Consider an (orthonormal) basis |1, |2, . . . , |n for the vector space. Then an arbitrary vector space |v, decomposed as |ivi = |ii|v, (1.29) |v = i
i
can be represented by the i|v numbers represented as a column vector, 1|v v 1 v2 2|v |v = . = . , .. .. vn
(1.30)
n|v
where the first form is general (in terms of coefficients vi of the basis), and the second is for an orthonormal basis. In this representation, the ith basis vector corresponds to the column vector with only a 1 in the ith position, 0 1 ... 0 |1 = .. , . . . , |i = 1i . . . .. 0 0
(1.31)
Under this representation of the vectors, the operators Aˆ associating a vector with another vector are associated with a matrix. The matrix element of the operator in the basis {|i} is ˆ j. Ai j = i| A|
(1.32)
If moreover the set {|i} is complete then the product of operators Aˆ · Bˆ is associated with the matrix product of matrices. Indeed, inserting the identity written as the completeness relation, we obtain ˆ j = ˆ ˆ j = ˆ i j = i| Aˆ · B| i| A|kk | B| Aik Bk j . (1.33) ( Aˆ · B) k
k
17
1 Finite-Dimensional Hilbert Spaces Then it follows also that the operator inverse Aˆ −1 , defined by ˆ Aˆ · Aˆ −1 = Aˆ −1 · Aˆ = 1,
(1.34)
ˆ −1 . is associated with the matrix inverse ( A) ij
1.3 Dual Space, Adjoint Operation, and Dirac Notation We represented the vectors |v as column vectors (1.30), where the basis vectors |i correspond to each component of the column vector. But it follows that we can define the adjoint matrix of this column vector, i.e., the transposed and complex conjugate matrix, a row vector, defined as the adjoint of the column vector: (|v) † = (v1∗ , v2∗ , . . . , vn∗ ).
(1.35)
Then we can also write the scalar product of two vectors in an orthonormal basis, which we saw
was written as i vi∗ ui , as the matrix product of row and column vectors, u 1 u2 v|u = vi∗ ui = (v1∗ v2∗ · · · vn ) . . .. i
(1.36)
un But that means that we can represent the adjoint of the vector as the formal “bra” vector, v| = (|v) † ≡ (v1∗ , v2∗ , . . . , vn∗ ).
(1.37)
The “bra” and “ket” notation is due to Dirac, and it is a splitting of the word “bracket”, which refers to the scalar product v|u. Then |u, the vectors in the space V, are called ket vectors and v|, the vectors in the “dual” (adjoint) space, denoted by V∗ , are called bra vectors. We can define the adjoint operation on vectors in a vector space a bit more generally, in order to better define V∗ , by defining the adjoint operation under multiplication by scalars a ∈ C and under a sum of the basis vectors, av| = (|av) † = (a|v) † = a∗ v| |v = vi |i ⇒ v| = (|v) † = vi∗ i|, i †
(1.38)
i
where we have used that |i = i| for the basis vectors. Having defined the adjoint operation for the vectors in the space, we can also define the same for the operators relating two vectors, also via the adjoint matrix, i.e., the transpose and complex conjugate, for the basis vectors |i, so † ˆ j = j | Aˆ † |i = A∗ji . (1.39) ( Aˆ † )i j ≡ ( Aˆ i j ) † = i| A|
18
1 Finite-Dimensional Hilbert Spaces
We can write this adjoint operation formally without reference to a basis of vectors. Since an operator associates a vector with another vector,
it follows that
ˆ = | Av, ˆ A|v
(1.40)
† ˆ = A|v ˆ Av| = v| Aˆ † .
(1.41)
The matrix definition, or this abstract definition, also implies that the dagger operation acts in reverse on a product, as follows: ˆ † = Bˆ † · Aˆ † . ( Aˆ · B)
(1.42)
Change of Basis Consider a change of basis, from the orthonormal basis {|i} to {|i }. The fact that both are bases means that we can expand a vector in one basis in the other basis, as the linear combination |i = Uji | j. (1.43) j
Multiplying by the bra vector k | and using orthonormality, k |i = δik , we obtain Uki = k |i .
(1.44)
ˆ with We can consider the reverse mapping, and represent this matrix as an abstract operator U, matrix elements Uki , i.e., ˆ Uki = k |Uˆ |i = k |Ui,
(1.45)
ˆ |i = Uˆ |i = |Ui.
(1.46)
which implies that formally we have
For the inverse relation, we can expand the vector |i in the basis |i , giving the linear combination, |i = Vji | j . (1.47) j
Multiplying by the vector k | and using the orthonormality relation k | j = δ jk , we obtain
Vki = k |i = i|k ∗ = Uik∗ = (Uˆ † )ki , (1.48) implying that formally we have Vˆ = Uˆ † . Then, applying the transformation twice, from |i to |i and back to |i, we obtain ˆ |i = Uˆ †Uˆ |i ⇒ Uˆ †Uˆ = 1,
(1.49)
ˆ implying that and similarly Uˆ Uˆ † = 1, Uˆ −1 = Uˆ † .
(1.50)
Such an operator is called a unitary operator. Thus, operators that change basis in a vector space are unitary. Moreover, for such an operator, taking the determinant of the relation (mapped to matrices) ˆ and using the facts that the determinant of the matrix product is the product of the matrix Uˆ Uˆ † = 1, determinants and that det U T = det U, so that det U † = (det U) ∗ , we obtain
19
1 Finite-Dimensional Hilbert Spaces 1 = det(Uˆ · Uˆ † ) = det U · det U † = | det U | 2 .
(1.51)
These unitary matrices U form a group G, which means that their multiplication is such that: ˆ Bˆ ∈ G, Aˆ · Bˆ ∈ G. (a) ∀ A, ˆ such that Aˆ · 1ˆ = 1ˆ · Aˆ = A, ˆ ∀ Aˆ ∈ G. (b) ∃ a unit operator (matrix) 1, −1 −1 −1 ˆ ˆ ˆ ˆ ˆ ˆ ˆ (c) ∀ A, ∃ A such that A · A = A · A = 1. The group of such unitary n × n matrices is called the unitary group U (n). Such a unitary transformation of basis Uˆ preserves the scalar product, meaning that if |v = U |v,
|w = U |w,
(1.52)
then the scalar product is invariant, w |v = Uw|Uv = w|U †U |v = w|v.
(1.53)
The transformation of the basis is called an “active” transformation, |v → U |v. Then the matrix elements of an operator Aˆ change as follows: ˆ j = Ui| A|U ˆ j = i|U † AU ˆ | j. ˆ j → i | A| i| A|
(1.54)
Equivalently, describing this as a “passive” transformation, we can think of the transformation acting on the operators instead as; ˆ = U −1 AU. ˆ Aˆ → U † AU
(1.55)
If, however, we consider the matrix element as invariant, the change of basis |i → U |i is ˆ −1 . accompanied by a compensating inverse transformation on operators Aˆ given by Aˆ → U AU
1.4 Hermitian (Self-Adjoint) Operators and the Eigenvalue Problem We call an operator Hermitian or self-adjoint if ˆ Aˆ † = A,
(1.56)
ˆ Aˆ † = − A.
(1.57)
and anti-Hermitian if
The notion of Hermitian and anti-Hermitian operators corresponds to the notion of real or imaginary complex numbers. As for complex numbers (where we can decompose any complex number into a real and an imaginary part), we can decompose any operator into a Hermitian and an anti-Hermitian part, Aˆ + Aˆ † Aˆ − Aˆ † + ≡ Aˆ H + Aˆ A-H . Aˆ = 2 2 Indeed, Aˆ H = ( Aˆ + Aˆ † )/2 is Hermitian and Aˆ A-H = ( Aˆ − Aˆ † )/2 is anti-Hermitian. In matrix notation, a Hermitian (self-adjoint) operator satisfies ∗ ∗ ˆ = i| A| ˆ j = j | Aˆ † |i = j | A|i ˆ = A∗ji . Ai j = i| Aj We will be interested in eigenvalue problems for operators, especially Hermitian operators.
(1.58)
(1.59)
20
1 Finite-Dimensional Hilbert Spaces An eigenvalue problem amounts to finding a vector |v, called an eigenvector, for which the operator acts on it simply as multiplication by a complex number λ, called the eigenvalue. Thus we have ˆ ˆ = λ|v = λ1|v, A|v
(1.60)
ˆ ( Aˆ − λ1)|v = 0.
(1.61)
or equivalently
Multiplying this equation with a bra vector, we obtain an equation describable as a matrix equation (since there is a basis in which |v is a basis vector), for which we can find the determinant, and obtain the eigenvalue condition ˆ = 0. det( Aˆ − λ1)
(1.62)
As mentioned previously, we are mostly interested in Hermitian operators. For them, we have the following:
Theorem The eigenvalues of a Hermitian operator are all real. Proof We have ˆ = λ|v ⇒ v| A|v ˆ = λv 2 . A|v On the other hand, for a Hermitian operator we also have † ˆ ˆ A|v = v| Aˆ † = v| A.
(1.63)
(1.64)
It then follows that we can rewrite the diagonal element of Aˆ † in two ways: ˆ = λv 2 v| Aˆ † |v = v| A|v
= v| Aˆ † |v = v|λ∗ |v = λ∗ v 2 . Therefore λ = λ∗ , so λ is real.
(1.65) q.e.d.
We can use the eigenvectors of a Hermitian operator to build an orthonormal basis, according to the following: ˆ there is an orthonormal basis of eigenvectors for V. In Theorem For a Hermitian operator, Aˆ † = A,
it, if the λi are all different,
λ 1 0 Aˆ = 0 0
0
···
···
λ2
0 ···
···
···
···
0
0
. λn 0 0
(1.66)
Then, the eigenvalue condition (1.62) corresponds to an nth-order polynomial in λ, Pn (λ) = 0, with n solutions λ1 , λ2 , . . . , λ n . That means that we can diagonalize a Hermitian operator by a unitary transformation, and on the diagonal we have the eigenvalues of the operator.
21
1 Finite-Dimensional Hilbert Spaces
1.5 Traces and Tensor Products We can define the trace of an operator formally, as corresponding to the trace of the matrix in its matrix representation, ˆ = i| A|i Aii . (1.67) Tr Aˆ = i
i
We can also define tensor products of vector spaces, and operator actions on them, in a way also obvious from the behavior of matrices. Consider the vector spaces V and V. Then define the tensor product V ⊗ V as follows. If |i ∈ V is a basis for V and |a ∈ V is a basis for V, we denote the basis for V ⊗ V as the tensor product of the basis elements, i.e., |ia ≡ |i ⊗ |a,
(1.68)
and any vector |v ∈ V ⊗ V is expandable in this basis, |v = via |ia.
(1.69)
i,a
Then operators on the tensor product space are also tensor products Aˆ = AˆV ⊗ AˆV , acting as tensor products of the corresponding matrices in the basis, i.e., acting independently in each space, ˆ = AˆV ⊗ AˆV via |i ⊗ |a = via AˆV |i ⊗ AˆV |a . (1.70) A|v i,a
i,a
1.6 Hilbert Spaces Finally, we come to the definition of Hilbert spaces, which are the spaces needed for states in quantum mechanics. We will forgo a more precise and rigorous definition until the next chapter, where we will also generalize Hilbert spaces to infinite-dimensional spaces. Briefly, Hilbert spaces are vector spaces, with a notion of a scalar product and the norm derived from it, which contain only proper vectors, which can be normalized to unity, v = 1, and for which we have a notion of continuity and completeness of the space with respect to a metric. Physically, as mentioned above the Hilbert space is the state space for quantum mechanics. In the finite-dimensional case, a Hilbert space can be thought of as a regular topological vector space, associated with the metric distance defined from the scalar product (and its norm). In the infinitedimensional case more care is needed, since the space of functions will be included in it, so in the next chapter we will define the Hilbert space more carefully. Note that for vectors |v in a Hilbert space, if they are normalizable, which means we can write v|v = 1 and expand the vectors in an orthonormal basis |i, j |i = δi j , we have |v = vi |i = i|v|i, (1.71) i
so that 1 = v|v =
i,j
j |v ∗j vi |i =
i
i
|vi | 2 =
i
|i|v| 2 .
(1.72)
22
1 Finite-Dimensional Hilbert Spaces
Important Concepts to Remember • Linear vector spaces are a collection of elements together with the addition of vectors and multiplication by scalars, and some axioms. • Vector spaces have bases. A basis contains a maximal number of linearly independent vectors, such that any element in the space can be decomposed into it. • A subspace is a subset of elements of the space, such that the addition and multiplication by scalars defined for the space apply also to its subspaces. • The scalar product associates a complex scalar with two vectors. In Dirac’s bra-ket notation, the scalar product is v|w. • The norm is the square root of the scalar product of a vector with itself. • An orthonormal basis is a basis of vectors of unit norm that are orthogonal to each other (the scalar product of two different basis vectors vanishes), i.e., vi |v j = δi j . • Operators Aˆ on a vector space associates any vector with another. A linear operator respects linearity. • Representing vectors as column vectors whose elements are the coefficients of the expansion into a basis, an operator is represented as a matrix acting on these column vectors. • When changing the basis between two orthonormal bases, the corresponding operator Uˆ is unitary, Uˆ ∈ U (N ), and the operators are changed as Aˆ → Uˆ AˆUˆ −1 . ˆ admit an eigenvalue–eigenstate problem, A|v ˆ • Hermitian operators, Aˆ † = A, = λ|v, with real eigenvalues λ ∈ R. • Traces of operators are associated with the traces of their associated matrices, while tensor products of vector spaces mean that for any element of one space, we associate with it a full copy of the other vector space. • Hilbert spaces are vector spaces with scalar products and norms and vectors that can be normalized, and for which we have a notion of continuity and completeness of the space with respect to a metric. Physically, they are the state space for quantum mechanics.
Further Reading See, for instance, Shankar’s book [3] for more details on vector spaces, from the point of view of a physicist.
Exercises (1)
Write formally for a general vector space the triangle inequality for three vectors, generalized from vectors in a three-dimensional space, with c = a + b. (2) Considering three linearly independent vectors in a three-dimensional Euclidean space, a , b, c, construct an orthonormal basis out of them. (3) Using the bra-ket Dirac notation for operators, show that we can put the product Aˆ · Bˆ into the same form and that the product is associated with the matrix product of the associated matrices.
23
1 Finite-Dimensional Hilbert Spaces
(4) Show that the trace of a product of Hermitian operators is real. (5) What is the trace of a tensor product of two operators? (6) Show that, modulo some discrete symmetries, U (n) can be split up into SU (n) times a group U (1) of complex phases eiα (and show that these phases do form a group). (7) Find the two eigenvalues of an operator associated with a 2 × 2 matrix with arbitrary elements.
2
The Mathematics of Quantum Mechanics 2: Infinite-Dimensional Hilbert Spaces In this chapter we move to the more complicated case of an infinite-dimensional Hilbert space, in particular the vector space of functions. Before that, however, we will take a second, more rigorous, look at the definition of a Hilbert space and various related definitions.
2.1 Hilbert Spaces and Related Notions We define a Hilbert space as a complex vector space with a scalar product, with which we associate a distance function (“metric”) for any two vectors x and y in the space, d(x, y) = x − y = x − y|x − y, (2.1) and which is a complete metric space with respect to this distance function. A metric space, sometimes also called a pre-Hilbert space, is a vector space with a metric or distance d(x, y) defined on it. The distance d(x, y) must obey the triangle inequality: d(x, z) ≤ d(x, y) + d(y, z).
(2.2)
In the case (as for a Hilbert space) where this comes from a scalar product, the triangle inequality follows from the Cauchy–Schwarz inequality |x|y| ≤ x y.
(2.3)
A metric space M is called complete (or a “Cauchy space”) if a so-called Cauchy sequence of points in M, that is, a sequence vn of points that become arbitrarily close to each other (in the sense of the metric or distance) as n → ∞, has a limit, as n → ∞, that is also in M. A counterexample to this, i.e.,√an incomplete metric space, is the space of rational numbers Q, since there are irrationals such as 2 that are the limit of rational numbers (in a Cauchy sequence), yet they are not part of the space itself. A metric space (pre-Hilbert) that is complete is called Hilbert. Related to the Hilbert space is the notion of a Banach space, which means a vector space with an associated notion of a norm (but that does not come from a scalar product), that is also complete. That means that any Hilbert space is Banach, but the converse is not true: there are Banach spaces that are not Hilbert. Any Hilbert space is also a topological vector space, which means a vector space that is also a topological space. A topological space is a space admitting a notion of continuity and having a uniform topological structure, i.e., a notion of convergence (for the definition of Cauchy sequences). Thus any Banach space is also a topological vector space. 24
25
2 Infinite-Dimensional Hilbert Spaces
2.2 Functions as Limits of Discrete Sets of Vectors We are interested in the relevant and nontrivial case of infinite-dimensional Hilbert spaces. Indeed, quantum mechanical systems are usually of this type, so we need to generalize to this case. Moreover, in the quantum mechanical case we have functions ψ(x) as vectors, and operators acting on them, so we need to generalize further to the case of infinite-dimensional vectors. Thus we need to find a way to think of ψ(x) as a vector space the same as we do for the x’s themselves, and to define operators ˆ acting on them, Aψ(x). That is, we need to replace a set of vectors | f k with a function f (x). A way to do that is to understand f (x) as a limit of a function defined on discrete points, f (x k ). Though the details are a bit different. The first step is to define a basis |x i that is orthonormal, x i |x j = δi j ,
(2.4)
and complete in the space of n vectors, n
|x i x i | = 1,
(2.5)
i=1
where 1 is the unit operator, which means that any vector | f can be expanded in this basis as |f =
n
f (x i )|x i .
(2.6)
i=1
If we understand the components f (x i ) as, by definition, f (x i ) ≡ x i | f ,
(2.7)
then, in view of the completeness relation, we obtain an identity, |f =
n
|x i x i | f .
(2.8)
i=1
Note that here we define |x i as the column vector with a 1 only in the ith position and zeroes in the other positions: 0 . .. . .. |x i = 1i . . .. .. . 0
(2.9)
The decomposition of | f in |x i and the fact that f (x i ) = x i | f suggest that the scalar product can be understood in a similar way, as f |g =
n n f |x i x i |g = f ∗ (x i )g(x i ), i=1
i=1
(2.10)
26
2 Infinite-Dimensional Hilbert Spaces
since f |x i = (x i | f ) ∗ = f ∗ (x i ).
(2.11)
Then we also obtain that the norm of a vector | f is n
f 2 = f | f =
| f (x i )| 2 .
(2.12)
i=1
2.3 Integrals as Limits of Sums In both the scalar product and the norm defined above we saw sums appearing, which we need to generalize to the continuum case. It seems obvious that sums should generalize to integrals. Indeed, the Riemann integral is defined as the limit of a Riemann sum. A very small (infinitesimal) constant step of integration Δx i = → 0 becomes the differential dx, where = (b − a)/n, n is the number of points and a, b the limits of integration. Then the sum tends to an integral in the limit n → ∞, b n f (x i ) → dx f (x). (2.13) a
i=1
However, it turns out that a better limit of the sum is the more general Lebesgue integral, which allows for integration in the presence of distributions such as Dirac’s delta function, treated next. The limiting Lebesgue integral is ∞ f ∗ (t)dt, (2.14) f dμ = 0
where f ∗ (t) = μ({x}| f (x) > t)
(2.15)
is the measure (the values of x for which f (x) is greater than a given t). With respect to the Lebesgue integral, we can define L 2 (X, μ), the space of square integrable functions over the domain, which is a space X, with measure μ. We say that f belongs to L 2 (X, μ) if | f | 2 dμ < ∞. (2.16) X
This means that the scalar product can be generalized from a sum to a Riemann integral, as b n f |g = f ∗ (x i )g(x i ) → f ∗ (x)g(x)dx,
(2.17)
a
i=1
or more generally to a Lebesgue integral, as f |g =
f ∗ (t)g(t)w(t)dt,
where the Lebesgue measure is
(2.18)
μ( A) =
w(t)dt. A
(2.19)
27
2 Infinite-Dimensional Hilbert Spaces
Thus the norm of a function can be written as a Riemann integral or as a Lebesgue integral, b f 2 = | f (x)| 2 dx → | f | 2 dμ. (2.20) a
A
2.4 Distributions and the Delta Function Since the so-called Dirac delta function, which famously is not a function, but rather a distribution, will also make an appearance in infinite-dimensional Hilbert spaces, we need to define the theory of distributions, and specialize to the delta function. In short, a distribution T is a linear and continuous application (not a function), defined on the space of test functions of compact support, and with complex (or real) values; thus T : D → C. The support means the effective domain of a function, which is smaller than the full domain, and it is supposed that the function vanishes outside the support. The space of test functions for distributions is then defined as the set of infinitely differentiable functions with compact support, D ≡ {φ : Rn → C|supp(φ) compact, φ ∈ C∞ (Rn )}.
(2.21)
This space is a vector space, with convergence structure (so that it is a topological space) φ n → φ in D. Distributions are in some sense generalizations of functions that allow us to make sense of quantities like the Dirac delta function. Indeed, for any locally integrable function, f (x) ∈ L 1loc ( A), we can associate a distribution f˜ with it, f˜ : D → C, such that, for any test function φ, f˜|φ = f ∗ (x)φ(x)dx. (2.22) A
The space of distributions D∗ is called the dual of the space of test functions D, D∗ = {T : D → C|T a distribution}.
(2.23)
It is also a vector space, and has a convergence structure (so it is a topological space): if the sequence {Tn }n ∈ D∗ , then Tn → T (weak convergence) ⇔ for any φ ∈ D we have Tn |φ → T |φ. On this space, we can define multiplication of the distribution T ∈ D∗ by any infinitely differentiable function a(x) ∈ C ∞ (Rn ), by aT |φ ≡ T |a∗ (x)φ(x).
(2.24)
If there exists a function f such that a distribution has a function associated with it, T = f˜, we call the distribution T a regular distribution. If there is no such function, we call the distribution singular; for example, the Dirac delta function is a singular distribution: δ x0 : D → R,
δ x0 |φ = φ(x 0 ).
On the other hand the Heaviside (step) distribution is regular. It is defined by ∞ Hx0 |φ = φ(x)dx. x0
(2.25)
(2.26)
28
2 Infinite-Dimensional Hilbert Spaces
Then the Heaviside function associated with it is defined as ⎧ ⎪ 1, Hx0 (x) = ⎨ ⎪ 0, ⎩
x > x0
(2.27)
x ≤ x0.
The Heaviside function is continuous, but not differentiable at x 0 . It is only differentiable as a distribution. We can define the derivative of a distribution T ∈ D∗ by a bracket with a test function φ ∈ D, as
∂φ ∂ T φ ≡ − T . ∂xj ∂xj
(2.28)
It is a good definition, since it is self-consistent for regular distributions (those associated with functions). For T = f˜, the relation is written as
∂φ ∂ ˜ ˜ f φ ≡− f . ∂ x j ∂xj
(2.29)
But the right-hand side can be written as follows: −
∂φ f (x) dx = − ∂xj ∗
Rn
∂ dx ( f ∗ · φ(x)) + ∂xj
Rn
Rn
∂f∗ φ(x)dx = ∂xj
˜ ∂ f φ , ∂ x j
(2.30)
where in the first equality we have used partial integration, and in the second we used the fact that the test functions have compact support, so the boundary integral on Rn vanishes and the remaining term is the definition of the scalar product. Furthermore, as a distribution, we have that the delta function is the derivative of the Heaviside function, Hx 0 = δ x0 .
(2.31)
To prove this, note that Hx 0 |φ
= −Hx0 |φ = −
∞
φ (x)dx = φ(x 0 ) − φ(∞) = φ(x 0 ) = δ x0 |φ,
(2.32)
x0
where we have used the fact that φ(∞) = 0 because of the compact support of φ. We can also define a notion of the support of a distribution T ∈ D∗ , supp(T ) ≡ {x ∈ Rn |T 0 in X }.
(2.33)
Then the support of the delta function is a single point, supp(δ x0 ) = {x 0 }.
(2.34)
Moreover, there is a theorem that, for any function f ∈ L 1loc ( A), we have supp( f˜) ⊂ supp( f ) and for any continuous function f , supp( f˜) = supp( f ).
29
2 Infinite-Dimensional Hilbert Spaces
2.5 Spaces of Functions We can now go back to defining the notion of a space of functions. As we saw, the scalar product of two functions f , g on an interval (a, b) is defined as b f |g = f ∗ (x)g(x)dx. (2.35) a
We can consider the case when one of the function vectors is replaced by the |x vector itself. It follows that we need to define the product of two such vectors, giving something like δ x,y . Of course, now we know that the correct thing to do is to use the delta function, or rather, distribution, δ(x − y), so x|y = δ(x − y),
(2.36)
which is zero outside the support x 0 and gives 1 on integration, b δ(x − y)dy = 1, a < x < b.
(2.37)
a
In fact, we can restrict the domain of integration to an infinitesimal region around the zero of the argument of δ, and write b δ(x − y) f (y)dy = f (x) a (2.38) x+ = δ(x − y) f (y)dy. x−
On the other hand, the completeness relation for |x, defined as a (Riemann) sum, is now generalized to an integral: b n |x i x i | = 1 → dx|xx| = 1. (2.39) a
i=1
Then we can also generalize x i | f ≡ f (x i ) to a continuous function x| f = f (x), since b b f (x) = x| f = x|1| f = x|x x | f = dx f (x )x|x a
(2.40)
a
only makes sense if x|x = δ(x − x ). Note that the delta function is not a function (it is a distribution), but can be obtained as the limit when Δ → 0 of a Gaussian, f Δ (x − y) = √
1 πΔ2
e−(x−y)
2 /Δ2
.
(2.41)
The delta function δ(x − y) is even, owing to its definition as a distribution, and since it is the limit of f Δ , and also because δ(x − y) = x|y = (y|x) ∗ = δ∗ (y − x) = δ(y − x).
(2.42)
From the fact that δ(x − y) is a distribution, its derivative with respect to its argument, δ (x − y) =
d d δ(x − y) = − δ(x − y), dx dy
(2.43)
30
2 Infinite-Dimensional Hilbert Spaces
acts in reality as δ(x − y)
d . dy
(2.44)
(the proof is by partial integration, as we saw from the general derivative of a distribution). Note that here we have assumed that the space of integration for the test functions is in y, f (y)dy, which is the opposite convention to the usual one, but it makes sense if we want to think of the integration as a generalization of summation coming from the matrix product. This fact in particular means that δ (x − y) is a kind of matrix in (x, y) space (since it has indices).
2.6 Operators in Infinite Dimensions We saw that the space of functions | f such that x| f = f (x) is a Hilbert space, which means from ˆ f . the general theory that we can define operators Aˆ on the functions, acting as A| Among such operators we have the trivial ones, corresponding to the usual multiplication by a number. One nontrivial case is the differentiation operator, which can be thought as a matrix too. It is defined as d d : f (x) → f (x) = f (x), (2.45) dx dx but we should extend this definition to the abstract space | f (without the basis |x) as D, where df D| f ≡ . (2.46) dx Then we can also multiply by x|, in order to obtain a matrix element, df df x|D| f = x (x). = dx dx On the other hand, introducing the completeness relation inside this expression, we obtain df (x) = dx x|D|x x | f ≡ dx Dxx f (x ), dx
(2.47)
(2.48)
which means that the matrix element of D is (note the integration over x in the delta function) Dxx = δ (x − x ) = δ(x − x )
d . dx
(2.49)
2.7 Hermitian Operators and Eigenvalue Problems Consider a linear operator Aˆ : H → H on a Hilbert space H. Then its adjoint Aˆ † is defined as before (in the finite-dimensional case), by ˆ Aˆ † f |g ≡ f | Ag.
(2.50)
Then a Hermitian (or self-adjoint) operator is one for which Aˆ = Aˆ † .
(2.51)
31
2 Infinite-Dimensional Hilbert Spaces
One thing we have to be careful about is that in principle f and g live in different Hilbert spaces (the vector space of functions and its dual), but Aˆ = Aˆ † is only meaningful if the domain (more precisely, the support) of the two operators is the same. This is a subtle issue, since there are operators that are formally self-adjoint, but act a priori on different spaces, and requiring that the spaces are identical imposes constraints. We have several important theorems about Hermitian operators: ˆ for all f , g ∈ H, it follows that Aˆ is Theorem For Aˆ : H → H Hermitian, i.e., Aˆ f |g = f | Ag bounded.
Theorem For Aˆ : H → H ( Aˆ ∈ L(H)), there is a unique adjoint Aˆ † ∈ L(H), and it has the same ˆ norm, Aˆ † = A.
ˆ the diagonal elements are real, Aˆ f | f ∈ R. Theorem For A ∈ L(H) Hermitian, Aˆ † = A, ˆ orthogonal eigenvectors correspond For a Hermitian operator Aˆ ∈ L(H), Aˆ † = A, to different eigenvalues. This leads to the same Gram–Schmidt diagonalization algorithm as that considered in the finite-dimensional case. We next consider the eigenvalue problem (or spectral problem) Aˆ f = λ f . It follows that
Theorem
( Aˆ − λ1) f = 0.
(2.52)
Then we have another theorem: For a Hermitian operator Aˆ = Aˆ † , with a set of eigenvalues λi and eigenvectors f i , {λi , f i }i ∈N , the set of eigenvectors forms a complete basis, so ˆ f = A| λi f i | f | f i . (2.53)
Theorem (Hilbert–Schmidt)
i ∈N
For an operator Aˆ : H → H, consider the homogenous and inhomogenous eigenvalue equations,
Theorem (Fredholm)
(1) : ( Aˆ − λ1)x = z (2) : ( Aˆ − λ1)x = 0.
(2.54)
Then we have two statements: (a) (1) admits a unique solution ⇔ λ is not an eigenvalue. (b) If λ is an eigenvalue, then (1) admits a solution ⇔ (2) admits a nontrivial solution, and z ∈ H ⊥ (the transverse subspace).
2.8 The Operator Dxx We can ask, is the operator Dxx from (2.49) Hermitian? No, but from it we can make a potentially Hermitian operator, −iDxx , since −iDxx = (−iDxx ) ∗ = +iDx x ,
(2.55)
32
2 Infinite-Dimensional Hilbert Spaces
which can be rewritten as −iδ (x − x ) = +iδ (x − x).
(2.56)
But if the domain of definition of the functions is actually finite (as opposed to just having a ˆ we compact support), there can be an issue with boundary terms. Indeed, for a Hermitian operator A, have the equality ∗ ˆ f = g| Aˆ f = Aˆ f |g = f | A|g ˆ ∗, g| A| (2.57) and by introducing two completeness relations we can write the relations in terms of the x x matrix components of the operators: b b ∗ b b ˆ ˆ dx dx g|xx| A|x x | f = dx dx f |xx| A|x x |g . (2.58) a
a
a
a
We want to see whether this can be satisfied by Axx = −iDxx . The left-hand side becomes b b b d ∗ d ∗ dx dx g (x) −iδ(x − x ) f (x ) = dxg (x) −i f (x), dx dx a a a
(2.59)
whereas the right-hand side gives, by partial integration, b b ∗ b d ∗ d dx dx f (x) −iδ(x − x ) g(x ) = −i g ∗ (x) f (x) + ig ∗ (x) f (x)| ba . (2.60) dx dx a a a That means that we get equality only if the boundary term + ig ∗ (x) f (x)| ba
(2.61)
vanishes. The eigenvalue problem for Aˆ = −iDxx , defined as ˆ A|k = k |k gives
ˆ kx|k = x| A|k =
ˆ x |k, dx x| A|x
(2.62)
(2.63)
which in turn becomes (since Axx = −iDxx = −iδ(x − x )d/dx and can be integrated, and defining x|k ≡ ψk (x)) −i
d ψk (x) = kψk (x), dx
(2.64)
which is solved by 1 ψk (x) = Aeik x = √ eik x . 2π
(2.65)
The normalization constant A has been chosen so that k |k = δ(k − k ). Note then that in this new |k basis we have that −iDkk = k | − iD|k = k k |k = k δ(k − k ) is diagonal.
(2.66)
33
2 Infinite-Dimensional Hilbert Spaces
Important Concepts to Remember • The distance on a metric or pre-Hilbert space is d(x, y) = x − y, and a Hilbert space is a metric space that is also complete, meaning that the limit of a series in the space also belongs to the space, and the norm comes from a scalar product. • A Banach space is like a Hilbert space, except that the norm does not come from a scalar product. • For quantum mechanics, we need infinite-dimensional Hilbert spaces in which the vectors are functions. To discretize, f (x i ) = x i | f but, in the continuous case, |x i is replaced by the continuous variable |x. b • The scalar product on the Hilbert space of functions is f |g = a f ∗ (x)g(x)dx, but the completeb ness relation for the |x basis vectors is a dx|xx| = 1, reducing the previous definition of a scalar product to a trivial identity. • Distributions are linear and continuous applications that act on the space of some test functions and give a complex number result. For instance, the distribution T = f˜ is defined such that f˜|φ = A f ∗ (x)φ(x)dx. • The “delta function” distribution δ x0 is defined by δ x0 |φ = φ(x 0 ), or, more commonly, by φ(x 0 ) = δ(x − x 0 )φ(x)dx. • The derivative of a distribution is defined by ∂T/∂ x j |φ = −T |∂φ/∂ x j . • The Heaviside function is a function, but it can also be associated with a distribution. As a distribution, Hx 0 = δ x0 . • The orthonormality relation of the |x vectors is x|y = δ(x − y). • Nontrivial operators on the Hilbert space of functions are those that involve derivatives, in particular the operator D, with D| f = |d f /dx and with matrix element Dxx = δ (x − x ) = δ(x − x )d/dx. • For Hermitian (or self-adjoint) operators Aˆ † = Aˆ there are many solutions to the eigenvalueeigenstate problem (or spectral problem) ( Aˆ − λ1) f = 0. In fact, the set of (linearly independent) eigenvectors forms a complete basis for the Hilbert space. • The operator −iDxx , thought of as a distribution or as an operator on the space of functions is Hermitian, but only if the Hilbert space (the “space of test functions for the distribution”) involves functions with compact support or otherwise if the boundary terms match for the matrix element with two functions.
Further Reading See, for instance, Shankar’s book [3] for more details about spaces of functions, from the point of view of a physicist.
Exercises (1)
Discretize the product of two functions, as compared to discretizing each function independently, and describe what that means in the language of kets. (2) For a tensor product of kets, describe what the norm is in the abstract sense, and then in the function form (with integrals).
34
2 Infinite-Dimensional Hilbert Spaces
(3) (4) (5) (6) (7)
Is the square of the “delta function” a distribution? If so, prove it using Dirac’s bra-ket notation. Show how δ (x − y) (the second derivative with respect to x) acts as a distribution on functions. ˆ f ( A), ˆ also Hermitian? Give examples. Is a real function of a Hermitian operator A, i Aˆ Consider the unitary operator e , with Aˆ Hermitian and acting on function space. Can it be diagonalized? If so, write an expression for its diagonal elements. 0 1 Diagonalize e σ1 , where σ1 is the first Pauli matrix . 1 0
3
The Postulates of Quantum Mechanics and the Schrödinger Equation After developing a somewhat lengthy mathematical background, we are now ready to define quantum mechanics from some first principles, or postulates. The word “postulates” implies some mathematical-type axiomatic system, but the situation is not quite so clear cut. There isn’t a mathematically rigorous collection of assumptions, only a set of rules that can be presented in many ways. The number, order, and content of the postulates varies, though the total physical content of the system is the same. Here I will present my own viewpoint on the postulates. We will start with the postulates themselves, and then we will explain them and add comments. The crucial difference from classical mechanics is that we don’t any longer have classical paths defined by a Hamiltonian H (pi , qi ) and initial conditions for the phase-space variables (pi , qi ) (giving a “state” at time t). Instead of that, we have probabilities for everything we can observe, including analogs of classical variables like (pi , qi ); but there are also new variables that have no classical counterpart. We thus have to define states, observables, probabilities for them, time evolution, and the postulates dealing with these concepts.
3.1 The Postulates First postulate, P1 At every time t, the state of a physical system is defined by a ket |ψ in a Hilbert space.
Second postulate, P2 For every observable described classically by a quantity A, there is a Hermitian operator Aˆ acting on the physical states |ψ of the system. The fact that the operator is Hermitian means that its eigenvalues are real, which as we will shortly see means that the observables are real, as they should be.
Third postulate, P3 The only possible result of a measurement of an observable is an eigenvalue λ of the operator corresponding to it: ˆ λ = λ|ψ λ . A|ψ
(3.1)
Note that if the spectrum of the operator Aˆ is discrete, there is a discrete number of values λ n , explaining the “quantum” in “quantum mechanics”. Since the operators Aˆ corresponding to
35
36
3 The Postulates and the Schrödinger Equation observables are Hermitian, it follows that the Hilbert space has a basis |n (corresponding to λ n ) ˆ so every state can be expanded into it, formed by eigenvectors of the operator A, |ψ = cn |n. (3.2) n
Fourth postulate, P4 The probability of obtaining an eigenvalue λ n corresponding to the eigenstate |vn in a measurement of the observable Aˆ in a normalized state |ψ is Pn = |vn |ψ| 2 ,
(3.3)
where, besides |ψ, the |vn are also normalized states. Here we have to assume that the sum of all
probabilities is one, n Pn = 1.
Fifth postulate, P5 ˆ giving as a result the eigenvalue After a measurement of a variable corresponding to an operator A, λ n corresponding to the eigenvector |vn , the state of the system has changed from the original |ψ to |vn . This is the strangest of the postulates, which clashes most with our intuition since it implies a discontinuous, nonlinear, change. But it is experimentally found to be true. If we re-measure the same quantity right after the first measurement, we always find the same result λ n .
Sixth postulate, P6 The time evolution of a state is given by |ψ(t) = Uˆ (t, t 0 )|ψ(t 0 ),
(3.4)
ˆ in order to preserve norms, and thus the total probability where the operator Uˆ is unitary, Uˆ †Uˆ = 1, (which equals 1), i.e., ψ(t)|ψ(t) = ψ(t 0 )|ψ(t 0 ).
(3.5)
This evolution operator (also known as a “propagator”) is found by imposing the requirement that the time dependence of the state satisfies the Schr¨odinger equation, i
d |ψ(t) = Hˆ |ψ(t). dt
(3.6)
Note that here we have assumed that states change with time but operators corresponding to observables do not, which is known as the “Schr¨odinger picture”, but there are other pictures, as we will see later on in the book. We next start to comment on and expand on the various postulates.
37
3 The Postulates and the Schrödinger Equation
3.2 The First Postulate As we have just seen, the state of a system is defined at a given time (in the Schr¨odinger picture), but it depends on time, with the time variation given by the Schr¨odinger equation. We described abstract states |ψ, but for states described by quantities that depend on spatial coordinates, such as, for instance states corresponding to single particles with a classical position, we can define the function x|ψ = ψ(x),
(3.7)
ˆ from postulate 4 known as the wave function. Since |x is an eigenvector of the position operator X, it follows that if we measure the position through Xˆ then |vn → |x, so the probability of obtaining the value x is |vn |ψ| 2 → |x|ψ| 2 = |ψ(x)| 2 .
(3.8)
Thus it is this function that defines position probabilities through its modulus squared, justifying the name wave function.
3.3 The Second Postulate One observation to make is that there is some ambiguity about the operator Aˆ corresponding to a ˆ P) ˆ classical mechanics quantity A: there is an issue concerning the order of the operators (such as X, involved in the expression for a composite operator. For instance, consider a Hamiltonian H (p, q) ˆ that classically contains a term pq2 . Quantum mechanically, should we replace that with PˆQˆ 2 , Qˆ PˆQ, 2 ˆ ˆ Q P, or some linear combination of the three possibilities? ˆ the eigenvalues λ n in A|v ˆ n = λ n |vn The other observation is that for a Hermitian operator A, are real, and the |vn eigenstates form a complete basis for the Hilbert space, so any state |ψ can be expanded in them: |ψ = cn |vn . (3.9) n
It could be, however, (this is more generic) that there are other observables that can also be ˆ which measured, which could mean that there are several states of given eigenvalue λ n for A, are distinguished by some other index b (corresponding perhaps to eigenvalues of some other observables), i.e., ˆ n , b = λ n |vn , b. A|v
(3.10)
But we can also have a linear combination of the states with index b, so Aˆ cb |vn , b = λ n cb |vn , b . b b Then an arbitrary state can still be expanded in this set, but now we write |ψ = cn,b |vn , b. n,b
(3.11)
(3.12)
38
3 The Postulates and the Schrödinger Equation
3.4 The Third Postulate As we saw, we measure only eigenvalues of operators, which can be discrete, i.e., indexed by a natural number, λ n . This is the reason we talk about quantum mechanics. (1) The eigenvalues, and the corresponding eigenstates associated with them, can be not only discrete but also finite in number, in which case we have a finite-dimensional Hilbert space. The standard example is the system with only two states that arises from a spin 1/2 particle: spin up ↑ or spin down ↓, or more precisely |s z = +1/2 and |s z = −1/2.
(3.13)
This system has no classical counterpart. (2) Another possibility is a discrete spectrum, for an operator that corresponds to a classical continuous variable, but still to have a countable (indexable by N) set of eigenvalues. As we said, in this case we talk about quantization of the observable. The standard example in this case is the energy of an electron in a hydrogen atom, En . It could also be, as in this case, that the energy spectrum is degenerate, meaning that there are other observables besides Hˆ (giving the energy) that add an index b to the state, Hˆ |ψ n , b = En |ψ n , b.
(3.14)
3.5 The Fourth Postulate Since the probability of finding the eigenvalue λi corresponding to eigenstate |vi of Aˆ is Pi = |vi |ψ| 2 ,
(3.15)
and we can expand the state in the complete and orthonormal (vi |v j = δi j ) set of the eigenstates of ˆ we write A, |ψ = ci |vi , (3.16) i
so that the probability of measuring λi is Pi = |ci | 2 .
(3.17)
Since the sum of all the probabilities is one, we obtain Pi = |ci | 2 = 1, i
which amounts (since the |vi are normalized states) to normalization of the state, ψ|ψ = ci∗ c j vi |v j = |ci | 2 = 1. i,j
(3.18)
i
(3.19)
i
This also means that only normalizable states |ψ can be physical (hence our definition for the Hilbert space), since normalizable states imply that we can construct a probability set that sums to one.
39
3 The Postulates and the Schrödinger Equation ˆ n , b = λ n |vn , b, instead of projecting onto a single state vi |, In the case of degenerate states, A|v we project onto the set of states with the same eigenvalue λ n , using the projector PVn , so onto the state (PVn |ψ) † .
(3.20)
A projector must satisfy the relations PV† n = PVn ,
PV2 n = PVn ,
(3.21)
the first stating that projecting an already projected state changes nothing, so P2 = P, and the second that the projections onto the bra and ket are identical. It is easy to realize that the projector must be the part of the completeness relation involving only the states in question, so in our case |vn , avn , a|. (3.22) PVn = a
Acting on the general state |ψ =
cm,b |vm , b
(3.23)
m,b
gives
PVn |ψ =
|vn , avn , a|vm , bcm,b .
(3.24)
a,m,b
Then we must replace Pn = |vn |ψ| 2 = |cn | 2 by the more general ∗ Pn = ψ|PVn |ψ = PVn ψ|PVn ψ = cn,a cn,a .
(3.25)
a
Finally, we come to the possibility of having a continuous spectrum for an operator. As we said, this is the case for the position operator, with matrix elements x| Xˆ |x = xδ(x − x ),
(3.26)
and the momentum operator, with matrix elements ˆ = −iδ (x − x ). x| P|x
(3.27)
The position operator has eigenstates |x, Xˆ |x = x|x,
(3.28)
which means we can define the wave function x|ψ ≡ ψ(x). In terms of it, the abstract relation for eigenvectors of operators, ˆ A|ψ = λ|ψ
(3.29)
becomes, by inserting the identity as a completeness relation using the states |x, ˆ x |ψ = λx|ψ. dx x| A|x
(3.30)
Writing this in terms of wave functions, we obtain dx Axx ψ(x ) = λψ(x).
(3.31)
40
3 The Postulates and the Schrödinger Equation ˆ we obtain In the case of the momentum operator, Aˆ = P,
dx −iδ (x − x ) ψ(x ) = pψ(x),
(3.32)
and, since δ (x − x ) acts as δ(x − x )d/dx , we finally obtain −i
dψ(x) = pψ(x). dx
(3.33)
ˆ since Xˆ xx = xδ(x − x ) we obtain just an On the other hand, for the position operator Aˆ = X, identity, xψ(x) = xψ(x). We have presented three cases, of finite dimensional Hilbert spaces (such as for electron spin), of discrete countable states (such as the energy states of the hydrogen atom), and of continuous states (such as position and momentum). But we can have a system that has eigenvalues corresponding to more than one observable, for instance both spin and position like the electron. In such a case, the total Hilbert space is a tensor product of the individual Hilbert spaces, H = H1 ⊗ H 2 .
(3.34)
Example For the electron, with spin 1/2, consider that H1 corresponds to spin 1/2, with basis |s z =
+1/2, |s z = −1/2 (or |+, |−), and H2 corresponds to position, with x|ψ = ψ(x). Then we consider states of the type |s ⊗ |ψ and basis states of the type +| ⊗ x|. If we have two independent observables that can be diagonalized simultaneously (so we can measure their eigenvalues simultaneously), corresponding to operators Aˆ 1 , Aˆ 2 , so that Aˆ 1 |λ1 , λ2 = λ1 |λ1 , λ2 ,
Aˆ 2 |λ1 , λ2 = λ2 |λ1 , λ2 ,
(3.35)
we can take the commutator of the two operators, and obtain ( Aˆ 1 Aˆ 2 − Aˆ 2 Aˆ 1 )|λ1 , λ2 = 0.
(3.36)
Then, more generally (since the vectors |λ1 , λ2 form a basis for the Hilbert space), we can write a commutation condition for operators: [ Aˆ 1 , Aˆ 2 ] = 0.
(3.37)
Thus operators corresponding to independent observables commute. From the fourth postulate we also obtain a result for the experimentally measured average value of an observable, which is sometimes included as part of the postulates: ˆ m |vm = ˆ cn∗ vn | Ac |cn | 2 an = Pn an , (3.38) A ≡ ψ| A|ψ = n,m
n
n
ˆ n = an |vn and the orthonormality of the eigenstates. Since the right-hand where we have used A|v
side ( n Pn an ) is the experimentally measured average value, it follows that the average value of Aˆ in the state |ψ is consistently defined as ˆ A ≡ ψ| A|ψ.
(3.39)
We also saw that the sum of probabilities must be 1, which means that states must be normalized,
2 n Pn = n |cn | = 1. But this relation, the conservation of probability, must be preserved by any
41
3 The Postulates and the Schrödinger Equation ˆ or time evolution through the operator transformation, such as a change of basis by an operator U, ˆ ˆ U (t, t 0 ). That means in both cases that U must be unitary, since |ψ → Uˆ |ψ ⇒ ψ|ψ → ψ|Uˆ †Uˆ |ψ = ψ|ψ
(3.40)
ˆ implies Uˆ †Uˆ = 1.
3.6 The Fifth Postulate As we said, this is the weirdest postulate, the one that most contradicts our common intuition but is experimentally verified. On the one hand it looks very singular, since the state changes abruptly by measurement, which is an interaction with a classical apparatus, and on the other it seems to happen instantaneously. Moreover, it is a nonlinear process. Other quantum processes, as we just saw, are obtained by unitary evolution (in either time, or change of basis), but this one process is nonunitary as well as nonlinear. Since classical processes are supposed to appear in the limit of a large number of components (as a macroscopic object is formed of a very large number of atoms), they should also be related somehow to quantum mechanical processes. Yet measurements, interactions with a classical system, somehow give a nonlinear and nonunitary process. This is very puzzling, and its interpretation is still the subject of some debate.
3.7 The Sixth Postulate We now consider the Schr¨odinger equation, and how it leads to the time evolution operator. The eigenstates |E, corresponding to eigenvalues, or energies E, of the Hamiltonian Hˆ form a complete set, so we introduce their completeness relation in the definition of a state |ψ(t), to find |E E|ψ(t) ≡ ψ E (t)|E, (3.41) |ψ(t) = E
E
and substitute in the time-dependent Schr¨odinger equation i
∂ |ψ(t) = Hˆ |ψ(t), ∂t
(3.42)
while imposing the time-independent Schr¨odinger equation, or eigenvalue problem for the Hamiltonian, Hˆ |E = E|E. We obtain
i
d ∂ − Hˆ |ψ(t) = i ψ E (t) − Eψ E (t) |E = 0, ∂t dt E
(3.43)
(3.44)
with solution ψ E (t) = e−iE (t−t0 )/ ψ E (t 0 ),
(3.45)
42
3 The Postulates and the Schrödinger Equation
leading to the full time-dependent state |ψ(t) =
e−iE (t−t0 )/ ψ E (t 0 )|E.
(3.46)
E
Then the time evolution operator is Uˆ (t, t 0 ) =
|EE|e−iE (t−t0 )/ .
(3.47)
E
For a degenerate energy spectrum, we will have a sum over states |E, a: |ψ(t) = e−iE (t−t0 )/ ψ E,a (t 0 )|E, a, E,a
Uˆ (t, t 0 ) =
(3.48)
|E, aE, a|e−iE (t−t0 )/ .
E,a
3.8 Generalization of States to Ensembles: the Density Matrix Until now we have considered the “pure quantum mechanical” case, where the state of the system is defined by a unique state |ψ in a Hilbert space, also known as a “pure state”. However, in practice, many important cases correspond to a combination of quantum mechanical and classical systems. Specifically, instead of having a pure quantum state, we have a “collection” or ensemble of states in which the system may be found each one with its own classical probability pi . We characterize this situation by a “density matrix” (though it is really an operator, and it should be thought of this way), ρˆ = pi |ii|. (3.49) i
Note that |i can be any kind of state, not necessarily an eigenstate of some observable. Then we say that the average value of an observable associated with an operator Aˆ is the result of a double averaging: first a classical averaging (denoted with a line over the operator) giving probabilities pi , then the standard quantum averaging over states |i, ¯ = ˆ A pi i| A|i. (3.50) i
This average is obtained from the quantity ˆ = ˆ = ¯ Tr( ρˆ A) pi n|ii| A|n pi |i|n| 2 λ n = pi Ai = A, n
i
i
n
(3.51)
i
where in the first equality we have used the fact that there is an orthonormal basis |n of eigenstates ˆ and we have taken the trace of the corresponding matrix, and in the second we have used the of A, ˆ = λ n |n, and finally we have used that the quantum probability of finding eigenvalue equation, A|n the state |i in the state |n is Pi,n = |i|n| 2 . Note that the density matrix is normalized, i.e., we assume that the sum of all the probabilities is one, which amounts to (setting formally Aˆ = 1, or λ n = 1) Tr ρˆ = pi |i|n| 2 = pi Pi,n = 1. (3.52) i
n
i
n
43
3 The Postulates and the Schrödinger Equation
To come back to the original case, of a “pure state” (as opposed to a “mixed state”, i.e., an ensemble with a density matrix), or, more properly stated, a “pure ensemble”, we now take into account that pi = δi,ψ , thus the density matrix is ρˆ = |ψψ|.
(3.53)
In this case, the average reduces to the quantum average, since ¯ = Tr( ρˆ A) ˆ = ˆ = A n|ψψ| A|n λ n |n|ψ| 2 = Aψ , n
(3.54)
n
the same as for a pure state. The formalism of density matrices is relevant for the transition from quantum to classical regimes which, as we alluded to already, is a tricky issue. As we said, we have a combination of classical and quantum aspects, and in the presence of temperature we obtain something of this type. Temperature means a degree of “thermalization”, or interaction with a classical temperature bath, that means classicalization to some degree. These are however issues that will be discussed more at length later on, in Part IIa of this book.
Important Concepts to Remember • The postulates of quantum mechanics are just a convenient way to package the physical content of quantum mechanics, not mathematical axioms. • The first postulate: at every time, the state of a physical system is a ket |ψ in a Hilbert space. • The second postulate: observables correspond to a Hermitian operator Aˆ acting on |ψ. The eigenvalues of Aˆ are real, as they must be in order to correspond to observables. ˆ If • The third postulate: the result of a measurement of an observable A is an eigenvalue λ n on A. the λ n are discrete, we have quantization. We can expand a state |ψ in terms of a basis consisting ˆ of the eigenstates |vn of A. • The fourth postulate: the probability of obtaining the eigenvalue λ n associated with |vn is
Pn = |vn |ψ| 2 . Since |ψ and |vn are normalized, n Pn = 1. • The fifth postulate: after measuring the variable A and obtaining λ n , the state of the system has changed from |ψ to |vn . • The sixth postulate: the evolution in time of a state is governed by the Schr¨odinger equation id/dt|ψ(t) = Hˆ |ψ(t). • If we consider the position of a particle in a state |ψ, with vector |x, then the probability of finding the particle at position x is |ψ(x)| 2 . • Independent observables can be diagonalized simultaneously and have commuting operators: ˆ B] ˆ = 0. [ A, • For an eigenvalue E of the Hamiltonian of a system, we have the time-independent Schr¨odinger equation Hˆ |ψ E = E|ψ E ; then |ψ E (t) = e−iE (t−t0 )/ |ψ E (t 0 ). • The evolution operator Uˆ (t, t 0 ) is defined by |ψ(t) = Uˆ (t, t 0 )|ψ(t 0 ).
• A classical ensemble of quantum states corresponds to a density matrix ρˆ = i |ii|, normalized ¯ = Tr(ρ A). ˆ as Tr ρˆ = 1 and with average observable value A
44
3 The Postulates and the Schrödinger Equation
Further Reading See, for instance, Messiah’s book [2] or Shankar’s book [3] for an alternative approach.
Exercises (1)
Suppose that the probability of finding some particle 1 at x 1 is a Gaussian around x 1 , with standard deviation σ1 , and the probability of finding another particle 2 at x 2 is a Gaussian around x 2 with standard deviation σ2 . What is the condition on the wave functions of the two particles such that the probability to find either particle at the position mentioned is the sum of the two Gaussians? (2) Consider the classical Hamiltonian H = αp2 + βq2 + p f (q).
(3.55)
ˆ taking into Write a quantum Hamiltonian that is ordering-symmetric with respect to Qˆ and P, ˆ ˆ account that [Q, P] is a constant. (3) Consider infinite and discrete energy spectra En . If the spectrum extends by a finite amount, between Emin and Emax , what conditions can you impose on the En very close to either of these values? What if Emax = +∞ or Emin = −∞? (4) Consider a (spinless) particle in a three-dimensional potential. How many commuting operators associated with observables are there? (5) Consider the Hamiltonian H = x2.
(3.56)
Solve the Schr¨odinger equation and find the time evolution operator. (6) Consider the density matrix 1 (|1 1| + |2, 2| + |3 3|), (3.57) 3 ˆ Calculate the average value of A. where |1 and |2 ± |3 are eigenstates of some operator A. ˆ write the evolution In exercise 6, if the states |1, |2, |3 are eigenstates of the Hamiltonian H, ρˆ =
(7)
in time of the density matrix ρ. ˆ
4
Two-Level Systems and Spin-1/2, Entanglement, and Computation In this chapter we consider the simplest possible situation, that of a Hilbert space with dimension 2, i.e., a system with only two orthonormal states leading to operators represented as 2 × 2 matrices, also known as a two-level system. We will only analyze the time-independent case, leaving the time-dependent case for later. Then, we will consider the case of several such two-level systems coupled together, with a tensor product Hilbert space (as we described previously). The relevant new phenomenon that we will observe is called entanglement, related to the quantum coupling of the Hilbert spaces. Moreover, this will allow us to define a quantum version of classical computation, which here we will only touch upon; both entanglement and quantum computation will be addressed in more detail later on in the book.
4.1 Two-Level Systems and Time Dependence We will motivate the study of general two-level systems by first considering the simplest such system, = 1/2 (more precisely, the spin is 1/2 times the of a fermion (such as an electron) with spin | S| = /2, but we will suppress the for the quantum unit of angular momentum, which is , so | S| moment). In this case, as we know experimentally, for instance from the Stern–Gerlach experiment (described in Chapter 0), the projection of the spin onto any direction, here denoted by z, Sz , can take only the values +1/2 and −1/2. In the Stern–Gerlach experiment, we put a magnetic field in the z direction, splitting an electron beam passing through it transversally. Therefore, the independent states in this simplest system are |S = 1/2, Sz = +1/2, also called |+ or |↑, and |S = 1/2, Sz = −1/2, also called |− or |↓. In a matrix notation for the Hilbert space and the operators on it, the states are (see Fig. 4.1) 1 0 |↑ = |+ = , |↓ = |− = . (4.1) 0 1 Operators acting on this space are then represented by 2 × 2 matrices acting on the above states. The most important operators are the spin operators themselves, or more precisely the spin projections onto the three axes Sx , Sy , Sz . We will describe the general theory of spin and angular momentum later, but for the moment we note that, since the spin projections are ±1/2, we want the matrices representing Sx , Sy , Sz to satisfy Sx2 = Sy2 = Sz2 =
1 . 4
(4.2)
Moreover, we want to have (we will explain this condition later) (Sx + iSy ) 2 = (Sx − iSy ) 2 = 0, 45
(4.3)
46
4 Two-Level Systems and Spin 1/2, Entanglement
+, ↑, E2 −, ↓, E1 Figure 4.1
A two-level system, with a common notation. which leads to Sx Sy + Sy Sx = 0,
(4.4)
i.e., these matrices anticommute. The above conditions, together with the anticommutativity of Sx and Sy with Sz , are actually sufficient to define the matrices as Si = 12 σi , with σi the Pauli matrices, 0 1 0 −i 1 0 σ1 = σx = , σ2 = σy = , σ3 = σz = . (4.5) 1 0 i 0 0 −1 More precisely, the Pauli matrices satisfy σi σ j = δi j 12×2 + ii jk σk ,
(4.6)
which we can check explicitly; we have used implicit Einstein summation over the index k, and the Levi–Civita tensor is i jk = +1 for (i j k) = 123 and for cyclic permutations thereof, and −1 otherwise. The energy due to the interaction of the spin of the electron, manifested through its magnetic is moment, and its interaction with the magnetic field B μ · B, ΔHspin = −
(4.7)
is the magnetic moment of the electron. Classically, where μ = μ
e L. 2m
(4.8)
Quantum mechanically, as we will see, the quantum unit of | L| is . Then the quantum of | μ | is called the Bohr magneton, and is μB =
e . 2m
(4.9)
On the other hand the electron, as a fermion, has half-integer spin (i.e., intrinsic angular momentum), and more precisely = | S|
. 2
(4.10)
Moreover, the quantum magnetic moment of the electron due to its spin has an extra factor, the Land´e g-factor, so e S = g μ S, (4.11) 2m where g 2. The magnetic moment as an operator, or more precisely a matrix, acting on the two spin states, is then S = μ
ge σ, 4m
(4.12)
47
4 Two-Level Systems and Spin 1/2, Entanglement
so the spin energy, leading to spin splitting of the electron’s energy levels, is = −μσ · B. ΔHspin = − μS · B
(4.13)
= μz ez and B is time independent, then we finally obtain If μ 1 0 ΔHspin = −μBz . 0 −1 Its eigenvectors are the basis vectors,
1 |↑ = |+ = , 0
(4.14)
0 |↓ = |− = . 1
(4.15)
If we are measuring this energy, for instance as in the Stern–Gerlach experiment, a prototype of a quantum state that does not have a well-defined energy but rather probabilities for energies, is the normalized linear combination 1 1 1 . (4.16) |ψ± ≡ √ (|↑ ± |↓) = √ 2 2 ±1 In this state, the probabilities for the occurrence of each eigenstate |i = |± are equal, and given by pi = |ci | 2 =
1 , 2
(4.17)
for both values of the measured energy, E+ = E + μBz and
E− = E − μBz .
(4.18)
We could in principle also add a time-dependent magnetic field with circular polarization in the (x, y) directions, Bx = −b cos ωt,
By = b sin ωt.
(4.19)
Then its interaction with the magnetic moment of the electron gives 0 cos ωt + i sin ωt 0 −(Bx σx + By σy ) = b = b −iωt e cos ωt − i sin ωt 0 leading to a total spin Hamiltonian ΔHspin = −μBz
1 0
0 0 + μb −iωt e −1
eiωt . 0
eiωt , 0
(4.20)
(4.21)
However, this case is more complicated and will not be analyzed here, but later in the book.
4.2 General Stationary Two-State System Consider a complete basis |φ1 , |φ2 corresponding to some part of the Hamiltonian Hˆ 0 . In coordinate space, this would correspond to wave functions x|φ1 = φ1 (x) and x|φ2 = φ2 (x). As before, 1 0 representing |φ1 by |+ = and |φ2 by |− = , a general wave function can be decomposed as 0 1 ψ(x, t) = c1 (t)φ1 (x) + c2 (t)φ2 (x),
(4.22)
48
4 Two-Level Systems and Spin 1/2, Entanglement
which is written formally as
c1 (t) |ψ(t) = c1 (t)|φ1 + c2 (t)|φ2 = . c2 (t)
(4.23)
Then the Schr¨odinger equation for the wave function in coordinate space is i
d ψ(x, t) = Hˆ ψ(x, t), dt
and in matrix form for the formal case it is d c1 (t) c1 (t) H11 ˆ i =H = c (t) c (t) H dt 2 2 21
H12 H22
(4.24)
c1 (t) . c2 (t)
Here the matrix elements of the Hamiltonian are, by definition, Hi j = φi | Hˆ |φ j = dx dyφi |xx| Hˆ |yy|φ j = dx dyφi∗ (x) Hˆ xy φ j (y),
(4.25)
(4.26)
where we have introduced two identities written using the completeness relation in order to express the Hamiltonian matrix elements in terms of wave functions. As we saw in the general theory, in order to solve the Schr¨odinger equation we write an eigenvalue problem for the Hamiltonian, and in terms of it we write an ansatz for the time dependence as c c1 (t) = e−iEt/ 1 . (4.27) c2 (t) c2 Here c1 , c2 are constants and E is an eigenvalue for the eigenvalue–eigenvector problem for the Hamiltonian, satisfying c H11 H12 c1 =E 1 . (4.28) H21 H22 c2 c2 As we know, the existence of the eigenvalue E is equivalent to the vanishing of det( Hˆ − E1), H11 − E H12 = 0. (4.29) H22 − E H21 Moreover, remembering that the Hamiltonian (like all observables) must be a Hermitian operator, Hˆ = Hˆ † , in matrix form it means that we must have ∗ H21 = H12 .
(4.30)
The above equation for the eigenvalues then becomes (E − H11 )(E − H22 ) − |H12 | 2 = E 2 − E(H11 + H22 ) + H11 H22 − |H12 | 2 = 0, with solutions H11 + H22 E± = ± 2
(H11 − H22 ) 2 + |H12 | 2 . 4
(4.31)
(4.32)
Given these eigenvalues, the eigenstate equations are H 11 H21
c± c± 1 = E± 1 . ± H22 c2± c2 H12
(4.33)
49
4 Two-Level Systems and Spin 1/2, Entanglement
Because of the eigenvalue condition, the two equations are degenerate (the determinant of the linear equation for the variables c1± , c2± vanishes), so only one of the two equations is independent, say, the second, H21 c1± + (H22 − E± )c2± = 0,
(4.34)
while the second condition for the pair of variables is the normalization condition |c1± | 2 + |c2± | 2 = 1,
(4.35)
ψ+ |ψ+ = ψ− |ψ− = 1.
(4.36)
obtained from
The solution of the two equations is c± η± 1 = ± H21 2 c2 1 + E± − H22
1 H21 , E± − H22
(4.37)
where |η± | 2 = 1, so η± is a phase, which we will choose to be 1. We can simplify the equations using the definitions H11 + H22 ∗ ¯ H22 − H11 ≡ Δ, H12 = H21 ≡ E, ≡ V , 2 Δ2 + |V | 2 ≡ Ω. (4.38) 2 2 Then the eigenenergies are Ω E± = E¯ ± Δ2 + |V | 2 = E¯ ± , 2 the Hamiltonian is
E¯ − Δ ˆ H= V∗
V , E¯ + Δ
(4.39)
(4.40)
and the eigenstates are
c1± = c2±
η± 1+
|V | 2 (Δ ± Δ2 + |V | 2 ) 2
1 ∗ V Δ ± Δ2 + |V | 2
η± (±Δ + Δ2 + |V | 2 ) = |V | 2 + (±Δ + Δ2 + |V | 2 ) 2 =
η±
|V | 2 + (±Δ + Δ2 + |V | 2 ) 2
1 ∗ V 2 + |V | 2 Δ ± Δ ±Δ + Δ2 + |V | 2 . ±V ∗
We choose η± = 1, a real V (V = V ∗ ∈ R), and then we can define + − c1 − sin θ cos θ c1 = = , . c2+ c2− cos θ sin θ
(4.41)
(4.42)
50
4 Two-Level Systems and Spin 1/2, Entanglement Note that this definition is self-consistent since c2− = −c1+ , which gives √ V Δ + Δ2 + V 2 = ≡ sin θ, √ √ V 2 + (−Δ + Δ2 + V 2 ) 2 V 2 + (Δ + Δ2 + V 2 ) 2 as we can check by squaring the equation and multiplying with the denominators. Moreover, we obtain that √ 2V (Δ + Δ2 + V 2 ) V sin 2θ = 2 sin θ cos θ = − = −√ , √ V 2 + (Δ + Δ2 + V 2 ) 2 Δ2 + V 2
(4.43)
(4.44)
which also implies that Δ cos 2θ = √ . 2 Δ + V2
(4.45)
In conclusion, the eigenstates corresponding to E± = E¯ ± Ω/2 are |ψ− = cos θ|φ1 + sin θ|φ2 |ψ+ = − sin θ|φ1 + cos θ|φ2 ,
(4.46)
so the general time-dependent state is given by |ψ(t) = c− e−iE− t/ |ψ− + c+ e−iE+ t/ |ψ+ ,
(4.47)
where, since |ψ± are orthonormal states, by multiplying with ψ± | we obtain that the coefficients c± are given by c± = ψ± |ψ(t = 0).
(4.48)
4.3 Oscillations of States Choosing as initial state an eigenstate of the diagonal part of the Hamiltonian, for instance 1 |φ1 = , but not of the full Hamiltonian, we will see that the system actually oscillates between 0 the state |φ1 and the state |φ2 . This phenomenon is actually the one responsible for neutrino oscillations, and what we have described here applies almost directly. Consider then the state |ψ(t = 0) = |φ1 as the initial state. Inverting (4.46), we have |φ1 = cos θ|ψ− − sin θ|ψ+ |φ2 = sin θ|ψ− + cos θ|ψ+ ,
(4.49)
which means that from c± = ψ± |ψ(t = 0) we get c− = cos θ,
c+ = − sin θ.
(4.50)
51
4 Two-Level Systems and Spin 1/2, Entanglement
Substituting into the general state, we obtain |ψ(t) = cos2 θ e−iE− t/ + sin2 θ e−iE+ t/ |φ1 + sin θ cos θ e−iE− t/ − e−iE+ t/ |φ2 ≡ c1 |φ1 + c2 |φ2 ,
(4.51)
which means that, from the point of view of the |φ1 , |φ2 basis, the state oscillates between the basis states. This is indeed the phenomenon of neutrino oscillations, if we consider the neutrino states as |ν1 = |φ1 and |ν2 = |φ2 and that they are actually not eigenstates of the Hamiltonian. The probability of oscillation, that is, of finding the system in the basis state |φ2 , is 2 1 − cos(E+ − E− )t/ p2 (t) = |c2 | 2 = sin2 θ cos2 θ e−iE− t/ − e−iE+ t/ = (sin 2θ) 2 2 √ Δ2 + V 2 t V2 = 2 sin2 2 Δ +V
(4.52)
V2 (1 − cos Ωt) , 2(Δ2 + V 2 ) √ where we have used E+ − E− = 2 Δ2 + V 2 = Ω. Then the probability of remaining in the original state is =
p1 (t) = |c1 (t)| 2 = 1 − p2 (t) =
2Δ2 + V 2 V2 + cos Ωt, 2(Δ2 + V 2 ) 2(Δ2 + V 2 )
(4.53)
as we can check explicitly. These probabilities oscillate with frequency Ω around an average, as can be seen. Consider next two special cases: • First, Δ = 0, in which case Ω = 2|V | and
1 2|V |t p1 (t) = 1 + cos 2 1 2|V |t p2 (t) = 1 − cos . 2
(4.54)
In this case, the oscillations happen around the average value of 1/2, but there are times where the system finds itself fully in the state |φ2 , since p1 = 0, p2 = 1, and others where it is fully back at |φ1 , since p1 = 1, p2 = 0. • Second, consider V = 0, in which case we are back at the spin 1/2 case with B = Bz , and E− = H11 = E1 , E+ = H22 = E2 . Then there is no oscillation, since we get sin 2θ = 0, and p2 (t) = 0. Moreover, in this case |ψ± = |φ1,2 are the basis states of the matrix. But for a general initial state |ψ(t = 0) = c− |φ1 + c+ |φ2 , with c± arbitrary, we find |ψ(t) = e−iE1 t/ c− |φ1 + e−iE2 t/ c+ |φ2 .
(4.55)
That means, however, that the probability of the system being in state 1 is p1 (t) = |c− | 2 and of its being in state 2 is p2 (t) = |c+ | 2 ; these probabilities are time independent, so we have no oscillations.
52
4 Two-Level Systems and Spin 1/2, Entanglement
4.4 Unitary Evolution Operator Consider the general initial state |ψ(t = 0) = b|φ1 + a|φ2 =
b = a(sin θ|ψ− + cos θ|ψ+ ) + b(cos θ|ψ− − sin θ|ψ+ ), a (4.56)
that is, with c− = a sin θ + b cos θ,
c+ = a cos θ − b sin θ.
(4.57)
Then substituting into the general form of the time-dependent state (4.56), we obtain |ψ(t) = c− |ψ− e−iE− t/ + c+ |ψ+ e−iE+ t/ = (a sin θ cos θ + b cos2 θ)e−iE− t/ + (−a sin θ cos θ + b sin2 θ)e−iE+ t/ |φ1 + (a sin2 θ + b sin θ cos θ)e−iE− t/ + (a cos2 θ − b sin θ cos θ)e−iE+ t/ |φ2 . This can be written in the form of a matrix U (t) acting on the initial state, b |ψ(t) = U (t) = U (t)|ψ(t = 0), a
(4.58)
(4.59)
where the U (t) is given by cos2 θ eiΩt/2 + sin2 θ e−iΩt/2 ¯ U (t) = e−i Et/ 2i sin θ cos θ sin Ωt/2
2i sin θ cos θ sin Ωt/2 . sin θ eiΩt/2 + cos2 θ e−iΩt/2 2
(4.60)
We can check explicitly that the matrix is unitary, ˆ U † (t)U (t) = 1.
(4.61)
This agrees with the general theory explained before, stating that the time evolution of a state, as well as any change of basis, must be a unitary operation in order for the sum of all probabilities to be equal to 1 (the conservation of probability). This also means that the time evolution is a reversible process: we can act with U −1 = U † on the final state, and obtain the initial state. The only possible nonunitary process, in quantum theory, is the process of measurement, which is not reversible.
4.5 Entanglement We have described a single two-level system, for instance a spin 1/2 system, but we can also consider several such systems interacting with each other. In the simplest case, consider just two such systems, A and B. In this case, as we explained in the general theory, the total Hilbert space is a tensor product of the individual Hilbert spaces, so H AB = H A ⊗ HB .
53
4 Two-Level Systems and Spin 1/2, Entanglement
But there is one possibility of interest that we will consider here, namely that the interaction between the systems is such that the state of one system influences the state of the other, in such a way that the state of the total system is not describable as the product of states in each system, i.e., |ψ AB |ψ A ⊗ |ψ B .
(4.62)
We call such a situation an entangled state, and the phenomenon is called entanglement. The quintessential example is one that is called “maximally entangled”, where the state of one system completely determines the state of the other. For instance, consider a system, such as an atom or nucleus, that has a total spin equal to zero but decays into two systems (decay products) that each have spin 1/2 (such as an electron and a “hole”, or ionized atom in the case of an atom). Then spin conservation means that the total spin of the decaying products must still add up to zero, meaning that if system A has Sz = +1/2 then system B has Sz = −1/2, and vice versa (if A has Sz = −1/2 then B has Sz = +1/2); thus the state of system A completely determines the state of system B. In this case, the probabilities of the two spin situations are each equal to 1/2, so the total state of the system is 1 1 |ψ± AB = √ (|↑ A ⊗ |↓B ± |↓ A ⊗ |↑B ) ≡ √ (|10 ± |01), 2 2
(4.63)
where in the last equality we have introduced a new notation for the two states of the two-level system: |0 and |1. We note that, as desired, the probabilities of the two cases are pi = |ci | 2 = 1/2 and that the state of the system is normalized, AB ψ± |ψ± AB
= 1,
(4.64)
as well as having orthogonal basis states, AB ψ+ |ψ− AB = 0. Another way to analyze entangled states is to consider the density matrix associated with the state. Indeed, in a more general situation we could also consider the entanglement associated with a nontrivial density matrix, though we will not do it here. The density matrix of the above entangled state is ρ AB = |ψ± ψ± | =
1 (|↑↓ ± |↓↑) (↑↓| ± ↓↑|) . 2
(4.65)
If we consider the trace of the density matrix operator over the Hilbert space of system B, corresponding to a sum over all the possibilities for the system B (for instance, assuming that we don’t know which possibility is correct), we obtain TrB ρ AB = TrB |ψ± AB AB ψ± | = B ↑|ψ± AB AB ψ± |↑B + B ↓|ψ± AB AB ψ± |↓B =
1 1 (|↓ A A↓| + |↑ A A↑|) = 1ˆ A, 2 2
(4.66)
where we have used the orthonormality of the eigenstates for the B system. The result is that after taking the trace over the B system’s density matrix, i.e., summing over its possibilities, we obtain a completely random system for state A: on measuring the spin we get the probability 1/2 of obtaining spin up, and probability 1/2 of obtaining spin down. This is another way to determine that we have maximum entanglement in the system. There is much more that can be said about entanglement, but we will postpone this until the second part of the book.
54
4 Two-Level Systems and Spin 1/2, Entanglement
4.6 Quantum Computation We have introduced another notation for two-level states, of |0 and |1, which recalls the bits of classical computation, also denoted by states 0 and 1. This is not incidental: we will call a two-level system a “qubit”, or quantum version of the computer bit. In the entangled system, the two interacting spins (the two qubits) are now denoted by |a ⊗ |b, where a, b = 0 or 1. On individual two-level systems, i.e., qubits, as well as on the two-qubit states, i.e., tensor product states, in the absence of measurements the only possible operations allowed by quantum mechanics, time evolution and changes of basis, are unitary operations, as we said earlier. This allows one to define a quantum version of classical computation in a computer. In a classical computer, we encode the data in classical states of 0s and 1s, and act on them with “gates”, which are operators with a well-defined action on the products of two-state systems (taking a given combination of 0s and 1s to be well defined). Any classical computation can be reduced to a product of gates. In a quantum computer, similarly, we would encode the data in sets, or tensor products, of qubits. All qubits are then evolved in time with unitary evolution operations (coming from Hamiltonians), corresponding to any computation. Thus a quantum computation is a unitary evolution, which can also be reduced to a product of gates. Each gate will be a quantum unitary operator acting on a tensor product of qubits, usually taken to be an operator acting on the two-qubit state |a ⊗ |b. Unlike a classical computation however, now we can have a wave function containing components in many different individual qubit states, and the unitary evolution will act on all of them at once. This allows for a version of parallel computing, which however is usually more efficient than anything classical, thus sometimes allowing an exponentially faster solution than in the classical case. The quintessential case is an initial state of the type 1 1 |ψ± = √ (|↑ ± |↓) = √ (|0 ± |1), 2 2
(4.67)
or in the case of the two-qubit state, an entangled state like |ψ± AB . The downside of such a calculation is that, when we “collapse the wave function” by making a measurement, which is the only nonunitary operation (and one which breaks the otherwise reversibility of the quantum calculation), we obtain only probabilities for individual states; so we need to make sure that (1) the result of the calculation can be extracted from a particular state and that (2) the probability of obtaining such a state is high enough that we can do this with a sufficiently small number of trials. There is much more information to be given about quantum computation, but we will expand upon it in the second part of the book.
Important Concepts to Remember • The simplest quantum mechanical Hilbert space is one with two basis elements, i.e., that is “twolevel”, such as a spin s = ±1/2 fermion for instance. • For the spin s = ±1/2 case, the spin operators are Si = 12 σi , where σi are the Pauli matrices, satisfying σi σ j = δi j + ii jk σk .
55
4 Two-Level Systems and Spin 1/2, Entanglement and classically the magnetic moment μ L = (e/2m) L, quantum mechanically • Since ΔHspin = − μ · B, S = g(e/2m) S, with g 2. μ = /2, ΔHspin = −μσ · B, with μ = g(e/2m). • Since | S| • Then the eigenvalues (for a magnetic field on the z direction) are ±μBz , and the eigenstates are |ψ± = √12 (|↑ ± |↓). ∗ • For a general two-state system Hamiltonian, with elements H11 , H22 , H12 and H21 = H12 , the 2 2 eigenenergies are E± = (H11 + H22 )/2 ± Ω/2, with Ω/2 = Δ + |V | where Δ = (H22 − H11 )/2 and V = H12 . • If the initial state is not one of the eigenstates |ψ± , then the time-dependent state oscillates between |ψ+ and |ψ− with frequency Ω. V2 • If the original state is |↑, the probability of switching spin is p2 (t) = (1 − cos Ωt). This 2(Δ2 + V 2 ) corresponds to the phenomenon of neutrino oscillation. • For several two-level systems, for instance two such systems, we can have states of the coupled system that are not separable, |ψ AB |ψ A ⊗ |ψ B , which are called entangled. The system is defined by a density matrix ρ AB . • The two-level system is the basis for the bits 0, 1 of the quantum version of computation. • In quantum computation the wave function is evolved in time during a calculation, so we have to extract from it the correct result.
Further Reading See Preskill’s Caltech Notes on Quantum Entanglement and Computation [12]. For more on quantum computation, see [13]–[15].
Exercises Using the Pauli matrices σi : (σ1 , σ2 , σ3 ), show that we can construct four matrices γa , a = 1, 2, 3, 4, as tensor products of Pauli matrices, γa = σi ⊗ σ j , such that γa γb + γb γa = 2δ ab . Find explicitly an example of such tensor products. (2) Consider a system with two spin 1/2 electrons in a constant magnetic field parallel to the z direction, Bz . Assume there is no other degree of freedom (not even momentum). Solve the Schr¨odinger equation and find the eigenstates of the system. (3) Consider the Hamiltonian for a two-level system
(1)
H = a0 1 +
3
ai σi .
(4.68)
i=1
Calculate its eigenstates and the associated energies. (4) (Neutrino oscillations). Consider a two-level system with eigenstates of the Hamiltonian |ψ1 and |ψ2 , of energies E1 and E2 , respectively (E2 > E1 ), corresponding to the mass (and, of course, momentum) eigenstates of two massive neutrinos. Consider also flavor eigenstates |φ1 = |νμ and |φ2 = |ντ , rotated by a mixing angle θ with respect to the mass eigenstates, and an initial muon neutrino eigenstate,
56
4 Two-Level Systems and Spin 1/2, Entanglement |ψ(t = 0) = |νμ , of energy E. If the neutrinos are ultra-relativistic (E1 m1 , E2 m2 ), find the formula for the oscillation probability (the probability of finding the system in the tau neutrino flavor eigenstate) as a function of θ, time, E and Δm2 ≡ m22 − m12 . (5) Find the unitary evolution operator corresponding to the previous case, that of neutrino oscillations. (6) Consider the two-qubit state |φ = C [|1 ⊗ |1 + |0 ⊗ |1 + a|1 ⊗ 0 + |0 ⊗ |0] ,
(7)
(4.69)
where C is a normalization constant. Find C as a function of a. When is the state entangled (at what values of a), and when is it not entangled? Calculate the density matrix of system A for the two-qubit state above (in exercise 6), when the trace is taken over system B, as a function of a. Find the maximum and minimum probabilities that system A is in state |1, independently of system B.
5
Position and Momentum and Their Bases; Canonical Quantization, and Free Particles After analyzing the simplest discrete system, with just two states, in this chapter we move on to the simplest continuum systems, with an infinite dimensional and continuum (not countable) number of dimensions. A classical particle is usually described in phase space, in terms of variables x, p which are ˆ P, ˆ continuous. In quantum theory, these classical observables are promoted to quantum operators X, which will have continuous eigenvalues. Each operator will have, according to general theory, a complete set of eigenstates |x, |p such that Xˆ |x = x|x,
ˆ = p|p. P|p
(5.1)
Both the basis {|x} and the basis {|p} are orthonormal. But the question is, how does the momentum operator Pˆ act on the |x basis?
5.1 Translation Operator To answer the above question, we have to first define the operator corresponding to translation by an infinitesimal amount dx, written as Tˆ (dx). It is thus defined as an operator that translates the eigenvalue x by dx, so Tˆ (dx)|x = |x + dx.
(5.2)
Since this translation operator preserves the scalar product, we have x|x = x + dx|x + dx = δ(x − x ),
(5.3)
but on the other hand we obtain x|Tˆ † (dx)Tˆ (dx)|x = x|x ,
(5.4)
which implies that the translation operator is unitary, ˆ Tˆ † (dx)Tˆ (dx) = 1.
(5.5)
Moreover, the translation operation obeys composition, so that Tˆ (dx)Tˆ (dx ) = Tˆ (dx + dx ),
(5.6)
Tˆ −1 (dx) = Tˆ (−dx).
(5.7)
and the inverse operator must be
Finally, translation by zero must give the identity, Tˆ (dx → 0) → 1. 57
(5.8)
58
5 Position and Momentum, Quantization, Free Particles
In fact, we can construct the action of the translation operator even for a finite translation by a in the coordinate (x) representation, that is, on wave functions we have † ad/dx ˆ −1 , Tˆxx (a) = Txx (a) = δ(x − x )e
(5.9)
since x|Tˆ † (a)|ψ = x + a|ψ = ψ(x + a) = Tˆ (a)xx x|ψ = e ad/dx ψ(x).
(5.10)
But on the other hand, by the Taylor expansion, ψ(x + a) = ψ(x) + a
an d n d a2 d 2 ψ(x) + ψ(x) + · · · = ψ(x) ≡ e ad/dx ψ(x), n 2 dx 2 dx n! dx n≥0
(5.11)
which proves the above identity. Then, for an infinitesimal translation, we have d Tˆxx (dx) 1ˆ − dx . dx
(5.12)
And, in abstract terms, the unitarity property means that ˆ Tˆ † (dx) = Tˆ −1 (dx) ⇒ 1ˆ − idx K, ˆ Indeed, in the x representation, we find that where Kˆ is Hermitian ( Kˆ † = K). d Kˆ xx = −i , dx xx
(5.13)
(5.14)
which is indeed Hermitian, as we have proven earlier. Again in abstract terms, we find that Xˆ Tˆ (dx )|x = Xˆ |x + dx = (x + dx )|x + dx , Tˆ (dx ) Xˆ |x = Tˆ (dx )x|x = x|x + dx ,
(5.15)
meaning that in general, as a condition on operators (since it is valid on an orthonormal basis of the Hilbert space) we have ˆ T (dx )] = dx 1ˆ ⇒ [ X, ˆ K] ˆ = i. [ X,
(5.16)
This is of course satisfied by (5.14), its form in the coordinate representation. To continue, and to see that in fact Kˆ is Pˆ up to a real constant, we review a bit of classical mechanics.
5.2 Momentum in Classical Mechanics as a Generator of Translations The formal definition of dynamics in classical mechanics is given in terms of the Hamilton equations on the phase space (qi , pi ). The Hamilton equations become more formal in terms of Poisson brackets {, } P.B. , defined for functions f and g on the phase space as ∂ f ∂g ∂ f ∂g { f (q, p), g(q, p)} P.B. = − . (5.17) ∂qi ∂pi ∂pi ∂qi i
59
5 Position and Momentum, Quantization, Free Particles
This is an antisymmetric bracket, { f , g} P.B. = −{g, f } P.B. ,
(5.18)
that anticommutes with (complex) numbers, which are not functions of phase space, { f (q, p), c} P.B. = 0,
(5.19)
{{ A, B}, C} + {{B, C}, A} + {{C, A}, B} = 0.
(5.20)
and satisfies the Jacobi identity
In terms of (5.18), the Hamilton equations ∂H = q˙i , ∂pi
∂H = − p˙i ∂qi
(5.21)
p˙i = {pi , H } P.B. .
(5.22)
become q˙i = {qi , H } P.B. ,
For an arbitrary function of phase space and time f (q, p; t), we have the time evolution d f (q, p; t) ∂f ∂f ∂f ∂f = q˙i + p˙i + = { f , H } P.B. + . dt ∂qi ∂pi ∂t ∂t
(5.23)
This means that the Hamiltonian, through its Poisson brackets, is the generator of the time translations (the motion of the system in time). Next, consider an infinitesimal canonical transformation on phase space, from (q, p) to (Q, P), and take the active point of view, which is that this is a transformation that changes points in phase space from (q, p) to (Q, P) = (q + δq, p + δp). Then we have the generating function of the canonical transformation: F (q, P, t) = qP + G(q, P, t),
(5.24)
where is an infinitesimal parameter. It generates the canonical transformation through the equations ∂G ∂F =P+ ∂q ∂q ∂F ∂G ∂G Q≡ =q+ q+ , ∂P ∂P ∂p p≡
(5.25)
where in the last equation we used P p, and so the variations are ∂G = {p, G} P.B. ∂q ∂G δq = = {q, G} P.B. . ∂p
δp = −
(5.26)
Doing the same thing for the Hamiltonian H (p, q), we obtain δH (q, p) = {H (q, p), G} P.B. −
∂G dG = − . ∂t dt
(5.27)
That means that if dG/dt = 0, we have δH = 0. This implies that a constant of motion (a function G(q, p) that is independent of time) is a generating function of canonical transformations that leaves the Hamiltonian H invariant.
60
5 Position and Momentum, Quantization, Free Particles In particular, considering a cyclic coordinate qi i.e., ∂H/∂qi = 0, it follows from the Hamilton equations that the corresponding momentum is a constant of motion, dpi /dt = 0. In particular, take this constant of motion as the generating function G, i.e., G(q, p) = pi .
(5.28)
Then the canonical transformation gives δq j = δi j δp j = 0,
(5.29)
which means that the momentum pi is the generator of translations in the coordinate qi canonically conjugate to it (δcan,pi qi = ).
5.3 Canonical Quantization It follows that in the quantum case, where classical observables correspond to operators, we can ˆ at least up to a multiplicative equate the momentum operator Pˆ with the translation generator K, ˆ ˆ real constant. In fact, the dimensions of K (inverse length) and P (momentum) are different, so that linking them there must be at least a constant with dimensions of momentum times length, or in other words the dimensions of action. One such constant is = h/(2π), so at least up to a number, we have Pˆ Kˆ = .
(5.30)
In fact, there is no further multiplicative number. A way to see this is to consider that, using de Broglie’s assumption on the relation between the wavelength λ and the momentum p of a particle, p=
p 2π h ⇒ =k= , λ λ
(5.31)
so p/ is the wave number, and it is natural to associate it with the translation operator (for instance on a lattice). Finally, then, when acting on a wave function ψ(x) = x|ψ, the momentum operator becomes d ˆ Pxx = −i . (5.32) dx xx In this x representation, we have the commutator d ˆ ˆ [ X, P] = x, −i = i, dx
(5.33)
which replaces the classical canonical Poisson bracket formula {q, p} P.B. = 1.
(5.34)
Moreover, the previous analysis of the Hamiltonian applies to any coordinates qi and the momenta canonically conjugate to them pi , so we obtain the general canonical quantization conditions [ Xˆ i , Pˆ j ] = iδi j ,
(5.35)
61
5 Position and Momentum, Quantization, Free Particles
replacing the classical canonical commutation relations {qi , p j } P.B. = δi j .
(5.36)
In other words, in the general canonical quantization prescription, the Poisson brackets are replaced with the commutator divided by i, {, } P.B. →
1 [, ]. i
(5.37)
The other canonical Poisson brackets are {qi , q j } P.B. = {pi , p j } P.B. = 0,
(5.38)
and under canonical quantization, they would become [ Xˆ i , Xˆ j ] = [ Pˆi , Pˆ j ] = 0.
(5.39)
These relations are indeed satisfied: for i = j they are trivial and for i j the variables are independent (do not influence each other), so the corresponding operators commute.
5.4 Operators in Coordinate and Momentum Spaces Operators in Coordinate (x) Space The eigenstates |x of the coordinate operator Xˆ are orthonormal, x|x = δ(x − x ), and in the coordinate representation we use the following identity in terms of them, 1ˆ = dx|xx|. Then the action of the matrix element of the operator Aˆ between the states |ψ and |χ is given by ˆ ˆ x |ψ = χ| A|ψ = dx dx χ|xx| A|x dx dx χ ∗ (x) Axx ψ(x ). (5.40) For an observable corresponding to a function of position, Aˆ = fˆ( Xˆ ), we have Axx = x| fˆ( Xˆ )|x = f (x )x|x = f (x )δ(x − x ). ˆ we have On the other hand, for Aˆ = P, d d Axx = −i = −iδ(x − x ) , dx xx dx as we saw in Chapter 2, so d d ˆ χ| P|ψ = dx dx χ ∗ (x)(−iδ(x − x )) ψ(x ) = dxχ ∗ (x) −i ψ(x) . dx dx
(5.41)
(5.42)
(5.43)
Operators in Momentum (p) Space In momentum space we use momentum eigenstates |p, which are also orthonormal, p|p = δ(p − p ), and the identity written as a completeness relation in terms of them, 1ˆ = dp|pp|.
62
5 Position and Momentum, Quantization, Free Particles Then the matrix element (5.40) of an operator Aˆ is ˆ ˆ χ| A|ψ = dp dp χ|pp| A|p p |ψ = dp dp χ ∗ (p) App ψ(p),
(5.44)
where p|ψ = ψ(p) is the wave function in momentum space. However, to do this integral we have to calculate the transformation of a wave function from coordinate space to momentum space. Consider the matrix element of Pˆ between the x| and the |p states, and introduce the completeness relation in coordinate space, giving d ˆ = ˆ x |p = x| P|p dx x| P|x dx Pxx x |p = dx −iδ(x − x ) x |p dx (5.45) d = −i x|p. dx ˆ = p|p, we also obtain px|p, so On the other hand, by using P|p −i
d x|p = px|p, dx
(5.46)
with solution x|p = N eipx/ .
(5.47)
The normalization constant N is found from the normalization condition by inserting a completeness relation for |x, dpx|pp|x . (5.48) δ(x − x ) = x|x = Substituting the solution above into this equation, we find x − x δ(x − x ) = N 2 dpeipx/ e−ipx / = N 2 2πδ = 2πN 2 δ(x − x ), √ meaning that N = 1/ 2π, and so 1 x|p = √ eipx/ . 2π
(5.49)
(5.50)
That means that the transformation between the x and the p/ basis (which has the same dimensions) is just the Fourier transform, x|ψ = dpx|pp|ψ ⇒ (5.51) 1 ipx/ dpe ψ(p), ψ(x) = √ 2π and the inverse relation (the inverse Fourier transform) is p|ψ = dxp|xx|ψ ⇒ 1 ψ(p) = √ dxe−ipx/ ψ(x). 2π
(5.52)
63
5 Position and Momentum, Quantization, Free Particles
5.5 The Free Nonrelativistic Particle The Schr¨odinger equation for an arbitrary time-dependent state is i
d |ψ = Hˆ |ψ. dt
(5.53)
But for a free classical particle, H (q, p) = p2 /2m + 0, leading to the quantum Hamiltonian Pˆ 2 Hˆ = . 2m
(5.54)
We will solve the Schr¨odinger equation by the general procedure we described: first we consider the eigenvalue problem for the Hamiltonian, Hˆ |ψ E = E|ψ E ,
(5.55)
and for the eigenstate we can solve the time evolution exactly, |ψ E (t) = e−iEt/ |ψ E (t = 0) = e−iEt/ |ψ E .
(5.56)
In our case, the Hamiltonian eigenvalue problem is Pˆ 2 − E |ψ E = 0. (5.57) 2m √ This means that |ψ E = |p is a momentum eigenstate, with p = ± 2mE. That in turn means that we have a degeneracy, with states √ √ |E, + ≡ |p = + 2mE, |E, − ≡ |p = − 2mE. (5.58) Pˆ 2 Hˆ |ψ E = |ψ E = E|ψ E ⇒ 2m
Since the system is degenerate with respect to energy, we have that, in general, √ √ |ψ E (t) = e−iEt/ α|p = + 2mE + β|p = − 2mE . Then the coordinate (x) space wave function for a given energy is √ e−iEt/ +ix √2mE/ x|ψ E (t) = ψ E (x, t) = √ αe + βe−ix 2mE/ . 2π
(5.59)
(5.60)
Choosing a sign for the momentum (and so a state of given momentum), say p > 0, so that α = 1, β = 0, means that the probability density of finding the free particle in this state at position x is dP(x) 1 = |x|ψ E (t)| 2 = , dx 2π
(5.61)
which is constant. This means that the position x is completely arbitrary once we fix the momentum p. We will see the implications of this fact in more detail in the next chapter. Note that the above relation is consistent with the normalization condition dP(x) 1= dx = dx|x|ψ| 2 = dxψ|xx|ψ = ψ|ψ, (5.62) dx which is therefore satisfied. We should comment on the fact that we want the variables x, p to be approximately classical, that is, defined up to some error. That means that the fact that the position is arbitrary in the above
64
5 Position and Momentum, Quantization, Free Particles
discussion arises because having a perfectly defined momentum p is an idealization. In reality, we need to consider so-called wave packets, a linear combination (superposition) of various momentum states with varying p. In this case, we can approximately localize the position x around an average value, perhaps with a moving (time-dependent) average value. We will see in the next chapter that if we center the state of given momentum p on some x by adding a multiplicative Gaussian profile at time t = 0 then, as t progresses, this wave packet will spread out in space, tending towards the same constant wave function with respect to x.
Important Concepts to Remember • A constant of motion is a generating function of canonical transformations that leaves the Hamiltonian invariant, and, in particular, the momentum pi is the generator of translations in the coordinate qi canonically conjugate to it. ˆ so Pˆxx = (−id/dx) xx . • The generator of translations is Kˆ = P/, • The canonical quantization conditions are [ Xˆ i , Pˆ j ] = iδi j (and [ Xˆ i , Xˆ j ] = [ Pˆi , Pˆ j ] = 0), replacing
• •
• •
{qi , p j } P.B. = δi j (and {qi , q j } P.B. = {pi , p j } P.B. = 0) in classical mechanics; so canonical quantization means that we have the replacement {, } P.B. → (1/i)[, ]. ˆ In x space, x| f ( Xˆ )|x = f (x )δ(x − x ) and χ| P|ψ = dx χ ∗ (x) (−id/dx) ψ(x). √ The product of coordinate and momentum space basis elements is x|p = eipx/ / 2π, so the wave function in momentum space is the Fourier transform (in units with = 1) ψ(p) = (1/2π) dxe−ipx/ ψ(x). For a free particle, the time evolution is |ψ E (t) = e−iEt/ |ψ√E , E|ψ E = ( Pˆ 2 /2m)|ψ√E , and there is a degeneracy with respect to momentum, |E, + = |p = + 2mE, |E, − = |p = − 2mE. The state of perfectly defined momentum p is an idealization. In reality we have wave packets, states given as linear combinations of momentum states, peaked around a given classical momentum.
Further Reading See any other fairly advanced quantum mechanics book, e.g., [2, 1, 4, 3].
Exercises (1)
Consider possible terms in the (quantum) Hamiltonian for a one-dimensional system, d 1 H1 = dx ψ∗ (x)e a d x ψ(x), H2 = dx ψ∗ (x) 2 2 ψ(x), d /dx
(5.63)
where ψ(x) is a continuous complex function of the spatial variable x, and 1/(d 2 /dx 2 ) is understood as the inverse of an operator, acting on the function. Can these terms be understood to be coming from local interactions in x (interactions defined at a single point x), and if so, why? Use a calculation to explain your reasoning.
65
5 Position and Momentum, Quantization, Free Particles
(2) Consider the angular momentum of a classical particle, L = r × p , and a system for which it is invariant. Use it as a generating function to define the relevant canonical transformations. (3) Consider a system with Lagrangian q˙2 q˙12 + q˙1 q˙2 + 2 − k (q1 + q2 ). (5.64) 2 2 Write down the Hamiltonian and Poisson brackets, and canonically quantize the system. (4) Consider a system of N free particles in one spatial dimension x. Write down the general eigenstate and eigenenergy, its degeneracy and time evolution. (5) Consider a system with Hamiltonian L=
p2 + αp + βx 2 + γx 3 . (5.65) 2m Write down its Schr¨odinger equation for wave functions in coordinate space. If β = γ = 0, write down its general eigenstate wave function and time evolution, for a given energy E. (6) Consider a free particle in three spatial dimensions, and a system of coordinates relative to a point O not on the path of the particle. Write down the integrals of motion conjugate to Cartesian, polar, and spherical coordinates respectively, and the canonical quantization for each set of coordinates. (7) Considering the free particle in three spatial dimensions from exercise 6, what is the probability density of finding a particle at a point r , given a momentum p ? H=
6
The Heisenberg Uncertainty Principle and Relations, and Gaussian Wave Packets In this chapter, following the explicit analysis of Gaussian wave packets, we will see that products in the variances of canonically conjugate variables have a nonzero value. Then we will prove a theorem saying that such products in general obey a lower bound, called the Heisenberg uncertainty relation(s). Moreover, we will then show that the minimum is attained in the case of the same Gaussian wave packets.
6.1 Gaussian Wave Packets As we said in the previous chapter, free particles of a given momentum are an abstraction, and moreover imply a coordinate that is completely arbitrary. In a realistic situation, a superposition of waves, or wave packet, represents a real particle, as in Fig. 6.1. The quintessential example is the Gaussian wave packet at t = 0, where we multiply a free wave of given momentum by a Gaussian in position space, centered on x = 0 and with variance σ = d, so that x|ψ p (t = 0) = N eipx/ e−x
2 /2σ 2
.
(6.1)
The probability density is a pure Gaussian (since the wave multiplying the Gaussian has a constant probability density): dP 2 2 = |x|ψ| 2 = N 2 e−x /d . dx Imposing normalization of the probability, dxdP/dx = 1, we obtain N
2
+∞
dxe−x
2 /d 2
−∞
√ = N2d π = 1 ⇒ N =
1 , (πd 2 ) 1/4
(6.2)
(6.3)
so the Gaussian wave packet at t = 0 is x|ψ p (t = 0) = x|pe−x
2 /2d 2
√ 2π . (πd 2 ) 1/4
(6.4)
Now we calculate the averages of the x, x 2 , p, p2 operators in this Gaussian wave packet state. First, x ≡ ψ| Xˆ |ψ = 66
dxψ|xx| Xˆ |ψ =
+∞ −∞
dx x|ψ(x)| 2 = 0,
(6.5)
67
6 Heisenberg Uncertainty Principle, Wave Packets
x
x
(a)
Figure 6.1
(b)
A wave packet, representing a particle. (a) A static wave packet, centered on a position in x space. (b) A time-dependent wave packet, where the central position in x moves with time. by the symmetry in x of the integral. The average of x 2 is nonzero, however: x 2 ≡ ψ| Xˆ 2 |ψ =
+∞ −∞
dx x 2 |ψ(x)| 2 =
+∞ −∞
x2 2 2 dx √ e−x /d d π
d2 d2 = √ Γ(3/2) = . 2 π Next, the average of p is also nonzero, ˆ ˆ x |ψ p ≡ ψ| P|ψ = dx dx ψ|xx| P|x ∗ d = dx dx ψ (x) −iδ(x − x ) ψ(x ) dx +∞ +∞ d ix ∗ 2 = −i dxψ (x) ψ(x) = dx|ψ(x)| p + 2 = p. dx d −∞ −∞
(6.6)
(6.7)
And, finally, the average of p2 is
d2 ψ(x) dx 2 2⎤ ⎡ 1 1 ix ⎥⎥ 2 2 ⎢ ⎢ = (−i) dx|ψ(x)| ⎢− 2 + p+ 2 ⎥ (−i) 2 d ⎢⎣ d ⎥⎦ 2 2 2 = p2 + 2 − 4 x 2 = p2 + 2 . d d 2d
p2 ≡ ψ| Pˆ2 |ψ = (−i) 2
dxψ∗ (x)
(6.8)
Now we can calculate the deviations in x and p, (Δx) 2 = (x − x) 2 = x 2 − x2 =
d2 2
2 (Δp) = (p − p) = p − p = 2 . 2d 2
2
2
(6.9)
2
Multiplying them, we obtain (Δx) 2 (Δp) 2 =
2 . 4
(6.10)
We will see that this actually yields the lower bound coming from the general Heisenberg uncertainty relations.
68
6 Heisenberg Uncertainty Principle, Wave Packets For completeness, we will write the Gaussian wave packet with momentum p in momentum space (we set a = i(p − p)d 2 /): p|ψ p = ψ p (p) = dxp|xx|ψ p 1 1 =√ 2π (πd 2 ) 1/4
+∞ −∞
p x x2 dx e−ipx/ exp +i − 2 2d
+∞ e 2 2 =√ dx e−(x−a) /2d 2π (πd 2 ) 1/4 −∞ √ d 2 2 2 = e−(p−p ) d /2 . (π2 ) 1/4 +a2 /2d 2
1
(6.11)
6.2 Time Evolution of Gaussian Wave Packet We have defined the Gaussian wave packet with momentum p0 at t = 0, but in order to find its time evolution, we need to remember the time evolution of states of definite momentum, √ |ψ E (t) = e−iEt/ |p = 2mE, (6.12) which we could also write in the coordinate representation. But we need to calculate the time evolution operator for the more general wave function that we have. In order to do this, we use the evolution operator formalism, where |ψ(t) = Uˆ (t)|ψ(t = 0)
(6.13)
is seen √ to imply (given the relation (6.12), which defines the time evolution for |p states, with p = 2mE) that +∞ +∞ 2 −iE (p)t/ ˆ U (t) = dp |p p|e = dp |p p|e−ip t/2m . (6.14) −∞
−∞
The matrix elements of this operator in the coordinate basis |x are +∞ 2 ˆ U (x, x ; t) ≡ x|U (t)|x = dpx|p p|x e−ip t/2m −∞
=
dp
e
ip(x−x )/
2π
e
−ip 2 t/2m
=
m im(x−x )2 /2t e . 2πit
(6.15)
Taking the relation (6.13) in the coordinate representation, i.e., multiplying with x| and then introducing a completeness relation dx |x x | on the right of the evolution operator, we get ψ(x, t) = dx U (x, x ; t)ψ(x , t = 0), (6.16) with U (x, x ; t) as above and ψ(x , t = 0) the Gaussian wave packet from the previous section:
ψ(x , t = 0) =
2
eip0 x / e−x /2d . √ d π 2
(6.17)
69
6 Heisenberg Uncertainty Principle, Wave Packets On performing the Gaussian x integral, the result is 1 (x − p0 t/m) 2 ip0 p0 t x|ψ(t) = exp − 2 exp − x− , √ 2m 2d (1 + it/md 2 ) π (d + it/md)
(6.18)
which implies the probability density dP(x, t) = |ψ(x, t)| 2 dx 1 (x − p0 t/m) 2 =√ √ exp − 2 . d + (2 t 2 /m2 d 2 ) π d 2 + 2 t 2 /m2 d 2
ρ(x, t) ≡
(6.19)
We observe two properties of this result: • the mean value moves with momentum p0 , as it is a function of x − p0 t/m; • the width Δx(t) of the Gaussian grows in time as d 2 t 2 Δx = √ 1+ 2 4 . m d 2
(6.20)
That is, the initial wave function spreads out until it becomes more like the wave associated with a free particle of given momentum.
6.3 Heisenberg Uncertainty Relations ˆ P] ˆ = i. But how general is this We have seen that Δx is correlated with Δp, and also that [ X, ˆ B] ˆ 0? correlation, in the case where two operators do not commute, [ A, In the case where operators do commute, which means that the related observables are compatible, the states have simultaneous eigenvalues, ˆ b = Ab|a, ˆ ˆ b = Ba|a, ˆ Aˆ B|a, b = ab|a, b = Bˆ A|a, b.
(6.21)
In the case where they do not commute, this would be impossible (we would obtain a contradiction from the above relation). For instance, Xˆ and Pˆ are incompatible and, as we saw, if we localize the momentum (p is given), then space is completely delocalized (x is arbitrary), and moreover if Δx 0, we also have Δp 0. In fact this is part of a general theory of noncommuting operators. We have the following: ˆ B] ˆ 0 and A, ˆ Bˆ are associated with observables, so that they are Hermitian If [ A, operators, then we have
Theorem
(ΔA) 2 (ΔB) 2 ≥
1 ˆ ˆ 2 |[ A, B]| . 4
(6.22)
These relation are known as Heisenberg’s uncertainty relation. ˆ B] ˆ = iCˆ then Cˆ is Hermitian, Cˆ = Cˆ † . Indeed, Proof First, note that if [ A, ˆ † = −iCˆ † = ([ A, ˆ B]) ˆ † = [ Bˆ † , Aˆ † ] = [ B, ˆ A] ˆ = −[ A, ˆ B] ˆ = −iC. ˆ (iC)
(6.23)
70
6 Heisenberg Uncertainty Principle, Wave Packets ˆ B} ˆ = Dˆ then Dˆ is Hermitian, Dˆ † = D: ˆ Also, if { A, ˆ B}) ˆ † = { Bˆ † , Aˆ † } = { B, ˆ A} ˆ = { A, ˆ B} ˆ = D. ˆ D† = ({ A,
(6.24)
The quantities in the Heisenberg uncertainty relations are defined as ˆ B] ˆ ≡ ψ|( Aˆ Bˆ − Bˆ A)|ψ ˆ [ A, (ΔA) 2 ≡ ψ|( Aˆ − A) 2 |ψ
(6.25)
(ΔB) 2 ≡ ψ|( Bˆ − B) 2 |ψ. ˆ Bˆ are Hermitian, Aˆ − A and Bˆ − B are also Hermitian. Then note that if A, ˆ Bˆ are operators acting on a Hilbert space, consider the states |φ, |χ in this Hilbert space, Since A, which, as we saw in Chapters 1 and 2, obey the Cauchy–Schwarz inequality, |φ|χ| 2 ≤ φ 2 χ 2 = φ|φχ|χ,
(6.26)
with equality only if |χ ∝ |φ. Now consider the states |φ = ( Aˆ − A)|ψ,
|χ = ( Bˆ − B)|ψ.
(6.27)
The norms of these states are (using the fact that Aˆ − A and Bˆ − B are Hermitian) φ 2 = ψ|( Aˆ − A) 2 |ψ = (ΔA) 2 χ 2 = ψ|( Bˆ − A) 2 |ψ = (ΔB) 2 .
(6.28)
Moreover, also using the Hermitian nature of ΔA and ΔB, we find |φ|χ| 2 = |ψ|( Aˆ − A)( Bˆ − B)|ψ| 2 .
(6.29)
But then 1 ˆ 1 [ A − A, Bˆ − B] + { Aˆ − A, Bˆ − B} 2 2 (6.30) 1 ˆ ˆ 1 ˆ ˆ = [ A, B] + { A − A, B − B}, 2 2 ˆ B] ˆ = iCˆ with Cˆ † = C, ˆ and so the anticommutator above is a Hermitian operator. and, as we saw, [ A, These two operators then act as real and imaginary numbers (remembering that the eigenvalues of a Hermitian operator are real, this is actually what we get in an eigenstate basis), so the modulus squared equals the real (Hermitian) part squared plus the imaginary (anti-Hermitian) part squared. In conclusion, the Cauchy–Schwarz inequality becomes ( Aˆ − A)( Bˆ − B) =
2 1 2 1 2 ˆ B|ψ| ˆ ˆ B]|ψ ˆ ˆ Δ B}|ψ ˆ + ψ|{Δ A, ≥ ψ|[ A, (ΔA) 2 (ΔB) 2 ≥ |ψ|Δ AΔ 2 2 2 1 ˆ B]|ψ ˆ . ≥ ψ|[ A, 4
(6.31)
q.e.d. We have thus proven the Heisenberg uncertainty relations in the general case. Specializing to the case of canonically conjugate operators, Aˆ = qˆi and Bˆ = pˆi , we obtain first ˆ B] ˆ = [qˆi , pˆi ] = i, [ A,
(6.32)
71
6 Heisenberg Uncertainty Principle, Wave Packets
and, substituting in the Heisenberg uncertainty relations, (Δqi ) 2 (Δpi ) 2 ≥
2 , 4
(6.33)
or, more simply, Δqi Δpi ≥
, 2
(6.34)
in terms of the standard deviations of qi , pi . This is the original form of Heisenberg’s uncertainty relations, in terms of the x, p observables themselves (rather than for general canonically conjugate variables). We also note the situation when we have equality in the uncertainty relations (i.e., when we “saturate” the inequality). There were two inequalities used, and each comes with its own condition for equality, so both must hold: (1) For the first inequality, equality arises when |χ ∝ |φ, so that ˆ ˆ A|ψ = c B|ψ.
(6.35)
(2) For the second inequality, equality arises when the (average of the) anticommutator vanishes, so that ˆ B}|ψ ˆ ψ|{ A, = 0.
(6.36)
6.4 Minimum Uncertainty Wave Packet We now find the wave packet for the free particle that saturates the minimum of the standard inequality. ˆ Bˆ = P. ˆ Then equality arises when Apply the Heisenberg uncertainty relations for Aˆ = X, (1) ( Pˆ − p)|ψ = c( Xˆ − x)|ψ,
(6.37)
ψ| ( Pˆ − p)( Xˆ − x) + ( Xˆ − x)( Pˆ − p) |ψ = 0.
(6.38)
(2)
Applying these conditions in coordinate space, i.e., by acting with x| from the left on the conditions, we find from (1) that d −i − p ψ(x) = c(x − x)ψ(x) ⇒ dx (6.39) d i ln ψ(x) = [p + c(x − x)], dx with the solution ψ(x) = ψ(x = 0)ei
px
c(x − x) 2 exp i . 2
(6.40)
72
6 Heisenberg Uncertainty Principle, Wave Packets Then, multiplying condition (1) with ψ|( Xˆ − x), considering the Hermitian conjugate of the resulting relation, summing the equation and its conjugate, and then using condition (2), we finally obtain (c + c∗ )ψ|( Xˆ − x) 2 |ψ = 0.
(6.41)
Since the expectation value cannot be zero independently of |ψ, we must have a purely imaginary coefficient, c = i|c|,
(6.42)
which means that finally the wave function is ψ(x) = ψ(x = 0)e
i
px
|c| 2 exp − (x − x) . 2
(6.43)
This is the Gaussian wave packet we used before (in which case, indeed, we found saturation of the Heisenberg uncertainty relations), with Gaussian variance d2 =
. |c|
(6.44)
6.5 Energy–Time Uncertainty Relation We have found an (x, p) uncertainty relation, but we also can have an (E, t) (energy versus time) uncertainty relation, ΔEΔt ≥
. 2
(6.45)
Since time t does not correspond to an operator in quantum theory, this relation cannot strictly speaking be derived in the same way as the (x, p) relation. However, we can argue for it because: • relativistically (i.e., in a relativistic theory), E and t are the zero components of the 4-vectors pμ and x μ , so ΔEΔt is Δp0 Δx 0 , meaning that by relativistic invariance we must have (6.45). ˆ which • We also know that E is the eigenvalue (observable value) for the quantum Hamiltonian H, as we saw generates evolution in time t. But we also saw that the canonically conjugate momentum Pˆ generates translations in x, so the same Heisenberg uncertainty relation should be valid for the pair (E, t). The meaning of the uncertainty relation is however the same: if for instance we know the energy of a system with infinite precision, such as in the case when a system is confined to a discrete energy state (without transitions: say the state is a ground state), then the time it spends there is infinite, so Δt = ∞. Conversely, at a given time the quantum energy is arbitrary (cannot be determined). Another way to apply this relation is to say that, for an unstable particle, the lifetime Δt of the particle and its uncertainty in energy ΔE are related by the uncertainty relation. This is what happens for instance for virtual particles (particles created from the vacuum for a short time) in a quantum theory: if the particles exist for a time Δt, then their energy is not well defined but has uncertainty ΔE.
73
6 Heisenberg Uncertainty Principle, Wave Packets
Important Concepts to Remember px
−
x2
• In a Gaussian wave packet in x space, at t = 0, ψ(x, t = 0) = N ei e 2σ2 and we have Δx 2 Δp2 = 2 /4. • A Gaussian wave packet travels with momentum p0 , so the x dependence of the probability density is encoded in a function of x − t√ p0 /m (but there is an extra t dependence), and the wave packet spreads out in time, with Δx = d/ 2(1 + 2 t 2 /m2 d 4 ). ˆ B] ˆ 0) comes from • Heisenberg’s uncertainty relation for incompatible operators ([ A, 1 2 2 2 ˆ B]| ˆ . (ΔA) (ΔB) ≥ 4 |[ A, • The standard form of Heisenberg’s uncertainty relation is Δqi Δpi ≥ /2. • A Gaussian wave packet minimizes Heisenberg’s uncertainty relation. • The energy–time uncertainty relation, ΔEΔt ≥ /2, is not derived as above, and it applies either to errors or to an energy difference and time of decay.
Further Reading See any other book on quantum mechanics, for instance [2] or [1].
Exercises Consider a Gaussian wave packet in momentum space, equation (6.11). Calculate (Δx) 2 and (Δp) 2 in this p space, and check again the saturation of Heisenberg’s uncertainty relations. (2) Do the integral for the Gaussian wave packet with evolution operator, to prove (6.18), and calculate Δx(t) for it, to prove (6.20). (3) Can we measure simultaneously the momentum and the angular momentum in three spatial dimensions and, if not, what are the Heisenberg uncertainty relations corresponding to them? (4) Consider a system with Hamiltonian
(1)
p2 + αpx. (6.46) 2m Can we measure simultaneously the energy and the momentum of the system? (5) Consider two energy levels of an atomic system, E1 = E∗ and E2 = E∗ + 0.5 eV. What is the minimum possible decay time from E2 to E1 ? (6) Consider a superposition of two Gaussian wave packets with the same momentum p0 , initially (at time t 0 ) at the same position x 0 , but with different variances, σ1 = d 1 and σ2 = d 2 . Calculate (Δx) 2 and (Δp) 2 , and check the Heisenberg uncertainty relation, at time t 0 . (7) For the situation in exercise 6, calculate the time dependence of the uncertainty in position, Δx(t). H=
7
One-Dimensional Problems in a Potential V(x)
After analyzing two-state systems and free particles and wave packets, which are the simplest systems in the discrete-Hilbert-space and continuous-Hilbert-space cases, we consider the next simplest systems: particles in one dimension, with a standard nonrelativistic kinetic term and with a potential V (x).
7.1 Set-Up of the Problem We want to solve the Schr¨odinger equation in position space, so we multiply by x| the equation i∂t |ψ(t) = Hˆ |ψ(t),
(7.1)
Pˆ 2 Hˆ = + Vˆ ( x). ˆ 2m
(7.2)
where the quantum Hamiltonian is
ˆ xˆ † = x, We note that Pˆ and xˆ are Hermitian operators, Pˆ † = P, ˆ and so is Vˆ ( x), ˆ and therefore ˆ the Hamiltonian H. However, note that this is actually not enough: one also needs to have the same domain for Hˆ and Hˆ † , not just for them to act in the same way (which they do, since Pˆ and xˆ do). This is a subtlety which is irrelevant in most cases, but implies some interesting counterexamples. ˆ and The kinetic and potential terms act as follows (note that Pˆ 2 is the matrix product of two Ps, involves a sum or integral over the middle indices, turning two delta functions into a single one): 2 2 Pˆ ∂ x = δ(x − x ) −i x ∂x (7.3) 2m x|Vˆ ( x)|x ˆ = δ(x − x )V (x). Then the Schr¨odinger equation in coordinate space becomes (introducing a completeness relation via 1ˆ = dx |x x |) i∂t x|ψ(t) = dx x| Hˆ |x x |ψ(t) ⇒ (7.4) i∂t ψ(x, t) = dx Hxx ψ(x , t). Substituting the matrix elements of the kinetic and potential operators, and doing the x integral, we obtain 2 2 ∂ i∂t ψ(x, t) = − + V (x) ψ(x, t). (7.5) 2m ∂ x 2 74
75
7 One-Dimensional Problems in Potential V(x) This is solved, as earlier, by the separation of variables. We consider a wave function ψ E (x, t) that is an eigenfunction of the Hamiltonian, so that i∂t ψ E (x, t) = Eψ E (x, t) Hˆ ψ E (x, t) = Eψ E (x, t).
(7.6)
Then we have the stationary solution ψ E (x, t) = e−iEt/ ψ E (x, t = 0) = e−iEt/ ψ E (x).
(7.7)
Note that for a stationary solution the probability density is independent of time, ρ(x, t) ≡
dP(x, t) = |ψ(x, t)| 2 = |ψ(x)| 2 . dx
(7.8)
The eigenfunction problem for the Hamiltonian, also called the stationary (time-independent) Schr¨odinger equation, in our case is 2 2 ∂ Hˆ ψ E (x) = − + V (x) ψ E (x) = Eψ E (x). (7.9) 2m ∂ x 2 Defining the rescaled variables 2mE ≡ , 2
2mV (x) = U (x), 2
(7.10)
the Schr¨odinger equation becomes ψ E (x) = −( − U (x))ψ E (x).
(7.11)
This is the equation we will study in this chapter. It is a real equation (with real coefficients), so it is enough to consider its real eigenfunctions, since complex eigenfunctions can be obtained from them. However, for simplicity sometimes we will consider complex eigenfunctions directly.
7.2 General Properties of the Solutions Next we make a general description of the solutions, without considering specific potentials U (x). • The one-dimensional Schr¨odinger equation discussed above is a second-order linear (in the variable ψ(x)) differential equation, similar to the classical equation for the position of a particle x(t) in a potential. As in that case, we could give the initial conditions at some point, ψ(x 0 ) and ψ (x 0 ), like the values for x and p = m x˙ in the initial condition for the classical particle, and find the evolution in x (corresponding to evolution in time for the particle). This would amount to “integrating” the differential equation, and it would clearly lead (via integration) to a continuous solution ψ(x), so ψ(x) ∈ C 0 (R) (the space of continuous functions). • Besides being continuous, ψ(x) is also bounded as x → ±∞, since otherwise the probability density |ψ(x, t)| 2 would not be normalizable to 1, as needed. • If U (x) doesn’t have infinite discontinuities (jumps), or in another words if it doesn’t have delta functions, then on integrating ψ = ( −U (x))ψ we obtain that ψ (x) is continuous, so ψ ∈ C 1 (R) (the space of once-differentiable functions with continuous derivative). Indeed, if U (x) has a finite jump then ψ also has a finite jump, which means that ψ is still continuous.
76
7 One-Dimensional Problems in Potential V(x) • If we are in a classically allowed region, with energy greater than the potential, > U (x), so that − U (x) > 0, then the equation ψ = −( − U (x))ψ in the U (x) constant case has sinusoidal/cosinusoidal solutions or, taking a complex basis, eik x and e−ik x solutions, with √ k = − U. (7.12) • If on the other hand we are in a classically forbidden region, with energy smaller than the potential, − U (x) < 0, the equation ψ = +(U (x) − )ψ in the U (x) constant case has exponentially increasing and decreasing solutions, ψ ∼ e κx and ∼ e−κx , with √ κ = U − . (7.13) • The energy spectrum can be continuous (as for a free particle) or discrete (as for a H atom or the two-level system studied earlier). It can also be degenerate (with two or more states with the same energy) or nondegenerate. • We will define lim x→+∞ U (x) ≡ U+ and limx→−∞ ≡ U− and will assume that U+ > U− ; if not, we can set x → −x, and retrieve this situation. Then as a function of the energy , we have three possible cases: (I) > U+ > U− . In this case, there are two independent solutions (sin and cos, or eik x and e−ik x ) at each end of the real domain. That means that we can define two solutions that are bounded everywhere, including at both ends of the domain, for any such energy , which means that the spectrum is continuous, i.e., there is no restriction on , and degenerate with degeneracy 2, since for each energy there are two solutions. These states are unbound states, like free particles, with kinetic energy − U. (II) U− < < U+ . In this case − U (x) is negative at +∞ and positive at −∞. The solution at +∞ is exponentially increasing or decaying. Since the exponentially increasing solution is nonnormalizable, we must choose the exponentially decreasing solution. But then this solution, when continued to −∞, will correspond to only a given linear combination of the two (sin and cos) solutions there. That means that there is a unique solution for every energy in this region, so the spectrum is still continuous but now nondegenerate, and we also have unbound states. (III) < U− < U+ . In this case − U (x) is negative at both +∞ and −∞, which means there is a unique solution (exponentially decaying) at both ends. Consider the unique solution at −∞, f 1 , and continue it to +∞, where it will become a linear combination of the exponentially decaying (g1 ) and increasing (g2 ) solutions, f 1 = αg1 + βg2 . But the coefficients α, β are functions of the energy , so the condition that β() = 0 (so that we have a bounded condition at +∞ also), gives a constraint on the possible energies, having as solutions a discrete (and also nondegenerate) spectrum. Moreover, because of the exponentially decaying function at both ±∞, the eigenfunction is bounded in space to a finite region, and we say we have bound states. The existence or not of eigenvalues for the energy depends on the details of the potential U (x) in the finite region in the middle, as does the total number of eigenvalues, which can be anything from 0 to infinity. • The Wronskian theorem. This is a theorem for general linear second-order equations. Define the Wronskian of two functions y1 , y2 as W (y1 , y2 ) = y1 y2 − y2 y1 ,
(7.14)
77
7 One-Dimensional Problems in Potential V(x)
where y1 and y2 are solutions to the equations y1 + f 1 (x)y1 = 0
(1)
y2
(2).
+ f 2 (x)y2 = 0
(7.15)
Here f 1 (x), f 2 (x) are real functions that are piecewise continuous in the interval x ∈ (a, b), meaning they can have jumps at a finite number of points but are otherwise continuous. Then the Wronskian theorem states that b W (y1 , y2 )|ab = dx( f 1 (x) − f 2 (x))y1 y2 . (7.16) a
Proof Multiplying equation (1) above by y2 and equation (2) by y1 , and subtracting the two, we get y2 y1 − y1 y2 + ( f 1 (x) − f 2 (x))y1 y2 = 0 ⇒
d (y2 y1 − y1 y2 ) + ( f 1 (x) − f 2 (x))y1 y2 = 0, dx (7.17)
and by integration over x we get the result of the theorem.
q.e.d.
• Applying the Wronskian theorem in our case, with f (x) = − U (x), for two energies 1 , 2 , we get f 1 (x) − f 2 (x) = 1 − 2 . Then the theorem says that b b b (y1 y2 − y2 y1 )|a = W (y1 , y2 )|a = (1 − 2 ) dxy1 y2 , ∀a, b. (7.18) a
In particular, for 1 = 2 , we obtain that W (y1 , y2 ) is independent of x. • The Wronskian theorem in our case can be used to prove rigorously that indeed if > U (x) for x > x 0 we have two oscillatory solutions, and if < U (x) and U (x)− ≥ M 2 > 0 for x > x 0 , there is a unique solution going faster than or as fast as e−M x to zero at +∞, and other solutions increase exponentially. It also follows that in a classically forbidden region ( < U (x)), the eigenfunction ψ(x) can have at most one zero, since it is either increasing or decreasing exponentially. • If < U (x) everywhere, there is no solution since it means that we have exponential solutions everywhere, but this would mean that the solution would have to increase at least on one side (+∞ or −∞), since the continuity of the derivative at possible jump points means that the increasing or decreasing property is conserved at jumps of U (x) also. Therefore this solution will not be normalizable, so there is no solution to the eigenvalue problem. • This also means that if U (x) has a minimum Umin somewhere then the eigenvalues of E are necessarily larger than Umin . • The number of nodes (zeroes of the wave function) can be analyzed as follows. Consider two different energies 2 > 1 . Then the Wronskian theorem for the case where a, b are two consecutive zeroes (or nodes) of y1 , so that y1 (a) = y1 (b) = 0, means that b b y2 y1 |a = (2 − 1 ) dxy1 y2 . (7.19) a
But if a, b are consecutive zeroes, it means that between them, in the interval (a, b), y1 has the same sign, say y1 > 0, which implies y1 (a) > 0, y1 (b) < 0. Thus y2 must change sign in the interval (a, b), since otherwise the right-hand side of the Wronskian theorem has the same sign as y2 (as 2 − 1 > 0), whereas the left-hand side has the opposite sign (since both y1 (b) and −y1 (a) are negative).
78
7 One-Dimensional Problems in Potential V(x)
That in turn means that y2 has at least one zero in between the two consecutive zeroes of y1 , a, and b. But since both y1 and y2 also vanish asymptotically at +∞ and −∞, besides the finite zeroes (or nodes) it follows that if y1 has n nodes (so that there are n + 1 intervals (a, b), including −∞ and +∞ as boundaries) then y2 has n + 1 nodes. In turn, that means that there are at least n − 1 nodes for the nth eigenfunction of the system. • Finally, if we have a discrete spectrum, which implies that the eigenstates satisfy y(±∞) = 0, then two eigenstates y1 , y2 for two different eigenenergies (1 2 ) are orthonormal, since the Wronskian theorem for a = −∞, b = +∞ implies that (dividing by the nonzero 1 − 2 ) +∞ dx y1 y2 = 0. (7.20) −∞
7.3 Infinitely Deep Square Well (Particle in a Box) For an infinitely high potential barrier, i.e., if U (x) = ∞ at some x = x 0 , it is clear that we have ψ(x 0 ) = 0. Consider then the case where the potential is an infinitely deep square well (or box; see Fig. 7.1a), U (x) = 0, = ∞,
|x| < L/2
(7.21)
|x| > L/2.
In this case, it is clear that ψ(x) = 0 for |x| ≥ L/2. Note that if U (x) = U0 for |x| < L/2, we can make the rescaling − U0 → , and so get back to this case. The differential equation in the only relevant region, |x| ≤ L/2, is ψ + ψ = 0,
(7.22)
x -L/2
x
+L/2 (a)
a
(b)
b
x
x L
(c)
Figure 7.1
(d)
One-dimensional potentials U(x). (a) Particle in a box (infinite square well): its potential U(x) and ground state wave function ψ1 (x). (b) Potential step and wave components. (c) Finite square well. (d) Potential barrier and tunneling effect.
79
7 One-Dimensional Problems in Potential V(x) with the boundary conditions that ψ(x) = 0 at x = ±L/2. This is basically a stationary violin string set-up, with discrete eigenmodes (harmonics). In the relevant region II, |x| ≤ L/2, the solution is ψ(x) = Aeik x + Be−ik x , where √
2mE . 2 Imposing continuity of the wave function at both ends, i.e., k=
(7.23)
=
ψ(−L/2) = ψ(+L/2) = 0,
(7.24)
(7.25)
implies the system of equations Ae+ik L/2 + Be−ik L/2 = 0 Ae−ik L/2 + Be+ik L/2 = 0, which has nontrivial solutions only if the determinant of coefficients vanishes, eik L/2 e−ik L/2 ik L −ik L/2 = e−ik L , ik L/2 = 0 ⇒ e e e which means that the wave numbers must be quantized, nπ k = kn = ⇒ eikn L = (−1) n . L Consequently the eigenenergies are
(7.26)
(7.27)
(7.28)
2 n2 π 2 . (7.29) 2m L 2 Here, naively, we would have n ∈ Z, so that n = 0, ±1, ±2, ±3, . . . , but in fact n = 0 is excluded, since that would imply that ψ(x) is constant, which would mean we don’t have ψ(±L/2) = 0, or that the wave function vanishes. On the eigenenergies, we have B = −Aeikn L , which means that the eigenfunctions are nπx ⎧ 2i A sin , n even ⎪ ⎪ ⎪ L ik n x n −ik n x ψ n (x) = A(e + (−1) e )=⎨ (7.30) ⎪ ⎪ ⎪ 2A cos nπx , n odd. ⎩ L The remaining constant, A, is determined from the normalization condition for probability, +L/2 L/2 dP dx = |ψ(x)| 2 = 1. (7.31) dx −L/2 −L/2 En =
This gives
1 = 4| A| 2
so the eigenfunctions are
+L/2 −L/2
dx sin2
nπx 1 = 2| A| 2 L ⇒ | A| = √ , L 2L
⎧ 2 nπx ⎪ ⎪ ⎪ sin , ⎪ ⎪ L L ψ n (x) = ⎨ ⎪ ⎪ 2 nπx ⎪ ⎪ ⎪ cos , ⎩ L L
(7.32)
n even (7.33) n odd.
80
7 One-Dimensional Problems in Potential V(x) We observe that the ground state of the quantum system is not the classical value E = 0, but is now π 2 2 , 2mL 2
(7.34)
En = n2 E1 .
(7.35)
E1 = and the remaining states satisfy
The fact that the energy of the ground state is nonzero, and its order of magnitude, can be understood from Heisenberg’s uncertainty relations. Since the particle is confined to be in a box of size L, for |x| ≤ L/2, it means that the variance of x (the uncertainty in the position) is Δx = L/2. From the Heisenberg uncertainty relation we obtain Δp ≥
/2 . L/2
(7.36)
Assuming equality, pmin = Δp =
, L
(7.37)
we find E≥
2 pmin 2 = . 2m 2mL 2
(7.38)
This gives the right result except for a missing factor of π 2 (so the inequality is valid, just the possibility of equality is not). The next observation is that in a bound state p = 0. Having a stationary state means that p is time independent, and if the result were nonzero, it would mean that the particle translates on average and would eventually escape from the box, but that cannot happen, since the box has infinite sides. The average position can be calculated explicitly, +L/2 +L/2 2 nπx 2 dx x|ψ(x)| = dx x sin2 =0 (7.39) x = L L −L/2 −L/2 by the x → −x symmetry. To calculate Δx, we find that, given x = 0, we have +L/2 +L/2 2 1 − cos 2πnx/L (Δx) 2 n = x 2 n = dx x 2 |ψ(x)| 2 = dx x 2 L −L/2 2 −L/2 +πn/2 L2 1 1 L2 = − 3 3 dθ θ 2 cos 2θ = L 2 − , 12 π n −πn/2 12 2n2 π 2
(7.40)
which gives Δx = L
1 1 − . 12 2n2 π 2
(7.41)
√ This tends, as n → ∞, to L/ 12, which is the classical result: indeed, the classical result for the +L/2 average of x 2 is (1/L) −L/2 dx x 2 = L 2 /12.
81
7 One-Dimensional Problems in Potential V(x) We can also calculate Δp in a similar way, considering that p = 0: +L/2 +L/2 +L/2 2n2 π 2 2 1 + cos 2πnx/L 2 ˆ p2 n = dx| Pψ(x)| = 2 dx|ψ (x)| 2 = dx 3 2 L −L/2 −L/2 −L/2 2 n2 π 2 = = 2mEn . L2
(7.42)
It follows that we have the expected classical-like relation (Δp) 2 n = p2 n = 2mEn .
(7.43)
7.4 Potential Step and Reflection and Transmission of Modes After considering the simplest stationary case, the particle in a box, we now move on to a different type of potential, one that allows for states at infinity, states propagating in time and reflecting and being transmitted through a “wall”. The simplest example in this class is the potential step, with two different values of the potential, changing at x = 0, as in Fig. 7.1b, ⎧ ⎪U1 , U (x) = ⎨ ⎪U , ⎩ 2
x 0.
Consider for concreteness that U2 > U1 , so that the step is an increase, looking a bit like a wall to an incoming wave.
(a) The case U2 > > U1 We treat first the case where the energy of the particle – or particles? is in between the two values of the potential. According to the general theory described at the beginning of the chapter, we expect a continuous and nondegenerate spectrum. For x < 0 we have > U1 , so we have a sinusoidal solution, whereas for x > 0 we have < U2 , so we have an exponentially decaying solution. Thus the solution can be parametrized as ⎧ ⎪ A sin(k1 x + φ), φ(x) = ⎨ ⎪ Be−κ2 x , ⎩
x 0,
(7.45)
√ √ where k1 = − U1 and κ2 = U2 − . We first impose the condition of continuity of φ(x) at x = 0, A sin φ = B,
(7.46)
and thus A is a normalization constant. Next, we impose the continuity of the derivative ψ (x) at x = 0, or better, of the logarithmic derivative (under the log, the derivative gets rid of otherwise arbitrary multiplicative constants), i.e. ψ (x)/ψ(x) at x = 0, ˜ ˜ ψ (x + ) ψ (x − ) = , ˜ ˜ ψ(x − ) ψ(x + )
(7.47)
82
7 One-Dimensional Problems in Potential V(x) implying a solution for φ: −κ2 = k1 cot φ ⇒ φ = arctan
−k1 . κ2
(7.48)
− U1 . U2 − U1
(7.49)
Substituting back, we can find value for B/A from B = A sin φ = −A
k1 k12
+
κ22
= −A
Then A is found from the normalization condition dxφ∗E (x)φ E (x) = δ(E − E ).
(7.50)
The full time-dependent solution is A sin(k1 x + φ)e−iEt = A˜ ei(k1 x−Et)+φ − e−i(k1 x+Et)−φ ,
(7.51)
which is the sum of an incident (original) traveling wave and a reflected wave, traveling in the opposite direction. The interference of the two waves gives the resulting solution.
(b) The case > U2 The second case is for an energy above both values of the potential. Again, according to the general theory described earlier, in this case we have a continuous and degenerate spectrum, with degeneracy 2. Both for x > 0 and for x < 0 we have sinusoidal solutions and, besides the incident wave for x < 0, we must have both a reflected wave (for x < 0) and a transmitted wave (for x > 0). Thus the wave function (where we have taken out an overall normalization constant A and replaced it with 1) is ψ(x) = eik1 x + Re−ik1 x , = Se
ik2 x
,
x 0,
(7.52)
where the wave with factor R is the reflected wave and the wave with factor S is the transmitted √ √ wave, and k1 = − U1 > k2 = − U2 . Continuity of the wave function at x = 0 gives 1 + R = S,
(7.53)
whereas continuity of the logarithmic derivative (of ψ (x)/ψ(x)), also at x = 0, gives ik1
1−R k1 − k2 2k1 = ik2 ⇒ R = ⇒ S =1+R= > 1. 1+R k1 + k2 k1 + k2
(7.54)
Then, introducing the time dependence, we have an incident wave ∼ eik1 x−iωt , where ω = E/, a reflected wave ∼ e−ik1 x−iωt , and a transmitted wave ∼ eik2 x−iωt .
83
7 One-Dimensional Problems in Potential V(x)
7.5 Continuity Equation for Probabilities We can define the probability current density for continuous systems from ψ(t)|ψ(t) = ρdV ,
(7.55)
where the probability ρ = dP/dV (written for the more general case of more than one dimension, with dV replacing dx), so we obtain in the coordinate representation 2 dP = ψ(r , t) . (7.56) dV Then writing the Schr¨odinger equation and its complex conjugate in the coordinate representation,
ρ=
∂ψ 2 2 =− ∇ ψ + Vψ ∂t 2m (7.57) ∂ψ∗ 2 2 ∗ ∗ −i =− ∇ ψ + Vψ , ∂t 2m and then multiplying the first equation by ψ and the second by ψ∗ and subtracting the two, we obtain i
2 ∗ 2 2 ψ∗ ) ⇒ (ψ ∇ ψ − ψ∇ 2m 2 ∗ ∗ ), i∂t (|ψ| 2 ) = i∂t ρ = − ∇(ψ ∇ψ − ψ∇ψ 2m which can be written in terms of a continuity equation for probability flow, i(ψ∗ ∂t ψ + ψ∂t ψ∗ ) = −
(7.58)
· j, ∂t ρ = −∇
(7.59)
− ψ∇ψ ∗ ). j ≡ (ψ∗ ∇ψ 2mi
(7.60)
where the probability current density is
Applying this formalism for a one-dimensional problem with an incident wave ψ = Aeik x , we find the current density j = k | A| 2 ex . (7.61) m Even in one dimension, the vector ex makes sense, taking the values ±1 depending on the direction of propagation. Consider then the conservation of probability at a point of reflection and refraction (i.e., transmission), jinc + jrefl = jtransmitted .
(7.62)
jinc = k1 ex m jrefl = k1 |R| 2 (−e x ) m k jtrans = 2 |S| 2 ex , m
(7.63)
In our case, these currents become
84
7 One-Dimensional Problems in Potential V(x)
which means that the conservation of probability becomes k1 (1 − |R| 2 ) = k2 |S| 2 ⇒ 1 − |R| 2 =
k2 2 |S| ≡ T, k1
(7.64)
where the right-hand side, T, is the transmission coefficient (since |R| 2 is the probability of reflection and T is the probability of transmission, so that the two sum up to 1). We find specifically T=
4k1 k2 , (k1 + k2 ) 2
|R| 2 =
(k1 − k2 ) 2 . (k1 + k2 ) 2
(7.65)
7.6 Finite Square Well Potential We next consider a square well potential, where the potential “dips” for a short while; for generality we consider the two sides to have different potentials, so that ⎧ U1 , ⎪ ⎪ ⎪ ⎪ U (x) = ⎨ U2 , ⎪ ⎪ ⎪ ⎪U , ⎩ 3
x < a,
(I)
a < x < b,
(II)
x > b,
(III),
(7.66)
where U2 < U1 < U3 , as in Fig. 7.1c. If < U2 , there is no solution, since the solution must be exponentially decaying or increasing, and this implies that at least one side of the solution will increase exponentially, making it impossible to normalize the probability to one.
(a) The case U1 > > U2 In this case, according to the general analysis, the spectrum is discrete, and represents bound states. The solution is exponentially decreasing to zero on both sides of the well, and sinusoidal inside the well, so we write ⎧ Ce−κ3 x , ⎪ ⎪ ⎪ ⎪ ψ(x) = ⎨ B sin(k2 x + φ) = B1 sin k2 x + B2 cos k2 x, ⎪ ⎪ ⎪ ⎪ Ae+κ1 x , ⎩
x>b a>x>b
(7.67)
x < a,
where κ1 = U1 − ,
k2 =
− U2 ,
κ3 = U3 − .
(7.68)
Next, we impose continuity at x = a and x = b for ψ(x), obtaining the equations ψ(a+) = ψ(a−) ⇒ Ae+κ1 a = B sin(k2 a + φ) ψ(b+) = ψ(b−) ⇒ B sin(k2 b + φ) = Ce−κ3 b .
(7.69)
Finally, we impose the continuity of ψ (x) at the above points or, better still, considering that we have already imposed the continuity of ψ(x), the continuity of the logarithmic derivative ψ (x)/ψ(x), giving
85
7 One-Dimensional Problems in Potential V(x) ψ ψ (a+) = (a−) ⇒ +κ1 = k2 cot(k2 a + φ) ψ ψ ψ ψ (b+) = (b−) ⇒ k2 cot(k2 b + φ) = −κ3 . ψ ψ
(7.70)
The equations giving the continuity of ψ equations solve for B, C as a function of A (then A is found from the normalization condition), whereas one of the equations for the continuity of ψ /ψ can be solved for φ, and then the other gives an equation for the eigenenergies n . We write these last two equations as two equations for the same φ, whose consistency will give n : k2 φ = arctan − k2 a κ1 (7.71) k2 = − arctan − k2 b + nπ. κ3 Equating the two values, we find arctan
k2 k2 + arctan + k2 (b − a) − nπ = 0, κ3 κ1
(7.72)
which is a transcendental equation for , indexed by n (one solution per value of n) giving a discrete and finite sequence for n . Defining the quantities U1 − U2 − U2 cos γ ≡ , ξ≡ , K = U1 − U2 , L = b − a, (7.73) U3 − U2 U1 − U2 we find the equation arcsin ξ + arcsin(ξ cos γ) = nπ − ξK L;
(7.74)
to find a solution for it, we need the condition K L ≥ (n − 1)π + γ
(7.75)
which implies that there is a maximum value for n, given by n ≤ nmax = 1 +
KL − γ . π
(7.76)
This means that, indeed, we have a finite spectrum, as promised. Moreover, n is the number of nodes or zeroes of ψ (see our previous general discussion, equating this number with the index of the eigenenergy), since we have a factor sin(k2 x + φ) in the wave function, vanishing for the nπ term. If U1 = U3 = U and U2 = U0 (for a symmetric well), we find κ1 = κ3 , which means cos γ = 1 √ and thus γ = 0, and if moreover L U − U0 = K L 1 then there is a unique solution: the condition K L ≥ (n − 1)π + 0 can only have n = 1 as a solution, given by the transcendental equation 2 arctan
k2 = π − k2 L, κ
(7.77)
from which we find k2 /κ → ∞, and the unique eigenenergy is 1 E = V − K L(V − V0 ), 2 and the angle φ π/2.
(7.78)
86
7 One-Dimensional Problems in Potential V(x)
(b) The case U1 < < U3 In this case, when the energy is in between the two potentials at −∞ and +∞, according to the general analysis we must find a continuous, nondegenerate spectrum. As for the finite step potential, in region I (x < a), where > U (x), we write the wave function as an incident wave and a reflected wave, while in region III (x > b), where < U (x), we find an exponentially decaying solution. All in all, ⎧ eik1 x + e−ik1 x e2iφ1 , ⎪ ⎪ ⎪ ⎪ ψ(x) = ⎨ 2Aeiφ1 sin(k2 x + φ2 ), ⎪ ⎪ ⎪ ⎪ 2Beiφ1 e−κ3 x , ⎩
x U (x) everywhere, meaning that the solution is sinusoidal everywhere, and we have two good solutions in every region, I, II, and III. This means that the spectrum is continuous and degenerate with degeneracy 2. As in the similar case with a single step, we write an incident and a reflected wave in region I, and a transmitted wave in region III. In region II, we also write a combination of a forward moving and a backward moving, wave, so ⎧ eik1 x + Re−ik1 x , ⎪ ⎪ ⎪ ⎪ ⎪ ψ(x) = ⎨ Peik2 x + Qe−ik2 x , ⎪ ⎪ ⎪ ⎪ ⎪ Seik3 x , ⎩
x U0 In this case, the particle can go “over” the barrier classically, as if it wasn’t there, but quantum mechanically there will be reflection as well as transmission. The continuity conditions at x = 0 and x = L for ψ(x) and ψ (x)/ψ(x) give 1+R=C+D √
+ De−ik L = Sei L √ 1−R C−D i = ik 1+R C+D ik L −ik L √ Ce − De ik ik L = i , Ce + De−ik L Ce
ik L
(7.92)
which similarly lead to a transmission coefficient T = |S| 2 =
1 < 1, U02 2 1+ sin k L 4( − U0 )
(7.93)
(and a reflection coefficient |R| 2 = 1 − T); note that now the transmission coefficient oscillates between 1 and 1/(1 + U02 /4( − U0 )) < 1 as a function of k (classically, we expect only the value T = 1).
89
7 One-Dimensional Problems in Potential V(x)
Important Concepts to Remember • For a one-dimensional particle in a potential, the wave function is ψ(x) ∈ C 0 (R) and, if the potential has no delta functions, ψ(x) ∈ C 1 (R) and is bounded at x = ±∞. • The wave function is of sin/cos type in a classically allowed region, and of exponentially decaying/growing type in a classically forbidden region (being exactly sin/cos or exp only if the potential is constant in the region). • For energies above the values of the potential at +∞ and −∞, we have a continuous spectrum, of unbound states of degeneracy 2, as for free particles. • For energies between the values of the potential at +∞ and −∞, we have a continuous spectrum, of unbound states but nondegenerate. • For energies below the values of the potential at +∞ and −∞, we have a discrete and nondegenerate spectrum of bound states, i.e., they are constrained to a finite region (exponentially decaying outside it). • There are at least n − 1 nodes (zeroes) for the nth eigenfunction of the system. 2 π 2 2 • The infinitely deep square well has eigenenergies En = 2mL 2 n , with n = ±1, ±2, ±3, etc., and eigenfunctions sin nπx/L for n even and cos nπx/L for n odd. • In the case of a step-type wall, we can define a transmitted wave and a reflected wave, ψ(x) = eik1 x + Re−ik2 x for x < 0 and ψ(x) = Se−ik2 x for x > 0. j = 0, where ρ = |ψ| 2 is the probability density • The continuity equation for probabilities is ∂t ρ + ∇ ∗ ∗ and j = 2m (ψ ∇ψ − ψ∇ψ ) is the probability current. • For both the step-function case and the case of a finite barrier, we can define the transmission coefficient T and the reflection coefficient R for the probability, with T + R = 1. √ • For a potential barrier of length L and height U0 , for L U0 − EL 1, the tunneling probability is √ T ∝ e−2 U0 −E L .
Further Reading See [2] for more details.
Exercises (1)
Consider a potential that depends only on the radial direction r in a three-dimensional space, V (r), with V (r such that → ∞) = 0 while V (0) = −V0 < 0 and Vmax = maxr V (r) = V1 > 0, reached at r = r 1 , and there is a unique solution r 0 to the equation V (r 0 ) = 0. Describe the spectrum of the system in the various energy regimes. (2) Repeat the previous exercise in the case V (r → 0) → −∞. (3) Consider the delta function potential in one dimension, V = −V0 δ(x).
(7.94)
Calculate its bound state spectrum. (4) Calculate x m n and pm n for a particle in a box (an infinitely deep square well) for arbitrary integers n, m > 0. In which limits do we have a classical result?
90
7 One-Dimensional Problems in Potential V(x)
(5) Consider a periodic cosinusoidal potential, V (x) = V0 cos(ax).
(7.95)
Calculate the bound state spectrum, and the number of nodes for the corresponding eigenstates. (6) For the finite square well potential, prove the formula for the transmission coefficient (7.83), calculate R, and prove that |R| 2 + T = 1. (7) Consider an asymmetric potential barrier, with ⎧ 0, ⎪ ⎪ ⎪ ⎪ ⎨ U (x) = ⎪U0 > U1 , ⎪ ⎪ ⎪U > 0, ⎩ 1
x 0). Around this minimum, we can do a Taylor expansion: V (x) V (x 0 ) + V (x 0 )(x − x 0 ) + V (x 0 ) = V (x 0 ) + (x − x 0 ) 2 . 2
V (x 0 ) (x − x 0 ) 2 + · · · 2
(8.4)
Thus, with x − x 0 ≡ y and V (x 0 ) ≡ mω 2 , we obtain approximately the above harmonic oscillator. 91
92
8 The Harmonic Oscillator For a generic multiparticle coupled system with variables x 1 , . . . , x n and potential V (x 1 , . . . , x n ), and a stable minimum (so the determinant of the Hessian matrix is positive) written as x 0 = (x 1,0 , . . . , x n,0 ), we obtain similarly n 1 V (x 1 , . . . , x n ) V (x 0 ) + ∂i ∂j V (x 0 )(x i − x i,0 )(x j − x j,0 ). 2 i,j=1
(8.5)
By writing δx i ≡ x i − x i,0 and ∂i ∂j V (x 0 ) ≡ Vi j = Vji ,
(8.6)
we obtain the Hamiltonian H=
n n 1 δi j 1 pi pj + δx i Vi j δx j . 2 m 2 i,j=1 i,j=1
(8.7)
This corresponds to a matrix in (i, j) space, which can be diagonalized to find the eigenmodes, i.e., the eigenvectors and eigenvalues for Hi j . In the diagonalized form, the system reduces to a system of decoupled harmonic oscillators. Finally, similarly considering a field φ(x, t) in Fourier space, it can be decomposed in an infinite set of approximately harmonic oscillators, as in the above finite dimensional case. We will not do this here, however, since field theory is beyond the scope of this book. The classical equations of motion of the harmonic oscillator Hamiltonian (Hamilton’s equations) are: x˙ =
p ∂H = , ∂p m
p˙ = −
∂H = −mω 2 x, ∂x
(8.8)
leading to a single equation for x(t), x¨ + ω 2 x = 0, with independent solutions e
iωt
and e
−iωt
(8.9)
, or sin ωt and cos ωt, with general real solution
x(t) = A sin(ωt + φ).
(8.10)
8.2 Quantization in the Creation and Annihilation Operator Formalism We want to solve the Schr¨odinger equation for the quantum harmonic oscillator, i∂t |ψ = Hˆ |ψ,
(8.11)
where the quantum Hamiltonian is, as we saw above, Pˆ 2 mω 2 ˆ 2 Hˆ = + X . 2m 2
(8.12)
We do the usual separation of variables, writing the eigenvalue problem for Hˆ (the time-independent Schr¨odinger equation), Hˆ |ψ = E|ψ.
(8.13)
93
8 The Harmonic Oscillator
In order to do that, and to diagonalize the Hamiltonian, i.e., find its eigenstates, we consider a ˆ P) ˆ to complex operators ( a, change of quantum operators from ( X, ˆ aˆ † ), defined by 1 mω ˆ 1 Qˆ + i Pˆ aˆ = √ X +i√ Pˆ ≡ √ mω 2 2 (8.14) ˆ − i Pˆ 1 mω ˆ 1 Q † aˆ = √ X −i√ Pˆ ≡ √ , mω 2 2 with the inverse aˆ + aˆ † Qˆ = √ , 2
aˆ − aˆ † Pˆ = √ . i 2
(8.15)
ˆ P] ˆ = i1ˆ becomes Under this change, the canonical commutation relation [ X, [a, ˆ aˆ † ] =
1 ˆ ˆ Qˆ − i P] ˆ = +i[P, ˆ Q] ˆ = 1, ˆ [Q + i P, 2
(8.16)
and of course we also have [a, ˆ a] ˆ = [aˆ † , aˆ † ] = 0. The Hamiltonian then becomes mω ( aˆ − aˆ † ) 2 mω 2 ( aˆ + aˆ † ) 2 ω + = ( aˆ aˆ † + aˆ † a) ˆ Hˆ = 2m (−2) 2 mω 2 2 1 1 = ω aˆ † aˆ + ≡ ω Nˆ + , 2 2
(8.17)
where in the second line we have used the commutation relation [ a, ˆ aˆ † ] = 1 and have defined the “number operator” (we will see shortly why it is called this) Nˆ = aˆ † a, ˆ
(8.18)
ˆ which we can easily see is Hermitian, Nˆ † = aˆ † aˆ = N. ˆ ˆ so we will consider the eigenvalue This means that the eigenstates of H are also eigenstates of N, problem for the latter, with the eigenvalue called n: Nˆ |n = n|n.
(8.19)
Since we have a Hilbert space, the norms of the states |n must be positive and nonzero, i.e., nonnegative, n|n > 0. But now consider the expectation value 2 ˆ = || a|n|| ˆ ≥0 n| Nˆ |n = n| aˆ † a|n
= nn|n ≥ 0,
(8.20)
leading to the fact that the eigenvalue n is positive, n ≥ 0, and moreover n = 0 ⇔ a|n = 0.
(8.21)
That means that the ground state of the harmonic oscillator, the state of minimum energy and ˆ has n = 0 and so is denoted |0 and is called the therefore also the minimum eigenvalue for N, “vacuum” state. It is defined by a|0 ˆ = 0.
(8.22)
94
8 The Harmonic Oscillator ˆ is increased by the action of a† and decreased by the Next we note that n (the eigenvalue for N) action of a, since ˆ a] ˆ aˆ = −aˆ [ N, ˆ = aˆ † aˆ aˆ − aˆ aˆ † aˆ = [aˆ † , a] † † † † † † ˆ aˆ ] = aˆ aˆ aˆ − aˆ aˆ aˆ = aˆ [a, [ N, ˆ aˆ † ] = aˆ † .
(8.23)
Indeed, then Nˆ |n = n|n implies that Nˆ ( a|n) ˆ = ( aˆ Nˆ − a)|n ˆ = (n − 1)( a|n) ˆ ⇒ |n − 1 ∝ a|n ˆ † † † † † Nˆ ( aˆ |n) = ( aˆ Nˆ + aˆ )|n = (n + 1)( aˆ |n) ⇒ aˆ |n ∝ |n + 1.
(8.24)
This means that aˆ † creates a quantum with energy ω, since when acting on a state it increases the energy as E → E + ω, and aˆ annihilates a quantum of energy ω, decreasing the energy of a state as E → E − ω. Thus aˆ is called an annihilation or lowering operator, and aˆ † is called a creation or raising operator. Since n can jump by only one, up or down, and we have n = 0 for the state defined by a|0 = 0, it follows that n ∈ N is a natural number, and the vacuum state |0 is defined by a|0 = 0. The states are then indexed by this natural number n, |0,
|1 ∝ a† |0, |2 ∝ a† |1 ∝ (a† ) 2 |0,
...
(8.25)
and the eigenvalue for the state |n is
1 En = n + ω. 2
(8.26)
We require orthonormal states, n|n = 1, and, more generally, n|m = δ nm ,
(8.27)
which means that the proportionality constants for the action of aˆ and aˆ † , given by a|n ˆ = α n |n − 1
(8.28)
aˆ † |n = β n |n + 1, can be calculated. Indeed, from the relations found previously, √ n|| |n|| 2 = ||a|n|| 2 = |α n | 2 || |n − 1|| 2 ⇒ |α n | = n. Choosing the phase of α n to be unity (the trivial choice), we obtain α n = previous relations,
√
(8.29)
n. Similarly, from the
n| aˆ aˆ † |n = || aˆ † |n|| 2 = n|( aˆ † aˆ + 1)|n = (n + 1)|| |n|| 2 √ = |β n | 2 || |n|| 2 ⇒ |β n | = n + 1. √ Then, choosing the phase of β n to also be unity, we obtain β n = n + 1. All in all, √ √ a|n ˆ = n|n − 1, aˆ † |n = n + 1|n + 1.
(8.30)
(8.31)
95
8 The Harmonic Oscillator Since n|m = δ mn , it means that the matrix elements of aˆ and aˆ † are √ m| a|n ˆ = nδ m,n−1 √ m| aˆ † |n = n + 1δ m,n+1 ,
(8.32)
which means the opearators a and a† are represented in the basis with n = 0, 1, 2, . . . by the matrices 0 0 a =
√ 1 0 0
0 √ 2 0 0
0 √ 3 √ 0 4
,
· · ·
0 √ 1 0 a† =
0 0 √ 2
0 0 0 √ 3
0 0 √ 4 0
.
(8.33)
· · ·
Moreover, we obtain the general state as a repeated action with aˆ † , using the recursion relation n times, aˆ † aˆ † aˆ † ( aˆ † ) n |n = √ |n − 1 = √ |n − 2 = · · · = √ |0. n n(n − 1) n!
(8.34)
8.3 Generalization We can easily generalize the single harmonic oscillator to the case of several oscillators, coming either from several particles with harmonic oscillator potentials, or from diagonalizing a total interacting Hamiltonian expanded around a stable minimum. Then we have Hq , (8.35) H= q
with each Hq being an independent harmonic oscillator of ω = ωq , 1 † Hq = ωq aˆ aˆ + . 2
(8.36)
The fact that the harmonic oscillators are independent means that we have [aˆq , aˆq† ] = δ q,q 1ˆ
(8.37)
(and the rest of the commutators vanish). Then the states of the total system are, as in the general theory described previously, tensor products of the states for each Hamiltonian Hq , meaning that the general state, with occupation numbers n1 , n2 , . . . , nq , . . . in each mode, is ( aˆ † ) n1 ( aˆq† ) nq |n1 , n2 , . . . , nq , . . . = √1 ... . . . |0, n1 ! nq !
(8.38)
where the total vacuum state is the tensor product of the vacuum state of each oscillator, |0 ≡ |01 ⊗ |02 ⊗ · · ·
(8.39)
96
8 The Harmonic Oscillator
8.4 Coherent States We can define other bases than {|n} (eigenstates of the Hamiltonian) in the Hilbert space. One such basis is the basis of coherent states |α, defined as eigenstates of the annihilation operator a, ˆ a|α = α|α.
(8.40)
We find that the coherent states can be written in terms of the basis of |n states, as αn † |α = e α aˆ |0 ≡ ( aˆ † ) n |0. n! n≥0
(8.41)
Indeed, then †
†
a|α ˆ = [a, ˆ e α aˆ ]|0 = αe α aˆ |0 = α|α, where in the first equality we have used a|0 ˆ = 0 and in the second we have used n α † [a, ˆ e α aˆ ] = ˆ aˆ † ]( aˆ † ) n−2 + · · · + ( aˆ † ) n−1 [a, ˆ aˆ † ] [a, ˆ aˆ † ]( aˆ † ) n−1 + aˆ † [a, n! n≥0 α n−1 † =α ( aˆ † ) n−1 = αe αa . (n − 1)! n≥0
(8.42)
(8.43)
Similarly, we define the complex conjugate state, ∗
α ∗ | ≡ 0|e α a ,
(8.44)
α ∗ | aˆ † = α ∗ |α ∗ .
(8.45)
obeying the complex conjugate relation
The inner product of the bra and ket states is found to be ∗
∗
∗
α ∗ |α = 0|e α a |α = e αα 0|α = e α α ,
(8.46)
since we have (using 0|a† = 0) †
0|α = 0|e αa |0 = 0|0 = 1. We also have a completeness relation for these bra and ket states, dα dα∗ −αα∗ 1ˆ = e |αα ∗ |, 2πi
(8.47)
(8.48)
but we will leave its proof as an exercise.
8.5 Solution in the Coordinate, |x, Representation (Basis) Until now, we have worked abstractly with ket states, without choosing a representation. But we now want to obtain the probabilities of finding the harmonic oscillator in coordinate (x) space, so we need to calculate the wave functions ψ n (x) ≡ x|n.
97
8 The Harmonic Oscillator
According to the general theory defined in the previous chapter, the Schr¨odinger equation ˆ H |n = En |n becomes 2 2 d 1 2 2 − + mω x ψ n (x) = En ψ n (x), (8.49) 2m dx 2 2 by inserting the completeness relation dx|xx| = 1ˆ and multiplying with x|. Rewriting it as mωx 2 ψ n (x) + n − ψ n (x) = 0, (8.50) where we have defined as before =
2m E, 2
(8.51)
E , ω
(8.52)
√ or, better still, defining the coordinate y = x mω/ and ˜ = we have ψ n (y) + (2˜ n − y 2 )ψ n (y) = 0.
(8.53)
In order to solve this differential equation and find its eigenvalues n , we use a method, called the Sommerfeld method, that can be used in more general situations. We first find the behavior of the equation, and the corresponding solutions, at the extremes of the region, y → ∞ and y → 0, and then we factor them out of ψ(y), and write an equation for the reduced function.
(a) The limit y → ∞ In this limit, equation (8.53) becomes ψ − y 2 ψ = 0,
(8.54)
on neglecting the constant in the second term. The solution is of the general form ψ(y) = ( Ay m )e±y /2 , 2
(8.55)
where for completeness we have considered also the subleading polynomial factor ( Ay m ), but in fact this is not needed, since all we need to do is to factor out the leading behavior. Indeed, then 2m + 1 m(m − 1) m+2 ±y 2 /2 ψ (y) = Ay e 1± + → y 2 ψ(y), (8.56) y2 y4 independently of the polynomial in the y → ∞ limit. From the condition that ψ(y) must be a normalizable function (since its modulus squared gives probabilities), we can only have the negative 2 sign in the exponent, e−y /2 .
(b) Limit y → 0 In this case we can ignore the y 2 term with respect to the constant in the second term, obtaining ψ (y) + 2ψ(y) ˜ = 0,
(8.57)
98
8 The Harmonic Oscillator
with the general solution
√ √ ψ(y) = A˜ sin 2y ˜ + B˜ cos 2y. ˜
(8.58)
However, since we are considering the y → 0 limit, we can ignore order-y 2 terms in the solution (as in the equation), and write only √ ˜ ψ(y) → A˜ 2y + B. (8.59) Putting together the two limits, we factor out the leading behaviors at infinity (e−y /2 ) and at zero (just 1, a constant), redefining 2
ψ(y) = e−y /2 H (y), 2
(8.60)
where from the general (including subleading) behavior at infinity, and the behavior at zero, we know that H (y) should be a polynomial (∼ Ay m at y → ∞ and ∼ C˜ y + B˜ at y → 0). We calculate first ψ (y) = e−y /2 [H (y) − 2yH (y) + (y 2 − 1)H (y)], 2
(8.61)
which means that the equation for H (y) is H (y) − 2yH (y) + (2˜ − 1)H (y) = 0.
(8.62) −y 2 /2
But from the fact that the good (i.e., normalizable) solution at y → ∞ is ∼ e √ and not√∼ e+y /2 , which √ in general is a linear combination of the two possible solutions at zero (sin 2y ∼ 2y and cos 2y ∼ 1), while we want a nonzero solution at y = 0, we get a constraint on = n , that is, a quantization condition for the energy. Indeed, this good behavior both at y = 0 and at y = ∞ of the solution will lead to the condition En = 2n + 1, n ∈ N, (8.63) 2˜ = 2 ω which will mean that the Hn (x) are Hermite polynomials. To see this, we first write H (y) as an infinite Taylor series, H (y) = Cn y n . (8.64) 2
n≥0
Then, substituting in the equation for H (y), equation (8.62), we find Cn [n(n − 1)y n−2 − 2ny n + (2˜ − 1)y n ] = 0.
(8.65)
n≥0
But we can redefine the sums above to have the same y n factor (and verify that in doing so we don’t change the n = 0 and n = 1 terms): y n [(n + 2)(n + 1)Cn+2 − Cn (2 − 1 − 2n)] = 0. (8.66) n≥0 n
Since the y are linearly independent functions, we can set all their coefficients to zero, obtaining a recurrence relation for Cn , Cn+2 = Cn
2˜ − 1 − 2n . (n + 2)(n + 1)
(8.67)
Here we use the fact that equation (8.62) is a rewriting (by redefining y(x) as −y 2 /2) of 2 (8.53), which we know has two possible solutions at infinity, e±y /2 ; so (8.62) also has two possible
2 solutions at infinity, one polynomial, and one ∼ e+y . Generically, then, the series H (y) = n≥0 Cn y n
99
8 The Harmonic Oscillator would lead, if all Cn s were nonzero all the way to infinity, to ∼ e+y behavior. To avoid that possibility, we require the Taylor series to terminate at a finite n, so that H (y) is a Hermite polynomial Hn (x). Given the above recurrence relation, this is only possible if Cn+2 /Cn = 0 for some n, implying indeed that 1 ˜ n = n + , n ∈ N. (8.68) 2 For such an n, we obtain the eigenfunction mω 2 (8.69) ψ n (x) = An e−mωx /2 Hn x , where An is a normalization constant, which can be found from the condition +∞ dx|ψ n (x)| 2 = 1, (8.70) 2
−∞
leading to
1/4
mω An = π22n (n! ) 2
.
(8.71)
8.6 Alternative to |x Representation: Basis Change from |n Representation As an alternative to the above derivation, we could go from the |n to the |x basis directly. First, we consider the ground state, defined by a|0 ˆ = 0. Introducing the completeness relation 1ˆ = dx |x x | on the right-hand side of a, ˆ multiplying with x|, and defining the ground state wave function x|0 ≡ ψ0 (x), we obtain
0=
(8.72)
dx x| a|x ˆ ψ0 (x ).
(8.73)
Using the matrix element 1 mω ˆ 1 1 mω 1 d ˆ X+√ i P |x = √ δ(x − x ) x +√ x| a|x ˆ = √ x| dx mω mω 2 2 (8.74) δ(x − x ) d = y + , √ dy 2 √ where we have defined y ≡ mω/x as before, we find the equation d 1 d δ(x − x ) 0= dx (8.75) y + ψ0 (x ) = √ y + ψ0 (y). √ dy dy 2 2 Its solution is ψ0 (x) = A0 e−mωx
2 /2
.
(8.76)
100
8 The Harmonic Oscillator To proceed to the other states, we act with x| on ( aˆ † ) n |n = √ |0, n!
(8.77)
and again by inserting a completeness relation on the left of |0, we find n 1 A0 y − d/dy 2 ψ n (x) = √ dx x |( aˆ † ) n |x ψ0 (x ) = √ e−y /2 √ n! n! 2 A0 2 =√ e−y /2 Hn (x), n/2 n!2 since the Hermite polynomials can be defined by n d 2 y 2 /2 Hn (y) ≡ e y− e−y /2 . dy
(8.78)
(8.79)
This result matches what we found before, including the normalization constant.
8.7 Properties of Hermite Polynomials 2
The Hermite polynomials form an orthonormal set with respect to the weight ey , and so have scalar product +∞ √ 2 Hn (y)Hm (y)e−y dy = δ nm (2n n! π). (8.80) −∞
They can be defined by n x2
Hn (x) = (−1) e
d dx
n
e−x , 2
(8.81)
have parity change given by the order n, Hn (−x) = (−1) n Hn (x),
(8.82)
and satisfy the recurrence relations Hn+1 (x) − 2xHn (x) + 2nHn−1 (x) = 0 Hn (x) = 2nHn−1 (x).
(8.83)
They also form a complete set on the space of (doubly differentiable) functions on the interval 2 (−∞, +∞) with weight e x , L 2 x 2 (R), since they satisfy e
ψ n (x)ψ n (x) = δ(x − x).
(8.84)
n≥0
They can be obtained by Taylor expansion from the generating function sn 2 e−s +2sz = Hn (z). n! n≥0
(8.85)
101
8 The Harmonic Oscillator
8.8 Mathematical Digression (Appendix): Classical Orthogonal Polynomials A set of polynomials {Pn (x)} constitutes a set of classical orthogonal polynomials with weight ρ on the interval (a, b) if b ρ(x)Pn (x)Pm (x) = δ nm (8.86) a
and ρ satisfies [σ(x)ρ(x)] = τ(x)ρ(x),
(8.87)
where ⎧ (x − a)(x − b), ⎪ ⎪ ⎪ ⎪ σ(x) = ⎨ x − a, ⎪ ⎪ ⎪ ⎪ b − x, ⎩
a, b finite b = ∞, a finite
(8.88)
a = −∞, b finite,
τ(x) is a linear function and τ(b) τ(a) ⎧ ⎪ ⎪ (x − a) β (b − x) α , α = − − 1, β = − 1, ⎪ ⎪ ⎪ b − a b −a ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (x − a) β e xτ (x) , β = τ(a) − 1, ρ(x) = ⎨ ⎪ ⎪ ⎪ ⎪ (b − x) α e−xτ (x) , α = −τ(b) − 1, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ exp τ(x)dx , ⎩
a, b finite b = ∞, a finite a = −∞, b finite (a, b) = (−∞, +∞). (8.89)
Then, one can prove the limits lim x m σ(x)ρ(x) = 0
x→a
lim x m σ(x)ρ(x) = 0.
(8.90)
x→b
One can take the general classical polynomials to their canonical forms, by making changes of variables for x and Pn .
(1) The case where (a, b) are finite In this case, we can change variables to set (a, b) → (−1, 1) and obtain ρ(t) = (1 − t) α (1 + t) β σ(t) = 1 − t 2
(8.91)
τ(t) = −(α + β + 2)t + β − α. The resulting classical orthogonal polynomials are the Jacobi polynomials. In the particular case of α = β = 0 (so that ρ(t) = 1 is trivial), we obtain for Pn (x) the Legendre polynomials.
102
8 The Harmonic Oscillator In the particular case α = β = +1/2 and α = β = −1/2, we obtain the Chebyshev polynomials of the first kind, Tn (x), and second kind, Un (x). In the cases α = β = λ −1/2, we obtain the Gegenbauer polynomials G n (x).
(2) The cases (−∞, b) and (a, +∞) In these cases, we change variables (a, b) to (0, +∞). We obtain the Laguerre polynomials L nα (x), with ρ(t) = t α e−t σ(t) = t
(8.92)
τ(t) = −t + α + 1.
(3) The case (a, b) = (−∞, +∞) In this case, we can put ρ(t) = e−t
2
σ(t) = 1
(8.93)
τ(t) = −2t, and the resulting polynomials are the Hermite polynomials.
Properties of the Classical Orthogonal Polynomials The only orthogonal polynomials whose derivatives are also orthogonal polynomials are the classical orthogonal polynomials, which we are describing in this appendix here. They satisfy the eigenvalue equation d d (8.94) σ(x)ρ(x) Pn (x) + λ n ρ(x)Pn (x) = 0, dx dx where the eigenvalue is
n − 1 λ n = −n τ (x) + σ (x) . 2
(8.95)
They are also given by Rodrigues’ formula, Pn (x) = An
1 dn n σ (x)ρ(x) , n ρ(x) dx
and they form a basis for the Hilbert space in L 2ρ (a, b), so that Pn (x)Pn (x ) = δ(x − x ).
(8.96)
(8.97)
n≥0
They satisfy the recurrence formulas xPn (x) = α n Pn+1 (x) + β n Pn (x) + γn Pn−1 (x),
(8.98)
103
8 The Harmonic Oscillator
valid for a general orthogonal polynomial, and σ(x)Pn (x) = α 1n Pn+1 (x) + (β 1n + γn1 x)Pn (x) Pn (x) = α 2n Pn+1 (x) + (β 2n + γn2 x)Pn (x),
(8.99)
valid only for classical orthogonal polynomials. The generating function K (x, t) can be Taylor expanded to give the classical orthogonal polynomials, P˜n (x) tn, n! n≥0
(8.100)
ρ(z1 ) 1 , ρ(x) 1 − tσ (z1 )
(8.101)
K (x, t) = and it can be calculated from the formula K (x, t) =
where z1 is the closest zero to x of the function f (z) = z − x − σ(z)t.
(8.102)
For the Legendre polynomials Pn (x), we have Pn (x) =
1 2n n!
dn [(x 2 − 1) n ] dx n
1 = Pn (x)t n . √ 2 1 − 2t x + t n≥0
(8.103)
For the Hermite polynomials Hn (x), we have n 2 d 2 Hn (x) = (−1) n e x (e−x ) dx n tn 2 e2t x−x = Hn (x). n! n≥0
(8.104)
For the Laguerre polynomials L nα (x), we have K (x, t) =
et x/t−1 . (1 − t) α+1
(8.105)
Important Concepts to Remember • The harmonic oscillator is an important prototype for a quantum mechanical system, which can be used to test formalisms and to approximate systems to the quadratic order. • One can define its quantum mechanics in terms of creation and annihilation operators aˆ † , a, ˆ satisfying [a, ˆ aˆ † ] = 1, which create and annihilate quanta of frequency ω. This is also called the occupation number basis. † • The states of n quanta vacuum |0, satisfying a|0 ˆ = 0, √ are created by acting with aˆ n times on the † n i.e., |n = [( aˆ ) / n!]|0; their number is measured by N = aˆ † a, ˆ N |n = n|n and their energy is En = (n + 1/2)ω.
104
8 The Harmonic Oscillator • Coherent states |α are eigenstates of the annihilation operator a, ˆ a|α ˆ = α|α, and are written as α aˆ † |α = e |0. • The solution of the harmonic oscillator in the |x (coordinate) representation is obtained by the Sommerfeld method, by finding the behavior of the harmonic oscillator equation, and of its solution, at the limits of the domain (here x → 0 and x → ∞), and factoring this out of the general solution, which usually reduces the solution to a polynomial. • In the Sommerfeld method, one usually obtains a quantization condition by imposing that the unique normalizable solution at one end of the domain matches the unique normalizable solution at the other end, so that an infinite series reduces to a polynomial, usually a classical orthogonal one. • For the harmonic oscillator, the Sommerfeld √ method leads to the Hermite polynomials and so the 2 solution ψ(y) = e−y /2 Hn (y), with y = x mω/. • Classical orthogonal polynomials are polynomials that are orthogonal on an interval (a, b), with a weight ρ(x), whose derivatives are also orthogonal polynomials. • In the case of a, b finite, redefining the domain to (−1, 1) we have the Jacobi polynomials; in the case of (a, +∞) or (−∞, b) we redefine the domain to (0, +∞), obtaining the Laguerre polynomials, and in the case (−∞, +∞) we have the Hermite polynomials.
Further Reading See [2] for more details.
Exercises (1)
Consider the Lagrangian L=
q˙12 q˙22 α 1 α2 α 12 + − sin2 (β1 q1 ) − sin2 (β2 q2 ) − sin2 (β12 (q1 + q2 )). 2 2 2 2 2
(8.106)
Approximate the system by two harmonic oscillators, and quantize it in the creation and annihilation operator (occupation number) representation. (2) Consider the Hamiltonian for a (large) number N of oscillators aˆi , aˆi† (with [aˆi , aˆ †j ] = δi j ): H=
N
† aˆi† aˆi + α aˆi+1 aˆi + h.c. ,
(8.107)
i=1
where aˆ N +1 ≡ aˆ1 , aˆ †N +1 ≡ aˆ1 . Diagonalize it and find its eigenstates. (3) Show that the completeness relation for coherent states is (8.48). (4) Calculate (in terms of a single ket state and no operators) exp i[a† a + β( aˆ + aˆ † ) 3 ] |α.
(5) Calculate x 2 n and x 3 n in the state |n of the harmonic oscillator.
(8.108)
105
8 The Harmonic Oscillator
(6) Use the Sommerfeld method for a particle in one dimension with a quartic potential, instead of a quadratic one, V (x) = λx 4 . What is the resulting reduced equation, and can you describe the quantization condition for bound states? (7) Consider two harmonic oscillators, one with frequency ω and one with frequency 2ω. Calculate the symmetric wave function (in x space) corresponding to the energy E = (15/2)ω.
9
The Heisenberg Picture and General Picture; Evolution Operator Until now, we have considered the Schr¨odinger equation as acting on states |ψ S (t) which are time dependent, with operators that are time independent. This is known as the Schr¨odinger picture. However, we can consider other “pictures” with which to describe quantum mechanics, for instance one in which the operators are time dependent but states are time independent; this is known as the Heisenberg picture.
9.1 The Evolution Operator In order to define the Heisenberg picture, we remember that in the Schr¨odinger picture the time evolution was packaged in a single operator, the “time evolution” operator Uˆ (t, t 0 ), defined in the Schr¨odinger picture by |ψ S (t) = Uˆ (t, t 0 )|ψ S (t 0 ).
(9.1)
Indeed, in this usual, Schr¨odinger, picture we can define the evolution operator as follows. We can note that the evolution from |ψ(t 0 ) to |ψ(t) is a linear operation, preserving in time the linear superposition of states, so we can define this time evolution as the result of an operator Uˆ (t, t 0 ) as in the relation (9.1) above. The relation must be a unitary transformation, Uˆ (t, t 0 ) † = Uˆ −1 (t, t 0 ), since it must preserve the norm of the states in time, ψ(t)|ψ(t) = ψ(t 0 )|ψ(t 0 ),
(9.2)
because the norm is associated with probabilities, and conservation of the total probability must be imposed. We also saw that, for a conservative system, for which Hˆ Hˆ (t), on eigenstates |ψ E (t) of the Hamiltonian, Hˆ |ψ E (t) = E|ψ E (t),
(9.3)
the time evolution of states is obtained as |ψ E (t) = e−iE (t−t0 )/ |ψ E (t 0 ) = e−i H (t−t0 )/ |ψ E (t 0 ), ˆ
(9.4)
so the time evolution operator is ˆ Uˆ (t, t 0 ) = e−i H (t−t0 )/ .
(9.5)
In the energy eigenstates basis, the completeness relation is written as 1ˆ = n,a |En,a En,a |, so, inserting it in front of the above relation, the time evolution operator can be written also as Uˆ (t, t 0 ) = |En,a En,a |e−iE (t−t0 )/ . (9.6) n,a
106
107
9 Heisenberg and General Pictures; Evolution Operator
Moreover, we can write a differential equation for this time evolution operator by acting on (9.5) with d/dt, obtaining d ˆ U (t, t 0 ) = Hˆ Uˆ (t, t 0 ). (9.7) dt The difference is that now we can define the time evolution operator to be that found from the ˆ even in the case of above differential equation, together with the boundary condition Uˆ (t 0 , t 0 ) = 1, ˆ ˆ a nonconservative system, with H = H (t). In this latter case, we can “integrate” the equation over a very small interval t − t 0 , and write t i Uˆ (t, t 0 ) = 1ˆ − Hˆ (t )Uˆ (t , t 0 )dt . (9.8) t0 i
This is in fact equivalent to the action of the Schr¨odinger equation acting on states, d |ψ(t) = Hˆ |ψ(t), dt since by integrating it we obtain the same time evolution operator: i ˆ ˆ |ψ(t + dt) = 1 − H dt |ψ(t) = Uˆ (t + dt, t)|ψ(t) e−i H (t)dt/ |ψ(t). i
(9.9)
(9.10)
ˆ the evolution We note that in this expression, since we have a Hermitian Hamiltonian, Hˆ † = H, operator is indeed unitary, Uˆ † = Uˆ −1 , so that finite time evolution amounts to an infinite sequence of unitary infinitesimal operations, leading to a total unitary operation. The final state, after this sequence of operations, each one centered on a t n , is n i |ψ(t) = exp − Hˆ (t n )dt n |ψ(t 0 ). (9.11) i=1 We need to use this complicated formula since, in general, the Hamiltonians at different times need not commute,
[ Hˆ (t 1 ), Hˆ (t 2 )] 0,
(9.12)
so if we were to write an integral H (t )dt in the exponent instead of the product of infinitesimal exponentials, this would be in general different from the above product of exponentials. Therefore we have to define the integral in the exponential only with a time ordering operator, which puts products of operators in time order: T ( A(t 1 ) · · · A(t n )) = A(t 1 ) · · · A(t n ) if t 1 > t 2 > · · · t n . Then t i |ψ(t) = T exp − Hˆ (t )dt |ψ(t 0 ), (9.13) t0 where the time-ordered exponentials are defined as t N −1 i ˆ T exp − H (t )dt = lim exp −i Hˆ (t n )dt n . N →∞ t0 n=0
(9.14)
Noncommuting Exponentials For such noncommuting exponentials, we have the following theorem: ˆ B] ˆ [ A, ˆ ˆ ˆ ˆ e A+B = e A e B exp − , 2
(9.15)
108
9 Heisenberg and General Pictures; Evolution Operator
which can be proven as follows. Consider the function ˆ
ˆ
f (x) = e Ax e Bx ,
(9.16)
ˆ Bx ˆ ˆ ˆ ˆ Ax ˆ Bx Ae e + e Ax Be ˆ ˆ ˆ − Ax Aˆ + e Ax Be f (x).
(9.17)
(−x) n (−x) n−1 n [B, An ] = −x An−1 [B, A] = −xe−Ax [B, A], n! n! n≥0 n≥0
(9.18)
and consider its derivative, df = dx = On the other hand, we also find that [B, e−Ax ] = allowing us to write df ˆ B]x) ˆ = ( Aˆ + Bˆ + [ A, f (x). dx
(9.19)
This differential equation can be integrated, with boundary condition f (0) = 1, to obtain ˆ
ˆ
f (x) = e ( A+B)x e x
2 /2[ A, ˆ B] ˆ
.
(9.20)
Putting x = 1, we arrive at the stated equation.
9.2 The Heisenberg Picture Having defined the evolution operator, it follows that we can define time-independent states, which by definition will be the states in the Heisenberg picture, by reversing the evolution operator, so these states, denoted by |ψ H , will be |ψ H = Uˆ −1 (t, t 0 )|ψ S (t) = |ψ S (t 0 ).
(9.21)
This change in the space of states can be thought of as a “unitary transformation”, or change of basis in the Hilbert space, with the unitary operator U −1 (t, t 0 ) = U † (t, t 0 ). Through it, the time dependence encoded in Uˆ (t, t 0 ) can be put onto the operators, since the change of basis implies that the operators are also changed from the Schr¨odinger picture operators Aˆ S to the Heisenberg picture operators Aˆ H , as Aˆ S → Aˆ H (t) = Uˆ † (t, t 0 ) Aˆ S Uˆ (t, t 0 ),
(9.22)
in such a way that the matrix elements (or expectation values) are unchanged, ψ S | Aˆ S |χS = ψ H | Aˆ H |χ H .
(9.23)
Note therefore that when we write Uˆ (t, t 0 ), we mean the evolution operator in the Schr¨odinger picture, Uˆ S (t, t 0 ). The time evolution of the new operators is found by using the differential equation for Uˆ (t, t 0 ) in equation (9.7), and its complex conjugate, in the time derivative of the definition of Aˆ H (acting on three terms), obtaining
109
9 Heisenberg and General Pictures; Evolution Operator
i
d ˆ ∂ Aˆ S ˆ AH = −Uˆ S† (t, t 0 ) Hˆ S Aˆ S Uˆ (t, t 0 ) + iUˆ † U + Uˆ † Aˆ S Hˆ S Uˆ dt ∂t ˆ ˆ Uˆ + iUˆ † ∂ AS U. ˆ = Uˆ † [ Aˆ S , H] ∂t
(9.24)
Defining the Hamiltonian in the Heisenberg representation, ˆ Hˆ H = Uˆ † Hˆ S U,
(9.25)
Uˆ † [ Aˆ S , Hˆ S ]Uˆ = [ Aˆ H , Hˆ H ],
(9.26)
in such a way that
and if the operators have explicit time dependence ∂/∂t Aˆ S 0, such that ∂ Aˆ H ∂ Aˆ S ˆ = Uˆ † U, ∂t ∂t
(9.27)
we obtain the equation for the time evolution of the Heisenberg operators, i
d Aˆ H ∂ = [ Aˆ H , Hˆ H ] + i Aˆ H . dt ∂t
(9.28)
This Heisenberg equation of motion replaces the Schr¨odinger equation, since now operators have time dependence, not the states (now ∂t |ψ H = 0 always). This equation of motion is the quantum mechanical counterpart of the classical equation in the Hamiltonian formalism, in terms of Poisson brackets, ∂ Acl dAcl = { Acl , Hcl } P.B. + . dt ∂t
(9.29)
This is the generalization of the Hamilton equations written with Poisson brackets, dqi ∂H = {qi , H } P.B. = dt ∂pi dpi ∂H = {pi , H } P.B. = − . dt ∂qi
(9.30)
For the quantum mechanical case, the above Hamilton equations become the Heisenberg equations of motion, d qˆi (t) i ˆ = − [qˆi , H] dt d pˆi (t) i ˆ = − [ pˆi , H]. dt
(9.31)
9.3 Application to the Harmonic Oscillator We will use as the simplest example our standard toy model, the harmonic oscillator. The quantum mechanical Heisenberg equations for a, ˆ aˆ † , taking the role of the Hamilton equations, are
110
9 Heisenberg and General Pictures; Evolution Operator d aˆ ˆ = ω aˆ = [a, ˆ H] dt d aˆ † ˆ = −ω aˆ † , i = [aˆ † , H] dt i
(9.32)
where we have used Hˆ = ω( aˆ † aˆ + 1/2) and [a, ˆ aˆ † ] = 1. The solution of these equations is a(t) ˆ = aˆ0 e−iωt aˆ † (t) = aˆ0† e+iωt , which implies for the phase space variables q(t), ˆ p(t) ˆ that † iωt q(t) ˆ = aˆ0 e + aˆ0 e−iωt 2mω mω † iωt p(t) ˆ =i aˆ0 e − aˆ0 e−iωt . 2
(9.33)
(9.34)
9.4 General Quantum Mechanical Pictures The Schr¨odinger and Heisenberg pictures are the best known ones, but there are others. Here we write a general analysis for such pictures. To gain an understanding, we look to classical mechanics in the Hamiltonian and Poisson bracket formalism, as we did above for the Heisenberg picture case. In classical mechanics, there are canonical transformations on phase space, changing (p, q) to (p, q ), under which functions of phase space F (q, p) and G(q, p), with Poisson brackets {F (q, p), G(q, p)} P.B. = K (q, p)
(9.35)
transform to functions F (q , p ) and G (q , p ) that obey the same relation, {F (q , p ), G (q , p )} P.B. = K (q , p ).
(9.36)
In quantum mechanics, the role of such canonical transformations on phase space is taken by unitary transformations on the Hilbert space, |ψ = Uˆ |ψ,
(9.37)
with inverse |ψ = Uˆ −1 |ψ , and under which the operators also transform, Aˆ = Uˆ AˆUˆ −1 .
(9.38)
Since quantization involves the replacement of {, } P.B. with 1/i[, ], we must obtain for the commutator a transformation law similar to that for Poisson brackets in classical mechanics. Indeed, we find ˆ F, ˆ G] ˆ Uˆ −1 = Uˆ Kˆ Uˆ −1 = Kˆ . ˆ G] ˆ → [Fˆ , Gˆ ] = U[ [F,
(9.39)
Consider a unitary transformation by an operator Wˆ (t) that can depend explicitly on time, |ψW = Wˆ (t)|ψ,
AˆW = Wˆ (t) Aˆ Wˆ −1 (t).
(9.40)
111
9 Heisenberg and General Pictures; Evolution Operator
The Hamiltonian generating the new time dependence is found from the Schr¨odinger equation: i∂t |ψ = Hˆ |ψ ⇒ i∂t |ψW = i∂t (Wˆ |ψ) = iWˆ ∂t |ψ + i(∂t Wˆ )|ψ
(9.41)
= (Wˆ Hˆ Wˆ −1 )|ψW + i(∂t Wˆ )Wˆ † |ψW , where in the last line we have used the Schr¨odinger equation for |ψ. Equating the final result to H |ψW allows us to write Hˆ , the Hamiltonian generating the new time evolution, as Hˆ = Hˆ W + i(∂t Wˆ )Wˆ † .
(9.42)
The time evolution of the transformed operators AˆW = Wˆ Aˆ Wˆ † is found as before, by acting on each term: ∂ Aˆ ˆ † ∂ ˆ t Wˆ −1 i AˆW = iWˆ W + i(∂t Wˆ ) Aˆ Wˆ −1 + iWˆ A∂ ∂t ∂t (9.43) ∂ ˆ = i A + i(∂t Wˆ )Wˆ † AˆW − i AˆW (∂t Wˆ )Wˆ −1 , ∂t W which finally becomes i
∂ ˆ ˆ W + [i(∂t Wˆ )Wˆ † , AˆW ]. AW = i(∂t A) ∂t
(9.44)
From the above, we can also find the new time evolution operator. Since in the Schr¨odinger picture we had |ψ(t) = Uˆ (t, t 0 )|ψ(t 0 ), in the new W picture we find |ψW (t) = Wˆ (t)|ψ(t) = Wˆ (t)Uˆ (t, t 0 )Wˆ † (t 0 )|ψW (t 0 ),
(9.45)
from which we deduce that the time evolution operator in the W picture is Uˆ (t, t 0 ) = Wˆ (t)Uˆ (t, t 0 )Wˆ † (t 0 ) UˆW .
(9.46)
Indeed, note that the new evolution operator is not simply the Schr¨odinger operator transformed into the W picture, because we have Wˆ (t) to the left but Wˆ † (t 0 ) to the right (at a different time t 0 ).
Application to the Heisenberg Picture Note that we can define the Heisenberg picture by the condition that states are time independent, ∂t |ψ H = 0, so the new Hamiltonian must be trivial, H = 0. But that means that the Schr¨odinger Hamiltonian in the new picture is Hˆ H = Hˆ S = −i(∂t Wˆ )Wˆ † .
(9.47)
By choosing the two pictures, the W picture and the Schr¨odinger picture, to be equal at time t 0 , we finally obtain Wˆ (t) = Uˆ S−1 (t, t 0 ) = Uˆ S (t 0 , t).
(9.48)
112
9 Heisenberg and General Pictures; Evolution Operator
9.5 The Dirac (Interaction) Picture The most interesting new example of a picture that remains is the Dirac, or interaction, picture. This is defined in the case where the Hamiltonian H can be split into a free particle part Hˆ 0 and an interaction part Hˆ 1 , Hˆ = Hˆ 0 + Hˆ 1 .
(9.49)
We can then use just the free part, Hˆ 0 , in order to go to a sort of Heisenberg picture for Hˆ 0 , i.e., using a unitary transformation Wˆ (t) defined by −1 Hˆ 0 (t − t 0 ) Hˆ 0 (t − t 0 ) −1 (t, t 0 ) = exp −i = exp +i Wˆ (t) = Uˆ S,0 .
(9.50)
ˆ both depend In this case, we obtain that the interaction picture states |ψ I (t) and operators A(t) on time, and their time dependences are given by ∂ |ψ I (t) = H1,I |ψ I (t) ∂t ∂ i Aˆ I (t) = [ Aˆ I (t), Hˆ 0 ] +i(∂t Aˆ S )I . ∂t
i
(9.51)
The evolution operator in the interaction picture is, according to the general theory, Uˆ I (t, t 0 ) = Wˆ (t)Uˆ (t, t 0 )Wˆ −1 (t 0 ),
(9.52)
ˆ Wˆ (t) = ei H0 (t−t0 )/ ,
(9.53)
for the canonical transformation
ˆ In the conservative case, of a time-independent Hamiltonian, Uˆ (t, t 0 ) = so that Wˆ (t 0 ) = 1. −i Hˆ (t−t0 )/ e , so that ˆ ˆ Uˆ I (t, t 0 ) = ei H0 (t−t0 )/ e−i H (t−t0 )/ .
(9.54)
We can write a differential equation for it, using the Schr¨odinger picture equation derived before, d ˆ i dt U (t, t 0 ) = Hˆ Uˆ (t, t 0 ). We find i
d ˆ d d d ˆ UI (t, t 0 ) = i [Wˆ (t)Uˆ (t, t 0 )] = i W (t) Wˆ −1 (t)Uˆ I (t, t 0 ) + Wˆ (t)i Uˆ (t, t 0 ) dt dt dt dt = [i(∂t Wˆ (t))Wˆ † (t)]Uˆ I (t, t 0 ) + Wˆ (t) Hˆ Wˆ −1 (t)UI (t, t 0 ) = Hˆ I Uˆ I (t, t 0 ),
where Hˆ I is the Hamiltonian giving the time evolution in the interaction picture.
(9.55)
113
9 Heisenberg and General Pictures; Evolution Operator
Moreover, since ˆ ˆ Uˆ I (t, t 0 ) = ei H0 (t−t0 )/ e−i H (t−t0 )/ ,
(9.56)
it follows that by differentiation we get i∂t Uˆ I (t, t 0 ) = ei H0 (t−t0 )/ ( Hˆ − Hˆ 0 )e−i H (t−t0 )/ , ˆ
ˆ
(9.57)
where, when differentiating both factors, we have chosen to write the resulting Hamiltonian in the middle of the two exponentials. Then, using the fact that Hˆ − Hˆ 0 = Hˆ 1 in the Schr¨odinger picture, and that in the interaction picture we have H1,I = ei H0 (t−t0 ) H1,S e−i H0 (t−t0 ) , ˆ
ˆ
(9.58)
where H1,I is H1 in the interaction picture, we derive that ˆ ˆ i∂t Uˆ I (t, t 0 ) = H1,I ei H0 (t−t0 ) e−i H (t−t0 ) = Hˆ 1,I Uˆ I (t, t 0 ).
(9.59)
Comparing the two differential equations, we see that the Hamiltonian in the interaction picture equals Hˆ 1 in the interaction picture, Hˆ I = Hˆ 1,I .
(9.60)
We can solve the differential equation (9.59) for something like U (t, 0) ∼ e−iHI t/ , by analogy. More precisely, we see that we can write the expansion of the exponential, taking care to integrate only over the time-ordered product of Hamiltonians: t t t1 dt 1 Hˆ 1,I (t 1 ) + (−i/) 2 dt 1 dt 2 H1,I (t 1 )H1,I (t 2 ) + · · · (9.61) Uˆ I (t, t 0 ) = 1 + (−i/) t0
t0
t0
Calling the three terms zeroth-order (1), first-order, and second-order, we see that in the secondorder term we are integrating only over a triangle, satisfying the condition of time ordering, t 1 > t 2 in H1,I (t 1 )H1,I (t 2 ), instead of dividing by 2, as would happen from the expansion of an exponential. When doing this, we find that, indeed, i∂t (first-order) = Hˆ 1,I (t)(zeroth-order) i∂t (second-order = Hˆ 1,I (t)(first-order).
(9.62)
However, as we integration over the triangle actually equals half the integration over the tsaid,the t whole rectangle t dt 1 t dt 2 . But, since in general the interaction Hamiltonians at different times 0 0 don’t commute, [ Hˆ 1,I (t), Hˆ 1,I (t )] 0, in order to get the correct result we must include a time ordering as well, so t t t (−i/) 2 ˆ ˆ UI (t, t 0 ) = 1 + (−i/) dt 1 H1,I (t 1 ) + dt 1 dt 2T {H1,I (t 1 )H1,I (t 2 )} + · · · 2! t0 t0 t0
(9.63)
(9.64)
114
9 Heisenberg and General Pictures; Evolution Operator
This pattern continues at higher orders, and we find that we must in fact put the time-ordering operator in front of all the terms coming from the exponential, so t Uˆ I (t, t 0 ) = T exp −i dt H1,I (t ) . (9.65) t0
This is the same formula as we obtained in the Schr¨odinger picture in (9.13) but now defined rigorously (and in the interaction picture).
Important Concepts to Remember • The evolution operator Uˆ (t, t 0 ) can be written as the solution of a Schr¨odinger-like equation, d ˆ leading to a solution as a timei Uˆ (t, t 0 ) = Hˆ Uˆ (t, t 0 ), with boundary condition Uˆ (t 0 , t 0 ) = 1, dt t ordered exponential, T exp − i t dt H (t ) . 0 • Quantum mechanics can be described in different pictures, the usual Schr¨odinger picture being the one where operators are time independent and states are time dependent (and obey the Schr¨odinger equation). • In the Heisenberg picture, states are time independent, |ψ H = Uˆ −1 (t, t 0 )|ψ(t) = |ψ S (t 0 ), and operators are time dependent, Aˆ H = Uˆ −1 (t, t 0 ) Aˆ S Uˆ (t, t 0 ), evolving with the Hamiltonian via the Heisenberg equation of motion, id Aˆ H /dt = [ Aˆ H , Hˆ H ] + i∂t Aˆ H . • A general picture is related to the Schr¨odinger picture by a canonical transformation, |ψW = Wˆ (t)|ψ, AˆW = Wˆ (t) AWˆ −1 (t), resulting in a new Hamiltonian for the time evolution of ˆW + states i∂t |ψW = i Hˆ |ψW , and a new time evolution of operators, i∂t AˆW = i(∂t A) −1 [i(∂t Wˆ )Wˆ , AˆW ]. −1 • In the Dirac, or interaction, picture, Hˆ = Hˆ 0 + Hˆ 1 and Wˆ (t) = Uˆ S,0 (t, t 0 ), so it is a sort of Heisenberg picture for Hˆ 0 only, in which states evolve with Hˆ 1,I , i∂t |ψ I (t) = Hˆ 1,I |ψ I (t) (and Hˆ I = Hˆ 1,I ), but operators evolve with Hˆ 0 , i∂t Aˆ I (t) = [ Aˆ I (t), Hˆ 0 ]. ˆ • The evolution t operator in the interaction picture is also a time-ordered exponential, UI (t, t 0 ) = i ˆ T exp − t dt H1,I (t ) . 0
Further Reading See [2] and [1] for more details.
Exercises (1)
Write down the time evolution operator and the differential equation that it satisfies for a onedimensional particle in a potential V (x). (2) Write down the time evolution equation for Heisenberg operators, calculating explicitly the evolution Hamiltonian, in the case of a free particle, and then calculate explicit expressions for x(t) ˆ and p(t). ˆ
115
9 Heisenberg and General Pictures; Evolution Operator † (3) For a harmonic oscillator of frequency ω, calculate the explicit time-dependent a(t), a (t) † ˆ operators in the picture with W (t) = exp 2iω aˆ a(t ˆ − t 0 ) , and the explicit differential equations for a general operator and state. (4) Consider a harmonic oscillator perturbed by H1 = λ(a + a† ) 3 . Write down the explicit evolution equations for states (the first two terms in the expansion in ω) and the a, a† operators in the interaction picture. (5) Calculate the evolution operator for the interaction picture in the case in exercise 4. (6) Consider a Hamiltonian Hˆ (p) only (no x dependence). What are the operators in the theory that evolve nontrivially in the Heisenberg picture? What about in a possible interaction picture, if Hˆ (p) can be separated into two parts? (7) Can one have a picture in which operators don’t evolve in time other than the Schr¨odinger picture?
10
The Feynman Path Integral and Propagators
Having defined the Heisenberg picture and evolution operators, we can now attack quantum mechanics in a different way, proposed by Feynman: instead of states and operators acting on them, we talk about something closer to the classical concept of a particle moving on a path. Instead of having a single path, Feynman thought about the fact that in quantum mechanics, we should be able to sum over all possible paths between two points, even discontinuous ones, with the weight given by the phase eiS , S being the classical action for the particle. This resulting Feynman path integral formulation of quantum mechanics is actually equivalent to the Schr¨odinger formulation (or the Heisenberg, or Dirac, formulations for that matter). The path integral for a generalized coordinate q(t) is (10.1) Dq(t)eiS[q]/ , and, in the case where it is possible to have a nearly classical motion, we will find that the result is approximated by ∼ eiScl [qcl ]/ × (corrections),
(10.2)
where qcl (t) is the classical path, and Scl is the classical on-shell action (in the classical solution). We will derive this next.
10.1 Path Integral in Phase Space The path integral formalism is applied to the propagator, that is, the evolution operator in the coordinate representation, defined as the product of two Heisenberg (bra and ket) states, H x
, t |x, t H ≡ U (x , t ; x, t) = x |Uˆ (t , t)|x,
(10.3)
−i Hˆ (t −t)/
where as we showed, in the case of a conservative system, Uˆ (t , t) = e and represents the transition amplitude between the initial and final points. The term “propagator” is related to the fact that we can write the evolution of a state in the coordinate representation (by multiplying with x |), and insert a completeness relation for |x, to get |ψ(t) = Uˆ (t, t 0 )|ψ(t 0 ) ⇒
ψ(x , t) = x |Uˆ (t, t 0 )|ψ(t 0 ) =
dx U (x , t ; x, t)ψ(x, t 0 ).
(10.4)
We saw earlier that the evolution operator can be written in terms of the energy eigen-basis as Uˆ (t, t 0 ) = |En , aEn , a|e−iEn (t−t0 )/ . (10.5) n,a
116
117
10 The Feynman Path Integral and Propagators
Then the propagator at equal times becomes ˆ lim U (x , t ; x, t) = x |En , aEn , a|x = x | 1|x = δ(x − x ). t →t
(10.6)
n,a
Moreover, the relation (10.4) implies that, since ψ(x, t) satisfies the Schr¨odinger equation, 2 2 ∂ d − + V (x) − i ψ(x, t) = 0, (10.7) 2m dx 2 ∂t the propagator U (x , t ; x, t) also satisfies the Schr¨odinger equation for t > t. Then, adding the definition that U (x , t ; x, t) = 0,
for t < t,
(10.8)
we find that U (x , t ; x, t) becomes a Green’s function for the Schr¨odinger operator. Indeed, because of the discontinuity at t → t, we find that 2 2 d ∂ − + V (x) − i (10.9) U (x, t; x 0 , t 0 ) = δ(x − x 0 )δ(t − t 0 ). 2m dx 2 ∂t The Heisenberg state |x, t H is an eigenstate of the Heisenberg operator xˆ H (T ) at time T = t, i.e., xˆ H (T = t)|x, t H = x(t)|x, t H ,
(10.10)
and is not an eigenstate at T t. The relation to the Schr¨odinger states is given by |x, t H = ei Ht/ |x ⇒ |x = e−i Ht/ |x, t H . ˆ
ˆ
(10.11)
The Heisenberg operator is related to the Schr¨odinger operator by ˆ ˆ Xˆ (t) = ei Ht/ Xˆ S e−i Ht/ ,
(10.12)
where Xˆ S |x = x|x. To write a path integral formulation for the propagator U (x , t ; x, t), we first divide the interval (t, t ) into n + 1 intervals of length , =
t − t ⇒ t 0 = t, t 1 = t + , . . . , t n+1 = t . n+1
(10.13)
At any fixed time t i , the set of all possible Heisenberg states {|x i , t i |x i ∈ R} is a complete set. The completeness relation is then ˆ dx i |x i , t i x i , t i | = 1. (10.14) Because of this, we can insert in U (x , t ; x, t) at each t i , i = 1, . . . , n, an operator 1ˆ written as the completeness relation for time t i , obtaining U (x , t ; x, t) = dx 1 · · · dx n x , t |x n , t n x n , t n |x n−1 , t n−1 · · · |x 1 , t 1 x 1 , t 1 |x, t. (10.15) In the above equation, x i = x(t i ) is a discretized path but it is not a classical path, in that we are integrating arbitrarily and independently over x i and x i+1 , which means that generically the distance between x i and x i+1 is large, and does not go to zero (i.e., we do not have x i+1 → x i ) if → 0. Instead of a classical path, we get a discontinuous quantum path, as in Fig. 10.1.
118
10 The Feynman Path Integral and Propagators
x x xi
x
t
ti
t Figure 10.1
t
For the definition of the quantum mechanical “path integral”, we integrate over all possible discretized paths, not just the smooth ones (as would be suggested by classical paths). Indeed, we divide the path into a large number of discrete points, after which we integrate over the positions of all these points, independently of the positions of their nearby points. The product of the integrations at each intermediate point between x and x is called the path integral, or integral over all possible quantum paths, and is denoted by Dx(t) = lim
n→∞
n
dx(t i ).
(10.16)
i=1
Since we can expand |x in the |p basis, and vice versa, |x = dp|pp|x, |p = dx|xx|p, and since we have (as we saw earlier) eipx/ eix(p−p )/ x|p = √ ⇒ dxp |xx|p = dx = δ(p − p ), 2π 2π
(10.17)
(10.18)
we can write the generic scalar product of two Heisenberg states appearing in the path integral as H x(t i ), t i |x(t i−1 ), t i−1 H
= x(t i )|e−i H/ |x(t i−1 ) ˆ = dp(t i )x(t i )|p(t i )p(t i )|e−i H/ |x(t i−1 ). ˆ
(10.19)
As a technical point, we now impose that Hˆ is to be ordered so that momenta pˆ are to the left of coordinates x. ˆ Then, to order , we can write i i p(t i )| exp − Hˆ |x(t i−1 ) exp − H (p(t i ), x(t i−1 )) p(t i )|x(t i−1 ) (10.20) i = exp − H (p(t i ), x(t i−1 )) , where in the first relation, we acted with Pˆ on the left and with Xˆ on the right, and neglected ˆ Xˆ ) Hˆ ( P, ˆ Xˆ ) we have Pˆ Xˆ Pˆ Xˆ commutator corrections, which are of order 2 : for instance, in Hˆ ( P, type terms, which would be more complicated.
119
10 The Feynman Path Integral and Propagators
All in all, using the formulas derived, we obtain n n ˆ U (x , t ; x, t) = dp(t i ) dx(t j )x(t n+1 )|p(t n+1 )p(t n+1 )|e−i H/ |x(t n ) i=1
j=1
· · · x(t 1 )|p(t 1 )p(t 1 )|e−i H/ |x(t 0 ) i [p(t n+1 )(x(t n+1 ) − x(t n )) + · · · = Dp(t) Dx(t) exp ˆ
+ p(t 1 )(x(t 1 ) − x(t 0 )) − (H (p(t n+1 ), x(t n )) + · · · + H (p(t 1 ), x(t 0 )))] =
Dp(t)
⎧ ⎪i Dx(t) exp ⎨ ⎪ ⎩
tn+1 =t t0 =t
⎫ ⎪ dt[p(t) x(t) ˙ − H (p(t), x(t))] ⎬ ⎪, ⎭
(10.21)
where we have made the replacement x(t i+1 ) − x(t i ) → dt x(t). ˙ This is the path integral in phase space (in the Hamiltonian formulation). Note that this is a physicist’s “rigorous” derivation (mathematically, of course, even the notion of path integration is not well defined, but once that is accepted, everything is rigorous). However, as we said at the beginning, we would be more interested in the path integral in configuration space, in terms of only x(t). In order to obtain this, we must do a Gaussian integration over p(t). For that, we need one more technical requirement on the Hamiltonian: it must be quadratic in the momenta p: H (p, x) = p2 /2 + V (x). However, before we do the integration over p(t) we must calculate some Gaussian integration formulas.
10.2 Gaussian Integration We will have a mathematical interlude at this point, in order to find out how to do the integration we must perform. The basic Gaussian integral is +∞ π −αx 2 I= dx e = . (10.22) α −∞ Squaring it, we obtain I2 =
dx
dy e−(x
2 +y 2 )
2π
=
∞
dφ 0
r dr e−r = π, 2
(10.23)
0
where in the second equality we transformed to polar coordinates in the (x, y) plane. Generalizing this Gaussian integration formula to an n-dimensional space, we can say that 1 d n x e−xi Ai j x j /2 = (2π) n/2 √ . (10.24) det A This formula can be proven by diagonalizing the (constant) matrix Ai j , in which case det A = i α i , where αi are the eigenvalues of the matrix Ai j . Then, defining a scalar that is identified with a quadratic “action” in a discretized form, S=
1 T x · A · x + bT · x, 2
(10.25)
120
10 The Feynman Path Integral and Propagators
and defining its “classical solution”, ∂S = 0 ⇒ x cl = −A−1 · b, ∂ xi
(10.26)
we write the on-shell (classical) action as 1 S(x cl ) = − bT · A−1 · b, 2 and the quantum action (including fluctuations) is rewritten as 1 1 (x − x cl )T · A · (x − x cl ) − bT · A−1 · b. 2 2 Then the Gaussian integration over the vector x, shifted to x − x cl , is 1 1 T n −S(x) n/2 −1/2 −S(xcl ) n/2 −1 d xe = (2π) (det A) e = (2π) √ exp + b · A · b . 2 det A S=
(10.27)
(10.28)
(10.29)
10.3 Path Integral in Configuration Space
Dp(t), which explicitly (in the discrete version) is ⎫ t ⎧ ⎪ ⎪ i 1 2 ⎨ Dp(t) exp ⎪ dτ p(τ) x(τ) ˙ − p (τ) ⎬ ⎪ t 2 ⎩ ⎭ (10.30) n i dp(τi ) 1 2 = exp Δτ p(τi ) x(τ ˙ i ) − p (τi ) . 2π 2 i=1
We are now ready to do the path integral over
This means that, with respect to the above formula for Gaussian integration, we have the identifications x i → p(t i ),
Ai j → iΔτδi j ,
b = −iΔτ x(τ), ˙
(10.31)
so that the result of the path integration is ⎫ t ⎡⎢ t ⎤ ⎧ ⎪i ⎪ 1 2 x˙ 2 (τ) ⎥⎥ ⎬ = N exp ⎢⎢ i Dp(t) exp ⎨ dτ p(τ) x(τ) ˙ − p (τ) dτ , (10.32) ⎪ t ⎪ 2 2 ⎥⎥ ⎢⎣ t ⎦ ⎩ ⎭ where the normalization constant N contains factors of i, 2, π, Δτ, all constant and so irrelevant once we normalize the probability to one. Finally, this means that the path integral for the propagator becomes a path integral only in configuration space, ⎫ 2 t ⎧ ⎪ ⎪i x˙ (τ) ⎬ U (x , t ; x, t) = N Dx(t) exp ⎨ dτ − V (x) ⎪ t ⎪ 2 ⎩ ⎭ ⎡⎢ t ⎤⎥ i ⎥⎥ (10.33) =N Dx(t) exp ⎢⎢ dτL(x(τ), x(τ)) ˙ ⎢⎣ t ⎥⎦ i =N Dx(t) exp S(x) .
121
10 The Feynman Path Integral and Propagators
This is indeed the formula we argued for at the beginning of the chapter, now derived rigorously from the phase space path integral, under the following two technical requirements: the Hamiltonian is to be ordered with ps on the left and xs on the right, and moreover the Hamiltonian is only quadratic in momenta. This formula, besides having a simple physical interpretation that generalizes the classical path (the amplitude for the transition probability (see equation (10.3)) is the sum over all paths weighted by the phase eiS[x] ), also implies a simple classical limit. Indeed, the classical path is the extremum of the action S[x], so we expand any action around its value on the classical solution (the “on-shell value”), up to at least quadratic order (and maybe higher), S[x] Scl [x cl ] +
1 (x − x cl ) · A · (x − x cl ), 2
(10.34)
as before. Then the first approximation to the propagator is U (x , t ; x, t) ei
Scl [xcl ]
,
(10.35)
√ while the second comes with a factor of 1/ det A, as we saw from the Gaussian integration formula. We can calculate exactly the propagator for free particles, though we will not do it here but will leave it for later. Also, we can show that the propagator U (x , t ; x, t), calculated as a Feynman path integral, does also satisfy the Schr¨odinger equation, so it is indeed a Green’s function for the Schr¨odinger operator, meaning that any wave function evolved with it will also satisfy the Schr¨odinger equation. Thus the Feynman path integral representation of quantum mechanics is actually equivalent to the original Schr¨odinger representation.
10.4 Path Integral over Coherent States (in “Harmonic Phase Space”) Our prototype (toy model) for a simple quantum system is, as we have seen, the harmonic oscillator. A more complicated system can be described as a perturbed harmonic oscillator or a system of harmonic oscillators. Therefore it is of interest to see if there is any other way to describe the path integral for it. In fact, there is, using the so-called “harmonic phase space” instead of configuration space or phase space. This space is defined using the notion of coherent states of the harmonic oscillator. For a harmonic oscillator, the Hamiltonian can be written in terms of creation and annihilation operators as (10.36) Hˆ = ω aˆ † aˆ + 12 . But, while the original intention in introducing aˆ and aˆ † was to deal with eigenenergy states |n, whose index n is changed by aˆ and aˆ † (hence the name creation and annihilation), we can also introduce eigenstates for these creation and annihilation operators, called coherent states. We can define |α, as we showed in Chapter 8, as an eigenstate of a, ˆ †
|α ≡ e α aˆ |0 ⇒ a|α ˆ = α|α
(10.37)
122
10 The Feynman Path Integral and Propagators and similarly α ∗ | as an eigenstate of aˆ † , α ∗ | ≡ 0|e α
∗ aˆ
⇒ α| aˆ † = α ∗ |α ∗ .
(10.38)
Moreover, they satisfy a completeness relation dαdα ∗ −αα∗ ˆ e |αα ∗ | = 1. 2πi
(10.39)
We can compute the propagator for these states, i.e., the scalar product of Heisenberg states constructed out of them, U (α ∗ , t ; α, t) ≡ H α ∗ , t |α, t H = α ∗ |Uˆ (t , t)|α,
(10.40)
−i Hˆ (t −t)
where as always Uˆ (t , t) = e . On the other hand, we find that the expectation value of Hˆ in the coherent state basis is ∗ α ∗ | Hˆ ( aˆ † , a)|β ˆ = H (α ∗ , β)α ∗ |β = H (α ∗ , β)e α β .
(10.41)
We define the same discretization of a classical path from α to α∗ in n + 1 intervals, with t − t , t 0 = t, t 1 , . . . , t n , t n+1 = t . (10.42) n+1 We follow the same steps as in phase space, inserting a completeness relation at each intermediate point, obtaining dα(t i )dα ∗ (t i ) ∗ ˆ U (α ∗ , t ; α, t) = e−α (ti )α(ti ) α ∗ (t )|e−i H/ |α(t n ) 2πi (10.43) i =
× α ∗ (t n )|e−i H/ |α(t n−1 )α ∗ (t n−1 )| · · · α ∗ (t 1 )|e−i H/ |α(t). ˆ
ˆ
But the generic scalar expectation value is i i α∗ (t i+1 )| exp − Hˆ |α(t i ) = exp − H (α ∗ (t i+1 ), α(t i )) exp [α ∗ (t i+1 )α(t i )],
(10.44)
if Hˆ is normally ordered (with aˆ † to the left and aˆ to the right). Then we obtain for the propagator U (α ∗ , t ; α, t) dα(t i )dα ∗ (t i ) = exp α ∗ (t )α(t n ) − α ∗ (t n )α(t n ) 2πi i
t ⎤⎥ i + α (t n )α(t n−1 ) − α (t n−1 )α(t n−1 ) + · · · + α (t 1 )α(t) − dτ H (α ∗ (τ), α(τ)) ⎥⎥ , t ⎥⎦ ⎫ dα(t i )dα ∗ (t i ) ⎧ ⎪ ⎪ t i ∗ ∗ ∗ ⎬, ˙ = dτ α (τ)α(τ) − H (α (τ), α(τ)) + α (t)α(t) exp ⎨ ⎪ ⎪ t 2πi i ⎭ ⎩ (10.45) ∗
∗
∗
so that finally we can write it as t ⎧ ⎫ ⎪i ⎪ ∗ ˙ U (α ∗ , t ; α, t) = D(α(τ)) D(α ∗ (τ)) exp ⎨ dτ α (τ)α(τ) − H + α ∗ (t)α(t) ⎬ ⎪ t ⎪. i ⎩ ⎭ (10.46)
123
10 The Feynman Path Integral and Propagators
10.5 Correlation Functions and Their Generating Functional We can consider a more general observable than a transition probability, namely one where we insert an operator Xˆ (t a ) in between Heisenberg states, H x
, t | Xˆ (t a )|x, t H ,
(10.47)
obtaining what is known as a correlation function, specifically a one-point function. This observable is harder to understand, but is of great theoretical interest, especially in that it can be generalized. To calculate it, we follow the same steps as before and discretize the path. The only constraint in doing so is to choose t a as one of the intermediate t i s in the path integral (we can choose the division into n + 1 steps in such a way that this is true). Then, if t a = t i , so that x(t a ) = x(t i ), we find x i+1 , t i+1 | Xˆ (t a )|x i , t i = x(t a )x i+1 , t i+1 |x i , t i ; now following the same steps as before, we find the path integral i x , t | Xˆ (t a )|x, t = Dx(t)e S[x] x(t a ).
(10.48)
(10.49)
Next, we can define the two-point function (which is also a correlation function) x , t | Xˆ (t b ) Xˆ (t a )|x, t.
(10.50)
If t a < t b , we can work as before: choose t a = t i and t b = t j , where j > i, and find x j+1 , t j+1 | Xˆ (t b )|x j , t j · · · x i+1 , t i+1 | Xˆ (t a )|x i , t i = x(t b )x j+1 , t j+1 |x j , t j · · · x(t a )x i+1 , t i+1 |x i , t i , so we obtain x , t | Xˆ (t b ) Xˆ (t a )|x, t =
(10.51)
Dx(t)eiS[x]/ x(t b )x(t a ).
Conversely, the path integral gives the time-ordered product Dx(t)eiS[x]/ x(t a )x(t b ) = x , t |T { Xˆ (t a ) Xˆ (t b )}|x, t,
(10.52)
(10.53)
where the time ordering is defined as usual, ⎧ ⎪ ⎪ Xˆ (t a ) Xˆ (t b ), t a > t b T { Xˆ (t a ) Xˆ (t b )} = ⎨ ⎪ ⎪ Xˆ (t b ) Xˆ (t a ), t a < t b , ⎩
(10.54)
and generalized to T { Xˆ (t a1 ) · · · Xˆ (t an )} = Xˆ (t a1 ) · · · Xˆ (t an ),
if t a1 > t a2 > · · · > t an .
(10.55)
Then we can also calculate n-point functions (general correlation functions) as a path integral, G n (t a1 , . . . , t an ) = x , t |T { Xˆ (t a1 ) · · · Xˆ (t an )}x, t = Dx(t)eiS[x]/ x(t a1 ) · · · x(t an ). (10.56)
124
10 The Feynman Path Integral and Propagators
Finally, we can write a generating function for all the n-point correlation functions. Indeed, for numbers an we can find a generating function an x n f (x) = , (10.57) n! n≥0 and the coefficient an is found from the nth order derivatives at zero, an =
dn . f (x) dx n x=0
(10.58)
In the present case, we can define a generating functional Z[J] for all the Green’s functions G n , defined as (i/) n dt 1 · · · dt n G n (t 1 , . . . , t n ) J (t 1 ) · · · J (t n ). (10.59) Z[J] = n! n≥0 Substituting the Green’s functions as path integrals, we obtain n 1 i iS[x]/ Z[J] = Dx(t)e dt x(t) J (t) , n! n≥0 which is seen to easily sum to i i i Z[J] = Dx(t) exp S(x, J) ≡ Dx(t) exp S(x) + dt J (t)x(t) ; then the Green’s functions are obtained from it via δn Z[J] = Dx(t)eiS[x]/ x(t 1 ) · · · x(t n ) = G n (t 1 , . . . , t n ). (i/)δ J (t 1 ) . . . (i/)δ J (t n ) J=0
(10.60)
(10.61)
(10.62)
Important Concepts to Remember • The Feynman path integral is another way to define quantum mechanics, as a sum over all possible paths, not necessarily continuous, between initial points and endpoints. i • The path integral is Dq(t) exp S[q(t)] , and is approximated (at a saddle point) by √ i exp Scl [qcl (t)] , with corrections coming from the Gaussian around the saddle point, as 1/ det A, where S = Scl [qcl ] + 12 (q − qcl ) · A · (q − qcl ). • The path integral is derived in phase space as t U (x , t ; x, t) = Dx(t) Dp(t) exp i t dt(pq˙ − H) , under the technical requirement to have p to the left of x in H. • From the path integral in phase space, we derive the path integral in coordinate space, U (x , t ; x, t) = N Dx(t) exp i S[x(t)] , under the extra technical requirement that H is quadratic in p, H = p2 /2 + V (x). • The best definition of the path integral is in harmonic phase space, related to coherent states of the harmonic oscillator, α, α ∗ , where
125
10 The Feynman Path Integral and Propagators
U (α ∗ , t ; α, t) =
Dα ∗ (τ)
t Dα(τ) exp i/ t dτ [/i α˙ ∗ (τ)α(τ) − H] + α ∗ (t)α(t) , and is
valid for normal-ordered Hamiltonians. • The correlation functions or n-point functions are expectation values of time-ordered products of position operators, rewritten as path integrals, G n (t a1 , . . . , t an ) = x , t |T ( Xˆ (t a1 ) · · · Xˆ (t an ))|x, t = Dx(t) exp i S[x(t)] x(t a1 ) · · · x(t an ). • The generating functional for all the correlations functions is i Z[J] = Dx(t) exp S[x] + i dt J (t)x(t) , from which we obtain the correlation functions by taking derivatives.
Further Reading See [3] and [5] for more details (the latter for details on correlation functions).
Exercises (1)
We have used the idea of Gaussian integration around the classical solution for the action to argue that the first-order result for the path integral is exp i Scl [x cl (t)] . However, is Gaussian integration, and more generally the path integral itself, well defined for this particular exponential? How would you expect to modify the exponential in order to make sense of the Gaussian integration? (2) For the path integral in phase space, we have assumed that the Ps are always on the left of the Xs in the Hamiltonian. Consider a case where this is not true, for instance having an extra term in the Hamiltonian of the type α( Pˆ Xˆ 2 + Xˆ Pˆ Xˆ + Xˆ 2 P). Redo the calculation, and see what you obtain for the path integral in phase space. (3) Consider the case when H = 34 p4/3 + V (x) and the path integral is in phase space. Is the resulting path integral in coordinate space approximated in any way by the usual expression i Dx(t)e S[x(t)] ? (4) Consider the Hamiltonian, in terms of a, a† (with [a, a† ] = 1), H = ω a† a + 12 + λ(a + a† ) 3 . (10.63) Derive the harmonic path integral in phase space for it (without calculating the path integral, which is not Gaussian). (5) Consider a generating functional 1 Z[J (t)] = N exp − dt dt J (t)Δ(t, t ) J (t ) . (10.64) 2 Calculate the two-point, three-point, and four-point functions. (6) Consider a harmonic oscillator, with Lagrangian L = ( q˙2 −ω2 q2 )/2. Using a naive generalization of the Gaussian integration, show that the generating functional is of the type given in exercise 5. Write a formal expression for Δ(t, t ).
126
10 The Feynman Path Integral and Propagators
(7) Consider the generating functional 1 Z[J (t)] = N exp − dt dt J (t)Δ(t, t ) J (t ) 2 +λ dt 1 dt 2 dt 3 dt 4 J (t 1 ) J (t 2 ) J (t 3 ) J (t 4 ) . Calculate the four-point function.
(10.65)
11
The Classical Limit and Hamilton–Jacobi (WKB Method), the Ehrenfest Theorem Until now, we have developed the formalism of quantum mechanics from postulates, and we have shown how to match with experiments. We have not used classical mechanics at all, except sometimes as a guide. But we know that the macroscopic world is approximately classical, and most systems have a classical limit. We must therefore consider a way to obtain a classical limit, plus quantum corrections, from the quantum formalism we have developed so far. First we will show that the classical equations of motion play a role in the quantum theory itself, in the form of the Ehrenfest theorem, which however does not in itself suggest a classical limit. The classical limit is suggested by the Feynman path integral formalism, developed in the previous chapter, for transition amplitudes, perhaps with the insertion of an operator. The action S[x] appearing in the path integral has as a minimum the classical action, the action on the classical path, for which δS = 0 ⇒ x = x cl (t); this allows us to expand the action around the classical on-shell action, S Scl [x cl ] + (δx) 2 (. . .) = Scl [x cl ] + fluctuations, and integrate over the fluctuations: ˆ Dx(t)e (iS[x]/) A({t i }) eiScl [xcl ]/ Acl ({t i }) × fluctuations x , t T A({t i }) x, t =
(11.1)
(11.2)
≡ eiScl [xcl ]/ Acl ({t i })eiSqu / . Thus the classical action has some relevance to the classical limit of quantum mechanics and, as we will see, it is related to the classical mechanics Hamilton–Jacobi equation and formalism. But before analyzing that, we will prove the Ehrenfest theorem.
11.1 Ehrenfest Theorem A statement of the Ehrenfest theorem is: The classical mechanics equations of motion are valid for the quantum average over quantum states.
Proof Consider a quantum operator Aˆ and its expectation value in the state |ψ, ˆ Aψ ≡ ψ| A|ψ.
(11.3)
dψ| d ∂ Aˆ d ˆ ψ . Aψ = A|ψ + ψ Aˆ ψ + ψ dt dt ∂t dt
(11.4)
Its time evolution is calculated as
But we can use the Schr¨odinger equation acting on |ψ, namely d 1 d 1 ˆ |ψ = Hˆ |ψ ⇒ ψ| = − ψ| H, dt i dt i 127
(11.5)
128
11 Classical Limit, WKB Method, Ehrenfest Theorem
on the time evolution above, and obtain ∂ Aˆ d 1 ˆ ψ Aψ = ψ|( Aˆ Hˆ − Hˆ A)|ψ + ψ dt i ∂t ˆ ∂A 1 ˆ ˆ = [ A, H]ψ + . i ∂t ψ
(11.6)
This is the first version of the Ehrenfest theorem, which is the quantum mechanical equivalent of the classical mechanics evolution equation d ∂A A = { A, H } P.B. + . dt ∂t
(11.7)
Consider next a Hamiltonian on a general phase space, H = H (qi , pi ), where pi is canonically conjugate to the variable qi . In this case, ∂( qˆi )/∂t = ∂( pˆi )/∂t = 0, so we obtain d 1 ˆ ψ qi ψ = [qˆi , H] dt i d 1 ˆ ψ. pi ψ = [ pˆi , H] dt i
(11.8)
On the other hand, the commutations come from Poisson brackets, in particular the canonical commutators from the canonical Poisson brackets, 1 [qˆi , pˆi ] = 1 ← {qi , pi } P.B. = 1, i
(11.9)
which means that we can represent the coordinates by derivatives with respect to momenta, and momenta as derivatives with respect to coordinates, as 1 ∂ qˆi = , i ∂pi
1 ∂ pˆi = − , i ∂qi
(11.10)
which allows us to replace them in the commutators on the right-hand side of the Ehrenfest theorem (11.6). Better still, we have that the specific commutators in the Ehrenfest theorem come from Poisson brackets that amount to derivatives of the Hamiltonian, which can be extended to operators as ˆ 1 ˆ ← {pi , H } P.B. = − ∂H → − ∂ H [ pˆi , H] i ∂qi ∂ qˆi ˆ 1 ˆ ← {qi , H } P.B. = + ∂H → + ∂ H . [qˆi , H] i ∂pi ∂ pˆi
(11.11)
Finally, then, we obtain the Hamiltonian equations of motion as operator equations quantum averaged over a state |ψ, ∂H d qi ψ = dt ∂pi ψ (11.12) ∂H d pi ψ = − , dt ∂qi ψ which is the more common way to describe the Ehrenfest theorem.
q.e.d.
129
11 Classical Limit, WKB Method, Ehrenfest Theorem
As an example, consider the case of a coordinate q , and a Hamiltonian with a kinetic term and a potential term, pˆ2 + V ( q). ˆ Hˆ = 2m
(11.13)
In this case, the two Hamiltonian equations of motion can be turned into a single (Newtonian) equation of motion, both for the quantum average: d 1 d ψ ⇒ q ψ = p ψ , p ψ = −∇V dt m dt d2 ψ = F ψ . m 2 q ψ = −∇V dt
(11.14)
11.2 Continuity Equation for Probability In Chapter 7, we saw that the norm of a time-dependent state amounts to an integral of a probability density, dP ψ(t)|ψ(t) = ρdV = dV , (11.15) dV and in the coordinate representation this implies that ρ=
dP = |ψ(r , t)| 2 . dV
(11.16)
Moreover, the Schr¨odinger equation implies the continuity condition for probability, · j, ∂t ρ = −∇
(11.17)
where j is the probability current density (when we think of probability as a fluid), − ψ∇ψ ∗ ). j = (ψ∗ ∇ψ 2mi
(11.18)
But we can expand on this formalism, and write the wave function in coordinate space as a real normalization constant times a phase, ψ(r , t) = AeiS(r ,t)/ , where A ∈ R and, from the above, we have A=
ρ(r , t).
(11.19)
(11.20)
Moreover, the probability current density is = ρ ∇S. j = A2 2i ∇S 2mi m
(11.21)
Thinking of probability as a fluid, we can define its “velocity” v by writing the fluid equation j = ρv , obtaining v =
1 ∇S. m
(11.22)
130
11 Classical Limit, WKB Method, Ehrenfest Theorem
Since a wave function is written in terms of the propagator (the time evolution operator in the coordinate basis), which itself is a path integral that can be expanded in terms of the classical action plus a quantum correction, as in (11.2), we obtain i (11.23) ψ(r , t) = U (r , t; r , 0)ψ(r , 0) exp (Scl (r cl , t) + Squ (r , t)) ψ(r , 0), which is consistent with the previous ansatz for the wave function (11.19). √ Part of the time evolution of the ansatz (11.19), namely the factor A = ρ, was defined from the continuity equation, which can now be written more explicitly, v ) = − 1 ∇( A2 ∇S) ∂t A2 = ∂t ρ = −∇(ρ ⇒ m 1 2S . + 1 A∇ ∂t A = − ∇A · ∇S m 2
(11.24)
This continuity equation was derived from the Schr¨odinger equation, but we have used only part of it; the other part must be the equation for S. Indeed, from the Schr¨odinger equation in coordinate space, 2 2 ∇ + V (r ) ψ(r , t) = i∂t ψ(r , t), (11.25) − 2m which for the ansatz becomes 2 i S 2 2i i 2 1 2 − e ∇ A + ∇A · ∇S + A∇ S − 2 A(∇S) + V (r ) AeiS/ 2m i i = ie S ∂t A + A∂t S ,
(11.26)
we obtain, using the continuity equation (11.24) to cancel terms involving A between the left- and right-hand sides, 2A 2 ∇ 1 2 (∇S) + V (r ) + ∂t S − = 0. (11.27) 2m 2m A We see that the quantum value appears only in the last term, the other three terms being of the same, classical, order. That means that if we define the classical limit as the limit → 0, in which for instance the canonical quantization condition [qˆi , pˆ j ] = iδi j becomes just the commutation of classical variables, [qi , p j ] = 0, we obtain an equation involving just the first three terms, which turns out to be the classical Hamilton–Jacobi equation.
11.3 Review of the Hamilton–Jacobi Formalism In order to continue, we quickly review the Hamilton–Jacobi formalism of classical mechanics. There is a more complete and rigorous derivation for it, but here we will show a somewhat shorter version, focusing on the physical interpretation. If we consider a canonical transformation between two representations of classical mechanics for the same system, the classical action S must change by a total derivative, given that the endpoints of the action are fixed:
131
11 Classical Limit, WKB Method, Ehrenfest Theorem
S → S = S −
t2
t2
L dt →
t1
dt
t2
(L dt − dS) =
t1
d S (x , t) ⇒ dt
(11.28)
L dt − S (x (t 2 ), t 2 ) + S (x (t 1 ), t 1 ),
t1
where S (x (t 1,2 ), t 1,2 ) are constants and so do not change the physics. That means that a Lagrangian depending on general variables qi changes as follows: L → L = L −
dS q˙i2 ∂S ∂S = mi − V − − q˙i dt 2 ∂t ∂qi i i 2 2 1 ∂S ∂S 1 ∂S = pi − −V − − . 2mi ∂qi ∂t 2mi ∂qi i i
(11.29)
The extremum of the new Lagrangian is obtained when we have both pi = and
∂ S ∂qi
1 ∂S 2 ∂ S+ + V (q, t) = 0. ∂t 2mi ∂qi i
(11.30)
(11.31)
This latter equation (11.31) is the Hamilton–Jacobi equation, and S (q , t) is Hamilton’s principal function. The usefulness of the Hamilton–Jacobi equation comes when, after the canonical transformation, H = 0, which means that L = Pi Q˙ i , and this is actually zero, since Pi and Qi are constant, as we will see shortly. Thus S is the action of the theory. Since H = 0, one finds that the new momenta Pi are constants of motion, which will be called α i , and the principal function will depend on these constants as well: S = S (t, qi , α i ). Then, moreover, the derivatives with respect to these constants αi give the new coordinates Qi and are new constants of motion, called βi , ∂S = βi = Qi . ∂α i
(11.32)
This, together with Pi = α i , defines the constants of motion, also known as integrals of motion. There is no algorithmic solution to the Hamilton–Jacobi equation, but often one can solve it by the separation of variables. Indeed, if the potential V (r ) is (explicitly) time independent, the equation admits a solution that separates the variables, S (t, qi ) = f (t) + s(qi ).
(11.33)
Then, substituting into (11.31), we find f˙(t) +
1 ∂s 2 + V (qi ) = 0, 2m ∂qi i
(11.34)
which means that time dependence equals qi dependence, allowing us to set to zero both independently, f˙(t) = constant ≡ −E ⇒ f (t) = −Et + constant,
(11.35)
132
11 Classical Limit, WKB Method, Ehrenfest Theorem
and substituting into (11.34) we find the time-independent Hamilton–Jacobi equation 1 ∂s 2 + V (qi ) = E. 2mi ∂qi i
(11.36)
For Hamilton’s principal function we obtain S (t, qi , E, α i ) = s(qi , E, α i ) − Et,
(11.37)
and the integrals of motion satisfy ∂S ∂s = = βi , ∂α i ∂α i
∂S ∂s(qi , E, α i ) = − t = t0. ∂E ∂E
(11.38)
We can also separate more variables, in the case where the Hamiltonian is independent of other variables, but we will not describe that here.
11.4 The Classical Limit and the Geometrical Optics Approximation We see then that, in the → 0 limit, the quantum version of the Hamilton–Jacobi equation, (11.27), reduces to the classical Hamilton–Jacobi equation, (11.31). That means that S(x , t) appearing in ψ = AeiS/ is identified with Hamilton’s principal function S (x , t), which in turn equals the action if H = 0, and thus L = PQ˙ = 0. Indeed, then, from the path integral formalism, we have ψ(r , t) eiScl (r ,t)/ × (· · · ).
(11.39)
Considering then the time-independent (stationary) case for S(x , t) as in the Hamilton–Jacobi equation, we write S(x , t) = W (r ) − Et,
(11.40)
which leads to the following equation for W : 2A 1 2 2 ∇ (∇W ) + V (r ) − E − = 0. 2m 2m A
(11.41)
But we still haven’t defined the classical limit physically or defined rigorously the smallness of the extra term in (11.27). Physically, we see that the limit is a generalization of the geometrical optics approximation of classical wave mechanics. In it, a wave is replaced by “paths of light rays”, corresponding now to replacing the integral over all paths with just the motion of classical particles. Using the particle–wave duality defined by de Broglie, the wave associated with a particle of momentum p has wavelength λ = h/p. In the presence of a potential V (r ), p2 /(2m) = E − V (r ), so the position-dependent de Broglie wavelength is λ=
h h = . p 2m(E − V (r ))
(11.42)
Replacing this in (11.41), we can rewrite it as )2 = (∇W
h2 λ2
⎡⎢ 2 2 ⎤ ⎢⎢1 + λ ∇ A ⎥⎥⎥ . (2π) 2 A ⎥⎦ ⎢⎣
(11.43)
133
11 Classical Limit, WKB Method, Ehrenfest Theorem
Then it is indeed clear that neglecting the term with (λ/(2π)) 2 as being small, we obtain the geometrical optics approximation. Indeed, the resulting equation is )2 = (∇W
h2 , λ2
(11.44)
which is the equation of a wave front in geometrical optics. The planes of constant W are planes of constant phase, i.e., wave fronts. If we have V (r ) = 0, λ is constant, so the solution of the equation is W = p · r + constant,
(11.45)
where |p | = h/λ, and so wave fronts are perpendicular to the direction of p , whereas light rays are in a direction parallel to p .
11.5 The WKB Method A related way to expand the wave function semiclassically is the WKB method, found by Wentzel, Kramers, and Brillouin; it is a more general method, but roughly speaking can be understood here as an expansion in and so as a “semiclassical” expansion of the wave function. Specifically, we saw that i (11.46) ψ(r , t) = eiS(r ,t)/ = exp (W (r ) − Et) ⇒ ψ(r ) = eiW (r )/ , and moreover we write W (r ) = s(r ) + T (r ), i
(11.47)
ψ(r ) = eT (r ) eis(r )/ ≡ Aeis(r )/ ,
(11.48)
so that
where we can uniquely define s(r ) and T (r ) by saying that they are both even in . We can moreover expand them in , and keep only the lowest order in the expansion. The Schr¨odinger equation for ψ(r , t) becomes an equation for S(r , t), 2 ⎡ ⎤ ∂ 2 S ⎥⎥ ∂S 1 ⎢⎢ ∂S + V (qi ) = 0. + + (11.49) ⎢ ∂t 2mi ⎢ ∂qi i ∂qi2 ⎥⎥ i ⎣ ⎦ Neglecting the term in , specifically because 2 2 ∂ S ∂S 2 , ∂qi ∂qi
(11.50)
we obtain again the classical Hamilton–Jacobi equation. As we saw, this limit is a geometrical optics approximation, and, remembering that ∂S/∂qi = pi , the above condition becomes ∂pi ∂(/pi ) pi ⇒ 1. (11.51) ∂qi ∂qi
134
11 Classical Limit, WKB Method, Ehrenfest Theorem Remembering that /pi = λi , we write the condition as 1, | ∇λ|
(11.52)
or, considering the variation of λ over a distance of order λ, so δλ (∇λ)δx for δx = λ, we obtain the condition δ λ λ 1, (11.53) λ which is the precise form of the geometrical optics approximation.
WKB in One Dimension The simplest application of the WKB method is for a one-dimensional system. Putting → 0 in (11.41), and restricting to one dimension, we obtain W 2 E − V (x), 2m with approximate solution
W (x) ±
x
dx 2m(E − V (x )) ≡ ±
(11.54)
x
dx p(x ),
(11.55)
√ where p(x) = 2m(E − V (x)) is the (space-dependent) momentum. Then the wave function is x i iEt ψ(x, t) = A exp ± dx p(x ) − . (11.56) x0 The equation for A is, as we have seen already, the continuity equation (coming from ρ = A2 ), now written as ∂ρ ∂ ρ ∂S + = 0. (11.57) ∂t ∂x m ∂x Since ∂ρ/∂t = 0 (the stationary case) and ∂S/∂ x = dW /dx, we obtain that ρW (x) is constant. But √ on the other hand, from (11.55), ρW = ±ρ 2m(E − V (x)), meaning we obtain A=
√
ρ=
constant . [E − V (x)]1/4
Finally, then the WKB solution to the one-dimensional problem is x constant i iEt ψ(x, t) = exp ± dx p(x ) − . x0 [E − V (x)]1/4
(11.58)
(11.59)
Important Concepts to Remember • The statement of the Ehrenfest theorem is that the classical equations of motion in the Hamiltonian formulation are valid for the quantum average, for quantum states.
135
11 Classical Limit, WKB Method, Ehrenfest Theorem
• The wave function can be written as ψ(r , t) = probability fluid” is v = 1 ∇S.
ρ(r , t)eiS(r ,t)/ , so that the “velocity of the
m
• Then, from the Schr¨odinger equation (and the continuity equation for probability), the function S(r , t) satisfies a quantum-corrected Hamilton–Jacobi equation, with a term of order 2 . • In the Hamilton–Jacobi formalism, S(q , t) is Hamilton’s principal function, and its derivatives with respect to the constants of motion α i , the values of Pi , are other constants of motion βi , the values of Qi . • There is no algorithmic way to solve the Hamilton–Jacobi equation; one generally uses some separation of variables, at least for time, leading to the time-independent Hamilton–Jacobi equation. • The classical limit, ignoring the 2 term in the Hamilton–Jacobi equation, is a generalization of the geometrical optics approximation, for the case where the de Broglie wavelength satisfies, |δ λ λ/λ| 1. • An equivalent expression for the wave function is ψ(r , t) = exp i (W (r ) − Et) , where W (r ) = s(r ) − i T (r ). • The WKB method dimension is a first-order correction to the classical solution Aeiscl / , in xin one √ which W (x) = ± dx 2m(E − V (x )) − i ln A and A is no longer constant anymore but equals const /[E − V (x)]1/4 .
Further Reading See [2] and [3] for more details.
Exercises (1)
Consider the Hamiltonian H=
p2 + λx 4 2
(11.60)
and the observable A = x 3 + p3 .
(11.61)
Find the equation of motion for A in the Hamiltonian formalism and the corresponding quantum version of this equation of motion. (2) Consider a radial (central) potential for motion in three dimensions, V (r). Write the quantum version of the Hamilton–Jacobi equation, and reduce it to a single equation for the radial motion (in r). (3) (Review from classical mechanics) Consider the classical Hamilton–Jacobi formalism, for the case of the motion of a particle in three spatial dimensions, in a central potential V (r) = −B/r. Solve the Hamilton–Jacobi equation for the motion of a particle coming in from infinity and being deflected, and find the deflection angle.
136
11 Classical Limit, WKB Method, Ehrenfest Theorem (4) In the parametrization ψ = A exp i W (r ) − i Et , what is the equation of motion for A? What happens to it in the geometrical optics approximation? If also V (r ) = 0, solve the equation for A. (5) Consider a one-dimensional harmonic oscillator. Can one apply the WKB approximation to it? If so, why, and when? (6) Write down the WKB approximation for the one-dimensional potential V = −B/x, B > 0, and find the domain of validity for it. (7) For the case in exercise 6, write down the explicit equations for the wave function outside the geometrical optics approximation, and find a way to introduce the next-order corrections to the WKB approximation that are based on the geometrical optics approximation.
12
Symmetries in Quantum Mechanics I: Continuous Symmetries We now begin the analysis of symmetries in quantum mechanics. First, in this chapter, we will analyze simple continuous symmetries, such as translation invariance. But before that, we will review symmetries in classical mechanics. Then we will generalize the results to quantum mechanics, and show that the symmetries of quantum mechanics form groups. In the next chapter, we will analyze discrete symmetries, such as parity invariance, and “internal” symmetries. After that, in the following chapters, we will consider rotational invariance and the theory of angular momentum.
12.1 Symmetries in Classical Mechanics Consider an infinitesimal transformation of the variables qi of a system with Lagrangian L(qi , q˙i , t), qi → qi + δqi , where δqi is a specific change. Then we have a symmetry of the system if the Lagrangian is invariant, δL/δqi = 0, or more precisely, if the action is invariant, δS = 0. δqi
(12.1)
Note that therefore the action S = dt L is invariant if L varies by at most the total derivative of a function f , i.e., d f /dt. We can consider two simple cases: • We can have the invariance of L under any translation by δqi , so that ∂L/∂qi = 0. Then the Lagrange equations of motion, ∂L d ∂L − = 0, ∂qi dt ∂ q˙i
(12.2)
where the canonically conjugate momentum is pi ≡ ∂L/∂qi , imply that this conjugate momentum is conserved, i.e., constant in time, dpi = 0. dt
(12.3)
This is the simplest form of the Noether theorem, which states that: for any symmetry of the theory there is a conserved charge (i.e., a charge that is constant in time). Here the conserved charge is the canonically conjugate momentum pi . Note that usually the Noether theorem is defined within classical field theory, which is outside our scope, but here we present the simple (nonrelativistic) classical-mechanics form. • We can also have a symmetry corresponding to invariance under a transformation that is continuous (so that there exists an infinitesimal form) and linear (so that it is proportional to the qi itself) and is of the type 137
138
12 Continuous Symmetries in Quantum Mechanics δqi =
a (iTa )i j q j ,
(12.4)
a
where the a are arbitrary (real or complex) infinitesimal parameters and (iTa )i j , for a = 1, . . . , N, are constant matrices called the “generators of the symmetry”. In this more general case, consider a Lagrangian that (for simplicity) has no explicit time dependence, so that L = L(qi , q˙i ). Then the variation of the action under the symmetry is ∂L ∂L δS = dt δqi + δ q˙i ∂qi ∂ q˙i (12.5) ∂L d ∂L d ∂L = dt δqi − + δqi , ∂qi dt ∂ q˙i dt ∂ q˙i d where in the second line we have used δ q˙i = dt (δqi ) (since [∂t , δ] = 0) and partial integration. If the transformation above is a symmetry then the action is invariant, δS = 0. Moreover, using the classical Lagrange equations of motion (so, “on-shell”), we find that t2 t2 d ∂L ∂L dt δqi = a (iTa )i j q j , (12.6) 0= dt ∂ q˙i ∂ q˙i t1 t1
which vanishes for any a , allowing us to peel off (factorize out) the a , and, for any t 1 , t 2 , which means that the quantity, a priori time dependent, ∂L j Qa ≡ (iTa )i q j (t), (12.7) ∂ q˙i is actually time independent, and is known as a “conserved charge” associated with the symmetry (12.4). This is the more common version of the Noether theorem in classical mechanics. The Noether theorem allows for one more generalization, for the case where the Lagrangian is not invariant, but only the action is invariant. The Lagrangian can change by a total derivative, so t2 d (12.8) δS = dt ( a Q˜ a ) = a Q˜ a , t1 dt which vanishes only because of the boundary conditions on Q˜ a at t 1 , t 2 . Then, actually instead of Q a , it is (Q a − Q˜ a )(t)
(12.9)
that is independent of time. We can make a number of observations about this analysis: • Observation 1. The total new variable, after the transformation, is qi = qi + δqi = δij + a (iTa )i j q j ≡ Mi j q j ,
(12.10)
where the matrix Mi j defines the linear transformation of qi . • Observation 2. Consider the time translation, t → t + δt; under it qi varies by δqi = q˙i δt. If time translation is an invariance (so that the action is invariant under it), the infinitesimal conserved charge and the variation of the Lagrangian are aQa =
∂L q˙i δt ∂ q˙i
δL = ∂t Lδt,
(12.11)
139
12 Continuous Symmetries in Quantum Mechanics ˜ Substituting pi = ∂L/∂ q˙i , we find and from the last relation we deduce that ∂t L = ∂t Q˜ ⇒ L = Q. the conservation law d ∂L d d 0= q˙i − L = (pi q˙i − L) = H, (12.12) dt ∂ q˙i dt dt meaning that the Hamiltonian, or energy, is the Noether charge associated with time translations. • Observation 3. We can write the linear and continuous variation of the qi as a Poisson bracket for the charge. Indeed, we find that Qa =
∂L (iTa )i j q j = pi (iTa )i j q j , ∂ q˙i
(12.13)
so the variation of qi is the Poisson bracket δqk = −{ a Q a , qk } P.B. = δik a (iTa )i j q j = a (iTa )k j q j ,
(12.14)
as written before. Here we have used the fundamental Poisson bracket {qk , pi } = δik . • Observation 4. We can generalize the formalism to discrete symmetries (meaning that there is no infinitesimal version), by writing the finite transformation qi → qi = Mi j q j ,
(12.15)
instead of M = 1 + O(). Then, if S changes to S = S or, more restrictively, if L changes to L = L, the matrix M defines the action of the symmetry on the variables qi .
12.2 Symmetries in Quantum Mechanics: General Formalism In quantum mechanics, the classical matrix Mi j becomes the operator Mˆ ij and the charge Q a becomes the charge operator Qˆ a , which as we saw, can be defined to be a function of the phase space variables, now operators, qˆi , pˆi . The linear transformation, written as a Poisson bracket, with the quantization rule becomes (1/(i) times) the commutator, δqk = −{ a Q a , qk } P.B. → −
1 aˆ [ Q a , qˆk ]. i
(12.16)
This is a natural relation in canonical quantization. In classical mechanics, we focused on symmetries in the Lagrangian formalism, when the action (or sometimes the Lagrangian) is invariant, δS = 0. However, in the Hamiltonian formalism, useful for quantum mechanics, the equivalent statement is the invariance of the Hamiltonian, δH (qi , pi ) = 0. In the classical mechanics version, this means that δH (qi , pi ) = −{ a Q a , H } P.B. = 0,
(12.17)
which translates in quantum mechanics into 1 aˆ ˆ ˆ = 0, [ Q a , H] = 0 ⇒ [Qˆ a , H] i i.e., that the transformed Hamiltonian operator is the same as the original one, ˆ Hˆ ≡ Qˆ a Hˆ Qˆ −1 a = H.
(12.18)
(12.19)
140
12 Continuous Symmetries in Quantum Mechanics
More importantly, the Ehrenfest theorem, stating that the quantum averages on states should have the same property as the classical values, must hold. In the case of symmetry transformations, we can have two points of view. (1) The active point of view, where a transformation changes the state of the quantum system, |ψ → |ψ = Uˆ |ψ,
(12.20)
but not the operators. Under this change, the energy, or quantum average of the Hamiltonian, the equivalent of the classical quantity, should be invariant. Thus ψ | Hˆ |ψ = ψ|Uˆ † Hˆ Uˆ |ψ = ψ| Hˆ |ψ,
(12.21)
while the norms of the states should also be invariant, ψ |ψ = ψ |Uˆ †Uˆ |ψ = ψ|ψ,
(12.22)
implying that the operator Uˆ should be unitary, Uˆ † = Uˆ −1 . Then the invariance of the quantum average of the Hamiltonian above, for any state |ψ, means that the transformed operator equals the original one, ˆ H] ˆ = 0, Uˆ −1 Hˆ Uˆ = Hˆ ⇒ [U,
(12.23)
ˆ representing the symmetry transformation so that the Hamiltonian commutes with the operator U, on states. (2) The passive point of view, where the transformation acts on operators but not on states, ˆ Aˆ → Aˆ = Uˆ −1 AˆU,
|ψ → |ψ.
(12.24)
This point of view should lead to the same results as the active one: we can see that the quantum average transforms in the same way. But, the change from an active to a passive point of view is more general than its application above to invariances of the Hamiltonian. It is in fact valid simply for transformations of the states or operators, corresponding to classical transformations. By the Ehrenfest theorem, the quantum version of the classical transformations of an observable A are transformed quantum averages for the ˆ quantum operator A, ˆ , A = ψ| Aˆ |ψ or ψ | A|ψ
(12.25)
ˆ where Aˆ = Uˆ −1 AˆU. Note that the invariance of the quantum Hamiltonian Hˆ under a symmetry transformation with operator Qˆ a implies a degeneracy of the eigenstates of the Hamiltonian. Indeed, if |ψ is an eigenstate of Hˆ then so is Qˆ a |ψ, since Hˆ (Qˆ a |ψ) = Qˆ a ( Hˆ |ψ) = Eψ (Q a |ψ).
(12.26)
141
12 Continuous Symmetries in Quantum Mechanics
12.3 Example 1. Translations A classical translation of the coordinate qi acts on classical phase space, and the corresponding quantum averages, as qi → qi + ai ⇒ qi → qi = qi + ai pi → pi ⇒ pi → pi .
(12.27)
But in Chapter 5 we saw that a translation by δq is represented by an operator ˆ Tˆ (δq) = 1ˆ − iδq K,
(12.28)
ˆ and acts on wave functions as where Kˆ is Hermitian, Kˆ † = K, d Kˆ → −i . dq
(12.29)
More precisely, since, as we saw, /(i) appears naturally in the transition from classical mechanics, we write i ˆ Tˆ () = 1ˆ − K.
(12.30)
Then, indeed, for a finite transformation by a instead of , ˆ Tˆ (a) = e−i(a/) K ≡
ia n Kˆ n − . n! n≥0
(12.31)
This is so since for wave functions, when Kˆ = −id/dq, this is just Taylor expansion on ψ(x). By the general definition, Tˆ (a)|q = |q + a,
(12.32)
and this means that for the wave function in the coordinate (q) representation, q|Tˆ (a)|ψ = ψ(q + a) = Tˆ (a)ψ(q),
(12.33)
as required. In the passive point of view, the action of translation on the operators qˆ is ˆ qˆ = Tˆ −1 () qˆTˆ () = qˆ + 1,
(12.34)
since then, indeed, the action on the quantum averages is qψ → qψ = qψ + .
(12.35)
142
12 Continuous Symmetries in Quantum Mechanics
12.4 Example 2. Time Translation Invariance As we saw, at the classical level, time translation invariance leads to a Hamiltonian that is invariant in time (conserved), dH/dt = 0, which by the Ehrenfest theorem becomes in quantum mechanics
d Hˆ dt
ψ
= 0.
(12.36)
This means that if we start with a given state at t = 0 or t = τ and evolve in time by t, we should obtain the same state, up to a phase eiα(t,τ) at the most, for any state: |ψ(t) = Uˆ (t, 0)|ψ(0) = eiα(t,τ) |ψ (t + τ) = eiα(t,τ) Uˆ (t + τ, τ)|ψ (τ),
(12.37)
where |ψ (τ) = |ψ(0) = |ψ0 is the same state. That in turn implies that in fact we have a relation between evolution operators, Uˆ (t, 0) = eiα(t,τ) Uˆ (t + τ, τ),
(12.38)
for any t and τ. But, since in the infinitesimal form Uˆ 1ˆ − i Hˆ dt, we obtain i ˆ i ˆ 1 − H (0)dt = (1 + i f (τ)dt) 1 − H (τ)dt ,
(12.39)
ˆ Hˆ (τ) = Hˆ (0) + f (τ) 1.
(12.40)
so that finally
However, in any case the function f (τ) coming from the expansion of the phase eiα(t,τ) is trivial and can be absorbed in redefinitions, so we can just drop it, resulting in the fact that the Hamiltonian operator is time independent, as expected from the classical analysis. Equivalently, and dropping the phase eiα from the beginning, we have idt ˆ |ψ(t 1 + dt) 1 − H (t 1 ) |ψ0 (12.41) idt ˆ |ψ(t 2 + dt) 1 − H (t 2 ) |ψ0 , so we obtain d Hˆ = 0, dt
(12.42)
which in particular implies also that
as needed.
d Hˆ dt
ψ
= 0, ∀|ψ,
(12.43)
143
12 Continuous Symmetries in Quantum Mechanics
12.5 Mathematical Background: Review of Basics of Group Theory General Linear and Continuous Symmetry In general, we can write for the transformation operator, i a ˆ Mˆ = 1ˆ − Qa .
(12.44)
Then the transformed operator is qˆi
i a ˆ 1 i a ˆ −1 ˆ ˆ = M qˆi M = 1 + Q a qˆi 1 − Q a qˆi − [ a Qˆ a , qˆi ]. i
(12.45)
Thus we can represent the operator Qˆ a as Qˆ a =
i,j
pˆi (iTa )i j qˆ j ⇒ −
1 aˆ [ Q a , qˆk ] = a (iTa )i j qˆ j = δ qˆi . i
(12.46)
Groups and Invariance under Groups Symmetry transformations form a group. We say we have a group G if we have a set of elements, together with a multiplication between them, represented as G × G → G, such that we have the following properties: (a) (b) (c) (d)
The multiplication respects the group, i.e., ∀ f , g ∈ G, h = f · g ∈ G. The multiplication is associative, i.e., ∀ f , g, h ∈ G, ( f · g) · h = f · (g · h). There is an element e called the identity, such that e · f = f · e = f , ∀ f ∈ G. There is an element called the inverse, f −1 , associated with any f ∈ G, such that f · f −1 = f −1 · f = e.
Indeed, in the case of symmetries qi → Mi j q j , with M ∈ G, ∀ M1 , M2 ∈ G, M1 · M2 = M3 ∈ G. Also, matrix multiplication is associative and admits an inverse. As examples, we can consider both time translation and translation of some qi by a, so that the operator is T (a). Then indeed T (a1 ) · T (a2 ) = T (a1 + a2 ).
(12.47)
Example of a Group: Z2 The simplest group is the group Z2 , which has just two elements, a = e and b. It can be represented on R as the numbers a = e = +1 and b = −1. This implies that the multiplication table for the group is a · a = a, b · b = a,
a · b = b · a = b.
(12.48)
144
12 Continuous Symmetries in Quantum Mechanics It also means that the group is Abelian, so that g1 · g2 = g2 · g1 , ∀ g1 , g2 ∈ G. Abelian groups are named after the Norwegian mathematician Niels Henrik Abel, whose name is associated with the most important prize in mathematics, the Abel prize. To have invariance of a system under a group G in classical mechanics means that we need to have invariance under the transformations qi → gqi , ∀g ∈ G. Thus L(gqi ) = L(qi ), or at least S[gqi ] = S[qi ] (invariant action).
Example of an Invariant System under Z2 Acting on q As a simple example of invariance under Z2 , consider a system with one real coordinate q ∈ R, but with a potential depending only on its modulus, V = V (|q|), so that L=m
q˙2 − V (|q|). 2
(12.49)
Then q = gq for g = a or b. For the case g = a = +1, we obtain q = q, so this is trivially a symmetry. For the case g = b = −1, we need L(−q) = L(q). But, indeed, q˙2 = (−q) ˙ 2 and | − q| = |q|, implying L(−q) = L(q). Then also S[q] = S[−q] (the action is invariant), though the reverse is not true in general (invariance of the action invariant doesn’t imply invariance of the Lagrangian).
Example of a System with Invariance of the Action but not of the Lagrangian We can modify the above example in such a way that the action is still invariant, S[−q] = S[q], but the Lagrangian is not. Consider then the new Lagrangian L=m
d q˙2 − V (|q|) + α q, 2 dt
(12.50)
together with the boundary condition q(t 2 ) = q(t 1 ). Then we have explicitly that L(−q) L(q), since the new term is odd under q → −q, but the action is invariant, since the new term gives t2 d dt q = α(q(t 2 ) − q(t 1 )) = 0, (12.51) α dt t1 because of the boundary condition.
Generalization: Cyclic Groups The next (“cyclic”) group in terms of dimension is Z3 , made up of three elements, {e, a, b}, that can be represented in C as complex numbers, equal to the third roots of unity, i.e., x ∈ C such that x 3 = 1. We have then e = 1 = e0 ,
a = e2πi/3 ,
b = e4πi/3 ,
(12.52)
which implies the multiplication table a2 = b, b2 = a, ab = ba = e. This is also an Abelian group, as we can easily see.
(12.53)
145
12 Continuous Symmetries in Quantum Mechanics
As an example of a classical Lagrangian that is invariant under the above group, consider a generalization of it. Consider a complex variable q = q1 + iq2 , q1 , q2 ∈ R, and L(q) = m
| q| ˙2 − V (q3 ). 2
(12.54)
Then, under q → q = gq, we have | q| ˙ = |g q| ˙ = | q| ˙ and q3 → q 3 = (gq) 3 = q3 ,
(12.55)
since g 3 = 1 for all g ∈ Z3 . Our final generalization is to the N-element cyclic group Z N , with elements {e, a1 , . . . , a N −1 } that can be represented in the complex numbers as the Nth roots of unity, g, such that g N = 1, specifically e = 1, a1 = e2πi/N , . . . , a N −1 = e2πi(N −1)/N .
(12.56)
An example of a Lagrangian invariant under this group is a further generalization of the same Lagrangian as before, for a complex variable q, | q| ˙2 − V (q N ), 2
(12.57)
q N → q N = (gq) N = q N .
(12.58)
L(q, q) ˙ =m since then indeed
ZN -Invariant System in Quantum Mechanics We can translate the invariance of the above classical system into a quantum mechanical invariance. For that, we define the quantum Hamiltonian | q| ˙ˆ 2 Hˆ = m + V ( qˆ N ). 2
(12.59)
Then we can check that the Hamiltonian is invariant under gˆ ∈ Z N , acting as qˆ → gˆ −1 qˆ g, ˆ since ˆ˙ 2 | gˆ −1 qg| ˆ Hˆ = gˆ −1 Hˆ gˆ = m + V (gˆ −1 qˆ g) ˆ N = H. 2
(12.60)
Representations of Groups So far, we have defined the groups Z N by the complex numbers that define their multiplication table. But it is important to understand that we have different kinds of representations of the group, in terms of different kinds of objects. For a representation R, we say that the element g is represented by DR (g). In the case of the fundamental (defining) representation of the Z N defined above, since all the elements g are complex numbers, we have a one-dimensional complex vector space. But for Z N with N ≥ 3, there is (at least) one other representation, called the regular representation.
146
12 Continuous Symmetries in Quantum Mechanics In the case of Z3 , we define it in terms of 3 × 3 matrices, 1 D(e) = 0 0
0 0 0 1 0 , D(a) = 1 0 1 0
0 0 1
1 0 0 , D(b) = 0 0 1
1 0 0
0 1 , 0
(12.61)
so it permutes the three elements of the vector space between each other. This is a three-dimensional representation, meaning that these are matrix operators acting on a x three-dimensional vector space y . Consider a Cartesian basis for this vector space, and denote its z components according to the three elements of the group, as 1 0 0 |e = 0 ≡ |e1 , |a = 1 ≡ |e2 , |b = 0 ≡ |e3 . 0 0 1
(12.62)
The notation has been chosen in such a way that we have D(g1 )|g2 = |g1 g2 ≡ |h,
(12.63)
as we can check explicitly. Moreover, since the states are orthonormal, ei |e j = δi j , we find that the matrix elements of D(g) are given by (D(g))i j = ei |D(g)|e j = ei |D(g)e j .
(12.64)
This regular representation is for the same group as the group defined by the roots of unity, but it is inequivalent to it, since the vector space has a different dimension (one versus three).
Equivalent Representations Equivalent representations means that, first, the spaces must have the same dimension, and second, there must be a similarity transformation S, defining a change of basis, under which we get from one representation to the other, i.e., D(g) → D (g) = S −1 D(g)S.
(12.65)
Indeed, in this case, the matrix elements are the same, (D (g))i j ≡ ei |D (g)|e j = ei |S −1 D(g)S|e j = ei |D(g)|e j = (D(g))i j ,
(12.66)
where in the second equality we have used the change of basis |ei = S|e j and the unitarity of S, S −1 = S † , which is true for Z N and for all the classical Lie groups.
Reducible Representations If there is an invariant subspace H ⊂ G in the representation vector space, i.e., ∀|h ∈ H,
D(g)|h ∈ H, ∀g ∈ G,
we say that we have a reducible representation. If not, we say we have an irreducible one.
(12.67)
147
12 Continuous Symmetries in Quantum Mechanics
A reducible representation is always block-diagonal, D(h) 0 D(g) = 0
0
0
˜ D( h)
0 ˜˜ D( h)
0
···
. · · ·
(12.68)
Indeed, in this case, the vector space splits as |h ˜ , | h ˜˜ | h
(12.69)
such that then, for each subspace, D(g)|h ∈ H, etc. An irreducible representation is a representation that is not reducible.
Important Concepts to Remember • A symmetry is a transformation on the variables of the Lagrangian that leaves the action invariant. • The Noether theorem says that for any symmetry of the theory there is a conserved charge.
• For a continuous linear symmetry, δqi = a a (Ta )i j q j , the conserved charge (invariant in time) j is Q a = ∂∂L q˙ i (Ta )i q j . • The momenta pi are the charges associated with invariance under translations in qi , and the Hamiltonian is the charge associated with time translation invariance. • Linear transformations can be written as δqk = −{ a Q a , qk } P.B. . ˆ = 0, corresponding to {Q a , H } P.B. = 0. • Invariance in quantum mechanics means that [Qˆ a , H] • Symmetry in quantum mechanics can be understood either as an active transformation, on states, ˆ |ψ → Uˆ |ψ, but not on operators, or as a passive transformation on operators, Aˆ → Aˆ = Uˆ −1 AˆU, but not on states. Either way, the expectation values transform in the same manner. • Symmetry transformations form a group. • The simplest groups are the discrete cyclic groups Z N ; these can be represented by the Nth roots of unity, which are Abelian groups, i.e., g1 g2 = g2 , g1 , ∀g1 , g2 ∈ G. • Groups have different representations; for Z N , we have the fundamental representation as Nth roots of unity with a multiplication table, and the regular representation as real matrices acting on the group elements as elements in an N-dimensional vector space, etc. • Equivalent representations can be mapped to each other; reducible representations have blockdiagonal matrices and so can be reduced (split) into lower-dimensional representations; irreducible representations cannot.
Further Reading See [8] for more on group theory for quantum mechanics.
148
12 Continuous Symmetries in Quantum Mechanics
Exercises Consider the Lagrangian (q, q˜ ∈ C) m ˙˜ 2 ) − V ((q2 + q˜2 ) 2 ). L = (| q| ˙ 2 + | q| (12.70) 2 What are its symmetries? What are the representations of the group in which q, q˜ belong? (2) Calculate the Noether charge for the continuous symmetry(ies) in exercise 1, and check that the infinitesimal variations of q, q˜ are indeed generated by the Noether charge via the Poisson bracket with q, q. ˜ (3) Consider two harmonic oscillators of the same mass and frequency, with Hamiltonian
(1)
1 2 k (p + p22 ) + (x 21 + x 22 ). (12.71) 2m 1 2 Find the continuous symmetries of the system and write down the resulting conserved charges as a function of the phase space variables. Then quantize the system, and show that the charges do indeed commute with the quantum Hamiltonian. (4) For the system in exercise 3, write down the equation of motion for x 21 + x 22 ≡ r 2 and then the corresponding Ehrenfest theorem equation and its transformed version under the continuous symmetries, in both the active and the passive sense. (5) Consider the Lagrangian for q ∈ C m 2 L = | q| ˙ − V (q), (12.72) 2 in the cases 1 1 1 1 V = 3 + 3 + 5 + 5 (I) q q¯ q q¯ 1 1 V = 3 + 5 (II) (12.73) |q| |q| 1 V = 3 (III). q H=
What are the symmetries in each case? (6) Consider the matrices 1 A = 0 0
(7)
0 1 0
0 0 0 1 (12.74) 0 , B = 0 1 0 . 1 1 0 0 (a) Do they form a representation of Z2 ? Why? (b) If so, is the representation reducible? (c) If so, is this a regular representation? (d) If so, is this equivalent with the roots of unity representation? Write down the generalization of the regular representation for Z3 to the Z N case, for a cyclic permutation by one step of the basis elements.
13
Symmetries in Quantum Mechanics II: Discrete Symmetries and Internal Symmetries After analyzing mostly continuous spacetime symmetries (spatial and temporal translations), as well as considering the general theory of symmetries and of discrete groups, we move on to applications of discrete symmetries in quantum mechanics.
13.1 Discrete Symmetries: Symmetries under Discrete Groups We start with Z2 -type symmetries (symmetries that can be described in terms of the group Z2 ). We consider symmetries that are present at the classical level also: • Parity symmetry P, or spatial inversion (like mirror reflection, but for all spatial coordinates instead of just one). By definition, that means that the space gets a minus sign. But inverting the direction of space also inverts the direction of momentum, so we have x → x = −x ,
p → p = −p .
(13.1)
In the presence of spin (understood as rotation around the direction of the momentum), spin does not get inverted; see Fig. 13.1a. • Time reversal invariance T, inverting the direction of time, t → −t.
(13.2)
Of course, unlike parity, which could be thought of as a simple change of reference frame, inverting the direction of time is not physically possible. Moreover, we know that there are phenomena such as the increase in entropy that define the arrow of time. But in this case, we simply mean “filming” the evolution in time of a system and rolling it backwards. In other words, consider the dynamics of the system and change t into −t in its equations. • “Internal” symmetries, i.e., symmetries that are not associated with spacetime but rather with some internal degrees of freedom. Examples are “charge conjugation”, which changes particles into antiparticles, R-parity in supersymmetric theories, isospin parity in the Standard Model of particle physics, etc. Another example of a discrete symmetry is lattice translation (translation by a fixed amount a) in condensed matter, but we will not analyze it here.
149
150
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries
T P (a)
Figure 13.1
(b)
(a) Parity is understood as inversion in a mirror: momentum gets inverted, spin (rotation around the direction of the momentum) does not. (b) Time reversal invariance: both momentum and spin get inverted.
13.2 Parity Symmetry Corresponding to the classical parity symmetry acting on coordinate and momenta by inverting them, we defined a parity operator πˆ that acts on states and/or operators. Specifically, for eigenstates of Xˆ ˆ in the active point of view, and P, ˆ π|x = | − x,
π|p ˆ = | − p.
(13.3)
Equivalently, in the passive point of view, with the parity operator acting on operators, Xˆ = πˆ −1 Xˆ πˆ = − Xˆ ˆ Pˆ = πˆ −1 Pˆ πˆ = − P.
(13.4)
ˆ 0, [π, ˆ 0. This means that [ π, ˆ X] ˆ P] Either way, we find the correct transformation law for the quantum averages of operators, which by the Ehrenfest theorem are the quantum equivalents of classical transformations, ˆ = −ψ| Xˆ |ψ = −xψ xψ = ψ| πˆ −1 Xˆ π|ψ ˆ ˆ pψ = ψ| πˆ −1 Pˆ π|ψ = −ψ| P|ψ = −pψ .
(13.5)
Moreover, we can find the action of parity on wave functions ψ(x) = x|ψ by introducing the ˆ completeness relation for |x states in x| π|ψ, ˆ x| π|ψ =
ˆ x |ψ = dx x| π|x
dx x| − x x |ψ = −x|ψ = ψ(−x),
(13.6)
where we have used the orthonormal relation x |x = δ(x − x ). Some observations about the parity operator: carrying out the classical parity operation twice, we get back to the same situation. Correspondingly, we find that the parity operator applied twice also gives the identity, ˆ πˆ 2 |x = |x ⇒ πˆ 2 = 1ˆ ⇒ πˆ −1 = π.
(13.7)
Since π is a symmetry transformation, πˆ is also unitary, according to the general theory, so for all ˆ π we have πˆ † = πˆ −1 .
151
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries ˆ the eigenvalues of πˆ are ±1, so for parity eigenstates we have Since πˆ 2 = 1, ˆ π|ψ = ±|ψ.
(13.8)
Then, for the wave functions of these parity eigenstates we have ψ(−x) = ±ψ(x),
(13.9)
which are called even parity and odd parity, respectively. ˆ Pˆ are not, since, as we But not all states are parity eigenstates. In particular, eigenstates of X, ˆ 0, [π, ˆ 0, so we cannot have saw, π|x ˆ = | − x and π|p ˆ = | − p. Also, as we saw, [π, ˆ X] ˆ P] simultaneously an eigenstate of Xˆ or Pˆ and of π. ˆ However, we can have a Hamiltonian that is invariant under parity, ˆ = 0, ˆ H] πˆ Hˆ πˆ −1 = Hˆ ⇒ [ π,
(13.10)
which means that such a Hamiltonian can have an eigenstate simultaneously with π. Then, if ˆ = 0 and) |n is an eigenstate of H, ˆ Hˆ |n = En |n, and |n is nondegenerate, it follows ([ π, ˆ H] that |n is also a πˆ eigenstate.
Example: Harmonic Oscillator For the harmonic oscillator, the ground state |0 is parity even, since the wave function is symmetric around 0. Since ˆ a† = α Xˆ + β P,
α, β ∈ C,
(13.11)
which is parity odd (since both Xˆ and Pˆ are parity odd), it follows that the first excited state is also parity odd, since |1 ∝ a† |0,
(13.12)
|n ∝ (a† ) n |0 ⇒ π|n = (−1) n |n,
(13.13)
and, more generally,
so the parity of the state |n is (−1) n .
Parity Selection Rules We can also consider selection rules, meaning rules for a matrix element to be nonzero. ˆ Pˆ are odd If |α, |β are parity eigenstates, |α has parity α , and |β has parity β then (since X, under parity) ˆ β| Xˆ |α = 0 = β| P|α,
(13.14)
ˆ unless α = − β (if not, the states |β and Xˆ |α, P|α are orthogonal, corresponding to different eigenvalues for the Hermitian operator π). ˆ
152
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries
13.3 Time Reversal Invariance, T As we mentioned, the time reversal operator corresponds to the reversal of motion, meaning running backwards the dynamics of the theory, which includes inverting the momentum. This leads to the classical action of T, x → x ,
p → −p .
(13.15)
In classical mechanics, if x (t) is a solution to the Newtonian equation of motion, , m(x¨ ) = −∇V
(13.16)
then x (−t) is also a solution, since making the replacement t → −t amounts to replacing d 2 /dt 2 by d 2 /d(−t) 2 in the equation, which leaves it invariant. Defining this as a new solution x (t) ≡ x (−t), we indeed find that d2 d2 x (t) = 2 x (t). 2 dt dt On the other hand, in quantum mechanics, the Schr¨odinger equation, 2 ∂ 2 i ψ = − ∇ + V (r ) ψ, ∂t 2m
(13.17)
(13.18)
is not invariant under time inversion, because of the presence of a single time derivative: if ψ(x , t) is a solution, then ψ(x , −t) is not a solution, since the time reversed equation is 2 2 ∂ ∇ + V (r ) ψ. (13.19) −i ψ = − ∂t 2m However, ψ∗ (x , −t) is a solution to (13.9). Indeed, the complex conjugate of the Schr¨odinger equation is 2 2 ∂ ∇ + V (r ) ψ∗ , (13.20) −i ψ∗ = − ∂t 2m which is the same as the time reversed equation, except that ψ is replaced by ψ∗ . Then, for an energy eigenfunction, the wave function is stationary, and we have ψ(x , t) = ψ n (x )e−iEn t/ ⇒ ψ∗ (x , −t) = ψ∗n (x )e−iEn t/ .
(13.21)
Considering a fixed time, such as t = 0, we obtain T
ψ n (x ) = x |n → ψ∗n (x ) = (x |n) ∗ = n|x .
(13.22)
This suggests that the Tˆ operator in quantum mechanics is an antilinear and unitary operator, or anti-unitary operator, so (according to the mathematics reviewed in the first few chapters) Tˆ (c|ψ) = c∗Tˆ (|ψ), ∀c ∈ C.
(13.23)
ˆ Pˆ is the equivalent By definition, in the passive point of view the action of Tˆ on the operators X, of the classical action on x, p, i.e., ˆ Tˆ −1 = X, ˆ Tˆ PˆTˆ −1 = − P. ˆ Tˆ X
(13.24)
153
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries
Therefore ˆ Tˆ 2 = 1ˆ ⇒ Tˆ −1 = T,
(13.25)
though only in the case without spin. As we will see later, on states with spin, the action of Tˆ is different. Having T as an anti-unitary operator solves another issue. Indeed, for a time-reversal invariant system, evolving a state by an infinitesimal time dt after the action of T should be equivalent to the action of T after a time evolution by −dt. Since in the infinitesimal case Uˆ (dt, 0) 1ˆ − i Hˆ dt/, we obtain ˆ1 − i Hˆ dt T |ψ = T 1ˆ − i Hˆ (−dt) |ψ ⇒ (13.26) ˆ ˆ −i H T |ψ = T i H |ψ . In the above, we have divided by dt/; both quantities in the ratio are real numbers, so the ratio can be taken out of both linear and antilinear operators. ˆ which However, we cannot divide by i, as for a linear operator, and conclude that Hˆ Tˆ = −Tˆ H, would be nonsensical, since it would imply that (13.27) Hˆ Tˆ |En = −Tˆ Hˆ |En = −En Tˆ |En , which would give negative eigenvalues of the Hamiltonian, and this is physically impossible. Instead, if we accept that Tˆ is antilinear, which is consistent with the previously derived action on T ˆ it becomes −i, so we obtain wave functions ψ n (x ) → ψ∗n (x ), when dividing out i from T, Hˆ (Tˆ |ψ) = Tˆ ( Hˆ |ψ),
(13.28)
which can be stated as an action on the Hamiltonian, ˆ Hˆ Tˆ = Tˆ Hˆ ⇒ Tˆ Hˆ Tˆ −1 = H,
(13.29)
saying that the Hamiltonian is time-reversal invariant, which is consistent with classical physics and our assumption. We note that in the above analysis, we didn’t need to define the action of Tˆ on a “bra” state, χ|T, which would be hard to do since the notation of bra and ket states was defined for linear operators. ˆ if the observable has a wellFor a general observable, associated with a Hermitian operator A, defined action under T, we have ˆ Tˆ AˆTˆ −1 = ± A,
(13.30)
which means Aˆ is even or odd under time reversal. Then, for quantum averages, we have ˆ ˆ Tˆ χ, ψ| A|χ = ±Tˆ ψ| A|
(13.31)
which is consistent with the Ehrenfest theorem. Coming back to the action of Tˆ on wave functions, we note that the antilinear property of Tˆ implies that action as well: since Tˆ (c|x) = c∗Tˆ |x = c∗ |x, inserting the |x completeness relation in front of a state |ψ, |ψ = dx|xx|ψ = dx|xψ(x),
(13.32)
(13.33)
154
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries
implies the action of T on it as
T |ψ =
dxψ∗ (x)|x,
(13.34)
T
so we re-obtain the previously found action ψ(x) → ψ∗ (x). ˆ We note two more properties of T: ˆ and the energy eigenstates |n are nondegenerate, then x|n is real, since • If Hˆ is invariant under T, T : x|n → x|n∗ = x|n.
(13.35)
T the spin is odd under time reversal, S → The reason is that we can think • If a system has spin S, − S. of the spin as a sort of rotation around the axis of the momentum, and running time backwards means the rotation changes direction; see Fig. 13.1b. In quantum theory, this means that
ˆ ˆ Tˆ STˆ −1 = − S.
(13.36)
A more precise notion has to wait for a better description of spin, here we merely note that for a state of spin j, we have Tˆ 2 |spin j = (−1) 2j |spin j.
(13.37)
ˆ For spinless states ( j = 0), we recover Tˆ 2 = 1.
13.4 Internal Symmetries Having dealt with discrete spacetime symmetries, we turn to discrete internal symmetries, having to do with internal degrees of freedom. Consider our prototype, the Z N group, specifically Z2 , for the case where there is no spacetime ˆ interpretation (unlike for π, ˆ T). We considered the example of a Z N -invariant system with complex variable qˆ and Hamiltonian ˆ˙ 2 | q| Hˆ = m + V ( qˆ N ). 2
(13.38)
ˆ ∀g ∈ Z N . gˆ Hˆ gˆ −1 = H,
(13.39)
Then, indeed,
ˆ = 0, we obtain a degeneracy of states, with g|E Since therefore [g, ˆ H] ˆ n having the same energy En as |En for all g ∈ Z N , meaning that there are N related states of the same energy. However, since in this q space the gk are represented as complex phases, gk = e2πik/N , the states all represent the same physical state. Another example is better suited to show the relevant degeneracy: consider another representation of Z N . In the case of Z2 , we can choose the representation in a two-dimensional vector space permuting the group elements, 1 0 0 1 D(e) = , D(b) = , (13.40) 0 1 1 0
155
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries where e and b are elements of Z2 . This representation can be used to construct a relevant case of a Z2 -invariant Hamiltonian, with two real variables q1 and q2 , m Hˆ = ( q˙ˆ12 + q˙ˆ22 ) + V ( qˆ1 · qˆ2 ). (13.41) 2 Indeed, the Hamiltonian is invariant under the above representation of Z2 , with the only nontrivial element D(b) acting on the vector space (q1 , q2 ) by interchanging them. Interchanging q1 and q2 leaves Hˆ invariant. Then, in this case also, |En and D(b)|En are states degenerate in energy, but now this means something nontrivial, as D(b) interchanges q1 and q2 , so one obtains a different energy eigenstate. We have given an example of Z2 invariance, where there are two group elements and correspondingly two states (degeneracy 2), but for a finite discrete group with N elements we have N states, so the degeneracy is N.
13.5 Continuous Symmetry Above we have considered only discrete internal symmetries, but we can also have continuous symmetries that can be both internal (considered later on in the book) and spacetime, e.g., rotations, which will be considered in the next chapter. As an example of continuous symmetry, consider the SO(2) rotation, with matrix element cos α sin α g = g(α) = , ∀α ∈ [0, 2π]. (13.42) − sin α cos α x This matrix element acts on a two-dimensionalvector (for a rotational symmetry in the plane y q1 (for an internal symmetry acting on two real defined by Cartesian coordinates x and y) or q2 variables as before). Consider the case of an internal symmetry for a system with two variables q1 and q2 , m L = ( q˙12 + q˙22 ) − V (q12 + q22 ) = T − V ⇒ 2 (13.43) m Hˆ = ( qˆ˙12 + qˆ˙22 ) + V ( qˆ12 + qˆ22 ) = Tˆ + Vˆ . 2 But then, under the SO(2) internal symmetry, the terms in the Hamiltonian transform as q12 + q22 → (q1 cos α + q2 sin α) 2 + (−q1 sin α + q2 cos α) 2 = q12 + q22 q˙12 + q˙22 → ( q˙1 cos α + q˙2 sin α) 2 + (−q˙1 sin α + q˙2 cos α) 2 = q˙12 + q˙22 ,
(13.44)
which means that both T and V are independently invariant.
13.6 Lie Groups and Algebras and Their Representations The group SO(2) considered above is an example of a Lie group, which is a continuous group, depending on continuous parameter(s) α a , so that g = g(α). It is also Abelian, though in fact it is the only Abelian Lie group, up to equivalences.
156
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries For a Lie group, by convention we have g(α a = 0) = e. Then, Taylor expanding in α a around α = 0, we have D(g(α a )) 1ˆ + idα a X a + · · · ,
(13.45)
which means that the constant elements X a , called the generators of the Lie group, are found as X a ≡ −i
∂ D(g(α a )) . a ∂α α a =0
(13.46)
The factor i is conventional (there is another convention without it), but it is chosen because if D(g) is unitary, so that [D(g)]† = [D(g)]−1 , then X a is Hermitian (X a† = X a ). Lie groups are named after Sophus Lie, who showed that the generators above can be defined independently of the particular representation and also defined the Lie algebra, which will be described shortly. For a finite group transformation by a parameter α a , we can consider infinitesimal pieces dα a = α a /k, as k → ∞, and then k iα a X a D(g(α)) = lim 1 + = exp (iα a X a ). (13.47) k→∞ k The simplest example of a Lie group is the group of unitary numbers, U (1), which is equivalent to the SO(2) group. Indeed, consider D(g(α)) = eiα ,
(13.48)
when acting on a complex number q = q1 + iq2 . Then q → eiα q = (cos α q1 − sin α q2 ) + i(q1 sin α + q2 cos α),
(13.49)
meaning that we have an action q1 cos α → q2 sin α
− sin α cos α
q1 , q2
(13.50)
equivalent to the SO(2) action above by a permutation of elements.
Lie Algebra A Lie algebra is the algebra (defined by a commutator) satisfied by the generators X a , X a ∈ L(G). The commutator acts as a product on the Lie algebra space L(G), [, ] : L(G) × L(G) → L(G). Because of the group property, ∀g, h ∈ G, g = eiα
aX
a
,
h = eiβ
aX
a
⇒ g · h = eiγ
aX
a
.
(13.51)
Then, after a somewhat simple analysis, one can show that this implies that we have the “algebra” [X a , X b ] = f ab c X c ,
(13.52)
which defines the constants f ab c , known as “structure constants”. From its definition, we find the antisymmetry property f ab c = − f ba c .
(13.53)
157
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries
In fact, the structure constants are completely antisymmetric in a, b, c, which can be shown by considering the Jacobi identity, an identity (of the type 0 = 0) described as [X a , [X b , X c ]] + cyclic permutations of (a, b, c) = 0,
(13.54)
which implies, by calculating both the commutators in (13.54) in terms of structure constants, e f bc d f ad e + f ab d f cd e + f ca d f bd = 0.
(13.55)
Representations for Lie Groups Lie groups can be represented as matrices, e.g., the SO(n) can be represented as special (of determinant 1) and orthogonal matrices, the SU (N ) as special and unitary matrices, etc. This is called the fundamental representation. But we have other representations denoted by R, in which the generators Ta are (Ta(R) )i j . Another important representation is the adjoint one, defined by (Ta )b c = −i f ab c .
(13.56)
It is indeed a representation, since we have ([Ta , Tb ])c e = i f ab d (Td )c e ,
(13.57)
which is true owing to the Jacobi identity, as we can easily check.
Important Concepts to Remember • Discrete symmetries are: parity, time reversal invariance, lattice translation, and internal. • Parity eigenstates, π|ψ ˆ = ±|ψ have even parity (the + eigenvalue) or odd parity (the − eigenvalue), but not all states are parity eigenstates in a parity-invariant theory. • For the harmonic oscillator (which is parity invariant), the parity of the state |n is (−1) n , but |x and |p are not parity eigenstates. • A selection rule for some particular symmetry is a rule for a matrix to be nonzero on the basis of the symmetry. • The Schr¨odinger equation is not time-reversal invariant; one needs to take the complex conjugate also of ψ. A related feature is that Tˆ (the time reversal operator in quantum mechanics) is antilinear and unitary, or anti-unitary. ˆ leading to x|n ∈ R. • In a time-reversal-invariant theory, Tˆ Hˆ Tˆ −1 = H, −1 ˆ ˆ ˆ ˆ and Tˆ 2 |spin j = (−1) 2j |spin j. • The action of time reversal on spin is T ST = − S, • Lie groups are continuous groups depending on continuous parameters, with generators Ta that are (−i) times the derivative of the group element with respect to the parameter, at zero parameter, so D(g(α)) = exp (iα a X a ). • The Lie algebra is the algebra of the generators of the Lie group, [X a , X b ] = f ab c X c , with f ab c the structure constants, satisfying the Jacobi identity, f bc d f ad e + cyclic permutations of (a, b, c) = 0. • The group SO(2), for the rotation of two real objects, is equivalent to the group U (1) with group element eiα rotating a complex element. • For the classical Lie groups we have the fundamental representation, on which the generators act as matrices on a space, and the adjoint representation, for which (Ta )b c = −i f ab c , and other representations.
158
13 Discrete Symmetries in Quantum Mechanics and Internal Symmetries
Further Reading See [8] for more about groups and quantum mechanics.
Exercises (1)
Consider a three-dimensional harmonic oscillator (a harmonic oscillator with the same mass and frequency in each direction). Is it parity invariant? If so, what is the parity of a generic state? (2) Consider the probability density of the nth state of the one-dimensional harmonic oscillator. Is it invariant under time-reversal invariance? How about under parity? (3) Consider a system with two degrees of freedom in one dimension, q1 and q2 , and Hamiltonian p12 p22 + + V (|q1 |) + V (|q2 |) + V12 (|q1 − q2 |). 2 2 Does it have any continuous internal symmetries? (4) Consider the algebra for three generators A, B, C, H=
[A, B] = C, [B, C] = A, [C, A] = B.
(13.58)
(13.59)
Is it a Lie algebra? If so, write its adjoint representation in terms of matrices. (5) Consider a Hamiltonian for two spins, H = α( S12 + S22 ) + β S1 · S2 .
(13.60)
Is the system time-reversal invariant? What about parity invariant? (6) What is the dimension of the adjoint representation of SU (N ), N > 2? Is this adjoint representation equivalent to the fundamental representation? (7) Consider the following Lagrangian for N degrees of freedom qi , L=
N N | q˙i | 2 −V |qi | 2 . 2 i=1 i=1
If qi ∈ R, what is the internal continuous symmetry group? What about if qi ∈ C?
(13.61)
14
Theory of Angular Momentum I: Operators, Algebras, Representations In this chapter we start the analysis of angular momentum. We first analyze the rotation group SO(3) and its equivalence to the group SU (2). We then define representations of both groups in terms of values for angular momentum (or “spin”).
14.1 Rotational Invariance and SO(n) A rotation is a linear transformation on a system: a transformation either of coordinates or of the system, depending on whether one takes a passive or active point of view, that leaves the lengths |Δr i j | = |r i − r j | of objects (the distances between points) invariant. Considering a rotation around a point O with vector r O , with linear transformation (r − r O )a = Λa b (r − r O )b ,
(14.1)
where a, b = 1, 2, 3 (these are spatial coordinates), by subtracting the relation for two points i and j we find that indeed (r i − r j )a = Λa b (r i − r j )b .
(14.2)
Imposing invariance, so that |r i − r j | = |r i − r j |, we find that |r i − r j | 2 = Λa b (r i − r j )b Λac (r i − r j )c = (r i − r j )b (ΛT Λ) bc (r i − r j )c = (r i − r j )b δ bc (r i − r j )c ,
(14.3)
which means that the matrix Λ defining the linear transformation obeys Λ† Λ = 1 ⇒ ΛT = Λ−1 ,
(14.4)
and thus is an orthogonal matrix. These orthogonal matrices form a group, since if ΛT1 = Λ−1 1 and ΛT2 = Λ−1 then 2 −1 −1 (Λ1 Λ2 )T = ΛT2 ΛT1 = Λ−1 2 Λ1 = (Λ1 Λ2 ) ,
(14.5)
so their product is also orthogonal. The group of orthogonal matrices is called the orthogonal group and is denoted by O(3) in the case of 3 × 3 matrices acting on three-dimensional vectors (spatial vectors). Moreover, using the fact that det( A · B) = det A · det B and det AT = det A, we find that det(ΛT Λ) = det Λ det ΛT = (det Λ) 2 = det 1 = 1, 159
(14.6)
160
14 Angular Momentum I: Operators, Algebras, Representations implying that det Λ = ±1. But that is easily understood: O(3) contains also the parity operation P, which as we saw acts as Pr = r = −r , which is not an operation continuously connected with the identity as is needed for a Lie group. This means that, in order to obtain a Lie group we need to eliminate P = −1 from O(3) by imposing the special condition, det Λ = +1. This condition respects the group property since if det Λ1 = det Λ2 = +1 then det Λ1 Λ2 = det Λ1 det Λ2 = 1.
(14.7)
We thus define the special orthogonal group, here SO(3), for 3 × 3 matrices. We have SO(3) = O(3)/Z2 ,
(14.8)
where Z2 = {+1, −1}, so O(3) corresponds to two copies of the Lie group SO(3). The analysis above trivially generalizes to n dimensions, for n×n matrices acting on n-dimensional vectors, forming the group O(n) with Lie subgroup SO(n) = O(n)/Z2 . The Lie algebra of SO(n) will be denoted by lower case letters, as so(n).
14.2 The Lie Groups SO(2) and SO(3) We have that SO(2), the Abelian Lie group studied before, is a subgroup: SO(2) ⊂ SO(3). As we saw, the matrix (which we can easily see is orthogonal and special) cos θ − sin θ M= (14.9) sin θ cos θ acts on two-dimensional spatial vectors
x as y
x x cos θ − y sin θ x =M = . y y x sin θ + y cos θ
(14.10)
Then we have rotational invariance, x 2 + y 2 = x 2 + y 2 , and the action of this SO(2) matrix is equivalent to the action of a unitary number (a complex phase) on a complex number, thus to the action of M = eiθ = cos θ + i sin θ
(14.11)
M z = z = x + iy ,
(14.12)
on z = x + iy, such that
as we can easily check, so SO(2) U (1) (where U (1) is the group of unitary numbers, i.e., phases). This SO(2) U (1) Abelian group (the unique Abelian Lie group) is a subgroup of any compact Lie group, in particular of any SO(n) for n ≥ 2. The reason is that we can always pick two coordinates out of the n on which SO(n) acts to rotate, hence defining SO(2) ⊂ SO(n).
161
14 Angular Momentum I: Operators, Algebras, Representations
3
z Á
2
Ã
1 Figure 14.1
Euler angles parametrizing a rotation in three-dimensional space: two angles, θ and ψ, relate to the Cartesian coordinate system, and one, φ, corresponds to a rotation around the axis itself.
The Group SO(3) Considering now rotations in three-dimensional space, we can always pick a planar (SO(2)) rotation of angle φ around a fixed axis zˆ. Choosing the coordinate system in such a way as to have this axis 0 zˆ in the third direction, ez = 0, we then have 1 cos φ R3 (φ) = sin φ 0
− sin φ cos φ 0
0 0 . 1
(14.13)
ˆ 2, ˆ 3, ˆ is However, in general the vector zˆ, in a fixed coordinate system with Cartesian directions 1, ˆ and project zˆ onto the defined by two angles, θ and ψ. Consider θ to be the angle made by zˆ with 3, ˆ 2). ˆ Then zˆ makes an angle π/2 − θ with its projection onto the (1, ˆ 2) ˆ plane, and plane defined by (1, ˆ the projection makes an angle ψ with 2, as in Fig. 14.1. The angles φ, θ, ψ are known as the Euler angles. ˆ thus aligning the projection of zˆ To obtain the general rotation, we first rotate by ψ around 3, ˆ ˆ ˆ with 2, and then rotate around 1 by θ, aligning zˆ with 3, and finally rotate by φ around it. Then, the general three-dimensional rotation parametrized by the Euler angles is g(φ, θ, ψ) = g(φ)g(θ)g(ψ) = R3 (φ)R1 (θ)R3 (ψ) cos φ = sin φ 0
− sin φ cos φ 0
0 1 0 0 1 0
0 cos θ sin θ
0 cos ψ − sin θ sin ψ cos θ 0
− sin ψ cos ψ 0
0 0 . 1 (14.14)
To obtain the most general rotation axis zˆ in three-dimensional space, the rotation range of ψ is [0, 2π), and then of θ is [0, π). Finally, the rotation by φ has the standard [0, 2π) range.
162
14 Angular Momentum I: Operators, Algebras, Representations There are three U (1) SO(2) Abelian subgroups in SO(3), corresponding to the three Euler angles, and three ways to pick two coordinates for the rotation plane from the three coordinates. To find the Lie algebra of SO(3), we define its three generators Ji as being associated with the three Euler angles. Each corresponds to rotation in a plane. For instance, R3 (φ) corresponds to rotation in the (1, 2) plane, and, for an infinitesimal φ, we obtain cos φ R3 (φ) = sin φ 0
− sin φ cos φ 0
0 1 0 φ 1 0
−φ 1 0
0 0 0 + O(φ) 2 = 1 + φ 1 1 0
−1 0 0 0 + O(φ2 ). (14.15) 0 0
Then the generator for this rotation around the 3-direction is 0 iT3 = i J3 = 1 0
−1 0 0 0 , 0 0
(14.16)
and we can do a similar calculation for the other directions. The group elements for finite rotations by angles θi , i = 1, 2, 3, are exp iθ i Ji ≡ exp ii jk ω jk Ji . (14.17) Moreover, a unitary infinitesimal matrix means that (1 + θ i i Ji ) † = 1 + θ(i Ji ) † = (1 + iθ i Ji ) −1 = 1 − θ i i Ji ,
(14.18)
and for real i Ji we find an antisymmetric matrix, as above for i J3 .
Rotationally Invariant Systems ˆ for instance For a rotationally (SO(3)) invariant Lagrangian L (and thus Hamiltonian H), 2 mi mi d 2 ˙ ri − V (|r i − r j |) = L= (r i − r O ) − V (r i − r j ) 2 , 2 2 dt i i
(14.19)
the Lagrangian of the system must be in a representation of the rotation group SO(3). Here, as we can see, we have the fundamental representation, where the rotation matrices act on the three-dimensional position vectors r . However, the state of the system, |ψ, need not be in the same representation, but can be in an arbitrary representation (since it is not related to x as a vector space). Before discussing general representations, we consider the equivalence of SO(3) and SU (2) in the Lie algebra.
14.3 The Group SU(2) and Its Isomorphism with SO(3) Mod Z2 Consider the group of unitary matrices, i.e., the complex matrices U satisfying U † = U −1 . Then, indeed, if U1† = U1−1 and U2† = U2−1 , (U1U2 ) † = U2†U1† = U2−1U1−1 = (U1U2 ) −1 ,
(14.20)
163
14 Angular Momentum I: Operators, Algebras, Representations which means that unitarity respects the group property. The group of unitary n × n matrices is called U (n) and acts on an n-dimensional complex vector. For a unitary matrix U, det(U †U) = (det U) ∗ det U = det 1 = 1 ⇒ | det U | = 1 ⇒ det U = eiα ,
(14.21)
which means that the determinant of the unitary matrix forms a U (1) group. Imposing the special condition det U = +1, which respects the group property as we saw, we obtain the special unitary group, SU (n). However, now (since the determinant forms a U (1) group), U (n) SU (n) × U (1), modulo topological issues. In fact, U (n) (SU (n) × U (1))/Zn .
(14.22)
In Lie algebra, u(n) = su(n) × u(1). In the relevant case of n = 2, we have also SO(3) SU (2)/Z2 ,
(14.23)
which means that SU (2) winds around SO(3) twice. Indeed, −12×2 ∈ SU (2), but −13×3 is not in SO(3). As the SO(2) ⊂ SO(3) rotation equals U (1) ⊂ SU (2), we can identify the rotation in the plane ˆ as an SU (2) matrix with diagonal elements that have opposite phases: (1, 2) (around 3) iφ/2 e 0 ≡ R˜3 (φ), (14.24) R3 (φ) → 0 e−iφ/2 z acting on the two-dimensional complex vector 1 . The matrix is unitary, since z2 −iφ/2 e † ˜ R3 = 0
0
eiφ/2
= R˜3−1 ,
(14.25)
and det R˜3 = 1. The rotation around 1ˆ can be represented by a different unitary matrix, cos θ/2 i sin θ/2 R1 (θ) → ≡ R˜1 (θ). i sin θ/2 cos θ/2
(14.26)
We can then check that R˜1† =
cos θ/2 −i sin θ/2 = R˜1−1 , −i sin θ/2 cos θ/2
(14.27)
and det R˜1 = +1. Then the general element of SU (2) is represented via the Euler angles φ, θ, ψ as g(φ, θ, ψ) = R˜3 (φ) R˜1 (θ) R˜3 (ψ) cos θ/2 ei(φ+ψ)/2 = −i(φ−ψ)/2 i sin θ/2 e
i sin θ/2 ei(φ−ψ)/2 cos θ/2 e
−i(φ+ψ)/2
.
(14.28)
164
14 Angular Momentum I: Operators, Algebras, Representations But, as we said, −12×2 ∈ SU (2), so we need to obtain ± times the above generic matrix, which was defined over the original range of the (SO(3)) Euler angles. Since cos(π + α) = − cos α and sin(π + α) = −α, we can achieve that by noting that R˜3 (φ) R˜1 (θ + 2π) R˜3 (ψ) = − R˜3 (φ) R˜1 (θ) R˜3 (ψ),
(14.29)
that is, by doubling the range of θ, thus proving the statement from before, that SU (2) winds twice around SO(3) (it is a double cover of SO(3)). On the other hand, that also means that the Lie algebras of SU (2) and SO(3) are the same, su(2) = so(3), which means that their representations are equivalent. In particular, it means that we can represent SO(3) via complex 2 × 2 matrices also! We will see that this is the minimum representation space (of “spin j = 1/2”) that we can have for SO(3).
Example for SU(2) An example of such a representation, for a system with “internal” SU (2) symmetry, is a system defined by four real coordinates, x 1 , x 2 , y1 , y2 ∈ R, combined into two complex coordinates q1 = x 1 + iy1 and q2 = x 2 + iy2 , and further into the vector space with two complex dimensions, q q = 1 , with Lagrangian q2 L=m
| q| ˙2 m − V (|q| 2 ) ≡ (| q˙1 | 2 + | q˙2 | 2 ) − V (|q1 | 2 + |q2 | 2 ). 2 2
(14.30)
Then, for q → q = gq, we obtain |q| 2 → |q | 2 = |gq| 2 = q† g † gq = q† q = |q| 2 | q| ˙ 2 → | q˙ | 2 = |g q| ˙ 2 = q˙† g † g q˙ = q˙† q˙ = | q| ˙ 2.
(14.31)
Again, just because the (classical or quantum) system is in the two-dimensional representation of SU (2), it doesn’t mean that the state is in that representation, since the state is not related to q1 , q2 as a vector space. States |ψ can be in a different representation of SU (2).
14.4 Generators and Lie Algebras From the Noether theorem, we saw in Chapter 12 that, for a symmetry with generators Ta , we have a conserved charge associated with it, Qa =
∂L (iTa )i j q j = pi (iTa )i j q j . ∂ q˙i
(14.32)
Considering the symmetry to be a rotation, specifically around z = x 3 , and qi = x i = (x 1 , x 2 , x 3 ), we found that 0 −1 iT3 = +1 0 0 0
0 0 , 1
(14.33)
165
14 Angular Momentum I: Operators, Algebras, Representations
and similarly (just permuting the vector space 1, 2, 3) for iT1 , iT2 we find Q3 = p · iT3 · q = x 1 p2 − x 2 p1 .
(14.34)
Moreover, for the other two charges, we find Q2 = x 3 p1 − x 1 p3 Q1 = x 2 p3 − x 3 p2 , or, written together, Qi =
i jk x j pk ,
(14.35)
(14.36)
j,k
using the Levi-Civita antisymmetric symbol i jk , for which 123 = +1 and which is totally antisymmetric (so that, for instance, 132 = −123 = −1). However, i jk defines the three-dimensional vector product, so we have in fact = r × p = L, Q
(14.37)
which is the angular momentum of a particle! Therefore the conserved charge associated with rotational invariance is the angular momentum. Moreover, denoting iTa = i Ja , we see that Q a = L a is the charge associated with the generator Ja , so the generators of SO(3) rotations are (associated with) angular momenta. This is similar to the fact that the generators of translations K a are (associated with) the momenta Pa . Classically, we can calculate the Poisson brackets of angular momenta, using the canonical expressions {x i , p j } P.B. = δi j or, equivalently, the definitions of the Poisson brackets: {L i , L j } P.B. = ikl jmn {x k pl , x m pn } P.B. = ilk jmk x m pl + ikl jmn x k pn = −i jk kmn x m pn = −i jk L k .
(14.38)
This is the same algebra as the algebra of generators i Ji for SO(3) (a Lie algebra), as we can directly check: ⎡⎢ 0 0 0 0 0 −1 ⎤⎥ 0 −1 0 ⎢⎢ ⎥ [i J1 , i J2 ] = ⎢0 0 −1 , 0 0 0 ⎥⎥ = −i J3 = − 1 0 0 , (14.39) ⎢⎢ ⎥⎥ 0 1 0 1 0 0 0 0 0 ⎦ ⎣ so we can identify the generators of rotations with the angular momenta. Finally, we will see in the next chapter that for wave functions ψ(r ) the rotation acts according to ˆ the operator L: ∂ i jk Xˆ j Pˆk ψ(x ) = −i i jk x j ψ(x ). (14.40) Lˆ i ψ(x ) = ∂ xk jk jk
14.5 Quantum Mechanical Version In quantum mechanics, the Poisson bracket {, } P.B. is replaced by (1/i)[, ], so we obtain the algebra [ Lˆ i , Lˆ j ] = i i jk Lˆ k . (14.41) k
166
14 Angular Momentum I: Operators, Algebras, Representations Equivalently, and more directly, x i and pi become the operators Xˆ i and Pˆi , so we have Lˆ 1 = Xˆ 2 Pˆ3 − Xˆ 3 Pˆ2 , etc. ⇒ Lˆ i = i jk Xˆ j Pˆk ,
(14.42)
k
and, using the canonical commutation relations [ Xˆ i , Pˆ j ] = iδi j , we find again [ Lˆ i , Lˆ j ] = i i jk Lˆ k .
(14.43)
k
Moreover, constructing the square of the total angular momentum, Lˆ 2 = Lˆ 2 + Lˆ 2 + Lˆ 2 , 1 2 3 we then obtain ˆ [ Lˆ i , L 2 ] = Lˆ j [ Lˆ i , Lˆ j ] + [ Lˆ i , Lˆ j ] Lˆ j = i
(14.44)
i jk ( Lˆ j Lˆ k + Lˆ k Lˆ j ) = 0.
(14.45)
k
Thus a general “angular momentum”, for either rotation SO(3) generators or generators of an internal SO(3), has the commutation relations ˆ [ Jˆi , Jˆk ] = i i jk Jˆk , [J 2 , Jˆi ] = 0, (14.46) k
where the Jˆi are Hermitian.
14.6 Representations Finally, we can construct the representations of the rotation group SU (2) SO(3) corresponding to the possible Hilbert spaces for the rotationally invariant systems. We define the operators J+ = J1 + i J2 , J− = J1 − i J2 ,
Jz = J3 ,
(14.47)
with (J+ ) † = J− and (J− ) † = J+ . Their algebra is [J+ , J− ] = 2i[J2 , J1 ] = 2Jz [Jz , J+ ] = J+
(14.48)
[Jz , J− ] = −J− . 2
From the fact that [J , Jz ] = 0, it follows that there is a complete set of simultaneous eigenstates for 2 both J and Jz , denoted |λ2 , m, . . . by their eigenvalues, J 2 |λ2 , m, . . . = λ2 2 |λ2 , m, . . ., Jz |λ2 , m, . . . = m|λ2 , m, . . ..
(14.49)
But, since Ji is Hermitian, we find that λ2 , m, . . . | Ji2 |λ2 , m, . . . = Ji |λ2 , m, . . . 2 ≥ 0 i
i 2 2
=λ
(14.50) ⇒ λ ≥ 0, 2
167
14 Angular Momentum I: Operators, Algebras, Representations
where we have used the fact that the norm of the eigenstates is one (or in any case, positive and nonzero). Thus λ2 ≥ 0, so λ is real (which shows that it was a good definition). In a similar manner, using the Hermiticity of J1 , J2 we find 2
λ2 , m, . . . |(J12 + J22 )|λ2 , m, . . . , = λ2 , m, . . . |( J − Jz2 )|λ2 , m, . . . = (λ2 − m2 )2 = J1 |λ2 , m, . . . 2 + J2 |λ2 , m, . . . 2 ≥ 0
(14.51)
⇒ λ2 − m2 ≥ 0. On the other hand, J+ acts as a raising operator for m and J− as a lowering operator, since (using [Jz , J+ ] = J+ and [Jz , J− ] = J− ) we find Jz (J+ |λ2 , m, . . .) = J+ (Jz |λ2 , m, . . .) + J+ |λ2 , m, . . . = (m + 1)(J+ |λ2 , m, . . .) Jz (J− |λ2 , m, . . .) = J− (Jz |λ2 , m, . . .) − J+ |λ2 , m, . . . = (m − 1)(J− |λ2 , m, . . .). (14.52) Since m2 ≤ λ2 , it follows that there is a maximum value and a minimum value of m, such that J+ |λ2 , mmax , . . . = 0 ⇒ |λ2 , mmax + 1, . . . = 0 J− |λ2 , mmin , . . . = 0 ⇒ |λ2 , mmin − 1, . . . = 0.
(14.53)
Also, because the step in m is 1, it follows that a representation of given λ2 has a finite number of states. Moreover, as we have 2 J− J+ = (J1 − i J2 )(J1 + i J2 ) = J12 + J22 + i[J1 , J2 ] = J − Jz2 − Jz
(14.54)
and J− = (J+ ) † , we find that 2 λ2 , m, . . . | J− J+ |λ2 , m, . . . = λ2 , m, . . . |( J − Jz2 − Jz )|λ2 , m, . . . = (λ2 − m2 − m)2
= J+ |λ2 , m, . . . 2 ≥ 0 ⇒ λ2 − m2 − m ≥ 0. (14.55) Similarly, as we have 2 J+ J− = (J1 + i J2 )(J1 − i J2 ) = J12 + J22 − i[J1 , J2 ] = J − Jz2 + Jz
(14.56)
and J+ = (J− ) † , we obtain 2
λ2 , m, . . . | J+ J− |λ2 , m, . . . = λ2 , m, . . . |( J − Jz2 + Jz )|λ2 , m, . . . = (λ2 − m2 + m)2 = J− |λ2 , m, . . . 2 ≥ 0 ⇒ λ2 − m2 + m ≥ 0. (14.57) Further, the equality to zero happens, for mmax , when the norm of the state is zero, since the state itself vanishes, meaning when 2 J+ |λ2 , m, . . . = 0 ⇒ λ2 − mmax − mmax = 0,
(14.58)
and for mmin when similarly 2 J− |λ2 , m, . . . = 0 ⇒ λ2 − mmin + mmin = 0.
(14.59)
However, subtracting (14.58) from (14.59) we find mmax (mmax + 1) = mmin (mmin − 1),
(14.60)
168
14 Angular Momentum I: Operators, Algebras, Representations
whose only solution is mmax = −mmin .
(14.61)
Indeed, there is no solution if both mmax and mmin are positive, or both are negative. Finally, we have −mmax ≤ m ≤ mmin ,
(14.62)
and since m changes by ±1 only (through J± ), it follows that 2mmax = mmax − mmin = n ∈ N,
(14.63)
so n ≡ j. 2
(14.64)
λ2 = j ( j + 1).
(14.65)
mmax = This “quantum number” j is thus half-integer. Finally, (14.58) implies that
Assuming that there are no other quantum numbers (so we have only angular momentum as a variable), the state is | jm and Jz | jm = m| jm J 2 | jm = j ( j + 1)2 | jm.
(14.66)
The interpretation is that the index j describes possible representations of SO(3), or of angular momentum (since j defines the total angular momentum), and m describes the states in the representation.
Matrix Elements Finally, we compute the matrix elements of the generators in the | jm representation. From (14.55), J+ | jm 2 = 2 [ j ( j + 1) − m2 − m],
(14.67)
J+ | jm ∝ | j, m + 1,
(14.68)
whereas from (14.52),
which finally gives J+ | jm = where α jm ≡
( j − m)( j + m + 1)| jm ≡ α j,m+1 | j, m + 1,
j ( j + 1) − m(m − 1) =
( j + m)( j − m + 1).
(14.69)
(14.70)
As this is a matrix element in | jm space, we have (J+ ) jm ,jm = α j,m+1 δ m ,m+1 .
(14.71)
J− | jm 2 = 2 [ j ( j + 1) − m2 + m]
(14.72)
Similarly, from (14.57),
169
14 Angular Momentum I: Operators, Algebras, Representations
whereas from (14.52) J− | jm ∝ | j, m − 1, which finally gives J− | jm =
( j + m)( j − m + 1)| jm ≡ α jm | j, m − 1.
(14.73)
(14.74)
As a matrix element in | jm space, this means that (J− ) jm ,jm = α jm δ m ,m−1 .
(14.75)
Finally, since J1 = (J+ + J− )/2 and J2 = (J+ − J− )/(2i), we find also (α j,m+1 δ m ,m+1 + α jm δ m ,m−1 ) 2 (J2 ) jm ,jm = (α j,m+1 δ m ,m+1 − α jm δ m ,m−1 ). 2i To these, we add the diagonal elements (J1 ) jm ,jm =
(14.76)
(J3 ) jm ,jm = mδ m m 2
( J ) jm ,jm = j ( j + 1)2 δ m m .
(14.77)
We now specialize to the j = 1/2 case (which will be described in much more detail later on, as it corresponds in particular to the electron spin 1/2 case), and note that the above matrices are now 2 × 2 matrices, acting on the two-dimensional vector space | jm = |+1/2, ±1/2. Then, as a matrix, 1/2 0 J3 = . (14.78) 0 −1/2 This indeed matches with the generators of SU (2), since for infinitesimal Euler angle parameters, we find 1 + iφ/2 0 1/2 0 R˜3 (φ) + O(φ2 ) = 1 + iφ + O(φ2 ) ⇒ 0 1 − iφ/2 0 −1/2 1/2 0 J3 = 0 −1/2 (14.79) 1 +iθ/2 0 1/2 2 2 ˜ R1 (θ) + O(θ ) = 1 + iθ + O(θ ) ⇒ +iθ/2 0 1/2 0 0 1/2 J1 = . 1/2 0 The last formula is consistent with the matrices we found for the spin 1/2 representation, since we find 0 × |1/2, ∓1/2 J+ |1/2, ±1/2 = 1 1 J− |1/2, ±1/2 = × |1/2, ∓1/2 ⇒ (14.80) 0 J+ + J− 1 0 1 = J1 = . 2 2 1 0
170
14 Angular Momentum I: Operators, Algebras, Representations
Important Concepts to Remember • The special orthogonal group SO(n) is the group of matrices that are orthogonal, ΛT = Λ−1 and special, det Λ = 1, which correspond to rotations in n-dimensional space. Globally, SO(n) = O(n)/Z2 . • The special unitary group SU (n) is the group of matrices that are unitary, U † = U −1 , and special, det U = 1, with U (n) (SU (n) × U (1))/Zn . • For n = 2 we also have SO(3) SU (2)/Z2 , so SU (2) winds twice around SO(3): SO(3) is generated by three Euler angles for rotations around three axes (in a plane; with the z axis towards the plane; and around the final z axis), while SU (2) has twice the range of the angles, so that −1 ∈ SU (2). • The charges associated with SO(3) rotations of coordinates are the angular momenta L i = i jk x k pk , which classically satisfy the Poisson brackets {L i , L j } P.B. = −i jk L k . ˆ • Quantum mechanically, the Lˆ i give the Lie algebra [ Lˆ i , Lˆ j ] = ii jk Lˆ k , with [L 2 , Lˆ k ] = 0, which is the Lie algebra of SU (2), with generators generally denoted by Jˆi . • The representations of SU (2) are indexed by a half-integer j ∈ N/2 and within a representation, 2 the states are indexed by m, − j ≤ m ≤ j, and thus by | jm, with J | jm = j ( j + 1)2 | jm and Jz | jm = m| jm. • The matrix elements within the representation are J+ | jm = α j,m+1 | j, m + 1, J− | jm = α j,m | j, m − 1, and J3 | jm = m| jm, with α j,m = ( j + m)( j − m + 1). • The spin 1/2 ( j = 1/2) generators Ji correspond to the generators of SU (2) via the Euler angles.
Further Reading See [2], [1], [3] for more information.
Exercises (1)
Write explicitly the matrices of the spin 1 representation and the adjoint representation of SU (2), and relate them. (2) Write explicitly the 2 × 2 matrices g = eiα i σi /2 , where σi are the Pauli matrices, and compare with g(θ, φ, ψ) for SO(3), to find an explicit map between α i and the Euler angles (θ, φ, ψ). (3) For a general matrix a b A= (14.81) c d belonging to SU (2), i.e., such that A† = A−1 and det A = 1, find the Euler angles in terms of a, b, c, d. q1 (4) Consider the complex degrees of freedom q1 , q2 ∈ C forming a doublet q = , q2
3 A = i=1 ai σi ∈ C, where σi are the Pauli matrices, and the Lagrangian L = q˙† q˙ + det e A − q† Aq. ˙
Show that L is invariant under SU (2).
(14.82)
171
14 Angular Momentum I: Operators, Algebras, Representations
(5) Write explicitly the matrices for the generators of the group SU (2) in the j = 3/2 representation, and check that they satisfy the Lie algebra of SU (2). (6) In the classical limit for the angular momentum, i.e., for large j, we expect to see quantum averages J1 , J2 , J3 close to the classical values, as well as a small quantum fluctuation for these angular momenta. Show that this is the case. (7) Consider a three-dimensional rotationally invariant harmonic oscillator. What quantum numbers describe a state of the oscillator, viewed as a system with a potential V = V (r)?
15
Theory of Angular Momentum II: Addition of Angular Momenta and Representations; Oscillator Model In the previous chapter, we saw that the angular momentum of a system acts as a generator of the SO(3) rotation group, and that the values of the angular momentum j define the representation of SO(3) in which the states of the system belong. Classically, energy and mass are scalars under rotations, meaning they do not change, while momentum p and angular momentum J are vectors, and so do transform under the fundamental (or, defining) representation of SO(3). Quantum mechanically, p and J no longer characterize states (which are in a given representation) but rather are operators; in particular they are generators of the translation and rotation (SO(3)) groups, respectively, and as such are in fixed representations of the groups themselves. On the other hand, states are in different representations, unrelated to those of the operators: they are in representations defined by an eigenvalue j: • for j = 0, we have the “scalar” representation, with a single state, |0, 0; • for j = 1/2, we have the “spinor” representation, with two states, either |1/2, −1/2 and |1/2, +1/2, or |1/2↑ and |1/2↓. • for j = 1, we have the “vector” representation, with three states, |1, −1, |1, 0, and |1, −1. • etc.
15.1 The Spinor Representation, j = 1/2 The simplest nontrivial representation, with angular momentum j = 1/2, has generators Ji in this representation given by the following matrices (from the general formulas of the last chapter): 0 1 0 −i 1 0 J11/2 = , J21/2 = , J31/2 = . (15.1) 2 1 0 2 i 0 2 0 −1 We take out the prefactor Ji1/2 ≡ and thus defining the Pauli matrices σi , 0 1 σ1 = , 1 0
172
σ2 =
0 i
σi , 2 −i , 0
(15.2) σ3 =
1 0 . 0 −1
(15.3)
173
15 Angular Momentum II: Addition, Representations; Oscillators
We can check that these matrices satisfy a number of properties: (1) σi2 = 1 (matrix normalization): (2) {σi , σ j } = 0 for i j (anticommutation): (3) σ1 σ2 = iσ3 and cyclic permutations thereof: σ2 σ3 = iσ1 , σ3 σ1 = iσ2 (commutation); since σ1 σ2 = −σ2 σ1 , etc. (anticommutation), we obtain the algebra [σi , σ j ] = 2ii jk σk .
(15.4)
Putting together properties (1) and (3), we have σi σ j = δi j 1 + ii jk σk .
(15.5)
Further properties of the Pauli matrices are as follows: (4) Tr[σi ] = 0 (tracelessness): (5) Tr[σi2 ] = 2 (matrix normalization); (6) using properties (3)–(5), we find also Tr[σi σ j ] = 0 for i j, so we have a (trace orthonormalization) condition Tr[σi σ j ] = 2δi j ;
(15.6)
(7) (completeness of (1, σi )) in the space of complex 2 × 2 matrices, we can decompose any matrix M in terms of the four matrices σi and 1, as M = α0 1 + α i σi ; (15.7) i=1,2,3
indeed, multiplying with 1 or σ j and taking the trace, and then using the trace orthonormalization condition at property (6), we find α0 =
1 Tr[M], 2
αi =
1 Tr[M σi ]; 2
(15.8)
(8) σi† = σi (Hermiticity); (9) putting together (1) and (8), we find also that σi† = σi−1 (unitarity). Finally, we obtain that the spin 1/2 representation is the fundamental representation of SU (2) (which is equal in Lie algebra to the rotation group SO(3), as we saw), since the σi = (2/) Ji1/2 are 2 × 2 complex Hermitian matrices and generators, so that g = ei α ·J is unitary: g † = g −1 . Moreover, property (4) (the tracelessness of σI ) means that the matrices g are also special. Indeed, we have det M = eTr ln M ,
(15.9)
which can be proven as follows. Diagonalize the matrix M, by writing it as S −1 M˜ S, with M˜ diagonal with diagonal elements λi , and using det(S −1 M S) = det S −1 det M det S = det M and Tr[S −1 M S] = Tr M, so that
λi ⇒ det M = λi = e i ln λ i = eTr ln M . (15.10) Tr[M] = i
i
Further, writing M = e A, we obtain det e A = eTr A,
(15.11)
· σ and e A = g, so det g = 1. which means that if Tr A = 0 then det e A = 1. In our case, A = α
174
15 Angular Momentum II: Addition, Representations; Oscillators
On the other hand, the adjoint representation of SO(3) is defined by the structure constants of SO(3), f abc = abc , (Ta )bc = −i f abc = −i abc .
(15.12)
This means specifically that 0 T1 = −i 0 0
0 0 0 0 1 , T2 = −i 0 −1 0 1
0 0 0
−1 0 0 , T3 = −i −1 0 0
1 0 0 0 , 0 0
(15.13)
which is the same set of matrices as Ji1 / (the generators, or angular momenta in the j = 1 representation), as can be checked using the explicit formulas from the last chapter. With regard to the representations of j ≥ 3/2, we can obtain them from the previous representations by composition, which will be discussed next.
15.2 Composition of Angular Momenta Consider a composite system, such as for instance an atom, that contains several objects (such as 2 electrons, etc.), each with angular momentum. Thus we have eigenstates of J and J z for each
object, but we must have also eigenstates of the system for the total momentum J = i J i .
Two Angular Momenta,J = J1 + J2 It is in fact enough to consider only two momenta, since we can add up two momenta, then add another to the sum of the first two, etc., until we have done the whole sum. 2 Since for J , Jz we have eigenstates | jm, with j defining the representation and m an index for the states in the representation, for two such momenta J 1 , J 2 we have a tensor product total state, written as | j1 m1 ⊗ | j2 m2 ≡ | j1 j2 ; m1 m2 .
(15.14)
The algebra of the angular momenta is [J1i , J1j ] = ii jk J1k [J2i , J2j ] = ii jk J2k [J1i , J2j ] = 0 [J12 , J1z ] [J22 , J2z ]
(15.15)
=0 = 0.
For fixed j1 , j2 , we have that m1 , m2 must define the system, as a tensor product representation. But this representation is in fact too large, considered as a representation of SO(3). In fact, we already know that all the representations of SO(3) are defined by a fixed j and indexed by m values. This means that the basis of states above, while complete in the Hilbert space, is too large, and we must impose an extra condition that J = J 1 + J 2 is also an angular momentum, [Ji , J j ] = ii jk Jk , with quantum numbers j, m defining the representation of the total system.
(15.16)
175
15 Angular Momentum II: Addition, Representations; Oscillators We note that on the space of states | j1 m1 ⊗ | j2 m2 , we can act with the matrices in the corresponding representations for each of the two Hilbert subspaces, thus with D1R (g) ⊗ D2R (g) = DR e−ia ·J 1 / ⊗ DR e−i α ·J 2 / · J 2 · J 1 α α ⊗ 1−i i · J 1 ⊗ 12 + 11 ⊗ J 2 1⊗1− α i · J , ≡1− α 1 − i
(15.17)
meaning that more precisely we have J = J 1 ⊗ 12 + 11 ⊗ J 2 .
(15.18)
In the basis | j1 j2 ; m1 m2 , we have the eigenvalues J 21 | j1 j2 ; m1 m2 = j1 ( j1 + 1)2 | j1 j2 ; m1 m2 J 22 | j1 j2 ; m1 m2 = j2 ( j2 + 1)2 | j1 j2 ; m1 m2
(15.19)
J1z | j1 j2 ; m1 m2 = m1 | j1 j2 ; m1 m2 J2z | j1 j2 ; m1 m2 = m2 | j1 j2 ; m1 m2 .
But since j1 , j2 define the representation, they must be kept in an alternative basis, while we can 2 replace m1 , m2 (the eigenvalues of J1z , J2z ) with the eigenvalues of J and Jz , represented by j and 2 2 2 m. Indeed, the set J 1 , J 2 , J , Jz is mutually commuting, so we can have simultaneous eigenvalues, whereas adding J1z , J2z spoils the mutual commutativity. Indeed, we can calculate 2 2 2 2 2 [J , J 1 ] = [J 1 + J 2 + 2J 1 · J 2 , J 1 ] 2
2
2
= [J 1 + J 2 + 2J1z J2z + J1+ J2− + J1− J2+ , J 1 ] = 0 2 2 2 2 2 [J , J 2 ] = [J 1 + J 2 + 2J 1 · J 2 , J 2 ] 2
2
2
(15.20)
= [J 1 + J 2 + 2J1z J2z + J1+ J2− + J1− J2+ , J 2 ] = 0 2
2
[Jz , J 1 ] = [J1z + J2z , J 1 ] = 0 2 2 [Jz , J 2 ] = [J1z + J2z , J 2 ] = 0,
but on the other hand 2
[J , J1z ] 0 (since [J1z , J1± ] 0) 2
(15.21)
[J , J2z ] 0 (since [J2z , J2± ] 0). Therefore the new basis of states is | j1 j2 ; jm, with eigenvalues J 21 | j1 j2 ; jm = j1 ( j1 + 1)2 | j1 j2 ; jm J 22 | j1 j2 ; jm = j2 ( j2 + 1)2 | j1 j2 ; jm J 2 | j1 j2 ; jm = j ( j + 1)2 | j1 j2 ; jm Jz | j1 j2 ; jm = m| j1 j2 ; jm,
(15.22)
176
15 Angular Momentum II: Addition, Representations; Oscillators
and this basis is also complete in the Hilbert space of states of the system. Thus we can expand the elements of one basis in terms of the other basis. In particular, by using the completeness relation | j1 j2 ; m1 m2 j1 j2 ; m1 m2 |, (15.23) 1= m1 ,m2
we can write | j1 j2 ; jm =
| j1 j2 ; m1 m2 j1 j2 ; m1 m2 | j1 j2 ; jm.
(15.24)
m1 ,m2
The matrix elements j1 j2 ; m1 m2 | j1 j2 ; jm are known as the Clebsch–Gordan coefficients. Properties of the Clebsch–Gordan coefficients: (1) The first property derives from acting with Jz = J1z + J2z on states, specifically taking matrix elements between the two basis states, 0 = j1 j2 ; m1 m2 |(Jz − J1z − J2z )| j1 j2 ; jm = (m − m1 − m2 ) j1 j2 ; m1 m2 | j1 j2 ; jm,
(15.25)
where we have acted with Jz on the right, and with J1z , J2z on the left. The equation above means that the Clebsch–Gordan coefficients can only be nonzero for m = m1 + m2 (otherwise they vanish, according to the equation). (2) The second property gives a range for the total angular momentum j: | j1 − j2 | ≤ j ≤ j1 + j2 .
(15.26)
As a first check, we note that at large angular momenta j1 , j2 (in the classical regime), when ji2 J 2i /2 , (15.26) is obviously true: 2 J 2 = ( J 1 + J 2 ) 2 = J 21 + J 22 + 2J 1 · J 2 ⇒ ( j1 − j2 ) 2 ≤ J ≤ ( j1 + j2 ) 2 . 2
(15.27)
But (15.26) is actually true in quantum mechanics as well. We first note that the maximum projection of Jz is mmax = m1,max + m2,max = j1 + j2 , which means that jmax = j1 + j2 .
(15.28)
The representation with this jmax has 2 jmax + 1 states, and the whole representation can be obtained from the state with mmax = j1 + j2 by acting with J− (which lowers m by one unit). For the next highest value of m, mmax − 1, we have more than one state: the state J− | j1 j2 jmax mmax and another state, with j = jmax − 1. From the latter state we generate another representation with j = jmax − 1 by acting with J− , for a total of 2( jmax − 1) + 1 states. Then, at mmax − 2, we have two states coming from the previous two representations, plus one more, with j = jmax − 2, etc. In total, assuming that indeed jmin = | j1 − j2 |, and assuming for concreteness that j1 ≥ j2 , we find that the total number of states is j 1 +j2 j=j1 −j2
(2 j + 1) =
2 [( j1 + j2 + 1)( j1 + j2 ) − ( j1 − j2 − 1)( j1 − j2 )] + 2 j2 + 1 2
= (2 j1 + 1)(2 j2 + 1),
(15.29)
177
15 Angular Momentum II: Addition, Representations; Oscillators
which is indeed the total number of states in the basis with j1 , j2 , m1 , m2 at fixed j1 , j2 , so we have found all the states in that basis, confirming the correctness of the hypothesis of jmin = | j1 − j2 |.
Decomposition of Product of Representations We have thus arrived at the following logic: when we take a product of two representations of SO(3), one of dimension j1 and the other of dimension j2 , we can decompose the resulting states into representations of SO(3) of varying j, between | j1 − j2 | and j1 + j2 , formally: ji ⊗ j2 = | j1 − j2 | ⊕ | j1 − j2 | + 1 ⊕ · · · ⊕ j1 + j2 .
(15.30)
15.3 Finding the Clebsch–Gordan Coefficients The Clebsch–Gordan coefficients are the matrix elements for a change of basis in the Hilbert space of the system, from | j1 j2 ; m1 m2 to | j1 j2 ; jm, which by the general theory of Hilbert spaces are unitary matrices. In general, a unitary matrix has complex elements, but since we can multiply all the elements of both bases by arbitrary phases without changing the physics of the system, we can choose a convention where the Clebsch–Gordan coefficients are all real (the phases turn all complex elements into real ones). Moreover, we can also choose a convention such that j1 j2 , m1 = j1 , m2 | j1 j2 ; jm = j ∈ R+ is not only real, but positive. Then, since we have unitary transformations on the Hilbert space, these matrices obey orthonormality conditions, j1 j2 ; m1 m2 | j1 j2 ; jm j1 j2 ; m1 m2 | j1 j2 ; jm = δ m1 m1 δ m2 m2 j,m
j1 j2 ; m1 m2 | j1 j2 ; jm j1 j2 ; m1 m2 | j1 j2 ; j m = δ j j δ mm .
(15.31)
m1 ,m2
For j = j , m = m in the second relation, we obtain a normalization condition for the Clebsch– Gordan coefficients. Instead of the Clebsch–Gordan coefficients, one sometimes uses a set of symbols related to them by constants, Wigner 3 j symbols, defined by j j2 j . (15.32) j1 j2 ; m1 m2 | j1 j2 ; jm = (−1) j1 −j2 +m 2 j + 1 1 m1 m2 −m
Recursion Relations We can find recursion relation between the Clebsch–Gordan coefficients by acting with J± = J1± + J2± on the basis decomposition relation, since successive actions of J± construct the j representation, of J1± construct the j1 representation, and of J2± construct the j2 representation: J± | j1 j2 ; jm = (J1± + J2± ) | j1 j2 ; m1 m2 j1 j2 ; m1 m2 | j1 j2 ; jm. (15.33) m1 ,m2
178
15 Angular Momentum II: Addition, Representations; Oscillators Using the formulas from the previous chapter for the actions of J± on | jm, we obtain the following relation between states: ( j ∓ m)( j ± m + 1)| j1 j2 ; jm ± 1 = ( j1 ∓ m1 )( j1 ± m1 + 1)| j1 j2 , m1 ± 1, m2 (15.34) m1 ,m2
+
( j2 ∓ m2 )( j2 ± m2 + 1)| j1 j2 ; m1 , m2 ± 1 j1 j2 ; m1 m2 | j1 j2 ; jm.
However defining (m1 ≡ m1 ± 1, m2 = m2 ) and (m1 = m1 , m2 = m2 ± 1),
(15.35)
for the two states, respectively, and multiplying from the left with j1 j2 ; m1 m2 |, we find that ( j ∓ m)( j ± m + 1) j1 j2 ; m1 m2 | j1 j2 ; j, m ± 1 = ( j1 ∓ m1 + 1)( j1 ± m1 ) j1 j2 ; m1 ∓ 1, m2 | j1 j2 ; jm (15.36) + ( j2 ∓ m2 + 1)( j2 ± m2 ) j1 j2 ; m1 , m2 ∓ 1| j1 j2 ; jm. These recursion relations, together with the normalization condition and the phase and sign convention, are enough to specify the Clebsch–Gordan coefficients completely, though we will not give examples here.
15.4 Sums of Three Angular Momenta,J1 + J2 + J3 : Racah Coefficients In the case where three angular momenta are to be added, we have the following starting basis for the Hilbert space: | j1 m2 ⊗ | j2 m2 ⊗ | j3 m3 ≡ | j1 j2 j2 ; m1 m2 m3 .
(15.37)
To sum the angular momenta, and decompose these states into representations (of given j) of the rotation group SO(3), we can first add two momenta, and then add the remaining one to the sum of the first two. This can be done in three different ways. (1) We can first set J 1 + J 2 = J 12 , and then set J 12 + J 3 = J . In the first step, we replace m1 and m2 with j12 and m12 , and in the second, we replace m3 and m12 with j and m (i.e., total angular momentum and its projection onto z), for a total basis state of | j1 j2 j12 j3 ; j, m. Decomposing this basis state in terms of the original basis, using the two steps above, leads to | j1 j2 j12 j3 ; jm = | j1 j2 j3 ; m1 m2 m3 j1 j2 m1 m2 | j12 m12 j12 j3 ; m12 m3 | jm, (15.38) m1 ,m2 ,m12 ,m3
where j1 j2 m1 m2 | j12 m12 ≡ j1 j2 j3 ; m1 m2 m3 | j1 j2 j12 j3 ; m12 m3 j12 j3 m12 m3 | jm ≡ j1 j2 j12 j3 ; m12 m3 | j1 j2 j12 j3 j; m.
(15.39)
(2) Alternatively, we can first set J 2 + J 3 = J 23 , and then J 23 + J 1 = J . In the first step, we replace m2 and m3 with j23 and m23 , and in the second, we replace m1 and m23 with j and m. The
179
15 Angular Momentum II: Addition, Representations; Oscillators
new decomposition is | j1 j2 j3 j23 ; jm =
| j1 j2 j3 ; m1 m2 m3 j2 j3 m2 m3 | j23 m23 j1 j23 ; m1 m23 | jm.
(15.40)
m1 ,m2 ,m3 ,m23
(3) The third possibility is to set J 1 + J 3 = J 13 , and then J 13 + J 2 = J . The corresponding last decomposition is | j1 j3 j13 j2 ; jm = | j1 j2 j3 ; m1 m2 m3 j1 j3 m1 m3 | j13 m13 j2 j13 ; m2 m13 | jm. (15.41) m1 ,m2 ,m3 ,m23
So, all three methods are equivalent, and we should arrive at similar bases. This means that, in particular, the bases obtained in points (1) and (2) are related by a unitary transformation. The matrix elements between them can be written as (by taking out real factors) the Racah W coefficients, or equivalently the Wigner 6 j symbols, j1 j2 j12 j3 ; jm| j1 j2 j3 j23 ; jm ≡ (2 j12 + 1)(2 j23 + 1)W ( j1 , j2 , j13 ; j12 , j23 ) (15.42) j j j ≡ (−1) j1 +j2 +j3 +j (2 j12 + 1)(2 j23 + 1) 1 2 12 . j3 j j23 These matrices are rather complicated and tedious to obtain, though Racah showed that the W coefficients have workable expressions.
15.5 Schwinger’s Oscillator Model Finally, we note a very useful way to obtain the representations of the Lie algebra of SU (2) SO(3), by building its generators from the algebra of harmonic oscillators. It turns out that we need two harmonic oscillators, one with operators a+ , a+† , one with a− , a−† . The two oscillators will be associated with the positive and negative values of m = Jz , respectively. We start with a single harmonic oscillator, with a, a† and number operator N = a† a. Thus its algebra is [a, a† ] = 1 [N, a† ] = a† aa† − a† a† a = a† [a, a† ] = +a† †
†
(15.43)
†
[N, a] = a aa − aa a = −[a, a ]a = −a. The following commutation relations of N look indeed like the commutation relations J± Jz Jz 2 , =± 2 , but the remaining one is not quite right, since it is J+ J− Jz , = 2 ,
(15.44)
(15.45)
which doesn’t match [a† , a] = −1. This is why we actually need two oscillators in order to construct the Lie algebra, commuting one with the other, so with simultaneous eigenvalues, [a+ , a−† ] = [a− , a+† ] = 0.
(15.46)
180
15 Angular Momentum II: Addition, Representations; Oscillators
For this system, we can construct simultaneous eigenstates, in the usual manner, as a tensor product of the states of the two oscillators, |n+ , n− = |n+ ⊗ |n− =
(a+† ) n+ (a−† ) n− |0, 0. √ n+ ! n− !
(15.47)
Then we can construct the generators of the Lie algebra in terms of these two oscillators, J+ ≡ a+† a− , J− ≡ a−† a+ , 2Jz ≡ a+† a+ − a−† a− = N+ − N− .
(15.48)
We can check that, indeed this has the right commutation relations. For instance, the commutation relation that did not work with a single oscillator now does work: J+ J− 2Jz , . (15.49) = a+† a− , a−† a+ = a+† [a− , a−† ]a+ − a−† [a+ , a+† ]a− = N+ − N− = Moreover, the action of the angular momentum operators on the basis consisting of occupation number states |n+ , n− is J+ |n+ , n− = a+† a− |n+ , n− = n− (n+ + 1)|n+ + 1, n− − 1 J− |n+ , n− = a−† a+ |n+ , n− = n+ (n− + 1)|n+ − 1, n+ + 1 2Jz |n+ , n− = (N+ − N− )|n+ , n− = (n+ − n− )|n+ , n− .
(15.50)
We see that we can identify the occupation states for the oscillators, |n+ , n− , with the states in the representation of angular momentum j, by making the identifications n+ = j + m, n− = j − m.
(15.51)
Defining the total occupation number, N = N+ + N− , we obtain that n+ + n− ≡ N = 2 j.
(15.52)
Substituting (15.51) in the general state |n+ , n− , we obtain the states in the spin j representation of the SO(3) rotation group as oscillator states, with (a† ) j+m (a−† ) j−m | jm = + |0. ( j + m)! ( j − m)!
(15.53)
In this way, we can construct all the states in the representation of angular momentum j: we note that mmax = j (so that j − m = 0) and mmin = − j (so that j + m = 0), otherwise the above state (15.53) does not make sense. We also note that the state with m = j is (a† ) 2j | j j = + |0, 2 j!
(15.54)
which can be interpreted as being composed of 2 j elements with angular momentum 1/2 each, all with “spin up”, jz = +1/2. Thus the total angular momentum can be thought of as a sum of angular
181
15 Angular Momentum II: Addition, Representations; Oscillators
momenta each having the minimum value, 1/2: j =
2j
→
(1/2) i .
(15.55)
i=1
Thus, as we mentioned at the beginning, we can indeed construct all the representations of SO(3) from just sums of the simplest, “spinor” representation. In this Schwinger oscillator model, a+† creates a spin 1/2 state with spin up, and a−† creates one with spin down, each state having N = 2 j oscillators, j + m with spin up and j − m with spin down. The Schwinger construction presented here for SO(3) SU (2) is actually much more general. In fact, any Lie algebra can be constructed in terms of harmonic oscillators ai , ai† , which in turn allows us, as above, to explicitly construct its representations. Indeed, in Lie group theory, one can show that we can reduce the Lie algebra to a kind of product of SU (2) factors, after which we can just follow the above analysis. We will not do this here, but the result is that this quantum mechanical construction is useful for solving the issue of representations of general (Lie) groups. Thus we can say that quantum mechanics helps not just with physics, but also with mathematics, in terms of representations of Lie groups!
Important Concepts to Remember • The angular momentum J is the generator of the SO(3) group of spatial rotations, and the values j of the angular momentum correspond to representations of SO(3) in which physical |ψ states belong. • The spinor representation of SO(3) SU (2)/Z, j = 1/2, has generators given by the Pauli matrices, Ji = 2 σi , and is the fundamental representation of SU (2). • The Pauli matrices σi satisfy σi σ j = δi j 1 + ii jk σk ; they are Hermitian and, since σi2 = 1, are also unitary and traceless. • The matrices (1, σi ) are complete in the space of 2 × 2 matrices. • The j = 1 representation is the adjoint representation of SO(3). • We can compose two angular momenta to give J = J 1 + J 2 , in which case the states can be described either as a tensor product of the original states, | j1 m1 ⊗ j2 m2 ≡ | j1 j2 ; m1 m2 , or as states of J , with j1 , j2 defined also, thus as | j1 j2 ; jm, since we have two mutually commuting 2 2 2 2 2 sets, ( J 1 , J 2 , J1z , J2z ) and ( J 1 , J 2 , J , Jz ). • The relation between the two possible bases is given as | j1 j2 ; jm = | j1 j2 ; m1 m2 j1 j2 ; m1 m2 | j1 j2 ; jm; m1 ,m2
the coefficients j1 j2 ; m1 m2 | j1 j2 ; jm are the Clebsch–Gordan coefficients. • In the Clebsch–Gordan coefficients we have (providing that the coefficient is nonzero) m = m1 +m2 and | j1 − j2 | ≤ j ≤ j1 + j2 , which means that we have the decomposition of the product of representations into a sum of representations, as j1 ⊗ j2 = | j1 − j2 | ⊕ | j1 − j2 | + 1 ⊕ · · · ⊕ j1 + j2 . • Up the Clebsch–Gordan coefficients are the same as the Wigner 3 j symbols, to a rescaling, j1 j2 j . m1 m2 −m
182
15 Angular Momentum II: Addition, Representations; Oscillators • When composing three angular momenta, J 1 + J 2 + J 3 = J , we can do this by first adding 1 and 2, or first adding 2 and 3 (or first adding 1 and 3), which are equivalent procedures, leading to coefficients for the transition, j1 j2 j12 j3 ; jm| j1 j2 j3 j23 ; jm, which up to some rescaling are the j j j Racah coefficients W ( j1 , j2 , j13 ; j12 , j23 ) or the Wigner 6 j symbols 1 2 12 . j3 j j23 • In Schwinger’s oscillator model, we can construct the generators of SU (2) or SO(3), the angular momenta, in terms of two sets of oscillators a+ , a+† , a− , a−† , as J+ / = a+† a− , J− / = a−† a+ , 2Jz / = N+ − N− , and the states in a representation in terms of the creation operators, (a† ) j+m (a−† ) j−m | jm = + |0, ( j + m)! ( j − m)! so a+† creates a spin 1/2 up and a−† a spin 1/2 down. • The Schwinger oscillator model is much more general; any Lie algebra can be represented in terms of harmonic oscillators, and the states in a representation in terms of the creation operators acting on a vacuum.
Further Reading See [2], [1], [3] for more on angular momenta, and [8] for more on mathematical constructions for SU (2) representations and for other Lie algebras.
Exercises (1)
Calculate Tr e α i σi /2 ,
(15.56)
where σi are the Pauli matrices. (2) Decompose the product of two spin 1 representations into (irreducible) representations of SU (2), and list explicitly the relation between the basis elements for the two bases in terms of Clebsch–Gordan coefficients. (3) For the case in exercise 2, find all the Clebsch–Gordan coefficients for the lowest value of the total spin j, using the recursion relations and the normalization conditions. (4) For the sum of three spin 1 representation, in the case where the final spin is 2, write explicitly the three decompositions of the final basis in terms of the tensor product basis (with a product of two Clebsch–Gordan coefficients) and also write explicitly the resulting Racah coefficients. (5) Rewrite the relation between the two bases in exercise 2 in terms of Schwinger oscillators. (6) Consider the Lie algebra for Ji , Ki , i = 1, 2, 3, [Ji , J j ] = ii jk Jk , [Ki , K j ] = ii jk Jk , [Ji , Ki ] = ii jk Kk .
(7)
(15.57)
Write it in terms of Schwinger oscillators, and give the general representation in terms of Schwinger oscillators acting on a vacuum. Decompose the product of four spin 1 representations of SU (2), 1 ⊗ 1 ⊗ 1 ⊗ 1, into irreducible SU (2) representations.
16
Applications of Angular Momentum Theory: Tensor Operators, Wave Functions and the Schrödinger Equation, Free Particles After having described how to construct representations, and how to compose them (and to decompose the products of representations into sums of representations, or J 1 + J 2 into J ), we now apply these notion to physics. First, we learn how to write operators, associated with observables, transforming in a given representation under rotation, i.e., “tensor operators”. Then, we find how wave functions transform under rotations, and how the Schr¨odinger equation decomposes under them. Then, we apply this formalism to the simplest case, that of free particles.
16.1 Tensor Operators Consider classical measurable vectors, e.g., r , p , L, collectively called Vi , that transform according to the rule Vi → Vi = Ri j Vj , (16.1) j
where Ri j is an SO(3) matrix, i.e., a rotation matrix in the defining j = 1, or vector, representation (with orthogonal 3 × 3 matrices). Then, also classically, one considers tensors as objects transforming as products of the basic vector representation, Ti1 i2 ...in → Ti1 i2 ...in = Ri1 i1 Ri2 i2 · · · Rin in Ti1 i2 ...in .
(16.2)
However, as we saw in the previous chapter, this representation, the product of vector representations, is not irreducible. It can in fact be reduced, i.e., decomposed into irreducible representations of SO(3), which are all of angular momentum j. From the previous chapter, we can formally say that we have 1i = j, (16.3) i
where j has several values that can be calculated, or, in a different notation, where we denote the representation by its value of j, and denote the effect of the angular momentum sum on states (tensor product, decomposed as a sum), 1 ⊗ 1 ⊗ · · · ⊗ 1 = · · · ⊕ (n − 1) ⊕ n.
183
(16.4)
Note that we have a tensor product of n quantities on the left and that on the right the possible j’s take only integer values, since we are summing integer values (of j = 1) many times. In yet another notation, we can denote the representation by its multiplicity (the number of states in it), for instance j = 1 as 3, j = 2 as 5, etc., so that
184
16 Applications of Angular Momentum Theory 3 ⊗ 3 ⊗ · · · ⊗ 3 = · · · ⊕ (2(n − 1) + 1) ⊕ 2n + 1.
(16.5)
This relation would be true as numbers also, i.e., 3 × 3 × · · · × 3 = · · · + (2(n − 1) + 1) + (2n + 1). In particular, for a sum of only two vectors (giving a tensor product of representations) and thus for the tensor Ti j , we have 1 ⊗ 1 = 0 ⊕ 1 ⊕ 2,
(16.6)
meaning that, for j1 = j2 = 1, we have 0 = |1 − 1| ≤ j ≤ 1 + 1 = 2, so j = 0, 1, 2. Equivalently, denoting the representation by its multiplicity, we have 3 ⊗ 3 = 1 ⊕ 3 ⊕ 5,
(16.7)
which is also true as regular products and sums, 3 × 3 = 1 + 3 + 5 = 9, giving the total number of states in the system. Quantum mechanically, we have operators Vˆi corresponding to the observables Vi , for instance rˆi , pˆi , Lˆ i . Then the transformed operators Vˆi have the same transformation rules as the classical observables. This is so, since if the states |ψ in the Hilbert space transform in some representation with angular momentum j under a rotation with rotation matrix R, |ψ → |ψ ≡ D( j) (R)|ψ,
(16.8)
where D( j) (R) is the rotation matrix in the ( j) representation, then the classically observable quantum averages of the operators Vˆi , classically observable, must transform in the same way as the classical objects, i.e., ψ|Vˆi |ψ → ψ|D†( j) (R)Vˆi D( j) (R)|ψ = Rik ψ|Vˆk |ψ, (16.9) j
implying the transformation for operators Vˆi → Vˆi = D†( j) (R)Vˆi D( j) (R) =
Rik Vˆk .
(16.10)
k
The infinitesimal form of the transformation law can be obtained by remembering the general form of a rotation matrix in a representation ( j), in terms of the generators Ji in this representation. In the infinitesimal case, we have i i i (16.11) D( j) (R) = exp − α (Ji )( j) 1 − α i (Ji )( j) , and in the defining, or vector representation, which for SO(3) happens to be also the adjoint one, we have i k i 1 Ri j = exp − α Jk 1i j − α k (Jk )i j = δi j − α k ki j , (16.12) ij where we have used (Jk )i j = −i f i jk = −ii jk .
Then the transformation law D† Vˆi D = j Ril Vˆl implies −
iα k ˆ iα k [Vi , (Jk )]( j) = − (−i kil Vˆl )( j) ⇒ [Vˆi , Jk ]( j) = iikl (Vˆl )( j) ,
which can be taken as an alternative definition of a vector operator.
(16.13)
185
16 Applications of Angular Momentum Theory Thus a quantum mechanical tensor operator is an object Tˆm( j) , in a representation of integer (not half-integer!) angular momentum j = k ∈ N where m is the standard index in the representation, m = − j, . . . , j, that therefore transforms according to the rotation matrices in the representation j ˜ under a transformation of the states by matrices in a different representation j, D†( j)˜ (R)Tˆm( j) D( j)˜ (R)
=
j m =−j
( j)∗ ˆ ( j) Dmm (R)Tm ,
(16.14)
where we have used DT (R−1 ) = D∗ (R) to write the complex conjugate of the rotation matrix. The infinitesimal form of the above transformation law for tensors is similar to the one for vectors, except that now we must use the (complex conjugate of the) matrix of the generator Jk in the j representation, ∗ ∗ (Jk )mm ≡ jm| Jk | jm = jm | Jk | jm,
(16.15)
∗ ˆ ( j) [Tˆm( j) , Jˆk ] = −(Jk )mm Tm ,
(16.16)
[ Jˆk , Tˆm( j) ] = Tˆm( j) jm | Jk | jm.
(16.17)
to obtain
or
16.2 Wigner–Eckhart Theorem This is a very useful theorem, which states that the matrix elements of the tensor operator Tˆm( j) with respect to |α j˜m ˜ eigenstates (α stands for all other indices) are given by Clebsch–Gordan coefficients, times a function that is independent of m or, denoting the tensor operator by Tˆq(k) in ˜ order to match standard notation and to avoid confusing j and j, α , j T (k) α, j α , j m |Tˆq(k) |α, jm = j k; mq| j k; j m , 2j + 1
(16.18)
where the double verticals indicate that the matrix element is independent of m. Proof. To prove the relation, we consider the action of the tensor operator on a state, Tˆq(k) |α, jm,
(16.19)
and construct a linear combination with coefficients given by the Clebsch–Gordan coefficients, |τ, j˜m ˜ ≡ Tˆq(k) |α, jm j k; mq| j k j˜m. ˜ (16.20) j,m
Using the orthogonality property of the Clebsch–Gordan coefficients, we reverse the relation and write Tˆq(k) |α, jm = |τ, j˜m ˜ j k; mq| j˜m. ˜ (16.21) ˜m j, ˜
186
16 Applications of Angular Momentum Theory Using (16.17) and (16.21), we find that Jˆk acts on |τ j˜m ˜ as on | j˜m. ˜ The proof of that is left as an exercise. Thus α jm|τ j˜m ˜ ∝ δ j j˜ δ mm˜ , which implies the statement of the theorem.
(16.22) q.e.d.
As an application of the theorem, we see that the matrix elements of Tˆq(k) are zero, unless q = m − m | j − j | ≤ k ≤ j + j .
(16.23)
This selection rule is very useful for quantum processes involving matrix elements for transition between states, as we will see later on in the book.
16.3 Rotations and Wave Functions For quantum observables, one can measure the eigenvalues of the corresponding operators. But we saw that the classical angular momentum, L i = i jk x j pk ( L = r × p ),
(16.24)
Lˆ i = i jk Xˆ j Pˆk ,
(16.25)
corresponds to a quantum operator
acting as a generator of the rotation group SO(3); hence its eigenvalues 2 j ( j + 1), defined by a given j, are something that one can measure. Therefore measurements will correspond to a given j, and so a given representation of SO(3), a quantization condition for angular momentum that will be reflected in the wave functions. In the coordinate basis, angular momenta have matrix elements ∂ x | Lˆ i |x = i jk x | Xˆ j Pˆk |x = i jk x j x | Pˆk |x = −ii jk x j δ(x − x ) . ∂ xk
(16.26)
When acting on a wave function in coordinate, |x , space, the usual manipulations lead to x | Lˆ i |ψ = dx x | Lˆ i |x x |ψ ∂ = −ii jk x j ψ(x ) = −i r × ∇ψ . i ∂ xk
(16.27)
It is useful to go to spherical coordinates r, θ, φ, defined from the Cartesian coordinates x 1 , x 2 , x 3 by x 1 = r sin θ cos φ x 2 = r sin θ sin φ x 3 = r cos θ,
(16.28)
187
16 Applications of Angular Momentum Theory
x3 μ
x1 Á
x2 Figure 16.1
Coordinate system. as in Fig. 16.1. The new system of coordinates has basis vectors er , eθ , and eφ , which obey the relations er = eθ × eφ ,
eθ = eφ × er ,
eφ = er × eθ ,
(16.29)
operator becomes in spherical coordinates and then the ∇ ∂ = er ∂ + eθ 1 ∂ + eφ 1 ∇ . ∂r r ∂θ r sin θ ∂φ
(16.30)
This allows us to write the form of the angular momentum operator acting on wave functions in spherical coordinates: = −i eφ ∂ − eθ 1 ∂ . L = −ir × ∇ (16.31) ∂θ sin θ ∂φ By using the decomposition of the spherical basis vectors in the Cartesian basis, er = sin θ cos φ e1 + sin θ sin φ e2 + cos θ e3 eθ = sin φ e1 + cos φ e2
(16.32)
eφ = cos θ cos φ e1 + cos θ sin φ e2 − sin θ e3 , we can also decompose L, written above in spherical coordinates, into L 1 , L 2 , L 3 , as follows: ∂ ∂ L 1 = −i − sin φ − cos φ cot θ ∂θ ∂φ ∂ ∂ (16.33) L 2 = −i cos φ − sin φ cot θ ∂θ ∂φ ∂ L 3 = −i . ∂φ Then we can also construct operators that change m, ∂ ∂ Lˆ + = Lˆ 1 + i Lˆ 2 = eiφ + i cot θ ∂θ ∂φ ∂ ∂ Lˆ − = Lˆ 1 − i Lˆ 2 = e−iφ − + i cot θ . ∂θ ∂φ
(16.34)
188
16 Applications of Angular Momentum Theory
We want to write wave functions in radial coordinates to match the symmetry of the relevant angular momentum operators, which means that we must consider basis coordinate states of the type |x ≡ |r, n = |r ⊗ |n,
(16.35)
where n = n (θ, φ) is a unit vector defined by the angles θ and φ, and corresponding factorized states |ψ = |ψr ⊗ |ψn .
(16.36)
Then the wave functions factorize as well, or in other words, we can consider the separation of variables in order to be able to act with Lˆ i on them and obtain eigenvalues, r, n | Lˆ i |ψ = r |ψr n | Lˆ i |ψn ∂ 1 ≡ −iψr (r) eφ − eθ n |ψn . ∂θ sin θ
(16.37)
Here we have defined the radial wave function r |ψr ≡ ψr (r).
(16.38)
n (θ, φ)|ψn ≡ ψ(θ, φ).
(16.39)
We also define
Since, as we said, we are interested in eigenfunctions of L, it means that |ψn is a state |lm. Moreover, we will see shortly that in fact we are interested only in l ∈ N (so m ∈ Z), in which case the angular wave function is a spherical harmonic, Ylm (θ, φ) ≡ n (θ, φ)|lm.
(16.40)
Then, the Ylm (θ, φ) are eigenfunctions of L 2 and L z , namely L 2Ylm (θ, φ) = 2 l (l + 1)Ylm (θ, φ) L z Ylm (θ, φ) = mYlm (θ, φ).
(16.41)
But, since we saw that in spherical coordinates L z = −i∂/∂φ, we obtain that Ylm (θ, φ) = eimφ ylm (θ).
(16.42)
We will give a better definition of these spherical harmonics later; for the moment this is all that we need to say about them.
16.4 Wave Function Transformations under Rotations Under rotations, since the Ylm (θ, φ) describe states |ψ ∝ |lm, they must transform under the l representation of SO(3). Under n = Rn, these states transform under the rotation matrix D(R), |n = D(R)|n,
(16.43)
189
16 Applications of Angular Momentum Theory
and correspondingly wave functions transform with the rotation matrix in the l representation, since then n |lm = n |D(R) † |lm ⇒ (l)∗ Ylm (n ) ≡ n |lm = n |D(R) † |lm = Dmm n |lm (R)
=
(16.44)
(l)∗ Dmm n ). (R)Ylm (
We now make an important observation. Since we have found that Ylm (θ, φ) ∝ eimφ ,
(16.45)
it means that indeed, for periodicity of Ylm (θ, φ) we must have only m ∈ Z, so that l ∈ N, as we have assumed before. However, it also means that if we considered instead m ∈ Z/2, for half-integer spin, in particular for j = 1/2, we would obtain Ylm (θ, 2π) = eiπYlm (θ, 0) ⇒ ψ(r, θ, 2π) = −ψ(r, θ, 0),
(16.46)
and only a rotation by 4π would bring us back to the original state. Equivalently, consider the formula from Chapter 14 concerning a rotation with Euler angle φ around the third axis, corresponding to the j = 1 representation, i.e., the vector or fundamental representation, cos θ R3 (φ) = sin θ 0
− sin θ cos θ 0
0 0 , 1
(16.47)
for which R3 (2π) = 1, as expected for j = 1 (integer). On the other hand, the same rotation acting on the space of j = 1/2 states, or “spinor” representation, i.e., acting as an SU (2) matrix, gives iφ/2 e 0 ˜ R3 (φ) = , (16.48) 0 e−iφ/2 so that R˜3 (2π) = −1 ⇒ R˜3 (4π) = 1.
(16.49)
This condition, in general gives a good definition of a spinor: it is a representation in which only a rotation by 4π, but not by 2π, gives back the same state. Indeed, in general we find that a rotation by φ in a representation of angular momentum j has a group element i ( j) g ∝ exp φ J3 . (16.50) Thus, if J3 has eigenvalue j, and φ = 2π, we obtain g(2π) = e2πi j g(0) = (−1) 2j g(0).
(16.51)
We might ask: is it not contradictory to have a rotation by 2π that changes the state? The answer is that it is not, as long as the change is unobservable, which it is: we will see that a single spinor is unobservable; only two spinors are observable, and, for a state of two spinors, nothing changes. Finally, we note that rotation matrices in the j representation can be found using the Schwinger method of representing generators using oscillators. We could calculate them explicitly and obtain the so-called Wigner formula (so, in this case, the method is also known as the Wigner method), though we will not do this here.
190
16 Applications of Angular Momentum Theory
16.5 Free Particle in Spherical Coordinates Our first application of the formalism that we have developed will be to a free particle in spherical coordinates, as a warm-up for later, more involved, cases. We choose, instead of the |r = |x, y, z coordinate basis, the spherical coordinate basis |r, θ, φ. Just as we had before ket states |p = |px py pz corresponding to the eigenvalues of three commuting observables Pˆx , Pˆy , Pˆz , so now we choose instead three other commuting observables, related to the rotational invariance. That means that L 2 and L z are certainly involved, but we must ˆ giving the energy E. choose a third. The simplest choice is the Hamiltonian H, Indeed, for a free particle, the Hamiltonian is ˆ 2 P , Hˆ = 2m
(16.52)
p 2 . 2m
(16.53)
with eigenvalue E= Since Lˆ i = i jk Xˆ j Pˆk ,
(16.54)
ˆ Lˆ z ] = 0 and also [ H, ˆ Lˆ 2 ] = 0. we can check that [ H, The Schr¨odinger equation in a coordinate basis, 2 2 ˆ x | H |ψ = − ∇ ψ(x ) = Eψ(x ), (16.55) 2m
can be rewritten in terms of L 2 , since (using i i jk ilm = δ jl δ km − δ jm δ kl , which can be proven by analyzing all possibilities) Lˆ i Lˆ i = i jk ilm Xˆ j Pˆk Xˆ l Pˆm = Xˆ j Pˆk Xˆ j Pˆk − Xˆ j Pˆk Xˆ k Pˆ j = Xˆ j Xˆ j Pˆk Pˆk − Xˆ j Pˆ j Xˆ k Pˆk − i Xˆ j Pˆ j ,
(16.56)
where in the second line we have used the canonical commutation relations. Then, we rewrite (16.56) in spherical coordinates, Lˆ i Lˆ i = −irˆ pˆr − rˆ pˆr rˆ pˆr + rˆ2 pˆ 2 = rˆ2 (pˆ 2 − pˆr2 ),
(16.57)
where we have used [r, ˆ pˆr ] = i. We see that therefore 1 ˆ pˆ 2 = pˆr2 + 2 L 2 . r
(16.58)
We can also prove this directly in the coordinate basis, since then pr = −i and
1 ∂ r r ∂r
L 2 ∂2 1 ∂ ∂ 1 = Δ = , sin θ + θ,φ 2 2 sin θ ∂θ ∂θ sin θ ∂φ2
(16.59)
(16.60)
191
16 Applications of Angular Momentum Theory
and finally Δ=
∂2 2 ∂ 1 + + Δθ,φ , ∂r 2 r ∂r r 2
(16.61)
leading to 2 = pr2 + p 2 = −2 Δ = −2 ∇
L 2 , r2
(16.62)
for r 0. The Schr¨odinger equation becomes p 2 1 2 L 2 ψ(r, θ, φ) = Eψ(r, θ, φ), (16.63) ψ(r, θ, φ) = p + 2m 2m r r 2 and can be solved by separation of variables, writing the wave function as a product of a radial function and a spherical harmonic, ψlm (r, θ, φ) = f l (r)Ylm (θ, φ),
(16.64)
where the spherical harmonics are eigenfunctions of L 2 and L z , L2Ylm (θ, φ) = l (l + 1)2Ylm (θ, φ) L z Ylm (θ, φ) = mYlm (θ, φ).
(16.65)
The Schr¨odinger equation then reduces to an eigenvalue equation for the radial wave function f l (r), 2 pr l (l + 1)2 + f l (r) = E f l (r) ⇒ 2m 2mr 2 (16.66) 2 ⎡ ⎤ 2 ⎢⎢ 1 ∂ l (l + 1) 2mE ⎥⎥ − r + − 2 ⎥ f l (r) = 0. 2m ⎢⎢ r ∂r r2 ⎥⎦ ⎣ Defining k2 ≡
2mE 2
and ρ = kr, we find the equation 2 ∂ 2 ∂ l (l + 1) + + 1− f l (ρ) = 0. ∂ρ 2 ρ ∂ρ ρ2
(16.67)
(16.68)
This is in fact the equation for the spherical Bessel functions jl (ρ), meaning that the solution to the Schr¨odinger equation in spherical coordinates is r, θ, φ|lm = ψlm (r, θ, φ) = Ylm (θ, φ) jl (kr).
(16.69)
In fact, the solution to the Schr¨odinger equation in Cartesian coordinates, x |k = ei k ·r ,
(16.70)
can be expanded in the basis of the solutions in spherical coordinates as ei k ·r =
∞ l l=0 m=−l
alm (k)Ylm (θ, φ) jl (kr).
(16.71)
192
16 Applications of Angular Momentum Theory
We will continue this analysis in Chapter 19; for the moment we just note that the final result can be rewritten as eikr cos θ =
∞
(2l + 1)i l jl (kr)Pl (cos θ)
l=0
= 4π
∞ l
(16.72) (2l + 1)i
l
∗ jl (kr)Ylm (n1 )Ylm (n2 ),
l=0 m=−l
where in the last form we used two directions n1 , n2 , for the spherical harmonics, that have an angle θ between them.
Important Concepts to Remember • The decomposition of products of spin j representations, in particular spin 1 representations, into
irreducible representations is denoted equivalently as i 1i = j, as 1 ⊗ 1 ⊗ · · · ⊗ 1 = · · · ⊕ (n − 1) ⊕ n, or (using the dimension of the representation) as 3 ⊗ 3 ⊗ · · · ⊗ 3 = · · · ⊕ (2(n − 1) + 1) ⊕ 2n + 1, which is also true for products of numbers. • The classical tensor transformation law under a rotation with matrix Ri j , Ti1 i2 ...in → Ti1 i2 ...in = Ri1 i1 Ri2 i2 . . . Rin in Ti1 i2 ...in , becomes the transformation law for states |ψ → |ψ ≡ D( j) (R)|ψ and, for operators Tˆm( j) in a representation j ∈ N, Tˆm( j) → D†( j)˜ (R)Tˆm( j) D( j)˜ (R) =
j ( j)∗ ˆ ( j) ˆ ˆ ( j) ˆ ( j) ˆ ˆ m =−j Dmm (R)Tm or [ Jk , Tm ] = Tm jm | Jk | jm (for vectors, [Vi , Jk ] ( j) = i ik j (Vj ) ( j) ). • The Wigner–Eckhart theorem states that the matrix elements (in |α jm) of tensor operators Tˆq(k) are given by Clebsch–Gordan coefficients for summing j + k = j , times a function independent (k ) of m: α , j m |Tˆq(k) |α, jm = j k; mq| j k; j m α ,j √T α,j . 2j+1
• For a given representation of the SO(3) rotation group, i.e., a given integer angular momentum L 2 , we have |ψ = |ψr ⊗ |ψn , or, for wave functions, ψ(r, θ, φ) = ψr (r)Ylm (θ, φ), where Ylm (θ, φ) is the spherical harmonic eimφ ylm (θ). • We see that single-valued spherical harmonics require l ∈ N, whereas for half-integer j we have spinors, which under a 2π rotation acquire a minus sign, ψ(r, θ, 2π) = −ψ(r, θ, 0), and return to themselves under a 4π rotation. • For a free particle, since Δθ,φ = L 2 , we have a centrifugal potential 2 l (l + 1)/2mr 2 , so the solution to the Schr¨odinger equation in spherical coordinates is Ylm (θ, φ) jl (kr) ( jl is the spherical Bessel function), and indeed ei k ·r decomposes in these solutions as basis functions with coefficients alm (k).
Further Reading See [2], [3], [1] for more details.
193
16 Applications of Angular Momentum Theory
Exercises (1)
Consider the symmetric traceless operator 1 ˆ 2 Xˆ i Xˆ j − X δi j , 3
(16.73)
where Xˆ i corresponds to the position of a particle. Is it a tensor operator? Justify your answer with a calculation. (2) Consider a free particle. Rewrite the matrix elements 1 ˆ E , l m | Pˆi Pˆ j − P2 δi j |E, lm (16.74) 3
(3) (4) (5) (6)
(7)
in terms of Clebsch–Gordan coefficients and other quantities. Write down the differential equation satisfied by the reduced spherical harmonic ylm (θ). Express the Schr¨odinger equation of the free particle in polar coordinates (r, θ, z), making use of L z . Write down the relevant separation of variables for exercise 4. Consider two free particles, and write down the wave function (in coordinate space) of this system (of distinguishable particles) for the case where the two particles coincide, in terms of the total angular momentum L 1 + L 2 . Find the normalization condition for orthonormal free particle solutions Ajl (kr)Ylm (θ, φ).
17
Spin andL + S
Until now, we have assumed that J = L is an orbital angular momentum, which would mean that states belong to representations of an arbitrary j, depending on the wave function. This reasoning was based on the classical theory. However, in relativistic theory, SO(3) rotational invariance is enhanced to Lorentz SO(3, 1) invariance and, further, to Poincar´e invariance under the group I SO(3, 1). The latter, however, leads to the existence of an intrinsic representation for an SO(3) group, called “spin”, that is associated with the type of particle, i.e., a given particle has a given spin s, that cannot be changed. Moreover, the statistics of identical particles, Bose–Einstein or Fermi– Dirac (to be studied further later on in the book), are associated with integer spin and half-integer spin, respectively, a statement known as the “spin–statistics theorem”, which is however beyond the scope of this book (it is a statement in quantum field theory). Therefore, a theory of spin arises from a relativistic theory, quantum field theory. For the electron, the simplest theory arises from the relativistic Dirac equation, to be studied towards the end of the book. At this point, however, we use nonrelativistic theory, which is based on classical intuition. Therefore we can think of the spin as an intrinsic “angular momentum” related to a rotation around an axis, i.e., to some precessing “currents”. However, unlike L, S is intrinsic, i.e., it is always the same for a given particle, whereas L varies. For the electron, the spin has s = 1/2, so S 2 = 2 s(s + 1) = (3/4)2 , which corresponds to a twodimensional Hilbert space for the simple two-state system described towards the beginning of the book. The states, as there, are written as |1/2, +1/2 = |+,
|1/2, −1/2 = |−.
(17.1)
17.1 Motivation for Spin and Interaction with Magnetic Field To motivate the study of spin, we look at the interaction of angular momenta with a magnetic field and in particular the Stern–Gerlach experiment described in Chapter 0. Consider the coupling of a charged particle, say an electron, to an electromagnetic field. This is done through “minimal coupling”, which replaces derivatives acting on states of the particle as follows; q →∇ − i q A, ∂i → ∂i − i Ai ⇒ ∇ (17.2) is the vector potential of electromagnetism and q the electric charge of the particle. Then, where A in quantum mechanics, the canonical momentum operator acting on a wave function is replaced as follows: → pˆ − q A. (17.3) pˆ = ∇ i 194
17 Spin andL + S
195
The Hamiltonian operator of a free particle therefore is also changed by the electromagnetic interaction: 2 2 ˆ2 ˆ ˆ ˆ ˆ pˆ 2 1 ˆ ˆ = p + q A ˆ 2 − q p · A + A · p . p − q A Hˆ = → 2m 2m 2m 2m m 2
(17.4)
corresponds to a constant magnetic field B then in the Coulomb If the electromagnetic potential A gauge we have ×A ·A = B, ∇ = 0, ∇
(17.5)
= 1B × r . A 2
(17.6)
and we can choose a vector potential
Indeed, using
i jk klm = δil δ jm − δim δ jl ,
(17.7)
k
we obtain 1 × r ))i = 1 i jk klm ∂j (Bl x m ) = 1 (Bi δ j j − B j δi j ) = Bi . (∇ × ( B 2 2 2
(17.8)
Then the (quantum) interaction Hamiltonian is q ˆ ˆ × rˆ ) · pˆ ] = − q i jk [ Pˆi B j Xˆ k + Bi Xˆ j Pˆk ] Hˆ int = − [p · ( B × r ) + ( B 4m 4m q ˆ =− B · L + constant. 2m
(17.9)
On the other hand, in general the interacting Hamiltonian for a closed current interactng with a magnetic field is Hint = − μ · B,
(17.10)
of a loop of current I surrounding a disk with area A = πr 2 and with where the magnetic moment μ unit vector e is = I Ae. μ
(17.11)
qv , 2πr
(17.12)
qv q q q = πr 2 = mvr = L ⇒ μ L. 2πr 2m 2m 2m
(17.13)
The current through the loop is I= which means that the magnetic moment is μ=
in the interacting Hamiltonian − We see that then, substituting this μ μ · L, we obtain the form of the quantum Hamiltonian given in (17.9). situated at an angle θ to L, of Moreover, the angular momentum L suffers a precession around B, d L = q LB sin θ eφ , ×B =μ dt 2m
(17.14)
17 Spin andL + S
196
but on the other hand we have (since d L moves on a circle of radius L sin θ) dφ d L = ≡ ω, (17.15) L sin θ dt dt the angular precession speed is therefore q =− ω B. (17.16) 2m The proportionality between the magnetic moment and the angular momentum in quantum mechanics leads to a quantized projection of the magnetic moment onto any direction Oz, q q = μ L ⇒ μz = Lz , (17.17) 2m 2m and, since L z = ml , we obtain e ml ≡ μ B ml , (17.18) 2m where μ B is called the Bohr magneton, and ml = −l, . . . , +l. Thus, when constructing an electron beam traversing a constant B field (in the Oz direction) area, the electrons in the beam will have no orbital angular momentum, and yet in the Stern–Gerlach experiment we see a split of the beam into two, corresponding to a two-state system, therefore indicating that the electrons are in an angular momentum j = 1/2 state. This means that there must be an electron spin of s = 1/2, with ms = ±1/2 and Sz = ms . Moreover, then we can measure Hint and therefore μz , the projection of μ along the direction of B. We find μz =
μz = g
e (2ms ), 4m
(17.19)
where 2ms = ±1, so that μB , (17.20) 2 and we find that g 2. In fact, from the Dirac equation we find that g = 2 exactly, whereas in QED there are small corrections that take us slightly away from g = 2. This underscores the fact that spin is really a different type of angular momentum, unlike the orbital angular momentum. This was to be expected, since it is an intrinsic property of the particle. For particles having a different spin, such as s = 1 (these are called “vector particles”, photons are an example), we have different values for g but again the Hilbert space (the representation) can be described in the same way as for the angular momentum states. μz = ±g
17.2 Spin Properties we can consider states For any particle that has both orbital angular momentum L and spin S, If we have several characterized by both, as well as by the total angular momentum J = L + S. particles composing the system, for instance an atom, with the nucleus, composed of nucleons, and the electrons around it, we can add up the angular momenta, as follows: L i , J total = L total + Stotal . Si , L total = (17.21) Stotal = i
i
17 Spin andL + S
197
In the case of an atom, both the nucleons and the electrons are particles of spin s = 1/2, but we have several possibilities for Stotal , by the general vector addition theory for angular momenta. For one particle, we can consider eigenstates of L 2 , L z , together with S 2 , Sz , as well as other observables, with eigenvalues generically called α: |α, lm ⊗ |sms .
(17.22)
A generic state |ψ can be described in a coordinate and spin basis r , ms | by a wave function r ms |ψ = ψ ms (r ), or ψ(r , ms ).
(17.23)
In the case of s = 1/2, we describe the states by wave functions ψ± (r ), corresponding to ms = ±1/2. We can represent these wave functions only in the coordinate basis r |, so states in the spin Hilbert space are given by r |ψ = ψ+ (r )|+ + ψ− (r )|− or
ψ+ (r ) ψ= , ψ− (r )
(17.24)
which is a (two-component) spinor field. We can also represent just the spin state (in the spin Hilbert space) as C |α = |++|α + |−−|α ≡ C+ |+ + C− |− = + , C−
(17.25)
(17.26)
where we have used a completeness relation in the spin Hilbert space, |++| + |−−| = 1, and defined C± ≡ ±|α. Further, we can factorize the state of the particle with spin, r |ψ (see 17.24), and find ψ+ (r ) = C+ ψ(r ), ψ− (r ) = C− ψ(r ).
(17.27)
17.3 Particle with Spin 1/2 In view of the fact that the spin and orbital angular momentum components commute, [ Lˆ i , Sˆ j ] = 0,
(17.28)
just as for [J1i , J2j ] in the general theory of addition of angular momenta, it means that we can have simultaneous eigenstates for spin and orbital angular momentum, as we have seen already. Thus, considering a particle in a rotationally invariant system, we can use as observables L 2 , L z , S 2 , Sz . In terms of eigenstates for these operators, we have found that we can separate variables in the wave function for the system without spin, ψ(r ) → ψl (r)Ylm (θ, φ), so in the presence of spin we can write states also characterized by s, ms : C ψlm,ms (r, θ, φ) = ψl (r)Ylm (θ, φ) + . C−
(17.29)
(17.30)
17 Spin andL + S
198
To continue, we give more details on the spherical harmonics introduced in the previous chapter. As we saw, the spherical harmonics also factorize in terms of the dependences on θ and φ: Ylm (θ, φ) = Pl,m (cos θ)Φm (φ),
(17.31)
where the normalized eigenfunctions for the φ dependence are 1 Φm (φ) = √ eimφ , 2π
(17.32)
the Pl,m are the associated Legendre functions of the second degree, defined for m ≥ 0 by the formula l+m (l − m)! 2l + 1 1 d m 2 m/2 Pl,m (w) = (−1) (1 − w ) (w 2 − 1) l , (17.33) (l + m)! 2 l! 2l dw which take real values; for negative values of m they are defined by imposing ∗ Yl,−m (θ, φ) = (−1) mYlm (θ, φ).
For m = 0, the spherical harmonics reduce to the Legendre polynomials, 2l + 1 Yl,0 (θ, φ) = Pl (cos θ). 4π The Legendre polynomials are defined by the relation l 1 d Pl (w) = (w 2 − 1) l , l! 2l dw
(17.34)
(17.35)
(17.36)
and satisfy the differential equation (1 − w 2 )Pl (w) − 2wPl (w) + l (l + 1)Pl (w) = 0;
(17.37)
their orthogonality condition is
1 −1
dwPl (w)Pl (w) =
2 δll . 2l + 1
For Ylm , when w = cos θ the differential equation (17.37) becomes 2 d d + cot θ + l (l + 1) Pl (cos θ) = 0. dθ dθ 2
(17.38)
(17.39)
This equation is obtained from the eigenvalue equation L 2 Ylm (θ, φ) = l (l + 1)Ylm (θ, φ), (17.40) 2 when we take into account that (using the formulas for L z , L + , L − from the previous chapter) Lˆ 2 1 ˆ 2 Lˆ + Lˆ − + Lˆ − Lˆ + = 2 L3 + 2 2 ∂2 1 iθ ∂ ∂ ∂ ∂ (17.41) =− 2 + + i cot θ + i cot θ e , e−iθ − 2 ∂θ ∂φ ∂θ ∂φ ∂φ 2 ∂2 ∂ ∂ 1 =− + cot θ + . ∂θ sin2 θ ∂φ2 ∂θ 2
17 Spin andL + S
199
Other properties of the spherical harmonics are as follows. From the completeness of the |lm states in the Hilbert space for integer l, we find ∞ l
|lmlm| = 1.
(17.42)
l=0 m=−l
Multiplying from the left with n | and from the right with |n , where n is a unit vector characterized by angles θ, φ so that |n is a coordinate basis for angles, we obtain the completeness relation for spherical harmonics, ∞ l
∗ Ylm (n )Ylm (n ) = δ(n − n ) = δ(cos θ − cos θ )δ(φ − φ ).
(17.43)
l=0 m=−l
Also, the orthonormality of |lm, lm|l m = δll δ mm ,
(17.44)
leads, by the insertion of the completeness relation of |n, dΩn |nn | = 1,
(17.45)
into the orthonormality relation for spherical harmonics, dΩn Yl∗ m (n )Ylm (n ) = δll δ mm .
(17.46)
Besides these relations, we also have the addition formula for spherical harmonics, l
n |lmlm|n =
m=−l
l
∗ Ylm (n )Ylm (n ) =
m=−l
2l + 1 Pl (n · n ). 4π
(17.47)
Finally, the recursion relation for finding Yl,m−1 from Yl,m is obtained from the recursion relation for |lm, where one uses Lˆ − , in the n | basis, so that n |l, m − 1 = √
n | Lˆ − /|lm , (l + m)(l + 1)
(17.48)
from which we obtain the formula Yl,m−1 (n ) =
1 ∂ ∂ e−iφ − + i cot θ Ylm (n ). (l + m)(l − m + 1) ∂θ ∂φ
(17.49)
Using it, we find the formulas already quoted, (17.31) and (17.33).
17.4 Rotation of Spinors with s = 1/2 We can also find the effect of rotating spinors in three-dimensional space. A rotation by an angle φ around an arbitrary direction defined by n is generated by σ · n/2, since the quantities σ/2 (the σi are the Pauli matrices) are the generators of SO(3) in the spinor (s = 1/2) representation. Then the group element (acting on Hilbert spaces) is σ · n φ g = exp −i . (17.50) 2
17 Spin andL + S
200
However, since σi2 = 1 and n 2 = 1, it follows that also (σ · n ) 2 = 1, which means that ⎧ ⎪ 1, (σ · n ) m = ⎨ ⎪ σ · n, ⎩
m = 2k m = 2k + 1.
(17.51)
Then the group element for rotations in the spin 1/2 representation becomes g = e−iσ ·nφ/2 =
∞ ∞ ∞ (−iφ/2) m (−iφ/2) 2k (−iφ/2) 2k+1 (σ · n ) m = + σ · n m! (2k)! (2k + 1)! m=0 k=0 k=0
φ φ = cos 1 − i sin (σ · n ). 2 2 Since σ · n is a vector operator, we have σi = g −1 σi g =
Ri j σ j ,
(17.52)
(17.53)
j
where Ri j is the rotation matrix (in the vector representation). We can check that, for instance for a rotation of σ1 around the third direction, we have σ1 = eiσ3 φ/2 σ1 e−iσ3 φ/2 = (cos φ/2 + i sin φ/2 σ3 )σ1 (cos φ/2 − i sin φ/2 σ3 ) = σ1 cos2 φ/2 + sin2 φ/2 σ3 σ1 σ3 + i sin φ/2 cos φ/2 [σ3 , σ1 ]
(17.54)
= σ1 cos φ − σ2 sin φ, where we have used σ1 σ3 = −σ3 σ1 = −iσ2 .
(17.55)
The resulting transformation is indeed a rotation of σi , according to the vector operator formula. We can use the same rotation formula to construct the eigenstate of a general vector operator σ · n with spin projection +1/2 onto the direction of n. 1 We start with the eigenstate of σ3 with spin projection +1/2 along the third direction, |+ = . 0 Then we rotate the system by an angle θ around direction 2 and after that by φ around direction 3. The resulting state is 1 −iσ3 φ/2 −iσ2 θ/2 e |+ = (cos φ/2 − i sin φ/2 σ3 )(cos θ/2 − i sin θ/2 σ2 ) |ψ = e 0 (17.56) cos θ/2 e−iφ/2 . = iφ/2 sin θ/2 e
17.5 Sum of Orbital Angular Momentum and Spin,L + S In order to sum orbital angular momentum and spin, we follow the general theory of addition of angular momenta, with J = L ⊗ 1 + 1 ⊗ S.
(17.57)
17 Spin andL + S
201
We can use a basis of eigenstates of commuting operators (observables) L2 , L z , S 2 , Sz , |lm ⊗ |sms ,
(17.58)
2 or a new basis of eigenstates of L 2 , S 2 , J , Jz , namely
|ls jm j . In the case of s = 1/2, by multiplying the state with n |α| we obtain C Ylm (n ) + , C−
(17.59)
(17.60)
leading to states ψl,1/2,j,m j . Whether |lmsms or |ls jm j are the more appropriate basis states depends on whether the interaction Hamiltonian Hˆ int has spin–orbit coupling terms L · S or not.
17.6 Time-Reversal Operator on States with Spin Finally, we consider the effect of the time-reversal operator T on particles with spin (before, we only considered its effect on spinless particles). We first note that the angular momentum, thought of as the generator of SO(3), is odd under T, T Ji T −1 = −Ji .
(17.61)
We can see that this is true by for instance observing that group elements g ∈ SO(3) should commute · J , so that with T (rotations and time reversals are independent), even for infinitesimal g 1 − i α [ei α ·J , T] = 0 ⇒ T (i Ji )T −1 = i Ji .
(17.62)
But, since T is an anti-unitary operator (antilinear and unitary), T (i Ji )T −1 = −iT (Ji )T −1 , giving the stated relation. Moreover, as we saw, T acts on wave functions by complex conjugation. Applying to the spherical harmonics, we obtain ∗ (θ, φ) = (−1) mYl,−m (θ, φ). T : Ylm (θ, φ) → Ylm
(17.63)
This means that for orbital angular momentum states, T |lm = (−1) m |l, −m,
(17.64)
at least for integer m. But one can generalize to say (this involves a phase convention) that the same formula applies to eigenstates of J = L + S, T | jm = (−1) m | j, −m, m ∈ Z.
(17.65)
Then we can generalize this further to any j, integer or half-integer, by writing it as T | jm = i 2m | j, −m.
(17.66)
17 Spin andL + S
202 Applying T twice gives
T 2 | jm = +| jm, j ∈ Z = −| jm,
j ∈ Z/2.
(17.67)
Note that this is quite unexpected, since reversing time twice gives the same system, and it is not clear why we should have a phase ±1. The above action of T on a state with general j can be proven more rigorously and generally, but we will not do it here. Next, we consider the action of T on spin, the intrinsic angular momentum of a particle. From the above analysis, S should also be odd under time reversal. This is consistent with the semiclassical idea of an intrinsic “spinning” around the direction of the momentum. Indeed, reversing the direction thus of time, this spinning would reverse as well, thus changing the sign of the spin, S → − S; Sz → −Sz for any direction z. Taking into account also the action of T on r , which is to leave it invariant (even under T, consistent with the idea that r is not related to time evolution), and on p , p → −p (odd under T, consistent with the idea that changing the direction of time changes the direction of motion), we can conclude that the time operator corresponds to a rotation by π around the y axis (given that the z axis is the direction of momentum), plus the complex conjugation operator K0 (the basic antilinear operator), T = e−iπSy / K0 .
(17.68)
Indeed we could check that the correct transformation rules are obtained, though we leave it as an exercise. Here we just point out that a rotation by π around y, perpendicular to the momentum (which is in the z direction), reverses any vector in the z direction.
Important Concepts to Remember which • Besides the orbital angular momentum L, there is an intrinsic spin angular momentum S, comes from quantum field theory (Poincar´e invariant theory), and is associated with the type of particle (thus constant for each type of particle). We have the spin–statistics theorem relating integer spin with Bose–Einstein statistics and half-integer spin with Fermi–Dirac statistics. and , such that Hint = − μ · B, • A charged particle in a magnetic field has a magnetic moment μ q q e = 2m L, so quantum mechanically we expect μz = 2m ml . For the electron, 2m = μ B classically μ is the Bohr magneton. • For the electron, μz = gμ B ms with g 2 (2 from the Dirac equation, and small corrections from QED), and for other particles we have different values of g that still differ from 1. • For a particle with spin, the state is |αlm ⊗ |sms , with wave function in the |r ms basis ψ ms (r ) = ψ+ (r ) ψ(r , ms ), or r |ψ = ψ+ (r )|+ + ψ− |− = with ψ+ (r ) = C+ ψ(r ), ψ− (r ) = C− ψ(r ), or ψ− (r ) C otherwise ψlmms (r, θ, φ) = ψl (r)Ylm (θ, φ) + . C−
17 Spin andL + S
203
√ • The spherical harmonics are Ylm (θ, φ) = Plm (cos θ)eimφ / 2π, with Plm the associated Legendre functions of second degree, orthonormal and complete; the spherical harmonics satisfy the addition
∗ (n ) = 2l+1 n · n ). formula lm=−l Ylm (n )Ylm 4π Pl ( • A spinor (which describes 1/2 particle), initially of spin +1/2 in the third direction, rotated a spin −iφ/2 cos θ/2 e with n (θ, φ), becomes . sin θ/2 e+iφ/2 • Under time reversal angular momentum is odd, T Ji T −1 = −Ji , and states transform as T | jm = i 2m | j, −m.
Further Reading See [2], [1], [3] for more details.
Exercises (1)
(2) (3) (4) (5)
(6) (7)
For an electron inside an atom, with angular momentum l = 1 (and spin 1/2), in a magnetic field, how many energy levels (from spectral lines) do we see? Assume there is a single energy level for l, s fixed. Write down the wave function for a free oscillating neutrino (using general parameters) in Cartesian coordinates. Write down the wave function for a free oscillating neutrino (using general parameters) in spherical coordinates. Prove the addition formula for spherical harmonics, (17.47). Classically, the spin has a precession Consider an electron (with spin 1/2) in a magnetic field B. motion around B (Larmor precession). Prove this. Then calculate the quantum mechanical wave function |ψ(t) corresponding to this classical motion. Consider an electron with orbital angular momentum l = 2. Write down its possible states in the J basis. Prove that the time-reversal operator can be written as T = e−iπSy / K0 .
The Hydrogen Atom
18
In this chapter, we consider the simplest nontrivial case of a central potential, the negative 1/r potential, valid for the hydrogen atom and hydrogen-like atoms (those with a nucleus of charge +Ze and an electron of charge −e around it), one of the first and most important cases to be analyzed by quantum mechanics. It is also a very simple system, forming a central-potential substitute for the one-dimensional harmonic oscillator. This system teaches most of the important methods for solving a quantum system.
18.1 Two-Body Problem: Reducing to Central Potential The first thing we will show is that for a two-body problem with a likewise two-body potential we can factorize out the center of mass behavior, and we are then left with reduced system in a central potential, as in classical mechanics. We will do this for a general two-body (central) potential V (|r i j |). The quantum two-body Hamiltonian acting on a wave function is 2 2 2 2 Hˆ = − ∇1 − ∇ + V (|r 1 − r 2 |). 2m 2m2 2
(18.1)
≡ m1r 1 + m2r 2 , R m1 + m2 1 1 1 ≡ + , μ m1 m2
(18.2)
Defining r ≡ r 1 − r 2 , M ≡ m1 + m2 , where μ is known as the reduced mass, we obtain m1,2 ∂R j , (18.3) M leading to an alternative split of the two-body kinetic terms, into center of mass and relative motion terms: 1 2 1 2 1 m1 2 1 m2 2 ∇1 + ∇2 = ∇r + ∇R + ∇r − ∇ m1 m2 m1 M m2 M R (18.4) 1 2 1 2 = ∇ + ∇ . μ r M R ∂i1,2 = (∂i1,2 r j )∂r j + (∂i1,2 R j )∂R j = ±∂ri +
Then the Schr¨odinger equation reduces to 2 2 2 2 = Eψ(r , R). − ∇R − ∇r + V (r) ψ(r , R) 2M 2μ 204
(18.5)
205
18 The Hydrogen Atom
We can separate the variables so that the wave function becomes a function of the center of mass times a function of the relative position r , position R = ψ(r )φ( R). ψ(r , R)
(18.6)
Then the Schr¨odinger equation splits into a part depending only on r , plus a part depending only on R: 2 2 2 2 ψ(r ) + V (r )ψ(r ) φ( R) ψ(r ) + − ∇ ∇R φ( R) − ECM φ( R) − 2M 2μ r (18.7) = (E − ECM )ψ(r )φ( R). This allows us to set the two parts to zero independently: 2 2 ∇ φ( R) = ECM φ( R) 2M R 2 2 − ∇ + V (r) ψ(r ) = (E − ECM )ψ(r ). 2μ r −
(18.8)
The first equation corresponds to the center of mass motion, and is of free particle type. The second equation is the equation for the relative motion, with relative energy Erel = E − ECM , with a central potential V (r).
18.2 Hydrogenoid Atom: Set-Up of Problem Consider a “hydrogenoid” atom, with a negative Coulomb potential corresponding to the interaction of a nucleus of charge Q = Ze and an electron of charge q = −e around it. The reduced mass μ is approximately equal to the electron mass me , since me mn (the nucleus mass mn ). Then V (r) = −
Q˜ 2 |Q||q| 1 ≡− . 4π0 r r
In the Schr¨odinger equation for this relative motion with central potential V (r), 2 2 − ∇ + V (r) ψ(r ) = Erel ψ(r ), 2μ r
(18.9)
(18.10)
where from now on we will replace Erel with E, we can separate variables further. First, we consider the equation in spherical coordinates, ψ(r ) → ψ(r, θ, φ), and then, remembering that 2 = Δr = ∇ r
∂2 2 ∂ 1 + + 2 Δθ,φ , 2 r ∂r r ∂r
(18.11)
where the angular Laplacian equals the square of the angular momentum, Δθ,φ = −
L 2 , 2
(18.12)
we write ψ(r, θ, φ) = R(r)Ylm (θ, φ).
(18.13)
206
18 The Hydrogen Atom However, the spherical harmonics Ylm (θ, φ) are eigenfunctions of the angular Laplacian Δφ,θ , or the angular momentum squared, Δθ,φYlm (θ, φ) = −
L 2 Ylm (θ, φ) = −l (l + 1)Ylm (θ, φ), 2
implying that the radial function R(r) satisfies the following radial equation 2 2 d 2 d l (l + 1) − + − + V (r) R(r) = E R(r). 2μ dr 2 r dr r2
(18.14)
(18.15)
The solution of this equation will depend on the parameters E and l only, so R(r) = RE,l (r) ⇒ ψ = ψ Elm (r, θ, φ) = REl (r)Ylm (θ, φ).
(18.16)
We see that, with respect to the Schr¨odinger equation in one dimension, in the radial equation for R(r) we have a term with a first derivative, r −1 d/dr. But we can reduce (18.16) to the same type of problem as in one dimension by redefining the radial wave function as R(r) =
χ(r) , r
(18.17)
which implies that 2 d 2 χ(r) 2μ l (l + 1) + − + E − V (r) χ(r) = 0; 2μ r 2 dr 2 2
(18.18)
this is indeed of the form of the one-dimensional Schr¨odinger equation, with an “effective potential” Veff (r) = V (r) +
2 l (l + 1) . 2μ r 2
(18.19)
The only difference is that here r ≥ 0, unlike in one dimension. In our particular case the effective potential is Veff (r) = −
Q˜ 2 2 l (l + 1) + , r 2μ r 2
(18.20)
and at r → 0 we have V (r) → +∞, so it is as though we have an infinite “wall” at r = 0, allowing us to have a complete analogy with the one-dimensional system. The potential is of the type depicted in Figs. 19.1b and 19.2a, with a negative minimum at a positive value for r, r min =
l (l + 1)2 ; Q˜ 2 μ
(18.21)
the potential goes to zero at infinity, from below.
Boundary Conditions Since at r = 0 we have an infinite “wall” for the motion, in χ(r), as in the one-dimensional case, we have the boundary condition χ(r = 0) = 0.
(18.22)
207
18 The Hydrogen Atom At r → ∞, we must have at least χ(r → ∞) → 0. Actually, though, due to the condition of (threedimensional) normalizability, we have ∞ ∞ |ψ| 2 r 2 dr dΩ < ∞ ⇒ χ(r) 2 dr < ∞, (18.23) 0
0
which means that the behavior at infinity of the “one-dimensional wave function” χ(r) must be reducing faster than χ(r)
0 In this case, since for r ≥ r 0 we have V (r) < 0, then, according to the general analysis of onedimensional systems, the state behaves at infinity sinusoidally, χ(r) ∼ e±ikr , r → ∞,
(18.25)
2 k 2 = E − V (r) E. 2μ
(18.26)
since V (r) 0 there, and thus
This sinusoidal behavior is not square integrable at infinity, as in the one-dimensional case, and corresponds to scattering states, to be studied in the second part of the book.
Case 2: E < 0 This is the more interesting case for us at present. By the general one-dimensional analysis we expect to find bound states, valid for a set of discrete values for the energy, and so for a quantization condition, obtained from the vanishing of a coefficient A of a possible solution: A = 0 ⇒ E = En .
(18.27)
18.3 Solution: Sommerfeld Polynomial Method To solve the radial equation, we use a variant of the same method as that used in the case of the one-dimensional harmonic oscillator to solve the equation. First, we rescale the equation, both variables and parameters, in order to have a dimensionless equation. Define 2μ|E| μ ˜2 , λ ≡ Q , (18.28) ρ = 2κr, κ ≡ 2 2|E|2 so that the equation becomes (note that now we set E = −|E|) d 2 χ(ρ) l (l + 1) 1 λ + − − + χ = 0. 4 ρ dρ 2 ρ2
(18.29)
208
18 The Hydrogen Atom In order to solve this equation, we first find the behaviors at r → 0 and r → ∞ of the solution, and then factor them out, leaving us with an easier (and hopefully recognizable) equation. At r → 0, writing χ ∼ Aρ α ,
(18.30)
we find 1 ρ α−2 [α(α − 1) − l (l + 1)] − ρ α + λρ α−1 = 0. (18.31) 4 Setting to zero the coefficient of the leading power in order to solve the equation for the leading behavior, we find α(α − 1) = l (l + 1),
(18.32)
which has two solutions, α = l + 1 or
− l ⇒ χ ∼ ρl+1 or ρ −l .
(18.33)
However, the behavior χ ∼ ρ−l is not normalizable at zero, so this solution is excluded. At r → ∞, from the general theory, since V (r → ∞) → 0 the solution is exponential with parameter κ, χ(ρ) ∼ Be±κr = Be±ρ/2 .
(18.34)
However, as in the one-dimensional case, the positive exponential is not normalizable but the negative one, ∼ e−ρ/2 , is normalizable. According to the general procedure outlined earlier, first we factorize this behavior at infinity by redefining the radial wave function with it: χ(ρ) = e−ρ/2 H (ρ).
(18.35)
1 χ = − e−ρ/2 H + e−ρ/2 H ⇒ 2 −ρ/2 1 χ =e H − H + H , 4
(18.36)
Then we have
so the radial equation becomes H − H −
l (l + 1) λ − H = 0. ρ ρ2
(18.37)
We next factorize the behavior at r → 0, again redefining the radial wave function: H (ρ) = ρl+1 F (ρ).
(18.38)
Since H = (l + 1)ρl F + ρl+1 F H = l (l + 1)ρl−1 F + 2(l + 1)ρl F + ρl+1 F , the radial equation becomes finally 2l + 2 F F + − 1 F − (l + 1 − λ) = 0. ρ ρ
(18.39)
(18.40)
209
18 The Hydrogen Atom
But this is the equation for the confluent hypergeometric function 1 F1 (a, b; z), namely b a F + − 1 F − F = 0, z z
(18.41)
with general solution F = A1 F1 (a, b; z) + Bρ 1−b 1 F1 (a − b + 1, 2 − b; z).
(18.42)
In our case, since a = l + 1 − λ and b = 2l + 2, the general solution for the function F in the radial equation is F = A1 F1 (l + 1 − λ, 2l + 2; ρ) + Bρ −2l−1 1 F1 (−λ − l, −2l; ρ).
(18.43)
However, since 1 F1 equals 1 at z = ρ = 0, the solution with coefficient B has the wrong behavior at ρ = 0, F ∼ ρ −2l−1 , so that χ ∼ ρl+1 ρ −2l−1 = ρ −l , which has already been excluded on physical (boundary condition) grounds. That means that only the solution with coefficient A is good at ρ = 0. On the other hand, at ρ → ∞, 1 F1 (a, b; ρ)
∼ αρ n + βe+ρ ,
(18.44)
so that the leading behavior of χ is χ ∼ αe−ρ/2 + βe ρ/2 . Again, we know that the behavior with e+ρ/2 is not normalizable, so must be excluded. That imposes a condition β = 0 which is a (quantization) condition on the parameters a, b (and so on l and on λ and therefore on the energy E) of the function 1 F1 (a, b; ρ). The confluent hypergeometric function can be defined as an infinite series, 1 F1 (a, b; ρ) =
∞ 1 a(a + 1) · · · (a + k − 1) k ρ . k! b(b + 1) · · · (b + k − 1) k=0
(18.45)
We can check this by writing the solution to this equation as a series in z (or ρ) and then finding a recursion relation for the coefficients from the equation. Thus, write F=
∞
Ck ρ k ,
(18.46)
k=0
and substitute in the defining equation, to find ∞
k (k − 1)Ck ρ k−2 + bkCk ρ k−2 − kCk ρ k−1 − aCk ρ k−1 = 0.
(18.47)
k=0
We can rewrite this by redefining the sums to have the same ρ k−1 factor in all terms, in other words, redefine k as k + 1 in the first two terms in the sum. This only affects the k = 0 term, which becomes k = 1, but then the new k = 0 term still vanishes owing to the k prefactor. Thus we get ∞
ρ k−1 [Ck+1 (k + 1)(k + b) − Ck (a + k)] = 0.
(18.48)
k=0
This implies the recursion relation Ck+1 a+k = , Ck (k + 1)(k + b) which is indeed satisfied by (18.45).
(18.49)
210
18 The Hydrogen Atom
18.4 Confluent Hypergeometric Function and Quantization of Energy Since (18.45) is written as an infinite sum of powers of ρ, the only way to avoid an exponential blow-up, and thus to put β = 0 in (18.44), is for the infinite series to terminate at a given k = nr ∈ N, since then Ck+1 /Ck = 0. Then 1 F1 becomes a polynomial of order nr for a + nr = 0.
(18.50)
This is a quantization condition, which implies that a = l + 1 − λ = −nr ⇒ λ = nr + l + 1 ≡ n μ μ Ze2 2 ˜ =Q = , 2|E|2 4π0 2|E|2
(18.51)
where we have defined the total energy quantum number n = nr + l + 1, with nr the radial quantum number, which takes values 0,1,2,. . . That means that the energy levels are quantized, as En = −
2 (Ze2 /4π0 ) 2 μ 2 (αZ ) = −μc , 22 n2 2n2
(18.52)
where α≡
e2 4π0 c
(18.53)
is the fine structure constant, approximately equal to 1/137. Note that, since l = 0, 1, 2, . . . and nr = 0, 1, 2, . . ., it follows that n = 1, 2, 3, . . . Conversely, at fixed n, since l = n − nr − 1, we have l = 0, 1, 2, . . . , n − 1. We can also rewrite (18.52) as Z 2 e02 1 , 2a0 n2
(18.54)
e2 4π0
(18.55)
2 2 4π0 = . e2 μ μe02
(18.56)
En = − where e02 ≡ and we have defined the Bohr radius a0 ≡
The energy can be put into Rydberg’s form, i.e., in terms of the Rydberg constant R, E=−
μe04 Z2R ⇒ R = . n2 22
(18.57)
Then we have also (see (18.28)) κn =
2μ|En | Z = , na0 2
(18.58)
211
18 The Hydrogen Atom
and the minimum of the effective potential is r min =
l (l + 1)2 l (l + 1) = a0 ∝ a0 . Z Ze02 μ
(18.59)
Therefore we can write the radial wave function as Rn,l (r) = Nn,l e−κ n r (2κ n r) l 1 F1 (−n + l + 1, 2l + 2; 2κr),
(18.60)
where the normalization constant Nn,l is found to be (by integrating over the square of the wave function) 1 (n + l)! 3/2 Nn,l = (2κ n ) . (18.61) (2l + 1)! (n − l − 1)! 2n
18.5 Orthogonal Polynomials and Standard Averages over Wave Functions When the confluent hypergeometric function terminates at a finite order n it becomes a polynomial, and since the constituent functions are orthonormal, it is an orthogonal polynomial. In fact, it is a classical orthogonal polynomial, the Laguerre polynomial L bn (z) (defined in Chapter 8), up to a constant, 1 F1 (−n, b +
1; z) =
n! Γ(b + 1) b L (z). Γ(b + n + 1) n
(18.62)
The Laguerre polynomials for integer parameters can be defined by d m −z m (e z ) dz m dk L km = (−) k k L 0m+k (z), dz
L 0m (z) = ez
and obey the orthonormality condition ∞ (m + k)! e−z z k L km (z)L kn (z)dz = δ mn . m! 0
(18.63)
(18.64)
Thus the radial wave function can be written in terms of the Laguerre polynomials: Rn,l = N˜ n,l e−κ n r (2κ n r) l L 2l+1 n−l−1 (2κ n r), where the new normalization factor is
N˜ n,l =
(n − l − 1)! (2κ n ) 3/2 . 2n(n + l)!
(18.65)
(18.66)
We can use the square of the radial wave function, giving the probability as a function of radius, to calculate average powers of the radius: ∞ r k = r 2 drr k Rn,l (r) 2 , (18.67) 0
212
18 The Hydrogen Atom
to obtain a0 [3n2 − l (l + 1)] 2Z a2 r 2 nl = 02 n2 [5n2 + 1 − 3l (l + 1)] 2Z 1 1 Z = . r nl n2 a0 rnl =
(18.68)
We see that the average position of the electron goes as a0 n2 /Z, increasing as n becomes larger, so in the higher energy levels (higher n), the electron is, on average, further away from the nucleus. Also, the average potential energy is Q˜ 2 Z En 2 1 ˜ V nl = −Q =− 2 = , (18.69) r nl 2 n a0 as is somewhat expected from the equipartition of energy. For the maximum value of l for fixed n, l max = n−1, the classical orbit would be the least eccentric. In this case, we find a02 2 n (n + 1/2)(n + 1), Z2 a0 = (n2 + n/2), Z
r 2 n,n−1 = rn,n−1
(18.70)
implying that the relative error in the position vanishes at large n, Δr ≡
r 2 − r2 =
rn,n−1 na0 √ 2n + 1 = √ ⇒ 2Z 2n + 1
Δr → 0 as n → ∞, r
(18.71)
so the electron becomes more classical in this limit.
Important Concepts to Remember • The hydrogen, or hydrogenoid, atom is a very important and simple case that teaches most of the methods involved in solving a quantum system in three spatial dimensions. • We can reduce the motion of such an atom, from the motion of the nucleus of charge Ze and the electron of charge −e to a free center of mass motion and a relative motion in the central potential, with reduced mass μ. • Separating variables so that ψ nlm (r, θ, φ) = Rnl (r)Ylm (θ, φ), and with R(r) = χ(r)r, we get a one-dimensional Schr¨odinger equation for χ(r) in an effective potential Veff (r) = V (r) +
2 l (l + 1) . 2μ r 2
• The effective potential has a wall at r = 0, goes to zero at infinity, and has a minimum at r min > 0, leading to bound states for E < 0 and scattering states for E > 0.
213
18 The Hydrogen Atom
• The solution of the Schr¨odinger equation is found by the Sommerfeld polynomial method: we factor out the behavior at zero and at infinity, and thus obtain an equation for a confluent hypergeometric function 1 F1 (a, b; z), with good behavior at zero; however the behavior at infinity has two components, a polynomial and a rising exponential (giving normalizable and nonnormalizable behaviors for the wave function). • Then we find a quantization condition on the parameters a, b by imposing that the wave function at infinity is normalizable, or equivalently, that 1 F1 reduces to a polynomial (so in the recursion condition for coefficients we have Ck+1 /Ck = 0 at some finite k); this turns out to be a Laguerre classical orthogonal polynomial. • The resulting quantization condition for the energy is En = −μc2
(αZ ) 2 Ze2 RZ 2 = − = − , 2n2 2a0 n2 n2
with a0 the Bohr radius, R the Rydberg constant, and n = l + nr + 1, nr = 0, 1, 2, . . . so n = 1, 2, . . . • The average position is 3 a0 2 [n − l (l + 1)/3], 2 Z and so increases for higher n, while Δr/r → 0 for n → ∞. r =
Further Reading See [2] for more details.
Exercises (1) (2) (3) (4) (5) (6) (7)
Consider the three-body problem, with the same two-body potential Vi j (r i j ) for all three pairs. Can we separate variables to factor out anything in this general case? Set up the problem for the same hydrogenoid atom, but living in two spatial dimensions instead of three (yet with the 1/r potential of three dimensions). Use the same Sommerfeld polynomial method to solve the problem in exercise 2. Prove that the normalization constant Nnl is given by equation (18.61). Prove the relations (18.68). Calculate the average radial momentum of the reduced orbiting particle in the state n, l of the hydrogenoid atom. for the hydrogenoid atom, what happens to If we introduce a small constant magnetic field B the energy levels En ? Give an approximate quantitative description of how this happens, based = 0. on the quantization of energy at B
19
General Central Potential and Three-Dimensional (Isotropic) Harmonic Oscillator After having examined the most important example of a central potential, the Coulomb potential, and the corresponding hydrogen atom, we now consider the general case of a central potential, and apply it to a few other examples: a free particle, a radial square well, and finally three-dimensional isotropic harmonic oscillator.
19.1 General Set-Up We have basically derived the general formulas for a central potential in the previous chapter, in the process of applying them to the Coulomb potential, but we review them here. The Schr¨odinger equation for the wave function in a general central potential, 2 2 ∇ + V (r) ψ(r ) = Eψ(r ), (19.1) − 2M reduces in spherical coordinates to ⎡⎢ ⎤ 2 2 2 2 ⎢⎢− ∂ + 2 ∂ − L / + (V (r) − E) ⎥⎥⎥ ψ(r ) = 0. r2 ⎢⎣ 2M ∂r 2 r ∂r ⎥⎦
(19.2)
This means that we can separate the variables in the equation, ψ(r ) = REl (r)Ylm (θ, φ),
(19.3)
where Ylm are eigenfunctions of the angular momentum, L 2Ylm (θ, φ) = −l (l + 1)2Ylm (θ, φ), to obtain
d2 2 d l (l + 1) 2m + − + (E − V (r)) REl (r) = 0. dr 2 r dr r2 2
(19.4)
(19.5)
Further, to reduce this equation to one-dimensional Schr¨odinger-type equation, we need to get rid of the (2/r)d/dr term, by defining REl (r) =
χEl (r) , r
(19.6)
obtaining
214
d2 2m 2 l (l + 1) + E − V (r) − χEl (r) = 0. dr 2 2 2mr 2
(19.7)
215
19 General Central Potential and Harmonic Oscillator
We see that indeed the equation is the one-dimensional Schr¨odinger equation for an effective potential 2 l (l + 1) . 2mr 2
Veff (r) = V (r) +
(19.8)
Normalization Conditions In order to have normalizable solutions, we need to have a finite integral over the square of the wave function: 2 3 2 2 |ψ| d r = |ψ| r dr dΩ = |χ| 2 dr dΩ. (19.9) Then the condition at infinity,
∞
|χ(r)| 2 dr < ∞,
(19.10)
implies that, as r → ∞, the function χ(r) vanishes faster than 1/r 1/2 , |χ(r)| < The condition at zero,
1 r 1/2
.
(19.11)
|χ(r)| 2 < ∞,
(19.12)
0
which means that at r → 0, the function χ(r) either blows up or vanishes more slowly than 1/r 1/2 , 1 . (19.13) r 1/2 In the case where we have a Laurent expansion for χ(r), this would mean that |χ(r)| ∼ r k , k ≥ 0, at r → 0. |χ(r)| >
19.2 Types of Potentials Case I We will consider first the case with lim V (r) = constant ≡ 0,
r→∞
(19.14)
where we have set the constant to zero, since we can rescale the constant in the energy. This condition is true in many cases of interest, but not in all.
Case II In particular, the case of the three-dimensional harmonic oscillator is not of the above type, and then it will be automatically described as having case II behavior. Case I can be split into subcases, depending on the behavior at r → 0. These subcases will be listed and then discussed separately.
216
19 General Central Potential and Harmonic Oscillator
Veff(r)
Veff(r)
Veff(r)
(a)
Veff(r)
(c)
Figure 19.1
(b)
(d)
(a) Case Ia, with α > 0. (b) Case Ia, with α < 0, if the behavior at 0 and at ∞ is the same. (c) Cases Ib with α > 0 and Ic with β > 0. (d) Cases Ib with α < 0 and Ic with β < 0. Case Ia. When V (r → 0) ∼
α , rs
s < 2,
(19.15)
the one-dimensional effective potential is dominated by the l (l + 1)/r 2 term and thus blows up at zero, Veff (r → 0) → +∞.
(19.16)
In this case, we have an infinite “wall” at zero, as in Figs. 19.1a and b, which means that we have the boundary condition χ(0) = 0. Case Ib. When V (r → 0) ∼
α , rs
s > 2,
(19.17)
we obtain that the one-dimensional effective potential is dominated by the potential itself and thus blows up at zero, with a sign depending on the sign of α, as in Figs. 19.1c and d, Veff (r → 0) → sgn(α)∞.
(19.18)
The more interesting case is, however, when α < 0, so that Veff → −∞. Case Ic. Finally, the potential could have the same behavior as the angular momentum (centrifugal) term, V (r → 0) ∼
α . r2
(19.19)
217
19 General Central Potential and Harmonic Oscillator
Case Ia We consider first case Ia. (1) E > 0. For positive energy, since we have an infinite wall at r = 0 and at r = ∞, then for E > V (r), according to the general theory in Chapter 7, we have unbound states, with imaginary exponentials (or sinusoidal and cosinusoidal), χ(r) ∼ Aeikr + Be−ikr , where
k=
2mE . 2
(19.20)
(19.21)
The function (19.20) is not square integrable at infinity. On the other hand, at zero, as we said previously, we need χ(0) = 0, which imposes a constraint, thus selecting a single solution and leading to a continuous nondegenerate spectrum. Indeed, at r → 0, the equation for χ is approximately d 2 χ l (l + 1) = χ, dr 2 r2
(19.22)
which has solutions r α with α = l + 1 or α = −l; thus χ = Cr l+1 + Dr −l .
(19.23)
Because of the boundary condition at zero, we need to impose D/C = 0, which is the constraint we mentioned earlier. (2) E < 0. For negative energy, we have bound states, if there is a region where V (r) < E < 0, by the same general analysis from Chapter 7. At r → 0, we have the solution in (19.23) again. On the other hand, at r → ∞ we have exponential solutions, χ ∼ Ae−κr + Be+κr , where
κ≡
2m|E| . 2
(19.24)
(19.25)
Imposing D/C = 0 in (19.23) selects a single solution, but then imposing B/A = 0 gives a quantization condition, selecting only discrete energies En . In this case, we can extend Sommerfeld’s polynomial method. Defining ρ ≡ 2κr, we find the equation for χ, l (l + 1) 1 2m d2 χ + − − − V (ρ) χ(ρ) = 0. (19.26) 4 dρ 2 ρ2 2 Then, at ρ → 0, the equation becomes d 2 χ/dρ 2 +l (l + 1)χ/ρ 2 , which as we saw has the physical solution χ ∼ ρl+1 ,
(19.27)
whereas at ρ → ∞ the equation is d 2 χ/dρ 2 1/4χ(ρ), with physical solution χ ∼ e−ρ/2 .
(19.28)
218
19 General Central Potential and Harmonic Oscillator
We then factor out the two behaviors, at zero and infinity, write χ = e−ρ/2 H (ρ),
H (ρ) = ρl+1 F (ρ),
(19.29)
and find an equation for F (ρ) that is a polynomial at zero; we impose that it is a polynomial also at infinity, which turns into a quantization condition. Further, in this case, with s < 2, consider a wave function that is only nonzero for r < r 0 , so that Δr = r 0 . Then, by Heisenberg’s uncertainty relation, p ∼ Δp ∼
= . Δr r 0
(19.30)
The energy of the state with this wave function is E∼
α 2 − s, 2 r 2mr 0 0
(19.31)
and it is always positive for small enough r 0 . This means that we cannot have E < 0 in this case; the bound states must start at some minimum distance from the center r = 0, and we cannot have arbitrarily small energies: the ground state has a finite E and is at a finite r. That means that the particle cannot fall into r = 0 quantum mechanically in this case. On the other hand, if the potential behaves in the same way, ∼ −α/r s , but with s < 2, then at r → ∞, for large enough r 0 , with a wave function centered on such an r 0 and with Δr r 0 , the energy E∼
α 2 − s, 2 r0 2m(Δr)
(19.32)
becomes negative at large enough r 0 . Thus there are bound states, stationary states of negative energy, situated at large distances from the origin, which implies they have arbitrarily small negative energy. In particular, this implies that there is an infinite number of (bound) energy levels, becoming infinitely dense at E = 0.
Case Ib In this case, if α < 0, Veff (r → 0) → −∞, which suggests that we would have a large probability at r = 0: χ(0) need not be zero and in fact χ can even be non-normalizable. The energy of a state bounded by r 0 → 0, E∼
2 α − , 2mr 02 r 0s
(19.33)
is negative for small enough r 0 , meaning that there are states of arbitrarily large |E| for E < 0. The interpretation is that the quantum particle can “fall” to r = 0, where it has E = −∞ (which is an eigenvalue). We can see this directly too, since at r → 0 the Schr¨odinger equation is approximately d2 χ α l (l + 1) = − s + χ(r), (19.34) r dr 2 r2 with solution χ∼
Aei f (r ) + Be−i f (r ) . rb
(19.35)
219
19 General Central Potential and Harmonic Oscillator
The second derivative is χ ∼ −( f (r)) 2 χ + 0 +
b(b + 1) χ, r2
(19.36)
and the leading term in this equation of motion implies that ( f (r)) 2 = α/r s , the zero in the middle term amounts to the vanishing of the imaginary part of the solution, and fixes B/A, and the matching of the subleading term gives b(b + 1) = l (l + 1), selecting b = l in this case i.e., a non-normalizable solution at zero. Therefore the probability of finding the particle outside r = 0 is negligible, and the particle falls into r = 0 in quantum mechanics, as expected. If the potential behaves in the same way, V ∼ −α/r s but with s > 2, at r → ∞ then the energy is E∼
α 2 − , 2m(Δr) 2 r 0s
(19.37)
where s > 2 and r 0 → ∞, implying that E > 0 at large enough r 0 . Thus, there is no state of arbitrarily small negative energy and large r 0 . The spectrum terminates at a finite and nonzero negative energy.
Case Ic When the potential V −α/r 2 , with α > 0, the effective potential is Veff (r) =
2 β 2 l (l + 1) − 2mα ≡ . 2 2mr 2mr 2
(19.38)
The Schr¨odinger equation at r → 0 is β d2 χ = 2 χ, 2 dr r with solution χ ∼ r a , which implies a(a − 1) = β, i.e., 1 ± 1 + 4β a1,2 = . 2
(19.39)
(19.40)
(1) If β > −1/4, both solutions a1,2 are real but, for reasons of normalization, we must choose the slowest behavior at r → 0, namely √ 1+ 1+4β 2 χ(r) ∼ r a1 = r , (19.41) for which 0 |χ(r)| 2 dr < ∞. (2) If β < −1/4, both solutions have imaginary parts and we find √ 1 √ R(r) ∼ 1/2 Aei −1−4β ln r + Be−i −1−4β ln r . (19.42) r Imposing the condition that the solution must be real, we find one that has an infinite number of zeroes, 1 (19.43) R(r) ∼ 1/2 cos −1 − 4β ln r + δ . r But the ground state cannot have zeroes, which means that the wave function (19.43) is wrong, and in the ground state the particle is actually at r = 0 (one can produce an argument that is a bit more rigorous for this statement, but the result is the same). On the other hand, if the potential is the same, i.e., V = −α/r 2 , at r → ∞, then:
220
19 General Central Potential and Harmonic Oscillator for β > −1/4, we have (at most) a finite number of negative energy levels: the solutions χ ∼ r a1,2 are valid also for r → ∞, but these solutions have no zeroes. Thus there can be zeroes for, at most, r less than some r 0 , implying a finite number of energy levels, each associated with an interval between zeroes, according to the general one-dimensional analysis. for β < −1/4, the solutions R ∼ cos −1 − 4β ln r + δ /r 1/2 are valid also at r → ∞, and have an infinite number of zeroes, thus an infinite number of energy levels.
19.3 Diatomic Molecule Consider a molecule made up of two atoms. Reducing the Schr¨odinger equation to the relative motion of the atoms with reduced mass μ, we find a central potential that has a minimum at some r 0 . If the molecule is stable then V (r 0 ) > 0, so the minimum is stable, as in Fig. 19.2a. This means that we can expand the potential around r 0 , V (r) V (r 0 ) + V (r 0 )
(r − r 0 ) 2 , 2
(19.44)
and we define V (r 0 ) ≡ μ
ω2 . 2
(19.45)
Also expanding the angular momentum term, 2 l (l + 1) 2 l (l + 1) +··· , 2 2mμ r 2μ r 02
(19.46)
and defining the inertial momentum I ≡ μr 02 , we find the following Schr¨odinger equation: 2 2μ d 2 l (l + 1) μω 2 2 + 2 E − V (r 0 ) − − ρ χ(ρ) = 0. 2I 2 dr 2
(19.47)
However, this is nothing other than the Schr¨odinger equation for a linear harmonic oscillator, of shifted energy E − V (r 0 ) − 2 l (l + 1)/2I. Thus the eigenenergies are 2 l (l + 1) 1 En = V (r 0 ) + + ω n + . (19.48) 2I 2
V
V
r
(a) Figure 19.2
(a) Potential for diatomic molecule; (b) Spherical square well potential.
r
(b)
221
19 General Central Potential and Harmonic Oscillator
19.4 Free Particle We have already looked at the free particle in spherical coordinates, but we review it here for completeness. For E > 0, the Schr¨odinger equation in spherical coordinates reduces to 1 l (l + 1) d2 χ + − χ(ρ) = 0. (19.49) 4 dρ 2 ρ2 The equation above has a unique solution that is everywhere bounded, the Bessel function jl (ρ/2) = jl (kr) for R = χ/r, meaning that the full wave function is ψ(r ) = jl (kr)Ylm (θ, φ).
(19.50)
Then the known solution in Cartesian coordinates, ei k ·r , can be expanded in terms of the spherical solutions above,
ei k ·r =
∞ l
alm (k) jl (kr)Ylm (θ, φ).
(19.51)
l=0 m=−l
Choosing the momentum k to be in the direction Oz, and defining cos θ ≡ u (and since ρ/2 = kr), eiρu/2 = eikr cos θ =
∞
cl jl (ρ/2)Pl (u).
(19.52)
l=0
From the recurrence relations obtained for the spherical Bessel functions jl and the Legendre polynomials Pl , we find that cl = (2l + 1)i l c0 ,
(19.53)
and with the normalization choice c0 = 1 we find that eikr cos θ =
∞
(2l + 1)i l jl (kr)Pl (cos θ).
(19.54)
l=0
19.5 Spherical Square Well Consider the important case where the potential is piecewise constant, specifically smaller for r ≤ r 0 than outside it as in Fig. 19.2b, ⎧ ⎪ −V0 , r ≤ r 0 V (r) = ⎨ ⎪ 0, r > r0 . ⎩
(19.55)
For negative energy, −V0 < E < 0, by the general theory we have bound states (since E > V (r) over a finite region). For r < r 0 we have a “free wave” solution like that above, Rl (r) = jl (kr), but with 2m(E + V0 ) k= . (19.56) 2
222
19 General Central Potential and Harmonic Oscillator For r > r 0 , we have an exponentially decaying solution (the only one that vanishes at infinity), χl (κr) ∼ e−κr , where
κ=
2m|E| . 2
(19.57)
(19.58)
Then, the condition of continuity of the logarithmic derivative (and so of the function and of its derivative) at r = r 0 , d χl (κr) d ln = ln jl (kr) , dr r r=r0 dr r=r0
(19.59)
gives a quantization condition that gives a finite number of energy levels En .
19.6 Three-Dimensional Isotropic Harmonic Oscillator: Set-Up This is an example of Case II, where the potential grows without bound as r → ∞. Consider a harmonic oscillator in d dimensions, with a priori different frequencies in each dimension, Hˆ =
d
Hˆ i
i=1
(19.60)
p2 ω 2 q2 Hˆ i = i + m i i . 2m 2 Then the Hilbert space is a tensor product of the harmonic oscillator Hilbert spaces for each dimension, H = H1 ⊗ · · · ⊗ H d ,
(19.61)
so the states (in the “Fock space”) are tensor product of states of occupation numbers ni , |{ni } = |n1 ⊗ · · · ⊗ |nd ;
(19.62)
in particular, the vacuum is a tensor product of the vacua for each dimension, |0 ≡ |01 ⊗ · · · ⊗ |0d . For these states the Hamiltonian for each dimension is diagonal, 1 Hˆ i |ni = ni + ωi |ni , 2
(19.63)
(19.64)
so that Hˆ |{ni } =
d i=1
1 ωi ni + |{ni }. 2
(19.65)
223
19 General Central Potential and Harmonic Oscillator In the case of an isotropic harmonic oscillator, with the same frequency in all directions, ωi = ω, we have pˆ 2 mω 2 ˆ 2 r . Hˆ = + (19.66) 2m 2 In terms of the number operators Nˆi = aˆ † aˆi , we have i
Nˆ =
d
Nˆi ,
(19.67)
i=1
so the total Hamiltonian is
d Hˆ = Nˆ + ω, 2
(19.68)
d d d ˆ H |{ni } = ni + ω|{ni } = n + ω|{ni }. 2 2 i=1
(19.69)
and the eigenenergies are
The degeneracy of these energies equals the number of different ni that give the same n = We are mostly interested in d = 3, in which case the degeneracy is gn =
(n + 1)(n + 2) . 2
i
ni .
(19.70)
19.7 Isotropic Three-Dimensional Harmonic Oscillator in Spherical Coordinates Instead of solving each dimension independently, as above (each as a harmonic oscillator), we can go to spherical coordinates, noting that the central potential is mω 2 2 r . (19.71) 2 Then the Sommerfeld polynomial method works as before. The Schr¨odinger equation becomes d2 χ 2mE m2 ω 2 2 l (l + 1) + − r − χ(r) = 0. (19.72) dr 2 2 2 r2 V (r) =
Define
α=
leading to the equation
mω 2E , = λ, ξ = αr, ω
d2 χ l (l + 1) 2 + λ − − ξ χ = 0. dξ 2 ξ2
(19.73)
(19.74)
Then, the equation at r → 0 or ξ → 0 is d 2 χ l (l + 1) = χ, dξ 2 ξ2
(19.75)
224
19 General Central Potential and Harmonic Oscillator
which has the solution χ = Aξ l+1 + Bξ −l ;
(19.76)
this implies that the normalizable solution has B = 0. The equation at r → ∞ or ξ → ∞ is d2 χ = ξ 2 χ, dξ 2
(19.77)
with the leading, normalizable, solution at infinity χ = e−ξ
2 /2
× (subleading terms).
(19.78)
We now factor out the behaviors at infinity and zero. First, we write χ = e−ξ /2 H (ξ), 2
(19.79)
which implies χ = e−ξ
2 /2
H − 2ξH + (ξ 2 − 1)H ,
so that the equation for H is (dividing by e−ξ /2 ) l (l + 1) H − 2ξH + λ − 1 − H = 0. ξ2
(19.80)
2
(19.81)
Next, we write H = ξ l+1 F (ξ),
(19.82)
which implies H = ξ l+1 F + (l + 1)ξ l F H = ξ l+1 F + 2(l + 1)ξ l F + l (l + 1)ξ l−1 F, so that the equation for F is (dividing by ξ l+1 ) l+1 F + 2 − ξ F + (λ − 2l − 3)F = 0. ξ
(19.83)
(19.84)
Finally, change variables by setting z = ξ 2 , so that dF dF = 2ξ , dξ dz
d2 F dF d2 F =2 + 4z 2 , 2 dz dξ dz
(19.85)
and we obtain the equation z
dF λ − 2l − 3 d2 F 3 + l + − z + F = 0. 2 2 dz 4 dz
(19.86)
But this is the same equation as that for the confluent hypergeometric function 1 F1 (a, b; z) from the previous chapter, with −a =
λ − 2l − 3 3 , b=l+ . 4 2
(19.87)
225
19 General Central Potential and Harmonic Oscillator
Therefore, as in the previous chapter, we find the solution 2l + 3 − 2E/(ω) 3 F (z) = 1 F1 ,l + ;z , 4 2
(19.88)
with quantization condition 2l + 3 − 2E/(ω) = −nr ∈ N. 4 Finally then, the eigenenergies are
(19.89)
En 3 3 = l + + 2nr ≡ n + , (19.90) ω 2 2 where we have defined n = 2nr + l. This is the same formula as that obtained from the Cartesian form. Moreover, the degeneracy is again (n + 1)(n + 2) , (19.91) 2 meaning that the possible states are indeed the same. In spherical coordinates, the radial functions are (putting together the formulas derived before) gn =
Rn,l (r) = Nnr ,l e−α where Nn,l
2 r 2 /2
(αr) l 1 F1 (−nr , l + 3/2; α 2 r 2 ),
√ 1/2 α 2α nr ! = . Γ(l + 3/2) Γ(nr + l + 3/2)
(19.92)
(19.93)
19.8 Isotropic Three-Dimensional Harmonic Oscillator in Cylindrical Coordinates For completeness, we now show how to solve the same oscillator in cylindrical coordinates. Then the Hamiltonian is ∂2 2 1 ∂ ∂ 1 ∂2 m0 ω 2 2 + + (ρ + z 2 ) Hˆ = − ρ + 2 2 2 2m0 ρ ∂ρ ∂ρ 2 ρ ∂φ ∂z (19.94) Lˆ 2z = Hˆ z + Hˆ ρ + , 2m0 ρ 2 where we have denoted the mass by m0 in order not to confuse it with m, the eigenvalue of L z /. We can easily check that H, Hz , L z form a complete set of mutually compatible observables, ˆ Lˆ z ] = [ H, ˆ Hˆ z ] = [ Lˆ z , Hˆ z ]. 0 = [ H,
(19.95)
Then we separate variables by writing the wave function as ψ nz ,nρ ,|m| (z, φ, ρ) = Znz (z)Φm (φ)Rnρ ,|m| (ρ). The equation of motion for Znz (z) is that for a harmonic oscillator in one dimension, 2 ∂ 2 m0 ω 2 z 2 Hz Z n z = − + Z n z = Ez Z n z , 2m0 ∂z 2 2
(19.96)
(19.97)
226
19 General Central Potential and Harmonic Oscillator
with the standard solution Znz = N (α)e−α
2 z 2 /2
Hnz (αz)
Ez = ω(nz + 1/2).
(19.98)
The equation of motion for Φm (φ) is even simpler, L z Φm (φ) = mΦm (φ),
(19.99)
1 Φm (φ) = √ eimφ . 2π
(19.100)
so that the eigenfunction is
Then, finally, the equation for the two-dimensional radial function (in the plane) is m02 ω 2 2 ⎫ ⎪ ⎪ m2 2m0 d 2 R 1 dR ⎧ 1 ⎨ − + + + E − ω n + ρ ⎬ − z 2 2 2 ⎪ R = 0. 2 dρ 2 ρ dρ ⎪ ρ ⎩ ⎭
(19.101)
Redefining the variable R, in order to get rid of the dR/dρ term, by setting
so that
χ(ρ) R= √ , ρ
(19.102)
d 2 R 1 dR 1 d2 χ 1 + =√ + χ , ρ dρ 2 4ρ 2 dρ 2 ρ dρ
(19.103)
we find the equation d2 χ m2 − 1/4 2m0 1 m0 ω 2 2 + − + E − ω n + ρ χ = 0. − z 2 2 dρ 2 ρ2 2 This is the same equation as (19.72), with the following changes, 1 1 E → E − ω nz + , l (l + 1) → m2 − , 2 4
(19.104)
(19.105)
implying l 2 + l − m2 +
1 1 = 0 ⇒ l = − ± |m|, 4 2
(19.106)
and eigenenergies E 1 3 − nz + = l + + 2nρ = 1 ± |m| + 2nρ ⇒ ω 2 2 3 3 E = ω nz + 2nρ ± |m| + ≡ ω n + . 2 2
(19.107)
We see that again we obtain the same spectrum, and in fact also the same degeneracy, as expected.
227
19 General Central Potential and Harmonic Oscillator
Important Concepts to Remember • For motion in a central potential in three dimensions, we can reduce the Schr¨odinger equation to an equation in one dimension, for χ(r), in an effective potential Veff (r) = V (r) +
•
•
• • • • •
•
2 l (l + 1) , 2m r 2
thus including the centrifugal term. For V ∼ α/r s with s < 2, we can have bound states if there is a region where V (r) < E < 0. If V ∼ −|α|/r s with s < 2 at r → ∞, we have an infinite number of bound states, of vanishingly small energy. For V ∼ −|α|/r s with s > 2, we can have bound states of arbitrarily large |E|, meaning there is an arbitrarily large probability for the particle to be at r = 0, implying that the particle “falls into r = 0 quantum mechanically”. For V ∼ −|α|/r 2 , β = l (l + 1) − 2m|α|/2 < −1/4, the ground state is at r = 0. The diatomic molecule has a potential with a minimum at r 0 , approximated by a shifted harmonic oscillator, so E = V (r 0 ) + 2 l (l + 1)/2I + ω(n + 1/2). The spherical square well has a finite number of bound states for E < 0. The three-dimensional isotropic harmonic oscillator is equivalent to three one-dimensional harmonic oscillators, with tensor product states, giving an (n + 1)(n + 2)/2 degeneracy. Equivalently, in spherical coordinates, we have a central potential mω2 r 2 /2, which can be solved by the Sommerfeld polynomial method in terms of the same 1 F1 , giving a quantized energy En = ω(l + 2nr + 3/2) ≡ ω(n + 3/2), with the same (n + 1)(n + 2)/2 degeneracy. In cylindrical coordinates, we obtain a one-dimensional harmonic oscillator in z times a onedimensional radial harmonic oscillator for planar motion, with the same energy levels and degeneracy.
Further Reading See [4] for more details on the various central potential cases and also [2].
Exercises (1) Consider the central potential V (r) = α/r cos βr. How many bound states are there? (2) Consider the central potential V (r) = |α|/r 2 ln r/r ∗ . In what region of space are the bound states confined? (3) Consider the attractive Yukawa central potential, V (r) = −|α|/re−mr . Is the bound state spectrum finite or infinite? Can we use the same approximation as that used for the diatomic molecule, and if so, under what conditions? (4) For the free particle in spherical coordinates, prove relation (19.53).
228
19 General Central Potential and Harmonic Oscillator
(5) Complete the proof of the fact that the number of bound states of the spherical square well is finite. (6) Show that the normalization constant for the isotropic harmonic oscillator in spherical coordinates is given by (19.93). (7) Solve the isotropic harmonic oscillator in two spatial dimensions, using polar coordinates.
20
Systems of Identical Particles
In this chapter we start considering a more physical case, with several particles. According to the general theory discussed earlier in the book, if the particles are independent, so that the
N ˆ Hi , observable operators for different particles commute (in particular the Hamiltonians, Hˆ = i=1 with [ Hˆ i , Hˆ j ] = 0), we can write the Hilbert space as a tensor product, H = H1 ⊗ · · · ⊗ H N .
(20.1)
Correspondingly, for a state with particles at positions x 1 , x 2 , . . . , x N , we have |ψ = |x 1 ⊗ · · · ⊗ |x N .
(20.2)
We have seen something similar before, in the case where we had several systems. For instance, in the previous chapter we considered a d-dimensional harmonic oscillator, with the oscillation in each
d ˆ Hi and the general state in dimension acting as an independent harmonic oscillator, so that Hˆ = i=1 the Cartesian occupation number picture was |n1 ⊗ · · · ⊗ |n N .
20.1 Identical Particles: Bosons and Fermions However, we want to ask what happens when the particles are not just similar, but identical. In classical mechanics that is not a problem because we can always track the position and momentum of a particle, so we can always distinguish it from another particle. In quantum mechanics, however, the particles have probabilities of being anywhere, including at the same point, so we cannot distinguish them by position: the particles are indistinguishable. Consider first a state with only two particles, and specifically the state with one particle at x 1 and one at x 2 , denoted |x 1 x 2 . Because of what we have just said, this state should be indistinguishable from the state |x 2 x 1 , which means the two states should be proportional, |x 2 x 1 = C12 |x 1 x 2 .
(20.3)
Define the permutation operator, which exchanges particles 1 and 2, by Pˆ12 |x 1 x 2 = |x 2 x 1 .
(20.4)
Then the two previous equations imply that Pˆ12 must be diagonal, and its eigenvalue is C12 . Moreover, it seems obvious that we must have Pˆ12 = Pˆ21 and Pˆ12 Pˆ34 = Pˆ34 Pˆ12 , i.e., the permu2 tations are Abelian (commutative) and Pˆ12 = 1, meaning that two permutations will take us back to the original (unpermuted) case. This latest consideration implies that 2 C12 = 1 ⇒ C12 = ±1.
229
(20.5)
230
20 Systems of Identical Particles
Now, the conditions that permutations are Abelian (commutative), and that two permutations take us back to the original case sound harmless and obvious, but in fact they are not. These are assumptions that are broken in some possible cases, leading to what are known as “non-Abelions” and “anyons”, which will be analyzed at the end of this book. For the moment however, we note that we have only two possible symmetries, associated with the statistics describing particles: C12 = +1 leads to Bose–Einstein statistics, for bosons, while C12 = −1 leads to Fermi–Dirac statistics, for fermions. Since Pˆ12 |x 1 ⊗ |x 2 = |x 2 ⊗ |x 1 |x 1 ⊗ |x 2 ,
(20.6)
it follows that states such as |x 1 ⊗ |x 2 are actually not satisfactory states for identical particles. We must have either symmetric (for bosons) or antisymmetric (for fermions) combinations of the above states and their permutations, 1 (20.7) |x 1 x 2 ; S/A = √ (|x 1 ⊗ |x 2 ± |x 2 ⊗ |x 1 ), 2 √ corresponding to C12 = ±1 (bosons/fermions). The factor 1/ 2 appears because of normalization, as the two states√ |x 1 ⊗ |x 2 and |x 2 ⊗ |x 1 are orthogonal so their sums or differences must have a coefficient 1/ 2. More generally, for states a, b characterizing particles 1 and 2, instead of x 1 , x 2 , for bosons/fermions we have 1 |ab = √ (|a ⊗ |b ± |b ⊗ |a). 2
(20.8)
In the case of fermions, when a = b (the same state for particles 1 and 2), the two-particle state is 1 |aa = √ (|a ⊗ |a − |a ⊗ |a) = 0. 2
(20.9)
This is a mathematical expression of the Pauli exclusion principle, stating that we cannot have two fermions occupying the same state, unlike for bosons or classical particles. Next, consider symmetric or antisymmetric states |ψS/A in the coordinate representation, which means that we have to take the scalar product with the states |x 1 x 2 ; S/A. The wave function is thus ψ S/A (x 1 , x 2 ) ≡ x 1 x 2 ; S/A|ψ S/A.
(20.10)
ˆ (with eigenvalues ω), measured for If the state |ψ S/A corresponds to eigenstates for observables Ω particles 1 and 2, then we must have 1 |ψ S,A = |ω1 ω2 ; S/A = √ (|ω1 ⊗ |ω2 ± |ω2 ⊗ |ω1 ). 2
(20.11)
Then, in the antisymmetric (A) or fermion case, we have the wave function 1 ψ ω1 (x 1 ) ψ ω2 (x 1 ) 1 = √ ψ ω1 (x 1 )ψ ω2 (x 2 ) − ψ ω2 (x 1 )ψ ω1 (x 2 ) . (20.12) ψ A (x 1 , x 2 ) = √ 2 ψ ω1 (x 2 ) ψ ω2 (x 2 ) 2 This is known as a Slater determinant.
231
20 Systems of Identical Particles
20.2 Observables under Permutation Consider (generalized) coordinates qˆ1 , qˆ2 for particles 1 and 2, and let an observable depending ˆ qˆ1 , qˆ2 ). Then the fact that particles are on these coordinates be represented by the operator A( ˆ ˆ qˆ2 , qˆ1 ) must be equal (while the states indistinguishable means that the operators A( qˆ1 , qˆ2 ) and A( can differ by a phase). On the other hand, consider separate operator observables associated with the two particles, Aˆ 1 and Aˆ 2 , which generalize qˆ1 , qˆ2 , and consider eigenstates for the two operators, |a1 ⊗ |a2 . Then Aˆ 1 |a1 ⊗ |a2 = a1 |a1 ⊗ |a2 Aˆ 2 |a1 ⊗ |a2 = a2 |a1 ⊗ |a2 .
(20.13)
−1 ˆ Acting with Pˆ12 Aˆ 1 on the states, and introducing 1 = Pˆ12 P12 before the states, we obtain (by calculating in two different ways) −1 ˆ Pˆ12 Aˆ 1 Pˆ12 P12 |a1 ⊗ |a2 = a1 Pˆ12 |a1 ⊗ |a2 = a1 |a2 ⊗ |a1 −1 = ( Pˆ12 Aˆ 1 Pˆ12 )|a2 ⊗ |a1 .
(20.14)
Since this relation is valid for any states |a1 ⊗ |a2 (and the eigenstates of observables make a complete set in Hilbert space), we have also the operatorial equation −1 Pˆ12 Aˆ 1 Pˆ12 = Aˆ 2 .
(20.15)
In particular, the same is true when Aˆ i = qˆi . ˆ qˆ1 , qˆ2 ), and calculating in two ways (the Applying relation (20.15) to the operator observable A( second using the fact that the operator for indistinguishable particles is the same when we switch them), we obtain −1 ˆ −1 −1 ˆ Pˆ12 qˆ1 Pˆ12 ˆ qˆ1 , qˆ2 ) Pˆ12 ˆ qˆ1 , qˆ2 ) = A( , P12 qˆ2 Pˆ12 ) = Pˆ12 A( A(
ˆ qˆ1 , qˆ2 ), = A(
(20.16)
thus leading to −1 ˆ qˆ1 , qˆ2 ) Pˆ12 ˆ qˆ1 , qˆ2 ), Pˆ12 A( = A(
(20.17)
ˆ = 0 (multiparticle observables commute with permutations). or [ Pˆ12 , A] Another proof of the same relation is obtained by acting with Aˆ Pˆ12 on states, ˆ qˆ2 , qˆ1 ) Pˆ12 |q1 ⊗ |q2 = A( ˆ qˆ2 , qˆ1 )|q2 ⊗ |q1 = A(q1 , q2 )|q2 ⊗ |q1 A( ˆ qˆ2 , qˆ1 )|q1 ⊗ |q2 , = Pˆ12 ( A(q2 , q1 )|q1 ⊗ |q2 ) = Pˆ12 A(
(20.18)
and since it is true for states, it is also true as an operator relation. In particular, the Hamiltonian operator must be of this type (it must commute with Pˆ12 , i.e., be symmetric). However, in general we have Hˆ =
Pˆ 2 Pˆ12 + 2 + V ( Xˆ 1 , Xˆ 2 ). 2m1 2m2
(20.19)
(a) One possibility is that the two particles don’t interact at all, corresponding to a separable system, i.e., V ( Xˆ 1 , Xˆ 2 ) = V ( Xˆ 1 ) + V ( Xˆ 2 ) ⇒ Hˆ = Hˆ 1 + Hˆ 2 .
(20.20)
232
20 Systems of Identical Particles
In this case, the solutions to the Schr¨odinger equation (the states) are separable, |ψ = |ψ1 ⊗ |ψ2 ,
(20.21)
leading to separable wave functions (as we saw), ψ(x 1 , x 2 ) = ψ1 (x 1 )ψ2 (x 2 ).
(20.22)
(b) Another possibility is that the potential depends only on the relative distance of the particles, V ( Xˆ 1 , Xˆ 2 ) = V (| Xˆ 1 − Xˆ 2 |),
(20.23)
which, as we saw, means that the Schr¨odinger equation reduces to separate equations for the center of mass motion and for the relative motion, with the relative motion corresponding to a central potential. −1 ˆ as the relative motion is the same when we = H, In this case, we do indeed have Pˆ12 Hˆ Pˆ12 interchange the particles.
20.3 Generalization to N Particles In the case of N particles, we must consider the behavior under a generic permutation, 1 2 ··· N P= . i1 i2 · · · i N But consider in particular just the transposition of particles i and j, 1 ··· i ··· j ··· N Pi j ≡ . 1 ··· j ··· i ··· N
(20.24)
(20.25)
In particular, the action of Pi j on a state is Pˆi j |12 . . . i . . . j . . . N = |12 . . . j . . . i . . . N.
(20.26)
However, in this case, we are back to the previous case (since only two particles are interchanged, the others are left untouched), so by the same argument we must also have Pˆi j |12 . . . i . . . j . . . N = Ci j |12 . . . i . . . j . . . N,
(20.27)
and again we must have Ci2j = 1 (see (20.3)), so Ci j = ±1,
(20.28)
corresponding as before to either Bose–Einstein or Fermi–Dirac statistics. Moreover, the same argument leads to the Pauli exclusion principle, saying that for fermions we can have at most one particle per state. The above analysis must be true for any i, j, which means that in fact, the state |12 . . . N must be totally symmetric or antisymmetric, corresponding respectively to bosons and fermions. Thus, for any generic permutation P, obtained as a product of transpositions Pi j , the same analysis as above must hold. For bosons, we have sums over permutations, whereas for fermions, we sum with a sign equal
233
20 Systems of Identical Particles
to the sign of the permutation P (which is given by the product of the minus signs corresponding to each transposition that composes P). We say that 1 ˆ ⊗ |2 ⊗ · · · ⊗ |N. |12 . . . N = (sgn(P)) P|1 (20.29) Nperms perms. P Wave functions in the coordinate representation are defined by ψ S/A (x 1 , . . . , x N ) = x 1 . . . x N ; S/A|ψ S/A,
(20.30)
and thus have the same symmetry properties. They are totally symmetric for bosons and totally antisymmetric for fermions. For a state that corresponds to individual states ai for each particle i, we have the multiparticle state 1 ˆ 1 ⊗ · · · ⊗ |an , |ψ S/A = (sgn(P)) P|a (20.31) Nperms perms. P which means that the wave functions can be written as a determinant generalizing the two-particle case, ψ S/A (x 1 , . . . , x N ) =
1 Nperms
ψ (x ) a1 . 1 .. ψ a N (x 1 )
··· ···
ψ a N (x 1 ) .. . . ψ a N (x N )
(20.32)
20.4 Canonical Commutation Relations For bosons, as we saw, we had commuting fundamental (canonical) operators, so in particular [qˆi , qˆ j ] = [ pˆi , pˆ j ] = 0, [qˆi , pˆ j ] = iδi j .
(20.33)
But if we have (extended) phase space variables ψi that are truly fermionic, i.e., anticommuting, their canonical anticommutation relations will be {ψi , ψ j } = {pψi , pψ j } = 0, {ψi , pψ j } = iδi j .
(20.34)
The simplest system will still be the harmonic oscillator, but now a fermionic version of it whose Hamiltonian can be written in terms of the above phase space variables, Hˆ =
pˆ2ψ 2m
+
mω 2 ˆ 2 ψ , 2
(20.35)
which again can be written in terms of creation and annihilation operators b, b† but now obeying anticommutation relations, {b, b} = {b† , b† } = 0, {b, b† } = 1. Then the quantum Hamiltonian can be written in terms of them as −bb† + b† b 1 Hˆ = ω = ω b† b − . 2 2
(20.36)
(20.37)
234
20 Systems of Identical Particles
Observations: • We note that there exists a unique totally symmetric or antisymmetric wave function, a fact that should be obvious since we can start from one state and symmetrize it. Also, the totally symmetric or antisymmetric state is in a given representation of SO(N ), so is uniquely defined by group theory. • The other important observation is that we have considered only N particles in order to (anti-) symmetrize, but we can ask: why not consider all the particles in the Universe? The point is that the particles considered must have some kind of interaction, even if a small one, in which case we are forced to use the symmetrization. But if we can ignore the coupling to an “exterior” made up of the rest of the Universe, we can ignore symmetrization with respect to it, and consider separation of variables instead: ψ = ψsystem ψrest .
(20.38)
Then the result is the same, as if we had not considered the exterior at all, since the probability is given by P = |ψ| 2 = |ψsystem | 2 |ψrest | 2 , and by summing (integrating) over the rest of the Universe, with
Psystem = |ψsystem | 2 ,
(20.39) |ψrest | 2 = 1, we obtain that (20.40)
the same result as if we had just ignored the rest of the Universe.
20.5 Spin–Statistics Theorem Now, a natural question can be asked: how do we know whether a particle is a boson or a fermion, other than by measuring its statistics (i.e., its behavior under measurement)? The answer is that the statistics obeyed by a particle is related to its spin: for integer spin the particle is a boson, whereas for half-integer spin j ∈ (N/2)\N, the particle is a fermion. This statement goes under the name of the spin–statistics theorem; however, it is not so much a single mathematical proof for this statement as it is a set of different proofs, of various degrees of rigor. Some of the more rigorous proofs are based on quantum field theory, and will not be reproduced here, but only stated: (1) Lorentz invariance of the S-matrix for the scattering of particles implies the need to correlate spin with statistics, in order not to break the symmetry at the quantum level. (2) Vacuum stability: if we consider the wrong statistics for the fields associated with particles with spin, then we find that there are contributions of negative energy, implying an arbitrarily negative potential. Then the vacuum is unstable, which cannot hold for a true vacuum, implying that the assumption about the statistics was wrong. A third proof based on quantum field theory will be sketched, being reasonably simple to understand (though some facts will have to be taken for granted by the reader): (3) Causality criterion: For a spacelike separation between two points, (x − y) 2 > 0, such that there is no causal contact between the points and we can make a Lorentz transformation to a reference system where the two points are at the same time t, the observables measured at points x and y
235
20 Systems of Identical Particles
should be independent. But in quantum mechanics, that amounts to the operators associated with them commuting, [O(x), O(y)] = [O(x , t), O(y , t)] = 0.
(20.41)
In quantum field theory, however, a quantum field is (up to a normalization factor Nk ) a sum over modes described by eik ·x and associated creation operators, φ(x , t) = d 3 k Nk a(k)e−iωt+i k ·x + N¯ k a† (k)e+iωt−i k ·x , (20.42) where ω = k 2 + m2 . The commutation relation for the creation and annihilation operators is [a(k), a† (k )] = δ3 (k − k ),
(20.43)
leading to commutation relation between the fields themselves at different points (that are spacelike separated), [φ(x , t = 0), φ(y , t = 0)] = d 3 k |Nk | 2 ei k ·(x −y ) − e−i k ·(x −y ) = 0, (20.44) where in the last equality we have used the fact that we can make the change k → −k in the integration. This is indeed what we expect, and since this relation is true for the basic operators (the fields), it will also be true for composite operators (arbitrary observables) made up from them. Note, however, that if we didn’t have commutation relations for a, a† , but anticommutation relations instead (as for b, b† ), we would get a nonzero result, thus proving the relation between spin (zero in this case) and statistics (bosonic, in this case). The relation can be proven case by case, though we do not have a general proof valid for all spins. (4) However, the simplest (yet least rigorous) proof of the spin–statistics theorem relies on just the quantum mechanics we have defined so far (though it becomes slightly more rigorous when using a relativistic formulation, which will not be given here). The generator of rotations around Oz is J3 /, as we saw in Chapter 16, so the matrix element for these rotations (with an angle θ) is g = eiθJ3 / .
(20.45)
But the interchange of two particles can be obtained as follows: we can make a rotation by π in the plane perpendicular to Oz, and then a translation (depending on where the origin is, with respect to the midline between the particles). This amounts to a rotation by π of each of the two particles, which is the same as a rotation by 2π of a single particle. However, a particle of spin j, with J3 = j (its maximum value), rotated by θ = 2π, gives g(2π) = e2πi j = (−1) 2j ,
(20.46)
which is indeed the formula we were looking for: symmetric (g = +1) for j ∈ N, and antisymmetric (g = −1) for j ∈ (N/2)\N under this exchange of particles. For j = 1/2, and considering a single particle, we must rotate by θ = π, so that i 0 iπσ3 /2 g=e = (cos π/2)1 + i(sin π/2) σ3 = iσ3 = . (20.47) 0 −i Thus, for a single particle in a state |ψ1 , we have g(π)|ψ ˆ 1 = ±i|ψ1 ,
(20.48)
236
20 Systems of Identical Particles
and for a two-particle state, 2 g(π)|ψ ˆ 1 ⊗ |ψ2 = (±i) |ψ1 |ψ2 = −|ψ1 ⊗ |ψ2
= |ψ2 ⊗ |ψ1 ,
(20.49)
as promised. Moreover, for j = (2k + 1)/2 and θ = π, we obtain g 2 (π) = g(2π) = (−1) 2j ,
(20.50)
which is what we obtained from our first way of deriving the interchange factor.
20.6 Particles with Spin Next, we consider what happens when the identical particles have a spin degree of freedom. Consider two electrons, with spin s = 1/2. Then the two-particle states are characterized by the individual positions and spin projections on Oz, with Hilbert space basis |x 1 , ms1 ⊗ |x 2 , ms2 → |x 1 , ms1 ; x 2 , ms2 .
(20.51)
2 ˆ = 0, i.e., if the Hamiltonian If the total Hamiltonian commutes with the total spin squared, [ Stot , H] is independent of the total spin, then we can write the states as the product of separate states for position and spin,
|ψ = |φposition ⊗ |χspin .
(20.52)
The wave functions in the coordinate–spin basis are ψ(x 1 , x 2 ; ms1 , ms2 ) = x 1 , ms1 ; x 2 , ms2 |ψ,
(20.53)
and they are separable into pure coordinate and pure spin wave functions, ψ(x 1 , x 2 ; ms1 , ms2 ) = φ(x 1 , x 2 )χ(ms1 , ms2 ).
(20.54)
Since the addition of the two angular momenta (of the electrons with spin s = 1/2) gives →
→
1/2 ⊗ 1/2= 1 ⊕ 0,
(20.55)
meaning the four states of the product space decompose into three states of total spin 1 (the triplet representation), | jm = |1, −1, |1, 0, |1, 1, plus a state of total spin zero (the singlet representation), | jm = |0, 0, it follows that we can construct spin wave functions that correspond to these representations, χ(ms1 , ms2 ) = χ++ 1 = √ (χ+− + χ−+ ) 2 = χ−− 1 = √ (χ+− − χ−+ ), 2
(20.56)
237
20 Systems of Identical Particles
where the first three states are in the triplet representation, and the last in the singlet representation. We note that indeed the triplet states are even under permutations (symmetric), as expected, since it implies that the electron angular momenta are parallel, whereas the singlet state is odd under permutations (antisymmetric), since it implies that the electron angular momenta are antiparallel. Since in the case where both angular momenta are positive, we expect a permutation to give back the same (symmetric) state (since the state is symmetric), Pˆ12 χ++ = +χ++ ,
(20.57)
the other two states in the triplet representation are obtained by acting with the total spin annihilation operator S− = S1,− + S2,− . Since we also have [ Pˆ12 , S− ] = 0, the property of being symmetric is maintained by acting with S− . Because the wave function factorizes (into separate variables), we have that the permutation operator also factorizes, space spin Pˆ12 = Pˆ12 · Pˆ12 .
(20.58)
Since we only need the total permutation to be antisymmetric (as is required for a fermion), Pˆ12 ψ(x 1 , x 2 ) = −ψ(x 1 , x 2 ) = ψ(x 2 , x 1 ),
(20.59)
it means we have two possibilities: • We can have a symmetric spatial wave function, φ(x 1 , x 2 ) = φ(x 2 , x 1 ), and an antisymmetric spin wave function, χ(ms1 , ms2 ) = −χ(ms2 , ms1 ). This is the case when the spins are antiparallel, and so corresponds to the singlet representation. • We can have an antisymmetric spatial wave function, φ(x 1 , x 2 ) = −φ(x 2 , x 1 ), and a symmetric spin wave function, χ(ms1 , ms2 ) = +χ(ms2 , ms1 ). This is the case when the spins are parallel, and so corresponds to the triplet representation. In the case of a separable Hamiltonian, Hˆ = Hˆ 1 + Hˆ 2 , we can further separate the spatial wave function φ(x 1 , x 2 ), which thus has solutions of the type φ(x 1 , x 2 ) = φ A (x 1 )φ B (x 2 ),
(20.60)
where φ A (x 1 ) corresponds to Hˆ 1 and φ B (x 2 ) corresponds to Hˆ 2 . But then we can write down a spatial symmetric or antisymmetric wave function, 1 φ(x 1 , x 2 ) = √ [φ A (x 1 )φ B (x 2 ) ± φ A (x 2 )φ B (x 1 )], 2
(20.61)
corresponding to the triplet and singlet representations, respectively. In that case, the probability density of finding one electron at x 1 and the other at x 2 is |φ(x 1 , x 2 )| 2 =
1 |ψ A (x 1 )| 2 |φ B (x 2 )| 2 + |ψ B (x 1 )| 2 |ψ A (x 2 )| 2 2 ± 2 Re[ψ A (x 1 )ψ B (x 2 )ψ∗A (x 2 )ψ∗B (x 1 )] .
(20.62)
The last term in the above is called the exchange probability density. We note that for the same 2 state, A = B, when x 1 = x 2 we find that |φ(x 1 , x 2 )|singlet = 0, which is a statement of the Pauli exclusion principle. On the other hand, for the triplet, the probability when A = B for x 1 = x 2 is maximal.
238
20 Systems of Identical Particles
Important Concepts to Remember • In quantum mechanics identical particles are indistinguishable, which means the result of acting with the permutation operator on a state with two particles must be related to the unpermuted state by a phase, |x 2 x 1 = C12 |x 1 x 2 . • Excluding the nontrivial cases of anyons and nonabelions, that means we can either have Bose– Einstein statistics (bosons) for C12 = +1, or Fermi–Dirac statistics (fermions), C12 = −1. • For fermions, this leads to the Pauli exclusion principle, which states that we cannot have two fermions in the same state: |x x = 0. • Multiparticle operators commute with the permutation operator; for the Hamiltonian, it is either separable and symmetric, or has a potential depending only on the relative distances of the particles. • A multiparticle state is either totally symmetric or totally antisymmetric. In the antisymmetric case, this leads to a Slater determinant for the wave function. • Fermionic phase space variables have anticommutation relations rather than commutation relations, indicated by {, }. In particular, the fermionic harmonic oscillator has operators b, b† with {b, b† } = 1. • The spin–statistics theorem says that integer spin particles are bosons, and half-integer spin particles are fermions; this can be understood from the rotation matrix g = eiθJ3 / , implying that g(2π) = (−1) 2j . • For particles with spin, the total wave function is a product of the coordinate wave function times the spinor wave function, both of which need to be totally symmetric or totally antisymmetric; only the product wave function is governed by the spin–statistics theorem. • Thus a fermion has either a symmetric spatial wave function and an antisymmetric spin wave function, or an antisymmetric spatial wave function and a symmetric spin wave function. • For a separable Hamiltonian, the probability density has a direct term and an exchange term.
Further Reading See [2], [1], [3] for more details.
Exercises (1)
Write down an antisymmetric (fermionic) state for three particles, the energy eigenstates E1 , E2 , E3 , and the corresponding Slater determinant. (2) Consider a three-body problem with two-body interactions, i.e., with the potential Vˆ = Vˆint (| Xˆ 1 − Xˆ 2 |) + Vˆint (| Xˆ 1 − Xˆ 3 |) + Vˆint (| Xˆ 2 − Xˆ 3 |).
(20.63)
Is this quantum potential suitable for indistinguishable particles? (3) Consider a state with N indistinguishable particles. What is the phase that relates |12 . . . N with |N . . . 21? (4) Write down a general state for fermionic harmonic oscillator. Also write down general state for N identical fermionic harmonic oscillators.
239
20 Systems of Identical Particles
(5) We saw in the text that [φ(x , t), φ(y , t)] = 0 for a bosonic field. Show also that {ψ(x , t), ψ(y , t)} = 0 for a fermion (Hint: think about the fact that ψ and ψ† are conjugate for fermions, unlike for bosons). What relation will we have for the product φ(x , t)ψ(y , t)? (6) Consider three particles of spin 1/2. Write the possible spin wave functions for the three-particle states, and construct the possible total wave functions, for the spatial wave function times the spin wave function. (7) For the case in exercise 6, write down the probability and identify the exchange terms.
21
Application of Identical Particles: He Atom (Two-Electron System) and H2 Molecule In this chapter, we will apply the formalism of the previous chapter to two cases with two electrons: the He atom (and helium-like atoms, with an arbitrary Z but two electrons only) and the H2 molecule. In order to do these calculations, we will introduce approximation methods that will be formalized and generalized later in the book.
21.1 Helium-Like Atoms Helium-like atoms have a nucleus of charge +Ze, around which two electrons move. Of course, in the case of helium Z = 2, but we can consider general Z. The nucleus will be considered as approximately fixed. As in the case of the hydrogen atom, the center of mass motion factorizes, and the relative motions are the only ones of importance: since the mass of the nucleus is much larger than that of the electrons, for the relative motions of the electrons we can consider a fixed nucleus situated at the origin, around which the electrons move (with reduced mass approximately equal to the electronic mass). The Hamiltonian for the system contains the potentials for the interactions of the two electrons with the nucleus and with each other, thus being given by Ze02 Ze02 e2 pˆ 2 pˆ 12 + 2 − − + 0 2m 2m r1 r2 r 12 = Hˆ 1 + Hˆ 2 + Hˆ 12 ,
Hˆ =
(21.1)
where e02 = e2 /(4π0 ), as before. As in the previous chapter, the state of the system factorizes into a spin state for spin and a coordinate state, and correspondingly the wave function is the product of a wave function for positions and one for spins, ψ(x 1 , x 2 ; ms1 , ms2 ) = φ(x 1 , x 2 )χ(ms1 , ms2 ), where the spin wave function χ(ms1 , ms2 ) is either in the triplet representation, 1 1 χ++ = χ1+ χ2+ ; √ (χ+− + χ−+ ) = √ (χ1+ χ2− + χ1− χ2+ ); χ−− = χ1− χ2− , 2 2
(21.2)
(21.3)
or in the singlet representation 1 1 √ (χ+− − χ−+ ) = √ (χ1+ χ2− − χ1− χ2+ ). 2 2
(21.4)
The position wave function φ(x 1 , x 2 ) is either antisymmetric, corresponding to the triplet spin wave function, or symmetric, corresponding to the singlet spin wave function. The case with triplet 240
241
21 He Atom and H2 Molecule spin (when the spins of the individual electrons are parallel) and antisymmetric φ(x 1 , x 2 ) is a state that is known as “ortho” (so in this case, we have ortho-helium), while the case with singlet spin (when the electron spins are antiparallel) and symmetric φ(x 1 , x 2 ) is a state known as “para” (so in this case, we have para-helium). To proceed, we will make some approximations.
Approximation 1 We neglect the interaction between the electrons, Hˆ 12 = e02 /r 12 , for the wave functions, and so consider eigenfunctions of the noninteracting Hamiltonian, Hˆ Hˆ 1 + Hˆ 2 ,
(21.5)
and then, as in the noninteracting case, we can separate variables, obtaining eigenfunctions ψ(r 1 , r 2 ) = ψ(r 1 )ψ(r 2 ).
(21.6)
For these eigenfunctions, the eigenenergies are E E (1) + E (2) ,
(21.7)
and (ψ(r 1 ), E (1) ), (ψ(r 2 ), E (2) ) are solutions of a hydrogen-like (noninteracting) Schr¨odinger equation, 2 Ze02 ψ (s) (r s ) = E (s) ψ (s) (r s ). − Δr − s 2m r s Then the individual electron wave functions are
ψ (s) (r s ) = ψ ns ls ms (r s ) ≡ ψqs (r s ),
(21.8)
(21.9)
and their energies are E (s) = Ens = −
mZ 2 e04 1 Z 2 e02 1 = − . 2a0 n2s 22 n2s
(21.10)
The corresponding symmetric (spin singlet) and antisymmetric (spin triplet) coordinate wave functions are 1 φ s (r 1 , r 2 ) = √ [ψq1 (r 1 )ψq2 (r 2 ) + ψq2 (r 1 )ψq1 (r 2 )] 2 (21.11) 1 φ a (r 1 , r 2 ) = √ [ψq1 (r 1 )ψq2 (r 2 ) − ψq2 (r 1 )ψq1 (r 2 )]. 2
Approximation 2 We now suppose that the energy of the interaction-less case above is corrected by the interaction term Hˆ 12 evaluated in the above (noninteracting, S or A) state, E = E (1) + E (2) + ΔE e2 ΔE = φ S/A 0 φ S/A r 12 e2 = d 3r1 d 3 r 2 0 |φ S/A (r 1 , r 2 )| 2 . r 12
(21.12)
242
21 He Atom and H2 Molecule
The probability density in the integral can be written as ψq (r 1 )ψq (r 2 ) ± ψq (r 1 )ψq (r 2 ) 2 2 2 1 |φ s,a (r 1 , r 2 )| = 1 √ 2 1 = |ψq1 (r 1 )| 2 |ψq2 (r 2 )| 2 + |ψq2 (r 1 )| 2 |ψq1 (r 2 )| 2 2 ± 2 Re[ψq1 (r 1 )ψq∗ 1 (r 2 )ψq2 (r 2 )ψq∗ 2 (r 1 )] .
(21.13)
The last term is called the exchange probability density, as we said in the previous chapter. The integrals involving the terms above can be rewritten in a useful way. Consider the integrals of the first two terms: e2 1 3 d r1 d 3 r 2 0 |ψq1 (r 1 )| 2 |ψq2 (r 2 )| 2 + |ψq2 (r 1 )| 2 |ψq1 (r 2 )| 2 C≡ 2 r 12 2 e (21.14) = d 3r1 d 3 r 2 0 |ψq1 (r 1 )| 2 |ψq2 (r 2 )| 2 r 12 ρ q (r 1 )ρ q2 (r 2 ) 3 = d r1 d 3r2 1 , 4π0 r 12 where in the second line we have made the redefinition r 1 ↔ r 2 in the integral of the second term, in order to show that it equals the first term, and in the third line we have defined the charge density √ as the charge times the probability density, ρ q (r ) = e|ψq (r )| 2 = e0 4π0 |ψq (r )| 2 . As we can see, this integral is a Coulomb-type integral; it is just the integral of the Coulomb potential for charge densities, hence the name C for the contribution. Next consider the last term in (21.13) (with ± 2 Re in front): e2 1 d 3 r 2 0 [ψq1 (r 1 )ψq∗ 1 (r 2 )ψq2 (r 2 )ψq∗ 2 (r 1 ) + ψq1 (r 2 )ψq∗ 1 (r 1 )ψq2 (r 1 )ψq∗ 2 (r 2 )] A≡ d 3r1 r 12 2 e2 = d 3r1 d 3 r 2 0 ψq1 (r 1 )ψq∗ 1 (r 2 )ψq2 (r 2 )ψq∗ 2 (r 1 ), r 12 (21.15) where we have used again that the two terms (which are otherwise complex conjugates of each other) are equal, as can be seen by making the redefinition r 1 ↔ r 2 . This means that the term is actually real, as needed since it is a term in the energy. It also means that we can define f (r ) ≡ ψq1 (r )ψq∗ 2 (r ), and write e2 3 d 3 r 2 0 f (r 1 ) f (r 2 ). (21.16) A= d r1 r 12 Here A is the exchange integral or exchange energy, and the initial comes from the German “Austausch”, meaning exchange. Finally then, the energy is E = Eq1 + Eq2 + C ± A.
(21.17)
Moreover, we can show that A is not only real, but positive. Indeed, we can see that the largest contribution to the integral comes from r 12 → 0, when the two particles are close to each other. But A is an integral of f (r 1 ) f r 2 )/r 12 , and if r 1 , r 2 are close together then, by the continuity of the function f , f (r 1 ) f (r 2 ), or at the very least they have the same sign, meaning that their product is positive, and so this leading contribution to A is positive.
243
21 He Atom and H2 Molecule
But that means that the energy of the ortho-helium state (antisymmetric position wave function) is smaller than the energy of the para-helium state (symmetric position wave function), meaning that it is more stable. Thus we have a tendency to have parallel spins, i.e., triplet states, even though we don’t have any interaction that depends on spin (this tendency follows simply from the statistics of the fermions). Also note that when q1 = q2 (both electrons have the same quantum numbers n, l, m), the analysis is special (the previous analysis does not apply). Then there is no antisymmetric (ortho-helium, or spin triplet) state, only a symmetric one (para-helium, or spin singlet), corresponding to antiparallel spins for the electrons.
21.2 Ground State of the Helium (or Helium-Like) Atom We will calculate only the ground state of the helium (or helium-like) atom, so we need only consider the ground states of hydrogen-like atoms as individual wave functions. Then we have ψq = ψ100 (r ) = R10 (r)Y00 (θ, φ).
(21.18)
But the ground state spherical harmonic is just a normalization constant, since the state has no angular momentum, and this means that it is spherically symmetric, 1 Y00 (θ, φ) = √ , 4π
(21.19)
so that dΩ|Y00 | = 1. In orbital notation, the ground state of a hydrogen-like atom is (1s) 2 , meaning that n = 1, l = 0 (denoted as an “s” orbital) and the index 2 refers to there being two electrons in this state (necessarily of opposite spins by the Pauli exclusion principle, since they are in the same state for coordinates). The radial wave function is R10 (r) ∼ e−kr , where k = Z/a0 , and the normalization constant for it comes from ∞ 1 e−2kr r 2 dr = 3 , (21.20) 4k 0 leading to ψq =
Z a0
3/2
2 √ e−Zr/a0 . 4π
(21.21)
Then the para (spin singlet) state of a helium-like atom is ψ s = ψq (r 1 )ψq (r 2 )χsinglet =
Z 3 −Z (r1 +r2 )/a0 e , πa03
and the electron interaction energy is e2 1 Z6 ΔE = e02 = d 3r1 d 3 r 2 2 6 e−2k (r1 +r2 ) 0 . r 12 ψ s r 12 π a0
(21.22)
(21.23)
To do this integral, we first define Φ(r 1 ) ≡
d 3r2
e−2kr2 , r 12
(21.24)
244
21 He Atom and H2 Molecule
and note that it has the form of a Coulomb potential at the point r 1 for a distribution of charge with density e−2kr2 . It then satisfies the Poisson equation Δr 1 Φ(r 1 ) = −4πe−2kr1 .
(21.25)
Because of the spherical symmetry of the equation, L 2 Φ = 0, we obtain just the radial equation, 2 d + 2 d Φ(r 1 ) = −4πe−2kr1 2 dr 1 r 1 dr 1
(21.26)
d2 (r 1 Φ(r 1 )) = −4πr 1 e−2kr1 . dr 12
(21.27)
or
We see that it has a particular solution of the form r 1 Φ(r 1 ) = (ar 1 + b)e−2kr1 and, by substituting it into (21.27) we obtain [(2k) 2 (ar 1 + b) − 4ka]e−2kr1 = −4πr 1 e−2kr1 ,
(21.28)
which fixes the coefficients as follows: a=−
π a π , b = = − 3. k k2 k
(21.29)
However, the general solution of a second-order differential equation with a source (the Poisson equation) is equal to the particular solution plus the general solution of the homogenous (without source) equation, which is r 1 Φ(r 1 ) = C1 r 1 +C2 . To fix the coefficients C1 , C2 , we impose the physical conditions that the potential Φ(r 1 ) should vanish at infinity, Φ(r 1 → ∞) → 0, which gives C1 = 0, and that it is nonsingular at r 1 → 0 since in the original definition (21.24) there is nothing special about r 1 = 0. This last condition implies that the coefficient of 1/r 1 must vanish at r 1 = 0, leading to C2 = −b = π/k 3 . Finally, then, we have 1 1 e−2kr1 1 1 Φ(r 1 ) = −π 2 e−2kr1 + 3 − 3 . (21.30) k k r1 k r1 Integrating over r 1 as well, we get ∞ −2kr1 3 Φ(r 1 )e d r1 = Φ(r 1 )e−2kr1 4πr 12 dr 1 0
= −4π
2
= −4π 2 =+
1 k2
∞ 0
e−4kr1 r 12 dr 1
1 + 3 k
∞
e 0
1 1 1 1 1 1 + − k 2 32k 3 k 3 16k 2 k 3 4k 2
−4kr1
1 r 1 dr 1 − 3 k
∞
e
−2kr1
r 1 dr 1
0
5 π2 . 8 k5 (21.31)
Thus the correction ΔE to the energy is ΔE =
e02 Z 6 5π 2 5 Ze02 = , 8 a0 π 2 a06 8k 5
(21.32)
245
21 He Atom and H2 Molecule
and the total energy is the sum of two ground state energies for a hydrogen-like atom, plus the above correction, Z 2 e02 5 Ze02 Ze2 5 + =− 0 Z− E = 2E0 + ΔE = − . (21.33) a0 8 a0 a0 8 Using Z = 2 for helium, we obtain E −74.8 eV, which may be compared with the experimental value of Eexp = −78.8 eV. That is quite good, but we can in fact do better.
21.3 Approximation 3: Variational Method, “Light Version” We can use a variational method to account for the fact that, when considering the interaction of the individual electrons with the nucleus, the electrons do not see the full charge +Ze of the nucleus but, rather, a smaller effective charge, owing to the partial screening of the nuclear charge by the charge density cloud created by the second electron. To find this effective charge Zeff , we use a variational method (studied in detail later, in Chapter 41). Namely, we consider a modified wave function, with Z → Zeff , so that ψ˜ s =
3 Zeff
πa03
e−Zeff (r1 +r2 )/a0 ,
and evaluate the Hamiltonian in this effective state, ˆ2 2 Ze02 Ze02 pˆ 22 p 1 e ˜ ˜ ˜ ˜ ψ s + ψ s − ψ s + ψ˜ s ψ˜ s . E = ψ s + − 2m 2m r r r 1 2 12 But we saw in Chapter 18 that 1 Z 1 1 Zeff = ⇒ ψ˜ s ψ˜ s = r n a0 n 2 r a0 1 ˆ2 ˆ2 2 2 Z 2 e2 Z e0 1 p ˜ s p ψ˜ s = eff 0 , = ⇒ ψ 2m 2m n 2a0 n2 2a0 so that the energy functional to be minimized is E=2
2 2 Zeff e0 Z Zeff e02 5 Zeff e02 −2 + . 2a0 a0 8 a02
(21.34)
(21.35)
(21.36)
(21.37)
Minimizing this over Zeff at fixed Z by δE/δZeff = 0, we obtain 5 . 16 Replacing it back in (21.37), we find the correct minimized energy as 2 Z 2 e2 e2 5 E=− 0 Z− = − eff 0 . a0 16 a0 Zeff = Z −
An equivalent way to obtain this result is to note that (as seen in Chapter 18) 1 1 Z 1 5 1 5 1 = ⇒ − − = 0, r n a0 n 2 r 12 16 r 1 16 r 2
(21.38)
(21.39)
(21.40)
246
21 He Atom and H2 Molecule
and use this to rewrite the Hamiltonian in the form pˆ 21 pˆ 22 5 e02 5 e02 ˆ H=− − − Z− − Z− 2m 2m 16 r 1 16 r 2 1 5 1 5 1 + e02 − − , r 12 16 r 1 16 r 2
(21.41)
where the term in the last line has zero quantum average, so can be dropped. But then the remaining terms are the same as for two decoupled electrons in the potential of a nucleus with Zeff = Z − 5/16, so the total energy is 2 e02 5 E = −2 Z− . (21.42) 2a0 16
21.4 H2 Molecule and Its Ground State Another two-electron system is the hydrogen molecule H2 , with two protons (nuclei), called A and B, and two electrons, called 1 and 2. The Hamiltonian corresponds to the Coulomb interactions between the electrons and each of the nuclei, between themselves, and between the two nuclei, so is given by e2 e2 e2 e2 e2 e2 2 2 Δ1 − Δ2 − 0 − 0 − 0 − 0 + 0 + 0 , Hˆ = − 2m 2m r 1A r 1B r 2A r 2B r 12 R
(21.43)
where R = |r A − r B | is the distance between the nuclei. It is clear that in the ground state, at least for R a0 , one electron is near nucleus A, and the other is near nucleus B. Indeed, consider the opposite situation, with both electrons near one nucleus; thus we have two ionized hydrogen atoms, H+ and H− . Here H− is just a bare nucleus, so doesn’t contribute in the first approximation, whereas H+ can be thought of as a helium-like atom (with two electrons), but with Z = 1. Then we can apply the helium-like energy formulas, obtaining that the electron energy of the system is 2 e02 e2 5 1− > − 0 = 2EH,0 , (21.44) Ee = − a0 16 a0 which is larger than the energy of the electrons in the configuration where we have two neutral hydrogen atoms. This means that for the ground state in the R a0 case we have one electron near nucleus A and the other near nucleus B, with two neutral hydrogen atoms, H + H. For this configuration, we have two eigenstates, depending on which electron is near which nucleus, ψ1 = ψ(r 1A )ψ(r 2B ) ψ2 = ψ(r 1B )ψ(r 2A ).
(21.45)
But while these two states are normalized, they are not orthogonal because they admit a nonzero product, ψ1 |ψ2 = d 3 r 1 ψ∗ (r 1A )ψ(r 1B ) d 3 r 2 ψ(r 2A )ψ∗ (r 2B ) = |S| 2 , (21.46)
247
21 He Atom and H2 Molecule
where we have defined the “overlap integral” S≡ d 3 r ψ∗ (r A )ψ(r B ).
(21.47)
Since we have the products ψ1 ± ψ2 |ψ1 ± ψ2 = ψ1 |ψ1 + ψ2 |ψ2 ± 2ψ1 |ψ2 = 2(1 ± |S| 2 ),
(21.48)
the normalized symmetric and antisymmetric coordinate wave functions for the two-electron system are ψ1 ± ψ2 ψ(r 1A) ψ(r 2B ) ± ψ(r 1B )ψ(r 2A ) ψ s/a = √ = . √ 2 2 1 ± |S| 2 1 ± |S| 2
(21.49)
Then the matrix elements of the Hamiltonian are e2 e2 2 2 ψ1 | Hˆ |ψ1 = ψ1 − Δ1 − 0 ψ1 + ψ1 − Δ2 − 0 ψ1 r 1A r 2B 2m 2m e2 e2 e2 e2 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0 |ψ(r 1A )| 2 |ψ(r 2B )| 2 R r 2A r 1B r 12 2 2 2 e e e e2 = 2E0 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0 |ψ(r 1A )| 2 |ψ(r 2B )| 2 R r 2A r 1B r 12 e2 e2 2 2 ψ2 | Hˆ |ψ2 = ψ2 − Δ1 − 0 ψ2 + ψ2 − Δ2 − 0 ψ2 r 1A r 2B 2m 2m e2 e2 e2 e2 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0 |ψ(r 2A )| 2 |ψ(r 1B )| 2 R r 2A r 1B r 12 2 2 2 e e e e2 = 2E0 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0 |ψ(r 2A )| 2 |ψ(r 1B )| 2 R r 2A r 1B r 12 e2 e2 2 2 ψ1 | Hˆ |ψ2 = ψ1 − Δ1 − 0 ψ2 + ψ1 − Δ2 − 0 ψ2 r 1A r 2B 2m 2m e2 e2 e2 e2 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0 ψ∗ (r 1A )ψ(r 1B )ψ(r 2A )ψ∗ (r 2B ) R r 2A r 1B r 12 2 2 e e e2 e2 = 2E0 S 2 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0 ψ∗ (r 1A )ψ(r 1B )ψ(r 2A )ψ∗ (r 2B ) R r 2A r 1B r 12 e02 e02 2 2 ˆ ψ1 + ψ2 − ψ1 ψ2 | H |ψ1 = ψ2 − Δ1 − Δ2 − r 1A r 2B 2m 2m e2 e2 e2 e2 3 + d r1 d 3 r 2 − 0 − 0 + 0 + 0 ψ(r 1A )ψ∗ (r 1B )ψ∗ (r 2A )ψ(r 2B ) R r 2A r 1B r 12 2 2 e e e2 e2 = 2E0 S 2 + d 3r1 d 3 r 2 − 0 − 0 + 0 + 0 ψ(r 1A )ψ∗ (r 1B )ψ∗ (r 2A )ψ(r 2B ). R r 2A r 1B r 12 (21.50)
248
21 He Atom and H2 Molecule
Then, renaming the integration variables by exchanging r 1 and r 2 , we find that ψ1 | Hˆ |ψ1 = ψ2 | Hˆ |ψ2 ψ1 | Hˆ |ψ2 = ψ2 | Hˆ |ψ1 .
(21.51)
Finally then, the energies of the symmetric (spin singlet) and antisymmetric (spin triplet) H2 states are ψ1 ± ψ2 ψ1 ± ψ2 ˆ H E± (R) = 2(1 ± |S| 2 ) 2(1 ± |S| 2 ) 1 = ψ1 | Hˆ |ψ1 + ψ2 | Hˆ |ψ2 ± ψ1 | Hˆ |ψ2 ± ψ2 | Hˆ |ψ1 2 2(1 + |S| ) (21.52) 1 = ψ1 | Hˆ |ψ1 ± ψ2 | Hˆ |ψ1 1 + |S| 2 C±A = 2E0 + , 1 ± |S| 2 where we have defined, as in the helium case, the Coulomb and exchange integrals,
e2 e2 e2 e2 d 3 r 2 − 0 − 0 + 0 + 0 |ψ(r 1A )| 2 |ψ(r 2B )| 2 R r 2A r 1B r 12 e2 e2 e2 e2 A= d 3r1 d 3 r 2 − 0 − 0 + 0 + 0 ψ∗ (r 1A )ψ(r 1B )ψ(r 2A )ψ∗ (r 2B ). R r 2A r 1B r 12
C=
d 3r1
(21.53)
We can calculate numerically and plot the two energy functions E− (R) and E+ (R), and we find that E− (R) − 2E0 is positive and uniformly decreases to zero at infinity, E+ (R) − 2E0 decreases to a negative minimum, reached at R = R0 , stays negative and at infinity tends to zero from below. We also note that the van der Waals interaction between two hydrogen atoms at a large distance, ∼ e04 /r 6 , is obtained when calculating E+ (R) at large R, but this is in second-order perturbation theory (which will be introduced later on in the book).
Important Concepts to Remember • For helium (two electrons and a nucleus), we can have an ortho- state, for triplet spin and antisymmetric spatial wave function, and a para- state, for singlet spin and symmetric spatial wave function. • As an approximation, we calculate the states and wave functions of the electrons as if there were no interaction between the electrons, and then evaluate the interaction Hamiltonian for these states. • We find that the extra energy depends on the probability of finding the electrons near one or the other atom, so has a direct part, and an exchange part. • The direct energy becomes ρ q (r 1 )ρ q2 (r 2 ) d 3r2 1 , A= d 3r1 4π0 r 12
249
21 He Atom and H2 Molecule
and the exchange part of the energy is e2 C= d 3r1 d 3 r 2 0 f (r 1 ) f (r 2 ), r 12 • •
•
•
where f (r ) = ψq1 (r )ψq∗ 2 (r ). Using the above approximation, the ground state of helium is found from ΔE = 5/8 Ze02 /a0 , which gives a good approximation. A better approximation is found using a variational method, which in this (simplified) case means writing a modified wave function, in which we make the change Z → Zeff , while keeping Z in the Hamiltonian, and then minimizing over Zeff at fixed Z. We find Zeff = Z − 5/16, and E0 = −Zeff e02 /a0 . For the H2 molecule (two protons and two electrons), if R a0 , where R is the distance between the protons, we can prove that in the minimum energy configuration each electron is around a proton. One finds C±A , E± (R) = 2E0 + 1±S where C is the direct contribution and A is the exchange contribution, for r 1A and r 2B , while S = d 3 r ψ∗ (r A )ψ(r B ).
Further Reading See [1] for more details.
Exercises (1)
(2) (3) (4) (5) (6) (7)
Consider a helium atom with one electron in the n = 1, l = 0 state, and another in the n = 2, l = 0 state. Calculate the energy of the ortho-helium state (the lowest energy state for these quantum numbers), using the two approximations in the text (Approximation 1 and Approximation 2). Consider a helium atom with two electrons in the n = 2, l = 0 state. Calculate their ground state, using the two approximations in the text. Apply the variational method (Approximation 3) to the case in exercise 1. Apply the variational method (Approximation 3) to the case in exercise 2. Argue that, for the H2 molecule, E+ (R) − 2E0 tends to infinity from below, and E− (R) tends to infinity from above. Consider a potential H3 molecule, in an equilateral triangle of side R. Give an argument why this molecule would be unstable. If we have four H atoms, write down the Hamiltonian, and describe two potentially stable configurations.
22
Quantum Mechanics Interacting with Classical Electromagnetism In this chapter we will learn how to deal with quantum mechanics in the presence of a classical electromagnetic field. We started this analysis in Chapter 17, but here we treat everything methodically.
22.1 Classical Electromagnetism plus Particle To start with, we consider classical mechanics for a particle interacting with a classical electromagnetic field. We already mentioned in Chapter 17 that in quantum mechanics the electromagnetic interaction is obtained by a “minimal coupling” that amounts to a replacement of derivatives with derivatives plus fields, q ∂i → ∂i − i Ai q ∂0 → ∂0 − i A0 .
(22.1)
In the Lagrangian of classical mechanics, correspondingly, the replacements are p = mv → mv + q A
(22.2)
p0 = mc2 → mc2 − q A0 , to be substituted into L = −mc2 + m
v 2 , 2
(22.3)
thus leading to (the electric potential is −A0 ) L = −mc2 + q A0 +
mv + q A 2m
2 .
(22.4)
Then the Lagrange equations, d ∂L ∂L = , dt ∂vr ∂ xr
(22.5)
2 /2m) become (neglecting the term quadratic in the fields, q2 A d (mvr + q Ar ) = q∂r A0 + qvi ∂r Ai , dt 250
(22.6)
251
22 Quantum Mechanics and Classical Electromagnetism B in terms of the vector potential A and the scalar and, since we have (using the definition of E, potential A0 ) d ∂ Ar Ar = + vs ∂s Ar dt ∂t ∂r As − ∂s Ar = r sk Bk
(22.7)
∂r A0 − ∂0 Ar = Er , we obtain mv˙r = q[∂r A0 − ∂0 Ar + (∂r As − ∂s Ar )vs ] = q(Er + r sk vs Bk ) ⇒
(22.8)
+ v × B). mv˙ = q( E This is the Lorentz force law, so the Lagrangian we wrote down was indeed correct. From this Lagrangian, we can find the canonically conjugate momentum pr =
∂L = mvr + q Ar . ∂vr
(22.9)
2 /2m) Then the Hamiltonian is (again neglecting the term q2 A H=
r
pr vr − L =
mv 2 1 2 − q A0 + mc2 . p − q A + mc2 − q A0 = 2 2m
(22.10)
22.2 Quantum Particle plus Classical Electromagnetism Next, we consider how to use quantum mechanics for the particle, while leaving the electromagnetism classical. This means that we replace p with the quantum operator pˆ , while the vector and and A0 1. The Hamiltonian scalar potentials remain scalar functions times the identity operator, A1 operator is then 1 ˆ 2 − q A0 1. p − q A1 Hˆ = 2m
(22.11)
since we still Further, it is still the canonically conjugate momentum p that is replaced by i ∇, have [x i , p j ] = iδi j , where p j are the canonically conjugate momenta. That means that we still have the replacement − q A, = ∇ pˆ = ∇ → pˆ − q A i i which implies the “minimal coupling” expressed in (22.1). Then the Schr¨odinger equation i∂t |ψ = Hˆ |ψ becomes, acting on the wave function, 1 ˆ 2 − q A0 1 ψ(t, r ). p − q A1 i∂t ψ(t, r ) = 2m
(22.12)
(22.13)
252
22 Quantum Mechanics and Classical Electromagnetism
This Hamiltonian contains an interaction part, for the interaction between the particle (via its operator pˆ ) and the classical electromagnetic field, q Hˆ int = − pˆ · A. (22.14) m ·A = 0, and for a constant magnetic field B, given that As we saw in Chapter 17, in the gauge ∇ × A, =∇ we can choose (considering also ∂t A = 0, a stationary field) B = 1B × r , A 2
(22.15)
q ˆ · μˆ , Hˆ int = − B · L = −B 2m
(22.16)
leading to an interaction Hamiltonian
where the magnetic moment can be written as an integral involving the current density j, or the current differential d I = jd 3 r, as r × d I 1 = μ d 3 r r × j = . (22.17) 2 2 and j are understood as expectation values, quantities that can Note that in this relation, both μ be measured in experiments, and not as quantum operators. Indeed, j is constructed out of a wave . function, and so is μ We next show that a change in the wave function by a phase amounts to a change of gauge for the gauge field ( A0 , Ai ). The redefinition of the wave function by a phase,
implying
ψ = ψ e−iqΛ/ ,
(22.18)
∂t ψ = ∂t ψ − q(∂t Λ)ψ e−iqΛ/ i i pˆi = ∂i ψ = ∂i ψ − q(∂i Λ)ψ e−iqΛ/ , i i
(22.19)
means that the Schr¨odinger equation for ψ, written as 2 ⎧ ⎫ ⎪ ⎨ ∂t ψ + 1 ∂i − q Ai ψ − q A0 ⎪ ⎬ ψ = 0, ⎪i ⎪ 2m i ⎩ ⎭ can be rewritten in terms of ψ as 2 ⎧ ⎫ ⎪ ⎨ ∂t ψ + 1 ∂i − q( Ai + ∂i Λ) ψ − q( A0 + ∂t Λ) ⎪ ⎬ ψ = 0. ⎪i ⎪ 2m i ⎩ ⎭ This is the Schr¨odinger equation for the gauge transformed fields Ai = Ai + ∂i Λ,
A0 = A0 + ∂t Λ,
(22.20)
(22.21)
(22.22)
proving the statement above (22.18). However, given the fact that [pr , As ] =
[∂r , As ] 0, i
(22.23)
253
22 Quantum Mechanics and Classical Electromagnetism
we have to be careful about the order of operators. Expanding the Schr¨odinger equation, we obtain 2 q q q2 2 ∂t ψ − Δψ − A · ∇ψ − (∇ · A)ψ + A ψ − q A0 ψ = 0, i 2m mi 2m i 2m with complex conjugate
(22.24)
2 q ∗ q ∗ q2 2 ∗ − ∂t ψ∗ − Δψ∗ + A · ∇ψ + (∇ · A)ψ + A ψ − q A0 ψ∗ = 0. (22.25) i 2m mi 2m i 2m Multiplying the Schr¨odinger equation (22.24) by ψ∗ and its complex conjugate by ψ, and subtracting the two, we obtain q 2 + ψ∇ψ ∗ ) − q (∇ · A)|ψ| ∂t (|ψ| 2 ) − (ψ∗ Δψ − ψΔψ∗ ) − A · (ψ∗ ∇ψ = 0, i 2m mi mi which can be rewritten as · (ψ∗ ∇ψ − ψ∇ψ ∗ ) − q A|ψ| 2 = 0. ∂t (|ψ| 2 ) + ∇ 2mi m
(22.26)
(22.27)
This has the form of a continuity equation for the probability density, adding the interaction with the electromagnetic field to the previously derived equation from Chapter 7. Indeed, defining the probability density ρ(r , t) = |ψ(r , t)| 2 ,
(22.28)
as we see that we can define the probability current density in the presence of the vector potential A − ψ∇ψ ∗ ) − q A|ψ| j = (ψ∗ ∇ψ 2 ≡ j0 + j1 , 2mi m so that we have the standard continuity equation · j = 0. ∂t ρ + ∇
(22.29)
(22.30)
We also define the current j0 as the current in the absence of A, − ψ∇ψ ∗ ), j0 = (ψ∗ ∇ψ 2mi as well as the new contribution j1 , depending on A, qρ j1 = − q A|ψ| 2 = − A. m m
(22.31)
(22.32)
Note that the probability density ρ and probability current density j are related to the charge density and current density by multiplying with the individual charge of the particle, ρ charge = qρ,
jcharge = q j.
(22.33)
We can then split the (expectation value for the) magnetic moment associated with each current denoted μ 0, density individually, first the current density in the absence of A, 1 0 = μ d 3 r r × j0,charge 2 ∗ ⎤ ⎡⎢ ⎥ q ∇ (22.34) = d 3 r ⎢⎢ψ∗ r × ψ + r × ∇ψ ψ⎥⎥ 4m i i ⎢⎣ ⎥⎦ q q ˆ ˆ ˆ = ψ| L|ψ, ψ|( L|ψ) + (ψ| L)|ψ = 4m 2m
254
22 Quantum Mechanics and Classical Electromagnetism ˆ and in the last equality where in the second equality we have used the fact that L = rˆ × pˆ = rˆ × i ∇, ˆ we have used the fact that L is a self-adjoint operator, like all observables. Finally, then, since we have a relation valid for any state |ψ, we obtain the same relation between an abstract operator (an operator that is not in a particular representation) describing the magnetic moment μˆ and the angular ˆ momentum operator L as at the classical level, q ˆ μˆ = L. 2m
(22.35)
, called μ 1 , that comes from the new term in However, the same is not true for the new term in μ the current density, j1 . For it, we obtain q2 3 1 1 = μ d r r × j1,charge = − d 3 r ψ∗ (r × A)ψ 2 2m (22.36) q2 ˆ =− ψ|(r × A)|ψ. 2m Again, though the relation is valid for any state |ψ, so we can write a relation valid for abstract operators, 2
q ˆ μˆ 1 = − (r × A). 2m
(22.37)
22.3 Application to Superconductors As an application of the previous formalism, we will consider the quantum theory of superconductors. In a superconductor there are superconducting electrons (which means electrons that don’t experience collisions with the nuclei that can reduce their speed) described by a wave function ψ(r ) and with a local density equal to the probability density ns = |ψ(r )| 2 . The only difference with respect to the previous analysis is that instead of the mass m of the electrons, we have an “effective mass” m∗ , an effective description of the mutual interaction of the electrons. Indeed, in a “Fermi liquid”, describing electrons in most of these materials, the effect of the interactions between the electrons is just to replace the mass of the free electrons (described as a “Fermi gas”) m by the renormalized mass m∗ , which can be 10%–50% larger. Actually, in heavy fermions it can even be the case that m∗ m. Therefore, the electric current density of the superconducting electrons, js = q j, is then 2 − ψ∇ψ ∗ ) − q ψ∗ ψ A. js (r ) = −i q (ψ∗ ∇ψ 2m∗ m∗
(22.38)
Deep inside the superconductor, the wave function of the superconducting electrons approaches a constant absolute value (so that the probability density is constant) that minimizes the thermodynamic free energy. Calling it ψ0 (∈ R), we have that ψ ψ0 eiα .
(22.39)
Then the electric current density of the superconducting electrons is 2
js − q ψ2 A. m∗ 0
(22.40)
255
22 Quantum Mechanics and Classical Electromagnetism
Further, the superconductor is described in the London–London model (named after H. London and F. London) as a sum of two components, a normal electron component jn , that experiences collisions and a superconducting with nuclei, so has a finite conductivity σ, and current density jn = σ E, component js as above. The normal electrons have density nn and velocity v and the superconducting ones have density ns = |ψ| 2 ψ02 and velocity vs , so we have a total density n = nn + ns , and jn = −ennvn js = −ens vs
(22.41)
j = jn + js . acts, without The superconducting electrons have mass m∗ , and on them only the electric force −e E any collisions to slow them down, so it increases their velocity by = m∗ −e E
dvs . dt
(22.42)
But that in turn means that the superconducting electric current density changes as d js ns e2 ns e2 d A = E=− , dt m∗ m∗ dt
(22.43)
which is the first London equation. It really should be postulated not derived, as are the usual Maxwell’s equations, since it is based on observations but doesn’t quite follow from what we already know about (normal) electrons. The second London equation is also postulated, though it is based on the observation we made that, where ns = |ψ| 2 ψ2 , so by taking the cross product of in the superconductor, js −(q2 /m∗ )ψ02 A, 0 with it we get ∇ 2
2
× js = − ns e B × A. = − ns e ∇ ∇ m∗ m∗
(22.44)
in a simply connected However, once it is postulated, we have to further postulate that js ∝ A superconductor (we will see in the next chapter why this is needed), in order to obtain 2 js = − ns e A ≡ − 1 A, m∗ μ0 λ2L
(22.45)
×B = μ0 j and where the last equality defines λ L . On the other hand, from the Maxwell equation ∇ 2 using ∇ C = −∇ × (∇ × C) − ∇(∇ · C) (true for any vector C) and that ∇ · B = 0, we obtain 2 2B × (∇ × B) × js = μ0 ns e B = −∇ = −μ0 ∇ = 1 B. ∇ m∗ λ2L
(22.46)
The solution of this equation is an exponentially decaying magnetic field inside the superconductor, when going inside from the boundary, −r/λ L . B(r) = B(0)e
(22.47)
This is the mathematical description of the observed phenomenon that superconductors expel magnetic field, known as the Meissner effect: here we see that the magnetic field only penetrates a distance of λ L inside the material.
256
22 Quantum Mechanics and Classical Electromagnetism
22.4 Interaction with a Plane Wave As a first step towards quantizing the electromagnetic field, we consider a simple space dependence and promote x to be a quantum operator. This is a simple generalization of the previously for A, constant, with A = (B × r )/2, where we also promoted r to be an operator. It is considered case of B also an obvious thing to do since, in the coordinate representation for wave functions, x acts trivially (since it is replaced by its eigenvalue). Consider a vector potential that is a plane wave, with sinusoidal form, = 2A(0) cos ω n · x − ωt , (22.48) A c where is a polarization vector, transverse to the propagation direction n, so that · n = 0. We can rewrite (22.48) as the sum of two exponentials, = A(0) exp i ω n · x − iωt + exp −i ω n · x + iωt , A (22.49) c c where, as we will see in the second part of the book, the first term is associated with absorption of light and the second with stimulated emission. This is replaced inside the interaction Hamiltonian, q ˆ Hˆ int = − P · A, m
(22.50)
in order to calculate its matrix element n| Hˆ int |m in between eigenstates |n of the free Hamiltonian, for which Hˆ 0 |n = En |n. Defining the frequency of transition between two states, ωnm =
En − Em ,
(22.51)
and using the fact that for the x component of the kinetic momentum we have m ˆ ˆ Pˆx = [ X, H0 ], i
(22.52)
where X is the kinetic position operature, we find that n| Pˆx |m =
m n|( Xˆ Hˆ 0 − Hˆ 0 Xˆ )|m = imωnm n| Xˆ |m. i
(22.53)
(the plane Moreover, as a first approximation, we can replace the exponentials in the expansion of A however it wave) with their leading term, 1. That amounts, of course, to having just a constant A; must be thought of as the leading approximation to a plane wave, and gives n| Hˆ int |m = −q A(0) iωnm n| Xˆ |m.
(22.54)
22.5 Spin–Magnetic-Field and Spin–Orbit Interaction Finally, we consider an interaction term in the Hamiltonian that really comes from the relativistic theory (quantum field theory), but, like the electron spin itself, has a simple nonrelativistic description, as an interaction between spin and orbital angular momentum.
257
22 Quantum Mechanics and Classical Electromagnetism
First, we consider the interaction of the electron spin with the magnetic field. We remember that, for an electron, the spin generates a magnetic moment with g = 2. Considering also that q = −e, we have S = − μ
e S. me
(22.55)
But the spin (intrinsic angular momentum) is given by S = σ /2, where σi are the Pauli matrices, acting on the two-dimensional spin vector space. As we showed earlier, the Pauli matrices satisfy σi σ j = δi j + ii jk σk .
(22.56)
Multiplying this with Pˆi Pˆ j , where Pˆi is the kinetic momentum, we obtain ˆ σ · P) ˆ = Pˆ · Pˆ + iσ · ( Pˆ × P). ˆ (σ · P)(
(22.57)
Now using the fact that Pˆi is related to the conjugate momentum pˆi by ˆ = pˆ + e A, P = pˆ − q A
(22.58)
2 + eσ · B. ˆ 2 = pˆ + e A (σ · P)
(22.59)
we obtain (since pˆ = i ∇)
can be rewritten using (22.56), multiplied On the other hand, the Hamiltonian in the absence of A by pˆi pˆ j , as pˆ · pˆ 1 ˆ ˆ 1 p · p + iσ · (p × p ) − eA0 = Hˆ = − eA0 = (σ · pˆ ) 2 − eA0 , 2m 2m 2m
(22.60)
ˆ obtaining that the Hamiltonian is that previously and in the last form we can again replace pˆ by P, derived for interaction with a classical electromagnetic field, plus a term giving the interaction of the spin with the magnetic field, 1 2 + eσ · B ˆ 2 − eA0 = pˆ + e A − eA0 . (σ · P) Hˆ = 2m
(22.61)
To obtain the interaction between spin and orbital angular momentum (the spin–orbit interaction), we have to consider the relativistic Dirac equation and expand it in 1/c2 , obtaining the interaction term Hint,LS = −
e 0 × p ). σ · (∇A 4m02 c2
(22.62)
We will not derive this here, though we will do so towards the end of the book, when considering the Dirac equation. For spherically symmetric systems, we find Hint,LS = −
e 1 dA0 e 1 dA0 σ · (r × pˆ ) = S · L. 2 2 r dr 4m0 c 2m02 c2 r dr
(22.63)
258
22 Quantum Mechanics and Classical Electromagnetism
Important Concepts to Remember • Minimal coupling of a particle to an electromagnetic field amounts to the replacement of the ≡ P, momentum p (usually both the kinetic momentum, and that conjugate to x) by p − q A where p is now the canonically conjugate momentum, and P is the kinetic momentum. may be kept classical), we replace the canonically • At the quantum level for the particle (while A so P = p − q A − q A1. with ∇ conjugate momentum p with i ∇, i ·A = 0, for constant B and stationary A is • The resulting interacting Hamiltonian, in the gauge ∇ where the magnetic moment is μ = r × d I/2. − μ · B, • A gauge transformation of Aμ = ( A0 , Ai ) by Λ corresponds to multiplying the wave function by the phase eiqΛ/ . • The gauge transformation of ψ does not change the probability density ρ, but does change the qρ and the magnetic moment operator by the current of probability j by the addition of j1 = − A, m q2 ˆ q ˆ 0 = addition of μˆ 1 = − (r × A), while μ ψ| L|ψ. 2m 2m • In a superconductor, we have electrons of effective mass m∗ and of two types (the London–London and superconducting, with ns and js = −ens vs , model): normal, with nn and jn = −ennvn = σ E, satisfying 2 ns e2 d A d js × js = − ns e ∇ × A. =− and ∇ dt m∗ dt m∗ • A superconductor expels magnetic field, meaning it penetrates the superconductor only a distance λL =
μ0 ns e2 /m∗ , B(x) = B0 e−x/λ L . This is the Meissner effect.
• For a sinusoidal electromagnetic wave, with the position variable replaced by the Xˆ operator, we find that the matrix elements of the interaction Hamiltonian are related to the transition frequency ωmn = (Em − En )/ and the matrix elements of Xˆ as Hint,mn = −q A(0) iωmn X mn . • Minimal coupling in the presence of spin leads to a spin–magnetic field interaction, and from the e 1 dA0 Dirac equation we also find a spin–orbit interaction S · L. 2m02 c2 r dr
Further Reading See [2] and [3] for more details.
Exercises √ The relativistic Lagrangian for particle is L = −mc2 1 − v 2 /c2 . Does minimal coupling at the classical level based on it work? Justify. (2) Write down formally the quantum level version for exercise 1, without bothering about what ψ means now. (3) Consider a conductor forming a closed loop C in a magnetic field, with nonzero flux Φ = B · d S through it. Show that there is a nonzero probability current loop C dl · I. S
(1)
259
22 Quantum Mechanics and Classical Electromagnetism
(4) In the London–London theory, show that by writing down the energy of the magnetic field and the kinetic term for the superconducting electrons, we obtain the free energy ∞ 1 2 + λ2L μ2j 2 )2πr dr, (B (22.64) f = 0 2μ0 0 2B = B. and from it we obtain the London equation λ2L ∇ (5) Explain why a small permanent magnet brought down along the (vertical) axis of a superconducting ring levitates (i.e., it doesn’t fall under gravity). (6) Consider a plane electromagnetic wave incident in a perpendicular direction onto a planar material containing electrons bound in it. Does the wave induce (leading-order) transitions in the electronic states? (7) Calculate the spin–orbit interaction for a hydrogen atom in the ground state.
Aharonov–Bohm Effect and Berry Phase in Quantum Mechanics
23
In this chapter, we consider relevant geometric phases that appear in the wave function: first, in the coupling to electromagnetic fields, the Aharonov–Bohm phase, and then, in general, the Berry phase.
23.1 Gauge Transformation in Electromagnetism In the previous chapter, we saw that the canonically conjugate momentum pˆ is represented by ∇ as i usual, but when the particle couples to electromagnetic fields, the kinetic momentum is represented by pˆ kin = pˆ − q A1,
(23.1)
and is the operator appearing in the Hamiltonian, pˆ − q A1 pˆ 2kin Hˆ = − q A0 1 + · · · = − q A0 1 + · · · 2m 2m
(23.2)
But this means that the canonically conjugate momentum acts on wave functions as (a constant times) a translation operator, ∂ pˆ ψ = ψ, i ∂x which in turn means that in the wave functions there is a phase factor i · dx , exp qA P
(23.3)
(23.4)
where P is a path in coordinate space. Indeed, in terms of the kinetic momentum, the translational phase (going along P from x to x in order to find the wave function at x from that defined at x) is i exp
x ⎡⎢ x ⎤⎥ i · dx ⎥⎥ ≡ exp i p · dx = exp ⎢⎢ p kin + q A p kin · dx eiδ , ⎢ ⎥ P:x P:x P:x ⎣ ⎦ x
(23.5)
where the kinetic momentum is taken to be gauge invariant (since, classically, it corresponds to mv ), which means that we have a gauge-dependent phase δ= 260
q
x
· dx . A P:x
261
23 Aharonov–Bohm Effect and Berry Phase
More precisely, we saw in the previous chapter that the gauge transformation acts on the solution of the Schr¨odinger equation as follows: Ai → Ai = Ai + ∂i Λ A0 → A0 = A0 + ∂0 Λ ⇒
(23.6)
ψ = ψ e−iqΛ/ , so that ψ = e+iqΛ/ ψ,
(23.7)
which is consistent with the phase factor eiδ in the wave function ψ. Another way to obtain the same result is to require on physical grounds that, under a gauge transformation on the wave function given by |ψ → |ψ = Uˆ |ψ, we have: • The norm of the state is invariant, ψ |ψ = ψ|ψ ⇒ Uˆ † = Uˆ −1 ,
(23.8)
so Uˆ is unitary. • The expectation value of Xˆ is invariant, ψ | Xˆ |ψ = ψ| Xˆ |ψ ⇒ Uˆ † Xˆ Uˆ → X. • The kinetic momentum is gauge invariant, 1 ψ = ψ pˆ − q Aˆ ψ ⇒ ψ pˆ − q A + ∇Λ Uˆ † pˆ − q A Uˆ = pˆ − q A.
(23.9)
(23.10)
We then see that the only solution for the unitary operator Uˆ is Uˆ = eiqΛ/ ,
(23.11)
which is the same as that found before. Yet another way to obtain the same result is to consider the path integral. In the propagator, we have the path integral expression t S i (23.12) U (t , t) = Dq(t) exp i L dt . = Dq(t) exp t But in the previous chapter, we saw that the classical Lagrangian, appearing in the path integral exponent, is L=
1 2 · v + q A0 + · · · mv + q A 2
(23.13)
Thus, there is a phase factor iq e = exp
t
iδ
the same as before.
t
iq dr A· dt = exp A · dr , dt P
(23.14)
262
23 Aharonov–Bohm Effect and Berry Phase
23.2 The Aharonov–Bohm Phase δ Now, in general a phase factor in the state |ψ or wave function is irrelevant, since we can always redefine it by a phase factor, while keeping the probability, |ψ 2 = ψ|ψ invariant. However, we cannot redefine the phase difference between the phases of two different particles. This is the same as saying that the phase on a closed path (formed by two different paths in between the same initial and final points, as in Fig. 23.1a) cannot be redefined away. Consider a double slit experiment, but with an electromagnetic field in between the two paths of the particles, as in Fig. 23.1b. That is, there is a unique source for particles (they have to be charged particles, not photons, in order to couple with electromagnetism), and two slits in a screen, and then we measure the interference in a unique point P behind the screen. The two generic (quantum) particle paths, going through the two slits, are called P1 and P2 . We can measure the phase difference through the interference between the waves traveling along paths P1 and P2 . The corresponding time dependent states are iq |ψ1 (t) = exp A · dr |ψ1 P1 (23.15) iq iq iδ |ψ2 (t) = exp A · dr |ψ2 ≡ e exp A · dr |ψ2 , P2 P1 where |ψ2 ∼ |ψ1 times perhaps an extra phase factor (independent of the electromagnetic field).
Figure 23.1
(a)
(b)
(c)
(d)
(a) For two different paths in between the same initial and final points, there is a closed path C = P1 − P2 , with a magnetic field B inside it. (b) The same for a double slit experiment; the region with B (into the page) is in between the slits. (c) The surface S bounded by the closed path C, with the region with B inside it. There is a nonzero field A around C, even though B is 0. (d) A physical construction: a conducting cylinder, with a field B only in the void inside the cylinder; the path C is inside the conductor.
263
23 Aharonov–Bohm Effect and Berry Phase
Then we can measure the phase difference iq · dr = exp iq · dr , eiδ = exp A A P1 −P2 C
(23.16)
where C = P1 − P2 is a closed loop. Then, if C = ∂S is the boundary of a surface S, by Gauss’s law (or Stokes’ law), we find iq · d S = exp iq Φm , B (23.17) eiδ = exp S where Φm is the magnetic flux going through the surface S. and not B itself. In favor of this interpretation, The implication seems to be that we are observing A, consider the case where there is no field B on the paths P1 and P2 , therefore none on C = P1 − P2 , see Fig. 23.1c, yet we observe nontrivial interference between the two paths, due to the phase δ. · dr 0. As a more precise set-up for this on it, so A However, there is certainly a field A C case, consider a cylindrical sheet of metal (a larger cylinder with a smaller cylinder in the middle), and electrons moving in the material, around the center. Consider also that there is a magnetic field parallel to the axis of the cylinder, but only in the void in the middle; see Fig. 23.1d. So there is a magnetic flux going through the surface bounded by any closed contour C in the section of the on the path C itself. But there is no B on C material (around the axis), and then there is also a field A itself, so it seems that we can measure A, not B. However, this doesn’t mean that we are breaking gauge invariance, since, as we saw above, δ through Φm , so it is a gauge invariant phase. This leads to a depends only on the gauge invariant B, sort of philosophical debate: is this nonlocality, whereby δ depends on the full closed contour C, or on C, more important for the interpretation? the fact that δ depends on the value of A In the path integral formalism, it appears that the nonlocality is more important. Indeed, consider the path integral for the wave function in the double slit experiment (or, equivalently, in the cylinder set-up, with electron paths going either clockwise or anticlockwise around the circle, between two points on it), and write it as a sum over paths P1 going through one slit, and paths P2 going through the other, ⎡⎢ t ⎤⎥ i iq · dr + i S( A = 0) Dq(t) exp ⎢⎢ L dt ⎥⎥ ψ(t = 0) = exp A ψ(r, t ) = P ⎢⎣ 0 ⎥⎦ paths P iq · dr ψ1 (r) = Dq(t)eiS0 (P1 )/ exp A P1 above={P1 } (23.18) iq iS0 (P2 )/ · dr ψ2 (r) + Dq(t)e exp A P2 below={P2 } iq · dr ψ1 (r) + eiδ ψ2 (r) . = exp A P1 Again, we obtain interference due to the phase factor eiδ but now we see explicitly that the nonlocality associated with the contour C is the key issue. Finally, note that if the phase δ is a multiple of 2π then the phase factor eiδ = 1 is unobservable. This happens for a magnetic flux through the surface S bounded by C that is quantized, Φ[C] = nΦ0 , Φ0 = where Φ0 is called a flux quantum.
2π h = , q q
(23.19)
264
23 Aharonov–Bohm Effect and Berry Phase
23.3 Berry Phase The Aharonov–Bohm phase factor eiδ can be generalized to a much more important case, the observable geometric phase, valid for any quantum mechanical system undergoing adiabatic changes. This was derived in a seminal paper by Michael Berry in 1983. Indeed, suppose that the Hamiltonian depends on a (set of) parameter(s) K that change slowly in time, K = K (t). However, suppose that the change is slow enough (adiabatic) that the system, initially in an eigenstate of the initial Hamiltonian, continues to be in an eigenstate |n(K (t)) of the Hamiltonian for any later time t, Hˆ (t), so that Hˆ (K (t))|n(K (t)) = En (K (t))|n(K (t)).
(23.20)
But the Schr¨odinger equation is (considering the state |n(K0 ), t 0 at time t 0 ) d Hˆ (K (t))|n(K0 ), t 0 ; t = i |n(K0 ), t 0 ; t, dt which means that its time-dependent solution is of the form t i iγn (t) |ψ(t) = exp − En (t )dt e |n(K (t)). 0
(23.21)
(23.22)
This is obvious except for the phase factor eiγn (t) , but this phase factor must exist since it compensates the fact that d/dt can also act on K (t). Indeed, taking this derivative into account in the Schr¨odinger equation, we obtain dγn d d iγn (t) 0 = i |n(K (t)) = eiγn (t) − |n(K (t)) + i |n(K (t)) , (23.23) e dt dt dt which implies that (multiplying with n(K (t))| from the left and using the normalization of states) dγn (t) dK = in(K (t))|∇k |n(K (t)) , dt dt which can be integrated to give the phase γn as γn = i n(K (t))|∇k |n(K (t))dK.
(23.24)
(23.25)
This phase γn is the Berry phase, or geometric phase. We can express it in a more familiar way by defining the Berry “connection” (or, generalized gauge field) An (K ) ≡ in(K (t))|∇k |n(K (t)).
(23.26)
(more on that later), we have, more precisely Then, if K is a vector, such as a position R (considering the factors of as well in order to have a more precise analog of a gauge field), |n( R(t)) n ( R) ≡ in( R(t))| A ∇ ⇒ R 1 n ( R) · d R. γn = A
(23.27)
Moreover, in general, as for a gauge field, we have a gauge transformation that leaves the system invariant. Indeed, as in the case of the Aharonov–Bohm phase, we realize that we can change the state |ψ(t) by an arbitrary phase e χ(K ) without affecting the observables and probabilities, so | n(K ˜ (t)) = eiχ(K ) |n(K (t))
(23.28)
265
23 Aharonov–Bohm Effect and Berry Phase is as good as |n(K (t)) in describing the system. Therefore d d dχ(t) dK dK i n(K ˜ (t)) n(K ˜ (t)) = i n(K (t)) n(K (t)) − = An (K ) − ∇K χ dt dt dt dt dt
(23.29)
also serves as a Berry connection term; thus the Berry connection transforms as An (K ) → An (K ) − ∇K χ,
(23.30)
we have the standard vector potential transformaexactly like a gauge field. In the case K = R, tion law, χ( R(t)). n ( R) →A n ( R) −∇ A R
(23.31)
We note that γn must be a phase, since it is real. Indeed, differentiating the normalization relation of the time-dependent state |n(K (t)), we have d [n(K (t))|n(K (t)) = 1] ⇒ dt d d n(K (t))|n(K (t)) 0 = n(K (t)) n(K (t)) + dt dt d = 2Re n(K (t)) n(K (t)) , dt where we have used that
(23.32)
∗ d d n(K (t))|n(K (t)) = n(K (t)) n(K (t)) , dt dt
so we obtain that the phase is real (note the extra i in the phase), d γn ≡ i n(K (t)) n(K (t)) ∈ R. dt
(23.33)
But, if the Hamiltonian is periodic, i.e., it returns to the initial point after a time T, Hˆ (K (T )) = Hˆ (K (0)),
(23.34)
then this means that the gauge transformation must also be single valued, eiχ(T ) = eiχ(0) , in order to have a well-defined physical system, so χ(K (T )) = χ(K (0)) + 2πm,
(23.35)
where m ∈ Z. Therefore the Berry phase on a closed path C, modulo 2πm, is invariant, and cannot be removed by a gauge transformation: An (K )dK + χ(T ) − χ(0) = An (K )dK = γn (C). (23.36) γn (C) + 2πm = C
C
23.4 Example: Atoms, Nuclei plus Electrons A standard example, and one that was implicit when we said that the parameters K can be a vector for the position of some quantity, is the case of several atoms interacting (close by), composed of R
266
23 Aharonov–Bohm Effect and Berry Phase
nuclei and electrons. The nuclei are slow variables, since the time variations of their positions are small, whereas the electrons are fast variables since the time variations of their positions are large. between the nuclei (determining relative Consider for simplicity a diatomic molecule, with distance R motion of the nuclei). Factoring out the center of mass motion, the total Hamiltonian for the system a Hamiltonian for the electrons H N ( R), then splits into a Hamiltonian for the nuclei depending on R, r 1 , . . . , r N ), Hel (r 1 , . . . , r N ), and a potential depending on the positions of all of the electrons, V ( R, + Hˆ el (r 1 , . . . , r N ) + V ( R, r 1 , . . . , r N ). Hˆ tot = Hˆ N ( R)
(23.37)
This is a slight generalization of the hydrogen molecule H2 , considered in Chapter 21. There constant: the nuclei have only a small we implicitly used the adiabatic approximation R motion (relative oscillation) with respect to the distance R between them. This is called the Born– Oppenheimer approximation, after the people who considered it first. We can then assume, adiabatically, that the positions of the nuclei are approximately constant as changes in time, as far as the electrons are considered. Then, for the electrons, we the distance R where R appears as just a can write down the eigenenergy equation with electron eigenstate |n( R), parameter, r 1 , . . . , r N ) |n( R) = Un ( R)|n( R). (23.38) Hel (r 1 , . . . , r N ) + V ( R, We see then that this is of the type considered in the Berry phase above. For the total system, the state is approximately (assuming the separation of variables) the tensor product of a state for the nuclei and the electron state, |ψ = |ψ N ⊗ |n( R).
(23.39)
Neglecting the back reaction of the electrons on the nuclei equation, we obtain the simple separated-variable equation + Un ( R) |ψ N = En |ψ N , (23.40) H N ( R) and corresponding time-dependent Schr¨odinger equation, whose solution classically corresponds to = R(t); the motion of the relative position of the nuclei, R this could be plugged back into the adiabatic eigenenergy equation for the electrons and a Berry phase obtained. We can ask: why is it then that Born and Oppenheimer missed this Berry phase in their analysis? One answer is that the total Hamiltonian in this case is real, Hˆ ∈ R, so by the general theory we can choose wave functions that are real as well (sines and cosines), so we can neglect phases; this is what Born and Oppenheimer did and therefore missed the Berry phase.
23.5 Spin–Magnetic Field Interaction, Berry Curvature, and Berry Phase as Geometric Phase We next aim to understand why the Berry phase is also called a geometric phase. We will examine the property of this phase of being defined by geometry for another classic case, that of a spin interacting with a magnetic field that depends slowly on time. Then, we can again consider an adiabatic approximation, and write the interaction of the spin with B(t) as giving a
267
23 Aharonov–Bohm Effect and Berry Phase
time-dependent Hamiltonian, in which the magnetic field appears as a time-dependent parameter. The interaction term is = −gμ B S · B(t). Hint ( B)
(23.41)
Then the Berry connection, defined with respect to the parameters B(t), is |n( B(t)). B = in( B(t))| A ∇ B
(23.42)
However, associated with it, we can define a “Berry curvature”, meaning the field strength of the Berry connection (the gauge field), in the usual way. Writing it for the general case (not necessarily but for general parameters) we have for K = B, K × A n (K ), n (K ) = ∇ F
(23.43)
which means that in components we have Finj = ∂i Anj − ∂j Ain ≡
∂ ∂ Aj − Ai , ∂Ki ∂K j
(23.44)
n (K ) in the usual way (as as for a regular gauge field, and then we can define the “magnetic field” F for electromagnetism), by Finj = i jk Fkn .
(23.45)
Then, moreover, using Gauss’s (Stokes’) law for this gauge field, we have that (as for the Aharonov–Bohm phase) n (K ) · d 2 K. An (k) · d K = B (23.46) C=∂S
S
The field strength of the gauge field can be rewritten as Finj = i ∂Ki n(K (t))|∂K j |n(K (t)) − ∂K j n(K (t))|∂Ki |n(K (t)) = i ∂Ki n(K (t))|∂K j n(K (t)) − ∂K j n(K (t))|∂Ki n(K (t)) .
(23.47)
Now we can insert the identity, rewritten using the completeness relation for the adiabatic states, |m(K (t))m(K (t))|, (23.48) 1= m
inside the scalar products, obtaining Fi j = i ∂Ki n(K (t))|m(K (t))m(K (t))|∂K j |n mn
− ∂K j n(K (t))|m(K (t))m(K (t))|∂Ki |n .
(23.49)
For m 0, due to the fact that Hˆ |m = Em |m and n|m = 0, which is still valid when K = K (t), we take ∇K and obtain ∇K n(K )| Hˆ |m(K ) = 0.
(23.50)
268
23 Aharonov–Bohm Effect and Berry Phase Moreover, also using the fact that ∇K n(K (t))|m(K (t)) = 0 (in the adiabatic case), which implies that ∇K n(K (t))|m(K (t)) = −n(K (t))|∇K m(K (t)),
(23.51)
and acting with ∇K on both states n| and |m, and on Hˆ in (23.50), we obtain ˆ 0 = n(K (t))|(∇K H)|m(K (t))| − (Em − En )n(K (t))|∇K m(K (t)).
(23.52)
This implies that n(K (t))|∇K |m(K (t)) =
ˆ n(K (t))|(∇K H)|m(K (t)) . Em − En
(23.53)
Substituting this in (23.49), we obtain that the Berry curvature is expressed as Finj
⎡⎢ n(K (t))|(∂i H)|m(K ˆ ˆ (t))m(K (t))|(∂j H)|n(K (t)) ⎢⎢ =i 2 (E − E ) ⎢ m n mn ⎣
ˆ ˆ n(K (t))|(∂j H)|m(K (t))m(K (t))|(∂i H)|n(K (t)) ⎤⎥⎥ − ⎥⎥ . (Em − En ) 2 ⎦
(23.54)
Finally, we can use this formula in the case of the interaction between a spin and a time-dependent magnetic field B(t), acting as parameters K (t). Then, we have ∂ ˆ Sˆi ∂i Hˆ = H = −gμ B (= −μ B σi ), ∂ Bi
(23.55)
where in parenthesis we have written an expression for the right-hand side in terms of the Pauli matrices acting on a two-dimensional Hilbert space for spin 1/2 ith projection m/2 (m = ±1) for the spin in the z direction, using Si = σi /2 and g = 2. Then, after a calculation (left as an exercise), we find Bin = m
Bˆi , B2
(23.56)
where Bˆi stands for the i component of the magnetic field. Then the Berry phase is (using the fact that Ki = Bi and Gauss’s, or Stokes’, law) 1 1 Bˆi 2 γn = Ain dBi = Bin d 2 Bi = −m d B = −m dΩ. (23.57) i 2 C=∂S S(C) S(C) B C Here in the second equality we have used Stokes’ law to convert the integral over the closed path C = ∂S in Bi space to an integral over the surface S(C) bounded by C in Bi space, and in the last equality we have noted that ( Bˆi /B2 )d 2 Bi is a solid angle dΩ on the surface S bounded by the contour space; see Fig. 23.2. Then we see that C, with respect to the origin O in B γn = −mΔΩ
(23.58)
space, thus being entirely geometric in nature, as we set out to show. is a solid angle in parameter ( B)
269
23 Aharonov–Bohm Effect and Berry Phase
C S
O
Figure 23.2
A surface S in B space, bounded by a contour C and making a solid angle Ω from the point of view of the origin O.
23.6 Nonabelian Generalization We can also define a nonabelian generalization of the Berry connection. Indeed, now suppose that we have N degenerate eigenstates |n, with n = 1, . . . , N, of the Hamiltonian, which depends on time-dependent parameters K (t), so we denote the eigenstates by |n, K. Then we can still define the Abelian Berry connection as the diagonal matrix element n, K (t)|∇K |n, K (t), (23.59) A(K ) = i n
but now we can define also the nonabelian Berry connection as the off-diagonal matrix element in the space of the same energy, A(n,m) (K ) = in, K (t)|∇K |m, K (t),
(23.60)
which is an element in the unitary group U (N ) (it is a unitary N × N matrix), so that A(K ) is its trace, A(K ) =
N
A(n,n) (K ),
(23.61)
n=1
which is known to be the Abelian component in the decomposition U (N ) (U (1) × SU (N ))/Z N .
23.7 Aharonov–Bohm Phase in Berry Form We can rewrite the Aharonov–Bohm phase δ in the Berry phase form as follows. Consider (electron) translated along the curve C (for example, a curve inside a conductor) until r . states at a point R, Then the position-space wave function at point r is
270
23 Aharonov–Bohm Effect and Berry Phase ⎡⎢ r = exp ⎢⎢ iq r ) · dr r |n( R) A( ⎢⎣ R
⎤ ⎥⎥ ψ n ( R), ⎦
⎥ ⎥
(23.62)
which leads to n( R) ∇ = −iq A( R), n( R)| R
(23.63)
R) as a Berry connection, and thus the Aharonov– meaning we have rewritten the vector potential A( Bohm phase as a Berry phase.
Important Concepts to Remember • Since the canonically conjugate momentum p = p kin − q A translates in x so is equal to ∂x , we x i · dx . find that when we go along a path P from x to x we acquire a phase eiδ , with δ = P:x q A • In turn, this means that on a closed path C, we have a gauge-invariant phase proportional to the · dx = q B · d S = q Φm . This is the Aharonov–Bohm phase. magnetic flux, since δ = q C=∂S A S is observable (we can have B = 0 on C, • The Aharonov–Bohm phase would appear to mean that A iδ but A 0); the phase itself is unobservable (e = 1) for integer flux, Φn = nΦ0 , Φ0 = h/q. • If the Hamiltonian depends on a set of parameters that change slowly in time, K = K (t), with n (K ) = in(K (t))| adiabatic energy eigenstates |n(K (t)), we can define a Berry connection A ∇K |n(K (t)), and the states have an additional Berry phase eiγn , with γn = A n (K ) · d K. n (K )) • The Berry phase is real and gauge invariant (under generalized gauge transformations for A modulo 2πm, so is physically observable. • For atoms (nuclei plus electrons) in the Born–Oppenheimer approximation (nuclei with small (the nuclear separation). motion), we have a Berry phase in terms of R • The Berry curvature, the field strength of the Berry connection, can be written in terms of the matrix elements of the derivatives of the Hamiltonian. • In the case of an interaction between a spin 1/2 and a time-dependent magnetic field B(t), with projection m/2 of the spin on B, the Berry phase is a geometric phase in parameter space space), specifically −m times the solid angle bounded by a moving direction B(t)/| (B B(t)| along a closed path. • In the case of N degenerate eigenstates of the Hamiltonian, |n, we can define a nonabelian Berry connection, A(nm) (K ) = in, K (t)|∇K |m, K (t), an element in U (N ), while the Abelian Berry con
nection is its trace, n A(n,n) . • The Aharonov–Bohm phase is a particular example of the Berry phase, since it can be put in the form of the latter.
Further Reading See [1] and the original paper of Michael Berry [9] for more details.
271
23 Aharonov–Bohm Effect and Berry Phase
Exercises (1)
(2)
(3) (4)
(5)
(6) (7)
Consider a figure-eight loop made of conductor, mostly in a single plane (with just enough nonplanarity for the loop to not cross itself, but to slightly avoid it), and a constant magnetic field perpendicular to this plane. Do we have an Aharonov–Bohm phase for the electrons in the conducting loop? What happens to the wave function of the electrons moving around the loop in exercise 1 (or to the interference pattern if we shoot electrons at a point on the loop in opposite directions on the loop and measure their interference when they cross again) as we undo the figure-eight into a circle in the same plane? i , i = 1, 2, . . . , N, for instance a Consider a system depending on N time-dependent vectors R molecule with N nuclei with such positions. How do we construct its generic Berry phase? Consider an H2 molecule moving slowly, in the Born–Oppenheimer approximation, in a constant What Berry connection does one define, and how can we have a nontrivial Berry electric field E. phase? Consider a molecule with fixed dipole d in a time-dependent electric field E(t). Calculate the Berry connection and Berry magnetic field in this case. Do we have a geometrical phase in this case? Prove that for a spin 1/2 particle in a magnetic field the Berry curvature is given by (23.56). How would you define the Berry curvature in the nonabelian case, in such a way that it transforms covariantly under U (N ), i.e., with U acting from the left and U −1 from the right?
24
Motion in a Magnetic Field, Hall Effect and Landau Levels In this chapter we consider particles with and without spin, and later atoms, in a magnetic field and calculate the resulting quantum effects. For a particle, we find that there are “Landau levels” for the states, and for particles in a conductor these lead to a quantum version of the classical Hall effect. Finally, for the atom, we find a first-order form for the Zeeman effect.
24.1 Spin in a Magnetic Field As we saw in previous chapters, for the interaction of a spin with magnetic field, we have the Hamiltonian · S. Hˆ (B) = Hˆ 0 − gμ B B (24.1) In the case of a spin 1/2 particle, we can rewrite this as · σ. Hˆ (B) = Hˆ 0 − μ B B For a magnetic field in the z direction,
· σ = Bz σ3 = +Bz B 0
0 , −Bz
(24.2)
(24.3)
which means that the time-independent Schr¨odinger equation Hˆ (B)|ψ = E(B)|ψ gives E(B) = E0 ∓ μ B Bz ,
(24.4)
which is the result of the Stern–Gerlach experiment for Sz = ±/2 (it is the spin-1/2-particle splitting energy as a function of Sz in a transverse magnetic field).
24.2 Particle with Spin 1/2 in a Time-Dependent Magnetic Field = B(t), We consider a magnetic field B i.e., constant in space, but time dependent. In this case, we can separate variables and write a wave function that is a tensor product of a position state and a spin state (as in Chapters 17 and 20), |ψ = ψ(r , t)|s(t).
(24.5)
The Schr¨odinger equation splits into independent Schr¨odinger equations. The solution for |ψ is unique, though the separation into parts is not. We can add a time dependence with a function F (t), split between the two equations, 272
273
24 Motion in Magnetic Field, Hall Effect, Landau Levels
∂ Hˆ 0 − F (t) ψ (r , t) + ψ (r , t) = 0 i ∂t d · σ + F (t))|s = 0. |s + (−μ B B(t) i dt The solution of the two equations factorizes the dependence on F (t), t i ψ (r , t) = ψ(r , t) exp F (t )dt 0 t i |s (t) = |s(t) exp − F (t )dt . 0
(24.6)
(24.7)
But, as we said, the total state is independent of F (t), |ψ = ψ (r , t)|s (t) = ψ(r , t)|s(t) = |ψ.
(24.8)
So we can use the solution without F (t) (i.e., put F (t) = 0), which is unique. However, as explained in the previous chapter, in this situation there is also a Berry phase that is actually geometrical; we will not repeat the argument here.
24.3 Particle with or without Spin in a Magnetic Field: Landau Levels We saw in the previous chapter that the coupling of a particle with a vector potential describing an electromagnetic field is given by 1 ˆ 2 − q A0 . p − q A Hˆ = (24.9) 2m = Bez , with no electric field, so we can put the Consider a magnetic field in the z direction, B gauge A0 = 0. Then we have the vector potential
= Bxey , A
(24.10)
which indeed gives Fxy = xyz Bz = B. Then the Hamiltonian becomes 2 1 2 Hˆ = (24.11) pˆx + pˆy − qBx + pˆz2 . 2m To solve the time-independent Schr¨odinger equation, we write a solution with separated variables, where in the y, z directions we have just the free particle ansatz ψ(x, y, z) = un,py ,pz (x, y, z) = X n (x)ei(y py +z pz )/ .
(24.12)
The time-independent Schr¨odinger equation becomes Hˆ ψ = Eψ ⎡⎢ ⎢⎢∂ 2 + ⎢⎣ x ⎡ 2 ⎢⎢ 2 =− ∂ + 2m ⎢⎢ x ⎣
2 =− 2m
2
⎤⎥ + ∂z2 ⎥⎥ ψ ⎥⎦ 2 pz2 ⎤⎥ ipy iq − Bx − 2 ⎥⎥ X n (x)ei(y py +z pz )/ . ⎥⎦ iq ∂y − Bx
(24.13)
274
24 Motion in Magnetic Field, Hall Effect, Landau Levels · σ in H, ˆ If the particle also has a spin, say spin 1/2, then we saw that there is an extra term −μ B B corresponding to a term ∓μ B B in the energy. With this extra term, the equation for a particle with spin 1/2 in a magnetic field is given by ⎡ pz2 py 2 ⎤⎥⎥ 2m ⎢⎢ q2 B2 X n (x) = 0. − (24.14) X n (x) + 2 ⎢ E ± μ B B − x− 2m 2m qB ⎥⎥ ⎢⎣ ⎦ Then, defining py x0 = qB (24.15) q2 B2 |q|B 2 ≡ ω (B) ⇒ ω(B) = ωc = , m m2 where ωc is called the cyclotron frequency and is the angular frequency for the motion of a particle of mass m in a field B, we find pz2 mωc2 2m − (x − x 0 ) 2 X n (x) = 0. (24.16) X n (x) + 2 E ± μ B B − 2m 2 This is the equation for a harmonic oscillator, with coordinate x − x 0 and frequency ωc , so the x dependent wave function is X n (x) = Nn e−α and the eigenenergies are En;|pz |,ms
2 (x−x
2 /2
Hn (α(x − x 0 )),
(24.17)
pz2 1 = ∓ μ B B + ωc n + . 2m 2
(24.18)
0)
We note that |e|B ωc = = |ms |ωc , (24.19) 2m 2 where ms = ±1/2 gives the values of the spin projection onto the z direction. Thus the eigenenergies can be rewritten as pz2 1 En;|pz |,ms = + ωc n + ms + . (24.20) 2m 2 μB B =
The parameter α in the harmonic oscillator is rewritten as mωc |q|B 1 α= = ≡ , l
(24.21)
◦
where the distance l becomes √
250 A B(tesla)
, and the position x 0 is
x 0 (py ) = ±
py l2 py = ± . mωc
(24.22)
The harmonic oscillator index n (listing the energy levels) is called now a Landau level. We note that the harmonic oscillation: • is shifted by x 0 (py ) and rescaled by l; in this case the plane defined by (x, y). • is a motion in a plane perpendicular to B,
275
24 Motion in Magnetic Field, Hall Effect, Landau Levels
To compare the behavior obtained in quantum mechanics with that in classical mechanics, note The electron is that in classical mechanics we have a circular motion around the direction of B. under the influence of the Lorentz force F = q( E + v × B), with E = 0 and a velocity perpendicular Since the Lorentz force acts as a centripetal force, or balances the centrifugal force, we obtain to B. F = |q|vB = Fcentrifugal = mvω,
(24.23)
leading to a motion with the cyclotron angular frequency ω=
|q|B = ωc . m
The particle thus describes circles of radius py v r= = = |x 0 (py )|. ωc mω(B)
(24.24)
(24.25)
Thus in the quantum case we have a shifted oscillator instead of circular motion, though the parameters are the same as in the classical case. We also note that the energy is independent of py , so there is a degeneracy of the levels, corresponding to all possible py . In a physical situation, the area of the system in the (x, y) plane is finite. Consider a rectangle in this plane, with length L x in x and L y in y, so with area S = L x L y . Then, the momenta py are quantized as py,m , since we need to have periodicity of the wave function i 2πim (which should vanish at the boundaries of the sample) exp py,m L y = e = 1, implying py,m =
2πm . Ly
(24.26)
But, since for each py the level is shifted by x 0 (py ), there is a maximum value for py , py,max , such that we shift all the way to the end of the system, x 0 (py,max ) = L x , py,max 2πmmax = = Lx, qB qBL y
(24.27)
which implies that there is a maximum number of states (degeneracy) in a Landau level, qBL y L x L x Ly = . h 2πl 2 Since S = L x L y , the magnetic flux is Φ = BL x L y , and thus we have Nmax = mmax =
Nmax =
Φ , Φ0
(24.28)
(24.29)
where, as before, Φ0 = h/e is the flux quantum. Therefore the maximum number of electron states in a Landau level is independent of the level n. And the maximum number of electron states per unit area L x L y is nB =
Nmax 1 eB = = . L x Ly h 2πl 2
(24.30)
24.4 The Integer Quantum Hall Effect (IQHE) The Landau levels have a very important application to the physics of conducting materials, the Hall effect.
276
24 Motion in Magnetic Field, Hall Effect, Landau Levels
−
Bz −
−
I
Ey
e− +
+
I
+
VH W
Ex
L VL Figure 24.1
The classical Hall effect: a current in the x direction and magnetic field in the z direction result in a Hall potential in the y direction. The classical Hall effect is as follows. Consider a quasi-two-dimensional sample of conducting material of size L x = L and L y = W and magnetic field B in the z direction. Add a voltage VL in the x direction, implying an electric field Ex , so that VL = LEx ; see Fig. 24.1. Conduction electrons move sideways in the material under the Lorentz force until there is a compensating electric field Ey , generating a “Hall voltage” VH = L y Ey . The equilibrium of the = −e( E + v × B) = 0, Hall voltage generating Ey with the Lorentz force gives, from F Ey = vx Bz .
(24.31)
This is the classical Hall effect, which we will quantify next. Conduction electrons move in the material by accelerating between collisions, which stop them and bring the velocity down to zero. The time between collisions is the “relaxation time” τ. This means that the average velocity in the conduction direction x is vx = aτ = −
eEx τ . m
(24.32)
Then, considering the current density per unit length j x (note that we have a quasi-two-dimensional sample, with a transverse area that is quasi-one-dimensional, so the appropriate density is given per unit length in the y direction instead of per unit area) generated by a number of electrons per unit area ne , we have j x = ne evx =
e2 τne Ex , m
(24.33)
which means that the compensating electric field in the y direction is Ey = vx Bz = −
eB τEx = −ωc τEx . m
(24.34)
We define the transverse, or Hall resistance RH as the ratio of the Hall voltage VH (in the y direction) and the applied current I (in the x direction), RH =
L y Ey VH mωc τ mωc B = =− 2 =− =− . 2 I Ly jx ne e e τne ne e
(24.35)
Here the minus sign is conventional, to emphasize that the Hall voltage is a compensating effect that opposes the applied current.
277
24 Motion in Magnetic Field, Hall Effect, Landau Levels
B Figure 24.2
RH , or the Hall resistivity ρ xy , versus magnetic field B has plateaux, while ρ xx 0 on the plateaux. The Hall conductance is defined as the inverse of the resistance, σH =
1 ne e =− . RH B
(24.36)
However, in 1980, Klaus von Klitzing measured experimentally the Hall effect in a twodimensional electron gas confined at the interface between an oxide, SiO2 , and the Si semiconductor MOSFET (metal-oxide-semiconductor field-effect transistor, a standard electronic device). He found that in this case, the B dependence of the Hall voltage is not linear, as suggested by the above relation, but rather has plateaux, whose position is independent of the sample (i.e., is universal), as in Fig. 24.2. He also found that actually the Hall conductance σ H is quantized as e2 = nσ0 . (24.37) h Moreover, von Klitzing found that at the plateaux the longitudinal (normal) resistance vanishes, RL = VL /I = 0. This behavior appears in various (2 + 1)-dimensional strongly coupled systems (systems where the effective interactions between the degrees of freedom are strong). The above phenomenon is called the integer quantum Hall effect, or IQHE, and von Klitzing got a Nobel Prize for it in 1985 (which is a record time between research and Nobel Prize). Actually, in 1982 Daniel Tsui and Horst St¨ormer had found that, at much higher B and much lower temperature T, for similar systems we have a fractional quantum Hall effect (FQHE), with σ H = −n
σH = ν
e2 , h
ν=
k . 2p + 1
(24.38)
The FQHE was (partially) explained by Robert Laughlin in terms of a “Laughlin wave function”, but we will not address it here. For this explanation, he (together with Tsui and St¨ormer) also got a Nobel Prize, in 1998. Coming back to the IQHE, and applying the Landau level theory, if we have a maximum number n B of states per unit area in a Landau level, we can have n fully occupied levels, so the number of (conducting) electron states per unit area is given by ne ν= = n ∈ N. (24.39) nB Then the Hall conductance is, as given earlier, σH = −
ne e nnB e ne2 =− =− . B B h
(24.40)
278
24 Motion in Magnetic Field, Hall Effect, Landau Levels
24.5 Alternative Derivation of the IQHE We will now consider an alternative derivation of the IQHE, which also explains an important property of the quantum Hall effect, the fact that at the plateaux we have RL = 0 (zero normal, or longitudinal, resistance). = − E, so As we said, at equilibrium the Lorentz force vanishes, v × B vi Bk = E j i jk .
(24.41)
If we take B to be in the z direction, and k = z, we find vi =
i j E j . Bz
(24.42)
If we have n fully occupied Landau levels, therefore each with Nmax states in them, so that Q = nNmax , for electrons (q = −e) we find the current density ji =
eQ eQ vi = i j E j ≡ σi j E j , L x Ly BL x L y
(24.43)
where we have defined a conductance tensor by the standard relation ji = σi j E j . Then we have σi j =
eQ enNmax e2 i j = = n i j = σH i j , Φ Nmax h/e h
so the conductance matrix is
σ=
0 −ne2 /h
ne2 /h . 0
(24.44)
(24.45)
We see that σxx , the diagonal or normal (longitudinal) conductance, vanishes. If there were no offdiagonal (Hall) conductance, but only a longitudinal conductance, σxx = 0 would imply infinite resistance, Rxx = ∞. However, now σ is a matrix, so has to be inverted as a matrix, giving the resistance matrix 0 −1/σxy R = σ−1 = , (24.46) 1/σxy 0 and we see that now in fact Rxx = 0! This means that we have zero longitudinal resistance on the plateaux, the other important experimental observation about the integer quantum Hall effect.
24.6 An Atom in a Magnetic Field and the Landé g-Factor Consider now the next case in terms of complexity of the system, a multi-electron atom in a magnetic
N field. The electrons have spins that add up to a total electronic spin S = i=1 si (for N electrons), through both the vector potential A and the spin–magnetic field and these electrons interact with B · S. Specifically, the Hamiltonian will be interaction B N 1 ˆ r i ) 2 + U (r 1 , . . . , r N ) − μ B B · S , p i + |e| A( Hˆ = 2m i=1
(24.47)
279
24 Motion in Magnetic Field, Hall Effect, Landau Levels where U (r 1 , . . . , r N ) is the interaction potential between the electrons and nucleus and between the electrons themselves. Expanding, we get N N e2 2 |e| |e| ˆ ˆ ˆ A(r i ) · p + A (r i ) + B · S. H = H ( A = 0) + m i=1 2m i=1 m
(24.48)
= (r × B)/2, If the magnetic field is constant, we can choose a gauge where the vector potential is A so the Hamiltonian is then rewritten as N N e2 |e| e B· (r i × pˆ i ) + ( B × r i ) 2 + B·S Hˆ = Hˆ ( A = 0) + 2m 8m i=1 m i=1 (24.49) N 2 e 2 ·B + , = Hˆ ( A = 0) + μ B ( L + 2 S) (r i × B) 8m i=1 where we have used the individual orbital angular moment L i = r i × p i and the total orbital angular
N momentum L = i=1 L i . The last term is independent of L, S, and the middle term is (since L is the total angular momentum), tot = μ B ( L + 2 S). μ
(24.50)
For the electrons, we can define the total angular momentum J = L + S,
(24.51)
and similarly for the projections onto the z direction, Jz = L z + Sz . The projection Jz is defined by an eigenvalue MJ , so the average for a state is Jz = MJ . Moreover, we also have L z = ML and Sz = MS . Then the energy split due to the angular momenta is μB B μB B (L z + 2Sz ) = (Jz + Sz ), (24.52) and we can write it in terms of a total Land´e, or gyromagnetic, factor g for the whole atom, as ΔE =
ΔE = gμ B BMJ .
(24.53)
This is the Zeeman effect, of the splitting of the atomic levels under a magnetic field. Note that there are further terms, coming from the nonrelativistic expansion for the Dirac equation, including the spin–orbit coupling, but we will not consider them here. to obtain In order to calculate g, we first square the relation J = L + S, 2 2L · S = J − L 2 − S 2 .
(24.54)
2 Then, we take the quantum average over a state of given J, L, S, for which J = J (J + 1)2 , L 2 = L(L + 1)2 and S 2 = S(S + 1)2 , to obtain
2 [J (J + 1) − L(L + 1) − S(S + 1)] . (24.55) 2 Next, we note that, for a spin–orbit coupling that aligns the spin and total angular momentum, we obtain J · S MJ [J (J + 1) − L(L + 1) − S(S + 1)] . (24.56) Sz = Jz = 2 2J (J + 1) J S · J =
280
24 Motion in Magnetic Field, Hall Effect, Landau Levels
We finally write J (J + 1) − L(L + 1) − S(S + 1) gMJ = Jz + Sz = MJ 1 + , 2J (J + 1) allowing us to write the Land´e factor as J (J + 1) − L(L + 1) − S(S + 1) g = 1+ . 2J (J + 1)
(24.57)
(24.58)
Important Concepts to Remember • For a quantum particle (perhaps with spin 1/2) in a constant magnetic field, we find that it behaves as a shifted and rescaled harmonic oscillator with cyclotron frequency ωc = |q|B/m in a direction leading to Landau levels n, En = pz2 /(2m) + ωc (n + ms + 1/2). perpendicular to B, • Classically, we have a motion at ωc on a circle of radius r = py /(mωc ) (with momentum perpendicular to the radial direction), while quantum mechanically we have motion in a direction x in the same plane as the circle, shifted by x 0 (py ) = r (the x direction is perpendicular to the py direction), as that of a one-dimensional harmonic oscillator with the same ωc , and rescaled by l = /(|q|B). • In a physical situation, for each Landau level we have a degeneracy determined by the quantized momentum py , which is bounded because it leads to a shift in the perpendicular direction. This leads to a maximum number of states (degeneracy) of each Landau level equal to Φ/Φ0 , where Φ0 = h/e is the flux quantum, and to n B = eB/h states per unit area in a Landau level. • The classical Hall effect is as follows: if we have a B field in the z direction, then an E field in the y direction is correlated with the current in the x direction (either the current induces Ey , i.e., the Hall voltage VH , or Ey induces Ix ), leading to a Hall conductivity σ H = −ne e/B. • The integer quantum Hall effect (IQHE) is the phenomenon whereby, for certain (2 + 1)dimensional systems, the relation between VH and B (which classically is VH = I/σ H ∝ B) is not linear, but has plateaux, whose position is universal, and we also have an integer conductivity, σ H = −ne2 /h, and RH = 0 (zero longitudinal resistance). • The fractional quantum Hall effect (FQHE) is the phenomen whereby, at much higher B fields and lower temperature T, the conductivity is fractional, σ H = νe2 /h, with ν = k/(2p + 1), (partially) explained by Laughlin in terms of a Laughlin quantum wave function. • The Hall conductance σ H of IQHE comes from n fully occupied Landau levels, with ne /n B = n, which also means σi j = σ H i j , such that σxx = 0, yet because R = σ−1 is a matrix inverse we also have Rxx = 0 (zero longitudinal resistance). μ B • For a multi-electron atom, the Land´e g factor is found from writing ΔE = B (Jz + Sz ) as μB B gJz .
Further Reading See for instance [6] for more details on the Hall effects (IQHE and FQHE), and [2] for more on the Land´e g factor.
281
24 Motion in Magnetic Field, Hall Effect, Landau Levels
Exercises (1) (2) (3) (4) (5)
(6) (7)
Consider a stationary gauge transformation Λ(x, y) in the plane transverse to a magnetic field Bz . Calculate the effect it has on the Schr¨odinger equation and its solutions. in For the case in exercise 1, specialize to an infinitesimal gauge transformation that rotates A its plane (transverse to B), and interpret this in terms of the classical motion. For a (2 + 1)-dimensional system of sides L x and L y under the IQHE, calculate the length of the plateau in RH as B is increased. In the FQHE, how many occupied states can there be in a Landau level? What do you deduce about the description of the states of the electron? In some more complicated materials, we can have both a Hall conductivity and a longitudinal, usually isotropic, conductivity. Calculate the resistivity matrix Ri j . Consider a transformation that acts by exchanging the values of the electric field components Ei and those of the current densities ji , and write it as a transformation on the complex value σ ≡ σxy + iσxx . For a multi-electron atom, with total angular momentum L > 1 and total spin 1, what are the possible values of the Land´e g factor? Specialize to the classical limit of large L. Consider a multi-electron atom, with spin S, orbital angular momentum L, and total angular momentum J in a slowly varying magnetic field B(t) moving on a closed curve C. Calculate the Berry phase of the quantum state of the atom.
25
The WKB; a Semiclassical Approximation
We now go back to the WKB method, which first appeared in Chapter 11, where it was related to the classical limit as an expansion around the Hamilton–Jacobi formulation of classical theory. The WKB method is a method of approximation, and as such we will also go back to it in Part II of the book, but here we make a second pass at it since it will lead, in the next chapter, to the Bohr–Sommerfeld quantization, which was one of the first formulations of quantum mechanics.
25.1 Review and Generalization The wave function solution of the Schr¨odinger equation was written as ψ(r , t) = eiS(r ,t)/ S(r , t) = W (r ) − Et,
(25.1)
which means that the time-independent wave function is ψ(r ) = eiW (r )/ .
(25.2)
Moreover, we wrote W (r ) as W (r ) = s(r ) + T (r ), i where s(r ) and T (r ) are uniquely defined as even functions in . Then ψ(r ) = eT (r ) eis(r )/ ≡ Aeis(r )/ . We can expand the even functions s, T in , 2 4 s = s0 + s2 + s4 + · · · i i 2 4 T = t0 + t2 + t4 + · · · . i i
(25.3)
(25.4)
(25.5)
If we define t n = s n+1 , it means that W (r ) has an expansion in all powers of /i, with coefficients s n . Then, in one spatial dimension, the Schr¨odinger equation becomes 2 dW d 2W − 2m(E − V (x)) + = 0, (25.6) dx i dx 2 and expanding it in powers of /i we obtain n⎤ 2 n+1 ⎡⎢ d2 ds n ⎥⎥ ⎢⎢ − 2m(E − V (x)) + s = 0. 2 n i dx ⎢⎣n≥0 dx i ⎥⎥⎦ n≥0 282
(25.7)
283
25 The WKB and Semiclassical Approximation We can set to zero the coefficients of each power of /i, obtaining an infinite set of coupled equations for s n . Then, as we have seen, keeping only s0 and s1 = t 0 the WKB solution in one dimension is x const. iEt i ψ(x) = exp ± dx p(x ) − , (25.8) x0 p(x) where, if E > V (x), p(x) =
2m(E − V (x)).
However, for the solution in the classically forbidden region, E < V (x), if we write p(x) = iχ(x) = i 2m(V (x) − E) then the solution is
x const 1 iEt ψ(x, t) = √ exp ± dx χ(x ) − . x0 χ(x)
(25.9)
(25.10)
(25.11)
The condition for the validity of this WKB approximation is |δ λ λ/λ| 1, or 2(E − V (x)) , ⇔ √ |δ λ λ| √ |dV /dx| 2m(V (x) − E) 2m|V (x) − E|
(25.12)
where the last expression is the characteristic distance for the variation of the potential. However, we note that, while being valid mostly everywhere, the above condition is certainly not valid near the turning points for the potential, where E = V (x). Indeed, there we have ψ → ∞, and 2|E − V (x)| → 0. →∞ √ |dV /dx| 2m(V (x) − E)
(25.13)
Instead, we must use a different approximation.
25.2 Approximation and Connection Formulas at Turning Points We consider first the case when V (x) decreases through E at x = x 1 , so the barrier (the classically forbidden region) is on the left-hand side of the potential diagram. Then we can approximate the function 2m (E − V (x)) 2 by a Taylor expansion to the first order, i.e., a linear approximation, f (x) =
(25.14)
2m V (x 1 )(x − x 1 ). (25.15) 2 This can then be analytically continued through the complex-x plane, in order to avoid the point x 1 , where f (x 1 ) = 0. One can make a rigorous analysis in this way, but it is rather complicated. Instead, there is a much simpler shortcut, where we replace the function f (x) with a step function: f (x) = f (x 1 )(x − x 1 ) = −
⎧ ⎪ −α 2 , f (x) = ⎨ ⎪ +α 2 , ⎩
x < x1 x > x1.
(25.16)
284
25 The WKB and Semiclassical Approximation
This function is discontinuous, but that is not a problem as we have already dealt with step functions in the general chapter on one-dimensional problems. With this f (x), the time-independent Schr¨odinger equation becomes ψ− − α 2 ψ− = 0,
x < x1
ψ+
x > x1.
+ α ψ+ = 0, 2
(25.17)
The solutions ψ± on the different segments can be glued using the usual joining conditions, of continuity at x 1 for ψ(x) and ψ (x). The solution in the region x < x 1 that goes to zero at x → −∞ is e−α(x−x1 ) ; introducing a factor √ 1/ α for consistency with the WKB approximation, we obtain x 1 1 1 ψ− (x) = √ e−α(x−x1 ) = √ e− x dx α , α α
(25.18)
so for this solution, we have 1 ψ− (x 1 ) = √ , α
ψ− (x 1 ) =
√
α.
(25.19)
On the other hand, the general solution in the region x > x 1 is 1 ψ+ (x) = √ A sin[α(x − x 1 ) + δ], α
(25.20)
so for this solution, we have ψ(x 1 ) =
A sin δ √ , α
√ ψ+ (x 1 ) = A α cos δ.
(25.21)
The joining conditions are ψ+ (x 1 ) = ψ− (x 1 ) ⇒ A sin δ = 1 ψ+ (x 1 ) = ψ− (x 1 ) ⇒ A cos δ = 1.
√ The ratio of the conditions gives tan δ = 1, so δ = π/4, and thus we obtain A = 2. Then, since 2m α= |E − V (x)|, 2
(25.22)
(25.23)
we obtain the condition for replacement (when going from the left to the right of the point) at the turning point: x1 1 1 ) − E] exp − dx 2m[V (x x [V (x) − E]1/4 √ x 1 2 π → sin dx 2m[E − V (x )] + (25.24) x1 4 [E − V (x)]1/4 √ x 1 2 π = cos dx 2m[E − V (x )] − . x1 4 [E − V (x)]1/4 Similarly, for the opposite turning point, for an increasing V (x) (thus for the barrier, or classically forbidden region, on the right-hand side), we obtain the condition for replacement (from the right to the left of x 2 ),
285
25 The WKB and Semiclassical Approximation x 1 1 exp − dx 2m[V (x ) − E] x2 [V (x) − E]1/4 √ x2 1 2 π )] − → sin dx 2m[E − V (x x 4 [E − V (x)]1/4 √ x2 2 1 π = cos − dx 2m[E − V (x )] + . x 4 [E − V (x)]1/4
(25.25)
25.3 Application: Potential Barrier Consider a potential well with depth −V0 until x = x 1 , after which V (x) jumps up to +Vmax , and then decreases monotonically to zero at infinity, as in Fig. 25.1. The relevant case is that of the potential in nuclear α-decay, meaning nuclear fragmentation. Then the energy E of the decaying nucleus is higher than zero, representing the energy of the fragmented pieces. The potential well at x < x 1 corresponds to the nuclear force, while the monotonically decreasing barrier for x > x 1 is a Coulomb potential. We define region I for x < x 1 , setting E = V (x) at x = x 2 > x 1 ; we have region II for x 1 ≤ x ≤ x 2 , and region III for x > x 2 . We further write, as usual, 2m(E − V (x)) ≡ k (x), x ∈ III 2 2m(V (x) − E) (25.26) ≡ κ(x), x ∈ II 2 2m(E + V0 ) ≡ k, x ∈ I. 2 Then the wave function in region III is √ x 2 π ψIII (x) = √ cos k (x) dx − 4 k (x) x2 x x π π 1 =√ k (x) dx − i k (x) dx + i exp i + exp −i . 4 4 k (x) x2 x2
V
I
II
III x
Figure 25.1
Potential barrier for α decay.
(25.27)
286
25 The WKB and Semiclassical Approximation
According to the previous WKB rules for continuity at the turning point, we have for the wave function in region II (except that the wave function should be now increasing away from the transition point, not decreasing) x2 1 exp + dx κ(x ) ψII (x) = √ κ(x) x (25.28) x 1 =√ e τ exp − dx κ(x ) , κ(x) x1 where we have defined
τ=
x2
dx κ(x ).
(25.29)
x1
Finally, in region I, where the potential is constant, V (x) = −V0 , we can solve the Schr¨odinger equation exactly (rigorously, without a WKB approximation): ψI (x) = A sin[k (x − x 1 ) + β] =A
ei[k (x−x1 )+β] + e−i[k (x−x1 )+β] . 2i
(25.30)
The conditions for the continuity of ψ(x) and ψ (x) at x = x 1 are ψI (x 1 ) = ψII (x 1 ) ⇒ ψI (x 1 )
=
ψII (x 1 )
√ 1 eτ κ(x1 )
= A sin β
⇒ k cot β = −κ(x 1 ).
(25.31)
But, given that we have written the wave functions ψI and ψIII in terms of exponentials eiθ(x) in (25.30) and (25.27), we can use the general formulas from Chapter 7 for the incident and transmitted currents, 2 jinc = k | A| ex m 2 (25.32) jtrans = k (x) √ 1 ex = ex . m ( k (x)) 2 m Then the transmission coefficient is T=
| jtrans | 4 = . | jinc | k | A| 2
(25.33)
But from (25.31), we obtain 1 1 = ⇒ 2 2 1 + cot β 1 + κ (x 1 )/k 2 e2τ 1 e2τ κ2 (x 1 ) 2 | A| = = , 1+ κ(x 1 ) sin2 β κ(x 1 ) k2
sin2 β =
(25.34)
so the transmission coefficient becomes T = 4e−2τ
√ kκ(x 1 ) −2τ (Vmax − E)(E + V0 ) = 4e . Vmax + V0 k 2 + κ2 (x 1 )
(25.35)
287
25 The WKB and Semiclassical Approximation
25.4 The WKB Approximation in the Path Integral The wave function is written in terms of the propagator (evolution operator), ψ(x, t) = U (t, t 0 )ψ(x, t 0 ), and the propagator can be written as a path integral, U (t, t 0 ) = Dx(t)eiS[x(t)]/ .
(25.36)
(25.37)
In terms of the path integral, the expansion in of the exponent in U (t, t 0 ) = eiW / is related to the expansion of the action around its classical value. To a first approximation, the action is equal to the classical value plus a quadratic fluctuation around it, t (x(t ) − x cl (t )) 2 δ2 S dt [x cl ], (25.38) S Scl [x cl (t)] + 2 δx 2 0 which corresponds to the expansion of W (x) s0 (x) + i s1 (x), which leads to ψ(x) = es1 eis0 / . The classical part does not need to be integrated (since it is constant), while the path integration acts on the quadratic fluctuation part. Since (putting t 0 = 0, and using the fact that L = p x˙ − H) xf t dx Scl = dt p(x) − E = dx p(x) − Et, (25.39) dt 0 x0 and since
t
S=
dt 0
m x˙ 2 − V (x) 2
⇒
p2 (x) + V (x) = E, 2m
we find that the zeroth-order term in the propagator becomes i i i exp Scl = exp dx 2m(E − V (x)) − Et ,
(25.40)
(25.41)
which is the zeroth-order term (eis0 / ) in the WKB approximation. On the other hand, the quadratic fluctuation part of S can be path integrated, as it is the form of a Gaussian integral type, +∞ π −αx 2 dxe = . (25.42) α −∞ Since we can write 1 S Scl [x cl ] + 2
t 0
d2 dt (x(t) − x cl (t)) −m 2 + (E − V (x)) (x(t) − x cl (t)) , dt
the Gaussian integration over the quadratic fluctuation part gives the prefactor −1/2 d2 s1 e = det −m 2 − (V (x) − E) ≡ [det A]−1/2 . dt
(25.43)
(25.44)
Now define V (x) ≡ mω 2 , and factor out the m prefactor from the operator A, since it gives an overall constant (independent of ω).
288
25 The WKB and Semiclassical Approximation
To calculate the determinant of the operator A, we must choose a basis of eigenfunctions for it, as follows. In general, it is hard to obtain the WKB approximation term s1 from the path integral calculation, and so from [det A]−1/2 . Instead, we can obtain the eigenenergies, e.g., for periodic paths, as in the harmonic oscillator case. In this case we choose x(t f ) = x(t i ) = x 0 and then we integrate over it, dx 0 . On these periodic paths, with period T = t f − t i , a good (complete) basis of eigenfunctions is sin nπt T . For this basis we have 2 2 n π d2 nπt nπt 2 2 − 2 − ω sin = − ω sin . (25.45) T T dt T2 Then, the determinant of the operator equals the product of the eigenvalues, ∞ ∞ A n2 π 2 ω 2T 2 sin ωT 2 det O ≡ det = − ω ≡ K (T ) 1 − 2 2 = K (T ) , m n=1 T 2 ωT n π n=1
(25.46)
where we have factored out another kinematic, i.e., ω-independent, factor, K (T ). Now one can do the integral over x 0 for the combined classical and first quantum terms. After a calculation that will not be repeated here (it can be found for instance in [10], Chapter 6.1), we obtain ∞ ∞ 1 = K˜ (T ) e−i(n+1/2)ωT = K˜ (T ) e−iEn T / , (25.47) dx 0 eiScl / √ det O n=0 n=0 where K˜ (T ) is another constant that is independent of the dynamics (i.e., of ω). The above result is the correct propagator for the harmonic oscillator.
Important Concepts to Remember • In the expansion ψ(r , t) = exp i (W (r ) − Et) , we write W (r ) = s(r ) + i T (r ) and expand s(r ) and T (r ) in (/i) 2 , obtaining a series expansion of the Schr¨odinger equation that corresponds to quantum corrections to the Hamilton–Jacobi equation. • The WKB solution (the solution in the WKB approximation) in one dimension is still formally a solution in the classically forbidden region E − V (x) < 0, by writing p(x) = iχ(x) (where √ p(x) = 2m(E − V (x))), but the solution (the WKB approximation) is not valid near the turning points in the potential E = V (x). • At the turning points in the potential, we can write connection formulas, which connect a solution in terms of p(x) into a solution in terms of χ(x). • In the path integral, the WKB approximation amounts to the quadratic approximation for the action S in the exponent eiS around its classical (minimum) value Scl . We are left with an integral over x 0 , in the case of periodic paths that begin and start at the same x, x f = x i = x 0 , as for the harmonic
−iEn T / oscillator, obtaining ∞ , where En are the eigenenergies of the harmonic oscillator. n=0 e
Further Reading See [3], [2] for more details on WKB in one dimension, and [10] for more details on WKB in the path integral.
289
25 The WKB and Semiclassical Approximation
Exercises (1) (2) (3) (4)
(5) (6) (7)
Write down the equation for s2 in terms of s0 and s1 = t 0 , and substitute into it the values of s0 and s1 from the WKB solution in one dimension. Write down the connection formulas at the turning points for a harmonic oscillator in one dimension. Write down the connection formulas at the turning points for a potential V = V0 cos(ax) (V0 > 0), for E < V0 . Consider the potential barrier given by Veff (r) for the radial motion in a potential V = −|α|/r 3 with a positive energy smaller than the barrier. Write down the connection formulas or the turning points in this case. In the case in exercise 4, calculate the transmission coefficient. Consider the potential V = −V0 cos(ax) (V0 > 0) with a small initial condition (x 0 < π/(2a)). Calculate the WKB approximation in the path integral. Can we truncate the sum over n in the propagator (25.47) to a finite order, as an approximation?
26
Bohr–Sommerfeld Quantization
In this chapter we use the WKB method on bound states to derive a modified version of the quantization condition put forward initially by Bohr and Sommerfeld, which was one of the first calculational methods of quantum mechanics. We then apply this modified Bohr–Sommerfeld quantization condition to a set of examples, including the harmonic oscillator and the hydrogen atom, in order to derive the eigenenergies, and wave functions for the states.
26.1 Bohr–Sommerfeld Quantization Condition We consider a potential well, where for x < x 1 we have V (x) > E, called region I (classically forbidden), for x 1 ≤ x ≤ x 2 we have V (x) ≤ E, called region II (the classically allowed region, the potential well), and for x > x 2 we also have V (x) > E, called region III, as in Fig. 26.1. We can now use the connection formulas at the turning points x 1 , x 2 derived in the previous chapter. In region I, we have the WKB solution x1 1 1 ) − E) , exp − dx 2m(V (x (26.1) ψI (x) = x [2m(V (x) − E)]1/4 that transitions into the solution in region II, √ x 1 2 π )) − ψII (x) = cos dx 2m(E − V (x . x1 4 [2m(E − V (x))]1/4 On the other hand, the correct WKB solution in region III is x 1 1 ) − E) , exp − dx 2m(V (x ψIII (x) = x2 [2m(V (x) − E)]1/4 which transitions into the solution in region II. √ x2 2 1 π ˜ ψII (x) = cos − dx 2m(E − V (x )) + . x 4 [2m(E − V (x))]1/4
(26.2)
(26.3)
(26.4)
In order for the two solutions in region II to be the same, ψII (x) = ψ˜ II (x), we must have the quantization condition x2 1 π dx 2m(E − V (x )) − = nπ. (26.5) x1 2
290
291
26 Bohr–Sommerfeld Quantization V
E x
Figure 26.1
Potential regions. Note that for odd n, we have that the argument of cos in ψII is minus the argument in ψ˜ II (which still gives the same function), whereas for even n, it is the same argument. Then the quantization condition becomes x2 1 dx 2m(E − V (x )) = n + π. (26.6) 2 x1 x x Considering the integral over a closed path, x 2 + x 1 = C , corresponding to a full oscillation 1 2 between x 1 and x 2 , and generalizing x to a general variable q, and p(x) to its canonically conjugate momentum p, we obtain 1 1 dx p(x) = p dq = n + 2π = n + h, (26.7) 2 2 C C where n = 0, 1, 2, . . . We note that, except for the constant (1/2 instead of 1), this is the same as the original Bohr– Sommerfeld quantization condition, which was postulated (not derived), p dq = (n + 1)h, (26.8) C
where again n = 0, 1, 2, . . . We have thus obtained an improved version of Bohr–Sommerfeld quantization that includes the classical and first semiclassical terms, and thus is expected to give the correct result either in very simple cases (such as the harmonic oscillator and the hydrogen atom), or for large quantum numbers n, when we are closer to classicality. We will thus apply it to find the eigenenergies En and eigenfunctions ψ n , and we expect in general to find very good agreement at large n. Finally, another observation is that the WKB wave function has n nodes for the quantum number n, just as in the general (exact) analysis of one-dimensional problems. To see this, note the form of the function ψII (x) in the classically allowed region, written in terms of the cos therefore having n nodes for the quantum number n.
26.2 Example 1: Parity-Even Linear Potential We consider the potential V (x) = k |x|,
(26.9)
292
26 Bohr–Sommerfeld Quantization
which applies for instance, to the q q¯ potential in QCD: when pulling the quark and antiquark apart, there is a constant force pulling them back together. This example can also be solved exactly, so we can compare how well the result works. The turning points are given by E E − V (x 1,2 ) = 0 ⇒ x 1,2 = ∓ . k
(26.10)
Then, integrating between them and noting that x 1 = −x 2 for an even integrand, we obtain x2 x2 dx 2m(E − V (x )) = 2 dx 2m(E − k x ) 0
x1
√ 1 2mEE =2 dy 1 − y k 0 √ 4 E 2mE = , 3 k
(26.11)
1 where we have used 0 dy 1 − y = 2/3. Since this integral equals (n + 1/2)π by the quantization condition, we obtain the eigenenergies En =
3πk(n + 1/2) √ 4 2m
2/3 .
(26.12)
But in this case, the exact eigenenergies are known, and are given by
k En = λ n √ 2m
2/3 ,
(26.13)
where λ n are the zeroes of the Airy function, Ai(−λ n ) = 0.
(26.14)
Then we find excellent agreement, starting at n = 1, where the error is at about 1%, after which it gets better. The only poor agreement is for the ground state, E0 (for n = 0). So in this case, the eigenenergies are getting better at large n, as expected. Also as expected, the wave functions of these states are also improving at large n except near the turning points, where we must use the approximation from the previous chapter.
26.3 Example 2: Harmonic Oscillator Consider next the harmonic oscillator potential, V (x) =
mω 2 2 x . 2
(26.15)
In this case, the integral appearing in the quantization condition is (note that now we also have x 1 = −x 2 , and at the turning points we have 2mE − m2 ω 2 x 21,2 = 0)
293
26 Bohr–Sommerfeld Quantization I=
x2
dx 2m(E − V (x )) =
x1
x2
√ dx 2mE − m2 ω 2 x 2
x1
x2 √ x2 m2 ω 2 x 2 2 2 = x 2mE − m ω x + x dx √ x1 x1 2mE − m2 ω 2 x 2 x2 x2 √ dx =− dx 2mE − mω 2 x 2 + 2mE , √ x1 x1 2mE − m2 ω 2 x 2
(26.16)
which leads to 2I = 2mE
x2 x1
dx 2mE 2πE = π= . √ mω ω 2mE − m2 ω 2 x 2
Therefore the quantization condition is E 1 1 = n+ ⇒ E = ω n + . ω 2 2
(26.17)
(26.18)
This is in fact the exact quantization condition for the harmonic oscillator, so it works at all n. This result is consistent with the path integral result from the end of the previous chapter, where we saw that the WKB propagator contained the exact eigenenergies for the harmonic oscillator, U (T ) =
∞
e−iEn T / .
(26.19)
n=0
Wave Functions The wave functions in this case are not satisfactory at intermediate values of x, but become better at large x (x → ±∞), which means in the classically forbidden region. In this classically forbidden region, the WKB wave function is (for instance, for x > x 2 , x → ∞) x √ 1 1 2 ω 2 x 2 − 2mE . ψ(x) = 2 2 2 exp − dx m (26.20) x2 [m ω x − 2mE]1/4 However, at large x, √
m2 ω 2 x 2
− 2mE = mωx 1 −
2E E + mωx − , 2 2 ωx mω x
so the exponent in the wave function is x √ 1 1 mω 2 x 2 E − dx m2 ω 2 x 2 − 2mE const − + ln x + O 2 . x2 2 ω x Since E/ω = n + 1/2, we obtain the approximate wave function mω 2 x 2 1 const. ψ(x) 2 2 2 exp − + n+ ln x 2 2 [m ω x − 2mE]1/4 const n+1/2 mω 2 x 2 mω 2 x 2 n √ x exp − = const × x exp − . 2 2 mωx
(26.21)
(26.22)
(26.23)
294
26 Bohr–Sommerfeld Quantization
But these are indeed the leading (exponential) and first subleading (power law) terms in the exact wave functions, since x n is the leading term in the Hermite polynomial Pn (x).
26.4 Example 3: Motion in a Central Potential Consider a central potential in three dimensions, V = V (r). Then the time-independent Schr¨odinger equation is −2 Δψ − 2m(E − V (r))ψ = 0.
(26.24)
Writing the time-independent wave function as the usual exponential (since S(r , t) = W (r ) − Et), ψ(r ) = eiW (r )/ ,
(26.25)
we obtain an equation for W that is a quantum-corrected version of the Hamilton–Jacobi equation, ) 2 − 2m(E − V (r)) + ΔW = 0. (∇W i But, in spherical coordinates, we have 2 2 2 1 ∂W ) 2 = ∂W + 1 ∂W + (∇W ∂r r 2 ∂θ r 2 sin2 θ ∂φ ∂ 2 W 2 ∂W 1 ∂ 2W ∂W 1 ∂ 2W ΔW = + + + cot θ + . r ∂r ∂θ sin2 θ ∂φ2 ∂r 2 r 2 ∂θ 2 Then, as we saw in Chapter 11, expanding W in in the usual way, 2 W = s0 + s1 + s2 + · · · , i i
(26.26)
(26.27)
(26.28)
and substituting in the Schr¨odinger equation, we obtain an equation that is also an infinite series in , where setting to zero each coefficient leads to an independent equation. Keeping only the leading and first subleading terms, we obtain 2 2 2 1 ∂s0 1 ∂s0 ∂s0 + 2 + − 2m(E − V (r)) ∂r r ∂θ r 2 sin2 θ ∂φ (26.29) ∂s0 ∂s1 2 ∂s0 ∂s1 2 ∂s0 ∂s1 + + 2 + + Δs0 = 0. 2 i ∂r ∂r r ∂θ ∂θ r 2 sin2 θ ∂φ ∂φ The leading (order-one) term in the equation is just the Hamilton–Jacobi equation, 2 2 2 ∂s0 1 ∂s0 1 ∂s0 + 2 + − 2m(E − V (r)) = 0. ∂r ∂r r r 2 sin2 θ ∂φ
(26.30)
We separate variables, as usual in the Hamilton–Jacobi equation, by writing in spherical coordinates s0 (r, θ, φ) = R0 (r) + Θ0 (θ) + Φ0 (φ),
(26.31)
295
26 Bohr–Sommerfeld Quantization
which leads to three independent equations, one for each separated function, 2 dΦ0 = L 23 dφ 2 L 23 dΘ0 + = L2 dθ sin2 θ 2 dR0 L2 + 2 − 2m[E − V (r)] = 0. dr r
(26.32)
The first equation is solved directly as Φ0 (φ) = L 3 φ, the second is integrated as
Θ0 (θ) = and the third is also integrated as
L 23 sin2 θ
,
R0 (r) =
dθ L 2 −
(26.33)
dr
2m(E − V (r)) −
(26.34)
L2 . r2
(26.35)
The equation for s1 is obtained by setting to zero the coefficient for /i in (26.29), which, after substituting the separated s0 (r, θ, φ), gives dR0 ∂s1 1 dΘ0 ∂s1 1 dΦ0 ∂s1 2 + 2 + (26.36) + Δs0 = 0. dr ∂r r dθ ∂θ r 2 sin2 θ dφ ∂φ We can separate variables also in s1 , s1 = R1 (r) + Θ1 (θ) + Φ1 (φ).
(26.37)
Since Δs0 contains also d 2 Φ0 /dφ2 = 0, we do not have any φ dependence in the equation for s1 , which means that we can choose to put Φ1 (φ) = 0, in which case the equation becomes dR0 dR1 1 dΘ0 dΘ1 1 d 2 Θ0 dΘ0 d 2 R0 2 dR0 2 + 2 + + + cot θ + = 0. (26.38) dr dr r dr dθ r dθ dθ dr 2 r 2 dθ 2 But the separation of variables gives the equations dR0 dR1 d 2 R0 2 dR0 + + =0 dr dr r dr dr 2 dΘ0 dΘ1 d 2 Θ0 dΘ0 2 + + cot θ = 0, 2 dθ dθ dθ dθ 2
(26.39)
or equivalently (denoting by a prime the derivative with respect to the coordinate, no matter what that is) 1 Θ0 (θ) Θ1 (θ) + + cot θ = 0 2 Θ0 (θ) (26.40) 1 R0 (r) 2 R1 (r) + + = 0. 2 R0 (r) r
296
26 Bohr–Sommerfeld Quantization
The solutions of these equations are given by 1 ln Θ0 (θ) sin θ 2 1 R1 (r) = const − ln r 2 R0 (r) , 2
Θ1 (θ) = const −
(26.41)
which implies for the WKB wave function the solution eiR0 (r )/ eiΘ0 (θ)/ ψ(r, θ, φ) = eiL3 φ/ . r R0 (r) Θ0 (θ) sin θ
(26.42)
The requirement for 2π periodicity in the angle φ means that we need to have L 3 = nφ .
(26.43)
Substituting the integral formula for Θ0 (θ) into the WKB wave function, the θ dependence in it is 1 [L 2 sin2 θ − L 23 ]1/4
⎡⎢ ⎤ L 23 ⎥⎥ i ⎢ 2 ⎢ ⎥. exp ⎢± dθ L − sin2 θ ⎥⎥ ⎢⎣ ⎦
(26.44)
Considering it as a WKB wave function in one dimension, for which we have the WKB quantization condition, or, equivalently, taking a Bohr–Sommerfeld quantization condition modified by the addition of 1/2, we obtain the condition
θ2
dθ
L2 −
θ1
1 = n + π, θ 2 sin2 θ L 23
(26.45)
where θ1,2 are the turning points, | sin θ1,2 | =
|L 3 | . L
(26.46)
To obtain the quantization condition, we will use an integral that we can calculate: I (a, b, c) =
x2 x1
√
dx a + 2bx − c2 x 2
,
(26.47)
where x 1,2 are the zeroes of the square root in the denominator (the turning points). It can be rewritten by shifting the integral over x in such a way that the endpoints x 1,2 become symmetric with respect to the origin, namely ±|d|/|c|, I (a, b, c) =
+|d |/|c | − |d |/|c |
√
dx d2
−
c2 x 2
=
+π/2 −π/2
dθ
|d/c| cos θ d 2 (1 − sin2 θ)
1 = |c|
+π/2 −π/2
dθ =
π , (26.48) |c|
where we have used the change of variables x = |d/c| sin θ, meaning that the integral from −|d|/|c| to +|d|/|c| becomes an integral from −π/2 to +π/2.
297
26 Bohr–Sommerfeld Quantization
Then we can calculate the integral in the quantization condition, as θ2 θ2 θ2 L 23 L 23 L2 dθ 2 dθ L − = dθ − 2 sin2 θ θ1 θ1 θ1 sin θ L 2 − L 23 /sin2 θ L 2 − L 23 /sin2 θ =−
θ2 θ1
d(cos θ)
L2 L 2 − L 23 − L 2 cos2 θ
+
θ2 θ1
L 23 d(cot θ) L 2 − L 23 − L 23 cot2 θ
= πL − π|L 3 |, (26.49) where we have used d(cot θ) = −dθ/ sin2 θ, and then the formula (26.48) in both terms (i.e., for the terms in cos θ, and in cot θ). Finally then, the quantization condition is 1 L − |L 3 | = nθ + , (26.50) 2 and, since L 3 = nφ , we find
1 1 L = nθ + nφ + ≡ l+ , 2 2
where l ≥ |nφ |. But then we have, in the WKB approximation, 2 1 L 2 = 2 l + , 2
(26.51)
(26.52)
which can be compared with the exact solution, 2 ⎡⎢ ⎤ 1 1⎥ L 2 = 2 l (l + 1) = 2 ⎢⎢ l + − ⎥⎥ . (26.53) 2 4⎥ ⎢⎣ ⎦ Note that, strictly speaking, since θ ∈ [0, π] we cannot use the previous WKB analysis on R. But the condition we used was that the function does not blow up at θ = 0 or π, and that is indeed true. Using the connection formulas at a turning point, for 0 ≤ θ ≤ θ1 , we have the angular wave function ⎡⎢ ⎤⎥ L 23 1 1 θ1 ⎢ 2 (26.54) exp ⎢⎢− dθ − L ⎥⎥⎥ . 2 2 2 2 1/4 [L 3 − L sin θ] sin θ 0 ⎢⎣ ⎥⎦ At θ → 0, this becomes |n | 1 |L 3 | dθ θ 1 1 θ φ exp − exp |nφ | ln tan → 0, (26.55) =√ =√ tan √ sin θ 2 2 |L 3 | |L 3 | |L 3 | so it means we actually have the correct boundary conditions, despite this not being obvious a priori.
Radial Wave Function Finally we move on to the radial wave function in ψ(r, θ, φ), which is ⎡⎢ r ⎤ i 1 L 2 ⎥⎥ ⎢ . exp dr 2m(E − V (r)) − ⎢ r 2 ⎥⎥⎦ ⎢⎣ r1 r 2m(E − V (r)) − L 2 /r 2 1/4
(26.56)
298
26 Bohr–Sommerfeld Quantization
However, the radial quantization condition, r2 L2 1 dr 2m(E − V (r)) − 2 = π nr + , 2 r r1
(26.57)
depends on the potential V (r), so we must choose a potential in order to obtain a WKB result and compare with the exact result.
26.5 Example: Coulomb Potential (Hydrogenoid Atom) The potential for the hydrogenoid atom is the usual V (r) = −
Ze02 , r
(26.58)
where e02 = e2 /(4π0 ). Also, the energy of bound states is negative, so E = −|E|. Then the quantization condition is written as r2 2mZe02 L 2 1 J= dr −2m|E| + − 2 = π nr + , (26.59) r 2 r r1 and the integral J is calculated by integrating by parts, r2 2mZe02 L 2 J = r −2m|E| + − 2 r r r1 r2 2 2 −2mZe0 /r + 2L 2 /r 3 − r dr r1 2 −2m|E| + 2mZe02 /r − L 2 /r 2 r2 −mZe02 /r + L 2 /r 2 =− dr r1 −2m|E| + 2mZe02 /r − L 2 /r 2 r2 mZe02 = dr r1 −L 2 + 2mZe02 r − 2m|E|r 2 r2 dr L2 − 2 r1 r −2m|E| + 2mZe02 /r − L 2 /r 2 z1 mZe02 π dz 2 =√ +L 2m|E| z2 −2m|E| + 2mZe02 z − L 2 z 2
(26.60)
mZe02 =√ π − πL, 2m|E| where in the fourth equality we have used the integral (26.48) in the first term and defined 1/r = z in the second, and in the last line, we have used the integral (26.48) in the second term. Then, the WKB quantization means that
299
26 Bohr–Sommerfeld Quantization
1 J = π nr + , 2
(26.61)
and, using the fact that L = (l + 1/2), as we have seen before, we obtain the eigenenergies E = −|E| = −
mZ 2 e04 , 22 (nr + l + 1) 2
(26.62)
which are in fact the exact eigenenergies (at all n = nr + l + 1, independently of whether n is large or small). For the wave functions, we have the same observation that we made for the θ dependence: the radial direction is only on half the real line, 0 ≤ r < ∞, so technically speaking the WKB analysis would not apply. Nevertheless, in the classically forbidden region 0 ≤ r ≤ r 1 , we have the wave function r1 L dr 1 1 r1 ψ(r) exp − exp −(l + 1/2) ln = √ r r r Lr r L 2 /r 2 1/4 (26.63) (r/r 1 ) l+1/2 l ∼ r → 0, = √ Lr where we have used the fact that L/ = l + 1/2. So, as before, this is the correct boundary condition at r = 0 (finiteness of the wave function), even though the wave function doesn’t extend to r = −∞, so the WKB analysis actually applies.
Important Concepts to Remember • The Bohr–Sommerfeld quantization condition (the version modified by 1/2 from its original form) is obtained by applying the x approximation, and the resulting turning-point connection x WKB formulas, to a closed path, x 2 + x 1 = C . This results in C dx p(x) = C dq p = (n+1/2)2π = 1 2 (n + 1/2)h, with n = 0, 1, 2, . . . (compared with the original, for which (n + 1)h). • Applying Bohr–Sommerfeld quantization to some simple cases, we can either get an exact result or obtain the exact result at large n (in the classical limit). • For V (x) = k |x|, we get a wrong result for the energies only for n = 0; from n = 1 onward we get less than 1% error. • For the harmonic oscillator, we get the correct eigenenergies, but only the leading result at (small x and) large x for the wave functions is correct. • For the motion in a central potential, writing as usual ψ = eiW with W = s(r ) − Et with expanding s(r ) into /i we get the quantum-corrected Hamilton–Jacobi equation, which can be solved by the separation of variables, s0 (r, θ, φ) = R0 (r) + Θ0 (θ) + Φ0 (φ), s1 = R1 (r) + Θ1 (θ) + Φ1 (φ), etc. • One thus obtains a WKB solution (by directly solving for R0 , Θ0 , Φ0 and then R1 , Θ1 , Φ1 in terms of them); the Bohr–Sommerfeld quantization conditions for θ ∈ [0, π] and r ∈ [0, ∞) (given that L 3 = nφ ) lead to L = (l + 1/2) with l = nθ + nφ and another quantization condition, depending on V (r). • For the hydrogenoid atom, nr quantization (for R(r)) gives the correct eigenenergies and the correct boundary conditions for the wave function at θ = 0 and r = 0, though not the exact wave function.
300
26 Bohr–Sommerfeld Quantization
Further Reading See [2] for more details.
Exercises (1)
Consider the Lagrangian q˙4 α q˙2 + − λq4 . (26.64) 4 2 Write down the explicit (improved) Bohr–Sommerfeld quantization condition for periodic motion in this Lagrangian. For the potential V = k |x| in Bohr–Sommerfeld quantization, calculate the leading and first subleading terms for the eigenfunctions at large x and small x. Apply the Bohr–Sommerfeld quantization method to particle in a box (an infinitely deep square well) to find the eigenenergies and eigenfunctions, and compare with the exact results. For the wave function of the harmonic oscillator in the Bohr–Sommerfeld quantization method, what happens in the classical, large-n, limit? Is there any sense in which we can consider that the result matches the exact result? For the case of motion in a central potential, write down the equations for R2 and Θ2 . Calculate the Bohr–Sommerfeld quantization condition for R(r) for motion in the central potential V = −α/r 2 . For the hydrogenoid atom in Bohr–Sommerfeld quantization, compare the wave function as r → ∞ (the leading and first subleading terms) with the exact wave function. Do they match? Is there any sense in which the wave function matches the exact one in the classical limit n → ∞? L=
(2) (3) (4)
(5) (6) (7)
Dirac Quantization Condition and Magnetic Monopoles
27
In this chapter we consider a quantization condition found by Dirac that gives an argument for the existence of magnetic monopoles. The quantization condition is found in several different ways, including a semiclassical way. We also give an argument for the existence of magnetic monopoles based on a duality symmetry of the Maxwell equations.
27.1 Dirac Monopoles from Maxwell Duality The Maxwell equations in vacuum are ×E = − ∂ B, ∇ ∂t ×B = + 1 ∂ E, ∇ c2 ∂t
·E =0 ∇ (27.1) ·B = 0, ∇
or, in relativistically covariant form, writing Ei = E i = F 0i = −F0i Bi = Bi =
1 i jk Fjk , 2
(27.2)
the second and third Maxwell equations become the equation of motion ∂μ F μν = 0,
(27.3)
while the first and the fourth Maxwell equations become the Bianchi identities, ∂[μ Fνρ] = 0,
(27.4)
where, because of the total antisymmetrization, this is equivalent to Fμν = ∂μ Aν − ∂ν Aμ ,
(27.5)
or Aμ = (φ, A). with Aμ = (−φ, A) It can be seen that the Maxwell equations have a symmetry, called Maxwell duality, on making the exchanges → B, E
→ − E, B
(27.6)
or, in relativistic form, Fμν → ∗Fμν ≡ 301
1 μνρσ F ρσ , 2
(27.7)
302
27 Dirac Quantization Condition, Magnetic Monopoles which takes ∂μ F μν = 0 into ∂[μ Fνρ] = 0. Note that ∗2 = −1, so (as we can also see from the vector components), applying the duality symmetry twice gives −1, so we obtain an overall sign for the fields. Adding sources means adding terms on the right-hand side of the second and third Maxwell equations, ρ ·E = ∇ 0 ×B − 1 ∂E = μ0 j, ∇ c2 ∂t
(27.8)
or, in relativistic notation, the equation of motion acquires a source term, ∂μ F μν + j ν = 0. Here the 4-current is
ρ j = , μ0 j . 0 μ
(27.9)
(27.10)
But now we have a problem, since there is no magnetic source, so Maxwell duality is broken. We note, however, that it would be fixed if we also introduce a magnetic 4-current, made up of a magnetic charge density and current, jm k μ = μ0 ρ m , . 0
(27.11)
Then the Bianchi identity has a source term, 1 μνρσ ∂μ Fρσ + k ν = ∂μ ∗Fμν + k ν = 0. 2
(27.12)
In terms of 3-vectors, we have the modified Maxwell equations ·B = μ0 ρ m ∇ ×E + ∂B = jm . ∇ ∂t 0 Thus we again have Maxwell duality, if we extend it to act on sources as well, as d 3 xρ e ↔ g = d 3 xρ m . j μ ↔ k μ , ρe → ρm , q =
(27.13)
(27.14)
Considering a pointlike electric source, an electron with ρ e = qδ3 (x), we have the Maxwell equation ·E = q δ3 (x). ∇ 0
(27.15)
But applying Maxwell duality to it, we find a pointlike magnetic source with ρ m = gδ3 (x), and a Maxwell equation ·B = μ0 gδ3 (x). ∇
(27.16)
303
27 Dirac Quantization Condition, Magnetic Monopoles
Then, using a magnetic Gauss’s law, namely integrating over a sphere, or rather a ball, centered on the charge, and converting the integral over the volume to an integral over the 2-sphere of radius R, SR2 , we obtain ·B = · d S = μ0 g, d3 x∇ B (27.17) 2 SR
BR
field must be radial, we finally obtain the magnetic field of the pointlike magnetic and, since the B source, = B
μ0 g rˆ . 4π r 2
(27.18)
This magnetic pointlike source is called a “Dirac magnetic monopole”. Usually, a magnetic field has only a dipole mode, or higher (quadrupole, etc.). The reason is that we have not yet found a magnetic monopole in nature; if there were one, we would say that the magnetic field also starts at the monopole, as for an electric field. In the presence of magnetic sources, we can also act with Maxwell duality on the Lorentz force → − E, E → B, q → g, we find the law, in order to find the full law, at q 0, g 0. Since B general law = q( E + v × B) + g( B − v × E). F
(27.19)
In relativistic notation, since F μν → ∗F μν , we have dp μ μν = qF + g ∗ F μν uν . dτ
(27.20)
We also note that, for a Dirac magnetic monopole, the energy density diverges at r → 0 since the magnetic field has density E=
2 g 2 μ0 B 1 = ∝ → ∞. 2μ0 32π 2 r 4 r 4
(27.21)
Moreover, also the total energy of the magnetic field of the Dirac magnetic monopole diverges, g 2 μ0 1 E= d 3 xE = 4πr 2 dr E ∼ → ∞. (27.22) 8π r r→0 But this is just the same as in the case of the electron. For the electron, the particle is fundamental, but there are quantum corrections to the field at r → 0 which cut off the divergence in E. For the monopole, the particle is usually a large-r approximation of a nonabelian monopole (i.e., a ’t Hooft monopole), which also cuts off the divergence in energy, though in a different way.
27.2 Dirac Quantization Condition from Semiclassical Nonrelativistic Considerations The existence of the Dirac monopole implies, at the quantum level, an important quantization condition known as the Dirac quantization condition. It can be derived in different ways, but (given the analysis in the previous chapters) we will start with a semiclassical nonrelativistic method.
304
27 Dirac Quantization Condition, Magnetic Monopoles
Consider a nonrelativistic system of an electric charge q in the magnetic field of a magnetic charge g. Indeed, the magnetic charge has the fields mon = 0, E
mon = B
μ0 g rˆ . 4π r 2
(27.23)
Then the Lorentz force on the electric charge is μ qg ˙ = 0 r × r . p˙ = mr¨ = qr˙ × B 4π r 3 The time derivative of the orbital angular momentum is
(27.24)
μ0 qg d L d ˙ × r ) = d μ0 qg r , = (mr × r˙ ) = mr × r¨ = r × ( r dt dt dt 4π r 4πr 3
where we have used 1 1 1 m l j [r × (r˙ × r )]i = 3 i jk x j ( klm vl x m )) = 3 (δli δ m j − δ i δ j )x vl x m r3 r r ⎡⎢ ˙ ⎤ r r (r · r˙ ) ⎥⎥ vi x i d r ˙ ⎢ = − 3 (r · r ) = ⎢ − = . r dt r i r r 3 ⎥⎥⎦ ⎢⎣ r i We finally obtain the conservation law d μ0 qg r d L− ≡ J = 0. dt 4π r dt
(27.25)
(27.26)
(27.27)
Since the first term is the orbital angular momentum, the full expression (the one that is conserved) must be a total angular momentum, which is why we called it J . But the total angular momentum is quantized in units of /2 so, now considering the opposite regime, when L = 0, we obtain the quantization condition μ0 qg N = ⇒ μ0 qg = 2πN = N h. (27.28) 4π 2 This is the Dirac quantization condition. Note then that, if there is a single magnetic charge g anywhere in the Universe, the Dirac quantization condition means that electric charge is quantized, which we know experimentally to be true. The above argument is the only available theoretical explanation for the quantization of the electric charge, which is a strong indirect argument for the existence of magnetic charges somewhere in our Universe even if we have not observed them experimentally yet.
27.3 Contradiction with the Gauge Field When we wrote down the duality-invariant Maxwell equations, with both electric and magnetic sources, by introducing the magnetic sources we obtained a contradiction with the existence of the × A, Indeed, since we define B =∇ we can use Gauss’s law, or rather its gauge field Aμ = (−φ, A). generalization, Stokes’ law, twice, and find 3 · dl = 0. μ0 Q m = d x∇ · B = B · dS = (∇ × A) · d S = A M3
∂M 3 =S 2
S 2 (closed)
∂S 2 (closed)
(27.29)
305
27 Dirac Quantization Condition, Magnetic Monopoles
Indeed, note that the 2-sphere S 2 is a closed surface, so it has no boundary, which means that the “line integral over the boundary” is zero. But this relation is a contradiction, since it implies that the magnetic charge must be zero. In relativistically covariant notation the contradiction is easier to understand, since Fμν = ∂μ Aν − ∂ν Aμ means that ∂[μ Fνρ] = 0, but this is in contradiction with the Bianchi identity with sources, 1 μνρσ ∂μ Fρσ + k ν = 0. 2
(27.30)
× A, =∇ The solution of this contradiction is that Fμν = ∂μ Aν − ∂ν Aμ or, in the φ = 0 gauge, B is valid only on patches. That is, it is valid only on open surfaces that do not intersect the magnetic charges, so it is not valid on the closed surfaces S that surround the magnetic charge.
27.4 Patches and Magnetic Charge from Transition Functions In order to use patches for the closed surface S 2 , we divide it and similar closed surfaces into two overlapping patches Oα and Oβ , such that Oαβ = Oα ∩ Oβ and Oα ∪ Oβ = S 2 , as in Fig. 27.1a. Then, (β) we define Aμ on each patch, i.e., Aμ(α) and Aμ , such that (β)
(β)
(β)
Fμν = ∂μ Aν(α) − ∂ν Aμ = ∂μ Aν − ∂ν Aμ .
(27.31)
But this means that on the common patch Oαβ , the two gauge fields must differ by a gauge transformation, (β)
Aμ(α) = Aμ + ∂μ λ (αβ) ,
(27.32)
where the gauge parameter λ (αβ) is called a transition function. In the A0 = 0 gauge, we have ×A ×A =∇ (α) = ∇ (β) , B
(27.33)
so that the vector potentials differ by a gauge transformation, (αβ) . (α) = A (β) + ∇λ A
(27.34)
We can consider explicit patches as in Fig. 27.1b, i.e., (α) : S 2 − north pole, θ = π (β) : S 2 − south pole, θ = 0.
(27.35)
In the gauge A0 = 0, on the patch (α) the vector potential is (α) = A
μ0 g (−1 + cos θ) eφ , 4πr sin θ
(27.36)
where eφ is the unit vector in the direction of increasing φ, in Cartesian coordinates: eφ = (− sin φ, cos φ, 0).
(27.37)
(α) is singular at sin θ = 0 but cos θ 1, which means at θ = π. Thus, indeed, The formula for A (α) is defined only on (α). A
306
27 Dirac Quantization Condition, Magnetic Monopoles
Dirac string singularity
N
Ob
Ob
Oab
Ob
C
Oa
Oa
Oa
2
S
S t, (a)
Figure 27.1
(b)
(c)
Dirac string singularity
(a) Two overlapping patches Oa and Ob for the 2-sphere S2 , with common patch Oab . (b) The case when the overlap Oab is an equator, C. (c) The case when Oa = S2 less the north pole, and Ob = S2 less the south pole. The Dirac string singularity is a line going to infinity either at the north or the south pole. (α) is correct, since we have, in spherical coordinates (note We can check that the formula for A that we have replaced θ with π − θ with respect to the usual definition), ∂ Aθ 1 ∂ Ar ∂ ∂ 1 ×A = 1 ∇ ( Aφ sin θ) + − (r Aφ ) eθ − rˆ − r sin θ ∂θ ∂φ r sin θ ∂φ ∂r (27.38) 1 ∂ Ar ∂ + − (r Aθ ) + eφ . r ∂r ∂θ On the patch (β), the vector potential (again in the gauge A0 = 0) is (β) = A
μ0 g (+1 + cos θ) eφ . 4πr sin θ
(27.39)
This potential is singular at sin θ = 0 but cos θ −1, so at θ = 0. Therefore it is indeed defined only on (β). Then, moreover, the difference between the two potentials is a gauge transformation, (β) = (α) − A A
μ0 −g (αβ) , eφ = ∇λ 2π r sin θ
(27.40)
where the transition function is λ (αβ) = −
μ0 g φ. 2π
(27.41)
This is correct since in spherical coordinates ∂ = ∂ rˆ + 1 ∂ eθ + 1 ∇ eφ . ∂r r ∂θ r sin θ ∂φ
(27.42)
307
27 Dirac Quantization Condition, Magnetic Monopoles (αβ) is single valued around the circle parametrized by φ ∈ [0, 2π], Then the gauge transformation ∇λ (αβ) but the gauge parameter λ is not. Indeed, around it we have λ (αβ) (φ = 2π) − λ (αβ) (φ = 0) = −μ0 g.
(27.43)
In general, consider a common one-dimensional patch that is comprised by an equator: Oαβ = C, β β ∪ OC = S 2 , and OCα ∩ OC = C, as in Fig. 27.1b. Then we can again use Stokes’ law (or the generalized Gauss’s law) twice, but now on the sum of patches, so ×A ×A · d S = (α) + (β) B ∇ ∇ μ0 Q m = β OCα
=
α OC
S2
α =+C ∂OC
(α) · dl + A
OC
β
∂OC =−C
(27.44)
(β) · dl = A
(α) − A (β) ) · dl. (A C
Here we have used the fact that OCα has an outward normal, associated with the right-hand screw β rule, whose direction is on C, and so OC has an outward normal, associated with the right-hand screw rule, with the opposite direction on C; hence when using the same outward normal, i.e., the same integral on C, we have difference of integrands. Finally, we obtain that the magnetic charge is the integral of the gauge transformation, or the non-single-valuedness of the gauge parameter, (αβ) ) · dl = λ (αβ) (φ = 2π) − λ (αβ) (φ = 0). (∇λ (27.45) μ0 Q m = C
27.5 Dirac Quantization from Topology and Wave Functions There is an alternative derivation of the Dirac quantization condition that builds upon the formalism of patches described above. If there are electrically charged particles in the system then we have complex fields, in particular through the complex wave functions, that are minimally coupled to the electromagnetic potential A electric charge q, as we saw in Chapter 23, iq (27.46) Dψ = ∇ − A ψ. But we saw that gauge transformations with parameters λ act on the wave functions as qλ ψ = ψ exp −i .
(27.47)
We must then impose that the gauge transformation on the wave function is single-valued around the circle C in Fig. 27.1, i.e., that iq iq exp − λ (αβ) (φ = 2π) = exp − λ (αβ) (φ = 0) , (27.48) which means that we must have q (αβ) (φ = 2π) − λ (αβ) (φ = 0) = 2πn. λ
(27.49)
308
27 Dirac Quantization Condition, Magnetic Monopoles Substituting the non-single-valuedness of λ (for a magnetic charge Q m = g), we obtain 2πn = μ0 g ⇒ μ0 gq = 2πn = hn, q
(27.50)
which is again the Dirac quantization condition.
27.6 Dirac String Singularity and Obtaining the Dirac Quantization Condition from It There is yet another way to derive the Dirac quantization condition, which will expose another important feature in the presence of magnetic monopoles: the Dirac string. is defined For the Dirac monopole, we found that (on both patches), the vector potential A (α) , there was a singularity on the north pole θ = π on everywhere except on a line. In the case of A S 2 , which is a “string” at θ = π extending from the monopole (r = 0) to infinity (r = ∞). Similarly, (β) was defined everywhere except on the “string” on the south pole θ = 0, extending from the A monopole to infinity; see Fig. 27.1c. This singularity is known as the Dirac string singularity. We see that by gauge transformations (β) ) we can move the Dirac string singularity around, which (α) and A (like that taking us between A means that it is not physical. Dirac’s interpretation of this singularity was that we can find a regular, i.e., nonsingular, magnetic This reg , that can be written in terms of an everywhere-defined gauge field (vector potential) A. field B mon plus the field of the Dirac string singularity, magnetic field is formed by the monopole field B string = μ0 gθ(−z)δ(x)δ(y) zˆ, B
(27.51)
reg = B mon + B string . B
(27.52)
so that
string is for the (β) patch, when the singularity is on the south pole, from r = 0 to The above field B reg obeys the r = ∞, i.e, at z < 0, hence we have the Heaviside function θ(−z). The regular field B usual rules, so that ·B × A. reg = 0 ⇒ B reg = ∇ ∇
(27.53)
The Dirac quantization condition now arises by imposing that the Aharonov–Bohm effect from a contour encircling the Dirac string is trivial, as it should be for an unphysical singularity: this means that the Dirac string must be unobservable. generating the delta function magnetic Since the Dirac string is associated with a vector potential A string , we have field B · dl = · d S 0, A B (27.54) C=∂S
S
yet it must be unobservable in an Aharonov–Bohm effect. Therefore, when going around a circle encircling the Dirac string, the change in the wave function, its “monodromy”, iq · dl ψ, ψ → exp A (27.55) C=∂S
309
27 Dirac Quantization Condition, Magnetic Monopoles
must be trivial, i.e.,
iq exp
C=∂S
iq Astring · d l = exp Bstring · d S = 1. S
(27.56)
This means that eiμ0 qg/ = 1, so we obtain again the Dirac quantization condition, μ0 qg = 2πN = N h.
(27.57)
Important Concepts to Remember → B, • Maxwell duality refers to the invariance of the Maxwell equations without source under E → − E. If we add a magnetic current source k μ = (μ0 ρ m , jm /0 ) similar to the electric current B source j μ = (ρ e /0 , μ0 je ), we can have Maxwell duality even with sources. • A magnetic current source of delta function type (like the electron in relation to the electric current) gives a Dirac magnetic monopole, = B
μ0 g rˆ . 4π r 2
• We can obtain the Dirac quantization condition μ0 qg = 2πN = hN from the conservation of the total angular momentum, such that N is an angular momentum quantum number. • If there is a single magnetic monopole with charge g in the Universe, we obtain the quantization of electric charge, which is an experimental fact but otherwise theoretically unexplained. This is good indirect evidence for the existence of magnetic monopoles. • In the presence of magnetic charges, Fμν = ∂μ Aν − ∂ν Aμ cannot be valid everywhere (in particular, not on a closed surface enclosing the charge), but must be true only on patches. • The magnetic charge is equal to the monodromy on C (the difference in the field when going around a closed loop) of the transition function (the gauge parameter for a transformation) between two patches with C as a common loop. • Dirac quantization can also be obtained from the single-valuedness of the gauge transformation defined by the transition function on the wave function minimally coupled to the electromagnetic potential. of a monopole, • The Dirac string singularity is a spurious singularity in the vector potential A starting at the monopole and going to infinity. It can be moved around by gauge transformations. The Dirac quantization condition also arises from the condition that the Aharonov–Bohm effect from a contour encircling the Dirac string is unobservable (trivial).
Further Reading See [7] for more details.
310
27 Dirac Quantization Condition, Magnetic Monopoles
Exercises (1)
Can Maxwell duality be derived from the classical action for electromagnetism (plus electric and magnetic sources)? Is there a simple way to generalize this duality to the quantum theory, if there are particles minimally coupled to the electromagnetic fields? Explain. (2) Consider an electron and a monopole at the same point at a distance R from a perfectly and B on the plane at the minimum-distance point from conducting infinite plane. What are E the charges? (3) In the presence of so-called dyons, particles that carry both electric and magnetic charges, the Dirac quantization condition for a particle with charges (q1 , g1 ) and another particle with charges (q2 , g2 ) is generalized to the Dirac–Schwinger–Zwanziger quantization condition q1 g2 − q2 g1 = 2πN, N ∈ N.
(27.58)
Prove this relation using a generalization of the argument for the quantization of the total angular momentum of system of two particles. (4) Consider two dyons satisfying the Dirac–Schwinger–Zwanziger quantization condition given in exercise 3, the minimum value for the integer, N = 1, occurring when one of the dyons has q1 = e, g1 = h/e. Calculate the total relative force between the dyons. (5) Suppose that the magnetic field for a particle at the point 0 is a delta function, 2θ B(x) = δ(x ). (27.59) e Take two and rotate one around the other. Calculate the Aharonov–Bohm such particles · dl of the moving particle. Since the particles are identical, how would you phase exp i C A interpret this result? (6) Consider two monopoles of opposite charge (monopole and antimonopole) situated at a distance 2R from each other, and a sphere of radius 2R centered on the midpoint between the monopole that is valid over the whole sphere? and antimonopole. Is there a unique vector potential A (7) Suppose in our Universe there is only a monopole and antimonopole pair, situated at a distance d of each other that is much smaller than the distance to any other atom in the Universe. Can we still infer that the electron charge is quantized? Explain.
28
Path Integrals II: Imaginary Time and Fermionic Path Integral In this chapter we will extend the quantum mechanical formalism of path integrals, in order to describe finite temperature, and fermions. We will first study path integrals in Euclidean space, which, as we will see, are better defined than those in Minkowski space, and also give their relation to statistical mechanics at finite temperature. Moreover, the usual case is defined as a limit of the Euclidean space path integral. Finally, we will also define path integrals for fermionic variables.
28.1 The Forced Harmonic Oscillator In Chapter 10, we obtained the path integral for the transition amplitude (for the propagator) i ˆ M (q , t ; q, t) ≡ H q , t ; q, t H = q | exp − H (t − t ) |q (28.1) tf i = Dp(t)Dq(t) exp dt [p(t) q(t) ˙ − H (q(t), p(t))] ti and, for a Hamiltonian quadratic in momenta, H = p2 /2 + V (q), we obtained the path integral in terms of the Lagrangian, i M (q , t ; q, t) = N Dq exp S[q] . (28.2) We also saw that we can define N-point correlation functions, and calculate them in terms of path integrals, as G N (t¯1 , . . . , t¯N ) = H q , t |T { q( ˆ t¯1 ) · · · q( ˆ t¯N )}|q, t i = Dq(t) exp S[q] q(t¯1 ) · · · q(t¯N ), q(t)=q,q(t )=q
(28.3)
where we have the boundary conditions q(t) = q, q(t ) = q . These correlation functions are obtained from their generating functional, which is i i i Z[J] = Dq exp S[q; J] = exp S[q] + dt J (t)q(t) , (28.4) q(t)=q,q(t )=q q(t)=q,q(t )=q by means of multiple derivatives at zero, G N (t¯1 , . . . , t¯N ) =
δ δ . · · · Z[J] i i ¯ ¯ δ J (t 1 ) δ J (t N ) J=0
(28.5)
But, while Z[J], called the partition function, is just a generating functional (a mathematical construct), we note that J (t) can be interpreted as a source term for the classical variable q(t). In 311
312
28 Path integrals II
the case of a harmonic oscillator, this would be an external driving force, giving rise to a forced (driven) harmonic oscillator, as seen in the equation of motion, 0=
δS[q; J] δS[q] = + J (t), δq(t) δq(t)
so Z[J] and its derivatives make sense also at nonzero J (t). The action of the driven harmonic oscillator is then 1 2 2 2 S[q; J] = dt q˙ − ω q + J (t)q(t) . 2
(28.6)
(28.7)
In this case, the path integral is still Gaussian, since we are just adding a linear term to the quadratic one. But, in order to calculate it, we still have one issue: we need to understand the boundary conditions for q(t). In the absence of boundary terms, we can partially integrate and obtain 2 d 1 2 S[q; J] = dt − q(t) + ω q(t) + J (t)q(t) . (28.8) 2 dt 2 This is schematically of the type i 1 i S = − q · Δ−1 · q + J · q, 2 where
d2 2 Δ q(t) ≡ i + ω q(t). dt 2 −1
(28.9)
In this case, the Gaussian path integral can be calculated as 1 i Z[J] = N Dq exp − q · Δ−1 · q + J · q 2 1 T n T ≡ d x exp − x · A · x + b · x 2 1 = (2π) n/2 (det A) −1/2 exp bT · A−1 · b , 2 where b = − i J (t) and A = Δ−1 /, so we finally obtain 1 Z[J] = N exp − J · Δ · J , 2
(28.10)
(28.11)
(28.12)
where N contains constants (expressions independent of J), including (det Δ) −1/2 . In the above, Δ is called the propagator, which seems unrelated to the previous notion of a propagator, as U (t, t ) relating |ψ(t ) to |ψ(t). In the case of the driven harmonic oscillator, the propagator is actually related to the two-point correlation function since then, from (28.12), we have δ2 Z[J] δ J (t 1 )δ J (t 2 ) J=0 δ 1 = (J · Δ)(t 2 ) exp − J · Δ · J = Δ(t 1 , t 2 ). δ J (t 1 ) 2 J=0
G2 (t 1 , t 2 ) = −2
(28.13)
313
28 Path integrals II
Also from (28.12), we find Δ(t, t ) = i since then
dp e−ip(t−t ) , 2π p2 − ω 2
d2 dp e−ip(t−t ) 2 + ω i 2π p2 − ω 2 dt 2 dp −p2 + ω 2 −ip(t−t ) =− e 2π p2 − ω 2 dp −ip(t−t ) = e = δ(t − t ). 2π
Δ−1 Δ(t, t ) = i
(28.14)
(28.15)
However, the expression we have found for Δ is ill defined, since we have a singularity at p2 = ω 2 , where the denominator vanishes, and we must somehow avoid this singularity. Its avoidance is related to the question of how to invert Δ−1 : this depends, as for any operator, on the space of functions on which the operator acts. In quantum mechanics, this is the Hilbert space of states and, in this continuous case, must be better defined. The relevant issue is that the Hilbert space for Δ−1 has zero modes, where 2 d −1 2 + ω q0 (t) = 0. (28.16) Δ q0 (t) = dt 2 Having zero modes (eigenstates of zero eigenvalue), the operator on the full Hilbert space is not invertible (think of a matrix with zero eigenvalues, therefore with zero determinant). Moreover, these are not some pathological states but rather are the solutions of the classical equations of motion of the harmonic oscillator. To obtain an invertible operator, we must therefore find a way to exclude these classical states from the Hilbert space, by imposing some boundary conditions that contradict them. We will find that the correct result is described in terms of an integral over slightly complex momenta p, by avoiding the pole just slightly in the complex plane, according to the formula dp e−ip(t−t ) . (28.17) ΔF (t, t ) = 2π p2 − ω 2 + i Here F stands for Feynman; this is the Feynman propagator. If an operator A has eigenstates |q with eigenvalues aq , A|q = aq |q,
(28.18)
q|q = δ qq ,
(28.19)
and the states are orthonormal,
then the operator can be written as A=
aq |qq|.
(28.20)
q
Thus the inverse operator is A−1 =
1 |qq|. aq q
(28.21)
314
28 Path integrals II In our case, if Δ−1 F has orthonormal eigenfunctions {qi (t)}, dtqi∗ (t)q j (t) = δi j ,
(28.22)
with eigenvalues λi , then Δ−1 F (t, t ) =
λi qi (t)qi∗ (t ),
(28.23)
1 qi (t)qi∗ (t ). λ i i
(28.24)
i
so the inverse is ΔF (t, t ) =
In order to get (28.17), we can make the identifications qi (t) ∼ e−ipt , i ∼ dp/2π, and λi ∼ (p2 − ω 2 + i). We can be more precise than the above, however. Since in ΔF (t, t ) there are poles in the complex plane at p = ±(ω − i), we can calculate the integral with the residue theorem on the complex plane. In order to do that, we must form a closed contour from the integral over the real line, by adding a piece of contour whose integral vanishes. Such a contour is a semicircle at infinity, provided the factor e−ip(t t ) is exponentially small. Thus if t > t , by considering Im(p) < 0, we obtain a factor that vanishes exponentially, −|Im(p) |(t−t ) e → 0, on the semicircle at infinity in the lower half of the complex plane. We can thus add, without affecting the result, the integral over this contour, obtaining a closed total contour C such that the pole p1 = +(ω − i) is inside C, as in Fig. 28.1. By the residue theorem, we obtain ΔF (t, t ) =
1 −iω(t−t ) e . 2ω
(28.25)
Similarly, if t < t , by considering Im(p) > 0, we obtain an exponentially vanishing factor e → 0 on the semicircle at infinity in the upper half of the complex plane. Adding Im(p)(t −t)
−ω +ω
Figure 28.1
The contour for the Feynman propagator avoids −ω from below and +ω from above. It is closed in the lower half-plane (see the dotted contour “at infinity”), for t > t.
315
28 Path integrals II
again without affecting the result the integral over this contour, we obtain a closed total contour C that encircles the pole p2 = −(ω − i), so by the residue theorem we get 1 −iω(t −t) e . 2ω Putting together the two cases, we have at general t, t , ΔF (t, t ) =
(28.26)
1 −iω |t−t | e . (28.27) 2ω Now we can calculate the boundary conditions, since from (28.24) we see that ΔF has the same boundary conditions for t as q(t). Since at t → +∞, ΔF ∼ e−iωt and at t → −∞, ΔF ∼ e+iωt , it follows that, in order to obtain (28.17) we need to impose the boundary conditions ΔF (t, t ) =
q(t) ∼ e−iωt , t → +∞ q(t) ∼ e+iωt , t → −∞.
(28.28)
We can use these boundary conditions to calculate more precisely the Gaussian path integral, and find the boundary term correction to the partition function, 1 i Z[J] = N exp − J · Δ · J + qcl (t) · J , (28.29) 2 though we will not show the details. The same result can also be obtained, a bit more rigorously, from the harmonic phase-space path integral of Chapter 10. Moreover, the harmonic phase-space path integral gives the same boundary conditions for the functions qi (t).
28.2 Wick Rotation to Euclidean Time and Connection with Statistical Mechanics Partition Function We have shown above how to calculate the partition function in the case of a quadratic action, by doing a kind of Gaussian integral. However, the calculation is not very rigorous, since the Gaussian +∞ 2 is of an imaginary kind while the result is strictly speaking only correct in the real case, −∞ e−αx . Indeed, in the case of an imaginary integral, dxe−iαx , the integrand oscillates rapidly, which makes it impossible for it to converge at infinity. If we take a cut-off for the integral at a large Λ, the +Λ +Λ+C C difference between −Λ dx e−iαx and −Λ−C dx e−iαx is 2 0 dx e−iαx , which is finite and oscillatory for finite C, so we cannot even define the limit properly when Λ → ∞. at infinity A simple solution would be to replace eiS/ with e−S/ , in which case the exponential +∞ would be decaying instead of oscillatory, and we could define the integral −∞ dx without any problem. This is what happens when we go to Euclidean time (i.e., work in Euclidean spacetime). For a Hamiltonian Hˆ that is independent of time, and has a complete set {|n} of eigenstates (so that
1 = n |nn|), with positive eigenenergies En > 0, we can write ˆ −i Hˆ (t−t )/ |q = q |nn|e−i H (t−t )/ |mm|q H q , t |q, t H = q |e n
m
= q |nn|qe−iEn (t−t )/
(28.30)
n
=
n
ψ n (q )ψ∗n (q)e−iEn (t−t )/ .
316
28 Path integrals II
In the second line, we have used the fact that the matrix element on energy eigenstates is
n|e−i H (t−t )/ |m = δ nm e−iEn (t−t )/ , ˆ
(28.31)
as well as the definition ψ n (q) ≡ q|n. At this point, we can make the required analytical continuation to Euclidean time, also known as a “Wick rotation”, where we just make the replacement Δt → −iβ. We then obtain ψ n (q )ψ∗n (q)e−βEn . (28.32) q , β|q, 0 = n
If moreover q = q and we integrate over it, we obtain an expression for the statistical mechanics partition function of a quantum system at temperature T, with k B T = 1/β. Indeed, then ˆ dqq, β|q, 0 = dq |ψ n (q)| 2 e−βEn = Tr[e−β H ] ≡ Z[β], (28.33) n
where we sum over the Boltzmann factors e−βEn times the probability |ψ n (q)| 2 of the state, and then sum over q and n. The procedure above implies that we are considering closed (periodic) paths in Euclidean time, of period β = 1/k B T, with q = q(t E = β) = q(t E = 0) = q. For a Lagrangian in the usual (Minkowski) spacetime, with the canonical kinetic term, 2 1 dq L(q, q) ˙ = − V (q), 2 dt the action can be rewritten in terms of the Euclidean time t E (t = −it E ) as 2 tE =β ⎡⎢ ⎤⎥ 1 dq iS[q] = i (−idt E ) ⎢⎢ − V (q) ⎥⎥ ≡ −SE [q], ⎢⎣ 2 d(−it E ) ⎥⎦ 0 where we have defined the Euclidean action SE [q], obtaining 2 β ⎡⎢ ⎤⎥ β 1 dq ⎢ SE [q] = dt E ⎢ + V (q) ⎥⎥ = dt E LE (q, q). ˙ ⎥⎦ ⎢⎣ 2 dt E 0 0
(28.34)
(28.35)
(28.36)
(28.37)
In this way, we obtain the Feynman–Kac formula, relating the statistical mechanics partition function to the partition function from the path integral in Euclidean space, 1 −β Hˆ Z (β) = Tr[e ]= Dq exp − SE [q] . (28.38) q(tE +β)=q(tE ) We can also introduce nonzero sources J (t) into the Euclidean-time partition function, 1 1 β Z[β, J] = Dq exp − SE [β] + dτ JE (τ)qE (τ) , 0 where the source term is the analytical continuation of the Minkowski-time source term, i dt J (t)q(t) = i d(−it E ) J (−it E )q(−it E ) = dt E JE (t E )qE (t E ).
(28.39)
(28.40)
317
28 Path integrals II
From the partition function (28.39) we can calculate correlation functions as before, for instance the two-point function, δ2 Z[β; J] 2 1 = Dq(τ)q(τ1 )q(τ2 )e−1SE (β)/ , (28.41) Z (β) δ J (τ1 )δ J (τ2 ) Z (β) which now is also a statistical mechanics trace, 1 ˆ Tr[e−β H T ( q(−iτ ˆ ˆ 1 ) q(−iτ 2 ))], Z (β)
(28.42)
where the Heisenberg operators in Minkowski time become the analytical continuation in Euclidean time, q(t) ˆ = ei Ht/ qe ˆ −i Ht/ ⇒ q(−iτ) ˆ = e1 H τ/ qe ˆ −1 H τ/ . ˆ
ˆ
ˆ
ˆ
(28.43)
However, the path integrals are still taken over periodic Euclidean paths. To go back to the regular case in Minkowski space, from the Euclidean formulation, we must both undo the Wick rotation, and take the limit β → ∞ (infinite periodicity), corresponding to T → 0 in statistical mechanics (zero temperature). In this limit, inside the trace remains only the vacuum contribution, for E0 , as ψ n (q )ψ∗n (q)e−βEn → ψ0 (q )ψ0 (q)e−βE0 . (28.44) n
Harmonic Oscillator Example We will apply the Euclidean time formalism to the simplest example: the forced harmonic oscillator. Its Euclidean partition function is ⎡⎢ 2 ⎤ ⎧ ⎫ ⎪ ⎪ 1 dq 1 2 2⎥ ⎢ ⎥ ⎨ Z E [J] = Dq exp ⎪ − dt ⎢ +ω q ⎥+ dt J (t)q(t) ⎬ ⎪ 2 dt ⎢ ⎥ ⎣ ⎦ ⎩ ⎭ 1 d2 1 2 (28.45) = Dq exp − dtq(t) − 2 + ω q(t) + dt J (t)q(t) 2 dt 1 = N exp dt dt J (t)ΔE (t, t ) J (t ) , 2 where in the second line we have used partial integration, now without boundary terms, since the Euclidean-time path integral is over periodic paths (without boundary), and in the third line the Gaussian integration is now well defined, as it is a real integral, of the type dte−S(x) , and not an oscillatory imaginary one, and the Euclidean-time propagator is defined as before through a Fourier transform in (Euclidean) energy EE , −1 d2 dEE e−iEE (t−t ) 2 ΔE (t, t ) = − 2 + ω (t, t ) = . (28.46) 2π EE2 + ω 2 dt We finally note that there are no poles in this expression, since, for real EE , EE2 + ω 2 > 0. We have thus solved all three problems that we found with the result in Minkowski space: there are no boundary terms, the Gaussian integration is well defined, and there are no poles in the expression for the propagator.
318
28 Path integrals II When Wick-rotating back to Minkowski time, via Et = EE t E , so that (since t = −it E ) EE = −iE, we can consider the procedure as a rotation in the complex E plane by π/2. However, for a Wick rotation of a full π/2 (EE = −iE), we would be back to having poles in the integrand for Δ(t, t ), since EE2 + ω 2 → −E 2 + ω 2 . To avoid obtaining a pole, we must instead do a rotation by π/2 − , so EE → e−i(π/2−) E = −i(E + i ).
(28.47)
Then the propagator becomes ΔE (t E = it) = −i
+∞ −∞
dE e−iEt = ΔF (t), 2π −E 2 + ω 2 − i
(28.48)
so it turns exactly into the Feynman propagator from before.
28.3 Fermionic Path Integral Until now, we have considered path integrals appropriate for bosons, as in the case of the particle with position x(t), but that is not the only possibility. Indeed, we have seen that the path integral for a harmonic oscillator is best defined as a harmonic phase-space path integral, in terms of a, a† instead of q, p. However, for fermions, we saw in Chapter 20 that we have anticommuting variables, with canonical quantization conditions {ψi , ψ j } = {pψi , pψ j } = 0,
{ψi , pψ j } = iδi j ,
(28.49)
and we can define fermionic raising and lowering operators b, b† , satisfying anticommutation relations {b, b} = {b† , b† } = 0,
{b, b† } = 1.
(28.50)
In this case, we can define a sort of “classical limit” for the fermions, by taking → 0. This is not quite a classical limit, since strictly speaking there is no classical fermion since for fermions there is a single particle per state, and we need a (macroscopically) large number of particles in the same state to obtain a classical limit. But one defines the limit in the abstract, and leaves the interpretation b, b† in terms of q, p with for later. In this case, remembering that (like a, a† ), we have √ √ defined √ ˜ , b† = b˜ † / , and then coefficients that have 1/ in front, so in fact we have b = b/ ˜ b˜ † } = → 0. { b,
(28.51)
Together with the fact that {b, b} = {b† , b† } = 0, we see that the classical limit does not give the usual commuting functions but rather a Grassmann algebra, for anticommuting objects, with complex coefficients. This is defined in terms of regular, commuting objects (complex numbers), called “even” (or bosonic) elements, and anticommuting objects such as b, b† , the “odd” (or fermionic) part of the algebra. We then have the usual product relations, bose × bose = bose,
319
28 Path integrals II fermi × fermi = bose and bose × fermi = fermi × bose = fermi. For example, the product of two odd objects is even, for example [bb† , bb† ] = 0. Having defined the Grassmann algebra for fermionic objects, we must next define the path integral over it.
Definitions We consider a Grassmann algebra with N objects x i , i = 1, . . . , N, {x i , x j } = 0, together with the identity 1, which commutes with the rest, [x i , 1] = 0. As we said, it is an algebra over the complex numbers, which means that the coefficients c ∈ C. Since (x i ) 2 = 0, the Taylor expansion in these x i will end after N terms, (N ) f ({x i }) = f (x 0 ) + f i(1) x i + f i(2) f i(3) x x x + · · · + f 12···N x 1 · · · x N . (28.52) j xi x j + jk i j k i
i< j
i< j 1 then the state is called entangled, or nonseparable, whereas if n = 1 then the state is called unentangled, or separable. For the separable state |ψ = |ψ A ⊗ |ψ B ,
(30.24)
the density matrices of the subsystems are still of the pure state type, ρ A = |ψ Aψ A |, ρ B = |ψ B ψ B |.
(30.25)
Returning to the spin 1/2 example, the entangled state has quantum correlations, since for instance in |ψ states, |↑ A implies |↑B and |↓ A implies |↓B . But these quantum correlations cannot be created locally, in A or B independently, but rather have to involve interactions between A and B. On the other hand, |↑ A ⊗|↑B can be prepared by communicating classically (and separately) between A and B. In terms of a quantum transformation, however, we can entangle states by a unitary transformation UAB in H that is not of the product type UA ⊗UB , with UA and UB independent unitary transformations in H A and HB , respectively. Indeed, we can act on the separable states |1 = |↑ A↑B and |2 = |↓ A↓B by rotating them, 1 |1 + |2 |1 |ψ1 UAB , (30.26) =√ = |ψ2 |2 2 |1 − |2
343
30 Quantum Entanglement and the EPR Paradox
in the same way that we act on individual up and down states in each subsystem to obtain 1 |↑ A + |↓ A |↑ A UA =√ |↓ A 2 |↑ A − |↓ A 1 |↑B + |↓B |↑B UB =√ . |↓B 2 |↑B − |↓B
(30.27)
The maximally entangled states are |ψ1,2 or |φ1,2 , in the following sense. We can calculate the density matrix ρ A, and find ρ A = TrB |ψ1,2 ψ1,2 | = |a| 2 |↑ A↑ A | + |b| 2 |↓ A↓ A | =
1 (|↑ ↑ | + |↓ A↓ A |) , 2 A A
(30.28)
and similarly for the A|φ states, 1 (|↑ ↑ | + |↓ A↓ A |) , 2 A A and moreover ρ A = ρ B for all these pure states, which means that ρ A = TrB |φ1,2 φ1,2 | = |a| 2 |↑ A↑ A | + |b| 2 |↓ A↓ A | =
(30.29)
1 1, (30.30) d where d is the number of basis states in the Hilbert space of A or B (the dimension of the Hilbert space); here d = 2. Obtaining the above density matrices for the subsystems A or B, with equal probabilities in all states, is the definition of a maximally entangled state: namely, we find a maximally mixed state by taking the trace over a subsystem B. The Schmidt basis decomposition is not unique, and, in order to define it uniquely, we would need more information. In the case of a maximally entangled state, with ρ A = ρ B = (1/d)1, with probabilities pi = 1/d in each state, rotating the basis states |i and | j with a unitary matrix Ui j in the Schmidt basis decomposition, 1 |ψ AB = √ |i AUi j | j B , (30.31) d i,j ρ A = ρB =
gives the same ρ A = ρ B density matrices. Moreover, rotating |i and | j with complex conjugate (so opposite) unitary matrices, |i A = |a AVai a
|i B =
|bB Vbi∗ ,
(30.32)
b
leads to the same state in the total Hilbert space, 1 1 1 |ψ AB = √ |i A ⊗ |i B = √ |a AVai ⊗ |bB Vib† = √ |a A ⊗ |a B . d i d i,a,b d a
(30.33)
30.4 Entanglement Entropy Entanglement between the two subsystems of a system is associated with a kind of entropic measure, corresponding to the classical notion of entropy as relating to many possible outcomes, with various
344
30 Quantum Entanglement and the EPR Paradox
probabilities. The entropy in the classical case of various states i, arising with classical probabilities pi , was defined by Gibbs as S = −k B pi ln pi . (30.34) i
In this formula for the Gibbs entropy, the probabilities are distributed according to the Boltzmann distribution, 1 i pi = exp − , (30.35) Z k BT for energies i for the states, so S = −k B
i
which varies as δS =
i i pi − − ln Z = pi + k B ln Z , k BT T i
δpi i i
T
δpi i + k B δpi ln Z = , T i
(30.36)
(30.37)
since i pi = 1 is preserved by the variation, and this implies that i δpi = 0. In the case of a quantum system, we can define an entropy that extends the classical Gibbs entropy, the von Neumann entropy (for k B = 1) S = − Tr ρ ln ρ,
(30.38)
where ρ is the density matrix, and ln ρ is understood as the infinite Taylor series applied for a matrix (the same as for the exponential function). For a density matrix that is diagonal, pi |ii|, (30.39) ρ= i
the von Neumann entropy reduces to the Gibbs formula, S=− pi ln pi .
(30.40)
i
In particular, for a pure state |ψ (a single quantum state), thus for ρ = |ψψ|,
(30.41)
the probabilities are pi = δi1 , so the entropy vanishes, S = 0, as expected. This is the minimum value of S. On the other hand, for the density matrix corresponding to the most entropic situation, with all states having equal probabilities, ρ=
1 1, d
(30.42)
we obtain the maximum value of the entropy, 1 1 Smax = − Tr pi ln pi = − d ln = ln d. d d In particular, for two-state system, with d = 2, we have Smax = ln 2 0.69.
(30.43)
345
30 Quantum Entanglement and the EPR Paradox
We can now specialize to the case when we have a total system that can be divided into two systems A and B, H = H A ⊗ HB , whether the division is physical (some sort of separation, either a wall, or a distance) or not. Then we define as before ρ A = TrB ρ tot , ρ B = Tr A ρ tot ,
(30.44)
and we define the entanglement entropy of A in the system as the von Neumann entropy of ρ A, S A = − Tr ρ A ln ρ A.
(30.45)
This, however, is not the usual extensive entropy, since for instance for a pure state |ψ AB , with total density matrix ρtot = |ψ AB AB ψ|, the entanglement entropy is the same for A and B, ρ A = ρ B ⇒ S A = SB ,
(30.46)
contradicting the extensive property. Going back to the generic von Neumann entropy, it has the following properties: • It is invariant under unitary transformations, |i → U |i, implying ρ → U ρU † = U ρU −1 . Indeed, then S(ρ) → − Tr[U ρU −1 ln(U ρU −1 )] = S(ρ),
(30.47)
where we have used the fact that the natural log is written as an infinite Taylor series, leading to ln(U ρU −1 ) = U ln ρU −1 . • It is additive for independent systems (it is not additive for interacting subsystems of a larger system, as we have seen) ρ A, ρ B : S(ρ A ⊗ ρ B ) = S(ρ A ) + S(ρ B ).
(30.48)
• It has strong subadditivity. Considering three systems A, B, C, we have S ABC + SB ≤ S AB + SBC .
(30.49)
This is less obvious, and the proof of this property is quite involved. We also obtain that S A + SC ≤ S AB + SBC .
(30.50)
Putting B = 0 in the first relation, we obtain the standard subadditivity relation, S AC ≤ S A + SC .
(30.51)
Since the entanglement entropy is the von Neumann entropy of a subsystem, it also satisfies the strong subadditivity inequality, as the inequality does not depend on S A being calculated from a subsystem of a system.
30.5 The EPR Paradox and Hidden Variables The EPR paradox was put forward in the Einstein–Podolsky–Rosen 1935 paper, which described a “paradoxical situation”, with correlations over potentialy infinite distances. The original paper dealt with measurements of p and x, but the “paradox” is easier to understand in the situation of measuring spins 1/2, in line with what we have already done in this chapter. This description was put forward by David Bohm.
346
Figure 30.1
30 Quantum Entanglement and the EPR Paradox
The EPR paradox from entanglement. What is different in quantum mechanics is that the measurement of the spin is in any direction. “OR”means that the upper and lower parts of the figure are alternatives. The encircled cross indicates that the spins are connected via a tensor product. We consider a physical situation in which a particle of total spin 0 decays (or disintegrates) into two particles of spin 1/2, in which case the total final spin, the vector sum of the two spins 1/2, must vanish. If the particles continue to move apart, even after they are at a large (macroscopic) distance, we still need to have the total spin zero, since the total state is entangled. An example of the above decay is the decay of the spin zero η particle into two oppositely charged muons, η → μ+ + μ− .
(30.52)
The result of the decay is then a state of the type |φ, 1 |φ1,2 = √ (|↑ A↓B ± |↓ A↑B ) , 2
(30.53)
which means that if Alice measures the spin projection, she knows for sure what Bob would measure if later he measured the spin projection in the same direction; see Fig. 30.1. However, if this were all that there was to say the situation would not be quantum mechanical in nature, since in a classical measurement we could get the same thing. Consider a measurement that corresponds to Alice picking out at random a ball from a hat containing a black ball (spin up) and a white ball (spin down). Then seeing (measuring) what the spin (ball) is by Alice implies that you would know what Bob would measure when he subsequently pulls the remaining ball (spin) from the hat. What is different about quantum mechanics, though is that we have total spin zero for the spin projection in any direction, for instance in the x and z directions. But, on the other hand, the states of given spin in x can be decomposed as linear combinations of the states of given spin in z, 1 |Sx ; ± = √ |Sz + ± |Sz − , 2
(30.54)
1 |Sz ; ± = √ (|Sx + ± |Sx −) . 2
(30.55)
with inverse
Then the entangled spin-singlet total state for spin measured along the z direction, 1 |φ AB = √ |Sz + A ⊗ |Sz −B − |Sz − A ⊗ |Sz +B , 2
(30.56)
347
30 Quantum Entanglement and the EPR Paradox
can be rewritten in terms of the spin-singlet total state for spin measured along the x direction, 1 |φ AB = − √ (|Sx + A ⊗ |Sx −B − |Sx − A ⊗ |Sx +B ) . 2
(30.57)
The minus sign in front is irrelevant, since it represents the same state. Then the expression above shows the ambiguity of the Schmidt basis for a maximally entangled state such as |φ AB . And the physical consequence is that, unlike in the classical case, Alice and Bob can measure the spin in different directions, and still obtain probabilistic results. For instance, Alice can measure the spin on z, and afterwards Bob can measure the spin on x, in which case Bob will obtain spin up or down with the same probability 1/2. This is then definitely not a classical result! The measurement by Alice of a spin in A, which dictates what Bob would find if he then measures the same spin in the B system, seems to suggest an instantaneous (with v > c) travel of information, so is the information transmitted at superluminal speeds? The answer is NO. The point is that it is not a message (information) being transmitted, since we have no way of determining whether Bob actually made that measurement (which would amount to information about Bob’s actions). It would seem as though we would need something like a simultaneous measurement on x and z, which is not quite possible. To generalize the previous case, consider an arbitrary direction n; changing |φ states to |ψ states for simplicity, we find, by changing the states on A with the unitary matrix V , and the states on B with V † , 1 |ψ AB = √ |Sz + A ⊗ |Sz +B + |Sz − A ⊗ |Sz −B 2 1 = √ (|Sn + A ⊗ Sn +B + |Sn − A ⊗ |Sn −B ) . 2
(30.58)
Given this transformation of the entangled state between the directions of spin measured, we might hope that we could use the relation to send signals, with the message encoded in whichever direction has been used for measurement by Bob (for instance, x or z). However, in fact there is no way to do this, as we can check. The only way to do it is to use extra information, which needs to be passed between Alice and Bob classically, so it must travel at subluminal speeds. In conclusion, entanglement does not imply a violation of causality, as information cannot be exchanged at superluminal speeds. In this sense, the EPR experiment is not paradoxical. Einstein knew this, as shown in the EPR paper. However, he felt that there was a paradox nevertheless, since he defined a theory as something with a complete description of physical reality, which would need to satisfy the stronger condition later called Einstein locality, or local realism, namely that: An action on A must not modify the description of B. Clearly the EPR experiment involves a paradox, as it breaks the Einstein locality criterion: while information does not travel superluminally, measurement of A changes the description of B by collapsing the state of the system to a state of given spin for B. Einstein then devised possible descriptions of the EPR paradox experiment that mimic quantum mechanics but satisfy Einstein locality, by having some classical description, with some hidden variables (that cannot be known by experiment). The description is classical, but the appearance of quantum probabilities is due to our not knowing some degrees of freedom, the hidden variables, which can for instance be some λ ∈ (0, 1). It seemed as though this was a solution, and quantum mechanics would not be needed.
348
30 Quantum Entanglement and the EPR Paradox
But Bell showed that we can distinguish experimentally between a hidden variable theory and quantum mechanics, by calculating the satisfaction or not of some inequalities, called Bell’s inequalities, which will be studied in the next chapter.
Important Concepts to Remember • In the case of two (sub)systems A and B, a separable state is a state of tensor product form, |ψ A ⊗ |χ B , while an entangled state is a state in H A ⊗ HB that cannot be written as such product (is not separable). • The formal definition of entanglement comes from the Schmidt decomposition of the bipartite state
√ |ψ AB , as i pi |i A ⊗ |i B . If the Schmidt number n (the number of nonzero eigenvalues pi ) is > 1, we have an entangled state, and if n = 1, a separable state. • We can entangle states by a unitary transformation UAB that is not a product of unitaries UA ⊗ UB . • A maximally entangled state is a state that, when traced over either A or B, leads to a constant (and equal) density matrix, ρ A = ρ B = d1 1. • The quantum mechanical von Neumann entropy, Sv N = − Tr ρ ln ρ, extends the classical Gibbs
entropy S = −k B i pi ln pi . • For a system formed by two subsystems, we define the entanglement entropy as the von Neumann entropy of A, SE = − Tr ρ A ln ρ A, where ρ A = TrB ρ tot and ρ B = Tr A ρ tot . • The EPR “paradox” arises when we make a measurement in A of an entangled state, thus modifying what B can see. In the quantum spin 1/2 case, unlike the classical case, knowing the spin in one direction means a given state for the spin in another direction, thus still implies something nontrivial for a different measurement. • In the EPR experiment information does not travel at superluminal speeds, but a measurement of A changes the description of B, thus we violate Einstein locality, or local realism.
Further Reading See Preskill’s lecture notes on quantum information [12].
Exercises a Consider the state |χ1,2 = |ψ1,2 + a|φ1,2 , for an arbitrary a ∈ R. When is it separable? Calculate ρ A and ρ B to check that you find the correct results. (2) Apply Schmidt decomposition to the general case in exercise 1. (3) Consider the same general state as that in exercise 1, in the generic entangled case. What is the unitary matrix that disentangles them? (4) Consider the density matrix
(1)
ρ = C(|12| + |21| + a|22| + |23| + |32|),
(30.59)
where |1, |2, |3 are orthonormal states. Calculate C, and thus the von Neumann entropy of this mixed state.
349
30 Quantum Entanglement and the EPR Paradox
(5) For the generic state in exercise 1, calculate the entanglement entropy. (6) Consider three spin 1/2 systems, A, B, and C, and a state C(|↑↑↑ + |↓↑↓ + |↑↓↑).
(7)
(30.60)
Check that the strong subadditivity condition is satisfied. Consider an EPR state |φ AB ; Alice measures the spin on z, then Bob measures it on x, and then Alice measures it again on z. Classify the possible answers for the second measurement of Alice, with their probabilities, depending on the initial measurement of Alice (and without knowledge of the measurement of Bob). What do you deduce?
31
The Interpretation of Quantum Mechanics and Bell’s Inequalities In this chapter we describe the Bell’s inequalities that apply for hidden variable theories in EPR paradox-type situations; these theories are violated by the quantum mechanics results, thus providing an experimental test, which has been verified to give quantum mechanics as the correct description in all cases. Finally, we describe some of the leading interpretations of quantum mechanics.
31.1 Bell’s Original Inequality As we described at the end of the previous chapter, Einstein considered a deterministic theory, a “hidden variable” theory, in which there are some variables that cannot be observed that generate the observed statistical results in an a priori deterministic model. The assumption is that the hidden variable theory will reproduce the quantum mechanical results perfectly. But, in his seminal paper, John Bell showed that such a hidden variable model would lead to an inequality that could be violated by quantum mechanics, and thus could be experimentally distinguished from it. Consider measuring the reduced spin, σ ≡ 2 S in two different directions, defined by unit vectors a and b, so we have (σ · a ) for system A, i.e., measured by Alice, and (σ · b) for system B, i.e., measured by Bob. According to the hidden variable model, with hidden variable λ, Alice measures A(a; λ) and Bob measures B(b; λ), obtaining possible values A = ±1 and B = ±1. In the case of Bell’s original inequality, applied to David Bohm’s (now standard) version of the EPR paradox, with a system of total spin 0, meaning that if a = b (Alice and Bob measure spin in the same direction) then A = −B (and if a = −b, then A = B), we thus get A(a; λ) = −B(a; λ).
(31.1)
The hidden variable λ appears (since we cannot measure it) as a statistical ensemble, with probability distribution ρ(λ), normalized to one, dλ ρ(λ) = 1. (31.2) Λ
Then define, in this hidden variable theory, the correlation function of the measurements of Alice and Bob, Chid (a, b) = dλ ρ(λ) A(a; λ)B(b; λ). (31.3) Λ
350
351
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
Given (31.1), we rewrite this as Chid (a, b) = −
dλ ρ(λ) A(a; λ) A(b; λ).
(31.4)
Λ
In order to obtain a relevant inequality, we consider a situation involving spin measurements in three directions a , b, c. We then calculate Chid (a, b) − Chid (a, c ) = − dλ ρ(λ) A(a; λ) A(b; λ) − A(a; λ) A(c; λ) Λ (31.5) = dλ ρ(λ) A(a; λ) A( b; λ) A( b; λ) A(c; λ) − 1 , Λ
where in the second equality we have used the fact that A = ±1, so [A(b; λ)]2 = 1, which we introduced in the second term. Since A = ±1, this means that also A(a; λ) A(b; λ) = ±1. Then we obtain an inequality, |Chid (a, b) − Chid (a, c )| = dλ ρ(λ) A(a; λ) A(b; λ) A(b; λ) A(c; λ) − 1 Λ ≤ dλ ρ(λ) A(b; λ) A(c; λ) − 1 A(a; λ) A(b; λ) (31.6) Λ = dλ ρ(λ) A(b; λ) A(c; λ) − 1 . Λ But ρ(λ) ≥ 0 and (since A = ±1) 1 − A(b; λ) A(c; λ) ≥ 0, which means we obtain
(31.7)
dλ ρ(λ) 1 − A(b; λ) A(c; λ)
|Chid (a, b) − Chid (a, c )| ≤
(31.8) = 1 + Chid ( b, c ), where in the second equality we used the normalization Λ dλ ρ(λ) = 1 and the definition of Chid . The resulting inequality is called Bell’s (original) inequality. On the other hand, in quantum mechanics, we define the corresponding correlation function of the measurements of Alice and Bob. Since in this case A = (σ · a )( A) and B = (σ · b)(B) , we obtain the expectation value of the product AB, in a state |ψ for the total Hilbert space H = H A ⊗ HB , Λ
CQM (a, b) = ψ|(σ · b)(B) (σ · a )( A) |ψ.
(31.9)
This correlation function will violate the Bell inequality, showing that quantum mechanics is different from the hidden variable theory. We choose an entangled state of total spin 0, meaning a |φ state, in the nomenclature of the previous chapter, and one that is antisymmetric in the opposite spins for A and B, uniquely identifying the state as |φ2 , 1 |ψ ≡ |φ− = |φ2 = √ (|↑ A↓B − |↓ A↑B ). 2
(31.10)
352
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
But, as we have seen, for such a state the density matrix obtained for the subsystem A by finding the trace over B is ρ A = 12 1. Then we obtain for the quantum mechanical correlation function CQM (a, b) = ψ|(σ · b)(B) (σ · a )( A) |ψ = −ψ|(σ · b)( A) (σ · a )( A) |ψ = −ai b j ψ|σ j ( A) σi( A) |ψ = −ai b j Tr A ρ A σ (j A) σi( A)
(31.11)
= −a · b = − cos θ(a, b). In the fourth equality, we have used the fact that, since there is no operator acting on the B subsystem, the correlation function equals the result obtained from expectation value in A for the density matrix of the subsystem A. In the fifth equality we have used the fact that 1 Tr A ρ A σ (j A) σi( A) = Tr A σ (j A) σi( A) = δi j , 2
(31.12)
and in the last equality we have defined θ(a, b) as the angle between the two unit vectors. In order to easily see that this correlation function violates the Bell inequalities, we consider the simpler case of small angles. First, if |θ(b, c )| 1,
(31.13)
then the right-hand side of the Bell inequality in the quantum mechanics case is 1 1 + CQM (b, c ) = 1 − cos θ(b, c ) θ 2 (b, c ). 2
(31.14)
Then, if |θ(a, b)| 1,
|θ(a, c | 1
(31.15)
as well, we obtain that the quantum mechanical version of the Bell inequality becomes 1 1 2 θ (a, b) − θ 2 (a, c ) ≤ θ 2 (b, c ). 2 2
(a; c)
a
b
(a; b) (b; c)
Figure 31.1
Set-up for violation of the Bell inequalities.
c
(31.16)
353
31 Interpretation of Quantum Mechanics, Bell’s Inequalities However, this relation can be easily violated. Consider coplanar unit vectors a , b, c; see Fig. 31.1. Then θ(a, c ) = θ(a, b) + θ(b, c ), so the inequality becomes |(θ(a, b) + θ(b, c )) 2 − θ 2 (a, b)| 2 = θ 2 (b, c ) + 2θ(b, c )θ(a, b) ≤ θ 2 (b, c ),
(31.17)
which is clearly violated. In principle, this Bell inequality can be tested experimentally. Alice measures A and gets ±, and Bob measures B and gets ±. Since the correlation function is a sum of products AB weighted by probabilities (so sum over events, and divide by the total number of events), or statistical expectation value, then A = +, B = + and A = −, B = − both contribute with a plus in the sum, whereas A = +, B = − and A = −, B = + both contribute with a minus in the sum, so the experimental correlation function is (here the Ns are numbers of events) N++ + N−− − N+− − N−+ . Cexp (a, b) = N++ + N−− + N+− + N−+
(31.18)
We could then see if the Bell inequality is experimentally violated or not. Unfortunately, the case of perfectly anti-aligned spins for A and B is hard to measure, so we also need other Bell inequalities that are easier to measure. Before that, however, we will derive other, even simpler to understand, Bell inequalities, though they are less relevant for the possible hidden variable theories (they eliminate fewer hidden variable theories).
31.2 Bell–Wigner Inequalities We now consider the Bell inequalities in a simple model by Wigner from 1970 [17]. This is also described in Modern Quantum Mechanics by Sakurai and Tuan [1], and a variant of the same was presented by Preskill in his lecture notes [12]. Consider the case of a hidden variable class of models, considered by Wigner, in which we have classically random emitters of spins, which emit spins of a given value in several directions at the same time (though it is impossible to measure the spin in several directions simultaneously, in consistency with experimental results that agree with quantum mechanics). Suppose that in the direction of unit vector a we get the possible values σa = ±, and in the direction of unit vector b we get the possible values σb = ±. All are pre-ordained, but not knowable. For the EPR experiment, Alice and Bob must always measure opposite spins in a given direction. Then consider states that have given spins in two directions, (a σa , bσb ), or in three directions (a σa , bσb , c σc ), etc. The EPR assumption means that Alice and Bob measure totally opposite states. For spins in three directions, we obtain Table 31.1 for the various possibilities, each with an associated number Ni of events, thus with probability Pi =
Ni . k Nk
(31.19)
We denote by P(a σa , bσb ) the probability for Alice to measure σa on a and for Bob to measure σb on b.
354
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
Table 31.1 Possible measurements by Alice and Bob, and the corresponding numbers of events. Alice (system A)
Bob (system B)
σa σb σc
σ a σb σc
N1
+++
−−−
N2
++−
−−+
N3
+−+
−+−
N4
+−−
−++
N5
−++
+−−
Numbers of events
N6
−+−
+−+
N7
−−+
++−
N8
−−−
+++
Then, by inspection of Table 31.1, we see that P(a+, b+) corresponds to N3 and N4 , P(a+, c+) corresponds to N2 and N4 , and P(c+, b+) corresponds to N3 and N7 , so we obtain N3 + N4 P(a+, b+) = P3 + P4 =
k Nk N2 + N4 P(a+, c+) = P2 + P4 =
k Nk N 3 + N7 P(c+, b+) = P3 + P7 =
. k Nk
(31.20)
N3 + N4 ≤ (N2 + N4 ) + (N3 + N7 ),
(31.21)
But then, since
it means that we obtain the inequality P(a+, b+) ≤ P(a+, c+) + P(c+, b+),
(31.22)
which is one form of the Bell–Wigner inequality. Other inequalities are possible. One example is obtained by defining the probability to obtain the same result for the same observer (say, Alice) in a measurement of two directions i, j, chosen among a , b, c, the probability being denoted by Psame (i, j). Note that this probability corresponds to having opposite spins (as in the EPR paradox) measured by Alice and Bob on the same pair of directions. Then, again by a simple inspection of Table 31.1, we obtain that N1 + N2 + N7 + N8
k Nk N1 + N5 + N6 + N8 Psame (a, c ) =
k Nk N1 + N4 + N5 + N8 Psame (b, c ) = .
k Nk Psame (a, b) =
(31.23)
355
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
Summing the three probabilities, we obtain
Psame (a, b) + Psame (a, c ) + Psame (b, c ) =
k
Nk + 2N1 + 2N8 ≥ 1.
k Nk
(31.24)
This is the new type of Bell–Wigner inequality. We next consider the results corresponding to the Bell–Wigner identities in quantum mechanics. In quantum mechanics, in order to calculate the probability P(a+, b+), we need to insert the projection operator along the spin up or spin down direction on n = a , b, i.e., E(n, ±) =
1 ± n · σ . 2
(31.25)
Specifically, we obtain P(a+, b+) = φ− |E ( A) (a, +)E (B) (b, +)|φ− .
(31.26)
But, substituting the expression for E(n, ±), and using the fact that φ− |1|φ− = 1, φ− |n · σ |φ− = 0, φ− |(σ · a )(σ · b)|φ− = − cos θ(a, b),
(31.27)
we get 1 1 θ(a, b) P(a+, b+) = (1 − cos θ(a, b)) = sin2 . 4 2 2
(31.28)
Similarly, we obtain for the other three possibilities for measurement, 1 (1 − cos θ(a, b)) 4 1 P(a+, b−) = φ− |E ( A) (a, +)E (B) (b, −)|φ− = (1 + cos θ(a, b)) 4 1 P(a−, b+) = φ− |E ( A) (a, −)E (B) (b, +)|φ− = (1 + cos θ(a, b)). 4 P(a−, b−) = φ− |E ( A) (a, −)E (B) (b, −)|φ− =
(31.29)
Then, we see that indeed we have probabilities normalized to one, since P(a+, b+) + P(a−, b−) + P(a+, b−) + P(a−, b+) = 1.
(31.30)
The (first) Bell–Wigner inequality in the quantum mechanical version becomes 1 1 (1 − cos θ(a, c ) + 1 − cos θ(c, b)) ≥ (1 − cos θ(a, b)) ⇒ 4 4 1 2 θ(a, c ) θ(c, b) 1 2 θ(a, b) sin + sin2 ≥ sin . 2 2 2 2 2
(31.31)
But this inequality is easily violated if a , c, b are coplanar and ordered in this way, so θ(a, b) = θ(a, c ) + θ(c, b)
(31.32)
is small, as before, so that θ2 (a, b) = (θ(a, c ) + θ(c, b)) 2 = θ 2 (a, c ) + θ 2 (c, b) + 2θ(a, c )θ(c, b) > θ2 (a, c ) + θ 2 (c, b).
(31.33)
356
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
As in the case of the other Bell–Wigner inequality, we first note that in quantum mechanics we have Psame (a, b) = P(a+, b+) + P(a−, b−) =
1 θ(a, b) (1 − cos θ(a, b)) = sin2 . 2 2
(31.34)
Then the left-hand side of the Bell–Wigner inequality is Psame (a, b) + Psame (b, c ) + Psame (a, c ) = sin2
θ(b, c) θ(a, b θ(a, c ) + sin2 + sin2 , 2 2 2
(31.35)
which can be easily arranged to be less than 1. However, again, as in the case of the original Bell’s inequality, it is experimentally difficult to arrange since we again have opposite spins for Alice and Bob. This means that we need a generalization of the Bell inequality that can be tested experimentally.
31.3 CHSH Inequality (or Bell–CHSH Inequality) A generalization of the Bell inequality that can be tested experimentally is provided by the inequality derived by John Clauser, Michael Horne, Abner Shimony and R. A. Holt: the CHSH inequality. We consider the situation where Alice can measure two variables, a and a , that take values in {±1}, and Bob measures other two variables, b and b, that also take values in {±1}. In a hidden variable theory, depending on the hidden variables λ, the measurements of Alice are A(a, λ) ≡ A and A (a , λ) ≡ A, and the measurements of Bob are B(b, λ) ≡ B and B (b, λ) ≡ B . Note that a, a , b, b can be any variables, not necessarily spins in some direction. We define the correlation function in the hidden variable theory as before, by Chid (a, b) = dλρ(λ) A(a, λ)B(b, λ) ≡ AB, (31.36) Λ
where the final notation is understood as the statistical (classical) average. Note that this definition has no assumption that B(a, λ) = −A(a, λ), as for spins in the EPR experiment. Then, since A, A = ±1, it follows that A + A = 0, A − A = ±2, or A − A = 0, A + A = ±2.
(31.37)
This implies that we have ( A + A )B + ( A − A )B = ±2 = AB + A B + AB − A B ≡ M. But then we obtain an inequality by using the statistical average and modulus, dλ ρ(λ)M (λ) = |M| ≤ |M | = dλ ρ(λ)|M (λ)| Λ Λ =2 ρ(λ) = 2,
(31.38)
(31.39)
Λ
where in the last equality we have used the normalization of the probability distribution ρ(λ).
357
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
Now replacing M by its definition in (31.38), we obtain |AB + A B + AB − A B | ≤ 2 ⇒ |Chid (a, b) + Chid (a , b) + Chid (a, b ) − Chid (a , b )| ≤ 2.
(31.40)
This is the CHSH, or Bell–CHSH inequality. To calculate its equivalent in quantum mechanics, we choose the variables to be spins in various directions, a = σ ( A) · a ,
a = σ ( A) · a
b = σ (B) · b,
b = σ (B) · b.
(31.41)
But since, as we saw, we have φ− |(σ ( A) · a )(σ (B) · b)|φ− = −a · b = − cos θ(a, b),
(31.42)
we can choose coplanar unit vectors, arranged as before in rotational order (when the origin is the same), a , b, a , b, at successive π/4 intervals of angle. Then AB = A B = AB = − cos
π 1 3π 1 = − √ , A B = − cos =√ . 4 4 2 2
(31.43)
Thus the equivalent of the CHSH inequality in quantum mechanics is √ 1 4 × √ = 2 2 ≤ (?)2, 2
(31.44)
which is clearly violated. In fact, one can show that this is the maximal violation of the classical CHSH inequality.
31.4 Interpretations of Quantum Mechanics We end this chapter with a short analysis of the main possible interpretations of quantum mechanics. There are very many such interpretations, proof that, more than a hundred years after its start, quantum mechanics is still a mystery to us, but most of these possibilities are shaky and less standard, so we will stay with the leading ones.
The Standard, “Copenhagen”, Interpretation This is the most common interpretation of quantum mechanics, the one that has been implicit in most of the book so far. It was developed by Niels Bohr and Werner Heisenberg around 1927 in Copenhagen, extending the previous work of Max Born. In this view of quantum mechanics, we have a probabilistic interpretation for the outcome of measurements based on wave functions evolving in time, i.e., we simply have probabilities of obtaining various results. On the other hand, questions about non-measured things, such as the paths previous to one measurement experiment, are meaningless. We have only a wave function and measurements that have meaning. In particular, the interaction of a system with the observer in obtaining a measurement collapses the wave function.
358
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
The “Many Worlds” Interpretation In this interpretation, there is no wave function collapse associated with measurements. Measurements occur via “decoherence”, where the system “interacts with itself” without the need of an observer, and moreover at each measurement, the Universe splits into multiple, mutually unobservable Universes, or “alternative histories”. These splittings occur at each moment in time corresponding to some measurement, leading to a “Multiverse”, a distribution of Universes with slightly different histories at each point, splitting further as time goes by. Note that this interpretation of quantum mechanics is the basis for a story device in science fiction movies and books that is quite common yet gets things mostly wrong. In the Sci-Fi version one can travel between Universes in the Multiverse (that can never happen, an essential part of this interpretation being that the Universes are mutually unobservable); the people are the same, but the events are slightly different (there is no difference between what happens to sentient people and objects or events with respect to branching: everything is slightly different, so the people should be too).
The “Consistent Histories” Interpretation This is an interpretation that is somewhat related to the formulation of quantum mechanics in terms of path integrals, which are sums over quantum (generically nonsmooth) histories for the propagator. In this interpretation, we view the wave function as a sum over consistent histories for the particles, with some probabilities. This interpretation is useful for dealing with the problem of quantum cosmology. Quantum cosmology refers to the quantum mechanical evolution of the whole Universe near the Big Bang (the time origin of the Universe), when all things are quantum mechanical, and the whole Universe is one big system. In that case, we have a problem with the standard Copenhagen interpretation, since there is a single Universe, there is no “ensemble of Universes” to measure, and there is only one experiment, the evolution of our Universe. Thus we have to find a way to deal with the quantum mechanics of a single experiment. Of course, the many worlds interpretation could also be useful for quantum cosmology, since in it we imagine a Multiverse (a collection of alternative Universes) instead of ensembles, even in regular experiments on Earth so even more so for quantum cosmology. The unobservability of the Multiverse is part of the point: the ensemble is always an ensemble of Universes, not of successive experiments.
Ensemble Interpretation The ensemble interpretation is the most minimalist, even more so than the Copenhagen interpretation. The quantum mechanics probabilistic interpretation is valid only for ensembles, not for single particles, or single Universes (so it definitely tells us we cannot apply quantum mechanics to cosmology, i.e., to the Universe itself). This is very agnostic, and restrictive, so it is perhaps going too far. There are many other interpretations of quantum mechanics, but these are less standard, and more controversial, so we will not mention them here.
359
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
Important Concepts to Remember • All or most of the variations of Bell’s inequalities refer to three consecutive measurements by Alice and Bob, in the EPR paradox, relative to three directions a , b, c, and in a classical, hidden variable, theory we obtain an inequality violated by the quantum mechanical analog of the same. • The original Bell inequality is hard to measure, since it requires the measurement of exactly antialigned spins for Alice and Bob (assuming that the total spin is zero). The Wigner model, in which we have preordained numbers for each classical possibility, leading to Bell-Wigner inequalities, is also hard to measure, for the same reason. • The Bell–CHSH inequalities are better experimentally, since we do not assume total spin zero, and we consider four arbitrary directions of measurement. • The Copenhagen interpretation of quantum mechanics is the most common: it is a probabilistic interpretation based on wave functions and measurements (involving interaction with an observer and so leading to collapse of the wave function). • In the many worlds interpretation there is no collapse of the wave function, and at each (self-) interaction, the Universe splits into multiple, mutually unobservable Universes, corresponding to each possibility. In the consistent histories interpretation, the wave function arises as the sum over all consistent histories (as in path integral), and allows us to deal with the quantum cosmology of a single Universe (i.e., a single experiment). In the ensemble interpretation, the most restrictive, we can deal only with ensembles, not with a single particle or Universe (it is the most agnostic description, i.e., it is the description that is maximal concerning what we cannot know).
Further Reading For Bell’s inequalities, see Preskill’s lecture notes on quantum information [12], as well as the analysis of Sakurai and Tuan in [1]. The original articles cited here are [16] and [17].
Exercises (1) Explore whether the original Bell inequality can be violated at large angles as well. (2) Analyzing Table 1, find another example of a Bell–Wigner inequality that is violated in quantum mechanics. (3) Find an example of (angles corresponding to) another violation of the Bell–CHSH inequality by quantum mechanics, which is not maximal but is more generic. (4) Instead of the Bell–CHSH inequality in the text, consider an inequality obtained in the same way from M˜ = ( A + A )B + ( A − A )B instead of M. Is it still violated by quantum mechanics? (5) If we consider five measurements instead of the four in the Bell–CHSH inequality, namely a, a , a for Alice and b, b for Bob, can we extend the Bell–CHSH inequality to include this case, such that the inequality is still violated by quantum mechanics?
360
31 Interpretation of Quantum Mechanics, Bell’s Inequalities
(6) In quantum cosmology, one can define a “wave function of the Universe” Ψ[a(t)], whose variable is the expanding scale factor of the Universe, a(t), and which satisfies a general relativity (Einstein) version of the Schr¨odinger equation, called the Wheeler–DeWitt equation. Yet in the Copenhagen interpretation of quantum mechanics, there is only one “experiment” (our expanding Universe, since our Big Bang) that we can see, and no “outside observer”. How would you slightly extend this interpretation to make sense of the results of the Wheeler–DeWitt equation (there is much debate, so there is no unique good answer to this question at present)? (7) In some TV shows, instigated by the “many worlds interpretation” of quantum mechanics, one sees an “evil parallel Universe”, where things have gone very bad in some sense, and the same characters behave in an evil way. Leaving aside philosophical speculations on good and evil, and the possibility of accessing (even just to see) this Universe, explain why this is inconsistent with the “many worlds interpretation” of quantum mechanics.
32
Quantum Statistical Mechanics and “Tracing” over a Subspace In this chapter, we will describe the basics of quantum statistical mechanics as a natural extension of the most general formalism, for a density matrix (i.e., a mixed state) instead of a pure state. A density matrix is a classical distribution of quantum states, and, as such, it allows for a satisfactory statistical mechanics interpretation, since classical statistical mechanics deals with a classical distribution of classical states. On the other hand, we have also seen that a nontrivial ρˆ appears when we take the trace over a subspace in a total space. Thus we can interpret mixed states in two possible ways: • Perhaps we can ignore (meaning, we cannot measure) some part of a Hilbert space for a total isolated system: we have a “macroscopic” description only, not a microscopic description. This is in fact at the core of the statistical mechanics interpretation, even in the classical case: statistics comes from ignorance of (or “averaging over”) the unknown parts of the state of the total system. • Alternatively, perhaps we have a system in contact with a reservoir, schematically denoted as S ∪R. Then, in a similar manner, there is only a total state for the system plus reservoir, S ∪ R, but not for S or R independently, since there is no such thing as a state, or wave function for only a part of a system, except for very special states (product, or separable, states).
32.1 Density Matrix and Statistical Operator Either way (in both of the above interpretations), the system S is in a state chosen randomly from the set {|ψ j }, with classical probability w j , which is the definition of a mixed state. Consider a complete and orthonormal set {|ψ α } of eigenstates of a complete set of compatible observables. That means that for any state of the system, we can decompose it in the |ψ α states, |ψ = cα |ψ α . (32.1) α
Multiplying by ψ β | from the left, we obtain as usual cα = ψ α |ψ. Applying the decomposition to the states in the mixed state set, we have cα( j) |ψ α . (32.2) |ψ j = α
361
362
32 Quantum Statistical Mechanics, Tracing over a Subspace The expectation value of an observable A in the state |ψ is found to be ˆ Aψ = ψ| A|ψ ˆ β ψ β |ψ = ψ|ψ α ψ α | A|ψ α,β
=
(32.3)
cα∗ cβ Aαβ ,
α,β
where we have inserted two completeness relations 1 = α |ψ α ψ α |. But then, for the classical statistics of the quantum states |ψ α with probabilities w j , we can calculate the average value of an observable A by taking first a quantum average, and then the classical statistical average over states: A ≡ Aqu =
stat
=
w j Aψ j =
j
ρ βα Aαβ
w j cα( j)∗ cβ( j) Aαβ α,β j
(32.4)
ˆ = Tr( ρˆ A).
α,β
Here we have defined the density matrix ρ βα = w j cβ( j) cα( j)∗ = w j ψ β |ψ j ψ j |ψ α j
= ψ β |
j
w j |ψ j ψ j | |ψ α
(32.5)
j
ˆ α , ≡ ψ β | ρ|ψ where ρˆ =
w j |ψ j ψ j |
(32.6)
j
is an operator that, at least in the context of quantum statistical mechanics, will be called the statistical operator. The normalization of the statistical operator and its associated density matrix is found as follows: Tr ρˆ = w j ψ α | |ψ j ψ j | |ψ α α
=
j
j
w j ψ j |
|ψ α ψ α | |ψ j =
α
w j = 1,
(32.7)
j
where in the third equality we have used the completeness relation of |ψ α and the orthonormality of |ψ j . Considering the statistical operator for a pure state |ψ as a particular example, ρˆ = |ψψ|,
(32.8)
ˆ α = cα∗ cβ . ρ βα = ψ β | ρ|ψ
(32.9)
the density matrix is
363
32 Quantum Statistical Mechanics, Tracing over a Subspace
The formalism of mixed states applies to states that are not completely known, so that there are classical probabilities w j , in which case the mixed states, defined by John von Neumann, appear in a macroscopic description. Indeed, in Chapter 30 we saw that, for a pure total state, either a system plus reservoir, S ∪ R, or an observed state plus an unobservable state, we sum over diagonal components in the unobservable Hilbert space B, i.e., we take the trace, with the result ρ B = Tr A ρ tot = Tr A |ψ AB AB ψ| = pi |i B B i | ρ A = TrB ρ tot =
i
pi |i A Ai|.
(32.10)
i
Note that in this case, the resulting state for a reduced system (i.e., either system A or system B
but not both) is not of the type j w j |ψ j , but simply is not a (pure) state |ψ at all!
32.2 Review of Classical Statistics Before we turn to the description of quantum statistical mechanics, we review the classical version, in order to see how it can be generalized. We start with the definition of a statistical ensemble, due to Boltzmann and Gibbs. It is a collection of identical systems to the real system S, in the same conditions as the real one, independent of each other and each system being in any of the possible states (that are compatible with the external conditions on the system). We also define the distribution function of the statistical ensemble, ρ=
dN , N dΓ
(32.11)
where N refers to the number of systems in the ensemble and dΓ is the differential of the total number of states with energy ≤ E, Γ(E) = dΓ. (32.12) H≤E
We also define the number of states in a region between E and E + ΔE, Ω(E, ΔE) = dΓ,
(32.13)
E ≤H≤E+ΔE
and the energy distribution, ω(E) =
∂Γ(E) , ∂E
(32.14)
so that Ω(E, ΔE) ω(E)ΔE.
(32.15)
364
32 Quantum Statistical Mechanics, Tracing over a Subspace All quantities, including Γ and ρ, are functions of the phase space (p, q) of the total number of particles (so there is a very large number of variables, giving a very large dimension of the phase space), so we have ρ(p, q) and dΓ(p, q). Then the ensemble average of an observable A is A =
A ρ dΓ.
(32.16)
Γ
The Ergodic Hypothesis The ergodic hypothesis is due to Boltzmann. It states that for an isolated system, a point in the phase space of the system goes through all the points in the phase space on an isoenergetic (E = constant) hypersurface. But it is not valid. A better variant is the quasi-ergodic hypothesis: that a point in the phase space of the system goes arbitrarily close to every point on the isoenergetic hypersurface. However, it is not valid and/or provable in general. That means that we need a set of postulates about observables in the system, to replace any need for explanations or proofs.
Postulate 1 The first such postulate states that the observed expectation value of a quantity A equals the temporal average, Aobs = A = lim
τ→∞
1 τ
τ
A(t) dt.
(32.17)
0
Postulate 2 The second postulate, called the ergodic postulate, or Gibbs–Tollman postulate, states that the statistical average (the average over the ensemble) equals the time average, A = A.
(32.18)
Postulate 3 The third postulate, of a priori equal probabilities, known as the Tollman postulate, says that for an isolated system, the probability density in phase space is constant for all the accessible states (i.e., those consistent with the external boundary conditions). Mathematically, we write ⎧ ⎪ constant in D ρ=⎨ ⎪ 0, outside it. ⎩
(32.19)
365
32 Quantum Statistical Mechanics, Tracing over a Subspace
32.3 Defining Quantum Statistics In the quantum mechanical case, we use classical statistics over quantum states, plus quantum statistics over a state. Moreover, states of the system are now (generically) discrete, labeled by an index n for the energy plus a degeneracy gn for different states of the same energy. Now we define the total number of states with energy less than E, similarly to the classical case, Γ(E) = 1= gn , (32.20) n,gn (En 2), (the term with λ k in the equation), we obtain ( Hˆ 0 − En(0) )| n˜ (k) , α + ( Hˆ 1 − En(1) )| n˜ (k−1) , α − En(2) | n˜ (k−2) , α − · · · − En(k) | n˜ (0) , α = 0.
(36.8)
36.2 The Nondegenerate Case We start with the simpler nondegenerate case, when there is no index α, so | n, ˜ α is replaced by just |n. The eigenstates of Hˆ 0 are assumed to be orthonormal, n (0) |m (0) = δ mn .
(36.9)
The zeroth-order equation is satisfied, as we have seen. The first-order equation becomes now simply ( Hˆ 0 − En(0) )|n (1) + ( Hˆ 1 − En(1) )|n (0) = 0.
(36.10)
We multiply it with the ket of the free Hamiltonian, m (0) |, from the left. In the case m = n, we obtain n (0) | Hˆ 0 |n (1) − En(0) n (0) |n (1) + n (0) | Hˆ 1 |n (0) − En(1) = 0.
(36.11)
Acting with Hˆ 0 on the left (on the ket state) in the first term, we obtain that the first and second terms cancel, and we are left with a formula for the first correction to the energy, En(1) = n (0) | Hˆ 1 |n (0) ≡ H1,nn .
(36.12)
m (0) | Hˆ 0 |n (1) − En(0) m (0) |n (1) + m (0) | Hˆ 1 |n (0) = 0.
(36.13)
In the case m n, we obtain
Again acting with Hˆ 0 on the left in the first term, we now obtain (0) (En(0) − Em )m (0) |n (1) = m (0) | Hˆ 1 |n (0) ≡ H1,mn .
(36.14)
But we can now use the second assumption, of the completeness of the states |n (0) in the Hilbert space of the full Hamiltonian Hˆ (λ), in order to expand |n (1) in |m (0) , n(1) |n (1) = Cm |m (0) . (36.15) m
409
36 Time-Independent Perturbation Theory Multiplying from the left with p(0) | gives Cpn(1) = p(0) |n (1) , so substituting this into the first order equation for m n, we get n(1) Cm =
H1,mn En(0)
(0) − Em
.
(36.16)
Note that this equation is valid only for m n. At nonzero λ, we will assume that there is no contribution to |n (0) itself for |n, n (0) |n (1) = 0, which is a consistent choice, amounting to just a n(1) normalization condition. Then, substituting Cm into the expansion of |n (1) , we obtain |n (1) =
H1,mn
(0) mn En
(0) − Em
|m (0) .
(36.17)
The second-order equation is now simply ( Hˆ 0 − En(0) )|n (2) + ( Hˆ 1 − En(1) )|n (1) − En(2) |n (0) = 0.
(36.18)
Again we multiply it with the free ket m (0) | from the left. In the m = n case, we obtain n (0) | Hˆ 0 |n (2) − En(0) n (0) |n (2) + n (0) | Hˆ 1 |n (1) − En(2) = 0,
(36.19)
and, by acting with Hˆ 0 on the left in the first term, we see that the first two terms cancel. But we can use the first-order result (the expansion of |n (1) ), to calculate n (0) | Hˆ 1 |n (1) =
n (0) | Hˆ 1 |m (0) H1,mn (0) En(0) − Em
mn
.
(36.20)
.
(36.21)
Substituting in the second-order equation, we obtain En(2) =
H1,nm H1,mn mn
(0) En(0) − Em
=
|H1,nm | 2 mn
(0) En(0) − Em
In the m n case, we obtain m (0) |( Hˆ 0 − En(0) )|n (2) + m (0) |( Hˆ 1 − En(1) )|n (1) (0) = (Em − En(0) )m (0) |n (2) + m (0) | Hˆ 1 |n (1) − En(1) m (0) |n (1) = 0.
(36.22)
n(1) Now we can use the first-order result, with En(1) = H1,nn and |n (1) = Cm |m (0) , and obtain (0) (Em − En(0) )m (0) |n (2) +
m (0) | Hˆ 1 |p(0)
pn
H1,pn En(0)
−
E p(0)
− H1,nn
H1,mn En(0)
(0) − Em
= 0.
Again we expand in the complete set of zeroth-order states, as in the first order: n(2) |n (2) = Cm |m (0) .
(36.23)
(36.24)
m n(2) Then, we find Cm = m (0) |n (2) , so
n(2) Cm = m (0) |n (2) =
1 En(0)
−
(0) Em
H1,mp H1,pn H1,nn H1,mn . − (0) (0) (0) (0) E − E E − E p n m pn n
(36.25)
410
36 Time-Independent Perturbation Theory
36.3 The Degenerate Case We next consider the degenerate case, where, as we said, in general we expect the basis at λ → 0 for the degenerate Hilbert space Hn(0) of given unperturbed energy to be different from the basis used for Hˆ 0 , λ→0 (0) | n, ˜ α; λ → | n˜ (0) , α ≡ Cn,αβ |n (0) , β. (36.26) β
First Order For the first perturbation order m = n case, multiply (36.6) from the left with the bra n˜ (0) , β|, obtaining n˜ (0) , β| Hˆ 0 | n˜ (1) , α − En(0) n˜ (0) , β| n˜ (1) , α + n˜ (0) , β| Hˆ 1 | n˜ (0) , α − En(1) δ αβ = 0.
(36.27)
Acting with Hˆ 0 from the left on the first term leads to the cancellation of the first two terms giving H1,nβα ≡ n˜ (0) , β| Hˆ 1 | n˜ (0) , α = En(1) δ αβ .
(36.28)
However, we see that this equation does not have a solution if H1,nβα 0 for α β. In order for it to have a solution, we consider states that are linear combinations of the basis states, |ψ n =
gm
Cnα | n, ˜ α,
(36.29)
α=1
where both |ψ n and | n, ˜ α admit expansions in λ, or, more precisely, the above relation is true at each order in λ, |ψ n(s) =
gm
(s) (s) Cnα | n˜ , α.
(36.30)
α=1
Then we replace | n˜ (s) , α with |ψ n(s) in the original perturbation theory equations, starting with (36.6). Then, again multiplying it with n˜ (0) , β|, we obtain gm α=1
(0) (0) H1,nβα Cnα = En(1) Cnβ .
(36.31)
This is a matrix equation for diagonalization, H1 · C = EC, so the possible eigenvalues, for En(1) , are solutions to the equation det(H1,nβα − En(1) δ βα ) = 0.
(36.32)
Next we consider the m n case. Again replacing | n˜ (s) , α with |ψ n(s) in (36.6), but multiplying from the left with m˜ (0) , β|, we obtain (1) (1) (0) (0) (0) Cnα m˜ (0) , β| Hˆ 0 | n˜ (1) , α − En(0) Cnα n˜ , β| n˜ (1) , α + Cnα n˜ , β| Hˆ 1 | n˜ (0) , α = 0. α
α
α
(36.33)
411
36 Time-Independent Perturbation Theory Acting with Hˆ 0 from the left on the first term, we obtain (1) (0) (0) (Em − En(0) ) Cnα m˜ (0) , β| n˜ (1) , α + Cnα H1,mβ,nα = 0, α
(36.34)
α
where we have defined H1,mβ,nα ≡ m˜ (0) , β| Hˆ 1 | n˜ (0) , α
(36.35)
and, expanding |ψ n(1) in the complete set of H0 eigenstates | n˜ (0) , α, (1) (1) (1) |ψ n(1) = Cnα | n˜ , α = C˜mβ | m˜ (0) , β, α
(36.36)
m,β
we first find that (by multiplication with m˜ (0) , β|) (1) (1) Cnα m˜ (0) , β| n˜ (1) , α = C˜mβ ,
(36.37)
α
in which case the perturbation equation is solved by
(0) α H1,mβ,nα Cnα (1) ˜ Cmβ = . (0) En(0) − Em
(36.38)
Substituting back in the definition of |ψ n(1) , we find |ψ n(1) =
(1) (1) Cnα | n˜ , α =
α
(0) H1,mβ,nα Cnα m,β
α
(0) En(0) − Em
| m˜ (0) , β.
(36.39)
Second Order For the second perturbation order, replacing | n˜ (s) , α with |ψ n(s) in (36.7), we find (2) (2) (1) (1) (0) (0) ( Hˆ 0 − En(0) ) Cnα | n˜ , α + ( Hˆ 1 − En(1) ) Cnα | n˜ , α − En(2) Cnα | n˜ , α = 0. α
α
(36.40)
α
The m = n case. Multiplying from the left with the bra n˜ (0) , β|, we find (2) (0) (2) (0) Cnα n˜ , β| Hˆ 0 | n˜ (2) , α − En(0) Cnα n˜ , β| n˜ (2) , α α
+
α (1) (0) Cnα n˜ , β| Hˆ 1 | n˜ (1) , α
− En(1)
α
α
(1) (0) (0) Cnα n˜ , β| n˜ (1) , α − En(2) Cnβ = 0.
(36.41)
Acting with Hˆ 0 from the left on the first term, the first two terms cancel, and we are left with (1) (0) (1) (0) (2) Cnα n˜ , β| Hˆ 1 | n˜ (1) , α − En(1) Cnα n˜ , β| n˜ (1) , α = En(2) Cnβ . (36.42) α
α
However, from the first-order analysis, replacing the form of |ψ n(1) we have α
(1) Cnα m˜ (0) , β| Hˆ 1 | n˜ (1) , α = m˜ (0) , β| Hˆ 1 |ψ n(1) =
α
α
(1) Cnα m˜ (0) , β| n˜ (1) , α
= m˜
(0)
, β|ψ n(1)
= 0.
p,γ
(0) Cnα
H1,mβ,pγ H1,pγ,nα En(0) − E p(0)
(36.43)
412
36 Time-Independent Perturbation Theory Using these identities, we obtain that the second-order equation for m = n becomes (0) En(2) Cnβ = m˜ (0) , β| Hˆ 1 |ψ n(1) =
α
(0) Cnα
H1,mβ,pγ H1,pγ,nα En(0) − E p(0)
p,γ
.
(36.44)
Then defining the matrix element of an abstract operator Kˆ (2) by n˜ (0) , β| Kˆ (2) | n˜ (0) , α ≡
m˜ (0) , β| Hˆ 1 | p˜(0) , γ p˜(0) , γ| Hˆ 1 | n˜ (0) , α En(0) − E p(0)
p,γ
the equation for En(2) again becomes an eigenvalue problem for Kˆ (2) , (0) (0) n˜ (0) , β| Kˆ (2) | n˜ (0) , αCnα = En(2) Cnβ .
,
(36.45)
(36.46)
α
For the case m n we could again find the second-order wave function, but the analysis is more complicated and will be skipped.
36.4 General Form of Solution (to All Orders) Now we will show the main points of a different approach, which works to all orders in perturbation theory. Consider the eigenstate of the total Hamiltonian H, | n, ˜ α, λ, and denote it by |n, α. Define the projector onto the Hilbert subspace of given energy En (the space of |n, α states for any α), Pˆn = |n, αn, α|. (36.47) α
Since it is a projector, it satisfies Pˆn2 = Pˆn , and Pˆn Pˆm = 0 ⇒ Pˆn Pˆm = δ mn Pˆn Pˆn = 1,
(36.48)
n
as we can check. Then the Schr¨odinger equation Hˆ |n, α = En |n, α becomes Hˆ Pˆn = En Pˆn .
(36.49)
Define the resolvent of this equation, ˆ ˆ −1 not. G(z) ≡ (z − H) =
1 z − Hˆ
,
(36.50)
where the label above the equals sign indicates that the notation with fractions will be used for convenience, and where z is a complex variable; the resolvent can be thought of as a sum of power laws, n ∞ 1 Hˆ ˆ . (36.51) G(z) = z n=0 z
413
36 Time-Independent Perturbation Theory
Then, using the Schr¨odinger equation, we obtain ˆ Pˆn = G(z)
Pˆn . z − En
ˆ n Pn = 1, we obtain |nαnα| Pˆn α ˆ G(z) = = . z − E z − En n n n
(36.52)
Summing over n, and using the fact that
(36.53)
This means that we can use the residue theorem in the complex plane to identify a projector Pˆn : 1 ˆ Pn = G(z)dz, (36.54) 2πi Γn where Γn is a contour in the complex z plane around the real pole at z = En . Now we remember that Hˆ = Hˆ 0 + λ Hˆ 1 , which means that ˆ G(z) =
1 , ˆ z − H0 − λ Hˆ 1
(36.55)
and define the same resolvent for the free theory (just with Hˆ 0 ), Gˆ 0 (z) =
1 z − Hˆ 0
.
(36.56)
Then we can write the identities ˆ G(z) =
1 1 1 = [(z − Hˆ 0 − λ Hˆ 1 ) + λ Hˆ 1 ] ˆ ˆ ˆ ˆ z − H0 − λ H1 z − H0 z − H0 − λ Hˆ 1 1 1 1 = + λ Hˆ 1 ˆ ˆ ˆ z − H0 z − H0 z − H0 − λ Hˆ 1 ˆ = Gˆ 0 (z) 1 + λ Hˆ 1 G(z) .
We have thus obtained a self-consistent equation tailor-made for perturbation theory, ˆ ˆ G(z) = Gˆ 0 (z) 1 + λ Hˆ 1 G(z) ,
(36.57)
(36.58)
ˆ since the term with G(z) on the right-hand side is proportional to λ. This means that we can iterate the equation, by putting Gˆ = Gˆ 0 on the right-hand side (so Gˆ (0) = Gˆ 0 ), and then we have Gˆ up to the first order on the left-hand side. Next we put this on the righthand side of the equation, obtaining Gˆ up to the second order on the left-hand side. We continue this procedure ad infinitum, thus obtaining the perturbative solution to all orders in perturbation theory, Gˆ =
∞
λ n Gˆ 0 ( Hˆ 1 Gˆ 0 ) n .
(36.59)
n=0
Define a contour in the complex plane Γ˜ n that encircles both En(0) (eigenvalue of Hˆ 0 ) and the real values En . Then we have 1 ˆ Pˆn = G(z)dz (36.60) 2πi Γ˜ n
414
36 Time-Independent Perturbation Theory
as before, but now we can also use the solution (36.59) inside the equation, and then use the same equation for Pˆn,0 in terms of Gˆ 0 to find Pˆn = Pˆn,0 +
∞
λ n A(n) ,
(36.61)
Gˆ 0 ( Hˆ 1 Gˆ 0 ) n dz.
(36.62)
n=1
where A
(n)
1 = 2πi
Γ˜ n
One can also find expressions for the expansions of Pˆn and Hˆ Pˆn in λ n from the above, but we will not do it here. Then, acting with Hˆ Pˆn on the eigenstates of Hˆ 0 , | n˜ (0) , α, the operator Pˆn projects them onto the ˆ so we can act with H, ˆ and obtain states of energy En for H, Hˆ Pˆn | n˜ (0) , α = En Pˆn | n˜ (0) , α.
(36.63)
But then, without altering anything, we can put Pˆn,0 in front of | n˜ (0) , α and multiply the equation from the left by Pˆn,0 . Defining the following operators, which are Hermitian, as we can easily check, Hˆ n = Pˆn,0 Hˆ Pˆn Pˆn,0 ,
Kˆ n = Pˆn,0 Pˆn Pˆn,0 ,
(36.64)
the resulting equation becomes Hˆ n | n˜ (0) , α = En Kˆ n | n˜ (0) , α.
(36.65)
In order for this equation to have a solution, the energy En must satisfy a secular equation, det( Hˆ n − En Kˆ n ) = 0.
(36.66)
The expansion in λ of Hˆ n and Kˆ n can be derived from the previous expansion of Hˆ Pˆn and Pˆn , resulting in the eigenvalue equation for En in a λ expansion. We will not continue it here, however.
36.5 Example: Stark Effect in Hydrogenoid Atom Consider the Stark effect, which is the interaction of a system having electric dipole moment d with a constant electric field E in the z direction, with interaction Hamiltonian Hˆ 1 = dEz = dEr cos θ.
(36.67)
Moreover, suppose that the unperturbed Hamiltonian Hˆ 0 is that for the hydrogenoid atom. For perturbation theory to work, we need that the interaction energy Enucleus .
415
36 Time-Independent Perturbation Theory We first apply perturbation theory to the ground state, with energy quantum number n = 1, starting with the first order. For n = 1, g1 = 1 the only state (nlm) is (100). But since z = r cos θ is directional, whereas the (100) state is spherically symmetric, we have that (H1 )100,100 = 0.
(36.68)
This means that for the first-order correction to the ground state energy E1 we have E1(1) = (H1 )100,100 = 0.
(36.69)
The first nonzero contribution is from the second order, which in this case is found from the general expression
E1(2) =
(nlm)(100)
|H1,100,nlm | 2 . E1 − En
(36.70)
We cannot find a closed form for the solution; instead we must calculate term by term in (nlm) and then sum, so we will not go further in evaluation of this formula. Next we apply perturbation theory to the first excited state, with n = 2. In this case g2 = 22 = 4. The four states in this Hilbert subspace are |2α = |2lm, specifically denoted by |2; 1 ≡ |2, 0, 0, |2; 2 ≡ |2, 1, 0, |2; 3 ≡ |2, 1, 1, |2; 4 ≡ |2, 1, −1.
(36.71)
Then the matrix elements of the perturbation are nl 1 m1 | Hˆ 1 |nl 2 m2 = dEl 1 m1 | cos θ|l 2 m2 rnl1 ,nl2 .
(36.72)
In order for the result to be nonzero, we must have m1 = m2 and l 1 = l 2 ± 1. Moreover, we find that l 2 − m2 l, m| cos θ|l − 1, m = l − 1, m| cos θ|l, m = , (36.73) 4l 2 − 1 which we leave as an exercise for the reader to prove. Therefore in our case (with states 1, 2, 3, 4) the only nonzero matrix element is H1,12 0, with 1 2; 1| cos θ|2; 2 = . (36.74) 3 Then the matrix element is r20,21 a0 H1,12 = −dE √ = −3dE . Z 3
(36.75)
Next we need to diagonalize the matrix 0 H 1,12 0 0
H1,12 0 0 0
0 0 0 0
0 0 0 1 = H1,12 0 0 0 0
1 0 0 0
0 0 0 0
0 0 . 0 0
The secular equation for the eigenvalues is (for E2(1) = λH1,12 ) H1,12 2 −λ = 0 ⇒ λ2 (λ2 − H1,12 ) = 0, λ H −λ 1,12
(36.76)
(36.77)
416
36 Time-Independent Perturbation Theory
with solutions E2(1) = H1,12 (0, 0, +1, −1).
(36.78)
The eigenstates are also found from the diagonalization, namely |ψ3 = |2; 3 = |2, 1, 1 |ψ4 = |2; 4 = |2, 1, −1 1 1 |ψ1 = √ (|2; 1 + |2; 2) = √ (|2, 0.0 + |2, 1, 0) 2 2 1 1 |ψ2 = √ (|2; 1 − |2; 2) = √ (|2, 0, 0 − |2, 1, 0). 2 2
(36.79)
Important Concepts to Remember • Time-independent perturbation theory arises for a time-independent Hamiltonian Hˆ (λ) = Hˆ 0 + λ Hˆ 1 , where Hˆ 0 is the “free” part, with large eigenvalues, and λ Hˆ 1 is a perturbation, with small eigenvalues. n(1) (0) • In the nondegenerate case, to first order we find En(1) = H1,nn and Cm = H1,mn /(En(0) − Em ),
(2) (0) (0) 2 En = mn |H1,nm | /(En − Em ).
• In the degenerate case, the basis | n˜ (0) , α at λ → 0 is generically different from the free basis,
|n (0) , α. Moreover, the states are combinations of the basis states, |ψ n = n Cnα | n, ˜ α. • The equation for En(1) is a secular equation in the basis states, det(H1,nβα − En(1) δ βα ) = 0, which
(0) (0) (1) (0) (0) solves α [H1,nβα Cnα − En(1) Cnβ ] = 0. Then C˜mβ = α H1,mβ,nα Cnα /[En(0) − Em ]. • In the general case of perturbation theory, defining the resolvent ˆ G(z) =
1 z − Hˆ
=
⎤⎥ ⎡⎢ ⎢⎢ |nαnα|/(z − En ) ⎥⎥ ⎥⎦ ⎢ n ⎣ α
and the free resolvent, Gˆ 0 (z) =
1 z − Hˆ 0
,
ˆ ˆ we have the self-consistent equation G(z) = Gˆ 0 (z)(1 + λ Hˆ 1 G(z)), solved by perturbation theory via iteration. • The Stark effect for a hydrogenoid atom, H1 = dEz, is trivial in first-order perturbation theory, and is nontrivial at second order.
Further Reading See [2], [3], [1].
417
36 Time-Independent Perturbation Theory
Exercises (1)
(2)
(3) (4) (5) (6) (7)
Consider a one-dimensional harmonic oscillator perturbed by a linear potential, λ Hˆ 1 = λx. Calculate the first-order perturbation theory corrections to the energy and wave functions, and compare with the exact solution for Hˆ 0 + λ Hˆ 1 (which is trivial to find in this case). Consider a hydrogenoid atom perturbed by a decaying exponential potential λ Hˆ 1 = λe−μr . Calculate the first-order perturbation theory corrections to the energy and ground state wave function. Write down an explicit formula for the second-order perturbation contribution to the ground state energy in exercise 2, without calculating the terms and summing them. Calculate the first-order perturbation contribution to the energy of the first excited state of the hydrogenoid atom in exercise 2. Formally, to obtain the perturbative solution (36.59), we did not need the self-consistent equation (36.58), we just needed to expand the definition of G(z) in λ: show this. Prove the relation (36.73), used in the analysis of the Stark effect. Use first-order perturbation theory to find the Stark effect splittings for the second excited state of the hydrogenoid atom, n = 3.
37
Time-Dependent Perturbation Theory: First Order
In this chapter, we will consider the time dependence of the Schr¨odinger equation, and solve it in perturbation theory. We will concentrate on the first order, leaving the second-order case and the general expansion for the next chapter. Consider then the time-dependent Hamiltonian Hˆ split into a nonperturbed, time-independent part Hˆ 0 and a time-dependent perturbation, Hˆ (t) = Hˆ 0 + λ Hˆ 1 (t).
(37.1)
As before, for this to be a satisfactory perturbation theory, we need λ 1, or rather that the matrix elements of λ Hˆ 1 (t) are much smaller than the matrix elements of Hˆ 0 . Thus, we will attempt to solve the time-dependent Schr¨odinger equation, i∂t |ψ(t) = Hˆ (t)|ψ(t),
(37.2)
as an expansion in λ.
37.1 Evolution Operator We defined the evolution operator Uˆ (t 2 , t 1 ) by |ψ(t 2 ) = Uˆ (t 2 , t 1 )|ψ(t 1 ),
(37.3)
which implies that the Schr¨odinger equation is now i∂t Uˆ (t, t 0 ) = Hˆ (t)Uˆ (t, t 0 ).
(37.4)
ˆ we obtain Uˆ (t, t 0 ) = e−i Hˆ (t−t0 )/ . Defining the If the Hamiltonian is time independent, Hˆ (t) = H, eigenstates of Hˆ as |ψ n with eigenvalues En , and if the state at t = 0 is |ψ(t = 0) = an |ψ n (0), (37.5) n
then the state at time t is |ψ(t) =
an e−iEn t/ |ψ n (0).
(37.6)
n
However, in general we have Hˆ (t) (a time-dependent Hamiltonian), so choosing at an initial time t i a state among the eigenstates |ψ n , namely |ψ(t = 0) = |ψi , 418
(37.7)
419
37 First-Order Time-Dependent Perturbation Theory we find that, at a final time, the state has become a superposition of all the |ψ n states, |ψ(t = t f ) = Uˆ (t f , t i )|ψi = an (t f )|ψ n .
(37.8)
n
Multiplying from the left with the bra ψ m |, we find that am (t f ) = ψ m |Uˆ (t f , t i )|ψi ≡ Umi (t f , t i ).
(37.9)
Then the transition probability (during the period from t i to t f ) between |ψi and |ψ f is P f i = |U f i (t f , t i )| 2 .
(37.10)
37.2 Method of Variation of Constants Consider the eigenvalue problem for the unperturbed Hamiltonian Hˆ 0 , Hˆ 0 |ψ n = En |ψ n , and an initial condition |ψ =
(37.11)
cn |ψ n .
(37.12)
n
As in the previous analysis of the evolution operator, the time-dependent interaction Hˆ 1 (t) can be used to calculate probabilities for transitions between the states |ψ n . For the solution of the time-dependent Schr¨odinger equation for Hˆ (t), we write an ansatz by lifting to time dependence the constants cn , so that |ψ(t) = cn (t)|ψ n . (37.13) n
Comparing with (37.8), we see that an (t) = cn (t) = Uni (t, t i ). Substituting this ansatz in the Schr¨odinger equation, we obtain i c˙n (t)|ψ n = Hˆ (t) cn (t)|ψ n . n
(37.14)
(37.15)
n
Multiplying from the left with the bra ψ m |, we obtain the system of equations ic˙m (t) = cn (t)ψ m | Hˆ (t)|ψ n ≡ cn (t)Hmn (t). n
(37.16)
n
This system of equations is equivalent to the original Schr¨odinger equation. Again, we must use the assumption that {|ψ n } is a complete basis for the full Hˆ (t) as well. Then, we expand the cn (t) coefficients in λ, cn = cn (t, λ) =
∞
cn(s) λ s .
(37.17)
s=0
Taking matrix elements of Hˆ (t) = Hˆ 0 + λ Hˆ 1 (t) in the basis |ψ n of Hˆ 0 , we obtain Hmn (t) = En δ mn + λH1,mn (t).
(37.18)
420
37 First-Order Time-Dependent Perturbation Theory
Moreover, we can factorize the time dependence of the unperturbed Hamiltonian from the coefficients cn (t), cn (t) = bn (t)e−iEn t/ ,
(37.19)
and therefore find an expansion in λ of bn (t) as well, bn (t) =
∞
bn(s) λ s .
(37.20)
s=0
Then the Schr¨odinger equation for cn (t) becomes one for bn (t) that is simpler, ib˙ m (t) = λH1,mn (t)ei(Em −En )t/ = λH1,mn (t)eiω mn t , n
(37.21)
n
where we define the transition frequency Em − En .
ωmn ≡
(37.22)
The boundary conditions for the equations for bn (t) are that bn(0) is constant, and defines the initial (unperturbed) wave function. To solve the equations, we identify the coefficient of λ s+1 on both sides of the equation, resulting in (s+1) ib˙ m (t) = H1,mn (t)bn(s) (t)eiω mn t . (37.23) m
This results in an iterative solution of the equation, given the boundary conditions. Put bn(0) on the right-hand side, then find bn(1) from the left-hand side. Then input bn(1) on the right-hand side, and find bn(2) from the left-hand side, etc. Consider the boundary conditions given by an initial state |ψi , so that bn(0) (t) = δ ni bn(s) (t = 0) = 0, s > 0.
(37.24)
The second condition means that there is no component for n i at all orders in λ, at the initial time. Then, the equation for the first order in λ is (1) ib˙ m (t) = H1,mi (t)eiω mi t ,
with solution
τ
(1) ibm (τ) =
dteiω mi t H1,mi (t).
(37.25)
(37.26)
0
Finally, the time-dependent state to first order is |ψ(t) = (δ ni + bn(1) (t))e−iEn t/ |ψ n .
(37.27)
n
Then the transition probability for the system to go from a state with energy Ei to a state with energy En during the time t is Pni (t) = |bn(1) (t)| 2 .
(37.28)
421
37 First-Order Time-Dependent Perturbation Theory
37.3 A Time-Independent Perturbation Being Turned On Consider a time-independent perturbation turned on at time t = 0, so that ⎧ t 0) − z z 1 − e−iax f (x) f (x) lim dx f (x) = iπ f (x) + lim dx + dx a→∞ y →0 x x x y + z f (x) = iπ f (x) + P dx. x y Here we have defined the principal part P of an integral as the integral minus its pole, z − z f (x) f (x) f (x) P dx = dx + dx. x x x y y +
(38.7)
(38.8)
(38.9)
To prove the above relation, valid for distributions, we first note that 1 − e−iax i sin ax 1 − cos ax = + , x x x
(38.10)
and, from the previous chapter, we know that if a → ∞, then the first term goes to iπδ(x). Moreover, − + z z 1 − cos ax 1 − cos ax dx f (x) = + + , (38.11) x x y y − and, since around x = 0 we have (1 − cos ax)/x a2 x/2, we have that + 1 − cos ax → 0, x −
(38.12)
which leaves only the principal part of the integral. Moreover, for the principal part (excluding the region near x = 0), we have cos ax lim − f (x)dx = 0, (38.13) a→∞ 2 since the integrand oscillates very fast, averaging to zero. Then finally we find z z 1 − e−iax f (x) lim f (x) = iπ f (x) + P dx. a→∞ y x x y
(38.14) q.e.d.
Substituting into (38.6), with a = t/ and x = En − Ei , we get b˙ i (t) i 1 2 |H1,in | iπδ(En − Ei ) + P . bi (t) En − Ei n
(38.15)
431
38 Higher-Order Time-Dependent Perturbation Theory
More precisely, we have n as a sum over final states, so b˙ i (t) 1 1 d f2 · · · dEn ρ(En , f 2 , . . . , f n ) |H1,in | 2 −πδ(En − Ei ) + iP bi (t) En − Ei fn |H1,in | 2 =π d f2 · · · ρ(Ei , f 2 , . . . , f n ) (38.16) fn |H1,ni | 2 i + P dEn d f2 · · · ρ(En , f 2 , . . . , f n ) . En − Ei f n
The term on the second line is defined to be ≡ −Γ/2 and the term on the third line is defined to be ≡ −(i/)ΔEi , and in total the result is defined to be equal to Γ i −γ ≡ − + ΔEi . (38.17) 2 Thus
2π Γ= d f2 · · · |H1,ni | 2 ρ(E, f 2 , . . . , f n ) n fn |H1,ni | 2 ΔEi = − P dEn d f2 · · · ρ(En , f 2 , . . . , f n ), En − Ei n f
(38.18)
n
and the decay of the initial state is given by b˙ i (t) −γ ⇒ bi (t) = e−γt = e−Γt/2 e−iΔEt/ . bi (t)
(38.19)
Note that the coefficient bi (t) calculated here, 1 − Γ/2t − i/ΔEt + · · · amounts to a sum of the zeroth-order bi(0) = 1 and the second-order one in −Γ/2−i/ΔE, plus an infinite series of contributions from higher orders. Thus the coefficients of the initial eigenstate of Hˆ 0 are i ci (t) = bi (t) exp − Ei t e−Γt/2 e−i(Ei +ΔEi )t/ , (38.20) which means that Γ is a “decay width”, whereas ΔEi is an energy shift. Indeed, the probability of finding the particle in the same state as the initial state is Pi (t) = |ci (t)| 2 e−Γt , so Γ is indeed a decay width. Moreover, the lifetime of the particle is given by ∞ ∞ ∞ dP d|ci | 2 1 t = dt t − dt t − dt tΓe−Γt = . = = dt dt Γ 0 0 0
(38.21)
(38.22)
Since we found that in (38.19) there are zeroth-order plus second-order contributions, substituting it into the right-hand side of (38.1), we obtain a sum of first-order and third-order terms for bm . Moreover, the terms in the sum with n i are negligible compared with those with n = i, so only the term with n = i remains, and we get (1)+(3) (t) = H1,mi bi(0)+(2) (t)eiω mi t = H1,mi e−γt eiω mi t . ib˙ m
(38.23)
432
38 Higher-Order Time-Dependent Perturbation Theory
Pm
Γ Em Ei + ΔEi Figure 38.1
The Breit–Wigner distribution is a Lorentzian curve.
Integrating this equation, we obtain (1)+(3) bm (τ) =
1 H1,mi i
=−
τ
dt e (iω mi −γ)t
0
H1,mi ei(Em −Ei )τ/ e−γτ − 1 Em − Ei + iγ
(38.24) .
At large times, τ → ∞, e−γτ → 0, so we obtain (1)+(3) bm (τ → ∞) →
H1,mi H1,mi = . Em − Ei + iγ Em − Ei − ΔEi + iΓ/2
(38.25)
Therefore the probability of finding the state in the mth eigenstate at large times is (1)+(3) 2 (1)+(3) 2 |cm | = |bm | =
|H1,mi | 2 (Em − Ei − ΔEi ) 2 + (Γ/2) 2
,
(38.26)
which is the Breit–Wigner distribution in energies, a Lorentzian curve, as in Fig. 38.1. Indeed, here Γ is the full width at half maximum value of the distribution. Notice also that this distribution implies the energy–time uncertainty relation, since the uncertainty in time is Δt ∼ t, so ΔtΔE ∼ .
(38.27)
38.2 General Perturbation The general theory of perturbations (to all orders) is best described in the interaction picture, defined in Chapter 9. We review it here. As before, we consider a nonperturbed Hamiltonian Hˆ 0 and an interaction Hˆ 1 , so Hˆ = Hˆ 0 + Hˆ 1 . Then we go to the Heisenberg picture with the free Hamiltonian Hˆ 0 only, so we change states and operators with respect to the Schr¨odinger picture, |ψW = Wˆ |ψ AˆW = Wˆ Aˆ Wˆ −1 ,
(38.28)
433
38 Higher-Order Time-Dependent Perturbation Theory
where
−1 i i Wˆ (t) = (US(0) (t, t 0 )) −1 = exp − Hˆ 0 (t − t 0 ) = exp + Hˆ 0 (t − t 0 ) ,
(38.29)
and US(0) (t, t 0 ) is the Schr¨odinger-picture evolution operator for propagation with Hˆ 0 . Then i |ψ I (t) = exp Hˆ 0 (t − t 0 ) |ψ S (t) (38.30) i ˆ i ˆ ˆ ˆ AI = exp H0 (t − t 0 ) AS exp − H0 (t − t 0 ) . Thus, now both the state |ψ I (t) and the operator Aˆ I (t) are time dependent, with ∂ |ψ I (t) = Hˆ 1,I |ψ I (t) ∂t ∂ i Aˆ I (t) = [ Aˆ I (t), Hˆ 0,I ], ∂t where the index I means that the operator is in the interaction picture. The evolution in the interaction picture is given by i
|ψ I (t) = Uˆ I (t, t 0 )|ψ I (t 0 ) = Uˆ I (t, t 0 )|ψ S (t 0 ), since all the pictures must give the same result at time t 0 . Then we find i Uˆ I (t, t 0 ) = Wˆ (t)Uˆ S (t, t 0 )Wˆ −1 (t 0 ) = exp Hˆ 0 (t − t 0 ) Uˆ S (t, t 0 ),
(38.31)
(38.32)
(38.33)
where we have used the fact that Wˆ (t 0 ) = 1. This interaction-picture operator then satisfies the same equation as |ψ I (t), namely ∂ ˆ UI (t, t 0 ) = Hˆ 1,I Uˆ I (t, t 0 ). (38.34) ∂t This equation is solved perturbatively, as t 2 t t1 −i −i Uˆ I (t, t 0 ) = 1 + dt 1 H1,I (t 1 ) + dt 1 dt 2 H1,I (t 1 )H1,I (t 2 ) + · · · t0 t0 t0 t 2 t t −i 1 −i =1+ dt 1 H1,I (t 1 ) + dt 1 dt 2T H1,I (t 1 )H1,I (t 2 ) + · · · 2! t0 t0 t0 t i = T exp − dt H1,I (t ) , t0 (38.35) i
where T is a time-ordering operator.
38.3 Finding the Probability Coefficients bn (t) Consider the initial interaction-picture state (which equals the Schr¨odinger-picture state) to be an eigenket of Hˆ 0 of index i, as before, so that |ψ I (t) = Hˆ I (t, t 0 )|ψ I (t 0 ) = Uˆ I (t, t 0 )|ψi .
(38.36)
434
38 Higher-Order Time-Dependent Perturbation Theory
Then the Schr¨odinger-picture state is i ˆ i ˆ |ψ S (t) = exp − H0 (t − t 0 ) |ψ I (t) = exp − H0 (t − t 0 ) Uˆ I (t, t 0 )|ψi .
(38.37)
In order to calculate transition probabilities, we insert on the left the identity expanded in a
complete set of the eigenkets of Hˆ 0 , 1 = n |ψ n ψ n |. We obtain i ˆ |ψ S (t) = |ψ n ψ n | exp − | H0 (t − t 0 ) Uˆ I (t, t 0 )|ψi n (38.38) i ˆ = exp − En (t − t 0 ) |ψ n ψ n |UI (t, t 0 )|ψi , n where we have acted with exp − i Hˆ 0 (t − t 0 ) on the left, on ψ n |. Identifying the result with the general expansion in terms of bn (t) coefficients, we find that bn (t) = ψ n |Uˆ I (t, t 0 )|ψi t t ∞ (−i/) s = ψ n dt 1 · · · dt s T H1,I (t 1 ) . . . H1,I (t s ) ψi s! (38.39) t0 t0 s=0 ∞ t t s (−i/) ! " = δ ni + dt 1 · · · dt s ψ n |T H1,I (t 1 ) . . . H1,I (t s ) |ψi . s! t0 t0 s=1
Now insert the identity written as a complete set over mk , 1 = mk |ψ mk ψ mk |, after each Hamiltonian at t k . Then, in i ˆ i ˆ H1,I (t k ) = exp H0 (t k − t 0 ) H1,S (t k ) exp − H0 (t k − t 0 ) , (38.40) we can act with the operator on the left on the left-hand ψ mk−1 | and with the operator on the right on the right-hand |ψ mk , obtaining t t ∞ (−i/) s bn (t) = δ ni + dt 1 · · · dt s s! t0 t0 s=1 i i × T exp En (t 1 − t 0 ) H1,S,nm1 (t 1 ) exp − Em1 (t 1 − t 0 ) {mk } (38.41) i i × exp + Em1 (t 2 − t 0 ) H1,S,m1 m2 (t 2 ) exp − Em2 (t 2 − t 0 ) · · · i i × exp + Ems−1 (t s − t 0 ) H1,S,ms−1 i (t s ) exp − Ei (t s − t 0 ) . This expansion can be identified with the expansion in λ of the coefficients, bn (t) =
∞
bn(s) (t),
(38.42)
s=0
with bn(0) (t) = δ ni . We have therefore re-obtained the zeroth-order, first-order, and second-order terms, and generalized to all orders, as promised.
435
38 Higher-Order Time-Dependent Perturbation Theory
Important Concepts to Remember • In the second order of time-dependent perturbation theory, more precisely summing the zerothorder, second-order, contributions, we find, for the initial-state coefficient, ci (t) = andΓ higher-order bi (t) exp − i Ei t = e− 2 t exp − i (Ei + ΔEi )t , where Γ is the decay width and ΔEi is the energy shift. • This leads to the decay of the original state, Pi (t) e−Γt , and to the Breit–Wigner distribution for (1)+(3) 2 (1)+(3) 2 the other states, |cm | = |bm | = |H1,mi | 2 /[(Em − Ei − ΔEi ) 2 + (Γ/2) 2 ]. • For a general perturbation, wecan use the the evolution operator in the interaction t formalism of picture, which is Uˆ I (t, t 0 ) = T exp − i t dt H1,I (t ) , and note that bn (t) = ψ n |Uˆ I (t, t 0 )|ψi . 0
Further Reading See [1] and [2].
Exercises (1) (2)
(3) (4) (5) (6) (7)
In general, can we have a zero-energy shift if we have a nonzero decay width (finite lifetime)? How about a zero decay width (infinite lifetime) for a nonzero energy shift? In a transition, if the density of final states ρ(En ) is an increasing exponential, ρ(En ) = Ae αEn , α > 0, what does the (physical) condition of a finite energy shift imply for the transition Hamiltonian matrix H1,in , if Ei > 0? If Hni is independent of En = Ei , what physical condition can we impose on ρ(En )? Argue for the energy–time uncertainty relation, ΔEΔt ∼ , on the basis of the Breit–Wigner distribution. Calculate the interaction picture operator Uˆ I (t, t 0 ) and bn (t) for the interaction in the sudden approximation. Show that the first-order formula for time-dependent perturbation theory can be recovered from (38.41). Show that the second-order formula for time-dependent perturbation theory can be recovered from (38.41).
39
Application: Interaction with (Classical) Electromagnetic Field, Absorption, Photoelectric and Zeeman Effects In this chapter we will apply the formalisms of time-independent and time-dependent perturbation theory to the case of the interaction of electrons and atoms with an electromagnetic field. In particular, we will consider the Zeeman effect for bound electrons, and the absorption and photoelectric effects for atoms.
39.1 Particles and Atoms in an Electromagnetic Field In Chapter 22, we described the interaction of a quantum mechanical particle with a classical electromagnetic field. We found that the interaction Hamiltonian is 1 2 + V (r ) − q A0 p − q A Hˆ = 2m 2 p 2 q q2 A = − A · p − q A0 (r ) + V (r ) + . 2m m 2m
(39.1)
Here V (r ) is a non-electromagnetic part of the potential (more precisely, non-external electromagnetic-field), so −q A0 (r ) + V (r ) = V (r ),
(39.2)
as usual, so, in the gauge ∇ ·A = 0, we have and p = i ∇, 2 2 iq q2 2 Hˆ = − ∇ + V (r ) + A · ∇ − q A0 + A. 2m m 2m
(39.3)
iq (∇ · A) term. 2m ˆ In a hydrogenoid atom, where V (r ) is the interaction Moreover, we can ignore the last term in H. of the electron with the nucleus, we have
Otherwise, there would be a +
2 Hˆ 0 = − ∇ + V (r ), 2m 2
(39.4)
and the other terms make up Hˆ 1 . A0 constitute a radiation field, we can also put A0 = 0 as a gauge choice, obtaining the If A, 2 radiation gauge. Neglecting the A term (as being small, of second order), we obtain iq Hˆ 1 = A · ∇. m 436
(39.5)
437
39 Applications
39.2 Zeeman and Paschen–Back Effects Consider an atom with strong spin–orbit coupling in a magnetic field. In Chapter 24, we found that in the presence of a magnetic field in the z direction, B = Bz , the interaction Hamiltonian between the electrons in the atom and the external magnetic field is = −μ B ( L + 2 S) ·B = −μ B B(L z + 2Sz ) = −μ B B(Jz + Sz ). Hˆ B-int = − μtot · B
(39.6)
When continuing the analysis, the implicit assumption (not expressed) was that this Hˆ B-int is a perturbation. Then, we can use time-independent perturbation theory in first order, which does not change the states, so that |n (0) = |En J M.
(39.7)
Therefore, the energy shift in first-order perturbation theory becomes ΔEn(1) = H1,nn = n (0) | Hˆ 1 |n (0) = −μ B BEn J M | Jz + Sz |En J M.
(39.8)
But, by the Wigner–Eckhart theorem (from Chapter 16), S is a vector operator like J so they are proportional, and so we have Sz =
J · S 2
J
Jz .
(39.9)
Using this relation, we found that En J M | Jz + Sz |En J M = gM δ M M ,
(39.10)
where g is the Land´e g factor, calculated from the above relation. Then, finally, we found ΔEn(1) = −gM μ B B.
(39.11)
We must make one observation, however. This linear Zeeman effect in a hydrogenoid atom is valid only in the case where j 0, so either l 0 or s 0. But, if l = s = 0 (so j = 0), then we need ˆ to consider the second-order term in H, 2 mω 2L r 2 sin2 θ e2 A = , Hˆ 2 = 2m 2
(39.12)
where ωL =
eB 2m
(39.13)
is the Larmor frequency. Then the energy shift in first-order perturbation theory, using this Hˆ 2 , is 2 m eB a0 2 1 2 2 ΔEn(1) = n (5n + 1), (39.14) 2 2m Z 3 where we have used the fact that, for a hydrogenoid atom, a 2 1 0 r 2 sin2 θn = n2 (5n2 + 1). Z 3
(39.15)
438
39 Applications
Paschen–Back Effect This effect occurs in the opposite limit to the (linear) Zeeman effect, in which limit the LS (spin– orbit) coupling (described in Chapter 22) is μ B B. This means that the spin–orbit interaction can be treated perturbatively, and does not couple L with S into J in this limit. Then the correct quantum states to consider are |En LSML MS , instead of |En LS J MJ . Therefore the energy shift in first-order time-independent perturbation theory is ΔEn(1) = En LSML MS | Hˆ 1 |En LSML MS = −μ B B(ML + 2MS ).
(39.16)
However, we note that actually this Hˆ 1 is diagonal in this basis, so it is actually a part of Hˆ 0 . Thus here the perturbation Hˆ 1 is the spin–orbit coupling, which is a small effect.
39.3 Electromagnetic Radiation and Selection Rules Before considering the absorption and emission of photons, we will consider the emission of classical electromagnetic radiation from an electric current, calculate the interaction Hamiltonian, and find the resulting selection rules for the transition between initial and final states interacting with the electromagnetic radiation. ·A = 0. Then the interaction Hamiltonian of We consider the same radiation gauge, A0 = 0 and ∇ the classical electromagnetic radiation with an electric current j is Hi = d 3 r j · A. (39.17) = 0, with traveling wave In the radiation gauge, the equation of motion is the wave equation A solution (a retarded wave, with delayed time) r · n A= A t− , (39.18) c where n is the direction of propagation, n =
k ck = . k ωk
(39.19)
The Fourier-transformed wave in the delayed time is A(ω), and satisfies the condition n · A(ω) = 0,
(39.20)
·A = 0. which comes from the gauge condition ∇ The Fourier-transformed vector potential is written in terms of a polarization vector (k), as A(ω) = (k) A(ω) = A1 (ω)1 + A2 (ω)2 ,
(39.21)
439
39 Applications where 1 and 2 are unit vectors perpendicular to n and each other, and √ A(ω) = 1 + a2 A1 (ω) (ω) =
1 + aeiθ 2 √ 1 + a2
(39.22)
A1 (ω) = aeiθ . A2 (ω) However, usually we consider a monochromatic wave, i.e., a wave with a single ω, so that instead of a Fourier-transformed wave, we have simply r , t) = A0 (k)2 cos ω n · x − ωt A( c (39.23) = A0 (k) ei(k ·x −ωt) + e−i(k ·x −ωt) , where k = ωn/c. Then, the interaction Hamiltonian between the radiation and the electric current presented by an atom, valid for both emission and absorption of electromagnetic radiation, is Hi = A0 d 3 r (k) · j ei(k ·x −ωt) + e−i(k ·x −ωt) . (39.24) We note that the current conservation law is ∂ρ + ∇ · j = 0. (39.25) ∂t Next, we can proceed either classically or quantum mechanically, and obtain basically the same result. Classically, we consider that the electric current generating or absorbing the radiation oscillates in syncronization with it, meaning the charge density ρ = ρ(ω) oscillates with the same monochromatic frequency ω as the wave, so that ∂ρ = iωρ. (39.26) ∂t ˆ and its oscillation in syncronQuantum mechanically, the charge density is an operator, ρ = ρ, ization with the radiation induces a transition between atomic states, |ψ n → |ψ m . Then, using the Heisenberg equation of motion for the operator ρ, we obtain ∂ ρˆ i ˆ ˆ ψ n ψ m ψ n = ψ m [ Hatom , ρ] ∂t (39.27) i ˆ n = iωmn ψ m | ρ|ψ ˆ n . = (Em − En )ψ m | ρ|ψ This is essentially the same result as in the classical case, except that it is an expectation value, between two states, and the frequency ω is generically quantized, ω = ωmn . Thus, either way we obtain ρ ∝ eiωt ⇒
· j = 0, iωρ + ∇ · j with −iωρ. But we can also use the relation meaning we can replace ∇ ( · r ) j = ∂i [ j x j ji ] = δi j j ji + j x j ∂i ji ∇ · j = · j − ( · r )iωρ. = · j + ( · r ) ∇
(39.28)
(39.29)
440
39 Applications
Integrating over space, the left-hand side gives a boundary term, which is assumed to vanish, so we obtain 3 d r · j = iω d 3 r ρ( · r ). (39.30) We can now consider the term with ei k ·x in Hˆ i , namely the first term, and expand it as follows: ei k ·x 1 + ik · x + · · ·
(39.31)
Considering only the zeroth-order term, i.e., 1, we have the interaction Hamiltonian at leading order, (1) 3 Hi = A0 d r ( k) · j = iω A0 ( k) · (39.32) d 3 r ρr , where we have used the relation deduced above. But the integral P ≡ d 3 r ρr
(39.33)
is the electric dipole moment, so Hi(1) = iω A0 (k) · P
(39.34)
gives rise to electric dipole radiation. Next, consider the first-order term in ei k ·x , resulting in the next-to-leading order (NLO) Hamiltonian, d 3 r ( (k) · j)(k · r ). (39.35) Hi(2) = A0 There is now a similar relation to (39.29) that we can use to rewrite the NLO interacting Hamiltonian, · [( · r )(k · r ) j] = ∂i [ j x j k k x k ji ] ∇ = δi j j k k x k ji + j x j δik k k ji + j x j k k x k ∂i ji · j = (k · r )( · j) + ( · r )(k · j) + ( · r )(k · r ) ∇
(39.36)
= (k · r )( · j) + ( · r )(k · j) − iωρ( · r )(k · r ), where in the last line we used current conservation. Then, integrating over r as before, the left-hand side gives a boundary term, assumed to vanish, so that d 3 r (k · r )( · j) + ( · r )(k · j) − iωρ( · r )(k · r ) = 0. (39.37) But we still need to rewrite the middle term in (39.36). For that, we need to use yet another relation, ( × k) · (r × j) = i jk i k j lmk x l jm = (δil δ jm − δim δ jl )i k j x l jm = ( · r )(k · j) − ( · j)(k · r ).
(39.38)
441
39 Applications
This allows us to rewrite the middle term resulting in a relation between integral quantities iω (r × j) d 3 r (k · r )( · j) = d 3 r ρ( · r )(k · r ) − ( × k) d 3r , (39.39) 2 2 so that the interaction Hamiltonian is
Hi(2) = A0
iω . i k j Qi j − ( × k) · μ 2
Here we have defined the electric quadrupole moment Qi j ≡ d 3 r ρx i x j ,
(39.40)
(39.41)
and the magnetic dipole moment ≡ μ
d 3r
(r × j) . 2
(39.42)
In general, at lth subleading order, we have an electric 2l+1 th-polar perturbation and a magnetic 2 th-polar perturbation. These are all tensors of SO(3) SU (2), which means that they are quantities with a given angular momentum. The electric dipole Pi is a vector, the magnetic dipole μi is also a vector (both with j = 1), while the electric quadrupole Qi j is a two-index symmetric tensor (with j = 2). In general, for higher-order terms, we also obtain tensors, Tq(k) . In quantum theory, these will be quantum operators, the same as those that were analyzed for the Wigner–Eckhart theorem in Chapter 16. The theorem states that l
α j m|Tˆq(k) |α, jm = j, k; m, q| j, k; j , m
α , j T (k) α, j , 2j + 1
(39.43)
and the matrix elements are nonzero only for q = m − m | j − j | ≤ k ≤ j + j .
(39.44)
These are then selection rules for the matrix element to be, resulting in a nonzero transition probability via first-order perturbation theory. In the case of the leading, electric dipole, approximation, we have a vector operator Pi , with q = 1, so m = 0, ±1, but m = 0 is only possible when P is parallel to , which means a longitudinal mode (which is unphysical, in the radiation region). Otherwise, we have m − m = ±1 | j − j | ≤ 1 ≤ j + j ,
(39.45)
which means that j − j ≤ 1 ⇒ j ≥ j − 1 j − j ≤ 1 ⇒ j ≤ j + 1,
(39.46)
j = j − 1, j, j + 1, and m = m ± 1.
(39.47)
finally resulting in
442
39 Applications
39.4 Absorption of Photon Energy by Atom The direct application of the formalism is to the absorption of electromagnetic energy. The first way to derive the transition amplitude is to use the interaction in the particle picture, q q · p = − A0 (k)ei k ·x · p , Hˆ 1 = − A m m
(39.48)
acting on a wave function. In the electric dipole where the momentum operator is understood as i ∇ approximation, the exponential is replaced by 1, so q Hˆ 1 = − (k) · p . m The matrix element between an initial and a final state is q H1, f i = ψ f | Hˆ 1 |ψi = − A0 (k) · ψ f |p |ψi m q = − A0 (k) · d 3 r ψ∗f (r ) ∇ψ r ). i ( m i
(39.49)
(39.50)
But, since for atoms we have Hˆ 0 = p 2 /2m + V (r ), p =
m [r , Hˆ 0 ]. i
(39.51)
Then we can replace p in the matrix element, and find q q A0 (k) · ψ f |p |ψi = − A0 (k) · ψ f |[r , Hˆ 0 ]|ψi m i Ei − E f = iq A0 ψ f | (k) · r |ψi = iq A0 ωi f d 3 r ψ∗f (r ) (k) · r ψi (r ).
H1, f i = −
(39.52)
But, since ρ is the electric charge density, it equals q times the probability density ψ∗ ψ, which means that the matrix element of ρ, ρ f i , is given by ρ f i (r ) = qψ∗f (r )ψi (r ), so that the interaction Hamiltonian can be rewritten as d 3 r ρ f i r , H1, f i = i A0 ωi f ·
(39.53)
(39.54)
which is the same formula as that obtained from (39.32) (which is true in the electric dipole approximation) in between the initial and final states. We must also give a better definition of the electric dipole approximation for hydrogenoid atoms. In this case, the transition frequency ωi j is of the order of the ground state energy (corresponding to the initial state being the ground state), so ω1 f ∼ E1 ∼
Ze02 Ze02 ∼ , r1 a0 /Z
(39.55)
443
39 Applications where the ground state radius is r 1 ∼ a0 /Z. Moreover, e02 /(c) ∼ 1/137, so the wavelength λ associated with ω is given by λ c ca0 137a0 /Z = ∼ 2 2 ∼ . 2π ω Z Z e0
(39.56)
But we require that the wavelength is much greater than the size of the atom, λ 2πr 1 , in order to use the dipole approximation, so r1 a0 /Z Z ∼ ∼ 1, λ/(2π) λ/(2π) 137
(39.57)
Z 137,
(39.58)
which is valid only if
i.e., for light hydrogenoid atoms. Next, we use Fermi’s golden rule for absorption, when the initial state has the energy Ei for the atom and a photon of energy ω is to be absorbed by the atom, so the delta function for energy is δ(E f − Ei − ω), giving the probability density (in time) dP 2π = |H1, f i | 2 d f2 · · · ρ(E f , f 2 , . . . , f r )dE f δ(E f − Ei − ω). (39.59) dt f r
But we can use (39.50) to replace the matrix element in the above, obtaining dP 2π q2 2 2 = | A | |ψ | ( k) · p |ψ | d f2 · · · ρ(E f , f 2 , . . . , f r )dE f δ(E f − Ei − ω). 0 f i 2 dt m f r
(39.60) The absorption cross section is defined as the ratio of absorbed energy per unit time divided by the incident flux (the incident energy per unit time and unit transverse area), σabs =
dE/dt(abs) ωdP/dt(abs) = , dEinc /dtdS F
(39.61)
where the absorbed energy per unit time equals the photon energy ω times the probability per unit time, and the incident flux F is written in terms of the incident energy density E as F = E c, so F=
2 dEinc c 2 + B , = 0 E dtdS 2 μ0
(39.62)
as and the electric and magnetic fields are written in terms of the traveling wave potential A × A( =∇ r , t) = k × (k) A0 ei(k ·x −ωt) B = −∂t A( r , t) = ω (k) A0 ei(k ·x −ωt) , E
(39.63)
so F=
ck 2 | A0 | 2 ω 2 | A0 | 2 = . 2 2c
(39.64)
444
39 Applications
Thus the absorption cross section is σabs
4πα = 2 |ψ f | (k) · p |ψi | 2 m ωf i
d f2 · · ·
ρ(E f , f 2 , . . . , f r )dE f δ(E f − Ei − ω),
(39.65)
fr
where we have used α = q2 /(c). But we can also relate the p matrix elements to those for −imω f i r , as in (39.52), obtaining 2 σabs = 4παω f i |ψ f | ( k) · r |ψi | d f2 · · · ρ(E f , f 2 , . . . , f r )dE f δ(E f − Ei − ω). (39.66) fr
39.5 Photoelectric Effect The photoelectric effect corresponds to an ionized final state of the atom, so the process is the absorption of a photon by the atom and the emission of a free electron, at a direction n, with momentum p . Then the final state is |ψ f = |k f , a free electron with energy E= Thus the matrix element is k f | · p ei k ·x |ψi = ·
2 k 2f 2me
.
d 3 r ψ∗ (r )ei k ·r ∇ψ r ), i ( kf i
(39.67)
(39.68)
where ψi (r ) corresponds to an electron inside the hydrogenoid atom, whereas ψk f (r ) is a free wave function, normalized over energy. In the electric dipole approximation we replace ei k ·x with 1, so we then obtain a matrix element H1, f i = imω f i ψ f | · p |ψi .
(39.69)
As in Chapter 37, where we analyzed the scattering wave collision, we have f 1 = E, f 2 = n, so d f 2 = dΩ. The differential cross section is then dσ ωdP/dt(abs) = dE f ρ(E f )δ(E f − Ei − ω), (39.70) dΩ F but if the state ψ f is normalized in energy, as in Chapter 37, then ρ(E f ) = 1. Therefore we finally obtain 4παω f i dσ = |ψ f | · p |ψi | 2 . dΩ (mω f i ) 2
(39.71)
In this formula, we would need to calculate the matrix element for a given state |ψi of the hydrogenoid atom. Therefore, since the differential cross section depends on |ψi , we will not compute it further here.
445
39 Applications
Important Concepts to Remember • The linear Zeeman effect corresponds to the linear part of the interaction Hamiltonian for interaction with electromagnetism, Hˆ 1 = iq m A · ∇, plus the interaction with the spin, leading to · B, and energy shift ΔE = −gM μ B B. For l = s = 0, we must consider the Hˆ 1 = −μ B ( J + S) quadratic part of Hˆ 1 instead. • The Paschen–Back effect corresponds to the opposite limit, in which the LS coupling is μ B B, so the correct quantum states are |En LSML MS , and ΔEn = −μ B B(ML + 2MS ). = A(t − n · r /c), or in particular A = A0 (k)[ei(k ·x −ωt) + • For electromagnetic radiation, A −i( k · x −ωt) · j = 0. e ], in which case the charge density ρ also oscillates with ω, and iωρ + ∇ i k · x • When expanding e into 1 + i k · x + · · · , the first-order interaction Hamiltonian depends on the and the second-order interaction Hamiltonian electric dipole moment P as Hi(1) = iω A0 (k) · P, as Hi(2) = depends on the electric quadrupole moment Qi j and the magnetic dipole moment μ A0 [(iω/2)i k j Qi j − i jk i k j μk ]. • The Wigner–Eckhart theorem leads to selection rules for the matrix elements for electromagnetic transition probabilities. In particular, the leading electric dipole approximation leads to m = m ± 1 and j = j − 1, j, j + 1. • For absorption of a photon by an atom, we obtain the matrix element H1, f i = i Ao ω f i · d 3 r ρ f i r , and we can use it in Fermi’s golden rule. For a hydrogenoid atom, this electric dipole approximation is valid for Z 137 = 1/α. • For the photoelectric effect, we would apply the formalism for absorption of photon by an atom, with the final state an emitted free electron.
Further Reading See [2], [1], [3], [4].
Exercises (1)
Consider the first relativistic correction to the energy of a particle, and introduce the coupling to the electromagnetic field. Write down the resulting extra terms that are linear and quadratic in A. (2) For the terms considered in exercise 1, find expressions for the first relativistic corrections to the linear and quadratic Zeeman effects (ignore any extra couplings to spin). (3) Show that the splittings of energy levels according to the Zeeman or Paschen–Back effects lead to the same number of lines. (4) Calculate the transition element H1, f i for the interaction of a hydrogenoid atom with a monochromatic electromagnetic wave, in the electric dipole approximation, from the n = 1, l = 0 state to the n = 2, l = 1 state.
446
39 Applications
(5) Consider a hydrogenoid atom in the state n = 3, l = 2, m = 0. What are the possible transitions induced by a monochromatic electromagnetic wave, to both first and second order? (6) Calculate the absorption cross section for the transition in exercise 4. (7) Calculate the differential cross section for the photoelectric effect for a monochromatic electromagnetic wave of energy of 14 eV incident on a hydrogen atom in the ground state.
WKB Methods and Extensions: State Transitions and Euclidean Path Integrals (Instantons)
40
In this chapter we pick up the previously described WKB (semiclassical) methods, and extend them for transitions between states in (time-dependent) perturbation theory, and we define “instanton” calculations in a Euclidean version of the path integral formalism.
40.1 Review of the WKB Method The Schr¨odinger equation in one dimension, in the presence of a potential V (x), with the wave function ansatz i (40.1) ψ(x, t) = exp (W (x) − Et) , is
dW dx
2 − 2m(E − V (x)) +
d 2W = 0, i dx 2
(40.2)
and has the quasiclassical solution
x const i i ψ(x) = exp ± dx p(x ) − Et x0 p(x)
for E > V (x), where p(x) = and the solution
2m(E − V (x)),
x const 1 i ψ(x) = √ exp ± dx χ(x ) − Et x0 χ(x)
for E < V (x), where χ(x) =
2m(V (x) − E).
(40.3)
(40.4)
(40.5)
(40.6)
These solutions are valid only asymptotically, meaning not very close to the “turning points” x i , where E − V (x i ) = 0. When going through these turning points, we match one type of solution to the other. If we have a decreasing potential V (x), and we transition from left to right (from higher V to lower V ), we are translating x1 1 1 ) − E) exp − dx 2m(V (x (40.7) x [V (x) − E]1/4 447
448
40 WKB Methods and Extensions, Transitions, Instantons
into
√ 2
x 1 π sin dx 2m(E − V (x )) + x1 4 [E − V (x)]1/4 √ x 1 2 π cos dx 2m(E − V (x )) − = . x1 4 [E − V (x)]1/4
(40.8)
On the other hand, for an increasing potential, and a transition from right to left (from higher V to lower V ), we need to translate x2 1 1 ) − E) exp − dx 2m(V (x (40.9) x [V (x) − E]1/4 into
√ 2
x 1 π sin dx 2m(E − V (x )) − x2 4 [E − V (x)]1/4 √ x 1 2 π )) + cos dx 2m(E − V (x = . x1 4 [E − V (x)]1/4
In the path integral formalism, the propagator i U (t, t 0 ) = Dx(t) exp S[x(t)] ,
(40.10)
(40.11)
with boundary conditions x(0) = x 0 and x(t) = x, is approximated in the “saddle point approximation” by the action in the classical solution times a Gaussian integration for the quadratic fluctuation giving a quasi-classical determinant, i U (t, t 0 ) exp S[x cl (t)] [det(kinetic operator)]−1/2 x i i = exp dx 2m(E − V (x )) − Et (40.12) x0 −1/2 d2 × det −m 2 − (V (x) − E) . dt After integrating over the initial position x 0 , made to be identical with the final position x, we obtain the correct propagator U (t, t 0 ) for the harmonic oscillator case.
40.2 WKB Matrix Elements We want to calculate matrix elements ˆ 2 A12 ≡ ψ1 | A|ψ
(40.13)
where |ψ1 , |ψ2 are eigenstates of the Hamiltonian Hˆ with energies E1 , E2 . Assume for concreteness that E2 > E1 . The analysis below follows the work of Landau (described in the book by Landau and Lifshitz).
449
40 WKB Methods and Extensions, Transitions, Instantons We consider only operators Aˆ that are functions in the x representation, as for instance a potential V (x). Then ˆ A12 = dx dyψ1 |xx| A|yy|ψ 2 (40.14) = dx dyψ1∗ (x)ψ2 (x) Axy and Axy = δ(x − y) f (x). Assuming also that ψ1 (x) and ψ2 (x) are real, we have A12 = f 12 = dxψ1 (x) f (x)ψ2 (x).
(40.15)
Consider a potential V (x) that is decreasing, and consider a turning point a1 for E1 such that V (a1 ) = E1 , and a turning point a2 for E2 such that V (a2 ) = E2 , implying a1 > a2 . Define further p1 (x) = pE1 (x),
χ1 (x) = χE1 (x)
p2 (x) = pE2 (x),
χ2 (x) = χE2 (x).
(40.16)
Then, according to the general theory reviewed before, away from the turning point a1 the wave function ψ1 (x) corresponding to E1 is x C1 1 ψ1 = √ √ exp − χ1 (x )dx , x < a1 a1 2 χ1 (x) (40.17) x C1 1 ψ1 = cos p1 (x )dx , x > a1 , a1 p1 (x) and similarly for ψ2 (x). However, we cannot use both these solutions to calculate f 12 . First, the region near the turning points has a large contribution to the integral, since the wave function blows up there, and second, the turning point of one is not a turning point for the other, so it encroaches on the region that was good from the point of view of the other. Instead, we write ψ2 = ψ2+ + ψ2− , where ψ2+ is complex, and ψ2− = (ψ2+ ) ∗ . Then we write, for the solution for x > a2 , 2 cos[. . .] as ei[...] + e−i[...] , which is a sum of complex conjugate terms, therefore corresponding to ψ2+ and ψ2− . But now we can take advantage of an alternative correspondence rule through the turning points, for complex exponential solutions, namely (from x > a2 to x < a2 ) x x 1 C i π C exp p(x )dx + i exp p(x )dx , (40.18) → a 4 a p(x) |p(x)| 2
thus defining
ψ2+
2
−iπ/2
also for x < a2 . We obtain (note that −i = e ) x −iC2 1 ψ2+ = √ √ p2 (x )dx , x < a2 exp a2 2 χ2 (x) x C2 i π ψ2+ = exp p2 (x )dx − i , x > a2 . a2 4 2 p2 (x)
(40.19)
+ − Having defined ψ2+ , and ψ2− = (ψ2+ ) ∗ , we can also split f 12 similarly as f 12 = f 12 + f 12 , with +∞ + f 12 = ψ1 (x) f 12 (x)ψ2+ (x)dx. (40.20) −∞
450
40 WKB Methods and Extensions, Transitions, Instantons
Next, we consider moving x from the real line onto the complex plane, specifically the upper halfplane (the positive imaginary part). In that case, if the exponent is a function g(x) that maps the upper half-plane to itself, eig(x) → e−Img(x) goes to zero at infinity, so is better defined near |x| → ∞. Then, we consider an integration contour C for x in the upper half-plane that is slightly above the real line, thus avoiding the turning points at real a1 , a2 and so being well defined. For that, however, we must define, in the whole half-plane but away from a1 , a2 , x 1 C1 ψ1 (x) = √ √ exp χ1 (x )dx a1 2 χ1 (x) (40.21) x −iC2 1 + ψ2 (x) = √ √ χ2 (x )dx . exp − a2 2 χ2 (x) + Then we can calculate the amplitude f 12 as x x 1 −iC1 C2 f (x) + f 12 = dx √ exp dx χ1 (x ) − dx χ2 (x ) . 2 a1 χ1 (x)χ2 (x) C a2
(40.22)
The exponent in the above formula has an extremum x 0 , i.e., a zero derivative with respect to x, only for V (x 0 ) = ∞ if E1 E2 . Indeed, x x d dx χ1 (x ) − dx χ2 (x ) = 0 dx a1 a2 (40.23) E1 − E2 ⇒ 2m(V (x) − E1 ) − 2m(V (x) − E2 ) √ =0 2m(V (x) − E1 ) has only the solution V (x) → ∞. Assume that there is a single pole for V (x) in the upper half-plane, i.e., V (x 0 ) → ∞. Then, the integral over the contour C (slightly above the real line) is equal to the integral over a contour C that goes vertically down from infinity to x 0 , then encircles it from below (very close to it), then goes vertically back up again. Indeed, we can add for free two quarter-circles at infinity in the upper halfplane (since, as we just said, this contribution at infinity vanishes in the upper half-plane) that will connect C with −C at infinity, and there is no pole inside the resulting total contour, as in Fig. 40.1. We can thus use the residue theorem of complex analysis to say that the integral over the whole contour is zero (the integral equals the sum of residues at the poles inside the contour), or, since the added integral at infinity vanishes anyway, that the integral over C is equal to the integral over C .
x
C
x0
Figure 40.1
C
Integration over the contour C, slightly above the real axis in x, can be reduced to integration over the contour C, encircling x0 , down and up from infinity, indicated by the dashed curve.
451
40 WKB Methods and Extensions, Transitions, Instantons + Then f 12 is found to be the same as (40.22), but integrated over C , and can be thus approximated (since the further we go up in the imaginary plane, the more we add to the exponent in e−Im[...] , making the contribution exponentially small) by the value of the exponential near the minimum x 0 . Moreover, then + f 12 = 2 Re f 12 ∝ 2 Im exp[. . .],
(40.24)
so that finally we have the approximation (note that f (x 0 ) is a subleading part, which can be neglected compared with the exponential) x0 x0 1 f 12 ∼ exp − Im dx p2 (x ) − dx p1 (x ) . (40.25) If the two energies E1 and E2 are close to each other, i.e.,
with ω21
1 E1,2 ≡ E ± ω21 , 2 E, we can also make the approximations ∂pE (x) ω21 (E) ∂E 2 ∂pE (x) ω21 p1 (x) pE (x) − (E) . ∂E 2
(40.26)
p2 (x) pE (x) +
Then, since ∂pE (x) (E) = ∂E
m , 2(E − V (x))
we find that the transition amplitude matrix element is approximated by x0 m f 12 ∼ exp −ω21 Im dx = e−ω21 Im τ 2(E − V (x)) where we have defined the complex time
τ≡
x0
dx , v(x)
and where also we have defined the complex velocity 2(E − V (x)) v(x) ≡ . m
(40.27)
(40.28)
(40.29)
(40.30)
(40.31)
Thus we have a transition amplitude (a matrix element) that is approximated by a decay factor in complex time, found as an extremum over the complex plane of the action, i.e., the exponent of the WKB approximation.
40.3 Application to Transition Probabilities The transition probability relations contain, as we saw in previous chapters, the modulus squared of transition amplitudes for the interaction Hamiltonian, |ψ m | Hˆ |ψi | 2 , and usually the interaction Hamiltonian Hˆ 1 is a certain function of x, f (x).
(40.32)
452
40 WKB Methods and Extensions, Transitions, Instantons
Then, time-dependent perturbation theory, in the sudden approximation (the interaction Hamiltonian jumps from 0 to a constant at t 0 ), gives |bm (t)| 2 = t or Fermi’s golden rule, P(i → m) = t
2π
d f2 · · ·
2π |ψ m | Hˆ |ψi | 2 δ(E f − Ei ),
ρ(Ei , f 2 , . . . , f r )|ψ m | Hˆ |ψi | 2 ∼ |ψ m | Hˆ |ψi | 2 .
(40.33)
(40.34)
fr
But then, we have for the amplitude
x0 x0 1 ψ m | Hˆ 1 |ψi ∼ exp − Im dx pm (x ) − dx pi (x ) . x1 x2
(40.35)
Since however, the integral in the exponent, according to the Bohr-Sommerfeld quantization, discussed in Chapter 26, is the action, x0 dx p(x) = action = S(x 1 , x 0 ), (40.36) x1
we obtain that the leading factor in the transition probability is 2 P(i → m) ∼ exp − Im [S(x 1 , x 0 ) + S(x 0 , x 2 )] .
(40.37)
This means that we have an action in the exponent for a path in the complex plane in terms of variables (complex) time τ and (complex) position x, together giving the (complex) path x(τ), which starts at x 1 and goes to x 0 , the extremum of the action S in complex time, then goes to x 2 . Here x 1 and x 2 are positions corresponding to the two states ψi and ψ m for which we are calculating the transition probability.
40.4 Instantons for Transitions between Two Minima The previous explanation motivates a more precise analysis for the case of a transition between two local minima. For this, however, we need to define path integrals in Euclidean space first.
Path Integrals in Euclidean Space In Chapter 28 we introduced path integrals in Euclidean (imaginary) time, though the motivation there was to connect with statistical mechanics. Here, the main motivation is to have a formalism that allows for a natural “classical path” in imaginary time. Thus we consider the propagator written as a path integral, x f =x(t f ) Dx(t)eiS[x(t)]/ , (40.38) U (t, t 0 ) = xi =x(ti )
and it can be Wick rotated to Euclidean (imaginary) time by setting t E = it (so we are considering t = −it E , where now t E ∈ R), which changes the Minkowski metric of spacetime to the Euclidean one,
453
40 WKB Methods and Extensions, Transitions, Instantons −c2 dt 2 + dx 2 = +c2 dt 2E + dx 2 .
(40.39)
We define the exponent in the path integral, iS[x], as −SE [x], 2 ⎡⎢ ⎤⎥ 1 dx ⎢ iS[x] = i (−idt E ) ⎢ − V (x) ⎥⎥ ≡ −SE [x], ⎢⎣ 2 d(−it E ) ⎥⎦ which defines the Euclidean action as SE [x] = dt E
2 ⎡⎢ ⎤ ⎢⎢ 1 dx + V (x) ⎥⎥⎥ = dt E L E (x, x). ˙ ⎢⎣ 2 dt E ⎥⎦
This allows the propagator to be better defined, as 1 U (t, t 0 ) = Dx(t E ) exp − SE [x(t E )] ,
(40.40)
(40.41)
(40.42)
but more importantly, it still takes real values for real Euclidean time t E instead of real normal time t. Assume that V (x) has two minima, x 1 and x 2 , with the same value V0 , thus V (x 1,2 ) = 0, with V (x 1,2 ) > 0, which necessarily means that there is a local maximum x 0 in between them, V (x 0 ) = 0 with V (x 0 ) < 0. These extrema for the potential are then also extrema of the Euclidean action SE [x], namely, x(t) = x 1 and x(t) = x 2 satisfy the classical equations of motion for the Euclidean action, 0=
δSE d 2 x dV =− 2 + = 0. δx(t E ) dx dt E
(40.43)
With respect to the classical equations of motion for the normal action, d 2 x dV + = 0, dx dt 2
(40.44)
we see that we have effectively an inverted potential, VE (x) = −V (x).
(40.45)
Consider then the classical solution in this inverted potential VE (x), i.e., the classical solution of the Euclidean action, with the given boundary conditions; it is called an instanton. Note that VE (x) has two maxima at x 1 and x 2 , and a local minimum x 0 in between them, as in Fig. 40.2.
VE (x)
x x1 Figure 40.2
x0
x2
Instanton solution in between two maxima, x1 , x2 of VE , passing through the local minimum x0 (equivalently, two minima of V, with the solution passing through the local maximum x0 ).
454
40 WKB Methods and Extensions, Transitions, Instantons Then we can consider an instanton solution that asymptotically goes to x 1 at t i → −∞ and to x 2 at t f → +∞, by passing through x 0 at a finite time. Physically, this motion in the inverted potential VE (x) corresponds to a particle initially (at t i → −∞) sitting at the maximum (an unstable point) at x 1 , and receiving an infinitesimal kick, after which it goes down the hill, through the local minimum at x 0 , and on to the maximum at x 2 , reached asymptotically when t f → +∞. If t i , t f ∈ R and are asymptotically at infinity, they would correspond to t E,i , t E, f ∈ iR, which is not what we want. However, in the complex t plane this would correspond to integration over the contour C and, as we saw, this is equivalent to integration over the contour C , where x 0 is, not a pole of the potential, but, rather more generally now, the extremum of the action appearing in the exponent situated in between the initial and final points, meaning the same x 0 as was defined just above. Then x(t) is the same complex path as that described in the WKB matrix element analysis. In this case, now we can use the general formulas in Chapter 38 relating transition amplitudes with the propagator, bn (t) = ψ n |UI (t, t 0 )|ψi .
(40.46)
Here we take |ψi to be the quantum state corresponding to the position x 1 at t i → −∞ and |ψ m the quantum state corresponding to position x 2 at t f → +∞. Then we obtain for the transition amplitude coefficient i 1 bm (t) ∼ Dx(t) exp S[x(t)] = Dx(t E ) exp − SE [x(t E )] . (40.47) A saddle point approximation for the path integral gives just the exponent corresponding to the minimum of the Euclidean action SE , meaning the Euclidean action evaluated on the classical solution (which extremizes SE ), i.e., the instanton, thus bm (t) ∼ exp {−SE [x cl (t E )]} = exp (−SE [x 1 , x 0 ] − SE [x 0 , x 2 ]) = exp (iS[x 1 , x 0 ] + iS[x 0 , x 2 ]), (40.48) where we have split the classical instanton solution into two halves, one going from x 1 to x 0 and the other from x 0 to x 2 , and we have gone back to the normal action by setting −SE = iS. Finally then, the probability of transition is 2 P(i → m) = |bm (t)| 2 ∼ exp − Im(S[x 1 , x 0 ] + S[x 0 , x 2 ]) , (40.49) which is the same formula as that obtained above from the WKB method on matrix elements. We can also consider the next correction to the above approximate formula, a further “quasiclassical approximation” around the classical Euclidean value. This corresponds to considering fluctuations around the classical path in complex space, i.e., around the instanton. We write the quadratic fluctuations around the instanton as follows: SE SE [x cl (t E )] +
1 δ2 SE [x cl (t E )](x − x cl (t E )) 2 , 2 δx 2 (t E )
(40.50)
and then, doing Gaussian integration over x, we obtain the subleading fluctuations determinant contribution as well, e
−S E ,0
⎡⎢ ⎤⎥ ⎫ −1/2 2 ⎧ ⎪ ⎨ det ⎢⎢− d + V (x cl (t E )) ⎥⎥ ⎪ ⎬ . 2 ⎪ ⎪ ⎢ ⎥ dt E ⎣ ⎦⎭ ⎩
(40.51)
455
40 WKB Methods and Extensions, Transitions, Instantons
We can also consider other contributions to the path integral, such as multi-instanton solutions and fluctuations around them. In the simple case considered here, with only two minima of the potential, we only have an instanton solution, going from x 1 to x 2 in Euclidean time, and an anti-instanton solution, going from x 2 to x 1 . We could have a quasi-solution that goes from x 1 to x 2 in a large, but not infinite time, and then back to x 1 in the same large time, which would be an instanton–anti-instanton solution. However, if instead of the potential with two minima we had a periodic potential, with an infinite number of minima, we could consider a multi-instanton (quasi-)solution, where we go from one minimum to the next, then to the next, etc. In this case, in the path integral we must sum over all the possible multi-instantons, and the fluctuations around them, but we will not explore this possibility further here.
Important Concepts to Remember ˆ 2 between two • We can use the WKB method to calculate matrix elements f 12 ≡ ψ1 | A|ψ eigenstates of the Hamiltonian, by first going to the complex plane (more specifically, the imaginary line in the upper-half plane) for x; consequently, for the exponent in the WKB approximation for f 12 , we have eig(x) → e−Im g(x) . • Then, minimizing this exponent over the complex plane, we obtain E2 − E1 f 12 ∼ exp − Im τ , x with τ = 0 dx/v(x), where x 0 is the (unique) point in the upper-half plane where V (x 0 ) → ∞, √ the integral only goes near it, and v(x) = 2(E − V (x))/m. • Using the transition probability is found as P(i → m) ∼ Bohr–Sommerfeld quantization, exp − 2 Im[S(x 1 , x 0 ) + S(x 0 , x 2 )] , for a path in complex plane for imaginary x and τ, from x 1 to x 2 via x 0 , the extremum of the action in complex plane. • In a more precise formalism, we define path integrals in Euclidean (imaginary) time, with propagator U (t, t 0 ) = Dx(t E ) exp − 1 SE (x(t E )) . • Motion in Euclidean time is motion in the inverted potential, VE (x) = −V (x), so motion between two minima for V (x) via a maximum x 0 in between them becomes motion between two maxima of VE (x) via a minimum x 0 in between them, called an instanton. • Since bm (t) = ψ m |U (t, t 0 )|ψ0 , the probability becomes the modulus squared of the Euclidean pathintegral for the instanton, which in first order gives the previous formula for P(i → m) ∼ exp − 2 SE [x cl (t E )] , in terms of the instanton action. Fluctuations around the instanton, as well as multi-instanton contributions, give corrections to P(i → m).
Further Reading For more on WKB methods for matrix elements, see [4], and for more on instantons, in both quantum mechanics and quantum field theory, see [10].
456
40 WKB Methods and Extensions, Transitions, Instantons
Exercises (1) (2) (3) (4)
(5)
(6)
(7)
Use the WKB method, and Bohr–Sommerfeld quantization, to estimate the matrix element 2| exp (−a∂x )|1 for a harmonic oscillator. Consider the potential V = −A/(x 2 + a2 ), and states |1, |2, the first two eigenstates in this potential. Using the methods in the text, estimate the transition probability f 12 ≡ 1|x|2. Can the instanton be associated with the motion of a real particle, albeit nonclassical (perhaps a quantum path)? Consider a theory with only two minima of the potential, at x 1 and x 2 . Sum the series of noninteracting multi-instantons that start at x 1 and end at x 2 , having been through a continuous motion. Consider the Higgs-type potential V = α(x 2 − a2 ) 2 , and a transition between the two vacua (minima of the potential) x 1 = −a and x 2 = +a. Calculate the transition probability in the classical one-instanton approximation. Consider the periodic potential V = A cos2 (ax), and a transition between the first two vacua (minima) of x > 0. Calculate the transition probability in the classical one-instanton approximation. Calculate the (formal) fluctuation correction to the instanton calculation in exercise 6.
Variational Methods
41
In this chapter, we will study another type of approximation methods, variational methods. They give an approximation to eigenenergies and eigenstates by considering variations over a certain subset of functions, representing for states.
41.1 First Form of the Variational Method The first form of the variational method applies to finding the energy of the ground state of a system.
Theorem This method starts from a simple theorem, expressed as follows. The energy of a state is always equal to or larger than that of the ground state: Eψ ≡
ψ| Hˆ |ψ ≥ E0 , ∀|ψ. ψ|ψ
(41.1)
ˆ Proof Consider an orthonormal basis of exact eigenstates |n of the Hamiltonian H, Hˆ |n = En |n, among which E0 is the smallest, i.e., it is the ground state. Since this is a basis, it is a complete set, |nn| = 1.
(41.2)
(41.3)
n
Then expand the state |ψ in this basis, |ψ =
cn |n =
n
|nn|ψ.
(41.4)
n
Since the states |n are an orthonormal set of eigenstates, m| Hˆ |n = En δ mn . Then it follows that
Eψ =
(41.5)
m,n ψ|mm| H |nn|ψ
ˆ
n ψ|nn|ψ
2 n |n|ψ| En =
2 n |n|ψ|
2 n |n|ψ| E0 ≥
= E0 . 2 n |n|ψ|
(41.6)
q.e.d 457
458
41 Variational Methods From this theorem, it follows that we can choose any basis |α for the Hilbert space (not just an ˆ and expand in it, eigenbasis of H), aα |α = |αα|ψ, (41.7) |ψ = α
α
and then the ground state energy can be found as a minimum over the whole space of {aα } coefficients,
∗ 0| Hˆ |0 α,β a α a β Hαβ Eψ = ∗ ≥ E0 = . (41.8) 0|0 α aα aα The minimum conditions,
∂Eψ = 0, |∀α ∂aα
(41.9)
will then select the ground state, Eψ,min = E0 ⇒ |ψ = |0.
41.2 Ritz Variational Method However, the above is in general impractical, since in most cases the Hilbert space is either infinite dimensional, or of a very high dimension, and moreover one has to orthonormalize the basis states, making full minimization extremely hard if not impossible. One particular variational method that is practical is the Ritz method, which amounts to considering a subset of states {|ψi }i , and not even necessarily an orthonormal subset (a sub-basis), and defining ci |ψi , (41.10) |ψ ≡ i
where ci are arbitrary complex coefficients. Then we define also
∗ ∗ ˆ i,k ck ci ψ k | H |ψi i,k ck ci Hki = , Eψ =
∗ ∗ i,k ck ci ψ k |ψi i,k ck ci Δki
(41.11)
where we have denoted ψk | Hˆ |ψi = Hki , ψk |ψi = Δki .
(41.12)
Treat the ck and ck∗ as independent variables (this is always possible, being equivalent to writing ci = ai + ibi with ai and bi independent real variables). Then the stationarity (minimum) equations, with variables ck∗ , are ∂Eψ = 0, ∂ck∗
(41.13)
459
41 Variational Methods
which gives ⎡⎢ ⎤⎥ ⎢⎢ ci Hki − Eψ ci Δki ⎥⎥ = 0, ⎢⎣ i ⎥⎦ i
(41.14)
ci Hki − Eψ Δki = 0,
(41.15)
1 ∗ i,k ck ci Δki
or equivalently
i
where Eψ is E at the minimum, leading to an eigenvalue equation for E, det(Hki − EΔki ) = 0.
(41.16)
Lemma The good thing about this method is that, even if we incur a significant error in finding the correct state (in this case |0), say |ψ − |0 ∼ O(),
(41.17)
the error in finding the correct energy is much smaller, E − E0 ∼ O( 2 ).
(41.18)
Proof We first trivially rewrite the form of the energy of the state as
Eψ =
n
|n|ψ| 2 (En − E0 ) + E0 .
2 n |n|ψ|
(41.19)
We also use normalized states, ˜ ≡ |ψ = |ψ | ψ .
ψ|ψ 2 n |n|ψ|
(41.20)
˜ ∼ O() n| ψ
(41.21)
Then, if we have
for n 0, then Eψ − E0 =
˜ 2 (En − E0 ) ∼ O( 2 ). |n| ψ|
(41.22)
n
q.e.d.
41.3 Practical Variational Method An even more practical version of the variational method is written explicitly in terms of wave functions, as opposed to states. Consider a wave function depending on parameters a1 , . . . , an , ψ(x ) ≡ x |ψ = ψ(x ; a1 , . . . , an ).
(41.23)
460
41 Variational Methods Then minimize the energy of the state Eψ over the parameters a1 , . . . , an (instead of over coefficients of states, or, given functions in an expansion, over parameters of a function): ∂Eψ =0 . (41.24) ∂ai i Thus, as in the case of the Ritz method, we obtain an approximation for the eigenenergy that is better, namely O(2 ), than the approximation for the eigenstates, which is O().
41.4 General Method We can now obtain the most general form of the variational method, which is in fact valid for any ˆ As before, we define Hermitian operator A, even though it is generally used for H. ˜ Hˆ | ψ ˜ = Eψ = ψ|
ψ| Hˆ |ψ . ψ|ψ
(41.25)
This general method is also based on a theorem.
Theorem ˜ the theorem is written formally as With the above definitions, and using normalized states | ψ, δEψ ˜ = |n, = 0 ⇔ | ψ ˜ δ| ψ
(41.26)
i.e., the energy of the state is stationary, meaning there are no O() terms in it, only O(2 ) terms, ˆ with energy E = En = Eψ . Note that if and only if the normalized state is an eigenstate of H, this is true for all the eigenstates, not just for the ground state, though note that the condition is of stationarity, not of a minimum (more on that later). Proof in a basis. One way to prove it, is to consider a basis, i.e., a complete orthonormal set |r of states, and expand the normalized state in it, ˜ = cr |r. (41.27) | ψ r
We want to keep the normalization condition fixed, i.e., ˜ ψ ˜ =1 ⇔ ψ| cr∗ cr = 1.
(41.28)
r,s
This means that we need to minimize Eψ , understood as an action, while keeping the constraint, which is added to the “action” via Lagrange multipliers called E, ⎡⎢ ⎤⎥ δ ⎢⎢Eψ − E cr∗ cr − 1⎥⎥ = 0. ⎢⎣ r ⎥⎦
(41.29)
461
41 Variational Methods
We rewrite it as ⎡⎢ ⎤⎥ δ ⎢⎢ cr∗ Hr s cs − E cr∗ δr s cs + E ⎥⎥ = 0 ⎢⎣ r,s ⎥⎦ r,s ⎡⎢ ⎤⎥ ⇒ δ ⎢⎢ cr∗ (Hr s − Eδr s )cs ⎥⎥ = 0. ⎢⎣ r,s ⎥⎦
(41.30)
Now we treat δcr∗ and δcs as arbitrary and independent variations, just as in the Ritz variational method. Thus we rewrite the above condition as δcr∗ (Hr s − Eδr s )cs + cr (Hr s − Eδr s )δcs = 0, (41.31) r,s
r,s
but since the variations are arbitrary and independent, we obtain two equations, (Hr s − Eδr s )cs = 0 (Hr s − Eδr s )cr∗ = 0.
(41.32)
∗ However, since Hˆ is Hermitian, Hr s = Hsr and moreover the eigenergies are real, E = E ∗ , the second equation becomes ∗ − E ∗ δ sr )cr∗ = 0, (Hsr
(41.33)
which is just the complex conjugate of the first. But this is just the eigenstate condition, Hr s cs = Ecr .
(41.34) q.e.d.
We also note that here for the first time we used hermiticity, and this is the only property used ˆ which means that indeed, the result is valid for any Hermitian operator. about H,
Proof in the general case In fact we don’t need to expand in a basis in order to prove the theorem. We vary the energy of the state, with the normalization condition imposed with Lagrange multiplier E, as before (but without expanding in a basis), ˜ Hˆ | ψ ˜ − Eψ| ˜ ψ ˜ + E = 0. δ ψ| (41.35) Then the variation becomes
⇒
˜ Hˆ − E1)| ψ ˜ =0 δ ψ|( ˜ Hˆ − E1)| ψ ˜ + ψ|( ˜ Hˆ − E1)|δ ψ ˜ = 0. δ ψ|(
(41.36)
˜ and |δ ψ ˜ as arbitrary and independent variations, we obtain But then if we treat δ ψ| ˜ =0 ( Hˆ − E1)| ψ ˜ Hˆ − E1) = 0 ⇒ ( Hˆ † − E ∗ 1)| ψ ˜ = 0. ψ|(
(41.37)
Since Hˆ is Hermitian, Hˆ † = Hˆ and E ∗ = E, so we obtain the same equation as the first, and moreover it is the eigenstate equation, ˜ = E| ψ. ˜ Hˆ | ψ
(41.38) q.e.d
462
41 Variational Methods Moreover, again we can show that the energy error is O(2 ). Consider an error |δ ψ˜ n , and divide it in a part parallel to |n, and a part |δ ψ˜ orthogonal to it, | ψ˜ n = |n + |δ ψ˜ n = a|n + b|δ ψ˜ ,
(41.39)
where n|δ ψ˜ = 0. Then normalization means that ψ˜ n | ψ˜ n = 1 ⇒ a2 + b2 = 1.
(41.40)
But then if b ∼ O(), a ∼ 1 − O( 2 ). That leads to the following error in the energy, Eψ = a∗ n| + b∗ δ ψ˜ | Hˆ a|n + b| ψ˜ = |a| 2 En + |b| 2 δ ψ˜ | Hˆ |δ ψ˜
(41.41)
En + O( 2 ), where in the second equality we have used Hˆ |n = En |n and δ ψ˜ |n = 0, and in the last equality we have used |a| 2 = 1 − O( 2 ) and |δ ψ˜ ∼ O(). Note however that now the sign of the 2 correction is arbitrary, meaning that Eψ reaches either a minimum or a maximum. Only for En = E0 (the ground state energy) do we always have a minimum (since the sign of E − E0 is always plus).
41.5 Applications We now apply the variational methods to some relevant cases. We first point out the pros and cons of the variational method. Pro: The method gives a very good approximation to the energy, with O(2 ) error, even when the O() error in the state (or wave function) is relatively large. Con: The method does not give us the error itself, since all we know is that we are minimizing over some functions, not how these functions relate to the exact ones (i.e., what the value of is). But we can improve on the cons as follows: (1) We can choose a wave function for |ψ that has the same number of nodes as the state |n which we want to reproduce (which is known by general theory, without knowing the state itself). (2) We can impose the same asymptotic behavior at the extremes, r = 0 and/or r = ∞ in three dimensions. ˆ ˆ e.g., the angular momentum L. (3) We can use eigenfunctions of operators that commute with H,
The Hydrogenoid Atom In this case, we actually know exactly the wave functions of the eigenstates, but we can pretend we don’t, and try to use the variational method. Using point (3) above, we can write the ansatz for the wave function in three dimensions, Φ = R(r)Ylm (θ, φ).
(41.42)
463
41 Variational Methods Then we use point (2) above, and at r = 0, R(r) → const, whereas at r → ∞, R(r) → 0 really fast. For the ground state energy E0 , with spherical symmetry, we can then try (using the practical variational method) the wave function R(r) = Ae−r/a ,
(41.43)
depending on the parameters A and a, and minimize the energy Eψ over them. At the minimum, we find a = a0 and A = a0−3/2 , leading to the ground state wave function |ψ0 , and the corresponding ground state energy E0 . Thus in this way, we find the exact energy E0 . But we could also try another function satisfying the same boundary conditions at r = 0 and r = ∞, R(r) =
b2
A , + r2
(41.44)
depending on the parameters A and b, and minimize the energy over them. At the minimum, we find Emin = −0.81|E0 |, which is pretty close to the true value, even though the wave function is way off. For the energy of an excited state En , n > 0, we must try wave functions that have the correct symmetry properties, the right number of nodes, etc. Otherwise, if we know the energy levels Em and states for m < n, we then try functions that are orthogonal to them, leading to En > Em .
Previously Studied Example: the Helium Atom (or Helium-Like Atom) Helium-like atoms were studied in Chapter 21, and the variational method was used without much explanation, so here we return to it and streamline the presentation. Consider a trial wave function for the electrons in a helium-like atom that is just a product of the states for a single electron, i.e., the zeroth-order approximation in the perturbation theory for the interaction between the electrons, r +r 1 1 2 exp − Φ(r 1 , r 2 ; a) = f (r 1 ; a) f (r 2 ; a) = , (41.45) 3 a πa but where a is now a free parameter, and not the value a = a0 /Z that would be appropriate for a helium-like atom of charge Z. Equivalently, write a=
a0 , Z
(41.46)
where Z is an arbitrary effective charge felt by the electrons, and is not equal to Z. Then, as shown in Chapter 21, the first-order (time independent) perturbation theory correction, the interaction potential (between the electrons) V12 , averaged in the zeroth-order state |Φ, only now with charge Z instead of Z, gives Φ|Vˆ12 |Φ =
5 Z |E0 |, 4
(41.47)
where |E0 | = e02 /(2a0 ) is the ground state energy of the hydrogen atom (the energy of a single electron in the field of a nucleus of charge one).
464
41 Variational Methods
Then the energy functional of the helium-like atom is twice the energy of an electron in the field of a nucleus of charge Z , plus the above interaction energy, so 2⎤ ⎡⎢ 5 ⎥⎥ . (41.48) E(a) = 2|E0 | ⎢⎢ Z 2 − 2 Z − 16 ⎥⎥ ⎢⎣ ⎦ Minimizing this over a, i.e., finding δE(a)/δa = 0, or equivalently minimizing over Z by finding δE/δZ = 0, we obtain Z = Z − and the energy at this minimum is
EΦ,min
5 , 16
5 = −2 Z − 16
(41.49) 2 |E0 |.
(41.50)
This is a better approximation than the first-order (time-independent) perturbation theory result, which is 5 (41.51) Φ|Vˆ12 |Φ = Z |E0 |, 4 leading to the total energy 5 25 Z |E0 | = EΦ,min + |E0 |, (41.52) 4 128 which is larger (and therefore a worse approximation) than EΦ,min . The Ritz method will not be exemplified here, but rather will be revisited in the next chapter, where it will be used to work with atomic and molecular orbitals. E (0+1) = −2|E0 | Z 2 +
Important Concepts to Remember • The energy of a state is always larger than its ground state, so we can find the ground state by
minimizing the energy α,β a∗α aβ Hαβ /( α |α α | 2 ) over states expanded in an arbitrary basis, |ψ =
α a α |α. • In the Ritz variational method, we minimize over a subset of states in the Hilbert space, not
∗ ∗ necessarily orthonormal, Eψ = k,i ck ci Hki /( k,i ck ci Δki ), Δki = ψ k |ψi , with |ψ =
2 i ci |ψi . If the error in the state is |ψ − |0 = O(), then the error in energy is Eψ − E0 = O( ). • In a practical variational method, one chooses wave functions depending on parameters, ψ(x) = x|ψ = ψ(x ; a1 , . . . , an ), and minimizes the energy Eψ over the parameters, again finding a better error in the energy, Eψ − E0 = O( 2 ), than in the state, ψ − ψ0 = O(). • The most general variational method (valid for any operator A, not only for H), makes the energy stationary (no O() terms, only O(2 ) terms) over normalized states, i.e., δEψ /δ|ψ = 0 if and only if we have an eigenstate, |ψ = |n, for all states. For the ground state, stationarity implies a minimum. • Variational methods improve the error, though we do not then know its value; but we can improve them by choosing wave functions with the correct number of nodes and behavior at infinity or zero, ˆ and that are eigenfunctions of operators that commute with H. • For a helium-like atom, the variational method is better than first-order time-independent perturbation theory.
465
41 Variational Methods
Further Reading See [2] and [1].
Exercises (1)
(2)
(3) (4) (5) (6) (7)
Consider a two-level system, with basis |1, |2, and in this basis, a Hamiltonian with elements 1 1 . Use the first form of the variational method to find the ground state, and then check by 1 1 finding the exact eigenstates. Use the Ritz variational method for the harmonic√oscillator, with trial wave functions ψ1 (x) = 2 2 2 e−y /2 , ψ2 (x) = e−y , ψ3 (x) = e−2y , where y = x mω/, in order to find the best approximation to the ground state energy. Use the practical variational method for the same harmonic oscillator ground state energy, with 2 trial wave function ψ a (x) = e−ay . Repeat exercise 3 for the third state (second excited state) |3 of the harmonic oscillator, namely, 2 for ψ a (x) = y 2 e−ay . 2 Consider the potential V = k x 2 + α|x| 3 , and a trial wave function ψ(x; a, b) = |y| a e−by . Find an estimate of the ground state energy (write down the equations for the minimum). Fill in the details in the text for minimization with the trial wave function R(r) = Ae−r/a for the hydrogenoid atom. Fill in the details in the text for minimization with the trial wave function R(r) = A/(r 2 + b2 ).
PART IIc
ATOMIC AND NUCLEAR QUANTUM MECHANICS
42
Atoms and Molecules, Orbitals and Chemical Bonds: Quantum Chemistry In this chapter, we will build up from hydrogenoid atoms to multi-electron atoms, and onward to molecules, by defining atomic orbitals, molecular orbitals, the hybridization of orbitals, and the resulting chemical bonds, defining the field of quantum chemistry.
42.1 Hydrogenoid Atoms (Ions) We start with a quick review of the hydrogenoid atoms, i.e., ions with a nucleus of charge Z and only one electron. The electronic quantum numbers are (n, l, m, ms ), where l = 0, 1, . . . , n − 1,
m = −l, −l + 1, . . . , l − 1, l,
ms = ±1/2.
(42.1)
There is a spin–orbit (l · s) coupling, meaning that there is such a term in the interaction Hamiltonian, where l is the angular momentum and s is the spin angular momentum. Then, at least if the interaction energy is sufficiently large, we must add the angular momenta to give the total angular momentum, j = l +s, which means that in terms of quantum numbers we have j = l ± 1/2. Moreover, the values of m j are − j, − j +1, . . . , j −1, j. We can then use the quantum numbers (m, l, j, m j ), which are more appropriate if the spin–orbit interaction energy is large enough. The wave functions depend on the position r and the quantum numbers, and can be expanded as products: ψ nlm (r, θ, φ, ms ) = Rnl (r)Ylm (θ, φ)χ(ms ).
(42.2)
Since Ylm (θ, φ) = Plm (cos θ)eimφ , the probability is Pnlmms (r ) = |ψ nlm (r, θ, φ, ms )| 2 = |Rnl (r)Pnl (cos θ)| 2 |χ(ms )| 2 ,
(42.3)
which means that it actually depends on r, θ and (n, l, m, ms ). This probability profile, more precisely the shape that contains most of it (say, 99% for instance), is defined to be an orbital, for a given (n, l, m, ms ) as above or a given (n, l, j, m j ). For the orbitals, we have the spectroscopic notation, where we denote an orbital by nl d , where d is the multiplicity of the state. More precisely, instead of the value of l, we use the notation where l = 0 is called s, l = 1 is called p, l = 2 is d and l = 3 is f . From l = 4 onwards, we have alphabetic notation, so l = 4, 5, 6, 7 are denoted by g, h, i, j, . . . The l = 0 or s orbital has spherical symmetry, so no θ dependence at all; the l = 1 or p orbital has axial (or dipole) symmetry, meaning one axis; the l = 2 or d orbital has quadrupole symmetry (or 2 axes); etc. 469
470
42 Quantum Chemistry
The above-defined hydrogenoid orbitals, i.e., atomic orbitals for hydrogenoid atoms, meaning the single-electron wave functions, are specifically (here ρ = r/a0 as always) 1s : 2s :
2p :
3s :
3p :
φ100 (ρ, θ, φ) = Ce−ρ/2 C φ200 (ρ, θ, φ) = √ (2 − ρ)e−ρ/2 32 Y1,0 cos θ Y1,−1√−Y1,1 x i C −ρ/2 φ21i (ρ, θ, φ) = √ e 2 sin θ cos φ ∝ ∝ r 32 Y +Y 1,−1 1,1 sin θ sin φ i √2 C φ300 (ρ, θ, φ) = √ (6 − 6ρ + ρ 2 )e−ρ/2 972 Y cos θ Y1,−11,0 −Y1,1 C 2 −ρ/2 sin θ cos φ ∝ √2 ∝ x i . φ31i (ρ, θ, φ) = √ (4ρ − ρ )e Y1,−1 +Y1,1 r 972 sin θ sin φ i √2
(42.4)
We make two observations: • For the s orbitals, φ(ρ = 0) 0 so there is a direct interaction between the electrons and the nucleus, called a Fermi contact interaction. • As we saw before, the average value of the radius is r =
a0 [3n2 − l (l + 1)], 2Z
(42.5)
so it decreases with the charge Z of the nucleus, meaning that the hydrogenoid atom gets compressed for higher Z, rather than growing.
42.2 Multi-Electron Atoms and Shells We next move on to multi-electron atoms. In this case, • the other electrons screen the nucleus somewhat by an amount σ; they are described in terms of a hydrogenoid atom, but with an effective charge, Zeff = Z − σ.
(42.6)
• due to the other electrons, which are not spherically distributed (unless we only have s, i.e., spherically symmetric, orbitals for the other electrons), the potential seen by an electron is no longer central. We have a shell model, where we fill up atomic orbitals, which are extensions of the hydrogenoid orbitals, with electrons, in order of increasing energy. We call a shell, the set of energy levels with a given n, and a sub-shell the set of energy levels of a given n and l; see Fig. 42.1.
471
42 Quantum Chemistry
3p 3s 2p 2s 1s Figure 42.1
The first sub-shells in the shell model, ignoring the spin (ms ) degree of freedom. For a sub-shell, of given n, l and varying (m, ms ) or ( j, m j ), if it is full, meaning all the energy levels in the sub-shell are occupied with electrons, then we have a spherical distribution of probability, and the total angular momenta vanish, L tot = 0 = Stot .
Hund Rules for the Atomic Orbitals: (1) The electrons avoid being in the same orbitals if there is more than one orbital of equal l. (2) Two electrons on equivalent (same value of l), but different, orbitals, have parallel (↑↑) spins in the ground state of the atom. If we have a fully filled shell (all the energy levels of a given n are occupied by electrons), then for all calculations referring to further electrons, it is as if we have a hydrogenoid atom, with an effective charge Zeff = Z − σ.
42.3 Couplings of Angular Momenta In an atom, in general we have many electrons, each with an orbital angular momentum l i and a spin si . We have various possibilities for coupling of these angular momenta, but we have two ideal situations: (a) LS coupling (normal, or Russell–Saunders). In this case, we first couple the orbital angular momenta to each other, and the spins to each other, and then we couple the resulting L and S: l i , S = L = si J = L + S. (42.7) i
i
This means that the possible values of J are |L − S| ≤ J ≤ L + S.
(42.8)
The notation in this coupling is 2S+1 L J , or rather n2S+1 L J , where usually 2S + 1 is the multiplicity of states of given L, S. However, if S > L then the multiplicity is 2L + 1, replacing 2S + 1 in the notation, and we say that the multiplicity is not fully developed. An example of the notation is the
472
42 Quantum Chemistry
2
P3/2 state, which is read as “doublet P three halves”. This LS coupling is predominant in the light and then the atoms: electrostatic forces between the electrons couple the l i into L and the si into S, smaller magnetic spin–orbit coupling couples L and S into J . (b) jj coupling (spin–orbit). In this other idealized case, we first couple the orbital and spin angular momenta of each electron into their total angular momentum, and then the total angular momenta of the electrons to each other: ji = l i + si , J = ji . (42.9) i
This case is predominantly used for heavy atoms, with large Z. In it, the magnetic spin–orbit coupling happens first, since it is due to the nuclear charge Z, which is large. Only after that do we couple the ji to each other by the electrostatic forces between the electrons, which are Z-independent. In general, a particular coupling lies somewhere in between the two idealized cases above. But the particular type of coupling does not modify the total number of levels, nor J, but rather changes the energy gaps between the various levels. This splitting of energy levels due to the coupling of the angular momenta of the electrons is called the “fine structure”. But there is also a “hyperfine structure” due to the nuclear angular momentum I, more precisely its interaction with J , leading to the total atomic angular momentum, = J + I. F
(42.10)
42.4 Methods of Quantitative Approximation for Energy Levels There is an approximation method of introducing a self-consistent field, called the Hartree–Fock method, but we will leave it for later, when analyzing multiparticle states under some generality.
Method of Atomic Orbitals The method we will describe here is an application of the Ritz variational method to atoms and their orbitals, called the atomic orbital method. It is a way to introduce some interaction, usually between electrons. We split the Hamiltonian into free plus interaction parts, Hˆ = Hˆ 0 + Hˆ int .
(42.11)
We use orthonormal eigenstates of Hˆ 0 , called |Φk , which we can usually calculate (we choose Hˆ 0 such that we can). If we consider all of them, for k = 1 to ∞, we obtain a basis of the Hilbert space of the system. But, rather than do that, we restrict to n terms only, for the expansion of a trial state, |ψ a =
n
ck |Φk
(42.12)
k=1
and use the full Hamiltonian Hˆ on it. Then, from the Ritz variational method, we want to obtain the eigenstates and eigenenergies for this system, n k=1
ck ( Hˆ − E)|Φk = 0.
(42.13)
473
42 Quantum Chemistry Multiplying with Φi | from the left, we have n
ck (Hik − Eδik ) = 0.
(42.14)
k=1
Then we can find the secular equation for the eigenenergies, det(Hik − Eδik ) = 0.
(42.15)
ˆ From it, we obtain n approximate values E1 , . . . , En for the eigenvalues of H.
42.5 Molecules and Chemical Bonds We next move on to molecules, which are several atoms bound together. The chemical bonds between the atoms are mostly between two atoms within a molecule, though there are exceptions such as the aromatic bonds, for instance in C6 H6 , explained briefly at the end of the chapter. Chemical bonds usually fall into one of three categories: • Electrovalent, or ionic, bonds, leading to a heteropolar molecule (meaning it has two different poles). In it, the electrons move between the two atoms, so one atom loses an electron (or more, or less; we are talking about probabilities in quantum mechanics, so fractional parts make sense) and the other gains it. For instance, in the ionic bond between an alkaline element (an element that has a single electron outside filled shells) and a halogen (an element for which a single electron is needed to have only filled shells), the alkaline element loses one electron, and the halogen gains it. The standard example of this is the ionic salt molecule, Na+ Cl− ; see Fig. 42.2a. • Metallic bonds, in which the atoms have a crystalline structure, in which some electrons are delocalized within the crystalline structure. This means that the very many equal energy levels of the individual atoms split infinitesimally, creating almost continuous “bands” of electrons instead of discrete levels; see Fig. 42.2b. • Covalent bonds, between neutral atoms, leading to a homopolar molecule (meaning with two like poles). The standard example is the H2 molecule; see Fig. 42.2c. A molecule is a stable structure, meaning it is a minimum of the energy (be it classical or quantum), associated with a given spatial structure. That in turn means that the nuclei are approximately fixed, and only the electrons move (or, rather, move fast compared with the nuclei).
(a) Figure 42.2
+ −
(b)
(a) Ionic bond: Na Cl . (b) Metallic bond: energy bands. (c) Covalent bond: H2 .
(c)
474
42 Quantum Chemistry
42.6 Adiabatic Approximation and Hierarchy of Scales Since electrons are much lighter than nuclei, me 1, MN
(42.16)
as stated above we can consider that the nuclei are approximately fixed to their stable structure, and the electrons move fast within it, i.e., they “follow” the nuclei adiabatically; so, at all times we are able to treat the nuclei positions as mere parameters as far as the electrons are concerned. Consider d as the characteristic distance between two atoms, or rather two nuclei, in the molecule. Then the electrons are delocalized over a distance Δx ∼ d, meaning the electron momentum pe− ∼
. d
(42.17)
Then the energy of an electron is about twice the kinetic energy, giving Ee ∼ 2Te ∼
p2 2 ∼ . m md 2
(42.18)
On the other hand, the rotational energy associated with the molecule, or rather with the nuclei that compose it, is Erot ∼
L 2 2 ∼ , I M d2
(42.19)
where I is the moment of inertia of the nuclei. Thus Ee Erot .
(42.20)
Moreover, there is also a vibrational energy, associated with the vibrations of the nuclei around the stable structure. This means that the potential energy, approximately that of a harmonic oscillator, Epot ∼
M ω2 d 2 , 2
(42.21)
should be balanced by the kinetic energy of an electron, 2 , 2md 2
(42.22)
2 ⇒ Evibr ∼ ω ∼ √ . √ d 2 Mm d 2 Mm
(42.23)
Te ∼ so that ω∼
Then finally the hierarchy of energy scales for electronic, vibrational, and rotational modes is Erot Evibr Ee .
(42.24)
While Ee is an atomic quantity, the rotational and vibrational modes are intrinsically molecular, have no atomic analogue, and are small. Concretely, the rotational energy Erot ∼ k B T ∼ 25 meV, the vibrational energy Evibr ∼ 0.5 eV, and the electronic energy is of a few eV.
475
42 Quantum Chemistry
The Schr¨odinger equation for a molecule is written as −
2 2 α } Δα − Δi + Vαi + Vαβ + Vi j ψ n{s } {r i }, { R 2Mα 2mi α i α,i i,j α,β = En ψ n{s } {r i }, { Rα } .
(42.25)
α Here mi are the electron masses and r i their positions, Mα are the masses of the nuclei and R their positions, n is a generic index for energy states whereas {s} is a degeneracy index. Moreover, TN is the nuclear kinetic energy and Te the electronic kinetic energy, where TN = −
2 2 Δα , Te = − Δi , 2Mα 2mi α i
(42.26)
and V is the total potential energy, composed of nuclear–electronic part VN e , nucleus–nucleus part VN N , and an electron–electron part Vee , where VN e = Vαi , VN N = Vαβ , Vee = Vi j , V = VN e + VN N + Vee . (42.27) α,i
α,β
i,j
42.7 Details of the Adiabatic Approximation We can consider a perturbation in which the unperturbed part Hˆ 0 is the purely electronic part, Hˆ 0 = Tˆe + Vˆ ,
(42.28)
neglecting TN , so considering fixed nuclei. We can perhaps also neglect VN N , which would mean neglecting the vibrational and rotational modes, thus assuming fixed nuclei at the positions in the α are just parameters, as in the practical variational method, and we can stable structure. Then R minimize over them. Further, writing Hˆ = Hˆ 0 + TˆN ,
(42.29)
we can write down a solution that is not fully separated as an eigenfunction for the Schr¨odinger ˆ equation for H, α } = ψe,n{s } {r i }, { R α } ψ N { R α } , (42.30) ψ n{s } {r i }, { R where the ψe are eigenfunctions of Hˆ 0 , α } = En(0) ψe,n{s } {r i }, { R α } , Hˆ 0 ψe,n{s } {r i }, { R
(42.31)
α of the nuclei only as parameters. that depend on the positions R ˆ On the other hand, TN acts on both ψe and ψ N , so we can write α } ψ N { R α } = E N ,n ψe,n{s } {r i }, { R α } ψ N { R α } , TˆN (+VˆN N ) ψe,n{s } {r i }, { R
(42.32)
476
42 Quantum Chemistry which is multiplied by the electron wave function ψe and integrated, leading to the kinetic energy plus an extra part, ∗ α } TˆN (+VˆN N ) ψe,n{s } {r i }, { R α } ψ N { R α } d 3r i ψe,n{s r i }, { R } { i (42.33) = (TN + ΔE N )ψ N { Rα } . Then the free (electronic) Schr¨odinger equation is written, more explicitly, as α } = Ee,n ψe,n{s } {r i }, { R α } . Tˆe + VˆN e + Vˆee (+ VˆN N ) ψe,n{s } {r i }, { R
(42.34)
The total energies split into electronic and nuclear parts, En = E N ,n + Ee,n ≡ E N ,n + En(0) . The quantized vibrational energy is
Evibr
1 = v+ ω0 , 2
(42.35)
(42.36)
and the quantized rotational energy is J 2 2 J (J + 1) Erot = = . (42.37) 2I 2I In order to find the stable structure, i.e., the molecular shape, we can apply the practical variational α } as parameters. method for ψ n{s } = ψe,n{s } ψ N and minimize the energy over { R
42.8 Method of Molecular Orbitals This method is an application of the Ritz variational method, similar to the method of atomic orbitals. We use the interaction between the electrons as a perturbation, Hˆ 1 = Hˆ ee . We first find an electronic wave function φ ai (r i ) that is decoupled, in which one electron is in the field of all the nuclei, so the Schr¨odinger equation for it is N 2 Z α e2 − Δi − φ ai (r i ) = Ea(i)i φ ai (r i ), 2m 4π r 0 αi α=1
(42.38)
where i = 1, . . . , ne is an index for the electrons. Next, we write a molecular orbital as a product of individual wave functions for the electrons φ ai (r i ), φr (r 1 , . . . , r ne ) ≡ φ a1 (r 1 ) · · · φ an e (r ne ),
(42.39)
where the total index r is made up of the {ai } indices, and we use the practical variational method α }. and minimize over the parameters { R Finally, we write the wave functions of the molecule as linear combinations of these molecular orbitals, α } = α } , ψe {r i }; { R cr φr {r i }; { R (42.40) r
477
42 Quantum Chemistry and use the Ritz variational method: we restrict to a finite subset of r = 1, . . . , k. Then we obtain the eigenstate–eigenvalue equation, k
cr (Hsr − Ee δr s ) = 0,
(42.41)
r=1
leading to the secular equation for the electronic energy, δ(Hsr − Ee δ sr ) = 0.
(42.42)
42.9 The LCAO Method A variant of the above method is the linear combination of atomic orbitals (LCAO) method. In order to find the decoupled electronic wave function φ ai (r i ), we use a type of Ritz method. We consider atomic orbitals for different nuclei inside the molecule, but the same electron, meaning r ), where R r is a parameter, representing for the position of the rth the same position r i , χr (r i ; R nucleus. Then we make linear combinations of the orbitals for each nucleus, one combination per electron, φ ai i (r i ) =
N
r , cir χr r i ; R
(42.43)
r=1
where N is the number of nuclei in the molecule, i is an index for the electron, and ai is the particular molecular orbital. 1 As an example, we can consider the NH molecular orbital, where the N nucleus has a position R 2 and atomic orbital is S (L = 0). and atomic orbital is P (L = 1), and the H nucleus has position R We minimize over cir (in the Ritz-like variational method) to find the molecular orbitals φ ai i . The total energy is E=
ne
Ei ,
(42.44)
i=1
where the individual electron energies are
∗ r,s c Hr s cis Ei = ir∗ , r,s cir Δr s cis
(42.45)
and, as before, we use the notation Hr s = χr | Hˆ |χs Δr s = χr |χs .
(42.46)
The eigenenergy–eigenstate equation is N
cir (Hr s − Ei Δr s ) = 0,
(42.47)
i=1
leading to a secular equation for the energies Ei , det(Hr s − Ei Δr s ) = 0.
(42.48)
478
42 Quantum Chemistry
42.10 Application: The LCAO Method for the Diatomic Molecule between them, and electrons with positions r 1 with respect Consider two nuclei only, with distance R to nucleus 1, and r 2 with respect to nucleus 2, and generic position (with respect to some arbitrary origin) r . Then the molecular orbital is a linear combination of the atomic orbitals (MO = LCAO), meaning φ(r ) = Aφ1 (r 1 ) + Bφ2 (r 2 ).
(42.49)
The energy in this state is Eφ =
A2 H11 + B2 H22 + 2ABH12 , A2 + B2 + 2ABS
where
S = φ1 |φ2 =
φ1∗ φ2 .
(42.50)
(42.51)
We minimize it with respect to A and B, ∂Eφ ∂Eφ = = 0. ∂A ∂B If the two atomic levels are equal, H11 = H22 , then at the minimum we find A = ±B and E1,2 =
H11 ± H12 , 1±S
meaning the level splits in two. If H11 H22 , but S is negligible (S 1), then the split energy levels are 2 2H12 H11 + H22 H11 − H22 E1,2 = ± 1+ . 2 2 H11 − H22
(42.52)
(42.53)
(42.54)
We note that E1 < H11 , H22 and E2 > H11 , H22 , and now |B| | A|.
42.11 Chemical Bonds Chemical bonds fall into two cases: • Purely covalent bond, as in a homopolar molecule. It corresponds to | A| = |B| (A = ±B), and so to the first case above, with the MO being the symmetric LCAO, φ = a(φ1 ± φ2 ).
(42.55)
In it, we see that the electron is evenly distributed between the two nuclei (or, rather, atoms). • An ionic bond, as in a heteropolar molecule. It corresponds to | A| |B|, which is the second case above, with φ = a(φ1 + λφ2 ), but where λ ±1, so the electron is found mostly near one or the other nucleus.
(42.56)
479
42 Quantum Chemistry
(a)
(b)
(c)
(d)
(e)
(g)
(h)
(f)
Figure 42.3
(a) s orbital. (b) p orbital. (c) σ bond in C–C between two 2p orbitals or spx orbitals. (d) π bond in C–C between two 2p orbitals or spx orbitals. (e) Hybridization: spx orbitals (sp, sp2 , sp3 ). (f) C2 H6 , C2 H4 and C2 H2 structure, orbitals, and bonds. (g) Aromatic bond in C6 H6 , the naive structure and the correct structure. (h) Aromatic bond in C6 H6 , the p orbitals that make up the aromatic bond. More precisely, describing the bonds between two atoms in a molecule (i.e, the chemical bonds), we first note that there is a preferred axis uniting the two nuclei. This means that L tot is not conserved, only its projection onto the axis, L z,tot , is conserved. The values are, as usual, L z = 0, 1, 2, . . ., denoted by σ, π, δ, . . . instead of s, p, d, f , . . .. These are molecular orbitals, which only have one axis of symmetry, not the full rotational invariance. See Fig. 42.3a, b, c, d for the spatial form of the s, p atomic orbitals and σ, π bonds, respectively (exemplified for carbon).
Hybridization A very important example is the carbon atom, which has four free electrons (outside full shells), two electrons in S states, with opposite spins ↑↓, and two electrons in P states. Specifically, the electron structure is 1s(↑↓), 2s(↑↓), 2p(↑), 2p(↑), 2p(). But in molecules that include a carbon atom, the split in energy between the S and the P states is much smaller than the binding energy of the molecule. That means that the relevant atomic orbitals of carbon are “hybridized”, or effective, orbitals that are combinations of the S and P orbitals.
480
42 Quantum Chemistry
The particular hybridization depends on the (symmetry of the) molecule under study. In Fig. 42.3e we see the spatial form of the hybridized spx orbitals (with x = 1, 2, 3). We now consider the CH4 (methane) molecule. In it, each hydrogen has a bond with the carbon, all these bonds being equivalent. This means that we need to hybridize all four orbitals of carbon, obtaining sp3 orbitals instead of the s and p orbitals. The sp3 hybridization defines a linear-combination wave function, φ sp3 = c1 φ s + c2 φ p x + c3 φ py + c4 φ pz 1 =√ [φ s + c(u1 φ p x + u2 φ py + u3 φ pz )]. 1 + c2
(42.57)
We have to define four sp3 orbitals replacing the s and three p orbitals, which means that there are four different u vectors, u (1) , u (2) , u (3) , u (4) . The orbitals have to be equivalent, so we need to have the same c for all of them. Moreover, the states need to be orthonormal, so (1) (2) φ sp 3 |φ sp 3 =
1 [1 + c2 u (1) · u (2) ] = 0, etc. 1 + c2
(42.58)
But u (i) · u ( j) = cos θi j , thus the equation becomes c2 cos θi j = 1,
(42.59)
and since the angles between any two of the four vectors must be the same, the vectors end on the vertices of a regular tetrahedron, meaning cos θi j = −
1 ⇒ c2 = 3. 3
(42.60)
Each sp3 orbital will join with a hydrogen atom and orbital, forming the CH4 molecule. Next we consider the C2 H6 (ethane) molecule, H3 C–CH3 . In it, each of the two carbons has three bonds, each with a hydrogen atom, and there is one bond between the two carbons. In it, the same sp3 hybridization works, and three such orbitals are used to bond with the hydrogen atoms. The last sp3 orbital of each of the two carbons joins to form a σ (l z = 0) bond. In it, the spins of the two electrons (from the two carbons) are antiparallel, ↑↓, giving a bond (i.e., negative energy). See Fig. 42.3f for the structure, orbitals, and bonds of C2 H6 . Next we consider the C2 H4 (ethylene) molecule, H2 C=CH2 . In it, for each carbon atom, the s orbital and two of the p orbitals hybridize into three sp2 orbitals, leaving a single p orbital unhybridized. The two bonds between the carbons are different. There is a σ bond (L z = 0), with the two sp2 orbitals parallel to the axis between the atoms and with the electrons in it antiparallel, as before. The two unhybridized p orbitals, each with wave function = u0 φ p + u0 φ p + u0 φ p , ψ0 = u0 · φ x y z 1 2 3
(42.61)
join into a π (L z = 1) orbital. The p orbitals are transverse to the axis between the carbons, and the wave function is normal to the sp2 hybridized orbitals. Next we consider the C2 H2 (acetylene) molecule, HC≡CH. In it, for each carbon atom, the s orbital and one p orbital hybridize into two sp orbitals, while the remaining two p’s remain untouched. One sp orbital joins with a hydrogen, while the other joins with the orbital of the other carbon and forms a σ bond. The two remaining untouched p orbitals of each carbon join with the one from the other atom, and form two π bonds. In all, we have one σ and two π bonds between the carbons. See Fig. 42.3f for the structure, orbitals, and bonds of C2 H4 and C2 H2 .
481
42 Quantum Chemistry
Finally, we consider the C6 H6 (benzene) molecule, which corresponds to a complex bond. The carbons sit at the vertices of a regular hexagon, and bond with one hydrogen each, and have a σ bond with each of the two immediate neighbors, leaving one electron unbonded. Therefore, one s orbital and two p orbitals hybridize into three sp2 orbitals, one joining with the hydrogen and the other two joining with the immediate neighbors to make two σ bonds. The remaining electrons from each carbon pool together to make a circular motion around the hexagon that is Bohr quantized (like in a hydrogenoid atom) on the circle, meaning we have delocalized bound electrons. We represent these electrons by a circle inside the hexagon. In Fig. 42.3g we represent the naive structure and the actual structure (with the aromatic bond) of C6 H6 , and in Fig. 42.3h we represent the creation of the aromatic bond out of p orbitals.
Important Concepts to Remember • In a hydrogenoid atom, the states are described by (n, l, m, ms ) or (because of the spin–orbit interaction) as (n, l, j, m j ), and the spectroscopic notation nl d . • In shell models, we fill atomic orbitals, extensions of the hydrogenoid orbitals; the Hund rules state that electrons avoid being in the same orbitals if other orbitals of same l are available, and that electrons in equivalent, but different orbitals, have parallel spins in the ground state. then L and S • In LS coupling (normal, or Russell–Saunders) first l i couples into L and si into S, into J , while in j j, or spin–orbit, coupling, first l i couples and si into ji , then ji into J ; most atoms are in between these two extremes. Note that the coupling doesn’t change the number of energy levels, only the energy gaps. • In the method of atomic orbitals, we use a finite number of unperturbed eigenfunctions for the total Hamiltonian, and minimize. • Chemical bond types are electrovalent or ionic, metallic (electrons delocalized within the crystalline structure), and covalent, between neutral atoms. • Energy scales in molecules are: Erot Evibr Ee , with usually Erot ∼ 25 meV, Evibr ∼ 0.5 eV and Ee ∼ a few eV. • A molecular orbital is a product of individual electronic orbitals, φr (r 1 , . . . , r ne ) = φ a1 (r 1 ) · · · φ an e (r ne ), and in the method of molecular orbitals one uses linear combinations of these, and the Ritz variational method for them. • In the LCAO method, one defines a molecular orbital as linear combinations of atomic orbitals (for a single electron with respect to each of the nuclei) and uses a Ritz-like variational method for the linear coefficients. • In molecules, atomic orbitals hybridize when the difference in energy between them is smaller than the binding energy of the molecule: e.g., sp3 in CH4 and C2 H6 , sp2 in C2 H4 , sp in C2 H2 . • Hybridized orbitals can join to form bonds: σ (L z = 0), for two parallel such orbitals (sp3 , sp2 , or sp) that are parallel to the common axis and of opposite spin, and 2p orbitals can form 2π (L z = 1) bonds, etc. One particular example is the aromatic bond of C6 H6 with the electron delocalized and quantized on a circle.
482
42 Quantum Chemistry
Further Reading See [2].
Exercises (1) Write down explicitly the 4s and 4p atomic orbital wave functions. (2) Use the shell model and the Hund rules to show how the orbitals are populated with electrons in the elements C, O, and Mg. (3) Consider the element O (oxygen). Write down the LS and jj couplings for the electrons, and show explicitly that the splitting of energy levels is the same for each type of coupling. (4) Consider the element O. Write down explicitly the method of atomic orbitals for it (for electron– electron interactions), using the basis of only the filled orbitals. (5) Write down explicitly the method of molecular orbitals for the O2 molecule. (6) Write down explicitly the LCAO method for the O2 molecule, applied to each of the atomic orbitals in O. (7) Write down the Bohr quantization for the common electron in benzene, and find the energy levels as a function of the “radius” of the benzene molecule (the distance between the center and a C atom).
43
Nuclear Liquid Droplet and Shell Models
In this chapter, we study nuclear models and approximations, the counterpart to the analysis of atoms in the previous chapter.
43.1 Nuclear Data and Droplet Model In the nucleus we have Z protons (where Z is electric charge of the nucleus), and N neutrons. Both of them are nucleons; the total number of nucleons is known as the atomic number A, so A = N + Z. The proton and neutron mass are approximately the same, m p mn , together called m N , and the atomic number A is then approximately the total nucleus mass divided by m N . More precisely, Mnucleus Am p (1 − 8/100),
(43.1)
which means that the binding energy per nucleon is very small (about 8%) and is constant as A increases. Moreover, for most nuclei, we have the radius R versus A law: R = r 0 A1/3 ,
(43.2)
where r 0 1.3 × 10−15 m = 1.3 fm (1 fm = 10−15 m is a femtometer, or one fermi). This law and the previous relation (M ∝ A) imply that both the energy and the volume of the nucleus are proportional to A, which is the property of a liquid. We say then that we have “nuclear matter”, forming something like a droplet of liquid, hence we have a “liquid droplet model”. In the ground state, we have a static, spherically symmetric fluid droplet. In an excited state, we can have waves propagating in the fluid, deformations of the droplet that make it nonspherical, etc. Inside this fluid, the individual nucleons are like particles of matter, and their motion is either thermal or brownian. However, the law E ∝ A is only approximate (we mentioned it is valid for most nuclei; in fact, there are fluctuations around it for some cases). In fact, there is some structure besides linearity: there are nuclei that have better stability, meaning higher binding energy. The nuclei for which this happens have either N or Z as one of the “magic numbers”, 2, 8, 20, 28, 50, 82, 126.
(43.3)
For instance, tin (chemically Sn, for Stannum in Latin) has Z = 50, and it has 10 stable isotopes, the most of any element. Also, if both N and Z are magic numbers, a situation called “double magic”, we have elements that are even more stable. One example of this is helium-4, He4 , with N = Z = 2; others are oxygen-16, O16 , with N = Z = 8, and lead-208, Pb208 , with Z = 82 and N = 126. 483
484
43 Nuclear Liquid Droplet and Shell Models
The situation above, with magic numbers for Z or/and N, is similar to the stability of atomic inert gases, which have full atomic shells, in the shell model of the atom. In fact, more quantitatively, for electrons in a hydrogenoid atom, with a Coulomb potential, the degeneracy of a shell of given n, for electrons characterized by (n, l, m, ms ), is (ms takes two values and m takes 2l + 1 values, while l goes from 0 to n − 1) dn =
n−1
2(2l + 1) = 2n2 ,
(43.4)
l=0
giving, for n = 1, 2, 3, 4, 5, 6, 2, 8, 18, 32, 50, 72.
(43.5)
However, the magic number is supposed to be the total number of electrons in all the full shells, so n m=1
dm =
n
2m2 =
m=1
n(n + 1)(2n + 1) , 3
(43.6)
giving, for n = 1, 2, 3, 4, 5, 6, 2, 10, 28, 60, 110, 182, . . .
(43.7)
43.2 Shell Models 1: Single-Particle Shell Models From the previous analysis it follows that one possibility for a nuclear model is to consider a shell model similar to that for an atom, but with another central potential V (r) rather than the Coulomb potential, and where the shell corresponds to a given energy (principal quantum number n). The wave function for a central potential is, as we saw in Chapter 19, ψ nlm (r ) =
χnl (r) Ylm (θ, φ). r
(43.8)
The form of the reduced radial wave function χnl (r) is found from the Schr¨odinger equation, for which we need the potential V (r). The true potential is something like a rounded-out finite square well, with a depth of V0 and width R (the maximum radius for the well) and 0 at infinity, as in Fig. 43.1a. Instead of the true potential, we must use an analytic form that we can solve. A first approximation is a spherical square well, of depth V0 and radius R. The potential is V = −V0 for 0 ≤ r ≤ R and, in order to be closer to reality, we would also need V = 0 for r > R; see Fig. 43.1b. A second possible approximation is a spherical (three-dimensional) harmonic oscillator, starting at V (r = 0) = −V0 , and reaching V = 0 at r = R. To be closer to reality, again we would need to put V = 0 for r > R; see Fig. 43.1c. Since the potential is central, for r ∈ (0, ∞), and not one-dimensional (for x ∈ (−∞, +∞)), the conditions that one needs to impose are finiteness and integrability at 0 and ∞, as we saw in Chapter 19.
485
43 Nuclear Liquid Droplet and Shell Models
V(r)
V(r)
V(r)
r R
(a)
Figure 43.1
r R
r R
(b)
(c)
(a) Real nuclear potential. (b) Approximation 1: spherical square well. (c) Approximation 2: three-dimensional harmonic oscillator.
Approximation 1 (Spherical Square Well) The spherical square well has been analyzed in Chapter 19 already, where we found that, for r < R ≡ r0 , χnl (r) = Nnl jl (kr), r where
k=
2m(E + V0 ) . 2
(43.9)
(43.10)
For r > R, the correct thing to do would be to put V = 0, but the simpler thing to do is to consider V = ∞, for an infinitely deep square well. If we are interested only in the low-lying energy states (for V0 − |V | V0 ), which we are since there are seven magic numbers only so presumably we go up to at most n = 7, the difference in the calculation required is minimal. Then the wave function should vanish for r ≥ R, meaning that we need to have the quantization condition 0 , jl (k R) = 0 ⇒ k n R = jn,l
(43.11)
0 where jn,l is the nth zero of jl . It then also follows that 2 Nnl =
2 1 . 3 0 2 R jl+1 ( jn,l )
(43.12)
If we consider, as is more appropriate, V = 0 for r > R, the analysis of the boundary conditions, giving the quantization condition and normalization, is more complicated.
Approximation 2 (Spherical Harmonic Oscillator) A better approximation is, however, the spherical harmonic oscillator, starting at V = −V0 , in which case the potential for the nucleus is r 2 1 VN (r) = −V0 1 − = −V0 + mω 2 r 2 , (43.13) R 2
486
43 Nuclear Liquid Droplet and Shell Models
where we have defined V0 1 ≡ mω 2 , 2 2 R
(43.14)
and m is the nucleon mass, m = m N mn m p . Again, really we should stop the increase of V at r = R, when V = 0, and put V = 0 for r > R instead. However, if we are interested only in the low lying states (and we saw that we have only seven magic numbers, so we should stop at most at n = 7), the difference is minimal. The solution of the radial Schr¨odinger equation is χnl (r) 2 2 = Nn,l e−α r /2 (αr) l 1 F1 (−nr , l + 3/2; (αr) 2 ), r
(43.15)
where 1 F1 (−nr , l
2 2 + 3/2; (αr) 2 ) = L l+1/2 nr −1 (α r ),
are Laguerre polynomials,
α=
the normalization constant is Nn,l
mω ,
√ 1/2 α 2α nr ! = , Γ(l + 3/2) Γ(nr + l + 3/2)
(43.16)
(43.17)
(43.18)
and nr = 0, 1, 2, . . . is the number of nodes of the radial wave function (less the node at infinity). The energy (principal) quantum number is, as we saw, n = 2nr + l = 0, 1, 2, . . . From the quantization condition, we found the eigenenergies 3 En = ω n + . 2
(43.19)
(43.20)
The degeneracy of the above energy levels comes from the two values of the spin projection ms and sum over l from 0 to n, dn = 2
n
(l + 1) = 2
l=0
(n + 1)(n + 2) = (n + 1)(n + 2). 2
(43.21)
But the number in which we are interested, to be tested against the magic numbers, is the total number of states in the fully filled shells, i.e., n m=0
dm =
n
(m2 + 3m + 2) =
m=0
n(n + 1)(2n + 1) n(n + 1) +3 + 2(n + 1) 6 2
(n + 1)(n + 2)(n + 3) = , 3
(43.22)
which gives, for n = 0, 1, 2, 3, 4, 5, 6, 7, 2, 8, 20, 40, 70, 112, 168, 240, . . . Out of these, the first three numbers match the magic numbers, although the others do not.
(43.23)
487
43 Nuclear Liquid Droplet and Shell Models
For completeness, note that d n = (n + 1)(n + 2) = 2, 6, 12, 20, 30, 42, 56, 72, . . .
(43.24)
Since only the first three numbers match, we need a better theory but it is clear what it should be, since the same thing happens for the atomic energy levels. At high Z, as for the atomic levels, the magnetic spin–orbit coupling becomes large (whereas the electrostatic energy responsible for energy levels does not increase).
43.3 Spin–Orbit Interaction Correction In order to understand the nuclear spin–orbit interaction, we review the atomic spin–orbit interaction first. The interaction term in the Hamiltonian comes from the Dirac equation, in the expansion around the nonrelativistic result (see Chapter 54), and one finds Hspin–orbit =
e 1 dA0 s · l . 2m02 c2 r dr
(43.25)
But then eA0 = −Ve (r)
(43.26)
is the Coulomb potential, so the spin–orbit interaction Hamiltonian is rewritten as Hspin–orbit = −const ×
1 dVe (r) (2s · l ). r dr
(43.27)
We will assume that the same formula still holds for the nucleus, except that instead of the Coulomb potential Ve (r) we use the nuclear potential (the potential for the nucleus states) VN (r). We will use the second approximation, the spherically symmetric harmonic oscillator, which gives better results, and obtain that 1 dVN (r) 2V0 = mω 2 = 2 = const. r dr R
(43.28)
We also find that V0 Ebinding ∝ A increases for larger nuclei. Moreover, 2 2 2 (2s · l ) = (l + s ) 2 − l − s 2 = j − l − s 2
= 2 [ j ( j + 1) − l (l + 1) − s(s + 1)] .
(43.29)
Since all the nucleons are spin 1/2 fermions, just like electrons, s = 1/2 implies that j = l ± 1/2. In this case, for the two values we find (2s · l ) = (l + 1/2)(l + 3/2) − l (l + 1) − 3/4 = l, for j = l + 1/2 = (l − 1/2)(l + 1/2) − l (l + 1) − 3/4 = −l − 1, for j = l − 1/2. Then the spin–orbit interaction Hamiltonian is l 0 j = l − 1/2
(43.30)
(43.31)
488
43 Nuclear Liquid Droplet and Shell Models This means that the j = l − 1/2 states are shifted upwards in energy, and are shifted more as A increases (so as either N or Z increases), since the constant is roughly proportional to A. Similarly, the j = l + 1/2 states are shifted downwards in energy, and shifted more as A increases. Moreover, the shift is also proportional to l for the first case, and to l + 1 in the second, so the shift is smaller for small l and larger for large l. To describe the nuclear states, we use a variant of the spectroscopic notation, with nr + 1 = nr =
n+1−l + 1, 2
(43.32)
the number of nodes, including infinity, appearing before the notation for l, so that (nr + 1) l j = nr l j .
(43.33)
For the first three shells, which as we saw already match the magic numbers, without considering the spin–orbit shifts, we have the sub-shells n = 0 : 1s1/2 n = 1 : 1p3/2 , 1p1/2
(43.34)
n = 2 : 1d 5/2 , 2s1/2 , 1d 3/2 , in the exact order of increasing energy written above, that is (1s1/2 )n=0 , (1p3/2 , 1p1/2 )n=1 , (1d 5/2 , 2s1/2 , 1d 3/2 )n=2 .
(43.35)
For the next shells, we have the following sub-shells over nr , l: n = 3 : 1 f , 2p n = 4 : 1g, 2d, 3s n = 5 : 1h, 2 f , 3p
(43.36)
n = 6 : 1i, 2g, 3d, 4s, where the n = 3 shell splits over j into 1 f 7/2 , 1 f 5/2 , 2p3/2 , 2p1/2 ,
(43.37)
but the first sub-shell, 1 f 7/2 , for n = 3, l = 3, has a spin–orbit interaction term that is sufficiently large to separate it downwards in energy from the n = 3 shell, but not enough for it to join the n = 2 shell, meaning that it acts as a separate shell all by itself. Its degeneracy is 2 j + 1 = 8. Next, for n = 4, the shell splits into 1g9/2 , 1g7/2 , 2d 5/2 , 2d 3/2 , 2s1/2 .
(43.38)
But now 1g9/2 has n = l = 4 and the spin–orbit interaction is large enough to take it down into the n = 3 shell. Finally, for n = 5, the shell splits into 1h11/2 , 1h9/2 , 2 f 7/2 , 2 f 5/2 , 3p3/2 , 3p1/2 .
(43.39)
The sub-shell 1h11/2 has n = l = 5 and the spin–orbit interaction is large enough to take it down into the n = 4 shell. Moreover, from the n = 6 shell, the element with n = l = 6, namely 1i 13/2 , also has a spin–orbit interaction large enough to take it down into the n = 5 shell. Then, finally, the
489
43 Nuclear Liquid Droplet and Shell Models sub-shells, ordered by increasing energy into shells (except the n = 0, 1, 2 shells, which are already matched), are as follows (where we also give the degeneracy d n of the shell and the magic number
k n = nm=1 d m ): (1 f ) n l r j 7/2 n=3 8 dn = , k 28 n
(2d 5/2
1g7/2
(2p3/2
1 f 5/2
2d 3/2 )n=4 32 82
2p1/2 )n=3 22 50
(1g9/2 )n=4
(1h11/2 )n=5
(3s1/2 )n=4
(2 f 1h9/2 3 f 5/2 2p3/2 )n=5 (1i 13/2 )n=6 7/2 44 126 We see that we have managed to reproduce all the magic numbers.
,
,
(3p1/2 )n=5
(43.40)
.
43.4 Many-Particle Shell Models Until now we have considered only single-particle shell models, where the contribution from the other nucleons is assumed to be included in the one-particle potential VN (r), but in reality we must consider the interactions of the nucleons, at least for those outside the fully filled shells, called (as in the atomic case) valence particles. As for atoms, the fully filled shells correspond to an inert core (spherically symmetric, without multipole interactions), contributing only through a “screening” of the nuclear charge and thus modifying V0 only, but not creating a multiparticle potential. The simplest correction to the above single-particle shell models is to consider a two-particle potential for the valence particles. Denoting by r 1 , r 2 the positions of the two valence particles relative to the nuclear center, and θ12 the angle between them, the potential is V = V (r 1 , r 2 ) = V (r 1 , r 2 , cos θ12 ).
(43.41)
Moreover, we can expand it in terms of Legendre polynomials for the angular dependence, V= f k (r 1 , r 2 )Pk (cos θ12 ). (43.42) k
Conversely (using the orthonormality of the Legendre polynomials), we have 2k + 1 1 fk = Pk (cos θ12 )V (r 1 , r 2 , cos θ12 )d cos θ12 . 2 −1
(43.43)
The simplest model for the two-particle potential is a contact, delta function, potential, V = −gδ(r 1 − r 2 ) = −gδ(1 − cos θ12 )
δ(r 1 − r 2 ) , πr 1 r 2
(43.44)
where πr 1 r 2 is the Jacobian for the transformation between the Cartesian coordinates and the radial coordinates. In this case, we obtain f k (r 1 , r 2 ) = −
g δ(r 1 − r 2 ) (2k + 1) . 4π r1r2
(43.45)
490
43 Nuclear Liquid Droplet and Shell Models
We consider states that are tensor products of the single-particle states (in zeroth-order perturbation theory for the interaction potential V , in which we have only single-particle potentials and shells), defined in terms of the quantum numbers (nl jm) for each particle (the spin–orbit interaction for each valence nucleon produces the coupling ji = l i + s i which, as for atomic j j coupling, happens at large Z, for several fully filled shells since then the spin–orbit interaction is larger), so |φ n1 l1 j1 m1 ⊗ |φ n2 l2 j2 m2 .
(43.46)
We want to calculate the interaction energy coming out of this two-particle interaction potential using first-order time-independent perturbation theory, so we will evaluate the potential Vˆ for the zeroth-order states. The total angular momenta couples the two individual ones, j1 + j2 = J , so we replace | j1 m1 j2 m2 by | j1 j2 J M via Clebsch–Gordan coefficients. We could continue with a full description, but we will just show a simple application instead. Consider the case when j1 = j2 = j, so that there are two identical particles in the same shell. Then these particles can “pair up” into parallel but opposite (“antiparallel”) angular momenta, ↑↓, i.e., J = 0 = M. In this case, the matrix element in the correct states | j1 j2 J M = | j j00 can be trivially related to the matrix elements for the |↑ ⊗ |↓ states (products of one-particle states), and then the sum over m j gives the degeneracy d j = 2 j + 1, so j j00|Vˆ | j j00 = (2 j + 1) |φ n1 l1 (r 1 )| 2 |φ n2 l2 (r 2 )| 2 f k (r 1 , r 2 )dr 1 dr 2 . (43.47) Substituting (43.45), we obtain g j j00|Vˆ | j j00 = −(2 j + 1) 8π
|φ n1 l1 (r)| 2 |φ n2 l2 (r)| 2
dr . r2
(43.48)
This is the pairing energy in first-order time-independent perturbation theory for the two-particle shell model. The next logical step would be to continue on to multiparticle shell models. But this leads to a self-consistent solution, “Hartree–Fock approximation”, and will be treated in Chapter 57, in Part IIe , concerning multiparticle calculations.
Important Concepts to Remember • A nucleus, defined by N and Z (and A = N + Z), can be roughly approximated as a liquid droplet of nuclear matter, since its radius is R = r 0 A1/3 and Mnucleus /(m p A) const and < 1. • However, we have magic numbers of stability in N or Z, 2,8,20,28,50,82,126, explained by analogy with the atomic shell model as being due to having a full nuclear shell (so the magic number is
m d m ), for the quantum numbers arising from quantization in the nuclear potential. • The nuclear potential can be approximated either as a spherical square well or as a spherical (three dimensional) harmonic oscillator starting at −V0 , naturally with V = 0 for r > R, but in practice V = ∞ for r > R gives good results for low quantum numbers. • For the spherical harmonic oscillator with V = 0 for r > R, we find the first three magic numbers, but the rest do not match. For them, we need to take into account the spin–orbit interaction correction, and make some assumptions about its parameters.
491
43 Nuclear Liquid Droplet and Shell Models
• To obtain a better result for the energy levels, one has to consider many-particle interactions. A twoparticle delta function interaction gives the pairing energy, but better corrections are obtained via the self-consistent Hartree–Fock approximation.
Further Reading See more about nuclear models in the book [27].
Exercises (1) (2) (3)
(4)
(5)
If a nucleus is spinning around a given axis, assuming the liquid droplet model how would you modify the law R = r 0 A1/3 from the static case? Calculate the eccentricity, ΔR/R, of the spinning nucleus in exercise 1, in a classical physics approximation. If the (effective) central potential is replaced with an (effective) azimuthal potential, V (r, z), depending independently on the polar radius r in a plane and on the height z along the axis transverse to the plane, is it possible to have a shell model? Why? If so, consider the harmonic oscillator potential with different constants in the direction z and in the polar radial direction r, and find the (first few) magic numbers. In the case of a spherical square well with V = 0 for r ≥ R, write down the solutions in the regions r ≤ R and r ≥ R, the gluing (continuity) conditions at r = R, and the resulting quantization condition. Consider the nuclear central potential (R2 R1 , but not by too large a factor)
r2 VN (r) = −V0 1 − 2 e−r/R2 . (43.49) R1 What can you (qualitatively) infer about the modification of the energy levels with respect to the levels of the spherical harmonic oscillator approximation in the text? (6) In the case in exercise 5, calculate the spin–orbit interaction correction. What can you (qualitatively) infer about the modification of the various states from the uncorrected case? (7) Calculate the pairing energy for the two-particle delta function potential, for two nucleons in the ground state, 1s1/2 .
44
Interaction of Atoms with Electromagnetic Radiation: Transitions and Lasers In this chapter, we first study the exact time-dependent solution for a two-level system in a harmonic potential, after which we study the general first-order perturbation theory in a harmonic potential for interaction with an external electromagnetic field. Finally, we consider a quantized electromagnetic field, and the interaction with the resulting photons, and derive the Planck formula that started quantum mechanics.
44.1 Two-Level System for Time-Dependent Transitions We start with the generic two-level system considered in Chapter 4, but now with a time-dependent potential for transition, namely d c1 (t) c (t) H11 H12 c1 (t) i = Hˆ 1 = , (44.1) c2 (t) H21 H22 c2 (t) dt c2 (t) ∗ = V ∗ (t), since the Hamiltonian is Hermitian. Moreover, as before, where H12 = V (t) and H21 = H12
H11 + H22 H22 − H11 , Δ= , 2 2 where the nonperturbed energies (at V (t) = 0) are E=
H11 = E1(0) = E − Δ H22 = E2(0) = E + Δ > E1(0) .
(44.2)
(44.3)
A useful formalism for describing the evolution of the states is in terms of the density matrix, |c1 (t)| 2 c1 (t) ∗ c1 (t)c2∗ (t) ∗ ρ = |ψ(t)ψ(t)| = . (44.4) c1 (t) c2 (t) = ∗ c2 (t) c1 (t)c2 (t) |c2 (t)| 2 Define first the difference in occupation numbers (probabilities) in the two states, N (t) = ρ 22 − ρ 11 = |c2 (t)| 2 − |c1 (t)| 2 = 2|c2 (t)| 2 − 1,
(44.5)
where we have used that |c1 (t)| 2 + |c2 (t)| 2 = 1, since we only have these two states, so the probability for the system to be in either state is one. Next, we note that ρ 21 = ρ ∗12 , and then define ρ 21 = c2 (t)c1∗ (t) =
Q(t) + iP(t) , 2
(44.6)
so that Q = ρ 21 + ρ ∗21 = ρ 21 + ρ 12 , P = 492
ρ 21 − ρ ∗21 ρ 21 − ρ 12 = . i i
(44.7)
493
44 Interaction of Atoms with Electromagnetic Radiation
Then the Schr¨odinger equation is equivalent to the Bloch equations d 1 N (t) = − [Q(t)(V (t) − V ∗ (t)) + iP(t)(V (t) + V ∗ (t))] dt i d 1 Q(t) = ω0 P(t) + N (t)(V (t) − V ∗ (t)) dt i d 1 P(t) = −ω0 Q(t) + N (t)(V (t) + V ∗ (t)), dt where we have defined 2Δ E2 − E1 ω0 ≡ = , previously called ω21 . The most relevant case, which can be solved exactly, is that of a harmonic potential, V (t) = V0 eiωt .
(44.8)
(44.9)
(44.10)
We can then check that the solution for N (t) is (the method for finding it is somewhat lengthy) 2V0 2V0 ω0 − ω 2V0 P(0) sin Ωt + N (0) − Q(0) (cos Ωt − 1), (44.11) N (t) = N (0) − Ω Ω Ω Ω where
Ω≡
(ω − ω0 ) 2 +
2V0
2 .
(44.12)
The solution for P(t), Q(t) is given as Q(t) = Q (t) cos ωt + P (t) sin ωt P(t) = P (t) cos ωt − Q (t) sin ωt 2V0 ω0 − ω (44.13) P (t) = P(0) cos Ωt + N (0) − Q(0) sin Ωt Ω Ω ω0 − ω ω0 − ω 2V0 ω0 − ω Q (t) = Q(0) + P(0) sin Ωt − N (0) − Q(0) (cos Ωt − 1). Ω Ω Ω Ω Considering that N (t) = 2|c2 (t)| 2 − 1, and that the most relevant case is when all particles are in the ground state initially, at t = 0, we have |c2 (0)| 2 = 0, which means N (0) = −1. Moreover, in this case c2 (0) = 0, so c2∗ (0) = 0 also, and thus Q(0) = P(0) = 0. In this case we obtain 2(ω0 − ω)V0 2V0 (1 − cos Ωt) cos ωt − sin Ωt sin ωt 2 Ω Ω2 2(ω0 − ω)V0 2V0 P(t) = − sin Ωt cos ωt − (1 − cos Ωt) sin ωt Ω Ω2 2 2 2V0 2V0 Ωt N (t) = −1 + (1 − cos Ωt) = −1 + 2 sin2 . Ω Ω 2 Q(t) =
Therefore we obtain that the probability for the system to be in the upper energy level is 2 2V0 1 + N (t) Ωt 2 P2 (t) = |c2 (t)| = = sin2 , 2 Ω 2 which is known as the Rabi formula. We also have P1 (t) = 1 − P2 (t).
(44.14)
(44.15)
494
Figure 44.1
44 Interaction of Atoms with Electromagnetic Radiation
Probability |c2 (t)| 2 as a function of time t, showing the absorption and emission parts of the cycle. From the form of Ω, we see that the amplitude is a maximum when Ω is a minimum, which happens when ω = ω0 =
E2 − E1 ,
(44.16)
which is known as the resonance condition. At resonance, Ω = Ω0 =
V0 .
Note also that the amplitude is Lorentzian in shape, since 2 2V0 (2V0 ) 2 = , Ω (2V0 ) 2 + [(ω − ω0 )]2
(44.17)
(44.18)
and it becomes 1 at the resonance. Moreover, when averaging over time, we obtain sin2 Ωt =
1 , 2
(44.19)
so P2 = 1/2 at resonance, and is smaller otherwise. At resonance then, we have a cycling between the two energy levels. From t = 0 until Ωt/2 = π/2, the system absorbs energy from the potential V (t), since E2 > E1 , and the probability P2 (t) = |c2 (t)| 2 increases from 0 to a maximum during that time. Then, from π/2 to π, P2 (t) decreases, so the system is losing energy, by emission of radiation. The cycles of absorption and emission, each for a time Ωt/2 equal to π/2, repeat ad infinitum; see Fig. 44.1. The above two-level system in a harmonic potential is relevant for the case of the laser, which actually is the acronym for “light amplification by stimulated emission of radiation” (LASER). In this case, there is radiation in a resonant cavity, with walls made of the atom with two levels. Then the field gives energy to the atom through absorption, followed by stimulated emission, after which the radiation gets reflected and comes back and gets absorbed, etc. The light is amplified coherently and is then led out of the cavity as a laser.
44.2 First-Order Perturbation Theory for Harmonic Potential The above case was exact, but the two-level system itself is an approximation. In reality, the atom is a multi-level system and interacts with the electromagnetic field, oscillating at some frequency, which generates transitions between the states of the system (so the interaction with the electromagnetic
495
44 Interaction of Atoms with Electromagnetic Radiation
field couples the states indirectly). Thus we can consider the generic interaction operator as the sum of two terms (so that it is Hermitian), Hˆ 1 (t) = Vˆ (t) = Vˆ0 eiωt + Vˆ0† e−iωt ,
(44.20)
for t > 0 (so, in a sudden approximation). For example, we found in Chapter 39 that the interaction Hamiltonian with an electromagnetic field is 3 ˆ d x A · j = A0 d 3 x( (k) · j) ei(k ·x −ωt) + e−i(k ·x −ωt) , (44.21) H1 = and we took the matter current j to have matrix elements. For comparison, for absorption from a constant field, we have the interaction Hamiltonian matrix elements d 3 r ψ∗f (r )( (k) · r )ψi (r ). (44.22) H1, f i = iq A0 ω0 Then, using the first-order time-dependent perturbation theory formula from Chapter 37, integrated over time, we have t 1 (1) cn (t) = dt H1,ni eiω ni t . (44.23) i 0 Substituting H1 (t), we find cn(1) (t)
1 = i =
t 0
† dt V0,ni eiωt + V0,ni e−iωt eiω ni t
1 1 − ei(ω+ω ni )t 1 − ei(ω ni −ω)t † V0,ni + V0,ni . ω + ωni −ω + ωni
(44.24)
Then, as in Chapter 37, we continue on to find the Fermi golden rule, but now with the replacement ωni = En − Ei → ωni ± ω = En − Ei ± ω,
(44.25)
where the ± refers to stimulated emission or absorption. The modified Fermi golden rule is then dPni 2π |V0,ni | 2 dE ··· ρ(En , . . .)δ(En − Ei ± ω). = (44.26) † 2 dt |V0,ni | This formula could have been obtained intuitively, for the stimulated emission or absorption of a photon with energy ω. † 2 We note that the Hamiltonian Hˆ is Hermitian, so |V0,ni | = |V0,ni | 2 , which comes from i|V0† |n = ∗ n|V |i .
44.3 The Case of Quantized Radiation To improve upon the previous analysis, we must also quantize the electromagnetic field. Strictly speaking, this should be done in the relativistically invariant quantum electrodynamics (QED) formalism, which is beyond the scope of this book, but we can make some shortcuts and “guess” the result.
496
44 Interaction of Atoms with Electromagnetic Radiation Briefly, radiation is made up of quanta of a given ω, just as for a harmonic oscillator, where we saw that E = (N + 1/2)ω is built up from the vacuum by adding N quanta of energy ω. contains a polarization vector j (k) (where j is the index for the two Now, the vector potential A polarizations transverse to the momentum), a wave factor ei(k ·r −ωt) , and a normalization constant K, plus a Hermitian conjugate term to ensure the reality of the potential (all of which are present in the classical theory), but now the wave factor is multiplied, as for a harmonic oscillator, by a creation or annihilation operator for respectively creating or annihilating a photon. All in all, r , t) = K j (k) a j (k)ei(k ·r −ωt) + a†j (k)e−i(k ·r −ωt) . (44.27) A( k,j
As we saw when discussing the harmonic oscillator, the only nonzero matrix elements of the creation and annihilation operators are √ N − 1|a|N = N (44.28) √ N + 1|a† |N = N + 1. We see that, with respect to the classical analysis of the electromagnetic field given above, we replace A0 with K a j (k) in one term and with K a†j (k) in the other, which are operators in both cases. This means that now we need to consider states not only for the atom but for the radiation itself, and now the total Hilbert space is a product Htotal = Hatom ⊗ Hradiation . Then the initial and final states become ˜ |F = | f ⊗ | f˜, |I = |i ⊗ |i,
(44.29)
˜ | f˜ are radiation states. where |i, | f are atomic states and |i, Therefore now the matrix elements of the interaction potential are † V0,F I = K (a j ( k)) f˜i˜ M f i ,
where Mf i =
d 3 r j (k) · ( j) f i ei k ·r .
(44.30)
(44.31)
k,j
˜ = |N and | f˜ = |N − 1, in But, as we just saw, the only nonzero matrix element (a j (k)) f˜i˜ is for |i which case the matrix element is N j (k). As we can see, in this situation one photon is absorbed, † so the term with V0,F I is for absorption, as in the classical radiation case. Similarly,
V0,F I = K (a†j (k)) f˜i˜ M f∗i
(44.32)
˜ = |N and | f˜ = |N + 1, in which case the matrix element is N j (k) + 1. This is only nonzero if |i is then the situation of emission of a photon into the radiation field, so the V0,F I term corresponds to emission, again as in the classical radiation case. Thus, the transition rate in this quantum radiation case is (from the Fermi golden rule)
⎧ K 2 |M f i | 2 k,j N j (k) dE · · · δ(E f − Ei − ω) abs. ⎪ dP(I → F) ⎪ =⎨ (44.33)
⎪ dt ⎪ K 2 |M f i | 2 [N j (k) + 1] dE. · · · δ(E f − Ei − ω) em. k,j ⎩ for the absorption and emission cases, respectively.
497
44 Interaction of Atoms with Electromagnetic Radiation We note that the transition rate in emission case is proportional to N j (k) + 1, so it can be split into two terms, a term proportional to N j (k), which is only nonzero when there are already photons in the radiation field, so this is a stimulated emission term (emission stimulated by photons), whereas the term equal to 1 is independent of the existence of photons, so it is a spontaneous emission term. We also note that the absorption rate equals the stimulated emission rate, as in the previous (classical) case. Only the spontaneous emission is different. In the resulting spontaneous emission rate, the right-hand side is called the “Einstein coefficient”. We can turn the sum over k into an integral, ω 2 dω k 2 dk dΩ = dΩ. (44.34) d3 k = c3 √ √ Moreover, the normalization constant K is proportional to 1/ ω, so we define K = K / ω. Then we finally obtain, for the emission rate in the cases of stimulated or spontaneous emission, 2π dPni N (k) = 3 |Mni | 2 K 2 ω 2 dω dΩ j δ(E f − Ei − ω). (44.35) dt 1 c
44.4 Planck Formula We can use the above formalism to calculate the Planck formula for thermal radiation, which was what started quantum theory (of course, Planck did not use this derivation, but a much simplified one with only a single assumption, that energy comes in quanta hν, but here we present the correct proof in quantum mechanics). We start from the observation that, for a given momentum k (thus angular frequency ω for the radiation), we have the following ratio of the absorption to spontaneous emission rates: dPabs. (k) = N (k). dPsp.em. (k)
(44.36)
Next, we calculate the energy absorbed in an interval dω of angular frequency and in a solid angle dΩ. Since we are absorbing photons, we have a factor ω for the energy of the photon times a factor N (k) for how many photons there are in the field. Further, in the continuum, we have 1 1 ω 2 dω dΩ 1 3 → d k = , (44.37) V (2π) 3 (2π) 3 c3 k
where the correspondence comes from k x = 2πn x /L x , so dk x = 2π/L x , and a product over dk x , dk y , dk z . Then we get dE(k) = ω
N (k) ω 2 dω dΩ . (2π) 3 c3
(44.38)
This allows us to rewrite dPabs (k) dE(ω) (2πc) 3 = N (k) = . dPsp.em. (k) dω dΩ ω 3
(44.39)
Consider now a two-level system, with lower level E1 and upper level E2 , in equilibrium with the radiation. The probability that the system in E2 changes by dP2 via absorption, taking the atom from
498
44 Interaction of Atoms with Electromagnetic Radiation
the E1 state to the E2 state, with rate dPabs. , is equal to dPabs. times the probability P1 of finding the atom in state E1 , dP2 = P1 dPabs.
(44.40)
On the other hand, the probability of finding the system in E1 increases owing to emission from the state E2 , with rate dPem times the probability P2 of finding the atom in state E2 , so dP1 = P2 dPem.
(44.41)
But the emission is either stimulated or spontaneous, and the stimulated rate equals the absorption rate, so dPem = dPst.em. + dPsp.em. = dPabs. + dPsp.em. ,
(44.42)
dP1 = P2 (dPabs + dPsp.em. ).
(44.43)
so
But, at equilibrium (and since we only have two levels for the atom in this picture), dP2 = dP1 ⇒ P1 dPabs. = P2 (dPabs. + dPsp.em. ).
(44.44)
P2 1 dPabs. = = . dPsp.em. P1 − P2 P1 /P2 − 1
(44.45)
Then we obtain
However, at equilibrium, for a temperature T, the Boltzmann distribution of the atomic states is P2 =
1 −E2 /k B T 1 e , P1 = e−E1 /k B T , Z Z
(44.46)
so we obtain dE(ω) (2πc) 3 dPabs. 1 = = , dω dΩ ω 3 dPsp.em. e (E2 −E1 )/k B T − 1
(44.47)
Finally, after integrating over dΩ and getting 4π, we have dE(ω) ω 3 1 = 4π . dω (2πc) 3 e (E2 −E1 )/k B T − 1
(44.48)
Alternatively, using the frequency ν = ω/(2π), given that E2 − E1 = ω, and having an extra factor of 2 from the two polarizations of light, we obtain the Planck formula dE 8πν 2 hν = . 3 dν c exp hν − 1 kB T
(44.49)
Important Concepts to Remember • For a two-level system with a time-dependent interaction between states, H12 = V (t), specifically for the harmonic V (t) = V0 eiωt , we find the Rabi formula, with an oscillating probability of being in the upper state, |P2 (t)| 2 ∝ sin2 Ωt/2, and a Lorentzian-shaped amplitude with a maximum at resonance, ω = ω0 = (E2 − E1 )/.
499
44 Interaction of Atoms with Electromagnetic Radiation
• A laser refers to radiation in a resonant cavity, with atoms of two-level system and coherent amplification at resonance, which is then led out of the cavity as the laser. • From quantum mechanics the atoms and Fermi’s golden rule we obtain stimulated emission and absorption, whereas when quantizing the radiation, the previous results are proportional to the number of photons, but we also obtain the possibility of spontaneous emission, which is not. • The Planck formula is obtained from the energy balance for emission and absorption of photons, and the Boltzmann distribution for the atoms.
Further Reading Read more about the interaction between matter and electromagnetism in [1] and [2].
Exercises (1) Show that (44.11) is the solution to the Bloch equations (44.8). (2) Show that P(t), Q(t) in (44.13) are the corresponding solutions to the Bloch equations. (3) Consider an exponentially decaying and oscillating potential, Vˆ (t) = (Vˆ0 eiωt + Vˆ0 e−iωt )e−γt for the interaction with classical radiation. What is the equivalent of Fermi’s golden rule in this case? (4) Find a limit in which the first-order perturbation theory for a harmonic potential is consistent with the exact, but two-level, calculation of Section 44.1 (giving the same result). 2 , for light–light interaction in the case of (5) Consider a possible quantum term, of the type α A quantized radiation. Analyze and interpret the resulting terms in the potential V (t) in terms of photons, as was done in the text for the linear term. (6) If there are no photons (no radiation field), N j (k) = 0, how do we interpret the rate of spontaneous emission (what is the physical situation it describes, and what are its limitations)? (7) In deriving the Planck formula, we used the classical Boltzmann distribution for the atoms. Why is this allowed, given that we are considering quantum interactions?
PART IId
SCATTERING THEORY
45
One-Dimensional Scattering, Transfer and S-Matrices
In Part IId , we give an introduction to the rather large field of scattering theory. In this chapter, we revisit one-dimensional scattering in a potential, restating it in a form that can be generalized to three dimensions and introducing various useful quantities along the way such as transfer and S-matrices. The one-dimensional case is interesting in several ways; an application we will do at the end of the chapter is for discrete one-dimensional systems, or “spin chains” (where the continuous line is replaced by a series of sites or points). We start with a short review of the set-up for one-dimensional systems from Chapter 7. The onedimensional Schr¨odinger equation is −
2 d 2 ψ + V (x)ψ(x) = Eψ(x), 2m dx 2
(45.1)
and can be rewritten as ψ (x) = −( − U (x))ψ(x),
(45.2)
where 2mV (x) 2 2mE ≡ 2 ≡ k 2.
U (x) ≡
(45.3)
In Chapter 7, we presented a more general Wronskian theorem, but, for applications here, we consider two solutions ψ1 (x), ψ2 (x) of the same Schr¨odinger equation (meaning, with the same potential and energy). Then their Wronskian, W (y1 , y2 ) = y1 y2 − y2 y1 ,
(45.4)
dW = 0. dx
(45.5)
is independent of x,
Moreover, W 0 implies that the two solutions are linearly independent, in which case an arbitrary solution can be written as a linear combination of the two, ψ(x) = aψ1 (x) + bψ2 (x).
(45.6)
If we give the initial condition data at x 0 , namely ψ(x 0 ), ψ (x 0 ), then the determinant of the linear equations for a and b, ψ(x 0 ) = aψ1 (x 0 ) + bψ2 (x 0 ) ψ (x 0 ) = aψ1 (x 0 ) + bψ2 (x 0 ), is W . 503
(45.7)
504
Figure 45.1
45 One-Dimensional Scattering, Transfer and S-Matrices
Generic scattering situation.
45.1 Asymptotics and Integral Equations Having in mind applications to 3-dimensional systems, we assume that the potential V (x) is bounded in space (is “local”, or has a finite domain), as in Fig. 45.1. More precisely, we impose +∞ dx|V (x)| < ∞. (45.8) −∞
Under this assumption, it follows that at x → ±∞ we have a free particle solution, meaning a linear combination of e±ik x , which defines this as a solution with momentum k. Then we must have ψ(x → ±∞) = a± eik x + b± e−ik x ⇒ ψ (x → ±∞) = a± ikeik x − b±ike−ik x .
(45.9)
Generalizing this behavior to the domain of the potential (the “inside”) means that a and b must become functions, ψ(x) = a(x)eik x + b(x)e−ik x ψ (x) = a(x)ikeik x − b(x)ike−ik x .
(45.10)
This ansatz is by definition good, since we are writing two arbitrary functions ψ(x) and ψ (x) in terms of two other arbitrary functions a(x) and b(x). The Schr¨odinger equation (a second-order differential equation) is equivalent to two first-order differential equations, d ψ(x) = ψ (x) dx d ψ (x) = (U (x) − k 2 )ψ(x). dx Substituting the ansatz for ψ(x) and ψ (x) into the first equation above, we obtain a (x)eik x + b (x)e−ik x = 0,
(45.11)
(45.12)
while substituting it into the second equation, we obtain a (x)ikeik x − b (x)ike−ik x = U (x)ψ(x).
(45.13)
In this way we have obtained two equations for the unknowns a (x) and b (x). Adding the first equation to the second divided by ik, we obtain U (x)ψ(x) −ik x U (x) a (x) = e = (45.14) a(x) + b(x)e−2ik x . 2ik 2ik
505
45 One-Dimensional Scattering, Transfer and S-Matrices
Replacing the sum with the difference, we obtain b (x) = −
U (x)ψ(x) ik x U (x) e =− b(x) + a(x)e2ik x . 2ik 2ik
(45.15)
We define ψ1 (x) as the solution that has only the term eik x at −∞, so a(−∞) = 1, b(−∞) = 0. Then integrating the equations for a (x), b (x) from −∞ to x, we obtain x 1 a(x) = 1 + dx e−ik x U (x )ψ1 (x ) 2ik −∞ (45.16) x 1 ik x b(x) = − dx e U (x )ψ1 (x ). 2ik −∞ Substituting these into (45.10), we obtain x 1 ψ1 (x) = eik x + dx eik (x−x ) − e−ik (x−x ) U (x )ψ1 (x ) 2ik −∞ x 1 ik x =e + dx sin[k (x − x )] U (x )ψ1 (x ) k −∞ x 1 ik x ψ1 (x) = ike + dx eik (x−x ) + e−ik (x−x ) U (x )ψ1 (x ) 2 −∞ x = ikeik x + dx cos[k (x − x )] U (x )ψ1 (x ).
(45.17)
−∞
We have obtained an integral equation for ψ1 (x). It can be solved iteratively, as an infinite perturbative sum, ψ1 (x) =
∞
ψ1(n) (x),
(45.18)
n=0
where the zeroth-order term is a free wave, ψ1(0) (x) = eik x .
(45.19)
Then, we substitute it into the right-hand side of the equation, thus defining the first-order term, ψ1(1) (x), then repeat the procedure to find ψ1(2) , etc. In general, then, x 1 (n+1) ψ1 (x) = dx sin[k (x − x )] U (x )ψ1(n) (x). (45.20) k −∞ One can show that the series is convergent uniformly (though we will not give the proof here) if +∞ 1 dx V (x ) < 1. (45.21) k −∞ But since the integral is finite by assumption, the relation is true for a sufficiently high k, k > k0 , with k0 the integral in (45.21). Conversely, though, it is not true for sufficiently small k, k < k0 . We can make a similar analysis for the solution ψ2 (x) that is pure eik x at +∞, so that a(+∞) = 1, b(+∞) = 0. In this case, integrating the differential equations for a (x) and b (x), we obtain +∞ 1 a(x) = 1 − dx e−ik x U (x )ψ2 (x ) 2ik x (45.22) +∞ 1 b(x) = + dx eik x U (x )ψ2 (x ). 2ik x
506
45 One-Dimensional Scattering, Transfer and S-Matrices
Note the change of sign in front of the integral, since now x is the lower limit instead of the upper limit. Substituting these expressions for a(x), b(x) into (45.10), we obtain +∞ 1 dx eik (x−x ) − e−ik (x−x ) U (x )ψ2 (x ) ψ2 (x) = eik x − 2ik x +∞ 1 = eik x − dx sin[k (x − x )] U (x )ψ2 (x ) k x (45.23) 1 +∞ ik (x−x ) ik x −ik (x−x ) ψ2 (x) = ike − dx e +e U (x )ψ2 (x ) 2 x +∞ = ikeik x − dx cos[k (x − x )] U (x )ψ2 (x ). x
45.2 Green’s Functions We can rewrite the above integral equations as +∞ ik x ψ1 (x) = e + dx G1 (x, x )U (x )ψ1 (x ) ψ2 (x) = e
ik x
+
−∞ +∞ −∞
(45.24)
dx G2 (x, x )U (x )ψ2 (x ),
where we have defined the Green’s functions sin[k (x − x )] θ(x − x ) k sin[k (x − x )] G2 (x, x ) = θ(x − x), k G1 (x, x ) =
(45.25)
where θ(x) is the Heaviside function. As a reminder, θ(x) = 1 for x ≥ 0 and θ(x) = 0 for x < 0. Also, we remember that (d/dx)θ(x) = δ(x) as a distribution. These Green’s functions have a number of properties: • G1 , G2 are continuous and equal to zero at x = x , since there the prefactor of the Heaviside function vanishes, sin[k (x − x )] = 0. • The derivatives of G1 , G2 are discontinuous, however. Indeed, ∂x G1 (x, x ) = cos[k (x − x )]θ(x − x ) +
sin[k (x − x )] θ (x − x ) k
(45.26)
= cos[k (x − x )]θ(x − x ). In the second equality above we have used that θ (x − x ) = δ(x − x ), which is zero everywhere except at x = x ; however, there the prefactor vanishes, so the whole second term does too (we explained this in Chapter 2). Then we also find ∂x G2 (x, x ) = cos[k (x − x )]θ(x − x).
(45.27)
• These G1 , G2 are indeed Green’s functions for the free Schr¨odinger equation. Indeed, calculating the second derivative, we obtain
507
45 One-Dimensional Scattering, Transfer and S-Matrices ∂x2 G1 (x, x ) = −k sin[k (x − x )]θ(x − x ) + cos[k (x − x )]θ (x − x ) = −k 2 G1 (x, x ) + δ(x − x ).
(45.28)
Here we have used the fact that we can replace the function multiplying θ (x − x ) = δ(x − x ) with its value at x = x , which is 1 in this case. Then it follows that G1 (x, x ) solves the Green’s function equation for the Schr¨odinger operator in the free case, (∂x2 + k 2 )G1 (x, x ) = δ(x, x ).
(45.29)
We can rewrite the full Schr¨odinger equation as 2 d 2 + k ψ(x) = U (x)ψ(x) ≡ f (x), dx 2
(45.30)
and if we consider f (x) as a given function (which, in fact, it is not, as we want to solve for ψ(x)), we can formally solve for ψ(x) by convoluting the Green’s function with f . Indeed, multiplying the equation with f (x) and integrating between −∞ and x, we obtain 2 x x d 2 dx f (x ) + k G (x, x ) = dx δ(x − x ) f (x ) = f (x), (45.31) 1 dx 2 −∞ −∞ or more generally
d2 + k2 dx 2
x −∞
dx G(x, x ) f (x ) + ψ0 (x) = f (x),
(45.32)
where ψ0 (x) is a general solution of the free (homogenous) equation, (d 2 /dx 2 + k 2 )ψ0 (x) = 0. Then it follows that the formal solution is x ψ(x) = ψ0 (x) + dx G1 (x, x ) f (x ). (45.33) −∞
However, as we noted, in reality f (x) is not given but is equal to U (x)ψ(x). Replacing it in the formal solution above, we actually obtain the integral equations. We note also that a general Green’s function is a sum of the above special Green’s function G1 (x, x ) and a general solution g(x, x ) of the homogenous equation 2 d 2 + k g(x, x ) = 0, (45.34) dx 2 so that G˜ 1 (x, x ) = G1 (x, x ) + g(x, x ).
(45.35)
45.3 Relations between Abstract States and Lippmann–Schwinger Equation Until now, we have worked in the x (space) representation. But we can use an abstract approach as well, without choosing a representation. In fact, the formal solution above can be written as |ψ = |ψ0 + Gˆ 0 | f ,
(45.36)
508
45 One-Dimensional Scattering, Transfer and S-Matrices where Gˆ 0 stands in for G1 (x, x ) without introducing a given representation, and the index 0 is there to show that the Green’s operator is for the free Schr¨odinger operator Dˆ 0 , equal to d 2 /dx 2 + k 2 in the x representation. In other words, we are solving Dˆ 0 |ψ = | f = Uˆ |ψ.
(45.37)
Equivalently, by multiplying with 2 /(2m), we can solve (− Hˆ 0 + E)|ψ = Vˆ |ψ,
(45.38)
where Hˆ 0 is the Hamiltonian of the free particle. Then the Green’s function equation is 2 (E − Hˆ 0 ) Gˆ 0 = 1, (45.39) 2m where 1 is the identity operator, so that x|1|x = δ(x − x ). Finally, this means that we can solve the Green’s function equation formally as 2 1 . Gˆ 0 = 2m E − Hˆ 0
(45.40)
However, we see that this operator is not well defined when acting on eigenvectors |k of the free Hamiltonian, since (E − Hˆ 0 )|k = 0. To make it well defined, we must make the energy slightly complex, E → E + i (we can also add −i instead, but that gives another Green’s function, not so useful), so that we obtain 2 1 . Gˆ 0 = lim ˆ0 →0 2m E + i − H
(45.41)
The integral equation is written formally as |ψ = |k + Gˆ 0 Vˆ |ψ = |k +
1 E + i − Hˆ 0
Vˆ |ψ,
(45.42)
and is known as the Lippmann–Schwinger equation. To see that this solves the Schr¨odinger equation formally, we multiply by E − Hˆ 0 , obtaining (neglecting i for simplicity) (E − Hˆ 0 )|ψ = (E − Hˆ 0 )|k + Vˆ |ψ = Vˆ |ψ ⇒ ( Hˆ 0 + Vˆ )|ψ = E|ψ.
(45.43)
An iterative solution is obtained as before, by substituting the nth-order solution into the right-hand side of the Lippmann–Schwinger equation, to obtain the (n + 1)th-order solution, |ψ = |k + Gˆ 0 Vˆ |k + (Gˆ 0 Vˆ ) 2 |k + · · · + (Gˆ 0 Vˆ ) n |k + · · · = |ψ (0) + |ψ (1) + |ψ (2) + · · · + |ψ (n) + · · · .
(45.44)
We can obtain this same perturbative solution in a different way. Substitute |ψ = |k + |φ into the full Schr¨odinger equation (E − Hˆ 0 − Vˆ )|ψ = 0. Using the free Schr¨odinger equation, we obtain (E − Hˆ 0 − Vˆ )|φ = Vˆ |k.
(45.45)
This is solved by 1 Vˆ |k E − Hˆ 0 − Vˆ = (Gˆ 0 + Gˆ 0 Vˆ Gˆ 0 + Gˆ 0 Vˆ Gˆ 0 Vˆ Gˆ 0 + · · · )Vˆ |k,
|φ =
(45.46)
509
45 One-Dimensional Scattering, Transfer and S-Matrices
where we have defined Gˆ =
1 1 = −1 , ˆ ˆ ˆ E − H0 − V G0 − Vˆ
(45.47)
ˆ and used the following relation for two arbitrary operators Aˆ and B, 1 Aˆ − Bˆ
=
1 1 ˆ1 1 ˆ1 ˆ1 + B + B B +··· , Aˆ Aˆ Aˆ Aˆ Aˆ Aˆ
(45.48)
which we leave as a simple exercise to prove. We note that, indeed, we have obtained the same perturbative solution of the full Schr¨odinger equation. We can now define the operator Tˆ by factorizing Gˆ 0 on the right-hand side of the perturbative solution, obtaining ˆ Gˆ 0 (Vˆ + Vˆ Gˆ 0 Vˆ + Vˆ Gˆ 0 Vˆ Gˆ 0 Vˆ + · · · ) ≡ Gˆ 0T.
(45.49)
But then we notice that we can again obtain the Tˆ operator in its definition, as ˆ Tˆ = Vˆ + Vˆ Gˆ 0 (Vˆ + Vˆ Gˆ 0 Vˆ + · · · ) = Vˆ + Vˆ Gˆ 0T.
(45.50)
This equation for Tˆ is in fact an equivalent form of the Lippmann–Schwinger equation.
45.4 Physical Interpretation of Scattering Solution We have defined ψ1 (x) as constituting a pure solution at x → −∞, ψ1 (x) ∼ eik x ⇒ ψ1∗ (x) ∼ e−ik x ,
(45.51)
while ψ2 (x) has a pure solution at x → +∞, ψ2 (x) ∼ eik x ⇒ ψ2∗ (x) ∼ e−ik x .
(45.52)
Thus at x → −∞ the two solutions ψ1 (x) and ψ1∗ (x) are linearly independent, with Wronskian W1 (ψ1 , ψ1∗ ) = ψ1∗ ψ1 − ψ1 ψ1∗ = 2ik 0,
(45.53)
so that (ψ1 (x), ψ1∗ (x)) form a basis in the space of eigenfunctions near −∞. Similarly, at x → +∞, the two solutions ψ2 (x) and ψ2∗ (x) are linearly independent, with Wronskian W2 (ψ2 , ψ2∗ ) = ψ2∗ ψ2 − ψ2 ψ2∗ = 2ik 0,
(45.54)
meaning that (ψ2 (x), ψ2∗ (x)) is a basis in the space of eigenfunctions near +∞. Therefore we can expand the basis elements near −∞ in terms of the basis elements near +∞, ψ1 (x) = α(k)ψ2 (x) + β(k)ψ2∗ (x) ⇒ ψ1∗ (x) = β ∗ (k)ψ2 (x) + α ∗ (k)ψ2∗ (x).
(45.55)
Applying this relation near +∞, where ψ2 (x) ∼ eik x , we obtain ψ1 (x → +∞) → α(k)eik x + β(k)e−ik x ψ1∗ (x → +∞) → β∗ (k)eik x + α ∗ (k)e−ik x .
(45.56)
510
45 One-Dimensional Scattering, Transfer and S-Matrices
Substituting the relation between basis elements into the Wronskian W1 , we obtain W1 = ψ1∗ ψ1 − ψ1 ψ1∗ = [|α(k)| 2 − |β(k)| 2 ](ψ2∗ ψ2 − ψ2 ψ2∗ ) = [|α(k)| 2 − |β(k)| 2 ]W2 .
(45.57)
However since W1 and W2 are independent of x, and we have calculated that W1 = 2ik = W2 , it follows that |α(k)| 2 − |β(k)| 2 = 1.
(45.58)
For a generic eigenfunction, expanded in the basis at −∞, ψ(x) = aψ1 (x) + bψ1∗ (x),
(45.59)
which means that near x → −∞, when ψ(x) → aeik x + be−ik x ,
(45.60)
we can use the transformation law between bases, and apply it near x → +∞, to obtain ψ(x) → (aα(k) + bβ∗ (k))e+ik x + (aβ(k) + bα ∗ (k))e−ik x ≡ a e+ik x + b e−ik x . It follows that in effect we are acting with a matrix on the state (a, b), namely a a a → =K , b b b
(45.61)
(45.62)
where we define the transfer matrix K: K≡
α(k) β(k)
β∗ (k) . α ∗ (k)
(45.63)
Outside the domain of the potential, the time-dependent solution of the Schr¨odinger equation is ψk (x, t) = ψk (x)e−iEt ,
(45.64)
and it corresponds to a propagating wave. For ψk (x) ∼ eik x , we have a wave propagating to the right, while for ψk (x) ∼ e−ik x , we have a wave propagating to the left. We can now connect with the analysis in Chapter 7. Indeed, there is a solution ψ+ (x) = ψ1 (x) −
β ∗ ψ (x), α∗ 1
(45.65)
that behaves as ψ+ (x) ∼
1 ik x e α∗
(45.66)
at x → +∞, and as ψ+ (x) ∼ eik x −
β −ik x e α∗
(45.67)
at x → −∞. This is a wave propagating from the left (∝ eik x ), and after encountering the potential domain, having a reflected part ∝ e−ik x and a transmitted part ∝ eik x . Similarly, there is a solution ψ− (x) =
1 ∗ ψ α∗ 1
(45.68)
511
45 One-Dimensional Scattering, Transfer and S-Matrices
that behaves as 1 −ik x e α∗
(45.69)
β ∗ ik x e + e−ik x α∗
(45.70)
ψ− (x) ∼ at x → −∞, and as ψ− (x) ∼
at x → +∞. This is a wave propagating from the right (∝ e−ik x ), after encountering the potential domain, having a reflected part ∝ eik x , and a transmitted part ∝ e−ik x . We define a wave as being an “in” wave if it is going towards the potential, and as an “out” wave if it is going away from the potential.
45.5 S-Matrix and T-Matrix Define abstract states |k± by eik x = x|k+, e−ik x = x|k−.
(45.71)
Then the transmitted (t) and reflected (r) coefficients in the waves are 1 α∗ 1 t− = ∗ α
t+ =
r+ = −
β α∗
(45.72)
β∗ r− = ∗ , α
which are normalized correctly, |t + | 2 + |r + | 2 = |t − | 2 + |r − | 2 = 1.
(45.73)
Then the free “in” states turn into a combination of free “out” states, as follows: |k+in → t + |k+out + r + |k−out
(45.74)
|k−in → t − |k−out + r − |k−out . This can be written as a matrix action, |k+out t |k+in ˆ →S = + |k−in |k−out t−
r+ r−
where we have defined the matrix ˆ ˆS = k + | S|k+ ˆ k + | S|k−
r+ 1/α ∗ = r− −β/α ∗
ˆ k − | S|k+ t = + ˆ t− k − | S|k−
|k+out , |k− out
(45.75) β∗ /α ∗ , 1/α ∗
(45.76)
called the S-matrix. It is a unitary matrix, since, as we can easily check, Sˆ Sˆ † = 1.
(45.77)
ˆ as Thus we can write it in terms of a Hermitian (“real”) matrix Δ, ˆ Sˆ = eiΔ .
(45.78)
512
45 One-Dimensional Scattering, Transfer and S-Matrices Equivalently, we can define an S-operator as the evolution operator between −∞ and +∞, Sˆ = Uˆ (+∞, −∞) =
lim
t2 →+∞,t1 →−∞
U (t 2 , t 1 ),
(45.79)
and the S-matrix as the matrix whose elements are the matrix elements of the Sˆ operator between the free states |k±. This definition also gives a transition between in states (at t → −∞) and out states (at t → +∞). ˆ The perturbative solution of We can define the T-matrix from the previously defined T-operator T. the Schr¨odinger equation was written as |ψ = |k + |φ = |k + Gˆ 0Tˆ |k = (1 + Gˆ 0Tˆ )|k.
(45.80)
Acting with Vˆ from the left, we obtain Vˆ |ψ = (Vˆ + Vˆ Gˆ 0Tˆ )|k = Tˆ |k,
(45.81)
where in the last equality we have used the Lippmann–Schwinger equation. This then gives an alternative form of the Lippmann–Schwinger equation, |ψ = |k + Gˆ 0 Vˆ |ψ.
(45.82)
Moreover, considering the two |k± free-wave solutions, we find |ψ± = |k± + Gˆ 0 Vˆ |ψ± = |k± + Gˆ 0Tˆ |k±.
(45.83)
This equation also defines an S-matrix, though one in which we have separated an identity term 1, giving the first, free-wave, part of the state.
45.6 Application: Spin Chains As an important application of the one-dimensional scattering formalism, consider a useful onedimensional system: on a discrete line, made up of a set of points (“sites”) i = 1, 2, . . . , L, each site has a “spin” variable. In the simplest case, of s = 1/2, the spin variable can have two states, |↑ and |↓. We can define the state |x 1 , x 2 , . . . , x k as the state with |↑ at sites x 1 , . . . , x k ∈ {1, 2, . . . , L} and with |↓ at the rest of the sites; see Fig. 45.2. Further, we can define a state with one excitation, one spin up moving along a sea of spins down (called a “magnon”) with momentum p, |ψ(p) =
L x=1
Figure 45.2
Spin chain.
eipx |x ≡
L x=1
ψ p (x)|x.
(45.84)
513
45 One-Dimensional Scattering, Transfer and S-Matrices
Similarly, the state with two excitations (two magnons) is defined in terms of a two-particle wave function as |ψ(p1 , p2 ) = ψ(x 1 , x 2 )|x 1 , x 2 . (45.85) 1≤x1 L/2, and an energy E ≤ V0 . Calculate ψ1 (x), ψ2 (x), and the corresponding transfer matrix K. ˆ with Sˆ = eiΔˆ . In the case in exercise 4, calculate the S-matrix and the Hermitian matrix Δ, Substitute the two-magnon ansatz into the Schr¨odinger equation for the Heisenberg spin 1/2 spin chain, and find that the energy of the two-magnon state is just the sum of the energies of the single magnons, and that 2
S(p1 , p2 ) =
φ(p1 ) − φ(p2 ) + i 1 p , φ(p) ≡ cot . φ(p1 ) − φ(p2 ) − i 2 2
(45.99)
(7) Solve the Bethe ansatz equations for two magnons for the case n = 0 in the quantization condition (45.91), and write down the corresponding explicit wave functions.
46
Three-Dimensional Lippmann–Schwinger Equation, Scattering Amplitudes and Cross Sections In this chapter, we start the analysis of three-dimensional scattering. We will consider (except in the last chapter of this part) only elastic scattering, meaning the colliding particles do not change their states during the collision. As we have already seen, this can be reduced to scattering off a potential, where the scattering is in terms of the relative motion. We begin with a quick review of motion in a potential.
46.1 Potential for Relative Motion Since we are considering elastic scatterings, it is enough to have only two colliding particles, A and B. In that case, the Hamiltonian for the system is 2 2 Hˆ = Hˆ A + Hˆ B + Vˆ = − ΔA − ΔB − Vˆ (r A − r B ). 2m A 2m B
(46.1)
the relative mass Defining the relative position r = r A − r B , the center of mass position R, m = m A m B /(m A + m B ), and the total mass M = m A + m B , we rewrite the above equation as 2 2 Hˆ = Hˆ CM + Hˆ rel = − ΔR − Δr + V (r ). (46.2) 2M 2m Then the center of mass and relative variables separate in the solution. The center of mass Schr¨odinger equation is trivial, and the relative-motion Schr¨odinger equation is 2 − Δr + Vˆ (r ) ψ = Eψ. (46.3) 2m In the absence of a potential (for instance, when we are far from the domain of the potential), we have a free-wave solution, r |p = φ p (r ) =
1 ei p ·r / . (2π) 3/2
(46.4)
Defining the wave vector k = p /, we can expand the exponential in terms of spherical harmonics Ylm (θ, φ), ei k ·r = Alm (nk ) jl (kr)Ylm (nr ), (46.5) l,m
where the unit vector in the momentum direction is nk = k/k and in the position direction is nr = r /r, and the coefficients are ∗ Alm (nk ) = 4πi l Ylm (nk ).
516
(46.6)
517
46 Three-Dimensional Lippman–Schwinger Equation
The position wave function at given l, m, is then ulm (k; nr ) = jl (kr)Ylm (nr ).
(46.7)
46.2 Behavior at Infinity To describe scattering in a potential, we need to understand first the boundary conditions, just as we did in the one-dimensional case, and in the case of eigenfunctions in a central potential, in Chapters 18 and 19. We saw there that the boundary conditions at r = ∞ and r = 0 restrict the wave function, in a similar way to that discussed in Chapter 7, in the one-dimensional case. If we have negative energy, E < 0, we saw that we obtain bound states, and that leads to the quantization of the energy levels. Indeed, we have two normalizability conditions, one at r = 0 and one at r = ∞, for the two independent solutions of the second-order differential equation. On the other hand, if we have positive energy, E > 0, we have asymptotic states (states that “escape” to infinity), and there are no quantization conditions. At r → ∞, both these states, with behavior r R(r) = χ(r) ∼ e±ikr (as we saw) are well behaved. As we will see, these two behaviors lead to outgoing and incoming waves, respectively, for scattering states |ψ± (as opposed to bound states, like those in which we were interested in Chapters 18 and 19). In conclusion, then, the boundary conditions at infinity imply the scattering solution that we want. For the time being we will consider a potential with a finite range, i.e., a potential V (r) (for a spherically symmetric case) that decays faster than 1/r at r → ∞. In that case, at infinity, the free particle (which experiences no potential) is a good approximation. The Schr¨odinger equation there (with E = 2 k 2 /(2m)) is ∂ 2 ψ 2 ∂ψ 1 + + 2 Δθ,φ ψ + k 2 ψ = 0. r ∂r ∂r 2 r
(46.8)
∂2 1 (r ψ) + Δθ,φ ψ + k 2 (r ψ) = 0. 2 r ∂r
(46.9)
It can be rewritten as
But, since we are at r → ∞, the middle term is O(1/r), so it goes to zero when compared with the other two terms and can be neglected. Then the equation at infinity is ∂2 (r ψ) + k 2 (r ψ) = 0, ∂r 2
(46.10)
r ψ = Ae±ikr .
(46.11)
which has the solutions
To be more precise, the prefactors A have also some dependence, r ψ = A(k, ±nr )e±ikr ,
(46.12)
which means that we have an “out” (outgoing) solution ψ1 = A(k, nr )
eikr r
(46.13)
518
46 Three-Dimensional Lippman–Schwinger Equation
and an “in” (incoming) solution e−ikr , (46.14) r where the minus sign in the dependence of B is conventional, to remind us that the wave goes towards r = 0. The most general solution then is a combination of both: ψ = ψ1 + ψ2 . In particular, we show that the free wave is also of this form, at r → ∞, with given A and B, ψ2 = B(k, −nr )
ei k ·r
2π e−ikr 2π eikr δ(nk − nr ) − δ(nk + nr ) , ik r ik r
(46.15)
2π δ(nk − nr ), ik
(46.16)
so that A(k, nr ) =
B(k, −nr ) = −
2π δ(nk + nr ). ik
Proof To prove this relation, we calculate the integral of the wave, on a sphere multiplied by an arbitrary function, I=
2
d nr f (nr )e
i k · r
π
=
0
−1
2π
= 0
dφeikr cos θ f (θ, φ).
(46.17)
0
Defining x = cos θ, we calculate 2π 1 I= dφ eikr x f (x, φ) = 0
2π
sin θdθ
2π 0
dφ ikr
1 −1
f (x, φ)d (eikr x )
1 dφ ikr −ikr ikr x d f (x, φ) − e dx , f (1, φ)e − f (−1, φ)e ikr dx −1
(46.18)
where in the second line we have integrated by parts. Since at cos θ = 1 we have nr = nk , and at 1 cos θ = −1 we have nr = −nk , it follows that f (±1, φ) = f (±nk ). Moreover, at kr 1, −1 eikr x is highly oscillatory, and therefore the integral is close to zero, or rather it is much less than the first two terms. These remaining terms are independent of φ, since nk is a fixed direction in terms of r , 2π which means that we can trivially obtain 0 dφ = 2π. Then finally, we have 2π eikr e−ikr I + f (−nk ) f (nk ) (46.19) ikr r r ˙
when kr 1. We note then that this integral I gives the same result as when replacing ei kr with 2π eikr e−ikr − δ(nk + nr ) δ(nk − nr ) (46.20) ikr r r in the original integral, thus proving the relation.
q.e.d.
46.3 Scattering Solution, Cross Sections, and S-Matrix After seeing the behavior at infinity of the solution of the Schr¨odinger equation for a finite-range potential, we need to construct a physical set-up that will describe the scattering of particles, from the stationary (scattered wave) point of view.
519
Figure 46.1
46 Three-Dimensional Lippman–Schwinger Equation
Generic scattering situation in three-dimensions with a potential V (the arbitrary shape with solid lines). Certainly at r → ∞ (away from the potential), we want to have an incident free particle wave, ei k ·r . We have already seen that at infinity this wave decomposes into an in part and an out part, as for any wave. Yet, since we are in the stationary-wave point of view (with the time-independent Schr¨odinger equation), the wave should have a scattered component but that component must only be an outgoing wave (a diverging spherical wave), since it is generated at the moment of scattering; see Fig. 46.1. Then we have the following ansatz for the stationary-wave function at r → ∞: u+ (r ) ei k ·r + f k (nr )
k
eikr = uinc (r ) + uscatt (r ). r
(46.21)
We saw earlier that the probability current is − ψ∇ψ ∗ ). j = (ψ∗ ∇ψ 2im
(46.22)
When substituting the above wave function, we obtain an incident current, a scattered current, and an interference term, j +k jinc + jscatt + jinterf
(46.23)
jinc = k m k | f |2 jscatt = nr 2 , m r
(46.24)
when r → ∞. Specifically, we find
where nr = r /r. We note that |uscatt | 2 = | f | 2 /r 2 is the probability density. Then the interference current term is jinterf = − 4π k (Im f )δ(nk − nr ), 2 mr k
(46.25)
which means that we can neglect it except in the forward direction. We saw in Chapter 37 that the definition of the differential cross section is dσ =
dPscatt (dΩ, dt)/dt , jinc
(46.26)
520
46 Three-Dimensional Lippman–Schwinger Equation and the incident current is jinc = k/m, while the probability current (flow rate) through the surface dS = r 2 dΩ is dPscatt (dΩ, dt)/dt = jscatt dS = jscatt r 2 dΩ =
k 2 | f | dΩ. m
(46.27)
Thus we obtain dσ = | f |2 . dΩ
(46.28)
The conservation of probability equation is 2 · j + ∂|ψ| = 0. ∇ ∂t
(46.29)
In the stationary (time-independent) case we have ∂|ψ| 2 /∂t = 0, so after integrating over a volume V bounded by Σ∞ , we obtain · j = 0= d 3r ∇ d S · j = dS jr = r 2 dΩ(ψ∗ ∂r ψ − ψ∂r ψ∗ ). (46.30) 2im Σ∞ V Σ∞ Σ∞ We can choose the whole of the Σ∞ boundary to be within the r → ∞ limit, in which case we can replace ψ with its general asymptotical form in terms of e±ikr /r with coefficients A and B. Then the conservation of probability becomes 2ik dΩr 2 2 [A∗ A − B∗ B] = 0 2im r (46.31) 2 2 ⇒ dΩ| A| = dΩ|B| . This means that knowing one of the coefficients A, B implies a value for the average of the other, and thus both coefficients must be nonzero. Moreover, it means that there exists a linear relation between them. Indeed, if we have several waves expanded in this way, ψ n An
eikr e−ikr + Bn , r r
(46.32)
then linear combination of the waves means linear combinations for the coefficients, n
eikr e−ikr Cn ψ n Cn An + Cn Bn , n n r r
(46.33)
which in turn implies that
2 2 dΩ Cn An = dΩ Cn Bn . n n
(46.34)
This is only possible if the out coefficient is an operator times the in coefficient, A = Sˆ · B, which more precisely leads to
A(n ) =
d 2n S(n , n )B(n ),
where we have generalized from n = nr , n = −nr to general n, n .
(46.35)
(46.36)
521
46 Three-Dimensional Lippman–Schwinger Equation However, to satisfy the law of conservation of probability dΩ| A| 2 = dΩ|B| 2 , we also require that Sˆ is a unitary operator, Sˆ † = Sˆ −1 (obtained by substituting A = Sˆ · B into the conservation law). This Sˆ refers to the S-operator, or S-matrix. We note that this is a generalization of the one-dimensional case, where there are only two states |k±, to the case of an infinity of states, called “channels”, for the in and out states.
46.4 S-Matrix and Optical Theorem We now expand the free wave ei k ·r in the scattering stationary wave (46.21) into in and out components, changing the notation as follows, f k (nr ) = f k (nr , nk ) and u+ (r ) = uk+ (nr , nk ; r), in k order to emphasize the dependence on angles. We obtain ikr 2π e 2π e−ikr uk+ (nr , nk ; r) δ(nr − nk ) + f k (nr , nk ) + δ(−nr − nk ) . (46.37) ik r ik r
We generalize nr to n and −nr to n as before, leading to 2π δ(n − nk ) + f k (n , nk ) ik 2π B(n ) = δ(n − nk ). ik
A(n ) =
Then the S-matrix relation A = Sˆ · B becomes 2π A(n ) = δ(n − nk ) + f k (n , nk ) ik 2π 2π = d 2n S(n , n )δ(n − nk ) = S(n , n ). ik ik
(46.38)
(46.39)
Therefore the S-matrix contains a trivial (identity) part, plus a nontrivial part, S(nr , nk ) = δ(nr − nk ) +
ik f k (nr , nk ), 2π
(46.40)
or, in operator terms, ik ˆ Sˆ = 1 + f. (46.41) 2π Here S and f are amplitudes for scattering, with S containing also the trivial (not the interaction) part. We will see that fˆ is related to the T-operator defined in the previous chapter. Since Sˆ is a unitary operator, as we saw before, meaning that it conserves probability as it relates the in and out states, we find ik ˆ ik ˆ† † ˆ ˆ 1 = SS = 1 + f 1− f 2π 2π (46.42) 2 k ik ˆ ˆ† † ˆ ˆ =1+ (f − f ) + ff . 2π 2π Finally, we obtain fˆ − fˆ† k ˆ ˆ† = ff . 2i 4π
(46.43)
522
46 Three-Dimensional Lippman–Schwinger Equation
Changing from operators to matrices, fˆ = f (n , nk ) ⇒ fˆ† = f ∗ (nk , n ), where n = nr , and considering forward scattering, so that nr = nk , we obtain k k Im f (nk , nk ) = d 2nr | f (nr , nk )| 2 = σtot , 4π 4π
(46.44)
(46.45)
where in the last equality we have used the fact that d 2nr = dΩ and dσ/dΩ = | f | 2 . The above form is the most common (though not the most general) form of the optical theorem for scattering.
46.5 Green’s Functions and Lippmann–Schwinger Equation We now turn to equations for the wave functions, specifically the generalization to three dimensions of the Lippmann–Schwinger equation. Now the relations we will prove are valid even for the case of a potential with an infinite range, as in the Coulomb case. We trivially generalize the one-dimensional case from the previous chapter to define the Green’s functions in an abstract form (E − Hˆ 0 ) Gˆ 0 =
2 1. 2m
(46.46)
However, the coordinate representation is also trivially generalized, since r | Gˆ 0 |r = G0 (r , r ) r |1|r = δ3 (r − r ).
(46.47)
Continuing to use the abstract form, we find 2 1 Gˆ 0 = , 2m E − Hˆ 0
(46.48)
or, more precisely, since we need to avoid the singularities appearing in the solutions of the (free) Schr¨odinger equation, (E − Hˆ 0 )|ψk = 0, we need to add ±i to the energy, leading to two Green’s functions, 2 1 Gˆ 0(±) = lim . ˆ0 →0 2m E ± i − H
(46.49)
Also as in the one-dimensional case (trivially generalized in the abstract form) we find the Lippmann–Schwinger equation, |ψ± = |k +
2m ˆ (±) ˆ G V |ψ±, 2 0
(46.50)
where |k is the free particle state of wave vector k, (E − Hˆ 0 )|k = 0. Note here that we have defined |ψ± as the state corresponding to Gˆ 0(±) , though we will soon see that it also corresponds to having an outgoing or incoming scattered wave, meaning that ψ1 = ψ+ , ψ2 = ψ− are the solutions at infinity.
523
46 Three-Dimensional Lippman–Schwinger Equation
Proof To prove the Lippmann–Schwinger equation, we multiply it by (E ± i − Hˆ 0 ), obtaining 2m (E ± i − Hˆ 0 )|ψ± = (E ± i − Hˆ 0 )|k + (E ± i − Hˆ 0 ) Gˆ 0 2 Vˆ |ψ± ⇒ (E ± i − Hˆ 0 − Vˆ )|ψ± = 0,
(46.51)
where we have used the fact that the first term in the first line vanishes, by the free particle Schr¨odinger equation, and in the second, we have used the definition of Gˆ 0 to write it as Vˆ |ψ±. q.e.d. Note that we can define Gˆ 0 fully on the complex plane, not just an infinitesimal distance away from the real line, 2 1 Gˆ 0 (z) = , 2m z − Hˆ 0
(46.52)
and then the Lippmann–Schwinger equation in this more general case is |ψ(z) = |k z +
2m ˆ G0 (z)Vˆ |ψ(z). 2
Then in coordinate space, i.e., multiplying with r |, we get 2m ψz (r ) = φz (r ) + 2 d 3r G0 (z; r , r )V (r )ψz (r ). We go back now and specialize to z = E ± i, which means that 2m ψ± (r ) = φ E (r ) + 2 d 3r G0(±) (r , r )V (r )ψ± (r ). The (free) Green’s function in coordinate space, 1 2m (±) r G (r , r ) = r 2 0 E ± i − Hˆ 0 is not diagonal but in momentum space it is diagonal, since 1 2m (±) p G (p , p ) = p 2 0 E ± i − Hˆ 0 p |p 1 = = δ3 (p − p ). 2 E ± i − p /2m E ± i − p2 /2m
(46.53)
(46.54)
(46.55)
(46.56)
(46.57)
This means that we can obtain a simpler form for the Lippmann–Schwinger equation in momentum space, 2m ψ± (p ) = φ E (p ) + 2 d 3 p G0(±) (p , p )V (p , ψ±) (46.58) 1 = φ E (p ) + V ( p , ψ±), p2 E ± i − 2m where V (p , ψ±) ≡ p |Vˆ |ψ±.
(46.59)
524
46 Three-Dimensional Lippman–Schwinger Equation
To obtain an explicit form for the Lippmann–Schwinger equation in coordinate space, we must first find the (free) Green’s function in coordinate space. We insert momentum space completeness relations in order to relate it to the momentum space Green’s function, obtaining (±) 3 d p d 3 pr |p p | Gˆ 0(±) |p p |r G0 (r , r ) = =
2 2m
d3 p
d 3 p
d3 p ei p ·(r −r )/ (2π) 3 E ± i − p2 /2m ∞ 2π π eiq |r −r | cos θ 1 2 = q dq dφ dθ sin θ 2 , (2π) 3 0 k − q2 ± i 0 0
2 = 2m
ei p ·r / δ3 (p − p ) e−i p ·r / 2 3/2 (2π) E ± i − (p /2m) (2π) 3/2
(46.60)
where the θ, φ angles refer to the r − r vector relative to the p vector, E = 2 k 2 /(2m), and p = q. 2π π 1 Integrating using 0 dφ = 2π and 0 dθ sin θ = −1 d(cos θ), we find ∞ 1 eiq |r −r | − e−iq |r −r | 2 G0(±) (r , r ) = q dq 4π 2 0 iq|r − r |(k 2 − q2 ± i) (46.61) +∞ eiq |r −r | − e−iq |r −r | 1 =− 2 qdq , (q2 − k 2 ∓ i) 8π i|r − r | −∞ +∞ ∞ where in the last equality we have used 12 −∞ qdq = 0 qdq. The integral has poles at √ q = ± k 2 ± i = ±k (1 ± i ). (46.62) To calculate the above integral, we use the standard residue theorem on the complex plane. We first extend the integral over the complex plane, and then close the contour of integration over the real line with a semicircle at infinity in the upper half-plane for the term with eiq |r −r | , since if |q| → ∞ in the upper half-plane, it becomes ∼ e−(Im q) |r −r | → 0. We close the contour with a semicircle at infinity in the lower half plane for the term with e−iq |r −r | , since if |q| → ∞ in the lower half-plane, the term becomes ∼ e+(Im q) |r −r | → 0. This means that addition of the semicircles does not change the final result; however, now that the contour is closed we can use the residue theorem, and find that the integral is equal to 2πi times the residue(s) inside the contour, with a plus sign for counterclockwise contour integration and a minus for clockwise contour integration. 1 gives, in the cases of G0(±) , the following That means that the integral over the real line times 2πi results: k ik |r −r | (−k) −i(−k) |r −r | G0(+) → e − (−) e = eik |r −r | 2k 2(−k) (46.63) −k i(−k) |r −r | k −ik |r −r | (−) −ik | r − r | G0 → e − (−) e =e , −2k 2k where in the first line the first term comes from the +k + i residue and has a counterclockwise contour and the second term comes from the −k − i residue and has a clockwise contour; in the second line the first term comes from the −k + i residue and has a counterclockwise contour and the second term comes from the +k − i residue and has a clockwise contour, as in Fig. 46.2. Then we find the coordinate space Green’s function G0(±) (r , r ) = −
1 ±ik | r − r | . e 4π|r − r |
(46.64)
525
Figure 46.2
46 Three-Dimensional Lippman–Schwinger Equation
Complex poles relevant for G+0 and G−0 . We note that this is the same Green’s function as for the equation (Δ + k 2 )G = δ3 (r − r ).
(46.65)
Now we are ready to find an explicit form for the coordinate-space Lippmann–Schwinger equation. Substituting G0(±) (r , r ) into (46.55) and substituting φ E (r ) = ei k ·r , we find ±ik | r − r | 2m i k · r 3 e ψ± (r ) = e − d r V (r )ψ± (r ). (46.66) 4π2 |r − r | Specializing to a potential of finite range, so that r belongs to the finite volume V that is the domain of the potential, it follows that for r → ∞ we have √ r |r − r | r 2 − 2rr cos α r 1 − cos α = r − nr · r , (46.67) r where α is the angle between r and r , and nr = r /r. Then at r → ∞ we keep in the exponential both the infinite phase and the finite phase (the first two terms in the expansion) since the latter is integrated over,
e±ik |r −r | e±ikr e∓iknr ·r ,
(46.68)
and in the denominator we can replace |r − r | with the leading term r, since we only are interested in the leading behavior of the integral. Then we obtain e±ikr 2m ψ± (r ) ei k ·r − d 3r e∓i k ·r V (r )ψ± (r ), (46.69) r 4π2 where we have defined k ≡ knr .
(46.70)
We note that we have reached the general scattering wave function in the stationary point of view if we make the identification 2m d 3r e∓i k ·r V (r )ψ± (r ), (46.71) f k± (nr , nk ) ≡ f (±) (k , k) ≡ − 2 4π corresponding to an outgoing or incoming wave ψ± : that is, the state |ψ±, previously defined by the term ±i in the Green’s function, actually corresponds to the scattered wave function term
526
46 Three-Dimensional Lippman–Schwinger Equation e±ikr f (±) /r. Then it follows that, since in most cases we are interested only in the outgoing case |ψ+, we are more interested in the Green’s function G0(+) .
Important Concepts to Remember • For three-dimensional scattering, at r → ∞ we have two asymptotic solutions, outgoing or “out”, ψ1 = A(k, nr )eikr /r, and incoming or “in”, ψ2 = B(k, −nr )e−ikr /r. • A free wave ei k ·r is expanded at infinity in the in and out waves, with A(k, nr ) = (2π/ik)δ(nk −nr ) and B(k, −nr ) = −(2π/ik)δ(nk + nr ). • In the case of scattering in a potential, we have at infinity the stationary wave function ansatz u+ (r ) = ei k ·r + f k (nr )eikr /r = uinc (r ) + uscatt (r ), and differential cross section dσ/dΩ = | f | 2 . k ikr −ikr • For /r, we have A = Sˆ · B, or A(n ) = a generic in plus out wave, ψ = Ae /r + Be dn S(n, n )B(n ), where Sˆ is the unitary S-matrix or S-operator. ˆ • Then we have Sˆ = 1 + (ik/2π) f , or S(nr , nk ) = δ(nr − nk ) + (ik/2π) f (nr , nk ), and the optical theorem, Im f (nr , nk ) = k/4π dnr | f | 2 = (k/4π)σtot . • The Green’s functions and Lippmann–Schwinger equations in the abstract form are trivially generalized from one dimension, and we can further generalize to the complex plane, 2 1 . Gˆ 0 (z) = 2m z − Hˆ 0 • In the coordinate representation, the Lippmann–Schwinger equation becomes 2m ψ± (r ) = φ E (r ) + 2 d 3r G±0 (r , r )V (r )ψ± (r ), with G±0 (r , r ) = −
1 e±ik |r −r | . 4π|r − r |
Further Reading See [2] and [1].
Exercises Consider a central potential of the spherical-step type, V = V0 > 0 for r ≤ R and V = 0 for r > R, and a solution with energy E > V0 and given angular momentum l > 0. If the solution is assumed to be square integrable at r = 0, calculate the coefficients A and B at infinity. (2) The decomposition of the free wave at infinity, (46.15), seems counterintuitive since the lefthand side is certainly nonzero (in fact, naively, it is of order 1!) if k is other than parallel or antiparallel to r . In what sense should we understand this relation, then? (3) If the f k (nr ) in (46.21) were independent of nr , would that contradict unitarity or not? (4) For the case in exercise 1, calculate the total cross section and the S-matrix.
(1)
527
46 Three-Dimensional Lippman–Schwinger Equation
(5) Consider the wave ψ = Aeikr/r + Be−ikr/r , with A, B = (2π/ik)δ(nk ± nr ) + a, b, where a, b are real constants. Can it be understood as a scattering solution? If so, calculate the total cross section. What if a, b are imaginary constants? (6) Calculate the generalization of G±0 (r , r ) on the complex plane for energy, G0 (z; r , r ). (7) Calculate the equivalent of (46.69) for the generalization G0 (z; r , r ) in exercise 6.
47
Born Approximation and Series, S-Matrix and T-Matrix
In this chapter, we define the Born approximation and series of higher-order approximations, and connect with the time-dependent point of view for scattering, where we define the S- and T-matrices.
47.1 Born Approximation and Series To solve the Lippmann–Schwinger integral equation, we use the same procedure as in the onedimensional case. We define the zeroth-order solution, namely the free (noninteracting) wave, · r
ψ±(0) (r ) = φ E (r ) = ei k
.
(47.1)
Then we substitute this solution into the right-hand side of the Lippmann–Schwinger equation to find the first order interacting solution, 2m 2m (1,±) f ( k , k) = − d 3r eir ·(k∓k ) V (r ) = − V (k ∓ k), (47.2) 4π2 4π2 where we have defined the Fourier-transformed potential V (q ), where q = k − k is the momentum transfer. That means that the first-order term in the wave function solution is 2m e±ikr ir ·(k∓k ) ψ±(1) (r ) = − d 3r e V (r ) 2 r 4π (47.3) = d 3r G0(±) (r , r )ei k ·r V (r ). We can then put this into the right-hand side of the Lippmann–Schwinger equation to find the secondorder term in the wave function solution, etc. Generally, though restricting to the physical + solution (and dropping the index when doing so), we find the recursion relation for the (n + 1)th-order term, 2m (n+1) ψ (r ) = d 3r 2 G0 (r , r )ψ (n) (r )V (r ). (47.4) This recursion relation defines the Born series. The first-order term is the Born approximation. We apply the Born approximation to a spherically symmetric potential, V (r ) = V (r). Since |k | = |k | = k, and defining θ as the angle between nr and nk , we have (see Fig. 47.1) θ |k − k | = 2k sin = q. 2 528
(47.5)
529
47 Born Approximation and Series, S- and T-Matrix
k = kˆ r k
q μ
O Figure 47.1
Geometry of scattering.
The first-order solution is then ∞ 2π 1 2m 2 r dr dφ d cos θ eiqr cos θ V (r ) 4π2 0 0 −1 iqr −iqr 2m ∞ 2 e − e =− 2 r dr V (r ) 2iqr 0 ∞ 2m =− 4π r dr V (r ) sin(qr ) 4π2 q 0 2m =− V (q), 4π2
f (1) = −
(47.6)
where in the last line we have defined V (q), which matches the first-order term (47.2), specialized to the spherically symmetric case, as well as the same formula in the derivation of the Born approximation through the Fermi golden rule, in Chapter 37. However, note that there we had a different normalization, with a 1/(2π) 3/2 factor. We will take an example, the Yukawa potential V = V0
e−μr , r
(47.7)
as in Chapter 37. There, we left as an exercise the proof of V (q) =
4πV0 , μ2 + q2
(47.8)
which leads to the Born approximation for the wave function, f (1) = −
2mV0 1 . 2 μ2 + q2
(47.9)
Then the Born approximation for the differential cross section, m 2 dσ = | f |2 = |V (q)| 2 , dΩ 2π2 matches the Chapter 37 calculation using Fermi’s golden rule.
(47.10)
530
47 Born Approximation and Series, S- and T-Matrix
47.2 Time-Dependent Scattering Point of View From the Fermi’s golden rule calculation of the Born approximation, which used first-order timedependent perturbation theory, we realize that we can use a time-dependent point of view for scattering. We review this calculation (though within our new formalism) for completeness. We define the S-matrix as in the one-dimensional case in Chapter 45, Sˆ = Uˆ I (+∞, −∞),
(47.11)
which however is still the same matrix as that defined in the previous chapter, Aout = Sˆ · Bin , and then the probability of transition is ˆ p i | 2 . Probab.(p in → dΩ around p f ) = dΩ|p f | S| However in time-dependent perturbation theory, at first order, +∞ i ˆ dtVI (t), UI (+∞, −∞) = 1 − −∞
(47.12)
(47.13)
as we saw in Chapter 38. Then we have dP 2π = |p f |Vˆ |p i | 2 ρ(Ei , n )dΩ, dt
(47.14)
and the analysis in Chapter 37 follows. In the case of scattering in a Coulomb potential, obtained as the μ → 0 limit of the Yukawa potential, so that V (q) =
4πV0 , q2
we obtain the differential cross section 2 4m2 V02 m2 V02 dσ m 2 4πV0 = = = , dΩ 2π2 q2 4 q4 4 4k 4 sin4 θ/2
(47.15)
(47.16)
which is the Rutherford formula for V0 = Z ze02 , derived classically by Rutherford (so the quantum mechanics approach to the first-order Born approximation doesn’t improve upon it).
47.3 Higher-Order Terms and Abstract States, S- and T-Matrices We saw in the previous chapter that the differential cross section is dσ = | f + |2 dΩ and, from the Lippmann–Schwinger equation, 2m f + (k , k) = − d 3r e−i k ·r V (r )ψ+ (r ). 4π2
(47.17)
(47.18)
531
47 Born Approximation and Series, S- and T-Matrix
In terms of abstract operators and states, we have f + (k , k) = so
m ˆ k |V |ψ+, 2π2
2 dσ m 2 ˆ k |V |ψ+ , = dΩ 2π2
(47.19)
(47.20)
where k | = p f | is a final momentum state, so the formula above generalizes the Born approximation formula above (47.10), which can be written as 2 dσ m 2 p f |Vˆ |p i . = (47.21) 2 dΩ 2π However, k |Vˆ |ψ+ is not a matrix element in the Hilbert space basis, since we have two different kinds of states on the left and on the right. In order to have a matrix element in the Hilbert space basis, we need to define, as in the one-dimensional case in Chapter 45, the operator T: Vˆ |ψ+ ≡ Tˆ |k,
(47.22)
i.e., we need to replace the action on an interacting state by an action on the free state |k. Then k |Vˆ |ψ+ = k |Tˆ |k
(47.23)
is a matrix element in the Hilbert space basis. But since the density of states is, as we saw in Chapter 37, ρ(E) =
mk , (2π) 3
(47.24)
and since the velocity is v = k/m, we can rewrite the differential cross section in terms of the transition matrix, or T-matrix (the matrix element of the T-operator), dσ 2π 2π = ρ(E)|k |Vˆ |ψ+| 2 = ρ(E)|k |Tˆ |k| 2 . (47.25) dΩ v v Then we can identify the amplitude for scattering, f + , with the T-matrix, up to a constant, m ˆ f + (k , k) = k |T | k, (47.26) 2π2 or, for operators, m ˆ fˆ = − T. (47.27) 2π2 Moreover, from (46.41), we can relate the S-operator to the T-operator (and the S-matrix to the T-matrix), since ik ˆ imk ˆ T. Sˆ = 1 + f =1− 2π (2π) 2
(47.28)
As in the one-dimensional case (a trivial generalization), the abstract Lippmann–Schwinger equation, and its “solution” through the T-operator, is 2m ˆ (±) ˆ |ψ± = | k + G V |ψ± 2 0 (47.29) 2m (±) = |k + Gˆ Tˆ |k. 2 0
532
47 Born Approximation and Series, S- and T-Matrix
Also as in the one-dimensional case, the full Born series is defined by the recursion relation 2m ˆ (±) ˆ (n) (n+1) |ψ± = G V |ψ± . (47.30) 2 0 Absorbing 2m/2 into Gˆ 0(±) for simplicity, the full Born series becomes |ψ± = |k + Gˆ 0(±) Vˆ |k + (Gˆ 0(±) Vˆ ) 2 |k + · · · = |ψ±(0) + |ψ±(1) + |ψ±(2) + · · ·
(47.31)
= |k + Gˆ 0(±) (Vˆ + Vˆ Gˆ 0(±) Vˆ + · · · )|k. ˆ we obtain Comparing with the Lippmann–Schwinger equation in terms of T, ˆ Tˆ = Vˆ + Vˆ Gˆ 0 Vˆ + Vˆ Gˆ 0 Vˆ Gˆ 0 Vˆ + · · · = Vˆ + Vˆ Gˆ 0T.
(47.32)
The physical interpretation of the full Born series is obtained once we go to coordinate space, since then the nth-order term is d 3r n G0 (r , r 1 )V (r 1 )G0 (r 1 , r 2 )V (r 2 ) · · · G0 (r n−1 , r n )V (r n )φ(r n ). (47.33) d 3r 1 . . . Then we see that the incoming wave hits the potential at r n (the last integration variable, the variable of the free-wave function φ(r n )), interacts with the potential, and then propagates with the propagator (Green’s function) G0 until r n−1 , where it interacts again with V (r n ), etc., until after the last propagation we hit the point where we are “measuring” the wave function, r . We can also define the full Green’s function, Gˆ ≡
1 1 = −1 E − Hˆ 0 − Vˆ Gˆ 0 − Vˆ
= Gˆ 0 + Gˆ 0 Vˆ Gˆ 0 + Gˆ 0 Vˆ Gˆ 0 Vˆ Gˆ 0 + · · ·
(47.34)
= Gˆ 0 (1 + Tˆ Gˆ 0 ). Acting with Gˆ −1 on the difference between the full state and the free state, we get (E − Hˆ 0 − Vˆ )(|ψ − |k) = Vˆ |k,
(47.35)
where we have used the full Schr¨odinger equation for |ψ and the free Schr¨odinger equation for |k. Then we have 1 |ψ − |k = Vˆ |k = Gˆ Vˆ |k E − Hˆ 0 − Vˆ (47.36) = Gˆ 0 (1 + Tˆ Gˆ 0 )Vˆ |k, which defines the full Born series once again.
47.4 Validity of the Born Approximation Until now we have assumed that the Born approximation is valid, but we have not stated a condition for this to be true.
533
47 Born Approximation and Series, S- and T-Matrix We require that, for r = 0, meaning at the maximum of the interaction (where the potential is centered), the extra term is small with respect to the free term, which at r is 1, thus |ψ±(1) (0)| = d 3r G0 (0, r )ei k ·r V (r ) (47.37) ikr 2m i 3 e k · r d r V (r )e 1. = r 4π2 Note that in the second line we have an exact equality, by just substituting G0 (0, r ) into the first line. For the particular case of the Yukawa potential,
V (r ) = V0
e−μr , r
(47.38)
and at low energy, kr 1, we obtain the condition −μr 2m 2mV0 3 e V0 d r 2 = 1, (47.39) 4π2 r μ2 ∞ where we have used d 3 r /r 2 = r dΩ, integrated over r , 0 dr e−μr = 1/μ, and dΩ = 4π. On the other hand, for a potential where V ∼ V0 for a range r ≤ r 0 , and at low energy kr 1, we have V (r ) V0 d 3 r 4π r 02 , (47.40) r 2 which means that the Born approximation validity condition is mV0 2 r 1. 2 0
(47.41)
One can make a more precise analysis of the validity condition, in particular one that interpolates to high energy, but we will not do it here. Here we just note that the region of validity of the Born approximation is not a low-energy one in general, and also is different than that of the WKB approximation.
Important Concepts to Remember • The Born approximation is the first-order approximation to the Lippmann–Schwinger equation, where we put the zeroth-order solution ei k ·x on the right-hand side, and the Born series is obtained as successive terms in the approximation. • The Born approximation gives the same result as a reinterpretation of the scattering as a timedependent process, with Fermi’s golden rule in first order, giving dP/dt = (2π/)|p f |V |pi | 2 ρ dΩ. • The Lippman–Schwinger equation implies dσ m 2 ˆ = | k |V |ψ+| 2 . dΩ 2π2 with Born approximation
dσ m 2 = |p f |Vˆ |p i | 2 . dΩ 2π2
534
47 Born Approximation and Series, S- and T-Matrix • One can define the T-matrix by Vˆ |ψ+ = Tˆ |k, so that the action on the interacting state is that of the T-matrix on a free state, and then m ˆ T. fˆ = − 2π2 • The Born approximation is valid in a region different from that of the WKB approximation, nor is it the low-energy region. At low energy, for a Yukawa potential V0 e−μr /r, we have mV0 /(μ2 ) 1 and for a constant potential, V ∼ V0 , in a range r 0 , we have mV0 r 02 /2 1.
Further Reading See [2] and [1].
Exercises (1) Calculate the first two terms in the Born series for a delta function potential, V (r) = −V0 δ3 (r ). (2) Calculate the differential cross section for scattering in the Born approximation for a potential V (r) = A/r 2 . (3) Describe physically how it is possible that the Born approximation to quantum mechanical scattering in a Coulomb potential gives the classical-scattering Rutherford formula. (4) Calculate the first two terms in the Born series for a potential V (r) = A/(r 2 + a2 ), with a = constant. (5) Write down explicitly the Lippmann–Schwinger equation for the T-matrix in the Yukawa case of V (r) = V0 e−μr /r. (6) Write down and solve the Lippmann–Schwinger equation for the T-matrix in the case of the delta function potential V = −V0 δ3 (r ). (7) Is there a domain of validity of the Born approximation in the case of the Coulomb potential?
48
Partial Wave Expansion, Phase Shift Method, and Scattering Length In this chapter we will consider spherically symmetric potentials, in which case we need to describe states and wave functions of a given l. To do that, we define an expansion in angular momentum l, the partial wave expansion, and the associated notions and methods of phase shift and scattering length.
48.1 The Partial Wave Expansion In the spherically symmetric case, the complete set of mutually commuting variables is {H, L 2 , L z }, so we have basis states that are eigenstates of those operators, |Elm, and which are orthonormal (including on the continuous energy variable), E l m |Elm = δll δ mm δ(E − E ).
(48.1)
In coordinate space, the wave function of a basis state is r |Elm = uElm (r ) = REl (r)Ylm (nr ).
(48.2)
We saw in Chapters 18 and 19 that, for a free particle, the solution for the radial wave function is REl (r) = Cl jl (kr),
(48.3)
where jl is the spherical Bessel function. Moreover, the plane wave solution, ei k ·r , where k = p /, is expanded in the above basis as
ei k ·r = eikr cos θ =
∞
(2l + 1)i l jl (kr)Pl (cos θ)
l=0
=
∞ l
(48.4)
alm jl (kr)Ylm (θ, φ),
l=0 m=−l
where (2l + 1)i l ≡ al , the first line is expanded in terms of only θ, since the expanded function is a function only of cos θ, whereas the second line is the general expansion of a wave function in the r |Elm basis. The relation between the two expansions is provided by the equality (from Chapter 17) 4π Yl,0 (θ, φ), (48.5) Pl (cos θ) = 2l + 1 which means that 4π alm = δ m,0 al . (48.6) 2l + 1 535
536
48 Partial Wave Expansion, Phase Shifts, Scattering Length The expansion in terms of θ (on the first line) follows from the relation +1 1 jl (kr) = l d(cos θ)eikr cos θ Pl (cos θ). 2i −1
(48.7)
Since at large values of the argument the spherical Bessel function becomes eikr i −l − e−ikr i l , 2ikr this means that, at r → ∞, the free plane wave becomes jl (kr)
ei k ·r
∞ ∞ eikr 1 e−ikr 1 (2l + 1)Pl (cos θ) − (2l + 1)i 2l Pl (cos θ). r 2ik l=0 r 2ik l=0
(48.8)
(48.9)
Then, in terms of the general expansion at infinity, the coefficients of the free plane wave are A(k, nr ) =
∞ 1 (2l + 1)Pl (cos θ) 2ik l=0
2π δ(nk − nr ) ik ∞ 1 B(k, −nr ) = (2l + 1)i 2l Pl (cos θ) 2ik l=0 =
(48.10)
2π δ(nk + nr ). ik Then, when we have scattering in a central (spherically symmetric) potential, we saw that we can make the replacement =
A(k, nr ) = Ak (n ) =
2π 1 δ(n − nk ) = (2l + 1) Pl (cos θ) ik 2ik l=0 ∞
(48.11)
2π → δ(n − nk ) + f k (n , nk ), ik where k = kn generalizes knr . However, given the expansion in cos θ of the coefficient A before the replacement, it means that the added term should also have an expansion, where we replace 1/(2ik) with a general coefficient al (k), i.e., we define f k (n , nk ) = f (θ) =
∞
(2l + 1)al (k)Pl (cos θ),
(48.12)
l=0
where θ is the angle between n and nk . Here al (k) is called the lth partial wave amplitude. Given the above expansion, the scattering solution for a potential with a finite range, i.e., a free plane wave and a diverging spherical wave, expands into partial waves as follows: ψ+ (r ) = ei k ·r + f k (nr , nk )
k
1 2ik
eikr r
ikr ∞ −ikr 2l ⎫ ⎧ ⎪ i ⎪ ⎨ (2l + 1)Pl (cos θ) e (1 + 2ikal (k)) − e ⎬. ⎪ ⎪ r r ⎩ l=0 ⎭
(48.13)
537
48 Partial Wave Expansion, Phase Shifts, Scattering Length
48.2 Phase Shifts The boundary conditions for the wave function in the spherically symmetric scattering case are imposed on REl (r), and define the abstract state |Elm in coordinate space. But the physical situation we have now, unlike the previous derivations (in Chapters 18 and 19) of bound states in the case of say, the hydrogen atom, is that of a scattering solution, which means an incoming plane wave plus an outgoing (or incoming) spherical wave giving the state |ψ±. So, really, it is more useful to write ± (r) to refer to the two outgoing or incoming solutions. If we |Elm± for the abstract state and REl do not use the ± index, it means that we are considering a general linear combination of the two solutions. Then the general solution of the Schr¨odinger equation |ψ is expanded in terms of some basis |Elm (a linear combination of the |Elm± states) as ψk (r ) =
∞ l
Klm REl (r)Ylm (θ, φ) =
l=0 m=−l
∞ l
Klm uElm (r ),
(48.14)
l=0 m=−l
± where REl (r) is a linear combination of REl (r). At r → ∞, we have the Helmholtz equation
(Δ + k 2 )R = 0,
(48.15)
so the general solution of it is either a linear combination of the spherical Bessel function jl (kr) and the spherical Neumann function nl (kr) or yl (kr), or a linear combination of the spherical Hankel functions of first and second degrees, hl(1) (kr) and hl(2) (kr), REl (r) ∼ Cl jl (kr) + Dl nl (kr) = Al hl(1) (kr) + Bl hl(2) (kr),
(48.16)
where, since hl(1,2) = jl ± inl , we have Cl = Al + Bl , Dl = i Al − iBl .
(48.17)
But, at large values of the argument, the spherical Hankel functions behave as eikr i −l ikr e−ikr i l hl(2) (kr) ∼ − , ikr hl(1) (kr) ∼
(48.18)
which means that if at r → ∞ Bl = Al∗ then REl (r) is real (since hl(1)∗ = hl(2) ). Moreover, if also Al is real, the solution has only a jl (kr) component, as for the free wave, but that is a very special case. Then the real solution for the radial wave function is REl (r) ∼
Al ei(kr−lπ/2) − Al∗ e−i(kr−lπ/2) , ikr
(48.19)
(+) (−) and is therefore the sum REl (r) = REl (r) + REl (r), where the first term contains eikr and the second e−ikr . We define the phase of Al as eiδ l , i.e.,
Al = Al0 eiδ l ⇒ Al∗ = Al0 e−iδ l .
(48.20)
538
48 Partial Wave Expansion, Phase Shifts, Scattering Length Then δl is called the phase shift, and the real solution for the wave function becomes 2Al0 lπ REl (r) ∼ sin kr − + δl . kr 2
(48.21)
Since Cl = Al + Al∗ = 2Al0 cos δl Dl = i Al − i Al∗ = −2Al0 sin δl ,
(48.22)
it follows that we obtain the phase shift from the expansion in terms of jl and nl , Dl . Cl
tan δl = −
(48.23)
We then define the wave function solution of the Schr¨odinger equation, ψk (r, θ) =
∞
Kl REl (r)Pl (cos θ),
(48.24)
l=0
which is a particular linear combination of the |Elm solutions, and is also of the scatteringsolution type (since the partial wave expansion (48.12) is of the same type). The linear combination of solutions, just like the plane wave free solution (48.4), is a particular linear combination of
jl (kr)Ylm (θ, φ), with alm ∝ δ m,0 . Here lm=−l KlmYlm (θ, φ) reduces to Kl Pl (cos θ) when Klm ∝ δ m,0 . We can define outgoing and incoming solutions |ψ± from the decomposition of the radial wave functions (48.19), as |ψ+ ≡ |k+ =
∞ l
0 iδ l Alm e |Elm+ =
l=0 m=−l
|ψ− ≡ |k− =
∞ l
∞ l
Al0 Klm eiδ l |Elm+
l=0 m=−l 0 −iδ l Alm e |Elm− =
l=0 m=−l
∞ l
(48.25) Al0 Klm e−iδ l |Elm−,
l=0 m=−l
where Alm ≡ Al0 Klm , and the states |Elm± are defined with real coefficients (times e±ikr ) at infinity. Then the asymptotics at r → ∞ of the linear combination solution defined above is ψk (r )
∞ l=0
Al0 Pl (cos θ)
ei(kr−lπ/2+δ l ) − e−i(kr−lπ/2+δ l ) , 2ikr
where we have absorbed Kl into Al0 . We note that the above becomes r | |k+ − |k− .
(48.26)
(48.27)
The |k± are orthonormal, k + |k+ = δ3 (k − k) = k − |k−,
(48.28)
as are the spherical states |Elm, E l m |Elm = δll δ mm δ(E − E ).
(48.29)
539
48 Partial Wave Expansion, Phase Shifts, Scattering Length
However, the product of the in and out states defined in (46.69) is nontrivial, ik k − |k+ = d 3r ψ(−)∗ (r )ψ(+) (r ) = δ(k − k) + δ(k − k) f k (n , nk ) k k 2π ˆ k = δ(k − k)ψ − |ψ+ = k | S|
(48.30)
= δ(k − k)S(n , nk ), where, compared with the in and out states defined before in (48.25), we have used states with an extra k modulus tensored in, |k± = |ψ± ⊗ |k, so that in the product we have an extra k |k = δ(k − k). Now we need to put the solution (48.24) with asymptotics (48.26) into a scattering form. In that form, the incoming wave part (∝ e−ikr /r) is only from the plane wave (ei k ·r ) part, so we can identify Al0 by comparing with the plane wave asymptotics in (48.9), obtaining ∞
B(k, nr ) =
i 2l i l e−iδ l (k) = Al0 Pl (cos θ) , 2ik 2ik l=0 ∞
(2l + 1)Pl (cos θ)
l=0
(48.31)
which means that Al0 is given by Al0 = (2l + 1)ei(δ l +lπ/2) .
(48.32)
Substituting back into the full ψk (r ) in (48.24), we obtain first ψk (r, θ) =
∞
Kl REl (r)Pl (cos θ).
(48.33)
l=0
But Kl was absorbed into Al0 so, up to an irrelevant normalization, we can identify them, implying ψk (r, θ) =
∞
(2l + 1)ei(δ l +lπ/2) REl (r)Pl (cos θ).
(48.34)
l=0
Then, substituting Al0 also in asymptotic (r → ∞) form into (48.26), we obtain ∞ 1 (2l + 1)Pl (cos θ)[eikr e2iδ l − e−i(kr−lπ) ] 2ikr l=0 2iδ ⎤ ⎡∞ e l − 1 ⎥⎥ eikr ⎢⎢ i k · r =e + (2l + 1)Pl (cos θ) ⎥⎥ , r ⎢⎢ l=0 2ik ⎣ ⎦
ψk (r )
where in the last line we have recreated ei k ·r from the cos θ expansion. Then we find 2iδ ⎤ ⎡⎢ ∞ e l − 1 ⎥⎥ f k (n , nk ) = ⎢⎢ (2l + 1)Pl (cos θ) ⎥⎥ , 2ik ⎢⎣ l=0 ⎦ and, now that we have the asymptotic scattering form, we can identify al (k) as
(48.35)
al (k) =
e2iδ l (k) − 1 eiδ l (k) sin δl (k) 1 = = . 2ik k k cot δl (k) − ik
(48.36)
(48.37)
Moreover, with n = nr , and nr · nk = cos θ, we have also an expansion in spherical harmonics, f k (nr , nk ) =
∞ l=0
(2l + 1)Pl (nr · nk )al (k) =
∞ l l=0 m=−l
∗ 4πal (k)Ylm (nk )Ylm (nr ).
(48.38)
540
48 Partial Wave Expansion, Phase Shifts, Scattering Length
Considering the scattering solution in (48.13), and comparing with the free case in (48.9), we see that the diverging wave (∝ eikr /r) is just multiplied by the factor 1 + 2ikal (k) = e2iδ l (k) ≡ Sl (k),
(48.39)
called the lth partial wave S-matrix element. Indeed, since the S-operator is related to the f -operator by ik ˆ Sˆ = 1 + f, 2π
(48.40)
so that, by multiplication with nr | from the left and with |nk from the right; we have S(nr , nk ) = δ2 (nr − nk ) +
ik f k (nr , nk ), 2π
(48.41)
which expands into S(nr , nk ) =
∞ ∞ 1 1 (2l + 1)Pl (cos θ) + (2l + 1)Pl (cos θ)2ikal (k), 4π l=0 4π l=0
(48.42)
it follows that Sl = 1 + 2ikal (k) = e2iδ l (k) .
(48.43)
48.3 T-Matrix Element The f -operator is related to the T-operator by m ˆ fˆ = − T, 2π2
(48.44)
which means that their matrix elements are related too, by m ˆ k |T | k. f (k , k) = k | fˆ|k = − 2π2
(48.45)
Then the T-matrix element is k |Tˆ |k = k |Vˆ |k+ 2π2 δ(k − k ) (2l + 1)al (k)Pl (n · nk ) m l=0 ∞
=
(48.46)
≡ δ(k − k )Tk (n , nk ), where again the δ(k − k ) factor appears because we have modified the states, |k+ from |ψ+. Taking in consideration (48.30), and the above T-matrix, we obtain for the S-matrix ˆ k = k 1 − imk Tˆ k k − |k+ = k | S| (2π) 2 (48.47) imk = δ3 (k − k ) − δ(k − k)T ( n , k). k (2π) 2
541
48 Partial Wave Expansion, Phase Shifts, Scattering Length
Then the S- and T-operators are related by 2 imk ˆk = 1 − ik δ(E − E)Tˆk . Sˆ = 1 − δ(k − k) T (2π) 2 (2π) 2
(48.48)
The S-matrix elements in the spherical, |Elm, basis, are given by ˆ E l m | S|Elm = δll δ mm δ(E − E )Sl ,
(48.49)
because of the cos θ expansion of the S-matrix S(nr , nk ) in (48.42). But then the T-matrix element in the same basis is E l m |Tˆ |Elm = δll δ mm δ(E − E )Tl ,
(48.50)
or, taking out the δ(E − E ) factor, E l m |Tˆk |Elm = δll δ mm Tl .
(48.51)
Now multiplying (48.48) by E l m | from the left and by |Elm from the right, we find1 Sl = 1 − 2πiTl ,
(48.52)
which means that Tl is related to al by k Tl (k) = − al (k). π The matrix element relation generalizes to an operatorial relation, ˆ Sˆ = 1 − 2πiδ(E − E )T.
(48.53)
(48.54)
48.4 Scattering Length At low energies, k → 0, we will see later that, for a finite range, δl (k) ∝ k 2l+1 → 0, and moreover only l = 0 contributes, so δ0 (k) → 0, and it dominates the other δl (k). But then, the al (k) are given by al (k)
sin δl (k) tan δl (k) δl (k) . k k k
(48.55)
The leading contribution is then δ0 (k) , k but it is actually negative, so we define the finite and positive quantity a0 (k)
a ≡ − lim a0 (k) = − lim k→0
k→0
δ0 (k) k
(48.56)
(48.57)
called the scattering length.
1
There is an extra (2π) 3 , coming from the, now different, normalization of states (with a 1/(2π) 3/2 factor), and there is also an extra 1/k 2 factor, which appears because of the states being defined as k nk | rather than as nk |, giving a factor of 1/k in the normalization. We take into account this change in normalization because the relation between S and T is usually defined as in the following.
542
48 Partial Wave Expansion, Phase Shifts, Scattering Length
48.5 Jost Functions, Wronskians, and the Levinson Theorem We have seen before that, for a finite-range potential, i.e., for a potential decaying faster than the Coulomb potential, lim rV (r) → 0,
(48.58)
2Al0 lπ χkl (r) = Rkl (r) ∼ sin kr − + δl r kr 2
(48.59)
r→∞
the radial wave function solution
+ − is a real solution, meaning it contains both an outgoing and an incoming part: Rkl (r) + Rkl (r). If we also impose
lim r 2 V (r) = 0,
r→0
(48.60)
it means that there is a discrete spectrum for E < 0 (since we are now imposing two normalizability conditions, at infinity and at zero, on two independent solutions of the Schr¨odinger equation), but the spectrum is still continuous for E ≥ 0. The behavior of χkl (r) at r → 0 is (as we saw in Chapter 19) ∼ r −l or ∼ r l+1 , so that Rkl (r) ∼ r −l−1 or ∼ r l . We then define the regular (normalizable-at-zero) solution χl = φl , which means the solution that has the behavior ∼ r l+1 . Moreover, we choose the normalization constant such that lim r −l−1 φl = 1. r→0
(48.61)
Note that this result is valid whether the energy is positive (a continuous-spectrum, scattering solution) or negative (a discrete-spectrum, bound-state solution). The physically normalized solution is multiplied by a constant Nkl , χkl = Nkl φl (k; r).
(48.62)
We then extend φl (k; r) to the full complex plane for k using analytic continuation, which means that φl must be an analytical function of k. Its properties are as follows. (1) φl (−k; r) = φl (k; r), which is part of the definition of the function: on the real axis, k < 0 is defined from the physical case, k > 0. (2) φl (k; r) = φl∗ (k ∗ ; r), which is a result of the analyticity imposed on φl . (3) If k ∈ R then φl is an eigenfunction of the Hamiltonian, but if k is not real (so that E ∗ E), then φl is not an eigenfunction of H.
Jost Functions For k real, we define uniquely the Schr¨odinger equation solutions f l± that behave at r → ∞ as incoming or outgoing, respectively, i.e., lim e±ikr f l± (k, r) = 1,
r→∞
(48.63)
543
48 Partial Wave Expansion, Phase Shifts, Scattering Length
where again we choose the normalization constant such that we have 1 on the right-hand side. Therefore χl = f l± e∓ikr ⇒ ψ ∝
e∓ikr . r
(48.64)
Then, at r → 0, the solutions f l± are a linear combination of regular (∼ r l+1 ) and irregular, or nonnormalizable (∼ r −l ), solutions, with the irregular solution dominating. This means that, at r → 0, f l± ∼ Cl± r −l ,
(48.65)
where Cl± is well defined (the subleading, regular, component is ∼ Dl± r l+1 ). When k ∈ C, we have that: f l+ is well defined for Im k < 0. f l− is well defined for Im k > 0. This implies that, at r → ∞, | f l± (k; r)| ∝ e±(Im k)r → 0.
(48.66)
In one dimension, and therefore also in three dimensions but for the radial direction, we defined the Wronskian theorem. We need only to apply it for the same potential V (r), and the same energy, which means that the Wronskian of two solutions ψ1 and ψ2 , W (ψ1 , ψ2 ) = ψ1 ψ2 − ψ2 ψ1 ,
(48.67)
is constant as a function of r, so dW (ψ1 , ψ2 ) = 0. (48.68) dr Then in particular we can calculate it at r → ∞. Choosing the two solutions to be f l± , in which case, f l± e∓ikr , we find W ( f l+ , f l− ) = 2ik.
(48.69)
Thus, if for all k there are three solutions, φl , f l+ , f l− , this means that they are linearly dependent, so φl is a linear combination of f l+ and f l− : φl = C1 f l+ + C2 f l− .
(48.70)
In this case, we define the Jost functions F ± (k) as F ± (k) ≡ W ( f l± , φl ).
(48.71)
We can calculate the Wronskian (which is r-independent again, as we have the same potential and the same energy) either at infinity or at zero. We find (from the Wronskian at infinity) 1 − (48.72) φl = −Fl (k) f l+ (k; r) + Fl+ (k) f l− (k; r) , 2ik where (from the Wronskian at zero) Fl± (k) = (2l + 1)Cl± = (2l + 1) lim r l f l± (k; r), r→0
(48.73)
so, at r → 0, f l± (k; r)
Fl± (k) (2l + 1)r l
.
(48.74)
544
48 Partial Wave Expansion, Phase Shifts, Scattering Length The properties of f l± and the Jost functions Fl± are: (1) f l− (k; r) = f l+ (−k; r), since at infinity we have f l± e∓ikr . (2) [ f l± (−k ∗ ; r)]∗ = f l± (k; r), again proven from the behavior at infinity and analyticity. (3) From the two previous properties, we also obtain f l± (k; r) = [ f l∓ (k ∗ ; r)]∗ . In particular, if k ∈ R, then f l− = ( f l+ ) ∗ , so Fl± (−k) = Fl∓ (k).
(48.75)
(4) This is generalized through analyticity to the complex plane relation [Fl± (−k ∗ )]∗ = Fl± (k).
(48.76)
But if we go back to k ∈ R, this reduces to a more general relation than the previous one, Fl+ (−k) = (Fl− (k)) ∗ .
(48.77)
As φl is χl , from the expansion of φl into f l± , we obtain Sl (k) = e2iδ l (k) =
Fl+ (k) ilπ Fl+ (k) ilπ e = e . Fl− (k) Fl+ (−k)
Substituting this into the behavior at r → ∞ of (48.72), we find Fl− (k) iδ (k) −lπ/2 lπ l e e sin kr − + δl . φl (k; r) ∼ k 2
(48.78)
(48.79)
If k ∈ R, then Fl+ (k) = |Fl+ (k)|eiα l Fl− (k) = |Fl+ (k)|e−iδ l .
(48.80)
This, together with the relation e2iδ l = Fl+ (k)eilπ /Fl− (k), implies that δl = αl + lπ(mod 2π) δl (−k) = −δl (k).
(48.81)
In this analysis of Jost functions, we have assumed that k ∈ C, though without much justification. We will come back to this in Chapter 50, where we will define more thoroughly the analysis for complex k.
Levinson Theorem This theorem was proven in 1949 by Levinson, but here we will just present a statement of it, without a proof. We will consider a not very rigorous proof later on. (a) The first statement regarding δl (k), specifically for the difference between low energy and high energy, is: ⎧ ⎪ ⎪ n0b π δ0 (0) − δ0 (∞) = ⎨ ⎪ ⎪ (n0 + 1/2)π ⎩ b
if F0+ (0) 0 if F0+ (0) = 0.
(48.82)
Note that F0+ (0) 0 means that φl contains both f l+ and f l− components, whereas the F0+ (0) = 0 condition means that φl ∝ f l+ , giving a quantization condition (but not two, as in the case of obtaining
545
48 Partial Wave Expansion, Phase Shifts, Scattering Length
a discrete level, i.e., a bound state), meaning an extra 1/2 term is added to n0b . Here n0b is the number of energy levels (number of bound states) of given angular momentum l = 0, so it is simple to generalize to nlb for angular momentum l. The assumption of this statement is that if F0+ (0) = 0 then F0+ (k) ∼ ak as k → 0. (b) The second statement is: if Fl+ (0) = 0 and Fl+ (k) ∼ Ak 2 .
δl (0) − δl (∞) = nlb π
(48.83)
In both cases, an extra assumption is that the potential has not just finite range but also a faster vanishing of the potential at infinity, lim r 3 V (r) = 0.
r→0
(48.84)
Important Concepts to Remember
∞ • The partial wave expansion is f k (n , nk ) = l=0 (2l + 1)al (k)Pl (cos θ), in terms of the lth partial wave amplitude al (k) and follows the same expansion as that of the other term, (2π/ik)δ(n − nk ), in Ak (n ), which has 1/(2ik) instead of al (k). • For a central potential going to 0 at infinity, the real radial wave function at infinity is REl (r) = [Al ei(kr−lπ/2) − Al∗ e−i(kr−lπ/2) ]/(ikr), with Al = Al0 eiδ l , or REl (r) = (2Al0 /kr) sin(kr − lπ/2 + δl ), with δl the phase shift. • The partial wave amplitude is related to the phase shift by al (k) = (e2iδ l (k) − 1)/(2ik) = [eiδ l (k) sin δl (k)]/k = [k cot δl (k) − ik]−1 , and to the lth partial wave S-matrix element by Sl (k) = eiδ l (k) = 1 + 2ikal (k), for the same partial wave expansion of S(nr , nk ). ˆ • The S-matrix and T-matrix are related by Sl = 1 − 2πiTl and Sˆ = 1 − 2πiδ(E − E )T. • At low energies, k → 0, al (k) δl (k)/k, and a ≡ − limk→0 a0 (k) > 0 is the scattering length. • We can define radial wave solutions Rkl (r) analytically continued to the complex k plane. Defining, for real k, φl via limr→0 r −l−1 φl (r) = 1 and f l± (k, r) via limr→∞ e±ikr f l± (k; r) = 1, we define the Jost functions as the Wronskians F ± (k) = W ( f l± , φl ). • The Levinson theorem states that δ0 (0) − δ0 (∞) = n0b π if F0+ (0) 0, where nlb is the number of bound states with l, and δ0 (0) − δ0 (∞) = (n0b + 1/2)π if F0+ (0) = 0, while δl (0) − δl (∞) = nlb π if Fl+ (0) = 0 and Fl+ (k) ∼ Ak 2 .
Further Reading For partial waves, phase shifts, and scattering length, see [1] and [2]. For the Levinson theorem, see Levinson’s paper [28].
Exercises (1)
Consider scattering onto a delta function potential, V = −V0 δ3 (r ), in the Born approximation. Calculate the partial wave amplitudes al (k).
546
48 Partial Wave Expansion, Phase Shifts, Scattering Length
(2) Consider a spherical well potential V = −V0 for r ≤ R, and V = 0 for r > R, in the Born approximation, and waves with E > 0. Calculate the phase shifts δl (k) for scattering. (3) In the case in exercise 2, calculate Sl (k) and the differential cross section. (4) If δl (k) is real, what do you deduce about Tl (k)? (5) Calculate the scattering length for the case in exercise 1. (6) Is relation (48.72) well defined for complex k? Why? (7) If limk→0 Fl+ (k)/k 2 is constant for l even, find the k → 0, r → ∞ behavior with k of φl (k; r) for even l.
49
Unitarity, Optics, and the Optical Theorem
In this chapter, we will revisit the issue of unitarity and the optical theorem for scattering. An example of the partial wave formalism, for a hard sphere, puts us on the track of quantum mechanical scattering as optics.
49.1 Unitarity: Review and Reanalysis To understand the application of unitarity, we first remember what unitarity means: in quantum mechanics, the conservation of probability implies that the time evolution is unitary, namely that the evolution operator is unitary, Uˆ † = Uˆ −1 . But the S-matrix is related to the evolution operator by Sˆ = Uˆ I (+∞, −∞), so Sˆ † = Sˆ −1 . Then unitarity of the S-matrix is equivalent to the conservation of probability. On the other hand, in the |Elm basis, which means in the partial wave formalism, the S-operator Sˆ is (as we saw) represented by Sl = e2iδ l , and so ˆ E l m | S|Elm = δll δ mm δ(E − E )Sl .
(49.1)
The diagonalization comes from the fact that the time evolution operator Sˆ = Uˆ I (+∞, −∞) has common eigenfunctions with Hˆ (with eigenvalue E), L 2 (with eigenvalue related to l), and Lˆ z (with eigenvalue m). This follows from the fact that the time evolution leaves invariant the energy E, angular momentum L 2 , and angular momentum projection L z . Classically, this means that ∂t H = ∂t L 2 = ∂t L z = 0,
(49.2)
while in quantum mechanics, invariance under time translation means that the commutator with Hˆ ˆ . . .] = 0 (the generator of time translation is H). ˆ However, the time vanishes, ∂t · · · = 0 → [ H, evolution invariance is ˆ H] ˆ = [ H, ˆ ˆ Lˆ z ] = 0, [ H, L 2 ] = [ H,
(49.3)
which is true in a spherically symmetric system. But then, unitarity, Sˆ † = Sˆ −1 , implies Sl∗ = Sl−1 and, since Sl = e2iδ l (k) , this means that δl∗ (k) = δl (k), so that the phase shift is real. But this was what we (implicitly) assumed when defining the phase shift. Equivalently, the unitarity condition is |Sl (k)| = 1, which means that the only change in the outgoing wave with respect to the incoming one is a phase, not an amortization (e−|Im δ l | ), which would lead to probability decay (in a physical case, probability decay could only mean that the system is not complete, and the probability “leaks” somewhere else, into another system coupled to the one we are analyzing). 547
548
49 Unitarity, Optics, and the Optical Theorem
Since kal =
Sl − 1 i ie2iδ l i e−iπ/2+2iδ l = − = + , 2i 2 2 2 2
(49.4)
this means that kal maps a circle in the complex plane, centered on i/2, of radius 1/2, called the “unitarity circle”.
49.2 Application to Cross Sections To see the effect of unitarity analysis on physical measurements, we need to look at cross sections. As we saw, in the partial wave formalism, dσ = | f k (θ)| 2 . dΩ
(49.5)
But since (see (48.12)) f k (θ) =
∞
(2l + 1)Pl (cos θ)al (k)
l=0
(49.6)
e2iδ l − 1 Sl − 1 al (k) = = , 2ik 2ik it follows that ∞ ∞ dσ 1 = 2 (2l + 1)(2l + 1)(e2iδ l (k) − 1)(e−2iδ l (k) − 1)Pl (cos θ)Pl (cos θ), dΩ 4k l=0 l =0
(49.7)
where we have used the fact that Pl (cos θ) is real. Then the total cross section is found by integrating over dΩ, dσ σtot (k) = dΩ dΩ 2π ∞ ∞ +1 1 (49.8) = 2 dφ d(cos θ)Pl (cos θ)Pl (cos θ) 4k 0 −1 l=0 l =0 × (2l + 1)(2l + 1)(e2iδ l (k) − 1)(e−2iδ l (k) − 1). Now we use the orthogonality condition of the Legendre polynomials (see (17.38)), +1 2δll d(cos θ)Pl (cos θ)Pl (cos θ) = , 2l + 1 −1
(49.9)
so that we obtain σtot (k) =
∞ ∞ π e2iδ l (k) − 12 = 4π (2l + 1) (2l + 1) sin2 δl (k). k 2 l=0 k 2 l=0
(49.10)
This means we can define a cross section σl through σtot (k) =
∞ l=0
σl (k),
(49.11)
549
49 Unitarity, Optics, and the Optical Theorem
and, identifying the two expansions, we obtain σl (k) =
4π (2l + 1) sin2 δl (k). k2
(49.12)
A more general formula applies also to the case when unitarity is “violated”, meaning that δl (k) is complex (which implies that the system under analysis interacts with other systems, having a “probability leak”, so the scattering is “inelastic”). This formula is σl (k) =
2 π (2l + 1) e2iδ l (k) − 1 . k2
(49.13)
In the case of unitary evolution (δl (k) ∈ R), since sin2 δl (k) ≤ 1 we have a “unitarity bound”, σl (k) ≤
4π (2l + 1) ≡ σlmax . k2
(49.14)
We will come back to this bound later but, for the moment, we just note that the saturation of the bound is at δl = π/2, which means that when δl π/2, the lth partial wave has a maximal effect. However, we need to calculate δl (k). To do that, we note that al (k) is found from (49.6) by integration with Pl (cos θ)d cos θ, so
1 −1
d(cos θ)Pl (cos θ) f k (θ) = 2al (k) =
e2iδ l (k) − 1 sin δl = 2eiδ(k) . ik k
But the Lippmann–Schwinger equation implies (see (46.71)) 2m f k (θ) = − d 3r e∓i k ·r V (r )ψ± (k, r ), 4π2
(49.15)
(49.16)
where we have substituted G0(+) (r , r ), and in the spherically symmetric case the l-expansion of the wave function is ψ+ (k; r, θ) =
∞
(2l + 1)ei(δ l +lπ/2) Rl (k; r)Pl (cos θ).
(49.17)
l=0
Instead of continuing like this, we can use a spherical expansion of the Green’s function, ∗ G0(+) (r , r ) = −ik Ylm (nr )Ylm (nr ) jl (kr < )hl(1) (kr > ), (49.18) l,m
which implies, via the l-expansion of the wave function, the Lippmann–Schwinger equation for the radial wave function, ∞ 2mik dr r 2 jl (kr < )hl(1) (kr > )V (r )Rl (k; r ). (49.19) Rl (k; r) = jl (kr) − 2 0 Since the first term is the expansion of the plane wave, the second term is the expansion of the f k (θ), which gives al (k), so that al (k) =
eiδ l (k) sin δl (k) 2m =− 2 k
∞ 0
r 2 drV (r) jl (kr)Rl (k; r).
(49.20)
550
49 Unitarity, Optics, and the Optical Theorem
49.3 Radial Green’s Functions We can also expand the Green’s functions and the Lippmann–Schwinger equation using a partial wave expansion. The Green’s function partial wave expansion is G(z; r , r ) = G(z; r, r , θ) =
∞ 2l + 1 l=0
=
∞
l
g˜l (z; r, r
4π
g˜l (z; r, r )Rl (nr , nk ) (49.21)
∗ )Ylm (nr )Ylm (nr ),
l=0 m=−l
where g˜l (z; r, r ) is known as the radial Green’s function. It satisfies 2 1 d 2 d δ(r − r ) l (l + 1) z+ − V (r) g ˜ (z; r, r ) = , r − l 2m r 2 dr dr r2 r2
(49.22)
and has the spectral decomposition g˜l (z; r, r ) =
Rnl (r)R∗ (r) nl , z − E n n
(49.23)
|nn| . z − En n
(49.24)
representing in radial space the general formula ˆ G(z) =
In the free particle case, the radial Green’s function is g˜l(+)0 (E; r, r ) = −
2mi 2 q jl (qr < )hl(1) (qr > ), 2
(49.25)
where 2 q2 (49.26) 2m and r < = min(r, r ), r > = max(r, r ). It appears in the radial Lippmann–Schwinger equation, Rl (k; r) = jl (kr) + dr g˜l(+)0 (E; r, r )V (r )Rl (k; r )r 2 , (49.27) E=
which leads to the equation (49.19). It also leads to the Lippmann–Schwinger equation for the full Green’s function g˜l , (+)0 g˜l (z; r, r ) = g˜l (z; r, r ) + dr r 2 g˜l(+)0 (z; r, r )V (r ) g˜l (z; r , r ). (49.28) As an example, consider the radial delta function potential, V (r) = −λδ(r − a).
(49.29)
Then the Lippmann–Schwinger equation for the full radial Green’s function has the solution g˜l (z; r, r ) = g˜l(+)0 (z; r, r ) − λa2
g˜l(+)0 (z; r, a) g˜l(+)0 (z; a, r ) 1 + λa2 g˜l(+)0 (z; a, a)
.
(49.30)
551
49 Unitarity, Optics, and the Optical Theorem
This means that the radial Green’s function gives Rl (k; r) = jl (kr) − λa2
g˜l(+)0 (E; r, a) jl (ka) 1 + λa2 g˜l(+)0 (E; a, a)
.
(49.31)
The Jost solutions are f l∓ (k; r) = ±ikr hl(1,2) (kr).
(49.32)
The details are left as an exercise.
49.4 Optical Theorem We have already proven a version of the optical theorem in Chapter 46, but here we will revisit it. First, we review the proof given in Chapter 46. Unitarity means Sˆ Sˆ † = 1, but since ik ˆ Sˆ = 1 + f, 2π
(49.33)
it follows that k ˆ ˆ† fˆ − fˆ† = (49.34) ff . 2i 4π When we represent this abstract statement in a basis, and take n = nk , we obtain the total cross section, since k k d 2nr | f (nr , nk )| 2 = σtot , (49.35) Im f (nk , nk ) = 4π 4π and on the left-hand side we have the forward direction, f (θ = 0) = f (nk , nk ).
(49.36)
But we want to describe this proof in physical terms, which means in terms of the interaction with a potential V . To do that, we relate the above calculation to the T-matrix, by m ˆ fˆ = − T. (49.37) 2π2 Then, when taking the matrix element in the k | basis, we obtain f (θ = 0) = f (nk , nk ) = −
m ˆ k |T | k. 2π2
(49.38)
But the T-matrix element is related to the potential Vˆ in between two states, so Imk |Tˆ |k = Imk |Vˆ |ψ+. From the Lippmann–Schwinger equation, we obtain 1 k |Vˆ |ψ+ = ψ + |Vˆ |ψ+ − ψ + Vˆ Vˆ ψ + . E − Hˆ 0 − i On the other hand, from simple complex analysis formula extended to operators, 1 1 Gˆ 0(+) = =P + iπδ(E − Hˆ 0 ), E − Hˆ 0 − i E − Hˆ 0
(49.39)
(49.40)
(49.41)
552
49 Unitarity, Optics, and the Optical Theorem where P refers to the principal part. But when taking the imaginary part, we obtain a zero if we have a diagonal matrix element of a Hermitian (“real”) operator, so Imψ + |Vˆ |ψ+ = 0 1 ˆ ˆ V ψ + = 0. Im ψ + V P E − Hˆ 0
(49.42)
Then out of the three terms in the imaginary part of the T-matrix element, only one contributes (and is actually purely imaginary), and we obtain Imk |Vˆ |ψ+ = −πψ + |Vˆ δ(E − Hˆ 0 )Vˆ |ψ+ = −πk |Tˆ δ(E − Hˆ 0 )Tˆ |k 2 k˜ 2 ˜ ˆ = −π d 3 k˜ k |Tˆ |k˜ δ E − k |T | k 2m 2 πm =− 2 dΩk 2 k |Tˆ |k , k
(49.43)
where in the last step we have used 2 k˜ 2 δ(k − k˜ ) δ E− . = 2m 2 k˜ /m
(49.44)
Then also (since d 2n = d 2nk = dΩ)
3 ˜
d k δ(k − k˜ ) =
d n k × (|k˜ → |k ) = 2 2
dΩ k 2 × (|k˜ → |k ).
(49.45)
Finally, then, we obtain m Imk |Vˆ |ψ+ 2π2 2 m2 k = dΩ k |Tˆ |k 24 m 2 k k |Tˆ |k = dΩ − 4π 2π2 k k = dΩ| f (θ)| 2 = σtot . 4π 4π
Im f (θ = 0) = −
(49.46)
This completes the physical proof of the optical theorem. While the result is the same, and we have taken a longer route to prove it, we still have gained something. The use of the Lippmann– Schwinger equation, which can be solved perturbatively through the Born series, means that we can define the perturbative series on both sides of the equation for the optical theorem, and consider a term-by-term analysis (though we will not do so here). Moreover, in the previous proof, we assumed that the S-matrix is unitary, but that means assuming that the quantum mechanical construction of scattering is fully consistent (and we have not proved this until now), whereas here the proof is explicitly written in terms of the potential.
553
49 Unitarity, Optics, and the Optical Theorem
49.5 Scattering on a Potential with a Finite Range We now give an example of how to calculate δl and solve the partial wave problem. Consider the case of a potential that is strictly vanishing outside a spherical shell, V (r) = 0 for r ≥ r 0 .
(49.47)
In such a case (the case of a potential with a strict finite range), outside the spherical shell the solution of the Schr¨odinger equation is a free spherical wave. Throughout the space (both outside and inside the spherical shell), we found (see (48.34)) that the wave function solution for a spherically symmetric potential is ψk (r, θ) =
∞
(2l + 1)ei(δ l +lπ/2) R˜ El (r)Pl (cos θ)
l=0
=
∞
(49.48) REl (r)Pl (cos θ).
l=0
Furthermore, we found an asymptotic solution, (48.16), for which the potential was vanishing (the interaction was turning off). But in our case, the potential is really zero, so the solution is exact, REl (r) = Cl jl (kr) + Dl nl (kr),
(49.49)
where Cl = Al + Al∗ = Al0 2 cos δl Dl = i Al − i Al∗ = Al0 (−2 sin δl )
(49.50)
and Al0 =
REl (r) = (2l + 1)ei(δ l +lπ/2) . R˜ El (r)
(49.51)
In order to find the wave function, as we saw in the one-dimensional case, at the place where the potential jumps, we need to impose continuity of the logarithmic derivative of the wave function. Calculating it at r 0 + , so we can use the above formulas, we obtain q0l ≡
j (kr 0 ) cos δl − nl (kr 0 ) sin δl r dREl (r) = kr 0 l . REl (r)dr r=r0 jl (kr 0 ) cos δl − nl (kr 0 ) sin δl
(49.52)
Then we calculate q0l at r 0 − , in terms of the wave function solution at r ≤ r 0 ; equating it to the above gives the full solution, in terms of the calculated δl . Indeed, inverting the formula, we find tan δl =
kr 0 jl (kr 0 ) − q0l jl (kr 0 ) . kr 0 nl (kr 0 ) − q0l nl (kr 0 )
(49.53)
To find the solution inside the shell, we must impose regularity (normalizability) of the solution at r = 0, which, as we saw in Chapter 19, means that χEl (r = 0) = r REl (r)|r=0 = 0.
554
49 Unitarity, Optics, and the Optical Theorem
49.6 Hard Sphere Scattering The simplest example of the calculation inside radius r = r 0 , is for a “hard sphere”, with infinite potential, V = ∞, for r ≤ r 0 . Then the simplest boundary condition at r = r 0 is that the radial wave function vanishes, REl (r 0 ) = 0, which means jl (kr 0 ) cos δl − nl (kr 0 ) sin δl = 0,
(49.54)
leading to tan δl =
jl (kr 0 ) . nl (kr 0 )
(49.55)
Equivalently, we can take the limit q0l → ∞ (since the denominator in the definition of q0l has REl (r 0 ) = 0, as the wave function does when it hits an infinite wall) in the general formula for V (r < r 0 ), and we obtain the same result. For l = 0, and at r ≥ r 0 , we find tan δ0 =
j0 (kr 0 ) sin(kr 0 )/(kr 0 ) = = − tan(kr 0 ) = tan(−kr 0 ). n0 (kr 0 ) − cos(kr 0 )/(kr 0 )
(49.56)
Then we identify the arguments of the tangent, since both of them are bounded, δ0 = −kr 0 .
(49.57)
REl (r) = 2Al0 ( jl (kr) cos δl − nl (kr) sin δl ) ,
(49.58)
The radial wave function (for r ≥ r 0 ) is which means that for l = 0 we obtain RE0 (r) =
2A00
sin kr 0 cos kr 0 cos δ0 + sin δ0 kr 0 kr 0
2A00 2A00 = sin(kr 0 + δ0 ) = sin[k (r − r 0 )] , kr 0 kr 0
(49.59)
a sinusoid shifted by r 0 , i.e., starting at r 0 instead of zero. Then, we calculate the scattering length, δ0 (k) = r0 . k Note that now we do not need to take the low-energy limit. a = − lim
r→0
(49.60)
49.7 Low-Energy Limit We now consider the low-energy limit for the hard sphere potential. For Bessel functions at small argument, x = kr 0 1, (kr 0 ) l (2l + 1)! ! (2l − 1)! ! nl (kr 0 ) − . (kr 0 ) l+1 jl (kr 0 )
(49.61)
555
49 Unitarity, Optics, and the Optical Theorem In the k → 0 limit, we saw that δl tan δl , so δl (k) = tan δl (k) =
(kr 0 ) 2l+1 . (2l + 1)! ! (2l − 1)! !
(49.62)
For the partial wave amplitude, in the low-energy limit, we obtain δl (k) ∝ k 2l . (49.63) k This means that δ0 dominates strongly, giving an isotropic (l = 0) result. The lth partial wave cross section is al (k)
sin2 δl (k) ∝ k 4l , k2 which means all these cross sections go to zero, except σ0 , which stays constant. Then σl = 4π(2l + 1)
(49.64)
sin2 δ0 (k) = 4πa2 k→0 k2 δ0 (k) a = − lim = r0 , k→0 k
(49.65)
σtot σ0 4πa2 = 4πr 02 ,
(49.66)
σ0 = 4π lim
so the total cross section is
which is four times the geometrical cross section for radius r 0 , πr 02 . Moreover, the differential cross section is also constant, 2 δ0 (k) dσ = a2 = r 02 , = | f | 2 |a0 (k)P0 (cos θ)| 2 = lim (49.67) dΩ k→0 k where we have used P0 (cos θ) = 1. This is consistent with the previous result, as the integration over the solid angle just gives a factor equal to the total angle, 4π. Since we have four times the geometrical cross section (which would be the classical result), we have to ask, why do we have a larger result? One answer is that at low energy the wavelengths of the particles are larger than the range of the potential, leading to an extreme quantum regime, the opposite of the classical limit. On the other hand, at high energy, kr 0 1, we still find a result higher than the classical one, even though now the wavelengths are going to zero. The result is (the derivation can be found in [1])
σtot = 2πr 02 = σreflection + σshadow ,
(49.68)
where σreflection = πr 02 is the classical result, and σshadow is the Fraunhofer diffraction result in optics. This is consistent with the geometric optics approximation of the quantum mechanical wave function (semiclassical), which is a short-wavelength approximation.
Important Concepts to Remember • Unitarity means that Sl∗ = Sl−1 , so that δl is real, |Sl | = 1, and kal = Sl2i−1 , meaning that kal maps a circle on the complex plane called the unitarity circle.
• The total cross section expands as σtot = l σl , with σl = 4π/k 2 (2l + 1) sin2 δl , so σl ≤ 4π/k 2 (2l + 1), called the unitarity bound.
556
49 Unitarity, Optics, and the Optical Theorem • We can write a Lippmann–Schwinger equation for the radial wave function Rl (k; r), Rl (k; r) = ∞ jl (kr) − 2mik dr r 2 jl (kr < )hl(1) (kr > )V (r )Rl (k; r ), and the Green’s function has also a 2 0 ∗ decomposition in spherical harmonics Ylm (nr )Ylm (nr ), with the coefficients of the radial Green’s function Rnl (r)Rnl (r ) , gl (k; r, r ) = z − En n satisfying a similar L–S equation. • The optical theorem can be proven physically, without assuming Sˆ Sˆ † = 1, and in the proof we use the Lippmann–Schwinger equation, which can be solved perturbatively via the Born series so that we can reduce it to a term-by-term equation. • When scattering on a hard sphere of radius r 0 , the scattering length (without actually employing a low-energy limit) is a = r 0 , and the low-energy differential cross section is constant, dσ/dΩ = r 02 , so we get four times the geometric cross section, owing to a Fraunhofer diffraction contribution.
Further Reading See [2] and [1] for more details.
Exercises (1) (2) (3) (4) (5) (6) (7)
Use the Lippmann–Schwinger equation for ψ to write an expression for σl in terms of integrals involving the potential V (r) and the wave function. Show the details of going from (49.19) to (49.20). In the case of the radial delta function potential, prove that the radial wave function Rl (k; r) is (49.31) and the Jost solutions are (49.32). Expand the optical theorem first in angular momentum l and then in the Born series, for both f (θ = 0) and k |Tˆ |k . For a potential with finite range, consider V = V0 > 0 inside r < r 0 , and find the solution in this region, in the case E > V0 , imposing normalizability at r = 0. For the hard sphere, calculate σl at general k. Calculate the high-energy limit (k → ∞) for the partial wave cross section σl for a hard sphere, and their relative weight in σtot .
50
Low-Energy and Bound States, Analytical Properties of Scattering Amplitudes In this chapter we will consider the relation between low-energy scattering (meaning E > 0) and bound states (with E < 0), and we will find that, on considering complex values of k (or E), the same result can be obtained by analysis on the complex plane. This leads us to consider more general analytical properties of the scattering amplitudes.
50.1 Low-Energy Scattering We have seen that the phase shift δl is calculated from the potential via ∞ 2mk kal (k) = eiδ l sin δl = − 2 dr r 2 V (r) jl (kr)Rl (k; r). 0
(50.1)
Moreover, at low energy, k → 0 implies δl → 0, and we have kal (k) δl ∝ k 2l+1 → 0,
(50.2)
called threshold behavior, which means that the l = 0 mode dominates the phase shifts. Physically, we can understand the dominance of l = 0 from the effective potential Veff (r) = V (r) +
2 l (l + 1) . 2m r 2
(50.3)
At low energy, since the second term in Veff blows up for r → 0 if l 0, we will never come close to zero, where l 0 dominates the effective potential. This means that only the l = 0 term contributes, for which Veff = V (r).
Step Potential We next consider the simplest finite-range potential, the one that is constant inside its range, ⎧ ⎪V0 , r < r 0 V=⎨ ⎪ 0, r ≥ r . 0 ⎩
(50.4)
If V0 > 0 we have a repulsive potential, and if V0 < 0 we have an attractive one. At low energy, kr 0 1, l = 0 dominates and the equation (48.34) implies ψk (r, θ) eiδ0 RE,l=0 = eiδ0 ( j0 (kr) cos δ0 − n0 (kr) sin δ0 )
eiδ0 sin(kr + δ0 ) , kr
(50.5)
where in the first equality we have used the dominance of l = 0, in the second P0 (cos θ) = 1, and in the third r → ∞. 557
558
50 Low-Energy and Bound States, Scattering Amplitudes For the reduced radial wave function, we obtain (for r ≥ r 0 ) χ = r RE,l=0 (r) = const × sin(kr 0 + δ0 ),
(50.6)
where we have chosen a different normalization, such that the constant is 1. Within the potential range, we impose regularity at r → 0, χ(0) = 0. Since V = V0 is constant over the range, the wave function is a free particle one, but with energy shifted by V0 , 2 k˜ 2 , 2m so the real solution for l = 0 (the spherically symmetric case) or at large r is E − V0 =
(50.7)
ei kr − e−i kr ˜ = N sin kr, (50.8) 2i where N is a normalization constant. Of course, this is only true if E > V0 . If E < V0 , we write k˜ = iκ, so ˜
˜
χ(r) = N
V0 − E =
2 κ2 , 2m
(50.9)
leading to the wave function N e κr − e−κr = N˜ sinh κr. i 2 We need to join the inside and outside solutions at r = r 0 . We saw in the previous chapter that among the lth partial wave cross sections χ(r) =
σl =
4π (2l + 1) sin2 δl , k2
(50.10)
(50.11)
at low energy the l = 0 term dominates, σ0 = 4π
sin2 δ0 (k) → 4πa2 . k2
(50.12)
However, the dominance of σ0 also happens at sin2 δ0 → 0 and k → 0, not only at δ0 → 0. It can also happen at δ0 = π. ˜ 0 = π, so If −V0 = |V0 | is large then k˜ is large (even though k is small). Thus, eventually we reach kr ˜ = sin π = 0, sin kr
(50.13)
even though kr 0 = 0. But the continuity of the inside and outside solutions means that ˜ 0 ) = 0, sin(kr 0 + δ0 ) = N sin( kr
(50.14)
so δ0 = π, since we have increased δ0 as |V0 | is increased from 0, which correspond to the free particle (when δ0 = 0).
50.2 Relation to Bound States The outside (r ≥ r 0 ) wave function solution for k → 0 can be rewritten as δ0 δ0 χ(r) = sin(kr + δ0 ) = sin k r + k r+ . k k
(50.15)
559
50 Low-Energy and Bound States, Scattering Amplitudes However, since we are in the regime E 0, l = 0, r > r 0 so that V = 0, the Schr¨odinger equation is d 2 (r R) d 2 χ = = 0, dr 2 dr 2
(50.16)
χ C(r − a),
(50.17)
with solution
where C, a are constants. But matching with the first form in the k → 0 limit, we obtain that a = − lim
k→0
δ0 , k
(50.18)
meaning a is the scattering length. But the condition of matching the inside and outside solutions means the logarithmic derivative must be continuous at r 0 . We define it slightly differently, in terms of χ(r), as q˜0 ≡
χ (r 0 ) . χ(r 0 )
(50.19)
In terms of the outside function, its value at arbitrary r is k cos(kr + δ0 ) δ0 1 1 q˜0 = = k cot k r + = . k sin(kr + δ0 ) k k (r + δ0 /k) r − a
(50.20)
If we choose r = 0, even though this is outside the domain of validity of the function (r ≥ r 0 ), we obtain 1 q˜outside (r → 0) = k cot δ0 = − , a
(50.21)
consistent with the definition of the scattering length. Its significance follows from the fact that q˜ is continuous at r 0 : we have χoutside (r) sin[k (r − a)] = 0
(50.22)
for r = a, regardless of whether r = a ≥ r 0 or not, which means that the scattering length r = a is the intercept of the outside wave function; see Fig. 50.1. If δ0 < 0, coming from the repulsive potential V0 > 0, then a > 0. If δ > 0, coming from the attractive potential V0 < 0, then a < 0. However, in the attractive-potential case V0 < 0, if |V0 | is ˜ then we have lower periodicity of the wave function, so the wave function large, meaning large k, drops faster at r > r 0 and so its first zero after r 0 corresponds to a different scattering length a > 0. Indeed, the low-energy (E 0) outside solution r > r 0 is χ C(r − a).
(50.23)
χ(r)
−a Figure 50.1
Wave function and scattering length.
r0
r
+a
560
50 Low-Energy and Bound States, Scattering Amplitudes This is a scattering solution for E > 0 (so E = 0+), though it can be written also as χ Be−κr ,
(50.24)
B = −Ca, −Bκ = C < 0.
(50.25)
where κ 0 (very small) and
After this rewriting, the wave function looks also like a bound state (E < 0), though very slightly so, E = 0−. In both the scattering (E = 0+) and bound state (E = 0−) solutions, 2 κ˜ 2 = E − V0 |V0 |. 2m
(50.26)
Therefore in both cases the inside wave function is χ sin κr. ˜
(50.27)
Then the logarithmic derivative inside is (in both cases) q˜0inside =
−κe−κr0 = −κ = κ˜ cot κr ˜ 0, e−κr0
(50.28)
1 1 − , r0 − a a
(50.29)
to be equated to the outside value, q˜0outside =
in this larger-scattering-length case, r 0 a. Thus the binding energy of the bound state is obtained by equating the cases E = 0+ and E = 0−, I = −Ebound state =
2 κ2 2 = . 2m 2ma2
(50.30)
This is a relation between the binding energy I of a mode close to zero and the (large value of the) scattering length a for scattering on an attractive potential of large depth.
50.3 Bound State from Complex Poles of Sl (k) The partial wave S-matrix Sl = e2iδ l (k) appears in the ratio of the eikr /r and e−ikr /r modes of the real wave function solution. From (48.19), the radial wave function is i(kr−lπ/2) −i(kr−lπ/2) 0 iδ l e −iδ l e REl (r) Al e −e r r (50.31) i(kr−lπ/2) −i(kr−lπ/2) e 1 e 0 = A˜ l − 2iδ (k) , r r e l where e2iδ l (k) = Sl (k) appears in the denominator of the incoming spherical wave, and we have redefined Al0 by a factor eiδ l (k) .
561
50 Low-Energy and Bound States, Scattering Amplitudes Dropping Al0 , putting l = 0 as it is the most relevant case, yet the calculation is actually valid for any l, from (48.34) we obtain eikr e−ikr iδ0 0 + ψk (r, θ) e RE,l=0 = A0 S0 (k) r r (50.32) ikr −ikr 1 e 0 e ˜ = A0 − . r S0 (k) r At this point we generalize to complex k, in order to connect to the description of bound states. Writing k = iκ, now with κ > 0, the scattering solution transforms into −κr 1 e+κr 0 e ˜ − ψk (r, θ) = A0 , (50.33) r S0 (iκ) r which matches the bound state wave function solution provided that the second term vanishes, which means S0 (iκ) → ∞ for κ ∈ R+ . Therefore the function S0 (k) has a pole on the imaginary line in the upper half-plane k = iκ. This means that, starting from scattering with k ∈ R, we generalize to the complex plane, k ∈ C, and then the poles k = iκ correspond to bound states. If there are no singularities other than a single bound state k = iκ, with κ > 0, the s wave S-matrix S0 (k) can be constructed. This implies that there are no singularities even at infinity; otherwise, we need to consider k not too large (if this is not true, then S0 (k) → 0 at infinity for a physical case). Besides the pole condition for S0 (k), we have also the following conditions: (1) |S0 (k)| = 1 for 2l+1 k ∈ R+ , derived from unitarity, and (2) S0 = 1 for k = 0, since then Sl = e2iδ l eiCk → 1. The simplest solution for the conditions on S0 (k) is k + iκ = e2iδ0 (k) . k − iκ
(50.34)
Sl=0 − 1 1 1 =− = . 2ik i(k − iκ) −κ − ik
(50.35)
1 k cot δl=0 (k) − ik
(50.36)
S0 (k) = − This means that the partial wave amplitude is al=0 (k) = Equating it with the definition al=0 (k) = gives
1 = lim k cot δl=0 (k) = −κ, (50.37) a k→0 the same relation between bound states and scattering length as we had before. Thus extending δl (k) and Sl (k) to the complex-k plane is useful for obtaining relations between different physical regimes. k cot δl=0 (k) = −κ ⇒
−
50.4 Analytical Properties of the Scattering Amplitudes Therefore, we will set up the analytical properties of the functions δl (k) and Sl (k), extended over the complex-k plane. If k ∈ C, the energy is also complex, E ∈ C. The exceptions are k ∈ R, leading to E ∈ R+ and k ∈ iR (imaginary k), leading to E ∈ R− . In general, since
562
50 Low-Energy and Bound States, Scattering Amplitudes
E=
2 2 k , 2m
(50.38)
we obtain 2 [(Re k) 2 − (Im k) 2 ] 2m 2 Im E = 2 Re k Im k. 2m Re E =
(50.39)
Thus if we consider k over the upper half-plane (Im k > 0), it sweeps all of the complex E plane (both the real and imaginary parts reach arbitrarily large positive and large negative values). However, this implies that this complex E plane is not a regular plane (over which functions are single valued), but rather a Riemann sheet of the domain of E, called the physical sheet. Indeed, when r → ∞, the reduced radial wave function in the scattering regime, with k ∈ R, is χ(r) A(E)eikr + B(E)e−ikr ,
(50.40)
where −
A(E) = e2iδ0 (E) = S0 (E). B(E)
(50.41)
But if k = iκ instead, meaning E=−
2 2 κ < 0, 2m
(50.42)
then a real χ(r) for E < 0 means real A(E), B(E). The relation between E > 0 and E < 0 is by analytical continuation through the upper half-plane in k (Im k > 0), so k = iκ. But at an arbitrary point on the upper half of the k plane, we have A(E ∗ ) = A∗ (E), B(E ∗ ) = B∗ (E).
(50.43)
Indeed, Im E does change sign, but, since Im k > 0, in fact only Re k actually changes sign, and eikr = eiRe k r e−Im k r ↔ e−iRe k r e−Im k r = e−iRe k r e−Re κ r ,
(50.44)
the outgoing wave function eikr becomes the bound state wave function e−κr , and the coefficient is its complex conjugate, so A(E ∗ ) = A∗ (E),
(50.45)
and similarly for B(E ∗ ) = B∗ (E). Then the physical Riemann sheet for E is defined by Im k = Re κ > 0,
(50.46)
and we need to consider a cut for E ∈ R+ (from E = 0 to infinity), taking us through to a different Riemann sheet. The analyticity conditions A(E ∗ ) = A∗ (E) and B(E ∗ ) = B∗ (E) also define the E plane, or physical Riemann sheet.
563
50 Low-Energy and Bound States, Scattering Amplitudes
50.5 Jost Functions Revisited To continue the analysis of analyticity properties, we go back to the Jost functions Fl± (k) = W ( f l± , φl ),
(50.47)
where the regular solution is φl =
1 − −Fl (k) f l+ (k; r) + Fl+ (k) f l− (k; r) , 2ik
(50.48)
and the Jost solutions are defined by lim e±ikr f l± (k; r) = 1.
r→∞
(50.49)
But then the coefficients B and −A are matched to Fl± (k), lim f + (k; r) r→∞ l − lim f l− (k; r) r→∞
= e−ikr
⇒
Fl+ (k) = B(E)
= e+ikr
⇒ −Fl− (k) = A(E).
(50.50)
Then, since k → −k ∗ means that Re k changes sign but Im k does not, this corresponds to E ∗ , so from the analyticity of A and B we obtain ∗ (50.51) Fl± (−k ∗ ) = ±B(E ∗ )/A(E ∗ ) = ±[B(E)/A(E)]∗ = Fl± (k) , which is indeed one of the Jost function’s properties. The other relevant one is Fl± (−k) = F ∓ (k), implying also
Fl± (k ∗ )
∗
= Fl∓ (k).
(50.52)
(50.53)
Then the lth partial wave S-matrix is written in terms of the Jost functions as Sl (k) = e2iδ l (k) =
Fl+ (k) ilπ Fl+ (k) ilπ e = e . Fl− (k) Fl+ (−k)
(50.54)
This means that the analytical properties of Sl (k) can be derived from the analytical properties of the Jost functions Fl± (k). Indeed, we have a theorem. The zeroes of the Jost functions Fl+ (k) in the lower half-plane (meaning Im k < 0) lie on the imaginary line iR, correspond to bound states, and are simple zeroes (with multiplicity one). To prove this, we see that Fl+ (−k) = 0 implies Sl (k) → ∞, meaning a pole of Sl (k). But we have already seen that a pole of S0 (k) is a bound state k = iκ n . Moreover, Sl (k) → ∞ ⇔ Fl+ (−k) = 0, so Fl+ (k) has zeroes at k = −iκ n corresponding to bound states. q.e.d. We then also obtain easily that Sl (k) has zeroes at k = −iκ n and that Fl (k) has poles at k = +iκ n ; see Fig. 51.1 in the next chapter. For bound states of Fl− (k), the same analysis follows, except now we have zeroes in the upper half-plane (Im k > 0), since Fl− (k) = Fl+ (−k).
(50.55)
564
50 Low-Energy and Bound States, Scattering Amplitudes Because Fl+ (k) = W ( f l+ , φl ), the reduced radial wave function for physical bound state is just the analytically continued regular solution with a normalization constant in front, χl (k n = −iκ n ; r) = Nnl φl (k n = −iκ n ; r),
(50.56)
where 2 Nnl =
−4κ2n
Fl− (−iκ n ) dFl+ (−iκ)/dκ
.
(50.57)
κ=κ n
The proof of the normalization constant is left as an exercise. Since F + (k) ilπ Sl (k) = e2iδ l (k) = l− e , Fl (k)
(50.58)
the analytical property [Fl± (k ∗ )] = [Fl∓ (k)]∗ implies that Sl (k ∗ ) = Sl∗ (k ∗ )
Fl+ (k ∗ ) ilπ [Fl− (k)]∗ ilπ e = e ⇒ Fl− (k ∗ ) [Fl+ (k)]∗
F − (k) −ilπ = l+ e = Sl−1 (k), Fl (k)
(50.59)
so Sl (k) satisfies Sl (k)Sl∗ (k ∗ ) = 1.
(50.60)
On the other hand, the analytical property Fl± (−k) = Fl∓ (k) implies that Sl (−k) =
Fl+ (−k) ilπ Fl− (k) ilπ e = + e , Fl− (−k) Fl (k)
(50.61)
so Sl (k) satisfies Sl (−k)Sl (k) = e2ilπ ,
(50.62)
where the right-hand side equals 1 if l is integer.
Important Concepts to Remember • For scattering at low energy, a0 and σ0 dominate; this is known as threshold behavior. • For a finite-range negative step potential, V = V0 < 0 for r ≤ r 0 , we can relate scattering (E = 0+) and bound state (E = 0−) solutions by −Eboundstate ≡
2 κ2 2 = 2m 2ma2
(so κ = 1/a), with a the (large) scattering length. • Generalizing Sl (k) to the complex k plane, the poles k = iκ of S0 (k) correspond to bound states. For a single pole, S0 (k) = (k + iκ)/(k − iκ), giving a = 1/κ. • Expanding the reduced radial wave function at infinity, χ(r) A(E)eikr + B(E)e−ikr , and extending to k ∈ C, the coefficients are Jost functions, B(E) = Fl+ (E) and A(E) = −Fl− (E), and Sl (k) = Fl+ (k)/Fl− (k)eilπ , with Sl (−k)Sl (k) = eilπ .
565
50 Low-Energy and Bound States, Scattering Amplitudes
Further Reading For more on the analytical properties of wave functions, see [4].
Exercises (1) (2) (3) (4) (5) (6) (7)
For a step potential, with energy E < V0 , connect the solution at r < r 0 with the solution at r > r 0 (in the general case). Find the first correction to the approximate relation (50.30) between the binding energy I of the bound state close to zero and the large scattering length a. Find the equivalent of the relation replacing (50.30) if we still have r 0 a, but E = V0 /2 > 0 and very small (kr 0 1). Extend the analysis in complex space leading to Sl (k), al (k), δl (k) and coming from the existence of a bound state, from the similar analysis for l = 0. In the complex k plane do we still have real δl (k)? Write down the Jost functions Fl+ (k) and Fl− (k) in the case of a single bound state. Derive the normalization constant (50.57).
51
Resonances and Scattering, Complex k and l
In this chapter we consider resonances in scattering and their interpretation and analytical properties in the complex k plane, where k is the wave number. We end with a few remarks about continuing l into the complex plane.
51.1 Poles and Zeroes in the Partial Wave S-Matrix Sl (k) We saw that k n = −iκ n is a zero for Fl+ (k) = 0, corresponding to a bound state. Then k n = +iκ n is a pole for Sl (k). But then the question arises: can one not have poles outside the imaginary axis in the complex k plane? If there are extra poles, they cannot be on the real axis, since for k ∈ R, |S0 (k)| = 1, whereas we want S0 (k) = ∞. Thus, we require kpole = k1 + ik2 .
(51.1)
Since S0 (k) = ∞, from the previous chapter we know that this corresponds to having only the “out” wave at r → ∞, ∗
χ(r) ∼ eikr ⇒ χ ∗ ∼ e−ik r .
(51.2)
Then the Schr¨odinger equations for χ and for χ ∗ are d2 χ(r) + k 2 χ(r) − V (r)χ(r) = 0 ⇒ dr 2 d2 ∗ χ (r) + k ∗2 χ ∗ (r) − V (r)χ ∗ (r) = 0. dr 2 Multiplying the first equation by χ ∗ and subtracting the second, multiplied by χ, we obtain d d ∗ ∗ d χ − χ χ + (k 2 − k ∗2 )χ ∗ χ = 0. χ dr dr dr r Integrating 0 dr , we obtain r r d d χ ∗ χ − χ χ ∗ + (k 2 − k ∗2 ) dr |χ(r )| 2 = 0. dr dr 0 0
(51.3)
(51.4)
(51.5)
Regularity of the solution at r = 0 means that χ(0) = χ ∗ (0) = 0. Since r → ∞, we can use ∗ χ(r) ∼ eikr , χ ∗ (r) ∼ e−ik r , obtaining r (k + k ∗ ) ie−2rIm k + 2i Im k dr |χ(r )| 2 = 0. (51.6) 0
566
567
51 Resonances and Scattering, Complex k and l Then we can have either k + k ∗ = 0, which means that the poles lie on the imaginary line (the bound states), or e−2rIm k Im k = − r < 0. 2 0 dr |χ(r )| 2
(51.7)
This would mean that there are poles in the lower half-plane, k = k1 − ik2 , where k2 ∈ R+ . We have analyzed the poles for S0 (k) because we started with (50.32), but that was actually valid for any l, so all the analysis above can be generalized to Sl (k) for any l. Since Sl (k) =
Fl+ (k) ilπ Fl+ (k) ilπ e = e , Fl− (k) Fl+ (−k)
(51.8)
the k = k1 − ik2 pole for Sl (k) ⇔ a pole for Fl+ (k) or a zero of Fl+ (−k) = Fl− (k) ⇔ a zero for Fl∗− (k) = Fl+ (k ∗ ) ⇔ the k = k1 + ik2 zero for Sl (k), kzero = k1 + ik2 .
(51.9)
Therefore for every pole in the lower half-plane, we have a corresponding zero in the upper half-plane at the complex conjugate point. However, we also have Fl± (−k ∗ ) = [Fl± (k)]∗ . Thus if k = −k1 + ik2 is a zero for Fl+ (k) ⇔ k = k1 − k2 is a zero for Fl− (k) (⇔ pole for Sl (k)) ⇔ also −k1 − ik2 is a zero for Fl− ⇔ k1 + ik2 is a zero of Fl+ (k). To summarize, we have (see Fig. 51.1) zero for Fl− (k) : k = ±k1 − ik2 ↔ pole for Sl (k)
(51.10)
zero for Fl+ (k) : k = ±k1 + ik2 ↔ zero for Sl (k).
However, in the physical case, k is real. If the poles and zeroes are close to the real line, i.e., if k2 k1 , we can approximate the form of Sl (k). Then k − kzero ˜ E − Ezero ˜˜ Sl (k) = e2iδ l (k) = Sl (k) = Sl (E) , (51.11) k − kpole E − Epole where S˜l (k) varies very little. For physical values of k, we have two cases. For bound states, k is on the imaginary axis, and if also l = 0 then S˜0 (k) = −1. But for the poles or zeroes near the k
(a)
Figure 51.1
k
(b)
Zeroes and poles of (a) Sl (k); (b) Fl+ (k); (c) Fl− (k). Zeroes are denoted by a circle, and poles by a cross.
k
(c)
568
51 Resonances and Scattering, Complex k and l real axis, S˜l (k) 1 since this corresponds to scattering, and if k is far from the pole or zero, then Sl (k) S˜l (k) 1 is close to the unperturbed case. Then Sl (k)
k − k1 − ik2 . k − k1 + ik2
(51.12)
This Sl (k) is valid close to the pole or zero.
51.2 Breit–Wigner Resonance From the formula Sl (k) = e2iδ l (k) above, we can calculate the partial wave amplitude by substituting it into the general formula, al (k) =
Sl (k) − 1 1 −k2 1 = k−k 2ik k (k − k1 ) + ik2 k (−k21) − ik
eiδ l (k) sin δl (k) 1 = = . k k cot δl (k) − ik
(51.13)
Identifying the term with cot δl with the similar term in the first line, we obtain cot δl (k) =
k − k1 . −k2
(51.14)
Moreover, the lth partial wave cross section is σl (k) =
k22 4π 4π (2l + 1) sin2 δl (k) = 4π(2l + 1)|al (k)| 2 2 (2l + 1) . 2 k k (k − k1 ) 2 + k22
(51.15)
But this formula has a maximum at k = k1 , which is called a resonance. We have seen that the zero of Sl (k) is at kzero = k1 + ik2 , and correspondingly there is a pole at kpole = k1 − ik2 . Then 2 2 2 2 Γ kzero/pole = (k − k22 ± 2ik1 k2 ) = Eres ± i , 2m 2m 1 2 where we have defined the resonance energy Eres and the width Γ by Ezero/pole =
(51.16)
2 2 (k − k22 ) 2m 1 (51.17) Γ 2 = 2k1 k2 . 2 2m If k is close to k1 (k k1 ), and correspondingly the real energy E is close to the real energy E1 , related by Eres =
E=
2 2 2 2 k ∈ R, E1 = k ∈ R, 2m 2m 1
(51.18)
we obtain E − E1 =
2 2 (k + k1 )(k − k1 ) 2k1 (k − k1 ), 2m 2m
(51.19)
as well as (since k1 k2 ) Eres E1 .
(51.20)
569
51 Resonances and Scattering, Complex k and l
¾l (E)
Γ
E E1 Figure 51.2
Breit–Wigner resonance. Then from (51.14) we have tan δl (k) =
−k2 −Γ/2 = , k − k1 E − E1
(51.21)
which in turn means that Sl (k) = e2iδ l (k) =
1 + i tan δl (k) E − E1 − iΓ/2 = . 1 − i tan δl (k) E − E1 + iΓ/2
(51.22)
Moreover, now we have (E − E1 ) 2 (Γ/2) 2 (k − k1 ) 2 = , k22 = 2 , 2 2 2 2k 2k 1 1 2m 2m
(51.23)
which we substitute in σl (k) to obtain σl (k)
4π (Γ/2) 2 (2l + 1) . k2 (E − E1 ) 2 + (Γ/2) 2
(51.24)
This is the Breit–Wigner formula for resonance in energy, giving a bell-like shape for the cross section; see Fig. 51.2. The maximum of σl (k) is at E = E1 Eres , which is therefore the resonance energy. On the other hand Γ is the width at half maximum value for σl . Indeed, if E − E1 = ±Γ/2, σl (E)/σ(E1 ) = 1/2. We can write a slightly more general formula. Until now, we have assumed that we can directly invert (51.14), so that −Γ/2 −1 δl (k) = tan , (51.25) E − E1 and replace it in Sl (k) = e2iδ l (k) to find the Breit–Wigner formula for σl (k). But it could be that there is a “background” value (a constant) for δl , called ξl , i.e., −Γ/2 −Γ/2 δl (k) → δ˜l (k) = δl (k) + ξl = tan−1 + ξl = tan−1 + ξl . (51.26) E − E1 E − Eres If ξl = nπ, that does not change the value of of tan δl (k), but otherwise it does. Then, defining −Γ/2 1 ≡− E − E1 1 tan ξl ≡ − , q tan δl =
(51.27)
570
51 Resonances and Scattering, Complex k and l
from the partial wave S-matrix ˜
Sl = e2iδ l → e2i δ l = e2iδ l e2iξl ,
(51.28)
we obtain the partial wave cross section σl =
4π (q + ) 2 (2l + 1) . k2 (1 + q2 )(1 + 2 )
(51.29)
In the normal case ξl = 0, implying q → ∞, we obtain σl =
4π 1 4π (Γ/2) 2 (2l + 1) = (2l + 1) . k2 1 + 2 k2 (E − Eres ) 2 + (Γ/2) 2
(51.30)
In the extreme non-normal case, q = 0 ↔ ξl = π/2, we obtain a new formula, σ˜ l = 2 σl =
4π (E − Eres ) 2 (2l + 1) . 2 k (E − Eres ) 2 + (Γ/2) 2
(51.31)
A final observation is that cot δl
E − Eres −Γ/2
(51.32)
means that at resonance cot δl = 0, and a little away from it, the above formula gives the first term in the Taylor expansion.
51.3 Physical Interpretation and Its Proof Now that we have defined the poles of Sl (k) and the related Breit–Wigner resonance formulas analytically, we need to understand the physical interpretation of this mathematical result. The first observation is that the poles are near the real axis, and they influence the behavior on the real axis, where physical scattering is located. But how are these poles created, or equivalently, what do the poles represent? The poles that we considered previously were bound states. But in bound states the system stays bound for an infinite amount of time, i.e., there is an infinite lifetime. The infinite lifetime refers to the particle being inside a potential well, with the energy of the bound state satisfying En < V on both sides of the potential well, and with En < 0, so that there are no asymptotic states (the particle cannot escape to infinity). Bound states are also stationary, which means that their time dependence is just a phase, ψ(r, t) ∝ e−iEn t/ ,
(51.33)
where En = −|En |. Since this corresponds to an imaginary k pole, k n = iκ n , the “out” wave turns into a decaying exponential, eikn r = e−κ n r , but under the transition the energy stays real, just changing sign, En = E(k n ) =
2 k n2 2 κ2n =− = −|En |. 2m 2m
(51.34)
Turning to the resonance scattering, k kpole = k1 − ik2 , we are still close to the real line, so actually k ∼ k1 , meaning the “out” wave stays out, eikr eik1 r . On the other hand, the energy now
571
51 Resonances and Scattering, Complex k and l
Veff (r) r
En
Figure 51.3
Scattering by entering, and then tunneling out of, a potential barrier around a quasi-bound state. becomes complex, E Epole = Eres − iΓ/2, leading to a time dependence of the wave function that is not just a phase, but also a decaying exponential, ψ(r, t) ∝ e−iEt/ e−iEres t/ e−Γt/2 .
(51.35)
Then the probability density decays in time, |ψ(r, t)| 2 ∝ e−Γt/ ,
(51.36)
with a finite lifetime τ=
Γ
(51.37)
for the state. This means we now have a quasi-bound state. With respect to the (true) bound state, we still have En < V on both sides of the potential well, but now En > 0, which means that there are asymptotics at infinity. This means that the well has a potential barrier on the side at large r, and we can tunnel out of it, see Fig. 51.3. In conclusion, the physical picture is the following: we have a particle coming in from infinity, it gets captured into the quasi-bound state, and then it tunnels out of the potential well, or “leaks out”. To prove this picture, we consider the radial wave function extended on the complex k plane, eikr e−ikr − Rl (k; r) = C Sl (k) r r (51.38) ikr k − k1 − ik2 e e−ikr − C . k − k1 + ik2 r r Choosing the constant C=
1 , k − k1 − ik2
(51.39)
the reduced radial wave function becomes χl (k; r)
eikr e−ikr − . k − k1 + ik2 k − k1 − ik2
(51.40)
From it, we construct a wave packet by integrating χl , multiplied by the stationary phase, near the pole, (51.41) ψ(r, t) = dk χl (k; r)e−iEt/ .
572
51 Resonances and Scattering, Complex k and l Since we are integrating near the pole, we write k = k1 + δk and integrate over δk, with dk = dδk. In this case, the energy is 2 2 2 2 k 2 k + k1 δk = E1 + v1 δk, 2m 2m 1 m where we have defined the velocity of the pole, E=
v1 ≡
k1 . m
(51.42)
(51.43)
Then the reduced radial wave function is χl (k; r) and the wave packet is ψ(r, t) = ei(k1 r−E1 t/)
dδk
eik1 r eiδk r e−ik1 r e−iδk r − δk + ik2 δk − ik2
eiδk (r−v1 t) − e−i(k1 r−E1 t/) δk + ik2
dδk
(51.44)
e−δk (r+v1 t) . δk − ik2
(51.45)
Since most of the integral comes from the region near the pole (δk ∼ 0), we can extend the +∞ integration region to the whole real line, −∞ dδk with minimal change. To calculate this integral over the real line, we must close the contour of integration in the complex plane, with a half-circle at infinity. In the first integral: If r − v1 t > 0 then in order for the contribution of the half-circle to vanish, we need to have Im δk > 0 (so that the exponential is decaying), meaning the integration is in the upper half-plane. Then 1 gives the sum of we have a closed contour C with a counterclockwise integration, meaning 2πi C residues inside C, i.e., the residues in the upper half-plane, of which there are none. If r − v1 t < 0 then for the contribution of the half-circle to vanish, we need to have Im δk < 0, i.e., integration in the lower half-plane. But now we have a closed contour C with clockwise integration, 1 gives the sum of residues in the lower half-plane. That means the residue at δk = −ik2 , so − 2πi C giving finally eiδk (r−v1 t) = −2πi Res(δk = −ik2 )θ(v1 t − r) = −2πiek2 (r−v1 t) θ(v1 t − r). (51.46) dδk δk + ik2 In the second integral: If r + v1 t > 0 for the contribution of the half-circle to vanish, we need Im δk < 0, which means the 1 integration is in lower half-plane. This gives a clockwise integration, and − 2πi gives the residues C in the lower half-plane, of which there are none. If r + v1 t < 0 then we need Im δk > 0, so integration is in the upper half-plane. This gives a 1 counterclockwise integration, with 2πi giving the sum of the residues in the upper half-plane, C meaning δk = ik2 , giving finally eiδk (r+v1 t) dδk = 2πi Res(δk = ik2 )θ(−v1 t − r) = 2πiek2 (r+v1 t) θ(−v1 t − r). (51.47) δk − ik2 Then the wave packet time dependence splits into three regions: If t < −r/v1 , we have the in wave. This time region corresponds to the coming wave propagating towards the center, since at r → ∞ we have ψ(r, t) 2πie−i(k1 r−E1 t/) ek2 (r+v1 t) , which means that the wave comes in from r = ∞ with velocity −v1 .
(51.48)
573
51 Resonances and Scattering, Complex k and l If −r/v1 < t < r/v1 , in the asymptotic region r → ∞ we have ψ(r, t) 0,
(51.49)
which means the wave is concentrated near r = 0. Thus the particle has been captured in a metastable or quasi-bound state. If t > r/v1 , we have the out wave. This region corresponds to the outgoing wave propagating from the center, since at r → ∞ we have ψ(r, t) −2πiei(k1 r−E1 t/) ek2 (r−v1 t) ,
(51.50)
where the last exponential is decaying, since r − v1 t < 0. Since Γ 2 = 2k1 k2 = v1 k2 , 2 2m we obtain that the probability density at fixed r → ∞ in this region is |ψ(r, t)| 2 ∝ e−2k2 v1 t = e−Γt/ = e−t/τ ,
(51.51)
(51.52)
which is what we wanted to prove. Indeed then, at large times, the wave function implies that there is a probability of tunneling out of the metastable state.
51.4 Review of Levinson Theorem Now that we have understood how to obtain δl (k), we will come back to the Levinson theorem, to present a simple proof. The theorem can be stated as δl (0) − δl (∞) = nlb π + φ,
(51.53)
where φ = 0 or π/2. We start with a small potential, namely potential that is small with respect to the energy associated with a de Broglie wavelength of the order of the range of the potential, |V (r)|
2 . 2mr 02
(51.54)
In this case, the Born approximation is valid at all energies, including E = ∞, so δl (E) 1, and δl (∞) 0. On the other hand, δl (0) = φ (which is also zero, except for a background value). Since the potential is shallow, we have no bound states, nlb = 0. Then consider E = , a small and fixed energy, and δl () − δl (∞) as V (r) deepens. As each new bound state of energy appears as V (r) deepens, and its radial wave function contains a new node with respect to the previous bound state with energy (since the bound state with energy has the largest energy among bound states, i.e., it is Enmax , for which we have nmax + 1 total nodes). But since E = is small, this last node is at large r (since the periodicity is inversely proportional to k), for which the wave function is lπ + δl , (51.55) REl (r) ∼ sin kr − 2 which means that kr − lπ/2 + δl changes by π at the new node’s appearance, so δl () changes by π when each new node appears at fixed r. That means that δl ( → 0) − δl (∞) = nlb π + φ, and after we take the limit → 0, we obtain the Levinson theorem.
(51.56) q.e.d.
574
51 Resonances and Scattering, Complex k and l
51.5 Complex Angular Momentum Until now, we have learned new information from considering k in the complex plane (and so also the energy E in the complex plane), but we have kept the angular momentum as a natural number, l ∈ N. However, we can also learn from an extension to a complex angular momentum, l ∈ C, where now the energy is kept real, E ∈ R. The result is called Regge theory. The reduced radial wave function in the bound state case is χEl (r) = r REl (r) = A(l, E)e−κr + B(l, E)e+κr ,
(51.57)
√ where κ = −2mE/. We can relate it to the scattering solution through analytical continuation through the lower half-plane (different from the upper half-plane relevant to complex k). The scattering solution is (the proof is just a modification of the continuation of the energy dependence through the upper half-plane) χEl (r) = A∗ (l, E)e−ikr + B∗ (l, E)eikr .
(51.58)
A∗ (l, E) = −Fl− (k) = −(Fl+ (k)) ∗ , B∗ (l, E) = Fl+ (k) = (Fl− (k)) ∗ .
(51.59)
This means that
If l ∈ R and E > 0, then, identifying the scattering χEl with the one analytically continued from the bound state, we obtain an analyticity constraint of A(l, E), namely A(l, E) = B∗ (l, E).
(51.60)
But analytically continuing to the complex plane, l ∈ C, while still keeping E ∈ R+ , we obtain A(l ∗ , E) = B∗ (l, E).
(51.61)
The general solution of the Schr¨odinger equation, even in the case of complex l, has the following behavior near r → 0, REl (r) ∼ αr l + βr −l−1 .
(51.62)
In the case of l ∈ N, we need to put β = 0 so that we have only REl ∼ r l . If l ∈ C, we have an alternative: we can define REl to be valid for general r −l−1 . Namely, we require r l r −l−1 , which means Re l ≥ Re(−l − 1), so Re(2l + 1/2) > 0.
(51.63)
Since S(l, E) = e2iδ(l,E) = eiπl
A(l, E) , B(l, E)
(51.64)
we can generalize to A(E) Fl+ (k) F + (l, k) A(l, E) = − → − = . B(E) Fl (k) F (l, k) B(l, E)
(51.65)
575
51 Resonances and Scattering, Complex k and l
Then we obtain S ∗ (l, E) = e−iπl
∗
S(l ∗ , E) = e+iπl
∗
A∗ (l, E) B∗ (l, E) A(l ∗ , E) , B(l ∗ , E)
(51.66)
leading to the analyticity condition S ∗ (l, E)S(l ∗ , E) = 1.
(51.67)
The poles of S(l, E) are then the zeroes of B(l, E) in the complex l plane, called Regge poles. These line up as points on curves l = α i (E)
(51.68)
called Regge trajectories. We will not continue with the analysis of Regge theory, since it is beyond the scope of this book.
Important Concepts to Remember • For Sl (k) we have poles in the lower half-plane, kpole = k1 − ik2 , and for each such pole a corresponding zero in the upper half-plane, kzero = k1 + ik2 . • For poles close to R, k1 k2 , near the pole Sl (k)
k − k1 − ik2 , k − k1 + ik2
and we have a resonance, σl =
k22 4π (2l + 1) , k2 (k − k1 ) 2 + k22
with Ezero/pole = Eres ± i Γ2 , and with tan δl = (−Γ/2)/(E − E1 ). • The resonance satisfies the Breit–Wigner formula, σl =
4π (Γ/2) 2 (2l + 1) , 2 k (E − E1 ) 2 + (Γ/2) 2
though with a background value for δl , δl → δl + ξl , we can change it. • The resonance has the physical interpretation that there is a quasi-bound state, with a potential barrier but with E > 0, so we have an in wave for a particle going towards the center of the potential, then a metastable (quasi-bound) state, then an out wave for a particle tunneling out (decaying) with probability ∝ e−t/τ , τ = /Γ. • We can also consider complex angular momentum l, with real energy, leading to Regge theory, in which case S(l, E) has poles at points l = αi (E) forming curves known as Regge trajectories.
Further Reading For more on the analytical properties of wave functions, see [4]. See also [2] and [1].
576
51 Resonances and Scattering, Complex k and l
Exercises (1) (2) (3) (4)
(5) (6) (7)
Write down the approximate value for the partial wave S-matrix Sl (k) for the case of two resonances and two bound states. Consider the case where among the partial wave cross sections σl only σ1 is at (or very near) a resonance. What can you say about σtot ? Consider the case where Sl (k) has a single pole (resonance), but k2 is comparable with k1 . Calculate tan δl and the equivalent of the Breit–Wigner formula in this more general case, σl (E). If we have two resonances (complex poles) for σl close to each other, and we are at resonance on the real k line, write down the wave function in terms of the physical interpretation of timedependent scattering with in and out waves, as in the text. In the case in exercise 4, calculate the asymptotic wave function in the metastable region of time, and in the out region. In the case of the Levinson theorem, for nlb , do we need to count only the poles of Sl (k) that are exactly on the imaginary line, or can they be slightly away from the imaginary line? Consider that Sl (k) for all l ∈ N has a single resonance pole, very close to the real line. If we consider now complex momentum instead, what can we learn about the Regge trajectories?
52
The Semiclassical WKB and Eikonal Approximations for Scattering In this chapter we study semiclassical approximations for scattering. We start with the WKB approximation, extended to three dimensions but then reduced to effectively one dimension. Then we give an alternative approach, the eikonal approximation, where the geometrical optics approximation is further refined to a straight line. We end with a special application, the Coulomb potential scattering, where the semiclassical approximation is actually exact.
52.1 WKB Review for One-Dimensional Systems Before the extension of the WKB analysis to three dimensions, we review the one-dimensional case. Constructing the semiclassical wave function starts with the redefinition of the time-dependent but stationary wave function, ψ(r , t) = eiS(r ,t)/ = ψ(r )e−iEt/ ,
(52.1)
where the exponent separates the time dependence as S(r , t) = s(r ) − Et.
(52.2)
Then the time-independent wave function is ψ(r ) = eis(r )/ .
(52.3)
This already shows that we can apply this formalism to scattering, if ψ(r ) corresponds to the scattering wave function ψ (+) (r ). Note that we applied the WKB method before to bound states, leading to Bohr–Sommerfeld quantization. The expansion of the function s(r ) in , s(r ) = s(r ) + s1 (r ) + 2 s2 (r ) + · · ·
(52.4)
leads to the semiclassical approximation if we keep only the first two terms, s0 (classical) and s1 (semiclassical). The zeroth-order term is the classical on-shell action between the initial condition r 0 (t 0 ) and the final point r (t), s0 (r ) = S0 [r 0 (t 0 ) → r (t)]. This arises as the extremum of the path integral for the propagator, i Dr (t) exp S[r 0 (t 0 ) → r (t)] . U (t, t 0 ) = 577
(52.5)
(52.6)
578
52 WKB and Eikonal Approximations for Scattering
One Dimension Specializing to one dimension, the Schr¨odinger equation implies the equation for s(x), 2 ds(x) d2 − 2m(E − V (x)) + s(x) = 0. dx i dx 2
(52.7)
It is the quantum-corrected version of the Hamilton–Jacobi equation for the classical action. The solution of the equation to the first order in (i.e., s0 + s1 ) gives, for the wave function in the classically allowed region E > V (x), x 1 i ψ(x) = exp ± dx 2m(E − V (x )) , (52.8) x0 [2m(E − V (x))]1/4 where the exponent is i s0 (x), where s0 (x) is the on-shell action (the classical action on the classical trajectory). We use connection formulas to transition from a solution in a classically allowed region to a solution in a forbidden one (and vice versa), through the classical turning point of the trajectory. The transition formula from the classically allowed region x > x 1 to the classically forbidden region x < x 1 is √ x 1 2 π )) + sin dx 2m(E − V (x , x > x1 x1 4 [2m(E − V (x))]1/4 (52.9) x1 1 1 → exp − dx 2m(V (x ) − E) , x < x 1 . x [2m(V (x) − E)]1/4 For the case where the classically allowed region is to the left, x < x 2 , and the forbidden region is to the right, x > x 2 , we have x 1 1 exp − dx 2m(V (x ) − E) , x > x 2 x2 [2m(V (x) − E)]1/4 (52.10) √ x2 1 2 π → sin dx 2m(E − V (x )) − , x < x2. x 4 [2m(E − V (x))]1/4
52.2 Three-Dimensional Scattering in the WKB Approximation We now apply the formalism to three-dimensional scattering. Then the quantum-corrected Hamilton– Jacobi equation arises from the Schr¨odinger equation, in terms of s(r ), 2 (∇s) i + V (r ) − E − Δs = 0. 2m 2m
(52.11)
But we are interested in the -expansion of s(r ), whose zeroth-order term is s0 (r ). To obtain the equation for s0 we drop the last term in (52.11), the only one depending on , and obtain the classical Hamilton–Jacobi equation, 0 )2 (∇s 2 k 2 + V (r ) = E ≡ , 2m 2m which defines k.
(52.12)
579
52 WKB and Eikonal Approximations for Scattering
Then we reduce the system in the spherically symmetric case to one dimension, namely to the radial direction. We first reduce to the radial wave function by writing ψ(r ) = R(r)Ylm (nr ),
(52.13)
and then to the reduced radial wave function χ(r) = r R(r). But the radial variable has a domain equal to half the real line, r ∈ (0, ∞), so we have to redefine the variable to x = ln(kr) so that x ∈ (−∞, +∞). The equation for χ(r) is (19.7), namely 2m d2 χ(r) + 2 [E − Veff (r)]χ(r) = 0, 2 dr where the effective potential has the centrifugal term added, 2 l (l + 1) . 2mr 2 In terms of x = ln(kr), meaning dr = e x dx/k, we obtain Veff (r) = V (r) +
d2 d 2m e2x χ(x) − χ(x) + [E − Veff (x)]χ(x) = 0. dx k2 dx 2 But to get rid of the term with the first derivative, we need to further set
(52.14)
(52.15)
(52.16)
χ(x) = e x/2 W (x).
(52.17)
d2 W (x) + Q2 (x)W (x) = 0, dx 2
(52.18)
2m 2 1 r [E − Veff (r)] − 4 2 (l + 1/2) 2 2 2m =r (E − V (r)) − , 2 r2
(52.19)
Then the equation of motion becomes
where Q2 (x) ≡
so that Q2 takes the place of E − V (x) in the one-dimensional case. We can easily see that if E > 0 (meaning, for scattering solutions) and V (r) → 0 for r → ∞, then 2 Q (x) → +∞ at r → ∞ (x → +∞). Then, if V (r) doesn’t blow up faster than 1/r 2 at r → 0, Q2 < 0 at r → 0 (x → −∞), so there is a single turning point x 0 , at which Q2 (x 0 ) = 0. Since Q2 represents E − V (x) in the one-dimensional case, we consider −Q2 as V (x) − E. At infinity, −Q2 → −∞ and at r = 0, −Q2 > 0, with a classical turning point, E − V (x) = 0, in the middle. The effective one-dimensional quantum-corrected Hamilton–Jacobi equation is obtained by further redefining ˜ W (x) = ei s(x)/ ,
which means
ψ(r ) = e
is( r )/
=
k i s(x)/ e˜ Ylm (nr ). r
The quantum-corrected Hamilton–Jacobi equation is 2 d s˜ d 2 s˜ (x) − 2 Q2 (x) = i . dx dx 2
(52.20)
(52.21)
(52.22)
580
52 WKB and Eikonal Approximations for Scattering Expanding in , s˜ (x) = s˜0 (x) + s˜1 (x) + 2 s˜2 (x) + · · ·
(52.23)
the equation of the leading term s˜0 (x) is obtained by dropping the right-hand side, which is proportional to , leaving just the classical Hamilton–Jacobi equation, with solution x s˜0 (x) = ± dx Q(x ). (52.24) x0
The semiclassical solution correspnds to s˜1 (x), as usual: s˜1 (x) = A +
i ln |Q(x)|, 2
(52.25)
where A is a constant, so we have the usual one-dimensional WKB solution in terms of W (x), x x 1 WWKB (x) = √ dx k (x ) + C2 exp −i dx k (x ) , (52.26) C1 exp i k (x) x0 x0 where Q2 (x) ≡ k 2 (x) > 0, so it is in the classically allowed region, including the asymptotic region r → ∞ (x → +∞). For the classically forbidden region Q2 (x) ≡ −κ2 (x) < 0, inside the range of the potential, we find x x 1 WWKB (x) = √ D1 exp − dx κ(x ) + D2 exp + dx κ(x ) . (52.27) κ(x) x0 x0 At the effective one-dimensional turning point of the classical trajectory, Q2 (x) = 0, we have the continuity formula, transitioning from the inside region (inside the range of the potential) to the outside region (including the asymptotic region), √ x0 x π 1 2 exp − dx κ(x ) → √ sin + dx k (x ) . (52.28) WWKB (x) = √ 4 κ(x) k (x) x x0 To describe scattering, we need to consider physical solutions in the r → √ ∞ limit, which are in the classically allowed region. For the reduced radial wave function χ(r) = krW , we go back to the r dependence, dx = dr/r, implying Q(x) = k (x) → k (r)/r = Q(r)/r, obtaining
−1/4 2mV (r) (l + 1/2) 2 χWKB (r) = 1 − − 2 k 2 r2k2 (52.29) ⎡⎢ ⎤ r π 2m (l + 1/2) 2 ⎥⎥ 2 ⎢ × sin ⎢ + dr k − 2 V (r ) − ⎥⎥ . r 2 ⎢⎣ 4 r0 ⎦ ∞ But at r → ∞, which means that the integral in the exponent is extended to r , the argument of the 0 sine is kr − lπ/2 + δlWKB , leading to a phase shift in the WKB approximation, δlWKB (k)
π lπ = + − kr 0 + 4 2
∞ r0
⎡⎢ ⎤⎥ 2m (l + 1/2) 2 ⎥⎥ . dr ⎢⎢ k 2 − 2 V (r ) − − k r 2 ⎥⎦ ⎢⎣
(52.30)
581
52 WKB and Eikonal Approximations for Scattering
52.3 The Eikonal Approximation We now consider another version of the WKB approximation called the eikonal approximation. The region of the validity is a subset of the region of the validity of the WKB approximation, namely the region where V (x) varies little over the de Broglie wavelength λ. This implies the WKB condition, δλ λ 1. λ
(52.31)
We further refine the domain of the approximation by saying that E |V | (the actual value of the potential, not just the variation of it, is small with respect to the energy). This means we are at high energy, which is different from the case for the Born approximation, which is valid for both low and high energies. However, as a WKB approximation, we are within geometrical optics. That means that a wave is replaced by a light ray path, i.e., the integral over all paths (the path integral) is restricted to just the classical path. Indeed, that restricts s(r ) to the classical on-shell action s0 (r ), where x(t) is on the classical trajectory, which solves the classical Hamilton–Jacobi equation. Instead of reducing the three- dimensional problem to a one-dimensional problem as in the WKB method above, we now make a further approximation: that the classical trajectory is approximately a straight path, which is true at high energies, since then there is only a small deflection. This makes the problem calculable, since to actually find the classical trajectory as a solution of the Hamilton–Jacobi equation is potentially very difficult. The approximation defined above is called the eikonal approximation, coming from the Greek “eikon”, meaning icon or image, since in the geometric optics case the light rays remaining straight through the interaction means that we obtain an undistorted “image” of an emitting object, after the interaction. Since the classical trajectory is a straight line, we can define it as the Oz axis, where the origin O is the projection onto the axis of the central point of the spherically symmetric potential; see Fig. 52.1. The distance√of the projection from the central point to the origin O is the impact parameter b. Since r = |r | = b2 + z 2 , the Hamilton–Jacobi equation reduces to a one-dimensional equation for the parameter z of the classical trajectory, giving 2 d s0 2m 2m = k 2 − 2 V (r) = k 2 − 2 V ( b2 + z 2 ). (52.32) dz It integrates to s0 =
z
dz −∞
k2 −
2m 2 V ( b + z 2 ) + C, 2
o Figure 52.1
Classical trajectory in the eikonal approximation, and the impact parameter.
(52.33)
582
52 WKB and Eikonal Approximations for Scattering where C is a constant, which we can choose such that s0 / → 0 when we turn off the potential, V → 0. Then, since 2mV V V mV k2 − 2 = k 1 − k 1 − (52.34) k− 2 , E 2E k we find ⎡⎢ ⎤⎥ 2m dz ⎢⎢ k 2 − 2 V ( b2 + z 2 ) − k ⎥⎥ ⎢ ⎥⎦ −∞ z⎣ m kz − 2 dz V ( b2 + z 2 ). k −∞
s0 = kz +
z
(52.35)
Then we obtain a scattering wave function for the argument r = b+ zez in the eikonal approximation, z im ψ+ (r ) = ψ(b + zez ) = eikz exp − 2 dz V ( b2 + z 2 ) , (52.36) k −∞ where eikz = ei k ·r . To obtain the scattering amplitude f we substitute the above ψ+ into the right-hand side of (46.71), obtained by substituting the free Green’s function G0(+) (r , r ) in the Lippmann–Schwinger equation, namely 2m + (+) f k (nr , nk ) = f ( k , k) = − d 3r e−i k ·r V (r )ψ+ (r ) 2 4π z ⎡⎢ ⎤⎥ (52.37) im 2m 3 −i k · r i k · r 2 2 2 2 ⎢ =− d r e V ( b + z )e exp ⎢− 2 d z˜ V ( b + z˜ ) ⎥⎥ , 4π2 ⎢⎣ k −∞ ⎥⎦
where if we replace the last exponential exp[. . . ] with 1, we have the Born amplitude f (1) (k , k). This means that the eikonal approximation resums some interactions (each having potential V ), replacing
V × 1 with V × exp[. . . V ] ∼ V × n (. . . V ) n . To do the integral, we integrate over cylindrical coordinates z,b and the angle φ that rotates the trajectory around the center of the potential, so d 3r = dz db bdφ. The deflection turns k into k , but the deflection angle θ is very small, and the modulus of k is unchanged, which means that k − k is perpendicular to k, which is proportional to ez . Thus, since (k − k ) · ez = 0, we find (k − k ) · r = (k − k ) · (b + z ez ) (k − k) · b.
(52.38)
However, k − k has modulus kθ (since θ 1). And its direction makes an angle φ with the impact parameter vector b (see Fig. 52.2), so finally (k − k ) · r kθ(b cos φ).
(52.39)
Then the scattering amplitude is f (+) (k , k) = −
2m 4π2
∞
2π
db b 0
0
dφe−ikbθ cos φ
+∞ −∞
z im dz V exp − 2 dz V . k −∞
(52.40)
583
52 WKB and Eikonal Approximations for Scattering
k
θ k
b
k−k
θ Figure 52.2
Geometry of scattering by a small angle θ, with impact parameter b. Using the formulas
2π
dφe−ikbθ cos φ = 2π J0 (kbθ)
0
+∞ −∞
im dz V exp − 2 k
z
−∞
dz V =
we obtain f
(+)
(k , k) = −ik
+∞ −∞
∞
d dz dz
z i2 k im exp − 2 dz V , m k −∞
db bJ0 (kbθ) e2iΔ(b) − 1 ,
(52.41)
(52.42)
0
where m Δ(b) ≡ − 2 2 k
+∞ −∞
dz V ( b2 + z 2 ).
(52.43)
If we have a finite-range potential, with range r 0 , if b > r 0 then Δ(b) = 0, so the integrand is zero, leading to r0 f (+) (θ) = −ik db bJ0 (kbθ) e2iΔ(b) − 1 . (52.44) 0
This amplitude corresponds to the optical theorem in an interesting way. To see this, we take the imaginary part of the forward amplitude, obtaining r0 r0 (+) 2iΔ(b) Im f (θ = 0) = −kRe db bJ0 (0) e − 1 = 2k db b sin2 Δ(b). (52.45) 0
0
For the optical theorem to be true, the total cross section must be r0 σtot = db(4πb)2 sin2 Δ(b).
(52.46)
0
This can be obtained from the partial wave expansion. To see that, we first consider the fact that the eikonal approximation is a high-energy approximation, therefore the maximum angular momentum is large. But the angular momentum is l = bp, and since p = k, we find l = bk. Then l max = kbmax = kr 0 1, and thus the sum over l becomes an integral over kb, lmax l=0
r0
↔
kdb, 0
(52.47)
584
52 WKB and Eikonal Approximations for Scattering
and the phase shift turns into Δ(b), δl (k) ↔ Δ(b)| b=l/k .
(52.48)
Thus if b > bmax = r 0 , δl = 0 ↔ Δ(b) = 0. Since also l 1 and θ 1, we have Pl (cos θ) J0 (lθ) = J0 (kbθ),
(52.49)
and also 2l + 1 2l ↔ 2kb. Then the amplitude, from the partial wave expansion formula, is, since al (k) = (e2iδ l (k) − 1)/(2ik), r0 2kb 2iΔ(b) f (θ) ↔ k db − 1 J0 (kbθ) e 2ik 0 (52.50) r0 = −ik db bJ0 (kbθ) e2iΔ(b) − 1 , 0
which is the eikonal approximation formula, obtained from the partial wave expansion. The total cross section is now obtained similarly, lmax 4π (2l + 1) sin2 δl (k) k 2 l=0 4π r0 ↔ 2 kdb 2bk sin2 Δ(b), k 0
σtot =
(52.51)
and this formula is consistent with the optical theorem.
52.4 Coulomb Scattering and the Semiclassical Approximation We will try to apply the formalism of the semiclassical WKB approximation to the case of Coulomb scattering. Since 2mV (r) (l + 1/2) 2 V (r) 2 (l + 1/2) 2 2 k − − k 1− − , (52.52) 2E 2 r2 4mEr 2 we find ∞
dr
r0
k2
2mV (r) (l + 1/2) 2 − − k 2 r2
r r0
k dr − 2E
r r0
2 (l + 1/2) 2 dr V (r ) − 4mE
r r0
dr . r2 (52.53)
But r → ∞ does not make sense in the Coulomb case V (r) = A/r, but only if V = A/r α with α > 1. Instead of using the WKB approximation for the radial reduction, for the Coulomb potential V = e02 /r (for the interaction of two electrons, i.e., negative charges, unlike in the hydrogenoid atom), we start directly from the Schr¨odinger equation, 2m e2 Δψ + k 2 − 2 0 ψ = 0. r
(52.54)
ψ(r ) = eikr F (r ),
(52.55)
Making the change of variables
585
52 WKB and Eikonal Approximations for Scattering
the Schr¨odinger equation becomes ΔF + 2ik
2me2 1 r · ∇F + 2ik − 2 0 F = 0. r r
(52.56)
However, instead of reducing the problem to the radial coordinate, we will use Cartesian coordinates (x, y, z) to reduce the problem to the variable u = k (r − z). Therefore our ansatz for the scattering solution of the Schr¨odinger equation is F = F (u). Then we find 2ku d 2 F 2u dF + r du2 r du r z dF u dF · ∇F = k 1 − = . r r du r du ΔF =
(52.57)
This reduces the Schr¨odinger equation to me2 2k d 2 F 2k dF 2k u 2 + (1 + iu) + i − 20 F = 0. r du r du r k
(52.58)
Defining v = −iu, the equation becomes v
me02 dF dF F = 0, 1 + i + (1 − v) − dv dv 2 k2
(52.59)
which is the equation for the confluent hypergeometric function 1 F1 (a, b; v) (see (18.41)), v
d2 F dF + (b − v) − aF = 0; 2 dv dv
(52.60)
this means that the solution has b = 1, a = 1 + iα, where α=
me02 . k2
(52.61)
But the confluent hypergeometric function of an imaginary variable has the asymptotics 1 F1 (a, b; −iu)
Γ(b) eiπ(b−a)/2 −iu Γ(a) 1 e + e−iπa/2 . Γ(a) ub−a Γ(b − a) u a
(52.62)
In our case, we have 1 F1 (1
1 e πα/2 −iu 1 e−iπ(1+iα)/2 e + Γ(1 + iα) u−iα Γ(−iα) u1+iα πα/2 −iα ln u e −iu+iα ln u −iπ/2 Γ(1 + iα) e = +e e , Γ(1 + iα) Γ(−iα) u
+ iα, 1; −iu)
where we have used 1/u−iα = e+iα ln u and 1/u1+iα = u−1 e−iα ln u . Therefore the full wave function is (since ψ = eikr 1 F1 (. . . )) e πα/2 Γ(1 + iα) eikr−iα ln[k (r−z)] ψ= eikz−iα ln[k (r−z)] − i . Γ(1 + iα) Γ(−iα) k (r − z)
(52.63)
(52.64)
In the first term, eikz = ei k ·r is the incoming plane wave, and there is an extra slowly varying phase.
586
52 WKB and Eikonal Approximations for Scattering In the second term, using Γ(1 + iα) = iαΓ(iα) and r − z = r (1 − cos θ) = 2r sin2 θ/2 and factorizing eikr /r, we obtain the scattering amplitude f (θ) =
2 Γ(iα) α e−iα ln[2kr sin θ/2] . Γ(−iα) 2k sin2 θ/2
(52.65)
However, since [Γ(iα)]∗ = Γ(−iα), the first factor is a phase and the differential cross section is 2 2 e02 m α d2 σ 2 = | f (θ)| = = 2 2 2 dΩ 2k sin2 θ/2 2 k sin θ/2 e02 1 = . 4m2 v 4 sin4 θ/2
(52.66)
This is nothing other than the classical Rutherford formula, even though we have obtained an exact formula for (nonrelativistic) quantum scattering, since we haven’t used any approximation. We have seen before that in the Born approximation we can find the classical result. So any semiclassical corrections to the classical result vanish, and in fact all quantum corrections vanish. Thus, Coulomb scattering is a very special case.
Important Concepts to Remember • We can use the WKB approximation also for three-dimensional scattering, using as before the transition formulas from the classically allowed region to the classically forbidden region, and reducing the three-dimensional Schr¨odinger equation to a one-dimensional Schr¨odinger d2 equation with domain R, by writing x = ln(kr), χ(x) = e x/2 W (x), and the equation dx 2 W (x) + Q2 (x)W (x) = 0. • The WKB approximation for W (x) = exp i s˜ (x) is obtained from the quantum-corrected Hamilton–Jacobi equation for s˜ (x), leading to an expression for χWKB (r) that at infinity goes like sin(kr − lπ/2 + δlWKB ), giving a formula for the phase shift in the WKB approximation. • The WKB approximation is a geometrical optics approximation, of classical ray paths, but one can add to it an eikonal approximation, of straight ray paths, since the classical paths (solutions of the Hamilton-Jacobi equation) are difficult to obtain. • The eikonal approximation, a high-energy approximation, ∞ resums some Born series terms, and gives, for a finite-range potential with range r 0 , f + = ik 0 db bJ0 (kbθ)[e2iΔ(b) − 1], with Δ(b) = r √ −(m/22 k) dxV ( b2 + z 2 ) and σtot = 0 0 db(4πb)2 sin2 Δ(b). • The exact Coulomb scattering calculation reproduces the Rutherford formula, so there are no WKB approximation corrections, in fact no quantum corrections of any kind, in this nonrelativistic case.
Further Reading See [2] and [1].
587
52 WKB and Eikonal Approximations for Scattering
Exercises (1) (2)
(3) (4)
(5) (6)
(7)
Consider a one-dimensional system with potential V (x) = V0 /(x 2 + a2 ). Write the WKB approximation for the wave function for scattering, for energy 0 < E < V0 /a2 . Consider a three-dimensional system with central potential V (r) = V0 /(r 2 + a2 ). Write the WKB approximation for the wave function for scattering of given angular momentum l, for energy 0 < E < V0 /a2 . For the case in exercise 2, calculate the cross section as a formal sum over l. For the same potential as above, V (r) = V0 /(r 2 + a2 ), calculate Δ(b) for the eikonal approximation. If the potential turns off (V (r) = 0) for r ≥ r 0 , calculate Δ(b) and, if r 0 is large, calculate an approximate value for the scattering amplitude in the eikonal approximation. For a Yukawa potential, V (r) = V0 e−μr /r, write an approximate value for the cross section in the eikonal approximation. If we have a Coulomb potential that turns off (V (r) = 0) for a large r ≥ r 0 , calculate the total (approximate) cross section in the eikonal approximation, and compare with the exact value from the Rutherford formula. Are there any resonances for Coulomb scattering? Check that the analytical properties of Sl (k) in this case are satisfied.
Inelastic Scattering
53
In this chapter we study inelastic scattering, which means there are energy, momentum, and maybe even particles leaking out of the system into another system (or “sink”). A standard example is the scattering of a fundamental particle (such as an electron) off a compound particle (such as an atom), which can absorb energy and get excited to a different state. After analyzing this from the point of view of an unknown other system, and that of a known other system, we generalize to scattering in the multi-channel case, and also the scattering of identical particles.
53.1 Generalizing Elastic Scattering from Unitarity Loss In the first generalization we abandon unitarity, which means that there is a probability leak into an unknown other system. We previewed this in Chapter 49. Then, everything follows via the partial wave expansion, but the S-matrix is not unitary, Sˆ † Sˆ 1. In terms of partial waves, Sl∗ Sl 1,
(53.1)
which means that Sl is not a phase factor. Or, defining in the same way Sl = e2iδ l , it means that δ∗ δ, i.e., δ has a nonzero imaginary part. Since Sl is the ratio of the “out” wave to the “in” wave, eikr e−ikr + , (53.2) r r and the probability leaks into something else between the in and the out wave functions, which means that |Sl (k)| < 1, or ψ ∼ Sl (k)
Im δl ≥ 0 ⇒ |Sl | = e−2Im δ l (k) ≤ 1.
(53.3)
In the eikonal approximation, δl (k) ↔ Δ(b)| b=l/k Sl (k) = e
2iδ l (k)
↔ e2iΔ(b) ,
(53.4)
where Im Δ(b) ≥ 0. The total cross section expands into partial waves, σtot =
lmax
σl ,
(53.5)
l=0
and for σl we have a more general formula, previewed in (49.13), π π σl = 2 (2l + 1)|e2iδ l (k) − 1| 2 = 2 (2l + 1)|Sl (k) − 1| 2 , k k 588
(53.6)
589
53 Inelastic Scattering
corresponding in the eikonal approximation to π 2b|e2iΔ(b) − 1| 2 . k r
max The sum over l turns into an integral, ll=0 → 0 0 kdb, so r0 σtot = 2π db b|e2iΔ(b) − 1| 2 .
(53.7)
(53.8)
0
We can define then a maximally inelastic scattering, the “black disk” eikonal, which means that inside a disk of radius r 0 we have complete absorption, Sl = 0, i.e., a “black” region. This is achieved by setting Im Δ(b) = +∞, for b < r 0 ,
(53.9)
and, as always, Sl = 1 (there is no interaction or deflection), or Δ(b) = 0, for b > r 0 . Then the total cross section of the black disk eikonal is σtot = πr 02 ,
(53.10)
which is just the classical, geometrical, cross section. Note that this is the total inelastic cross section. On the other hand, we can construct the amplitude as we did in the previous chapter. Defining q = k − k, with modulus q = |q | kθ, we write −i(kθ)b cos φ as −iq · b, where the vector parameter b is in the two-dimensional plane transverse to the initial direction k. Then the eikonal amplitude becomes r0 2π ik db b dφ e−iq ·b (e2iΔ(b) − 1), (53.11) f (θ, r 0 ) = − 2π 0 0 where
2π
dφe−iq ·b = 2π J0 (qb) = 2π J0 (kθb).
(53.12)
0
Taking the imaginary part of the forward (θ = 0) amplitude for the black disk eikonal (e2iΔ(b ≤r0 ) = 0), we obtain r0 kr 2 Im f (0, r 0 ) = k db b = 0 . (53.13) 2 0 If the optical theorem is to be satisfied, the total cross section must be σtot =
4π Im f = 2πr 02 . k
(53.14)
However, if we consider only inelastic scattering then unitarity is violated, as we said, since |Sl | < 1, and thus the optical theorem (which follows from unitarity) is also violated. But the total cross section from the optical theorem is larger. In fact, the eikonal approximation is at high energy, where we have already seen that the result is 2πr 02 , comprising the geometrical cross section and the shadow one, σtot = 2πr 02 = πr 02 + πr 02 = σgeometric + σshadow . The inelastic cross section is the geometric part of the above.
(53.15)
590
53 Inelastic Scattering
The eikonal amplitude can be further rewritten as r0 2π ik f (θ, r 0 ) = − db b dφe−iq ·b (e2iΔ(b) − 1) 2π 0 0 ik =− d 2b e−iq ·b (e2iΔ(b) − 1). 2π
(53.16)
Calculating the black disk (e2iΔ(b ≤r0 ) = 0) eikonal amplitude at arbitrary θ, we obtain r0 ik f (θ, r 0 ) = db b2π J0 (qb) 2π 0 (53.17) qr0 ik ikr 0 = 2 dx x J0 (x) = J1 (qr 0 ), q q 0 a where we have used 0 dx x J0 (x) = a J1 (a). Taking the limit q → 0 by using the limit J1 (x) x/2 for x → 0, we obtain f (θ → 0, r 0 ) =
ikr 02 . 2
(53.18)
In order to obtain the correct inelastic cross section, we use the optical theorem but divide by an extra 2, since the optical theorem gives the total cross section, which is in equal parts inelastic and shadow, as explained above. σtot,inelastic =
1 4π Im f (0, r 0 ) = πr 02 . 2 k
(53.19)
53.2 Inelastic Scattering Due to Target Structure One possible way to have inelastic scattering is for the target to have structure, leading to excited states, that can be reached by absorbing energy from the projectile. That is, the projectile is particle A, and the target is system B, with internal structure. The standard example is the scattering of an electron (particle A) off an atom (system B) with excited internal states. Then the state of the combined system AB (electron plus atom) is the product of the momentum state of the particle A and the state of the target B, |kn0 = |k A ⊗ |n0 B ,
(53.20)
with the wave function factorizing into a free particle wave function for A times a multiparticle wave function for the N components of B, ψ ∼ ei k ·x ψ n0 (r 1 , . . . , r N ).
(53.21)
In the case of the atom we have N = Z, the electric charge number, which is the same as the number of electrons in the atom. The transition between the initial state and the final state is |k, n0 = |k A ⊗ |n0 B → |k n = |k ⊗ |nB .
(53.22)
591
53 Inelastic Scattering
The final state wave function is · x
ψ ∼ ei k
ψ n (r 1 , . . . , r N ).
(53.23)
The elastic case corresponds to n = n0 , otherwise we have inelastic scattering. In Chapter 46, we found the differential cross section as dσ r 2 | jscatt | r 2 (k/m)(| f |/r 2 ) = = = | f |2. dΩ (k/m) | jinc |
(53.24)
We want to generalize the final formula by calculating the change in the initial and scattered currents. In the scattered current the energy, and therefore the wave vector k, of the projectile is changed by exciting the internal state of the target, so k k. But the projectile itself does not change, so m stays fixed. The more general formula is then r 2 jscatt k dσ (|n0 → |n) = = | f |2. dΩ jinc k
(53.25)
If we restrict to the Born approximation, namely the first-order time-dependent perturbation theory, Fermi’s golden rule, calculated in Chapter 47 as f (1) (k , n; k, n0 ) =
2m ˆ k , n|V | k, n0 , 4π2
(53.26)
leads to the first-order differential cross section, 2 dσ (1) k 2m ˆ . = k , n| V | k, n 0 dΩ k 4π2
(53.27)
In general (outside the Born approximation), the amplitude is written as 2m ˆ f (k , n; k, n0 ) = k , n|V | k, n0 +, 4π2
(53.28)
and the first-order differential cross section is 2 dσ k 2m ˆ . = k , n| V | k, n + 0 dΩ k 4π2
(53.29)
The potential contains the interaction of the projectile A with all the components of the target B. In the particular case of the atom, we have the interaction of the incoming electron with the nucleus of charge +Ze situated at r = 0, and the N = Z electrons, situated at r i , V =−
N Ze02 e02 + . r |r − r i | i=1
In the Born approximation, the amplitude for the elastic scattering becomes explicitly 2m f (1,+) (k , k) = d 3r eir ·(k−k ) V (r ), 4π2
(53.30)
(53.31)
592
53 Inelastic Scattering where k − k = −q . In the inelastic generalization, this becomes 2m (1,+) f ( k , n; k, n0 ) = d 3r e−iq ·r n|V (r )|n0 4π2 2m = d 3r e−iq ·r 4π2 N × d 3r i ψ∗n (r 1 , . . . , r N )V (r , r 1 , . . . , r N )ψ n0 (r 1 , . . . , r N ).
(53.32)
i=1
But we have that N i=1
e−iq ·r = |r − r i | i=1
d 3r
N
=
d 3r
N 4π i=1
q2
e
e−iq ·(r +r i ) r
−i q · ri
4π = 2 q
(53.33)
d r ρ(r )e 3
−i q · r
,
where the target’s electron density is ρ=
N
δ3 (r − r i ).
(53.34)
i=1
We define the form factors (in momentum space) 1 n| e−iq ·r i |n0 . Z i=1 N
Fn,n0 (q ) =
Then the integral in the amplitude is N Ze02 e02 n = Ze2 4π −δ 3 −i q · r d r e n − + + F ( q ) , 0 n,n n,n 0 0 0 r |r − r i | q2 i=1
(53.35)
(53.36)
leading to the differential cross section 2 dσ k 4me2 Z 2 e04 −δ n,n0 + Fn,n0 (q ) = dΩ k 4 q4 2 k 4Z 2 −δ n,n0 + Fn,n0 (q ) . = 2 4 k a0 q
(53.37)
53.3 General Theory of Collisions: Inelastic Case and Multi-Channel Scattering After considering a simplified treatment of inelastic scattering, from the point of view of a fundamental projectile, we consider the general theory of collisions. In this general case, both projectile and target can have structure, as for instance in atom-on-atom collision, or nucleus-onnucleus collision, as in the RHIC (relativistic high-energy ion collider) and LHC (large hadron collider) experiments.
593
53 Inelastic Scattering
In the most general case, then, consider N fundamental particles (particles that cannot be separated in sub-parts, at least not at the available energies), like for instance electrons and nucleons. They can form different fragments, as for instance nuclei or atoms, in different ways, if there are bound states (if there is a nucleus, or an atom, in the given conditions). Examples include nuclear reactions through scattering, in which we collide two nuclei, but after the scattering interaction other nuclei appear. A division of all the N fundamental particles into fragments is called a channel. We note that changing the state or the quantum numbers (as for instance |n0 to |n for an atom) of a fragment remains within the channel (though we can formally consider it as being in a different channel, if we want); the term channel is reserved for different combinations into fragments. For scattering, we will have both an “in” channel (before the collision) and an “out” channel (after the collision). Also, if conservation of energy, momentum, angular momentum, etc., impedes the appearance in the final state of a fragment, related to a given channel, then we say that the “channel is closed”. The energy Ei of fragment i is given by the kinetic energy of the center of mass motion of the fragment, i , subtracting from it the binding energy wi , Ei = i − wi .
(53.38)
The simplest nontrivial example of channels corresponds to a system of N = 3 fundamental particles, called, say, a, b, c. Then the three two-particle bound states correspond to fragments (ab), (bc), and (ac). To admit the bound states of the fragments, we need to have the binding energies w ab , wbc , w ac > 0. Then there are four channels, with the total energy divided among the various fragments, I, a + (bc) : E = a + bc − wbc II, b + (ac) : E = b + ac − w ac III, c + (ab) : E = c + ab − w ab
(53.39)
IV, a + b + c : E = a + b + c . Putting the fragments in the order of their binding energies, wbc > w ac > w ab > 0, and having the middle channel, II, as the in channel, we will analyze the case of the out channel that is the least bound, III, i.e., c + (ab). Energy conservation for the collision, between the in and the out channels, gives − w ab . b + ac − w ac = c + ab
(53.40)
b + ac = w ac − w ab + c + ab > w ac − w ab > 0.
(53.41)
Then we obtain
This is the condition for the c + (ab) channel to be open. If the total incoming kinetic energy, b + ac , is < w ac − w ab , we have a closed channel.
53.4 General Theory of Collisions Before considering channels further, we describe a general, abstract-state, model of scattering. Suppose that we have orthonormal states |Eα, Eα|E α = δ(E − E )δ αα ,
(53.42)
594
53 Inelastic Scattering of eigenvalue energy E of some unperturbed Hamiltonian Hˆ 0 , Hˆ 0 |Eα = E|Eα.
(53.43)
|Eα± = |Eα + Gˆ ± (E)Vˆ |Eα,
(53.44)
The scattering states are
where Gˆ ± (E) are the full Green’s functions. The states have the same norm as the free states, Eα ± |E α ± = Eα|E α ,
(53.45)
and they obey the Lippmann–Schwinger equation |Eα± = |Eα + Gˆ ±0 (E)Vˆ |Eα±.
(53.46)
The full Green’s function also satisfies the Lippmann–Schwinger equation: Gˆ ± = Gˆ ±0 + Gˆ ±0 Vˆ Gˆ ± .
(53.47)
The S-matrix is defined by ˆ α = δ(E − E )δ αα − 2πiδ(E − E )Tαα (E), Eα − |E α+ = Eα| S|E
(53.48)
where the T-matrix is Tαα (E) = Eα − |Vˆ |E α E=E = Eα|Vˆ |E α +E=E .
(53.49)
In a spherical basis, this S-matrix reduces to the partial wave S-matrix, SElm,E l m = S˜l (E)δ(E − E )δll δ mm ,
(53.50)
leading to S(k1 , k2 ) = e2iδ l
π δ(k1 − k2 )δl1 l2 δ m1 m2 . 2k12
(53.51)
Define the operators Ω(±) that create the scattering states from the free states, Ω(±) |Eα = |Eα±,
(53.52)
Ω(±) = lim UI (0, t),
(53.53)
Sˆ = Ω(−)† Ω(+) .
(53.54)
specifically t→∓∞
and giving the S-operator as
If we specialize to a single channel, yet with target structure (electron–atom scattering), the initial (free) state is |Eα = |E A, n A ⊗ | B0 , β0 ,
(53.55)
|Ea , n A ⊗ | B , β.
(53.56)
E = E A + B0 = Ea + B .
(53.57)
and the final state is
Conservation of energy leads to
595
53 Inelastic Scattering
The Lippmann–Schwinger equation is |Eα+ = |E; n A, β0 + = |E A, n A ⊗ | B0 , β0 + Gˆ +0 (E)Vˆ |E; n A, β0 +.
(53.58)
The corresponding scattering wave function for projectile position r and center of mass target as r → ∞, is position R,
eik r + i k · r f nn0 (k , k)un ( R). r | ⊗ R| |Eα+ = v (r , R) ∼ e un0 ( R) + r n
We expand the scattering solution in terms of the target states, = v + (r , R) Fn+ (r )un ( R).
(53.59)
(53.60)
n
Then at infinity for the projectile, r → ∞, we find
Fn+ (r ) ∼ ei k ·r δ nn0 +
eik r f nn0 (k , k) r
(53.61)
for an open channel, and Fn+ (r ) ∼ 0
(53.62)
for a closed channel. Then the differential cross section becomes dσ B0 ,β0 → B β pa = | f |2 , dΩa pA
(53.63)
which splits between the elastic (n = n0 ) and inelastic (n n0 ) cases, dσnelastic 0 dΩ
= | f n0 n0 | 2
dσninelastic k = | f nn0 | 2 . dΩ k
(53.64)
53.5 Multi-Channel Analysis We now generalize to the multi-channel case. As we said, the presence of different channels means dividing the fundamental particles and their interaction into fragments. Denoting a channel by Γ, we have Hˆ = Hˆ 0Γ + Vˆ Γ .
(53.65)
The free states |EαΓ correspond to eigenvalues of the free Hamiltonian of the channel, Hˆ 0Γ |Eα; Γ = E|Eα; Γ.
(53.66)
These states are orthonormal within the channel, Eα; Γ|E α ; Γ = δ(E − E )δ αα ,
(53.67)
596
53 Inelastic Scattering
but there is a nonzero overlap between channels, Eα; Γ|E α ; Γ 0
(53.68)
ˆ |Eα; Γ± ≡ |Eα; Γ + G(E) Vˆ Γ |Eα; Γ
(53.69)
for Γ Γ . The scattering states are defined as
and satisfy the Lippmann–Schwinger equation, |Eα; Γ± = |Eα; Γ + Gˆ 0 (E)Vˆ Γ |Eα; Γ±.
(53.70)
The orthogonality conditions of the states are as follows: (1) The scattering states in different channels are orthogonal, EαΓ ± |EαΓ ± = 0
(53.71)
for Γ Γ . (2) The free states within a single channel are not complete, ∞ dE|EαΓEαΓ| 1, α
(53.72)
0
while the scattering states of a single channel give just the projector to the scattering states of the channel, ∞ dE|EαΓ±EαΓ ± | ≡ P Γ± . (53.73) α
0
(3) The sum of the projectors onto different channels, added to the projector Λb onto bound states, is complete, P Γ± + Λb = 1. (53.74) Γ
The operators turning free states into scattering states are defined for each channel, by projecting onto the (free) states of the channel with ΛΓ : Ω(±)Γ = Uˆ Γ (0, +∞)ΛΓ .
(53.75)
Within the channel these operators are unitary, [ΩΓ(±) ]† ΩΓ(±) = ΛΓ .
(53.76)
The S-operator is defined for arbitrary in and out channels,
(+) S Γ Γ = ΩΓ(−)† ΩΓ .
(53.77)
The S-matrix is
E α Γ |S Γ Γ |EαΓ = E α Γ − |EαΓ+
= δ(E − E )δ αα δ ΓΓ − 2πiδ(E − E )TαΓ αΓ (E),
(53.78)
where the T-matrix is given by TαΓ αΓ (E) = Eα Γ |VˆΓ |EαΓ+ = Eα Γ − |VˆΓ |EαΓ.
(53.79)
597
53 Inelastic Scattering
If both the in and the out channels have two fragments, small one, A (the projectile), transformed into A, and large one, B (the target), transformed into B , the inelastic differential cross section has an extra factor coming from the mass of the projectile, k mA dσinelastic = | f α ,Γ ;α,Γ | 2 . dΩ k m A
(53.80)
53.6 Scattering of Identical Particles The relevant case of electron scattering off an atom, which itself has N electrons, reveals a new characteristic, namely when identical particles scatter (the scattering electron versus the atom’s electrons). But identical particles need to be symmetrized, as we have seen (their wave functions are either totally symmetric (S), or totally antisymmetric (A)). We assume that the target (the atom, containing electrons) is already symmetrized, ˆ ˆ = E P|Eα, Hˆ 0 [ P|Eα]
(53.81)
where P is the symmetrization operator. But then we add to the system the projectile A, which is also an electron, so we need to consider states that are symmetric or antisymmetric in all the N + 1 electrons. That means we need to symmetrize j = 1, corresponding to the projectile, with j = 2, . . . , N +1, for the electrons of the atom. The symmetrized system is obtained by the action of the operator Λ, symmetrizing j = 1 with all the other j’s, 1 Λ|Eα = N +1
⎡⎢ ⎤⎥ N +1 ⎢⎢ |Eα + P1j |Eα⎥⎥⎥ . ⎢⎢ ⎥⎦ j=2 ⎣
The correctly normalized symmetric/antisymmetric states are then √ |EαS/A = N + 1Λ|Eα ⎡⎢ ⎤⎥ N +1 1 ⎢ ⎢⎢ |Eα± ± =√ P1j |Eα±⎥⎥⎥ . N +1 ⎢ ⎥⎦ j=2 ⎣
(53.82)
(53.83)
The S-matrix is SE α ,E α = E α − |Eα+ ± NE α − |P12 |Eα+ = δ(E − E )δ αα − 2πiδ(E − E )Tαα (E),
(53.84)
where the T-matrix is given by Tαα (E) = E α − |Vˆ |Eα ± NE α − |P12 Vˆ |Eα = E α |Vˆ |Eα+ ± NE α |P12 Vˆ |Eα+.
(53.85)
It splits into a direct term (index d) and an exchange term (index exch), d exch Tαα (E) = Tαα (E) ± NT αα (E).
(53.86)
598
53 Inelastic Scattering Applying this to the simplest system, electron scattering off a hydrogen atom (N = 1), we replace d d Tαα δ δ (E) → t p n l m ; p ,n,l,m m1 m1 m2 m2 l
exch Tαα (E)
→
l
t exch δ δ . p n l ml ; p ,n,l,ml m1 m2 m2 m1
Then the differential cross section is dσm1 m2 ;m1 ,m2 = (. . . )|t d δ m1 m1 δ m2 m2 ± t exch δ m1 m2 δ m2 m1 | 2 . dΩ Considering mi = ± (the states of the spin projection), we obtain dσ++,++ dσ−−,−− = = |t d − t exch | 2 dΩ dΩ dσ+−,+− dσ−+,−+ = = |t d | 2 dΩ dΩ dσ+−,−+ dσ−+,+− = = |t exch | 2 , dΩ dΩ and the rest of the scattering cross sections vanish. That means that the unpolarized differential cross section is unpolarized dσ 3 1 = |t d − t exch | 2 + |t d + t exch | 2 . dΩ 4 4
(53.87)
(53.88)
(53.89)
(53.90)
To calculate the transition amplitude, we use the Born approximation for |Eα± → |Eα, and Vˆ = Vˆ1 + Vˆ12 , giving Tαα (E) = E α |(Vˆ1 + Vˆ12 )|Eα ± NE α |P12 (Vˆ1 + Vˆ12 )|Eα.
(53.91)
Important Concepts to Remember • Inelastic scattering means scattering when energy, momentum, and/or particles can leak into a “sink”, namely the case in which one or both of the particles (projectile and target) have internal structure and can become excited. • In inelastic scattering Sˆ is not unitary, so Sl is not a phase, i.e., δl is not real: |Sl | < 1, or Im δl > 0. • For a “black disk” eikonal, Im Δ(b) = +∞ for b < r 0 and Δ = 0 for b > r 0 , so the total inelastic cross section is σtot = σgeometrical = πr 02 , while the total cross section is 2πr 02 . • In inelastic scattering with a target structure (so that the target can absorb energy and momentum), dσ k (|n0 → |n) = | f | 2 . dΩ k For an atom with N electrons, dσ k 4Z 2 = | − δ n,n0 + Fn,n0 (q )| 2 , dΩ k a02 q4 with Fn,n0 (q ) the form factors. • In general, both projectile and target can have structure, and that structure can change during the collision (as in nuclear reactions occurring through collision), so we have different divisions, named channels, of the total number of fundamental particles into fragments.
599
53 Inelastic Scattering
• In the case of several channels Γ (different ones for initial and final states), we have dσ k mA = | f α ,Γ ;α,Γ | 2 . dΩ k m A • When scattering involves identical fundamental particles, organized into fragments, as for instance an electron scattering off an atom with electrons, the total state needs to be (anti)symmetrized, leading to a direct term and an exchange term in the amplitude, so dσ/dΩ ∼ |t d δ · · · + t exch δ . . . | 2 .
Further Reading See [2] and [1].
Exercises (1) (2) (3)
(4)
(5)
(6)
(7)
Consider the mapping of the black disk eikonal into the general, δl (k), representation. Calculate the total inelastic cross section in this representation. Calculate the differential cross section dσ(θ)/dΩ for the black disk eikonal, integrate it, and compare with the total inelastic cross section. Consider the scattering of a projectile A (nonidentical to the target components) off a hydrogenoid atom in the Born approximation, involving a jump of the electron from the ground state to the first excited state. Calculate the differential cross section. Consider a system of N = 4 fundamental particles. Write down the channels for the scattering of the various fragments. Taking the in channel as one where each of the two fragments have two particles, find the condition for the out channel to be one in which the fragments have respectively one and three particles. Describe the general theory of scattering (as in the text) for the case when the target is a single fundamental particle (and the projectile is composite), so both in and out channels have a singleparticle target. How do you write the inelastic differential cross section for the case where the in channel has two parallel projectiles, and the out channel has a single outgoing projectile (fragment) of general composition (neither of the initial projectiles, nor their sum)? Specialize the scattering of the identical particle formalism for the case of an electron scattering off a helium atom to describe the differential cross sections of various spin projections in terms of direct and exchange T-matrices.
PART IIe
MANY PARTICLES
The Dirac Equation
54
In this chapter we consider relativistic corrections to the electron (fermion) theory, in the form of the Dirac equation. The correct treatment of the Dirac equation is in quantum field theory, but here we will deal with only the first relativistic corrections, in which case we can ignore all quantum field corrections. In some treatments in the literature, one talks about building a relativistic quantum mechanics. However, there is no relativistic quantum mechanics; joining relativity with quantum mechanics leads to quantum field theory, as we will show.
54.1 Naive Treatment In basic nonrelativistic quantum mechanics we have the Schr¨odinger equation, which in the free case is −i∂t ψ −
2 Δψ = 0, 2m
(54.1)
and is invariant under the Galilei transformations, x = x − vt.
(54.2)
In the momentum representation, the free Schr¨odinger equation is i∂t ψ = H ψ =
2 2 p ψ = 0, 2m
(54.3)
which is the nonrelativistic energy relation. We would like to build a replacement for the Schr¨odinger equation that is invariant under the Lorentz transformation. We will replace the energy relation with a relativistic one. Acting twice with the same operator on the wave function we obtain −2 ∂t2 ψ = H 2 ψ,
(54.4)
E 2 → H 2 = p 2 c2 + m2 c4 .
(54.5)
where
Then we have a replacement of the Schr¨odinger equation defined by − 2 ∂t2 ψ = (p 2 c2 + m2 c4 )ψ ⇒ 2 − m2 c4 ]ψ = 0 ⇒ [−2 ∂t2 + 2 c2 ∇ 2 2 2 − 1 ∂ − mc ∇ ψ = 0. c2 ∂t 2 603
(54.6)
604
54 The Dirac Equation
This is the Klein–Gordon equation, but it is not what we wanted. The equation acts on a wave function ψ that is in a scalar representation of the Lorentz group. In fact, we need to find an equation acting on the spinor representation (spin 1/2, for the electron) of the Lorentz group. Moreover, we need to have a first-order equation in ∂t , like the usual Schr¨odinger equation, and correspondingly (because of Lorentz invariance) also in p = ∇. i
Putting together these two requirements, we find that we need to take the “square root” of the Klein–Gordon equation, in terms of matrices (the coefficients are matrices). Then we have c2 p 2 + (mc2 ) 2 = (c α · p + βmc2 ) 2 ≡ H 2 ,
(54.7)
meaning that the Hamiltonian is H = c α · p + βmc2 ,
(54.8)
and β, satisfying (from the defining equation, by equating terms with pi p j , with matrix coefficients α or pi , or no pi at all, on both sides of it), (α i ) 2 = β2 = 1 α i α j + α j α i = {α i , α j } = 0, i j
(54.9)
α i β + βαi = {α i , β} = 0. Moreover, since the Hamiltonian is Hermitian, it means that the coefficient matrices are also Hermitian, so α†i = α i and β† = β. They are also traceless and have eigenvalues ±1 (since α2i = β2 = 1). The matrices must be 4 × 4, which we can prove as follows. The dimension must be even, since we know that the functions on which we act with them must eventually depend on spin, which takes two values (±1/2). And for 2 × 2 matrices, the complete set of matrices is (1, σ ), so the full set of independent matrices that anticommute comprises the three Pauli matrices σi ; yet we need four, for the α i and β. That leaves the next even dimension up, namely 4 × 4 matrices. The matrices are also not unique; after a unitary transformation, with matrix S (S † = S −1 ), we have another solution for the matrices: → S −1 α S, S −1 βS. α The Dirac equation, replacing the Schr¨odinger equation, is then ∂ c + βmc2 ψ. ·∇ i ψ = (c α · p + βmc2 )ψ = α ∂t i
(54.10)
(54.11)
One useful choice for the matrices α, β (reachable from another by a unitary transformation) is 0 σ 1 0 = α , β= . (54.12) σ 0 0 −1 , β coefficients are 4 × 4 matrices, they act on four-dimensional column vectors ψ, not Since the α just on a single wave function. However, we have already encountered a two-component vector as the spin 1/2 wave function (ms = +1/2 and ms = −1/2 components). Here we have two twocomponent vectors, which actually correspond to both the electron e− and its antiparticle e+ (though , β). the separation of the two is related to some specific choice for α The correct treatment for producing the Dirac equation and its solutions is in quantum field theory. In the literature, there are statements about the existence of a relativistic quantum mechanics, but that is misleading; the correct relativistic treatment, joining relativity with quantum mechanics, is
605
54 The Dirac Equation
quantum field theory. The fact that a strictly relativistic quantum mechanics does not exist can be understood from three points of view: (1) For any particle, there is an antiparticle. For spin 1/2 particles, the two are different (for real fields, such as real scalars, the particles are their own antiparticles). Thus if the energy is E > m p c2 + m p¯ c2 , we can create a particle and antiparticle pair, which means that the number of particles is not conserved in quantum field theory, which describes real situations. But in usual quantum mechanics, particles follow quantum paths that never end or begin, so the number of particles is conserved. (2) Even if E < m p c2 +m p¯ c2 , we can create a particle–antiparticle pair for a short time. Indeed, the Heisenberg uncertainty principle, regarding E and t implies that ΔE·Δt ∼ , so for low enough Δt, we can have E + ΔE > m p c2 + m p¯ c2 . Therefore we can always create a virtual pair of particles. There are no asymptotic particles (at large space and time separation) owing to energy and momentum conservation, but we have quantum fluctuations. Thus, even for low energies, we can always have quantum paths of particles that begin and end, violating the conservation of particle number. (3) In the usual quantum mechanics, we also violate causality, even if we have Lorentz invariance (a relativistic theory), with E = p 2 c2 + m2 c4 . We will show that under a time evolution that violates causality we still have a nontrivial result, contrary to what we should have in a good theory. The propagator matrix element between x 0 at time zero and x at time t is ˆ ˆ U (t) = x |e−i Ht |x 0 = d 3 p d 3 p x |p p |e−i Ht/ |p p |x 0 1 = (2π) 3 But
d 3 p ei p ·x / =
=
∞ 0
i 2 2 2 4 d p exp − t p c + m c ei p ·(x −x 0 )/ .
p2 dp 2π
(54.13)
3
+1 −1
d(cos θ)eipx cos θ/ =
p 4π p2 dp sin x . px
∞ 0
p2 dp
2π ipx/ − e−ipx/ e ipx
(54.14) Then the propagator matrix element is ∞ p i 2 c2 + m2 c4 . U (t) = p dp sin | x − x | exp − t p 0 2π 2 |x − x 0 | 0
(54.15)
To approximate it, we use the saddle point approximation of an integral, by Taylor expanding around the extremum in an exponent, 2π f (x) f (x0 ) f (x0 )δx 2 /2 f (x0 ) I= dxe e dδxe =e . (54.16) f (x 0 ) If we consider a separation much outside the causal cone, x 2 c2 t 2 , the saddle point (extremum) condition implies 1 d t pc2 (ipx − it p2 c2 + m2 c4 ) = 0 ⇒ x = dp p2 c2 + m2 c4 (54.17) imcx ⇒ p = p0 = √ . x 2 − c2 t 2
606
54 The Dirac Equation
Thus the saddle point evaluation of the propagator matrix element is mc √ i U (t) ∝ exp (p0 x − t p02 c2 + m2 c4 ) ∼ exp − x 2 − c2 t 2 0
(54.18)
This means that there is a breakdown of causality even well outside the causal cone, albeit with an exponentially small value. In quantum field theory causality is recovered, but we will not explain how, since it requires information beyond the scope of this book.
54.2 Relativistic Dirac Equation Our previous treatment of the Dirac equation was based on the idea of a relativistic version of quantum mechanics. But, as we have just seen, that is not consistent so we need a derivation based on quantum field theory. We need to consider a field associated with spin 1/2 particles. These are “spinor” fields, encompassing the Hilbert space acted upon by so-called gamma matrices γμ . These are objects satisfying the “Clifford algebra”, objects which when squared give the identity, and which anticommute among and β. Together, these conditions make the Clifford algebra each other, just like α {γ μ , γ ν } = 2g μν 1,
(54.19)
where g μν is the metric, in our case the Minkowski metric g μν = η μν = diag(−1, +1, +1, +1). If we replace γ0 with iγ0 , the Clifford algebra will have the Euclidean metric g˜ μν = δ μν . In three dimensions, the Pauli matrices σi satisfy this Euclidean Clifford algebra. The Hilbert space for the Clifford algebra has a basis that is changed by a unitary transformation, ψ → Sψ, which amounts to a transformation on the gamma matrices, γ μ → Sγ μ S −1 , meaning the and β. gamma matrices are not unique, just like α A choice of γ μ corresponds to a representation of the Clifford algebra. A useful representation is the Weyl representation, 0 1 0 σi 0 i γ = −i , (54.20) , γ = −i −σi 0 1 0 satisfying (γ0 ) 2 = −1, (γi ) 2 = 1. Together, the gamma matrices in the Weyl representation are written as 0 σμ , (54.21) γ μ = −i μ σ¯ 0 where σ μ = (1, σi ),
σ¯ μ = (1, −σi ).
(54.22)
Then the Dirac equation is the Lorentz-invariant 4 × 4 matrix equation that is linear in ∂t and has rest energy mc2 . That uniquely gives, in the “theorist’s units” with c = = 1, the equation (γ μ ∂μ + m)ψ = 0.
(54.23)
Reinstating and c, we find γ0
∂ ψ + γi ∂i ψ + mcψ = 0. c ∂t
(54.24)
607
54 The Dirac Equation
Multiplying by iγ0 c, we obtain i
∂ ψ = icγ0 γi ∂i ψ + imc2 γ0 ψ, ∂t
(54.25)
which means that we have = −γ0 γ , β = iγ0 . α
(54.26)
54.3 Interaction with Electromagnetic Field To describe the interaction of the spin 1/2 fermions (electrons) with the electromagnetic field, we use and H with H + qφ. minimal coupling, replacing p with p − q A, Then the Hamiltonian with the interaction with the electromagnetic field is + βmc2 + qφ. (54.27) H = c α · p − A Substituting into the Schr¨odinger equation i
∂ ψ = Hˆ ψ, ∂t
we obtain the Dirac equation including the interaction with the electromagnetic field, 1 ∂ q · ∇ − q A + βmc ψ = 0. + φ+α i c ∂t c i In the explicitly relativistic invariant notation, we have γ μ ∂μ − q Aμ + mc ψ = 0. i
(54.28)
(54.29)
(54.30)
54.4 Weakly Relativistic Limit The main reason to consider in this book on quantum mechanics the Dirac equation, which is really part of a quantum field theory treatment, is to calculate the first nontrivial relativistic corrections to the nonrelativistic result. This is usually called the weakly relativistic limit.
Interaction with a Magnetic Field The first application is to the case of φ = 0, leading to interaction with just a magnetic field. In the reduces to stationary case, ψ(t) = ψe−iEt/ , the Dirac equation for interaction with A Eψ = ψˆ = (c α · p kin + βmc2 )ψ,
(54.31)
p kin = p − q A.
(54.32)
where the kinetic momentum is
608
54 The Dirac Equation Dividing the 4-vector ψ into two two-component vectors χ and φ, we find the matrix equation and β in (54.12)) (using α
E − mc2 −cσ · p kin
−cσ · p kin E + mc2
χ = 0, φ
(54.33)
which divides into two equations, (E − mc2 )χ = cσ · p kin φ (E + mc2 )φ = cσ · p kin χ.
(54.34)
In the nonrelativistic limit, σ · p ∼ mv, and E + mc2 2mc2 , meaning that (from the second relation) φ mvc 1v ∼ = 1. 2 χ 2mc 2c
(54.35)
Thus the two-component vector φ is small, while χ is large. Substituting φ
σ · p kin χ, 2mc
(54.36)
obtained from the second Dirac equation, into the first equation, we obtain (E − mc2 )χ = cσ · p kin φ =
(σ · p kin ) 2 χ, 2m
(54.37)
which is called the Pauli equation, for the large two-component vector χ. Using the relation between Pauli matrices, σi σ j = δi j + ii jk σk ,
(54.38)
we find 2 p − q A
1 i σi σ j pi − q Ai p j − q A j = + i jk ∂i − q Ai ∂ j − q A j σk 2m 2m 2m i i 2 q p − q A σ · B, = − 2m 2m
(54.39)
where we have used i jk (−∂i A j + ∂j Ai ) = −Bi . (the Pauli equation) is Then the Dirac equation for χ including the interaction with A ⎡⎢ ⎤⎥ 2 q p − q A ⎢ ⎥⎥ χ. σ · B − (E − mc )χ = ⎢⎢ ⎥⎥ 2m 2m ⎢⎣ ⎦ 2
(54.40)
term Here E − mc2 is the energy appearing in the nonrelativistic Schr¨odinger equation, and the σ · B is the spin interaction with magnetic field, with the Land´e factor g = 2, as we have seen before.
609
54 The Dirac Equation
Interaction with an Electric Field Interaction with a central electric potential φ(r), V (r) = eφ(r),
(54.41)
gives two coupled Dirac equations for χ and φ with an extra term proportional to the identity, (E − V − mc2 )χ = cσ · p kin φ (E − V + mc2 )φ = cσ · p kin χ.
(54.42)
Solving the second equation as before, cσ · p kin χ, (54.43) E − V + mc2 and substituting into the first equation, we get an equation for the large two-component vector χ, φ=
1 (σ · p kin )χ. E − V + mc2 This equation is now expanded in the small parameter (E − mc2 − V )χ = c2 (σ · p kin )
E − mc2 − V 1. 2mc2
(54.44)
(54.45)
The zeroth-order term is (σ · p kin ) 2 χ, (54.46) 2mc2 which becomes (moving V to the right-hand side) the usual interaction with a magnetic field and an electric potential (the Pauli equation with V in it) (σ · p kin ) 2 2 (E − mc )χ = + V χ. (54.47) 2m (E − mc2 − V )χ c2
Next, we consider the first-order corrections in the expansion parameter, −1 (σ · p kin ) E − mc2 − V (E − mc2 − V )χ = c2 1 + (σ · p kin ) 2 2 2mc 2mc (54.48) 2 ⎡⎢ ⎤⎥ 2 σ · p E − mc − V kin ⎢⎢ − (σ · p kin ) (σ · p kin ) ⎥⎥ χ. 2m 4m2 c2 ⎢⎣ ⎥⎦ = 0, and p kin = p . Then, using At this point, we want to consider the pure electric case, with A (E − mc2 − V )σ · p χ = σ · p (E − mc2 − V )χ + σ · [E − mc2 − V , p ]χ,
(54.49)
we find ⎧ ⎪ p 2 ⎪ p4 (σ · p )(σ · [p , V ]) ⎫ ⎬ χ. (54.50) (E − mc2 )χ = ⎨ + V − − 3 2 2 2 ⎪ 2m ⎪ 8m c 4m c ⎩ ⎭ The first two terms on the right-hand side are the nonrelativistic result, while the third term is the m2 c4 , relativistic correction to the energy coming from the expansion of E = p2 c2 + E mc2 +
p2 p4 − . 2m 8m3 c2
(54.51)
610
54 The Dirac Equation Decomposing σi σ j = δi j + ii jk σk , we obtain the corrected Schr¨odinger equation σ · (p × [p , V ]) p · [p , V ] p 2 p 4 χ, (54.52) (E − mc2 )χ = +V − −i − 3 2 8m c 4m2 c2 4m2 c2 2m where there are three relativistic corrections to the Hamiltonian. But the second is V ]) σ · (p × [p , V ]) σ · (p × [−i∇, −i = −i , (54.53) 4m2 c2 4m2 c2 where V ] = −i(∇V ) = −i r dV . [−i∇, (54.54) r dr However, −p × r = L is the angular momentum and σ /2 = S is the spin, so the second relativistic correction is 1 1 dV ( S · L) , (54.55) 2 2 r dr 2m c which is the spin–orbit interaction, giving Thomas precession. Finally however, the last term gives the so-called Darwin term. But the term is not Hermitian, +
(p · [p , V ]) † = [V , p ] · p = −p · [p , V ].
(54.56)
This means that the conservation of probability is broken (the conservation of probability implies unitary evolution, through a Hermitian Hamiltonian). We need an extra term in the Hamiltonian to compensate for the probability loss and restore Hermiticity. The probability loss is due to the fact that we dropped the small two-component vector φ. Indeed, the normalization of probability is given by ψ|ψ = χ|χ + φ|φ = 1.
(54.57)
Therefore we must replace the resulting norm of φ with a term involving only χ. Using (54.43), approximated as |φ
σ · p |χ, 2mc2
(54.58)
to relate the two vectors, we obtain 2 σ · p χ φ|φ = χ 2mc to be added to χ|χ, i.e., to replace it by σ · p 2 χ . χ 1 + 2mc Thus we are replacing |χ by σ · p 2 (σ · p ) 2 ˜ = 1+ | χ |χ 1 + |χ, 2mc 8m2 c2 inverted as
(σ · p ) 2 ˜ |χ = 1 − | χ. 8m2 c2
(54.59)
(54.60)
(54.61)
(54.62)
611
54 The Dirac Equation ˜ state, we obtain (since Evaluating the energy in the Schr¨odinger equation E − mc2 in the | χ 2 2 (σ · p ) = p ) p · p 2 ˜ = 1+ (E − mc )| χ (E − mc2 )|χ 8m2 c2 p · p ˆ p · p ˜ = 1+ H 1 − | χ (54.63) 8m2 c2 8m2 c2 ˆ [p · p , H] ˜ = Hˆ + | χ, 8m2 c2 where in the second equality we have used (E−mc2 )|χ = Hˆ |χ, and replaced |χ with its expression ˜ The only nontrivial commutator with p · p is V , the rest of the terms in Hˆ commute in terms of | χ. trivially. Therefore we have an extra term in the Hamiltonian: [p · p , V ] ˜ = Hˆ + ˜ | χ. (54.64) (E − mc2 )| χ 8m2 c2 Then the Darwin term, including the last relativistic correction and the term coming from the normalization, is HD =
1 2 1 −2 p · [ p , V ] + [ p · p , V ] = − [ p , ·[ p , V ]] = + ΔV . 8m2 c2 8m2 c2 8m2 c2
(54.65)
54.5 Correction to the Energy of Hydrogenoid Atoms The relativistic correction to the Schr¨odinger equation is given by (p 2 ) 2 1 1 dV 2 + L·S + ΔV = H1 + H2 + H3 , (54.66) 3 2 2 2 r dr 8m c 2m c 8m2 c2 where H1 is the relativistic correction to the energy of a particle, H2 is the spin–orbit coupling, and H3 is the Darwin term. We will apply this to the hydrogenoid atom, with H = −
Ze02 . (54.67) r Since we have spin–orbit coupling, it is convenient to add the orbital and spin angular momenta together to give the total angular momentum, L + S = J . The eigenstates in the central potential are ˆ L 2 , J 2 , Jz ). In the coordinate representation, then eigenstates |nl jm j of the complete set ( H, r |nl jm j = Rnl (r)θ, φ|l jm j = Rnl (r) Cl jm j Ylm j −1/2 (θ, φ)|ξ + Dl jm j Ylm j +1/2 (θ, φ)|η , (54.68) V =−
where |ξ = |↑ and |η = |↓. The first-order time-independent perturbation theory for the energy is given by W (1) = H nl jm j ,
(54.69)
nl jm j |H |nl j m j = δll δ j j δ m j mj H nl jm j .
(54.70)
where the matrix element is diagonal,
612
54 The Dirac Equation
The corrected energy is (1) Wnj = En + Wnj .
(54.71)
Using the matrix elements of powers of momentum and radius, one finds for the first term in (54.66), 1 (αZ ) 4 1 3 H1 nl jm j = − mc2 − , (54.72) 2 l + 1/2 4n n3 for the spin–orbit coupling H2 nl jm j
4 ⎧ 1 ⎪ 1 2 (αZ ) ⎪ ⎪ ± mc , for j = l ± 1/2, l 0 4 3 (l + 1)(l + 1/2) n =⎨ ⎪ ⎪ ⎪ 0, for l = 0, ⎩
(54.73)
and for the Darwin term, H3 nl jm j = Since at l 1, we have
1 2 (αZ ) 4 mc δl0 . 2 n3
1 1 1 − − 1∓ , l + 1/2 ± 1/2 l + 1/2 2(l + 1/2)
this means that at large l or j, we find the total relativistic correction as 1 (αZ ) 4 1 3 (1) Wnj = − mc2 − . 2 j + 1/2 4n n3
(54.74)
(54.75)
(54.76)
Important Concepts to Remember • The Dirac equation comes from the quantum field theory treatment of the electron, though it is often mistakenly described as a relativistic version of quantum mechanics. • From the Hamiltonian becoming relativistic energy, we get the Klein–Gordon equation, 2 − c−2 ∂t2 − (mc/) 2 ψ = 0, and the Dirac equation is a sort of matrix square root of it, i∂t ψ = ∇ + βmc2 )ψ. (−ic α·∇ • There is no relativistic quantum mechanics since: the particle number is not conserved, owing at least to particle–antiparticle annihilation (and more); for a short time we can have virtual particles that can also change the particle number; in quantum mechanics we can violate causality. • The relativistic form of the Dirac equation is (γ μ ∂μ + mc)ψ = 0 (where ∂t has a factor /c in front), with γ μ the gamma matrices satisfying the Clifford algebra, {γ μ , γ ν } = 2g μν , ψ a spinor field and αi = −γ0 γi and β = iγ0 . relativistically ∂μ → • The coupling to electromagnetism is via the minimal coupling, p → p − q A, i i ∂μ − q Aμ . 2 /2m − (q/2m)σ · B], and the coupling to • The resulting coupling to a magnetic field is [(p − q A) an electric field is the spin–orbit interaction (giving Thomas precession), 2 1 dV ( S · L) , r dr 2m2 c2
613
54 The Dirac Equation
and the Darwin term, 2 ΔV . 2m2 c2 • The relativistic corrections to the hydrogenoid atom come from: the first relativistic correction to the energy, −p4 /(8m3 c2 ), the spin–orbit interaction, and the Darwin term, leading to mc2 (αZ ) 4 − (1/( j + 1/2) − 3/(4n)). 2 n3
Further Reading See [2], [1] and [3].
Exercises (1) (2) (3) (4) (5) (6) (7)
Does a wave function satisfy the Klein–Gordon equation? What can you deduce about the sign of the rest energy mc2 in the Dirac equation? Find the Dirac equation in 1+1 dimensions (one space dimension) and find a representation for the matrices involved. Show that if we consider γ5 = iγ0 γ1 γ2 γ3 together with γ0 and γi , they form a Clifford algebra in five dimensions. Write down the Klein–Gordon equation with coupling to electromagnetism. Consider a shell model for a nucleus, with the potential for a nucleon being approximated by the Yukawa potential. Calculate the first relativistic correction to the Hamiltonian for the nucleon. Check the missing steps in the relativistic corrections to the hydrogenoid atom. Calculate the relativistic corrections to a hydrogenoid atom in a magnetic field.
Multiparticle States in Atoms and Condensed Matter: Schrödinger versus Occupation Number
55
In this chapter we return to multiparticle states, from the point of view of atomic physics and condensed matter physics, with a large number of identical particles. After reviewing the Schr¨odinger representation and the approximations needed, we lay the foundations of the occupation number picture, which will be continued in the next chapter in terms of Fock states and second quantization.
55.1 Schrödinger Picture Multiparticle Review The abstract-state Schr¨odinger equations, in the time-dependent and time-independent varieties, are i∂t |ψ = Hˆ |ψ ⇒ Hˆ |ψ = E|ψ.
(55.1)
This equation applies to both single-particle and multiparticle states. Applying it to N identical particles, the coordinate basis is |r 1 . . . r N , and the basis has exchange symmetry under exchange of r i with r j (particle i with particle j), |r 1 . . . r i . . . r j . . . r N = ±|r 1 . . . r j . . . r i . . . r N ,
(55.2)
where the ± correspond to Bose–Einstein and Fermi–Dirac statistics, respectively. If the particles are all different, we have no symmetry. If only some of them are identical, the exchange symmetry is valid for them only. The completeness relation on the N-particle Hilbert space is d 3r N |r 1 . . . r N r 1 . . . r N | = 1. (55.3) d 3r 1 · · · We can use, alternatively, the unsymmetrized basis |r 1 ⊗ · · · ⊗ |r N in the identical case also (since in the case of all different particles it is a basis by definition), with completeness relation 3 d 3r N |r 1 r 1 | ⊗ · · · ⊗ |r N r N | = 1. (55.4) d r 1 · · · Inserting the symmetrized basis, we find the Schr¨odinger equation acting on wave functions, 3 d r 1 · · · d 3r N r 1 . . . r | Hˆ |r 1 . . . r N r 1 . . . r N |ψ, (55.5) i∂t r 1 . . . r N |ψ = or i∂t ψ N (r 1 , . . . , r N ) = Hr 1 ,...,r N ψ N (r 1 , . . . , r N ) ⇒ Hr 1 ,...,r N ψ N (r 1 , . . . , r N ) = Eψ N (r 1 , . . . , r N ). 614
(55.6)
615
55 Multiparticle States: Schrödinger vs. Occupation Number
The symmetry properties of N independent but identical particles imply the general invariant states 1 ˆ ⊗ |2 ⊗ · · · ⊗ |N , |12 . . . N = sgn(P) P|1 (55.7) Nperms perms P where in the Bose–Einstein case sgn(P) is replaced by 1, leading to the wave function ψ S/A (r 1 , . . . , r N ). In the Fermi–Dirac (antisymmetric) case, we have ψ a1 (r 1 ) .. ψ A (r 1 , . . . , r N ) = . ψ a (r N ) 1
···
···
ψ a N (r 1 ) .. , . ψ a N (r N )
(55.8)
known as the Slater determinant. The symmetrized wave function satisfies ψ N ,S/A (r 1 , . . . , r i , . . . , r j , . . . , r N ) = ±ψ N ,S/A (r 1 , . . . , r j , . . . , r i , . . . , r N ).
(55.9)
We consider a Hamiltonian containing a free part that acts on a single particle, namely the kinetic part for each particle, and a potential that acts on several particles, Hˆ = Hˆ 0 + Vˆ (r 1 , . . . , r N ) Hˆ 0 = Tˆ = Vˆ (r 1 , . . . , r N ) =
N i=1
N p 2i 2mi i=1
Vˆ (r i ) +
(55.10) N i,j=1
Vˆ (r i , r j ) +
N
Vˆ (r i , r j , r k ) + · · ·
i,j,k=1
N ˆ Here the one-particle potential i=1 V (r i ) can be a Coulomb potential from a fixed source, in which case it could be added to the free one-particle Hamiltonian Hˆ 0 , while the three-particle potential Vˆ (r i , r j , r k ) and any higher-N N-particle potentials come from quantum field theory corrections to the classical potential. The only relevant term in the potential is the two-particle potential Vˆ (r i , r j ) = Vˆ (|r i − r j |) (the Coulomb potential between the particles, for instance).
(a) (b)
Figure 55.1
(a) The 2-point potential can come from a quantum mechanical Feynman diagram, but without loops (since it is classical). (b) The three-point potential can only come from loop corrections, and so from quantum field theory.
616
55 Multiparticle States: Schrödinger vs. Occupation Number
55.2 Approximation Methods The multiparticle Schr¨odinger equation is hard to solve exactly, so in the relevant cases for atomic or molecular physics and condensed matter physics, one can use approximation methods to get a workable wave function. A useful example for this chapter is the LCAO (linear combination of atomic orbitals) approximation to the multi-electron wave function for the atom or molecule. Consider the basis of factorized wave functions φr (r 1 , . . . , r N ) ≡ φ a1 (r 1 ) · · · φ a N (r N ),
(55.11)
where r is an index encompassing the relevant subset of all the a1 , . . . , a N , or its symmetrized alternative, in the fermionic case a Slater determinant. α of In the case of the molecular wave function, we must also add a dependence on the positions R α }). the nuclei in the molecule, giving φr ({r i }, { R Then the wave function is expanded in a finite subset of the total set of basis states, α }) = α }), ψ({r i }, { R Cr φi ({r i }, { R (55.12) r
and we minimize over Cr .
55.3 Transition to Occupation Number The approximation method amounts to using a subset of the complete set of states, but if we keep all the states, we will get an exact statement. The basis wave functions are denoted φ a1 ...a N (r 1 , . . . , r N ) = φ a1 (r 1 )φ a2 (r 2 ) · · · φ(r N ). The exact expansion in this basis is with coefficients in all the basis states, i.e., ψ(r 1 , . . . , r N ; t) = Ca1 ...a N (t)φ a1 ...a N (r 1 , . . . , r N ).
(55.13)
(55.14)
In the LCAO approximation, one takes a subset of states and minimizes the Schr¨odinger equation over the coefficients. Alternatively, to obtain the same expansion, we can use successive expansions in each particle. That is, first expand in wave functions of r 1 , ψ(r 1 , . . . , r N ; t) = Ca1 (r 2 , . . . , r N ; t)φ a1 (r 1 ), (55.15) a1
and then the coefficients are expanded in the wave functions for r 2 , Ca1 (r 2 , . . . , r N ; t) = Ca1 a2 (r 3 , . . . , r N ; t)φ a2 (r 2 ),
(55.16)
a2
etc., until we are back to ψ(r 1 , . . . , r N ; t) =
a1 ,...,a N
Ca1 ···a N (t)φ a1 (r 1 ) · · · φ a N (r N ).
(55.17)
617
55 Multiparticle States: Schrödinger vs. Occupation Number
The symmetry properties of the wave function are transmitted to symmetry properties of the coefficients, Ca1 ...ai ...a j ...a N (t) = ±Ca1 ...a j ...ai ...a N (t).
(55.18)
A relevant case occurs when the indices ai are indexing energies, so that ai → Ei . The energies can even be approximately continuous, though not exactly continuous. In this case, we will not consider degeneracies: the energy indexes the available states. From now on we will replace ai with Ei everywhere. Note then that Ei takes value in the set {1, 2, 3, . . . , ∞}, ordered by increasing energy. In the following we will mostly follow Bose–Einstein (boson) statistics, while Fermi–Dirac (fermion) statistics will be treated later. Then, if index 1 (the lowest-energy state) occurs n1 times among the row of indexes (E1 , . . . , E N ), index 2 occurs n2 in the same row of indexes, etc., until ∞ occurs n∞ times in the row, we have that the total number of occurences of any value equals N, ∞
ni = N.
(55.19)
i=1
In that case, the coefficients can have the dependence on (E1 , . . . , E N ) replaced with dependence on the occupation numbers n1 , n2 , . . . , n∞ . The numerical equality is then related to a different functional dependence, C˜ instead of C, CE1 ...E N (t) = C˜n1 ...n∞ (t). Now we can translate the normalization condition in the indices Ei , |CE1 ...E N (t)| 2 = 1,
(55.20)
(55.21)
E1 ...E N
into a normalization condition in the indices ni . But the sum over Ei ’s equals the sum over ni , with a combinatorial factor for choosing n1 objects of type 1, n2 of type 2, etc. out of a total of N, CnN1 ...n∞ =
N! . n1 ! · · · n∞ !
(55.22)
The normalization condition in C˜ is now 1=
n1 ,...,n∞
|C˜n1 ...n∞ (t)|
2
CnN1 ...n∞
2 N! ˜ = (t) C n ···n n ! . . . n ! 1 ∞ . 1 ∞ n1 ,...,n∞
This then allows for the definition of new coefficients, appropriately normalized: N! C˜n ...n (t) = Cn 1 ...n∞ (t). n1 ! · · · n∞ ! 1 ∞
(55.23)
(55.24)
55.4 Schrödinger Equation in Occupation Number Space We now consider the Schr¨odinger equation with a Hamiltonian that contains a kinetic term that is a one-particle operator, perhaps including also the one-particle potential V (r i ), and a potential that is a two-particle operator,
618
55 Multiparticle States: Schrödinger vs. Occupation Number
Hˆ =
N
T (r i ) +
i=1
N
Vˆ (r i , r j ) + · · ·
(55.25)
i,j=1
We set to zero the three-particle and higher parts of the potential (since they are small, and come from quantum field theory corrections to the classical potential). We will transition from the Schr¨odinger equation in the coordinate representation to a representation in terms of occupation numbers.
One-Particle Operator Acting on Wave Functions We start with the action of the (generalized) kinetic operator, meaning the one-particle part of the Hamiltonian. That means we calculate N
T (r i )ψ(r 1 , . . . , r N ; t) =
i=1
N
T (r i )
i=1
E1 ...E N
CE1 ...E N (t)φ E1 (r 1 ) · · · φ E N (r N ).
(55.26)
More precisely, we want to peel off the existence of extra wave functions (N − 1 of them) on the right-hand side. To do that, we calculate N 3 T(1) ≡ d r 1 · · · d 3r N φ∗E1 (r 1 ) · · · φ∗E N (r N ) T (r i )ψ(r 1 , . . . , r N ; t). (55.27) i=1
Using the orthonormality of the wave functions, d 3r j ψ∗E j (r j )ψ E j (r j ) = δ E j E j ,
(55.28)
for all the wave functions except for ψ Ei (r i ), on which we have T (r i ) acting, we obtain N T(1) = d 3r i φ∗Ei (r i )T (r i )φ Ei (r i )CE1 ...Ei−1 Ei Ei+1 ...E N (t) i=1 Ei
N = φ Ei |Tˆi |φ Ei CE1 ...Ei−1 Ei Ei+1 ...E N (t).
(55.29)
i=1 Ei
This means that in the coefficients, we replace Ei with Ei, and sum over it. Next we change to the occupation number indexes: CE1 ...Ei−1 Ei Ei+1 ...E N (t) = C˜n1 ...nEi −1,...nE +1,...n∞ i n1 ! · · · n∞ ! = Cn1 ...nE −1,...nE +1,...n∞ . i N! i
(55.30)
In the first equality we note that we have removed one occupied state from the value Ei and added it at Ei. In the second equality, we have just redefined the coefficients in terms of correctly normalized ones. When going to occupation number space, the sum over particles translates into a sum over states, the multiplicity being the occupation numbers, N i=1
φ Ei | =
Ei
nEi φ Ei |.
(55.31)
619
55 Multiparticle States: Schrödinger vs. Occupation Number At this point, we can simplify the notation, and replace |φ Ei with |i and nEi with ni everywhere. Then we get the simple one-particle operator action N n1 ! · · · n∞ ! i|Tˆ |i Cn1 ...ni −1,...ni +1,...n∞ (t). (55.32) T(1) = N! i,i =1
Two-Particle Operator Acting on Wave Functions Next, we consider the action of the two-particle operator, i.e., the two-particle potential N
V (r i , r j ) =
i> j=1
N 1 V (r i , r j ), 2 i,j=1,ij
(55.33)
and define the same kind of coefficient coming from it as for T(1) , V(2) =
d 3r 1 · · ·
d 3r N φ∗E1 (r 1 ) · · · φ∗E N (r N )
N 1 V (r i , r j ). 2 i,j=1,ij
(55.34)
Again using the orthonormality of the one-particle wave functions, except for those involving r i , r j , we find 1 d 3r i d 3r j φ∗Ei (r i )φ∗E j (r j )V (r i , r j )ψ Ei (r i )φ E j (r j ) V(2) = 2 i,j=1,ij E E i
j
× CE1 ...Ei−1 Ei Ei+1 ...E j−1 E j E j+1 ...E N (t) 1 = φ Ei φ E j |Vˆ |φ Ei φ E j CE1 ...Ei−1 Ei Ei+1 ...E j−1 E j E j+1 ...E N (t). 2 i,j=1,ij E E i
(55.35)
j
We then change to the occupation number representation, now with two changes in the indices, CE1 ...Ei−1 Ei Ei+1 ...E j−1 E j E j+1 ...E N (t) = C˜n1 ...nEi −1,...nE +1,...nE j −1,...nE +1,...n∞ i j n1 ! · · · n∞ ! = Cn1 ...nE −1,...nE +1,...nE −1,...nE +1,...n∞ , i j j N! i
(55.36)
where the factorials in the square root have the same occupation numbers as the coefficients C . Now, besides replacing the single sum over particles replaced with a sum over particles that includes the multiplicity of occupation numbers, N φ Ei | = nEi φ Ei |, i=1
(55.37)
Ei
we have to account for double sum, in which, if two energies are equal, the second sum has one state less, N N i=1 j=1
φ Ei φ E j | =
E j Ei
nEi nE j φ Ei φ E j | +
Ei
nEi (nEi − 1)φ Ei φ Ei |.
(55.38)
620
55 Multiparticle States: Schrödinger vs. Occupation Number Relabeling |φ Ei as |i and nEi = ni as before, we find the two-particle potential operator: 1 V(2) = nE (nE j − δ Ei E j )φ Ei φ E j |Vˆ(2) |φ Ei φ E j 2 i E E E E i
j
i
j
n1 ! · · · n∞ ! Cn1 ...nE −1,...nE +1,...nE −1,...nE +1,...n∞ (t) i j N! i j 1 = ni (n j − δi j )i j |Vˆ(2) |i j 2 i,j,i ,j n1 ! · · · n∞ ! × Cn1 ...ni −1,...ni +1,...n j −1,...n j +1,...n∞ (t), N! ×
(55.39)
where again the factorials inside the square root have the same occupation numbers as the coefficients C . Then, the Schr¨odinger equation, multiplied by the wave functions and integrated over position, gives 3 3 d r 2 · · · d 3r N φ∗E1 (r 1 ) · · · φ∗E N (r N ) × i∂t ψ(r 1 , . . . , r N ; t) = T(1) + V(2) . (55.40) d r 1 Since the left-hand side gives i∂t CE1 ...E N (t), we obtain Schr¨odinger equation for the coefficients, first in terms of the original CE1 ...E N (t) coefficients, i∂t CE1 ...E N (t) =
N φ Ei |Tˆi |φ Ei CE1 ...Ei−1 Ei Ei+1 ...E N (t) i=1 Ei
+
1 φ Ei φ E j |Vˆ |φ Ei φ E j CE1 ...Ei−1 Ei Ei+1 ...E j−1 E j E j+1 ...E N (t), 2 i,j=1,ij E E i
j
(55.41) and finally in terms of the Cn 1 ...n∞ (t) coefficients, N n1 ! · · · n∞ ! n1 ! · · · (ni − 1)! · · · (ni + 1)! · · · n∞ ! i ∂t Cn1 ...n∞ (t) = i|Tˆ |i N N i,i =1 × Cn 1 ...ni −1,...ni +1,...n∞ (t) + ×
1 ni n j i j |Vˆ(2) |i j 2 ij,i ,j
n1 ! · · · (ni − 1)! · · · (ni + 1)! · · · (n j − 1)! · · · (n j + 1)! · · · n∞ ! N!
× Cn 1 ...ni −1,...ni +1,...n j −1,...n j +1,...n∞ (t) 1 + ni (ni − 1)i j |Vˆ(2) |i j 2 i=j,i ,j n1 ! · · · (ni − 2)! · · · (ni + 1)! · · · (n j + 1)! · · · n∞ ! × N! × Cn 1 ...ni −2,...ni +1,...n j +1,...n∞ (t). (55.42)
621
55 Multiparticle States: Schrödinger vs. Occupation Number
55.5 Analysis for Fermions In the analysis before, we were considering bosonic particles. To analyze the case of fermions, we must replace the factorized basis (products of the one-particle wave functions) with Slater determinant, with combinatorial coefficient φ E (r 1 ) 1 .. N −1/2 [Cn1 ...n∞ ] . φ E1 (r N )
··· ···
φ E N (r 1 ) .. . . φ E N (r N )
But this combinatorial coefficient is the same as before, n1 ! · · · n∞ ! 1 N −1/2 [Cn1 ...n∞ ] = =√ , N! N!
(55.43)
(55.44)
since the occupation numbers for fermions ni are either 0 or 1, in both cases obtaining ni ! = 1. The one-particle and two-particle operators act on particles with the same coefficients CE1 ...E N (t), where all the Ei are different (since for fermions we have at most one particle per state). Then the coefficients can be rewritten in terms of C˜n1 ...n∞ (t) and Cn 1 ...n∞ (t), 1 CE1 ...E N (t) = C˜n1 ...n∞ (t) = √ Cn 1 ...n∞ (t). N!
(55.45)
The one-particle operator acting on the Slater determinant gives φ Ei |Tˆi |φ Ei CE1 ...Ei−1 Ei Ei+1 ...E N (t),
(55.46)
where the term with coefficient CE1 ...Ei−1 Ei Ei+1 ...E N (t) acts by replacing the Slater determinant with φ E1 (r 1 ) .. . φ E1 (r N )
··· ···
φ Ei (r 1 ) φ Ei (r N )
··· ···
φ E N (r 1 ) .. . . φ E N (r N )
(55.47)
Important Concepts to Remember • The multiparticle Schr¨odinger equation has one-particle, two-body, three-body, etc., potentials, where only the one-particle and two-body potentials have a classical counterpart, the others coming from quantum field theory. • We can expand the multiparticle wave function in a basis of factorized wave functions, φr (r 1 , . . . , r N ) = φ a1 (r 1 ) · · · φ a N (r N ), and in the LCAO approximation we keep only a finite number of these basis functions. • The coefficients Ca1 ...a N (t) have the correct symmetry properties (for bosons or fermions). It is useful to consider ai ≡ Ei (energy eigenstates), and transition to occupation number states for all the energy eigenstates, CE1 ...E N (t) ≡ C˜n1 ...n∞ (t).
622
55 Multiparticle States: Schrödinger vs. Occupation Number
• The one-particle (kinetic and potential) operator in occupation number is
ˆ i Ei φ Ei |Ti |φ Ei CE1 ...Ei−1 Ei Ei+1 ...E N (t), and the two-particle operator is
1
ˆ ij Ei ,E j φ Ei φ E j |V |φ Ei φ E j CE1 ...Ei−1 Ei Ei+1 ....E j−1 E j E j+1 ...E N (t). 2 • We can write a Schr¨odinger equation for the coefficients CE1 ...E N (t) ≡ C˜n1 ...n∞ (t). • For fermions the coefficients multiply changed Slater determinants.
Further Reading More details are given in the book [27].
Exercises (1) (2) (3)
(4) (5) (6)
(7)
How does translational invariance and rotational symmetry constrain the three-body potential coming from quantum field theory? Consider the LCAO approximation for an atom with three electrons. Write the expansion, restricting yourself to four basis elements. Write the coefficients of the expansion in exercise 2 as coefficients in terms of energy, CE1 ... , show explicitly their symmetry properties, and then rewrite them as coefficients in the occupation number picture. Consider five bosons with a one-body potential that can be approximated as harmonic oscillator. Write the wave function expansion in terms of harmonic oscillator states with n ≤ 3. For the case in exercise 4, write the one-particle (kinetic) operator in the Hamiltonian, in the occupation number basis. For the case in exercise 4, consider a two-particle potential of the Coulomb (∝ 1/r) type. Write explicitly the first six nontrivial terms (those having an occupation number change in the twoparticle operator in the Hamiltonian), in the occupation number basis. Consider three fermions with a one-body potential that can be approximated as a harmonic oscillator potential. Write the first four one-particle terms in the Hamiltonian that acts on the Slater determinant (for a wave function with arbitrary coefficients).
56
Fock Space Calculation with Field Operators
In this chapter we continue with the occupation number picture, and build the Fock space corresponding to it. Then we build field operators, and operators acting on them. This is the beginning of the quantum field theory formalism, though in its nonrelativistic version, with v c, and no antiparticles. This is a whole field of study, of many-body physics, of which we describe only the foundation.
56.1 Creation and Annihilation Operators Assume, as in the previous chapter, that there are one-particle states, which means that there are (quasi-)particles. The states are indexed as (1, 2, . . . , ∞), and each multiparticle state has occupation numbers (n1 , n2 , . . . , n∞ ) for the one-particle states. Consider basis vectors for the multiparticle Hilbert space that are tensor products of each single-particle state, |n1 n2 . . . n∞ = |n1 ⊗ |n2 ⊗ · · · ⊗ |n∞ .
(56.1)
In the coordinate representation for the n-particle Hilbert space, we have r 1 , . . . , r n1 |n1 = φ1 (r 1 ) · · · φ1 (r n1 ) r n1 +1 , . . . , r n1 +n2 |n2 = φ2 (r n1 +1 ) · · · φ2 (r n1 +n2 ), etc. The occupation number basis is complete in the multiparticle Hilbert space, so |n1 . . . n∞ n1 . . . n∞ | = 1.
(56.2)
(56.3)
n1 ...n∞
We now define creation and annihilation operators, first in the bosonic case, b†i , bi . These operators act in the same way as for the harmonic oscillator, just on the occupation number states. This means that on the single-particle state i, with occupation number ni , the actions of bi and b†i are √ bi |ni = ni |ni − 1 (56.4) b†i |ni = ni + 1|ni + 1. We can also define the number operator in the state i, Ni = b†i bi → Ni |ni = b†i bi |ni = ni |ni .
(56.5)
For the states with ni = 0 or ni = 1, we have, in particular, b|0 = 0, b† |0 = |1, 623
b|1 = |1 √ b† |1 = 2|2.
(56.6)
624
56 Fock Space Calculation with Field Operators
The creation and annihilation operators satisfy the independent harmonic oscillator commutation rules, [bi , b†j ] = δi j [bi , b j ] = [b†i , b†j ] = 0.
(56.7)
We can then finally extend, trivially, the actions of bi and b†i to the full multiparticle Hilbert space, with its basis |n1 . . . ni . . . n∞ .
56.2 Occupation Number Representation for Fermions We can define similarly the occupation number states (and representation) for fermions. The fermionic creation and annihilation operators ai† , ai for fermionic harmonic oscillators were defined in Chapters 20 and 28 via their anticommutation rules, owing to the antisymmetric statistics. The anticommutators are {ai , a†j } = δi j {ai , a j } = {ai† , a†j } = 0.
(56.8)
We can import the anticommutators from the fermionic harmonic oscillator to the single-particle states i. Since we have fermions, the single-particle states can be either empty (unoccupied) or full (occupied), so the states are |0 or |1. Then we must solve for the anticommutation rules for the actions of the operators ai and ai† on the states |0 and |1. The solution is 0|0 = 0, †
a |0 = |1,
a|1 = |1 a† |1 = 0.
(56.9)
With respect to the bosonic case, we only need to change the last action, since there is no |2 state so it must be replaced with 0: |2 ≡ 0. In terms of the occupation numbers ni = 0, 1, the relations above can be summarized as √ ai |ni = ni |ni − 1 (56.10) ai† |ni = 1 − ni |ni + 1. Equivalently, we can just use the same definition as for the bosonic case but with the identification |2 = 0. We can also define fermionic number operators Ni = ai† ai ⇒ Ni |ni = ai† ai |ni = ni |ni ,
(56.11)
for ni = 0, 1. Finally, the extension of the actions of (ai† , ai ) on the basis |n1 . . . n∞ of the full multiparticle Hilbert space will be explained later, in the definition of Fock space.
625
56 Fock Space Calculation with Field Operators
56.3 Schrödinger Equation on Occupation Number States Define the abstract state
|ψ(t) =
n1 ...n∞
Cn1 ...n∞ (t)|n1 n2 . . . n∞ ,
(56.12)
with a general wave function, i.e., coefficient Cn1 ...n∞ . Going back to the Schr¨odinger equation for the coefficients of occupation number states (55.42), rewriting it by dividing by the square root of the factorials on the left-hand side and isolating the √ i = i kinetic term, in which ni = ni − 1, so that ni (ni + 1) = ni , we obtain i∂t Cn 1 ...n∞ (t) = i|Tˆ |ini Cn 1 ...ni ...n∞ (t) i
+
i,i
+
√ i|Tˆ |i ni ni + 1Cn 1 ...ni −1,...ni +1,...n∞ (t)
i j |Vˆ(2) |i j
ij,i ,j
× +
1√ √ ni n j ni + 1 n j + 1 2
(56.13)
Cn 1 ...ni −1,...ni +1,...n j −1,...n j +1,...n∞ (t)
ii|Vˆ(2) |i j
i=j,i j
1 ni (ni − 1) ni + 1 n j + 1 2
× Cn 1 ...ni −2,...ni +1,...n j +1,...n∞ (t). Multiplying by the occupation number states, we obtain the Schr¨odinger equation for general states, i∂t |ψ(t) = i∂t Cn 1 ...n∞ (t)|n1 n2 . . . n∞
n1 ...n∞ ; i ni =N
=
⎧ ⎪ ⎨ i|Tˆ |iCn ...n ...n (t)ni |n1 . . . ni . . . n∞ ∞ i 1 ⎪
n1 ...n∞ ; i ni =N ⎩ i √ + i|Tˆ |i Cn 1 ...ni −1,...ni +1,...n∞ (t) ni ni + 1|n1 . . . ni , . . . ni , . . . n∞
ii
1 i j |Vˆ(2) |i j Cn 1 ...ni −1,...ni +1,...n j −1,...n j +1,...n∞ (t) 2 ij,i ,j √ √ × ni n j ni + 1 n j + 1|n1 . . . ni , . . . ni , . . . n j , . . . n j , . . . n∞ +
(56.14)
1 ii|Vˆ(2) |i j Cn 1 ...ni −2,...ni +1,...n j +1,...n∞ (t) 2 i=j,i ,j × ni (ni − 1) ni + 1 n j + 1|n1 . . . ni , . . . ni , . . . n j , . . . n∞ .
+
Redefine in the second term (in the first term no redefinition is needed) ni − 1 = n˜ i ,
ni + 1 = n˜ i .
(56.15)
626
56 Fock Space Calculation with Field Operators
In the third term, we redefine ni − 1 = n˜ i ni + 1 = n˜ i
n j − 1 = n˜ j n j + 1 = n˜ j ,
(56.16)
and in the fourth term, we redefine ni − 2 = n˜ i , ni + 1 = n˜ i , n j + 1 = n˜ j .
(56.17)
Then the states and prefactors multiplying the matrix elements and coefficients C become, under the redefinitions, ni |n1 . . . ni . . . n∞ = b†i bi |n1 . . . ni . . . n∞ ,
n˜ i + 1 n˜ i |n1 . . . n˜ i + 1, . . . n˜ i − 1, . . . n∞ = b†i bi |n1 . . . n˜ i . . . n˜ i . . . n∞ , n˜ i + 1 n˜ i n˜ j + 1 n˜ j |n1 . . . n˜ i + 1, . . . n˜ i − 1, . . . n˜ j + 1, . . . n˜ j − 1, . . . n∞ = b†i bi b†j b j |n1 . . . n˜ i . . . n˜ i . . . n˜ j . . . n˜ j . . . n∞ , ( n˜ i + 1)( n˜ i + 2) n˜ i n˜ j |n1 . . . n˜ i + 2, . . . n˜ i − 1, . . . n˜ j − 1, . . . n∞
(56.18)
= b†i b†i bi b j |n1 . . . n˜ i . . . n˜ i . . . n˜ j . . . n∞ , and we can replace the sum over ni with a sum over n˜ i , etc., since the sum is unchanged as the prefactor is zero for the extra terms in the sum. Then, in all the terms, the original basis vectors are reobtained, and from them the full |ψ(t). Finally then, the Schr¨odinger equation for general states is i|Tˆ |ib†i bi |ψ(t) i∂t |ψ(t) = i
+
i|Tˆ |i b†i bi |ψ(t) i,i
1 i j |Vˆ(2) |i j b†i b†j bi b j |ψ(t) 2 ij,i ,j 1 + ii|Vˆ(2) |i j b†i b†i bi b j |ψ(t). 2 i=j,i j
+
(56.19)
The right-hand side of the equation can be identified with Hˆ |ψ(t) in occupation number states, which means that the Hamiltonian acting on the occupation number states is Hˆ = b†i i|Tˆ |i bi + b†i b†j i j |Vˆ(2) |i j bi b j . (56.20) i,i
i,j,i ,j
56.4 Fock Space We can define the total number operator, Nˆ =
i
Nˆi =
i
b†i bi ,
(56.21)
627
56 Fock Space Calculation with Field Operators
which has as eigenvalue the total number of particles/occupied states N= ni .
(56.22)
i
We could consider the space to be of given total number N, since in (nonrelativistic) quantum mechanics the number of particles does not change. However, as the individual operators bi , b†i change the total number to N − 1 and N + 1, respectively, in order to define their action we must consider the total Hilbert space as the space of all values of N, called Fock space. In this space, the normalized basis states are (b† ) n1 (b† ) n∞ · · · √∞ |0. |n1 . . . n∞ = √1 n1 ! n∞ !
(56.23)
Indeed, the normalization constant C for the state |n = C(b† ) n |0
(56.24)
comes from the normalization condition 1 = n|n = |C| 2 0|bn (b† ) n |0 = |C| 2 n0|bn−1 (b† ) n−1 |0 = · · · = |C| 2 n! 0|0 = |C| 2 n! ,
(56.25)
where in the iteration step we have commuted the last b past n b† s, leading to n(b† ) n−1 .
56.5 Fock Space for Fermions For fermions, the definition of the occupied states |n is |n = (a† ) n |0,
(56.26)
where
√ √ a|n = n|n − 1, a† |n = n + 1|n + 1, √ and where n = 0 or 1 only and |2 ≡ 0. Note that then we also have n! = 1. The multiparticle states are defined in the same way, † n∞ |n1 . . . n∞ = (a1† ) n1 · · · (ai† ) ni · · · (a∞ ) |0.
(56.27)
(56.28)
However, because of the anticommutating operators for i j, {ai , a†j } = 0,
(56.29)
the action of the annihilation operator ai on the Fock space states is different from its action just on a single state, by a sign: before reaching (ai† ) ni in this string, we have to anticommute past the rest of the creation operator, creating a sign, √ ai |n1 . . . ni . . . n∞ = (−1) S ni |n1 . . . ni − 1, . . . n∞ , (56.30) where S = n1 + n2 + · · · + ni−1 .
(56.31)
628
56 Fock Space Calculation with Field Operators
Similarly, the creation operator acts on the Fock space to produce the same sign, ai† |n1 . . . ni . . . n∞ = (−1) S ni + 1|n1 . . . ni + 1, . . . n∞ .
(56.32)
Finally, the number operator acts without a sign, Ni |n1 . . . ni . . . n∞ = ai† ai |n1 . . . ni . . . n∞ = ni |n1 . . . ni . . . n∞ .
(56.33)
56.6 Schrödinger Equations for Fermions, and Generalizations The analysis for the Schr¨odinger equation for fermions follows that for bosons, since we have the same definition of the action of creation and annihilation operators, but with the addition of the condition |2 = 0 and with the sign (−1) S in intermediate equations. But at the end, when |ψ(t) is re-formed, the sign disappears (we leave the details as an exercise). Then the Schr¨odinger equation for fermions has the same form as for bosons, i∂t |ψ(t) = Hˆ |ψ(t), where the Hamiltonian in the occupation number space is 1 † † Hˆ = ai† i|Tˆ |i ai + ai a j i j |Vˆ(2) |i j ai a j . 2 i,i i,j,i ,j
(56.34)
(56.35)
We can then use a formalism where we treat both bosons and fermions together, with annihilation operator ci = (bi , ai ) according to whether we have bosons or fermions, and similarly for the creation operators. Then the more general one-particle operator Aˆ =
N
Aˆ (i) ,
(56.36)
ˆ ci . ci† i| A|i
(56.37)
i=1
becomes in the occupation number picture Aˆ =
i,i
Similarly the more general two-particle operator N 1 ˆ (i j) B , Bˆ = 2 i,j=1
(56.38)
becomes in the occupation number picture N 1 † † ˆ j ci c j . Bˆ = c c i j | B|i 2 i,j,i ,j =1 i j
(56.39)
56.7 Field Operators The occupation number representation leads to so-called second quantization, which is regarded as the quantization of the wave function, where instead of complex numbers they take values in
629
56 Fock Space Calculation with Field Operators
operators (numbers are replaced with operators). However, that is actually a misnomer, since the correct procedure is to build quantum field theory, where the wave function is replaced by a quantum field. But here we are dealing with nonrelativistic quantum mechanics, so v c, and there are no antiparticles in the theory, only particles. This means that we have annihilation and creation operators only for particles, ci and ci† , but not the corresponding operators for antiparticles, as in the relativistic quantum field theory. We build the field operators as the product of the annihilation or creation operators with the singleparticle wave functions φi (r ), i.e., ˆ r) = ψ( φi (r )ci ⇒ i
ˆ†
ψ (r ) =
φi∗ (r )ci† .
(56.40)
i
More precisely, the above definitions are only for the boson case, ci = bi . In the case of fermions, there is an extra index for spin σ = ±1/2 or 1, 2, ψˆ σ (r ) = φiσ (r )ai,σ ⇒ i
ψˆ †σ (r ) =
† ∗ φiσ (r )ai,σ .
(56.41)
i
We can also write field operators involving spin wave functions, ˆ r , s) = ψ( ψˆ σ (r )χσ (s) σ=1,2
ψˆ † (r , s) =
σ=1,2
ψˆ †σ (r )χσ (s).
(56.42)
The field operators are operators in the occupation number Hilbert space, through ci , ci† , and, as such, they obey commutation and anticommutation relations derived from those for ci and ci† , [ ψˆ σ (r ), ψˆ †σ (r )]∓ = δ σσ δ3 (r − r ) [ ψˆ σ (r ), ψˆ σ (r )]∓ = [ ψˆ †σ (r ), ψˆ †σ (r )]∓ = 0.
(56.43)
In the above, we have used the same indices σ as for fermions. For bosons, we just drop all the σ indices. Using the orthonormality and completeness of the single-particle wave functions φiσ (r ), we can rewrite the result of acting with one-particle operators in the occupation number picture as an integral with the field operators, ˆ ci ci† i| A|i Aˆ = i,i
=
d 3r
σ
(56.44) ψ†σ (r ) A(r )ψ σ (r ).
Similarly, the two-particle operators in the occupation number picture also become an integral with field operators, 1 † † ˆ j ci c j Bˆ = c c i j | B|i 2 i,j,i ,j i j (56.45) 1 = d 3r d 3r ψˆ †σ (r ) ψˆ †σ (r )B(r , r ) ψˆ σ (r ) ψˆ σ (r ). 2 σ,σ
630
56 Fock Space Calculation with Field Operators We could further generalize to operators that are nontrivial in their spin action, Aˆ σσ , Bˆ σ1 ,σ2 ,σ1 ,σ2 , so that the relation becomes a matrix relation in σ space. Remaining in the case of trivial dependence on spin, the Hamiltonian becomes 1 3 † 3 ˆ H= d r ψ σ (r )T (r )ψ σ (r ) + d r d 2r ψˆ †σ (r ) ψˆ †σ (r )V (r , r ) ψˆ σ (r ) ψˆ σ (r ). 2 σ σ,σ (56.46)
56.8 Example of Interaction: Coulomb Potential, as a Limit of the Yukawa Potential, for Spin 1/2 Fermions Consider the example of the Yukawa interaction,
V (r , r ) =
e02 e−μ |r −r
|
|r − r |
,
(56.47)
which leads to the Coulomb potential in the limit μ → 0. Consider then the state labeled by k = p / and spin σ, with spin wave functions χσ =
1 0 , or . 0 1
(56.48)
Then the kinetic energy in these states in momentum space is k1 σ1 |T |k1 σ1 =
2 k12 δ σ σ δ . 2m 1 1 k1 ,k1
(56.49)
The potential in the same states is k1 σ1 , k2 , σ2 |Vˆ |k1 σ1 , k2 , σ2 =
e02 4π δ σ σ δ σ σ δ . V 1 1 2 2 k1 +k2 ,k1 +k2 (k1 − k2 ) 2 + μ2
(56.50)
In terms of the matrix elements, the Hamiltonian is Hˆ = k1 σ1 |T |k1 σ1 a† ak σ k1 σ1 k1 σ1
+
k1 σ1
a†
k1 σ1 , k2 σ2 , k1 σ1 , k2 σ2
a†
1 1
k1 σ1 k2 ,σ2
k1 σ1 , k2 , σ2 |Vˆ |k1 σ1 , k2 σ2 ak σ ak σ . 1 1
(56.51)
2 2
Taking the limit μ → 0 and substituting the matrix element, with the replacements k1 → k, k2 → k , and k1 − k2 → q , we get H=
2 k 2 k,σ
2m
a† akσ + kσ
e02 4π † a a† a a . V σ,σ q 2 k+q ,σ k −q ,σ k ,σ k,σ k, k , q
(56.52)
631
56 Fock Space Calculation with Field Operators
Important Concepts to Remember • If we have one-particle states in a multiparticle system, in the occupation number picture we can define creation and annihilation operators, b†i , bi that act as for a harmonic oscillator. • Similarly, for fermions, in the occupation number (equal to 0 or 1) picture, we can define creation and annihilation operators ai† , ai that act as for a fermionic harmonic oscillator.
• In terms of bi , b†i , the occupation number picture Hamiltonian is Hˆ = i,i b†i i|Tˆ |i bi +
† † ˆ i,j,i ,j bi b j i j |V(2) |i j bi b j . • Fock space is the Hilbert space in the occupation number picture, for all possible values of the total number N. • In the fermionic Fock space, ai |n1 . . . n∞ = (−1) S |n1 . . . ni−1 , ni − 1, ni+1 . . . n∞ , with S = n1 + · · · ni−1 and the same sign for ai† . • By analogy with (relativistic) quantum field theory, in fact a nonrelativistic version of the same, ˆ r ) = i φi (r ) cˆi , ci = (bi , ai ), where for ci = ai we actually we can construct field operators, as ψ(
ˆ r ; s) = s ψˆ σ (r )χs (s). have also a spin index, so that ψˆ σ (r ) = i φi,σ (r )ai and we can write ψ( • For the Coulomb potential, as a limit of the Yukawa potential, we find 2 k 2 e2 † aˆ† aˆk,σ + 0 aˆ aˆ † a a . Hˆ = 2m k,σ V σ,σ k+q ,σ k −q ,σ k ,σ k,σ k,σ
k, kq
Further Reading More details can be found in the book [27].
Exercises (1) Find the eigenvalue of the operator e αb on single-particle states. (2) If the Hamiltonian acting on the occupation number states of a system can be written as
Hˆ = i t i b†i bi + V i< j b†i b†j bi+1 b j−1 , how would you interpret this physically? (3) Consider a system with seven one-particle states, and three bosons in the system. How many states are there in the Fock space? How many multiparticle states are possible?
(4) Find the eigenvalue of the fermionic operator exp [( i ai )α] in the fermionic Fock space. (5) Show the details of the proof of the Schr¨odinger equation for the occupation number states for fermions, and show the resulting absence of a sign. (6) Calculate the occupation number Hamiltonian for a delta function two-body interaction V (r i − r j ) = V0 δ(r i − r j ). (7) Explain physically the Coulomb occupation number Hamiltonian (56.52) in terms of Feynman diagrams.
57
The Hartree–Fock Approximation and Other Occupation Number Approximations In this chapter, we will consider a self-consistent approximation method for multiparticle systems, the Hartree–Fock approximation. It arises from the multiparticle picture, either the original one or, better, the occupation number picture. We will make connections with the original derivation of Hartree and Fock. Then we sketch the general ideas of the other approximations in the occupation number picture: the usual perturbation theory in the style of nonrelativistic quantum field theory, and the self-consistent Bethe–Salpeter equation.
57.1 Set-Up of the Problem We would like to solve perturbatively for the energies of multiparticle states. The most relevant case is that of multi-electron states, meaning we have fermionic one-particle states. The perturbative solution is via a self-consistent equation, where the Hamiltonian contains field operators for the rest of the fermions (electrons) other than the one whose interaction we are analyzing. We will start directly with the occupation number Hamiltonian derived in the previous chapter, Hˆ = Hˆ 0 + Hˆ 1 Hˆ 0 = i|Tˆ(1) |i ai† ai i,i
1 † † Hˆ 1 = − a a i j |Vˆ(2) |i j ai a j . 2 i,j,i ,j i j
(57.1)
Note that the minus sign in front of the expression for Hˆ 1 is there because the particles are fermions (the general formula for bosons or fermions with the above ordering has ± in front). Because of the Fermi–Dirac antisymmetric statistics, we can rewrite Hˆ 1 equivalently by reordering ai a j as −a j ai , followed by redefinition of the summation variables, i as j and j as i . Averaging over the two equivalent forms, we substitute in Hˆ 1 1 i j |Vˆ(2) |i j → (57.2) i j |Vˆ(2) |i j − i j |Vˆ(2) | j i . 2 Equivalently, consider in the original formulation the basis N-particle state |ψ = |E1 1 ⊗ |E2 2 ⊗ · · · ⊗ |E N N , and antisymmetrize it to obtain the Slater determinant, ¯ = √1 | ψ P|ψ. N! perms. P 632
(57.3)
(57.4)
633
57 The Hartree–Fock Approximation ¯ Then we define Vˆ (i j) as the We consider the average of Hˆ 1 for the Slater determinant state | ψ. two-particle interaction potential, acting on the left on the ith single-particle Hilbert space, and on the right on the jth single-particle Hilbert space, so that ¯ Hˆ 1 | ψ ¯ =1 ¯ ψ| ψ|Vˆ (i j) | ψ. (57.5) 2 i,j ¯ containing states with Ei and E j , this results (after using the orthonormality of the rest Then, for | ψ ¯ i| j = δi j ) in a matrix element for two-particle states, of the other one-particle states in |ψ and | ψ, ¯ Hˆ 1 | ψ ¯ =1 ψ| Ei E j |Vˆ (i j) |Ei E j − Ei E j |Vˆ (i j) |E j Ei . (57.6) 2 i,j This is the same result as that from the occupation number method above. On the other hand, transitioning from the above original representation result, by writing |ψ = . . . ai† . . . a†j . . . |0,
(57.7)
means that we can relate the N-particle matrix element to the two-particle one in the two-particle operator, ¯ Vˆ (i j) | ψ ¯ = 1 i j |Vˆ(2) |i j − i j |Vˆ(2) | ji . ψ| (57.8) 2
57.2 Derivation of the Hartree–Fock Equation We want to solve the time-independent Schr¨odinger equation, Hˆ |ψ = E|ψ, which means that we need to consider ψ|( Hˆ − E)|ψ = 0.
(57.9)
δψ|( Hˆ − E)|ψ.
(57.10)
|ψ = ai†1 ai†2 · · · ai†N |0,
(57.11)
Varying the matrix equation, we obtain
If the state is
then its variation is |δψ = δai†1 ai†2 · · · ai†N |0 + ai†1 δai†2 · · · ai†N |0 + · · · + ai†1 · · · ai†N −1 δai†N |0.
(57.12)
ˆ with Uˆ Uˆ † = The variation occurs through a unitary transformation of the single-particle state by U, 1, such that ˆ Uˆ 1 + i Kˆ ⇒ Kˆ † = K.
(57.13)
The transformation Uˆ acts on states, so on the annihilation and creation operators, mixing the various i’s, so that ai → Ui j a j , ai† → Ui†j a†j .
(57.14)
634
57 The Hartree–Fock Approximation
For an infinitesimal transformation, δai† = −i
K ji a†j .
(57.15)
j
We can have two types of transformation: (a) If δai† ∝ ai only, it means that |δψ ∝ |ψ, so (57.10) becomes just the usual Schr¨odinger equation, ψ| Hˆ |ψ = Eψ|ψ.
(57.16)
(b) If δai† contains other a†j , meaning K ji 0 for i j, then δψ|ψ = 0 and then we obtain δψ| Hˆ |ψ = 0.
(57.17)
The interpretation of this result is that, in the sector of |ψ’s, see (57.11), where there is only one creation operator ai† that changes from i to j (in all the terms in |δψ, there is also only one
δai† among the other ones that is untouched, and thus δai† ∝ j K ji a†j ) and the matrix element of Hˆ vanishes. In particular, for relevant combinations of (i j, i j ), δψ| Hˆ (2) |ψ = 0| . . . ai . . . a j . . . | Hˆ (2) | . . . ai† . . . a†j . . . |0 = i j | Hˆ (2) |i j .
(57.18)
This means that the terms in the occupation number Hamiltonian Hˆ that have only one ai† -change, or otherwise (by the previous equation) if i j | Hˆ (2) |i j has an index on the left equal to one on the right, must vanish. These terms are 1 1 i j |Vˆ(2) |i j ai† a†j ai a j − i j |Vˆ(2) |i iai† a†j ai ai 2 i,j,j 2 i,j,i 1 1 − i j |Vˆ(2) | j j ai† a†j a j a j − i j |Vˆ(2) |i jai† a†j ai a j 2 i,j,j 2 i,j,i 1 1 =− − ki|Vˆ(2) |ki ai† ak† ak ai + ki|Vˆ(2) |i kai† ak† ak ai 2 2 k,i,i 1 1 + ik |Vˆ(2) |ki ai† ak† ak ai − ik |Vˆ(2) |i kai† ak† ak ai 2 2 1 1 =− − ik |Vˆ(2) |i kai† ak† ak ai + ik |Vˆ(2) |ki ai† ak† ak ai 2 2 k,i,i 1 1 + ik |Vˆ(2) |ki ai† ak† ak ai − ik |Vˆ(2) |i kai† ak† ak ai , 2 2 −
(57.19)
where in the first equality we have commuted a† past a† or a past a (with different indices), and then relabeled the summation indices as the common set (kii ), and in the second equality we have used i j |Vˆ(2) |kl = ji|Vˆ(2) |l k in the first two terms. Then the relevant terms terms in the Hamiltonian are ⎤⎥ ⎡⎢ ai† ⎢⎢ ak† ak ik |Vˆ(2) |i k − ak† ak ik |Vˆ(2) |ki ⎥⎥ ai , (57.20) ⎥⎦ ⎢⎣ k ii k
635
57 The Hartree–Fock Approximation and since ak† ak = Nk is the occupation number in mode k, which is 1 for an occupied state and 0 for an unoccupied state, it means that we can replace it with changing the sum over all k to a sum over occupied states k, ii
⎡⎢ ⎤⎥ ai† ⎢⎢ ik |Vˆ(2) |i k − ik |Vˆ(2) |ki ⎥⎥ ai ⎢⎣k occ. ⎥⎦ k occ. ≡ ai† i|VˆH−F |i ai ,
(57.21)
ii
where we have defined the matrix element of the Hartree–Fock potential VˆH–F , i|VˆH–F |i = ik |Vˆ(2) |i k − ik |Vˆ(2) |ki . k occ.
(57.22)
k occ.
Thus the condition for the terms in the Hamiltonian with only one transition of ai to vanish is i|Tˆ(1) | j + i|VˆH–F | j = 0, for i j.
(57.23)
Therefore Tˆ(1) + VˆH–F is diagonal, or in other words the one-particle states are eigenstates for it, (Tˆ(1) + VˆH–F )|i = Ei |i.
(57.24)
This is the Hartree–Fock equation for the single-particle Hamiltonian (VˆH–F is a single-particle operator), which is to be solved iteratively for the eigenstates |i and the eigenenergies Ei . So the new Hamiltonian in occupation number space is Hˆ H–F = i|Tˆ(1) |i ai† ai + i|VˆH−F |i ai† ai . (57.25) i,i
i,i
57.3 Hartree and Fock Terms Writing the Schr¨odinger equation for wave functions, we obtain 2 (2,exch) (1) − Δr + V (r ) φi (r ) + VH–F (r )φi (r ) − d 3r VH−F (r , r )φi (r ) = 0. 2m
(57.26)
In this equation, Hartree considered the first term in VH–F , and Fock added the second term. The equation is obtained as follows. The first term, (1) ik |Vˆ(2) |i k = i|VˆH–F |i (57.27) k occ
becomes, in wave functions, on introducing the completeness relation for |r , (1) d 3r φi∗ (r )VH–F φi (r ) = d 3r φi∗ (r ) d 3r φk∗ (r )V(2) (r , r )φk (r )φi (r ),
(57.28)
k occ.
which defines the Hartree potential, (1) VH (r ) = VH−F (r ) = d 3r |φk (r )| 2 V(2) (r , r ) = d 3r ρ(r )V(2) (r , r ). k,occ.
(57.29)
636
57 The Hartree–Fock Approximation
We have defined the probability density ρ(r ) = k,occ. |φk (r )| 2 of the electrons. By multiplying with e, to give ρ e = eρ, we obtain the electron density, which means that this first term has a classical interpretation. The second term is called the exchange term, since it is a purely quantum term, without a classical analog. In it, (2) ik |Vˆ(2) |ki = i|VˆH–F |i (57.30) k,occ.
becomes, in wave functions, (2) 3 ∗ 3 ∗ d r φi (r )VH–F φi (r ) = d r φi (r ) d 3r φk∗ (r )V(2) (r , r )φi (r )φk (r ),
(57.31)
k occ.
which defines the Fock “exchange” potential acting on a wave function, VF φi (r ) =
(2) VH–F φi (r )
= ≡
d 3r φk∗ (r )φk (r ) V(2) (r , r )φi (r ) k,occ
(57.32)
(2,exch) d 3r VH−F (r , r )φi (r ).
Here we have defined the two-particle exchange term of the potential, (2,exch) VH–F (r , r ) = φk∗ (r )φk (r )V (2) (r , r ).
(57.33)
k occ.
This completes our definition of the Hartree–Fock equation for wave functions. We now specialize to the case of the Coulomb interaction, V=
e02
|r − r |
.
Then keeping only the first of the Hartree–Fock terms, we obtain the Hartree equation, 2 − Δ + V (r ) + VH (r ) φi (r ) = Ei φi (r ). 2m
(57.34)
(57.35)
This has an interpretation as a semiclassical potential, in which the Hartree term VH is a classical potential with a quantum charge density, e0 VH (r ) = d 3r ρ(r )V (r − r ) = d 3r ρ e (r ) , (57.36) |r − r | where ρ e (r ) = e0 ρ(r ) is the electric charge density. On the other hand, the second term, the Fock term, is due to the Coulomb exchange potential e0 (2,exch) (r , r ) = e0 φk∗ (r )φk (r ) , (57.37) VH–F |r − r | k occ. leading to VF φi (r ) =
(2,exch) d 3r VH–F (r , r )φi (r ).
(57.38)
637
57 The Hartree–Fock Approximation
57.4 Other Approximations in the Occupation Number Picture Besides the Hartree–Fock approximation, solving iteratively the Hartree–Fock equations, we can also employ perturbation theory, but not for single-particle wave functions. Instead, we use a perturbation theory in terms of the field operators for the occupation number. It is in fact the nonrelativistic version of the formalism of quantum field theory. Here we will give just a rough sketch, an outline, of the method. In order to define perturbation theory, we use the Dirac picture, with respect to the free Hamiltonian, in the occupation number version, ωi cˆi† cˆi , (57.39) Hˆ = i
where ωi δii = i|Tˆ(1) |i ,
(57.40)
and Tˆ(1) contains the kinetic energy and the one-particle potential. The operators in the Dirac (interaction) picture are (in terms of the Schr¨odinger picture operator Aˆ S ) ˆ ˆ Aˆ I (t) = ei H0 t/ Aˆ S e−i H0 t/ ,
(57.41)
while the operators in the Heisenberg picture are ˆ ˆ Aˆ H (t) = ei Ht/ Aˆ S e−i Ht/ .
(57.42)
† Then we can define the creation and annihilation operators in the Dirac picture, cˆi,I (t), cˆi,I (t). A crucial object in the theory is the Green’s function, defined as
G σσ (r , t; r , t ) ≡ −i
ψ0 |T { ψˆ H,σ (r , t) ψˆ †H,σ (r , t )}|ψ0 ψ0 |ψ0
.
This will have an expansion similar to the single-particle Green’s function, generalizing |ii| ˆ G(z) ∼ , z − Ei i
(57.43)
(57.44)
implying that the Green’s function has poles at the energies of the bound states Ei . Moreover, the Green’s function is part of a more general correlation function of generic operators ˆ Bˆ in the Heisenberg picture; for the ground state wave function |ψ0 we have A, ψ0 |T { Aˆ H (t) Bˆ H (t )}|ψ0 . ψ0 |ψ0
(57.45)
The “Feynman theorem” will relate these correlation functions to those of the interaction picture operator Aˆ I (t), Bˆ I (t) and |φ0 , the ground state of Hˆ 0 . Then the Green’s functions are related to observables. For one-particle operators, Aˆ σσ = d 3r aˆ σσ (r ), (57.46)
638
57 The Hartree–Fock Approximation
+
H-F
=
+
(a)
Figure 57.1
B-S
(b)
(a) Hartree–Fock equation for Green’s functions. (b) Bethe–Salpeter equation for four-point Green’s functions. the average in the ground state is a(r ) ≡ ψ0 |aσσ |ψ0 = −i
lim
r → r ,t →t
Aσ,σ G σ ,σ (r , t; r , t )
σ,σ
(57.47)
= −i Tr[AG]. Similarly, the average in the ground state of the two-particle operator is 1 3 ˆ B = − d r d 3r Bσ1 σ1 σ2 σ2 (r , r ) lim G σ2 σ2 ,σ1 σ1 (r , t; r , t ). →t t 2 σ σ σ σ
(57.48)
1 2 1 2
Hartree–Fock versus Bethe–Salpeter Equation We can consider a self-consistent equation for the standard perturbation theory approximation, the Hartree–Fock equation for the Green’s functions, (2) G (2) G0(2) + G0(2) · VH–F · G (2) ,
(57.49)
which depicted pictorially is in Fig. 57.1a. As can be seen, the equation is for two-point Green’s functions, i.e., propagators. Alternatively, we can write a Bethe–Salpeter equation, or Dyson equation, for the four-point Green’s functions G (4) . This equation for G (4) is (2) (2) (4) (2) (2) G (4) = G0,1 G0,2 + K B−S · G0,1 G0,2 · G (4) ,
(57.50)
depicted pictorially in Fig. 57.1b. However, as we saw, the Green’s function has poles at the corrected bound states Ei , and near the pole at i, the Green’s function becomes |ii| ˆ G(z) , z − Ei
(57.51)
which can be generalized to the case of several particles, in particular to the propagation of two particles, described by the four-point Green’s functions Gˆ (4) . Near a pole it has a similar form, |(i j)(i j)| Gˆ (4) (E) ∼ , E − Ei
(57.52)
which means that the Bethe–Salpeter equation for the two-particle wave function |(i j) is (4) (2) (2) |(i j) K B−S · G0,1 G0,2 |(i j).
(57.53)
639
57 The Hartree–Fock Approximation
Important Concepts to Remember • The Hartree–Fock approximation is an approximation for the energy of an electron in a multielectron state, using the occupation number picture, via a self-consistent equation with field operators for the other electrons.
• The Hartree–Fock potential VˆH−F is defined by i|VˆH−F |i = k occ. ik |Vˆ (2) |i k −
ˆ (2) |ki . k occ. ik |V • The Hartree–Fock equation states that then the Hamiltonian is diagonal, so (Tˆ (1) + VˆH−F )|i = E |i, and the mean-field Hartree–Fock Hamiltonian in occupation number space is Hˆ H−F =
i ˆ (1) |i a† ai + i,i i|VˆH−F |i a† ai . i,i i|T i i • On wave functions, from VˆH−F we obtain two terms, the direct term or Hartree potential,
(1) (1) VH−F φi (r ), with VH (r ) = VH−F (r ) = d 3r k occ. |φk (r )| 2 V (2) (r , r ) = d 3r ρ(r )V (2) (r , r ), (2,exch.) (2) and the exchange or Fock term, VF φi (r ) = VH−F φi (r ) = d 3r VH−F (r , r )φi (r ), with
(2,exch.) VH−F (r , r ) = k occ. φk (r )φk (r )V (2) (r , r ). • The Hartree equation contains only the Hartree potential, as a semiclassical contribution from the charge of the other electrons, d 3r ρ(r )V (r − r ), while the Fock term contains the purely quantum exchange term. • In the occupation number picture, we can also introduce a nonrelativistic version of quantum ˆ ˆ field theory, via field operators. Interaction picture operators, Aˆ I (t) = ei H0 / Aˆ S e−i H0 / are used ˆ ˆ for perturbation theory, and Heisenberg picture operators Aˆ H (t) = ei H/ AS e−i H/ are used for definitions. • Green’s functions
G σσ (r , t; r , t ) ≡ −i
ψ0 |T { ψˆ H,σ (r , t) ψˆ †H,σ (r , t )}|ψ0 ψ0 |ψ0
and more general correlation functions ψ0 |T { Aˆ H (t) Bˆ H (t )}|ψ0 ψ0 |ψ0 can be expanded, via the Feynman theorem, using the perturbation theory of interaction picture operators; one-particle operators Aˆ = d 3r aσσ (r ) have an average in the ground state a = −i Tr[AG] and similar formulas hold for operators for higher numbers of particles. • One can write a Hartree–Fock equation for (two-point) Green’s functions, G (2) G0(2) + G0(2) · (2) VH−F · G (2) , or alternatively a Dyson equation, or version of the Bethe–Salpeter equation, for the (2) (2) (4) (2) (2) four-point Green’s function, G (4) = G0,1 G0,2 + K B−S · G0,1 G0,2 · G (4) .
Further Reading More details are given in the books [2] and [27].
640
57 The Hartree–Fock Approximation
Exercises (1) (2) (3) (4) (5) (6) (7)
Consider bosons instead of fermions, and set up the corresponding problem for a self-consistent approximation. Continue with this method to find the equivalent of the Hartree–Fock potential VˆH–F for bosons. Given the Hartree–Fock equation, rewrite the Hartree–Fock Hamiltonian. Calculate the Hartree potential for the helium atom in its ground state. Calculate the two-particle exchange term in the Hartree–Fock approximation for the lithium atom in its ground state. Write explicitly the Hartree–Fock equation for the 2-point Green’s function, in the coordinate (r and spin σ) representation. Write explicitly, in the coordinate (r and spin σ) representation, the Dyson or Bethe–Salpeter equation.
58
Nonstandard Statistics: Anyons and Nonabelions
In this chapter we consider statistics other than Bose–Einstein and Fermi–Dirac, namely, statistics involving multiplication with an abelian phase eiα (different from ±1), called anyonic statistics, or with a nonabelian factor U, called nonabelian anyonic statistics. The corresponding particles, anyons and nonabelions, live in 2+1 dimensions and are associated with the abelian and nonabelian Berry phase and connection. Therefore, we will review these concepts first. The implementation in physical materials of the new statistics is associated with the Chern–Simons action, which we will also introduce.
58.1 Review of Statistics and Berry Phase In Chapter 20 we considered the permutation of particles within a quantum state, |ψ = |x 1 x 2 → Pˆ12 |ψ = |x 2 x 1 .
(58.1)
We found that Pˆ12 has eigenvalues C12 . Then we imposed that the application of the interchange 2 operator twice returns the situation to the initial state, Pˆ12 = 1, implying that the eigenvalue is C12 = ±1. This behavior was associated with Bose–Einstein (bosons) and Fermi–Dirac (fermions) 2 statistics. But it is not necessary to have Pˆ12 = 1. In fact, we can “interchange” particles continuously, through a path around each other, leading to a Berry phase, as considered in Chapter 20. If the Hamiltonian is time dependent, Hˆ = Hˆ (t), via the adiabatic time dependence of parameters K = K (t), the time dependence of the state that at t = 0 is |n and has energy En has an extra phase, called the Berry phase γn , so that t i |ψ(t) = exp − En (t )dt eiγn (t) |n(K (t)). (58.2) 0 Here the Berry connection is K |n(K (t)). n (K ) = in(K (t))| ∇ A If K corresponds to a rescaled position R/(c), then the Berry phase is An ( R) γn = · d R, c C
(58.3)
(58.4)
and is an Aharanov–Bohm-like phase. We saw that the Berry phase on a closed contour C, modulo 2πm, is invariant and so cannot be removed by a gauge transformation since, under it, γn (C) = γn (C) + 2πm. 641
(58.5)
642
58 Nonstandard Statistics: Anyons and Nonabelions
We also defined a nonabelian generalization, in the case when the state with En has degeneracy N with index |a, a = 1, . . . , N. Then the adiabatic Hamiltonian Hˆ (K (t)) has states |n, K depending on time through the parameters. In this case, the nonabelian Berry connection is K |b, K (t), ab (K ) = ia, K (t)| ∇ A
(58.6)
with values in in the adjoint representation of the group U (N ) of unitary transformations. The nonabelian Berry phase is (ab) (ab) ( K ) · d K. γ = A (58.7) C
The Berry phase factor is path ordered, Peiγ
(a b)
,
(58.8)
along C: the implying that the expansion of the exponential in powers is ordered by the value of K later in the path, the further to the right.
58.2 Anyons in 2+1 Dimensions: Explicit Construction We now construct explicitly an anyon system, where the particles are neither bosons or fermions and under the permutation of two particles we obtain a phase eiα , different from ±1. The possibility of an abelian phase eiα is restricted to 2+1 dimensions, which means that the system has an infinitesimal extent in a third spatial dimension. Consider the following construction of an anyon system. We start with a system of N particles, since they are charged. for instance electrons, interacting with an electromagnetic field Aμ = ( A0 , A) The Hamiltonian is Hˆ =
N N r i )| 2 |p i − q A( + v(r i − r j ) + q A0 (r i ), 2m i< j i=1 i=1
(58.9)
where r i are the positions of the particles, and v(r i − r j ) is a two-particle potential. The multiparticle wave function ψ(r 1 , . . . , r N ) obeys the Schr¨odinger equation, Hˆ ψ(r 1 , . . . , r N ) = Eψ(r 1 , . . . , r N ).
(58.10)
Now we make a canonical transformation (a unitary transformation) of the wave function from ψ to φ = U ψ as follows: ⎫ ⎧ ⎪ ⎪ ⎪ ⎪ θ ⎨ ψ(r 1 , . . . , r N ), φ(r 1 , . . . , r N ) = U ψ(r 1 , . . . , r N ) = ⎪ exp −i α(r i − r j ) ⎬ ⎪ π ⎪ i< j ⎪ ⎩ ⎭
(58.11)
where α(r i − r j ) is the angle made by the relative distance between the particles r i − r j with respect ˆ We can check easily that to a fixed axis. Under U the Hamiltonian transforms as Hˆ → U −1 HU. r i ))U = p i − q A( r i ) − qa (r i ), U −1 (p i − q A(
(58.12)
643
58 Nonstandard Statistics: Anyons and Nonabelions
where we have defined an emergent (or statistical) gauge field a by θ θ zˆ × (r i − r j ) qa (r i ) = ∇i α(r i − r j ) = . π ji π ji |r i − r j | 2
(58.13)
Here zˆ is the unit vector in the direction perpendicular to the two-dimensional material (the third spatial direction). Note that, since α(r i − r j ) is the angle of the relative distance with the fixed axis, α(r i − r j ) = α(r j − r i ) + π.
(58.14)
Substituting this in the definition of φ = U ψ, when we exchange the two particles i and j, after the canonical transformation by U we obtain an extra eiθ , besides the original ±1 valid for ψ: φ(. . . , r j , . . . , r i , . . .) = ±eiθ (. . . , r i , . . . , r j , . . .).
(58.15)
Then, if θ = (2k + 1)π, the statistics is changed from Bose–Einstein to Fermi–Dirac and from Fermi–Dirac to Bose–Einstein and, for more general θ, we change to fractional, or anyonic, statistics. The field aμ is called the statistical gauge field, since it changes the statistics. In the new representation, the Hamiltonian is Hˆ =
N N r i ) − qa (r i )| 2 |p i − q A( + v(r i − r j ) + q A0 (r i ), 2m i< j i=1 i=1
and its field strength (due to the magnetic field) is where a is added to A, 2θ × a (r i ) = 2θ f 12 (r i ) ≡ b(r i ) = ∇ δ(r i − r j ) = 2 ρ charge (r i ), q ji q
(58.16)
(58.17)
and where ρ charge (r i ) = q ji δ(r i − r j ) is the charge density of the anyon system. Note that in 2+1 dimensions the magnetic field is a scalar B, not a vector, and the magnetic flux associated with it is just Φ = BdS. → A + a after the canonical (unitary) transformation, if B = F12 (r ) = 0 before the Since A transformation then the total magnetic field after it is × (A + a )(r i ) = 2θ δ(r i − r j ). (58.18) B˜ = F˜12 = ∇ q ji This means that we have a delta function magnetic flux associated with the particle (at its position), 2θ ˜ Φ= F12 dS = f 12 dS = , (58.19) q S S where S is a small surface around the particle. Therefore, we have attached a magnetic flux at the position of the particle, turning it into an anyon. The explicit construction above was through the canonical (unitary) transformation, leading to magnetic field B for the statistical gauge field, but we can define B as a delta function for any gauge field, including an electromagnetic field itself, and still obtain an anyon behavior. Indeed, the Aharonov–Bohm phase obtained when one of the anyons is interchanged with another, by continuously moving it around the other, is · dr = exp iqΦ[C] = e2iθ . A (58.20) exp iq C
644
58 Nonstandard Statistics: Anyons and Nonabelions
However, rotating one anyon around another returns it to the original position, therefore corresponds to two interchanges. That means that the factor for a single interchange is C12 = eiθ .
(58.21)
Moreover, the same argument can be used for the more general Berry phase γ, associated with a Then the Berry phase for one particle going around the other in parameter space Berry connection A. gives e2iθ = eiγ ⇒ θ =
γ . 2
(58.22)
58.3 Chern–Simons Action We have obtained the relation between the magnetic field (here the total magnetic field) and the electric charge density, F12 =
2θ 2θ ρ charge ≡ J0 , q q
(58.23)
but ρ charge is the zeroth component of the relativistic current Jμ (in 2+1 dimensions), ρ charge = J0 . The above equation is the (12) component of a relativistically covariant equation, Fμν =
2π μνρ J ρ , k
(58.24)
with a “Chern-Simons quantization level” k, where q k=π , θ and μνρ is the totally antisymmetric Levi–Civita tensor, with 012 = +1. The action from which this relativistically covariant equation of motion comes is k SCS = d 2+1 x μνρ Aμ ∂ν Aρ 4π M k = d 2+1 x − A1 A˙ 2 + A2 A˙ 1 4π M + A0 ∂1 A2 − A2 ∂1 A0 − A0 ∂2 A1 + A1 ∂2 A0 k = d 2+1 x[A2 A˙ 1 + A0 (∂1 A2 − ∂2 A1 )] 2π M k = d 2+1 x[A2 A˙ 1 + A0 F12 ], 2π M
(58.25)
(58.26)
where we have used partial integration in the third equality, assuming that the boundary term at infinity vanishes. Adding a source term, + J 0 A0 ], Ssource = d 2+1 x J μ Aμ = d 2+1 x[J · A (58.27)
645
58 Nonstandard Statistics: Anyons and Nonabelions
the total action,
S = SCS + Ssource =
d 2+1 x
k μνρ Aμ ∂ν Aρ + J μ Aμ , 4π
(58.28)
has the equations of motion k μνρ Fνρ = J μ ⇒ 4π 2π Fμν = μνρ J ρ , k
(58.29)
as required. as fixed) is just the relation (without Alternatively, the equation of motion for A0 (considering A the relativistically covariant generalization) F12 =
2θ 0 J . q
(58.30)
Moreover, the Chern–Simons level k is quantized. Its quantization is related to Dirac quantization, as we will now show. The gauge transformation is in general ˙ = ∇λ, δA δ A0 = λ,
(58.31)
but we restrict to transformations that depend only on time, λ = λ(t). Then the Chern–Simons action transforms as t2 k ˙ 12 = k δSCS = d 2+1 x λF dt dSF12 λ˙ 2π M 2π t1 M (58.32) ⎡⎢ BdS ⎤⎥ M ⎢ ⎥ k (λ(t 2 ) − λ(t 1 )) . =⎢ ⎢⎣ 2π ⎥⎥⎦ We now consider periodicity in time, which is related to finite temperature as we saw in Chapter 28 (the relation to finite temperature appearing after the Wick rotation to imaginary, or Euclidean, time), t 2 −t 1 being the periodicity. Since λ is the gauge parameter, under the gauge transformation the wave function of the anyon changes by (see Dirac quantization in Chapter 27) ψ → eiqλ/ ψ,
(58.33)
and in order for the gauge transformation to be well defined as we go around the periodicity cycle, we need to have the periodicity λ(t 2 ) − λ(t 1 ) =
2π m, q
m ∈ Z.
(58.34)
Substituting this in the variation of the Chern–Simons action δSCS , together with the fact that the magnetic flux is quantized in units of Φ0 = h/q, as seen in Chapter 24 concerning the Hall effect, BdS Φ p p h = = = p , p ∈ Z, (58.35) 2π 2π 2π q q and defining the product of the integers m and p as an integer n, we obtain δSCS = kn
2π . q q
(58.36)
646
58 Nonstandard Statistics: Anyons and Nonabelions
Since the action appears in the path integral for quantum calculations in the form eiS/ , and therefore the variation under the gauge transformation of eiS/ must be trivial, eiδS/ = 1, we obtain δSCS = 2πN ⇒ k
∈ Z. q2
(58.37)
It is usual to rescale the gauge field by the charge, Aμ → Aμ /q, such that the electromagnetic kinetic term has 1/q2 in front, 1 Sem,kin = − 2 Fμν F μν ; (58.38) 4q then the quantization condition is simply that k is an integer in units of 1/, or k ∈ Z. Then the Chern–Simons action SCS generates anyons, and θ=
π , k ∈ Z. k
(58.39)
58.4 Example: Fractional Quantum Hall Effect (FQHE) The Hall effect refers to the current density j ( jy ) induced in a direction perpendicular to the applied (Bz ), with jy = σ H Ex . Then the integer quantum (Ex ) and a constant magnetic field B electric field E Hall effect (IQHE) refers to a quantized value of the Hall conductivity, σ H = nσ0 , whereas the fractional quantum Hall effect (FQHE) refers to fractional values of the same, σ H = (q/r)σ0 , with q, r ∈ Z. We will not consider further the theory of the FQHE, first encountered in Chapter 24, but will just say that for the particular case where σ=
1 σ0 , r
(58.40)
the action that describes both the Hall effect on the conductivity, and the particles responsible for this effect, is 1 μνρ r μνρ Seff = d 2+1 x Aμ ∂ν aρ − aμ ∂ν aρ . (58.41) 2π 4π M The classical equations of motion of the action (for aμ ) are f μν = ∂μ aν − ∂ν aμ =
1 1 1 Fμν = (∂μ Aν − ∂ν Aμ ) ⇒ aμ = Aμ . r r r
Then the on-shell action (solved for aμ ) in terms of the electromagnetic field Aμ is 1 1 Seff = d 2+1 x μνρ Aμ ∂ν Aρ , r 4π M
(58.42)
(58.43)
and from it we obtain the fractional Hall conductivity σ H = σ0 /r (though we will not consider it further here). with But if we do not solve for aμ , we obtain the kinetic term for the statistical field interacting μ anyons. Indeed, the source term for electromagnetism, in the form of charged particles, eJ Aμ ,
647
58 Nonstandard Statistics: Anyons and Nonabelions
charge e, implies that there should also be a source term for the statistical gauge field aμ , in the form of “quasiparticles” of charge q under aμ , 2+1 μ d x qJ aμ = q dt a0 (x , t). (58.44) Adding this source term to Seff , the equation of motion for aμ becomes F12 r f 12 − + qδ(x − x (t)) = 0. (58.45) 2π 2π This implies that we have a delta function flux either in F12 (the electromagnetic field) or in f 12 (the statistical field). But having a delta function flux in the electromagnetic field is an idealization that does not sit well with reality, since the flux lines going up through the third direction should return back down somewhere else. Instead, we can have a delta function flux in the statistical gauge field, that is, f 12 q = δ(x − x (t)). (58.46) 2π r Comparing this with the relation between the magnetic field and the charge density for anyons, we obtain that the anyon statistics parameter θ is π θ = q2 , (58.47) r and, as we said, the factor q2 can be absorbed into a redefinition of the gauge fields.
58.5 Nonabelian Statistics Until now we have considered an abelian phase statistics, but we can have also a nonabelian statistics. Moore and Read have shown that, even in the FQHE, we can have nonabelian statistics. That means that under the interchange of particles, the possibilities for the change in the wave function do not belong to the permutation group Z N but rather to the braid group B N . We can think of the anyons as abelian representations of the braid group B N . But there are nonabelian representations of the braid group, and the particles charged under these representations are called nonabelian anyons, or nonabelions. Under the interchange (braid), the wave function changes as ψ p:{i1 ,...,ir ,...,is ,...,i N } (z1 , . . . , zir , . . . , zis , . . . , z N ) = Bpq [i 1 , . . . , i N ]ψq:{i1 ,...,i N } (z1 , . . . , z N ),
(58.48)
q
where p is a shorthand for the set of indices {i 1 , . . . , i N }, and z ∈ C is the two-dimensional spatial coordinate. The Moore–Read wave function for N “electrons without spin” is 1 1 (zi − z j ) m exp − |z j | 2 , (58.49) ψ P f (z1 , . . . , z N ) = Pfaff zi − z j i< j 4 where zi is the position of a “vortex” and the Pfaffian is a kind of “square root” of the determinant, when N is even (there is no space to explain this further). It is rather hard to show that a bound
648
Figure 58.1
58 Nonstandard Statistics: Anyons and Nonabelions
A basic braiding move. state of four excitations obeys nonabelian statistics. Equivalently, and also hard to show, we have a nonabelian Berry connection, leading to a nonabelian Berry phase, also with nonabelian statistics. An example of the action of the nonabelian braid on bound states of excitations is as follows. Consider oscillators with annihilation and creation operators ci and ci† , with commutation relations {ci , c†j } = δi j , {ci , c j } = {ci† , c†j } = 0.
(58.50)
Then define the Majorana fermion zero modes (modes of zero energy) γi as γi = ci + ci† ,
(58.51)
{γi , γ j } = δi j , i = 1, . . . , 2n.
(58.52)
satisfying the Clifford algebra
Then the complex fermion made up from the Majorana fermions is γ2k+1 + iγ2k , k = 1, . . . , n, (58.53) 2 and has a 2n -dimensional Hilbert space. For each vortex, we have one Majorana zero mode, but the relevant excitations forming the nonabelion are nonlocal (they come from several vortices). An example of braiding, in fact the one that generates the braid group, is where the ith vortex is braided with the (i + 1)th vortex, in an anticlockwise direction: ψk =
R : {γi → γi+1 , γi+1 → −γi ,
γ j → γ j , j i, i + 1}.
(58.54)
A graphical representation of a basic braiding move is given in Fig. 58.1.
Important Concepts to Remember • Instead of the Bose–Einstein and Fermi–Dirac statistics, in 2+1 dimensions we can also have particles whose statistics are defined by interchange by a phase eiθ , called anyons, or by a nonabelian matrix, called nonabelions. • We can construct anyons with eiθ by a unitary transformation from the wave function of a system of ordinary particles (for instance electrons), obtaining an extra coupling to a statistical gauge field
r i − r j ). qaμ , qa = πθ ji ∇ i α(
649
58 Nonstandard Statistics: Anyons and Nonabelions × (A + a )(r i ) = • To construct the anyon we attach a flux at the position of the particle, B˜ = F˜12 = ∇ 2θ
r i − r j ). ji δ( q • The constructed anyon satisfies the equation of motion for the Chern–Simons action with a source, k μνρ d 2+1 x 4π Aμ ∂ν Aρ + J μ Aμ , with quantized level k, quantized via Dirac quantization. • Anyons and the associated Chern–Simons action appear also in the theoretical description of the fractional quantum Hall effect (FQHE), for σr = σ0 /r. • A nonabelian statistics is also possible, with interchange under the braiding group, exemplified on Majorana fermion zero modes, as found from the Moore–Read wave function.
Further Reading More details about the FQHE and nonabelian statistics can be found in the review by David Tong [29], and the Moore–Read paper [30].
Exercises (1)
(2) (3)
(4) (5)
(6)
(7)
The nonabelian Berry phase, by its very definition, changes the wave function, even though we return to the initial state in terms of K (t) (in the abelian case, the wave function changes by a phase, so |ψ| 2 doesn’t change). Can you give an example where this is consistent? Show the steps in the proof of the relation (58.12), for constructing an anyon from a statistical gauge field. ρ Consider the (01) component of the equation of motion Fμν = 2π k μνρ J , coming from the relativistic Chern–Simons action with a source. What physical property does it describe? Note that this is now related to the existence of anyons. Show that the Chern–Simons action (without a source) is equal to a (3 + 1)-dimensional action of a gauge invariant, topological (metric independent, and with discrete values) type. When describing the FQHE in the text, we gave the option of solving for the statistical gauge field aμ and obtaining a Chern–Simons-like action for the electromagnetic field Aμ , or of adding a source term and obtaining an anyon. But why did we not consider the equation of motion for Aμ , either in the first or in the second case? When discussing the FQHE anyon as a statistical gauge delta function flux added to the quasiparticles, we argued that for the electromagnetic field Fμν to be a delta function is unphysical because of flux conservation. But why then is it OK for the statistical f μν to be a delta function? Show that the factor (zi − z j ) m in the Moore–Read wave function (also present in the phenomenological “Laughlin wave function” for a theoretical description of the FQHE) implies there are m “vortices” at each position. A vortex ansatz is f (r)eiα , where r, α are the radius and the polar angle in the plane, respectively, and one can impose a consistency condition on the ansatz (and so find it).
References
[1] J.J. Sakurai and San Fu Tuan, Modern Quantum Mechanics, revised edition, Addison–Wesley, 1994. [2] Albert Messiah, Quantum Mechanics, Dover Publications, 2014. [3] R. Shankar, Principles of Quantum Mechanics, second edition, Springer, 1994. [4] L.D. Landau and L.M. Lifshitz, Quantum Mechanics (Non-Relativistic Theory), ButterworthHeinemann, 1981. [5] H. Nastase, Introduction to Quantum Field Theory, Cambridge University Press, 2019. [6] H. Nastase, String Theory Methods for Condensed Matter Physics, Cambridge University Press, 2017. [7] H. Nastase, Classical Field Theory, Cambridge University Press, 2019. [8] Howard Georgi, Lie Algebras in Particle Physics: From Isospin to Unified Theories, Westview Press, 1999. [9] Michael V. Berry, “Quantal phase factors accompanying adiabatic changes”, Proc. Roy. Soc. Lond. A 392 (1984) 45. [10] R. Rajaraman, Solitons and Instantons: An Introduction to Solitons and Instantons in Quantum Field Theory, first edition, North-Holland Personal Library, Vol. 15, 1987. [11] Paul A.M. Dirac, Lectures in Quantum Mechanics, Dover Publications, 2001. [12] John Preskill, Caltech Lecture Notes on Quantum Information, ∼ http://theory.caltech.edu/ preskill/ph219/. [13] E. Knill, R. Laflamme, H. Barnum, et al., “Introduction to quantum information processing”, arXiv:quant-ph/0207171 [quant-ph]. [14] Richard Jozsa, “An introduction to measurement-based quantum computation”, arXiv:quantph/0508124 [quant-ph]. [15] Daniel Gottesman, “An introduction to quantum error correction and fault-tolerant quantum computation”, arXiv:0904.2557 .[quant-ph]. [16] J.S. Bell, “On the Einstein–Podolsky–Rosen paradox”, Physics 1 (3) (1964) 195. [17] E.P. Wigner, American Journal of Physics 38 (1970) 1005. [18] J. Maldacena, S.H. Shenker, and D. Stanford, “A bound on chaos”, J. High-Energy Phys. 08 (2016) 106, arXiv:1503.01409 [hep-th]. [19] R. Jefferson and R.C. Myers, “Circuit complexity in quantum field theory”, J. High-Energy Phys. 10 (2017) 107, arXiv:1707.08570 [hep-th]. [20] M.A. Nielsen, “A geometric approach to quantum circuit lower bounds”, Quantum Inform. Comput. 6 (2006) 213 [quant-ph/0502070]. [21] A.R. Brown and L. Susskind, “Second law of quantum complexity”, Phys. Rev. D 97 (8) (2018) 086015, arXiv:1701.01107 [hep-th]. [22] S. Xu and B. Swingle, “Accessing scrambling using matrix product operators”, Nature Phys. 16 (2) (2019) 199, arXiv:1802.00801 [quant-ph]. 650
651
References
[23] M. Srednicki, “Chaos and quantum thermalization”, cond-mat/9403051. [24] R. Nandkishore and D. A. Huse, “Many-body localization and thermalization in quantum statistical mechanics”, Ann. Rev. Cond. Matter Phys. 6 (2015) 15. [25] W.H. Zurek, “Decoherence, Einselection, and the quantum origins of the classical”, Rev. Mod. Phys. 75 (2003) 715, arXiv:quant-ph/0105127 [quant-ph]. [26] Marcos Rigol, Vanja Dunjko, and Maxim Olshanii, “Thermalization and its mechanism for generic isolated quantum systems”, Nature 452 (2008) 854, arXiv:0708.1324 [cond-mat]. [27] A.L. Fetter and J.D. Walecka, Quantum Theory of Many-Particle Systems, Dover, 2003. [28] N. Levinson, “Determination of the potential from the asymptotic phase”, Phys. Rev. 75 (1949) 1445. [29] David Tong, “Lectures on the quantum Hall effect”, arXiv:1606.06687 [hep-th]. [30] G.W. Moore and N. Read, “Nonabelions in the fractional quantum Hall effect”, Nucl. Phys. B 360 (1991) 362.
Index
adiabatic approximation, 426 Aharonov–Bohm effect, 260, 308 Aharonov–Bohm phase, 641 angular momenta, composition, 174 angular momentum, 137, 159 complex, 566, 574 total, 194 antilinear operator, 15, 152, 153 anyons, 641, 642 arrow of time, 149 atomic orbital method, 472 Balmer relation, 5 Banach space, 24 Bell inequalities, 348 Bell inequality, original, 350 Bell–CHSH inequality, 356 Bell–Wigner inequalities, 353 Berry connection, 264, 641 nonabelian, 269 Berry curvature, 267 Berry phase, 260, 264, 641 Bessel function, spherical, 537 Bethe ansatz, 513 equations, 513 Bethe roots, 513 Bethe–Salpeter equation, 632, 638 blackbody radiation, 3 Bloch equations, 493 Bogoliubov transformation, 393, 400 Bohr magneton, 46, 196 Bohr radius, 210 Bohr–Sommerfeld quantization, 8, 282, 290 Boltzmann distribution, 344 Born approximation, 528 elastic scattering, 424 Born series, 528 Born–Oppenheimer approximation, 266 Bose–Einstein statistics, 230 boson, 229 bound states, 557, 558 braid group, 647 Breit–Wigner distribution, 429, 432 Breit–Wigner formula, 569 butterfly velocity, 389
652
canonical quantization, 60 conditions, 60 Cauchy–Schwarz inequality, 70 central potential, 214 cases, 215 channel, scattering, 593 chaos, quantum, 384, 388 Chebyshev polynomials, 102 chemical bond, 473 Chern–Simons action, 644 level, 644 Clebsch–Gordan coefficients, 176, 177, 185 Clifford algebra, 606, 648 coherent states, 96 completeness relation, 15 complexity classical, 384 quantum, 384, 385 quantum, Nielsen, 386 Compton effect, 4 computation, quantum, 54 conductance tensor, 278 confluent hypergeometric function, 209 constraints first-class, 326, 327 primary, 326 second-class, 326, 327 secondary, 326 continuity equation, probability, 83 correlation function, 123 n-point function, 123 correspondence principle, 7 cost (Finsler) function, 387 Coulomb scattering, WKB approximation, 584 creation/annihilation operators, 93 cross section, 516 cryptography, quantum, 380 cyclotron frequency, 274 Darwin term, 610 Davisson–Germer experiment, 8 de Broglie particle–wave duality, 132 de Broglie wavelength definition, 8, 132 decay width, 431 decoherence, 380 quantum, 393
653
Index
delta function, Dirac, 26, 27 density matrix, 42, 361, 362 diatomic molecule, 220 dipole moment electric, 440 magnetic, 441 Dirac bra-ket notation, 12 Dirac brackets, 329 Dirac equation, 257, 603 Dirac quantization condition, 301, 303 Dirac quantization, constrained systems, 325 Dirac string, 308 dissipation time, 390 distribution, 27 Bose–Einstein, 370 Fermi–Dirac, 370 Maxwell, 368 double-slit experiment, 6 dual space, 17 Dyson equation, 638 effective mass of electron, 254 effective potential, radial Schr¨odinger, 206 Ehrenfest theorem, 127, 140, 153 eigenstate thermalization hypothesis (ETH), 398 eigenvalue problem, 19 eikonal approximation, 577, 581 einselection, 395 Einstein coefficient, 497 Einstein locality, 347 energy shift, 431 ensemble canonical, 368 grand canonical, 369 quantum, 42 quasi-microcanonical, 367 statistical, 363 entangled state, 339 maximally, 53, 343 entanglement, 52, 339 entropy entanglement, 343, 345, 371 Gibbs, 344 Renyi, 371 statistical, 368 thermal entanglement, 371 von Neumann, 344, 368, 377 EPR paradox, 339, 345, 350 ergodic hypothesis, 364, 398 Euler angles, 161 exchange density probability, 237, 242 exchange integral, 242 Fermi golden rule, 422, 495, 529 Fermi–Dirac statistics, 230 fermion, 229 Feynman–Kac formula, 316 Feynman theorem, 637
field operators, 623, 628 fine structure, 472 flux quantum, 263, 275 Fock space, 623, 626 form factors, 592 Fredholm theorem, 31 free particle, 63 spherical coordinates, 190 gamma matrices, 606 gate AND, 384 INPUT, 384 NOT, 384 OR, 384 quantum Hadamard, 385 quantum NOT, 385 quantum phase, 385 quantum Toffoli, 386 XOR, 385 gauge field, statistical, 643 Gaussian integration formula, 119 Gaussian integration in Grassmann algebra, 320 Gegenbauer polynomials, 102 generalized Gibbs ensemble (GGE), 398 generating functional, 123, 124 generating function, canonical transformation, 59 geometric phase, 264 geometrical optics, 581 geometrical optics approximation, 132 Gram–Schmidt theorem, 14 Grassmann algebra, 318 Green’s function, 506 Green’s function, radial, 550 group abelian, 144 cyclic, 144 definition, 143 H2 molecule, 240, 246 Hall effect, 272 Hall voltage, 276 Hamilton–Jacobi equation, 130, 131, 578 formalism, 127, 130, 282 Hamiltonian extended, 328 total, 327 Hamilton’s principal function, 131 Hankel function, spherical, 537 harmonic oscillator, 91 fermionic, 233 forced, 311 isotropic, 214, 222 isotropic, cylindrical coordinates, 225 isotropic, spherical coordinates, 223 Hartree–Fock approximation, 490, 632 Hartree–Fock method, 472
654
Index
Hartree–Fock potential, 635 Heisenberg spin 1/2 Hamiltonian, 513 helium atom, 240 Helmholtz equation, 537 Hermite polynomials, 98, 102 hidden variables, 345, 350 Hilbert space, 21 Hilbert–Schmidt theorem, 31 Hund rules, 471 hybridization, 479 hydrogen atom, 204 hyperfine structure, 472 identical particles, 229 inelastic scattering, 588 instanton, 447, 453 integral Lebesgue, 26 Riemann, 26 interaction with magnetic field, 194 Jacobi identity, 157, 328 Jacobi polynomials, 101 Jost function, 542, 563 Klein–Gordon equation, 604 Laguerre polynomials, 102, 211 Landau levels, 272, 273 Land´e g factor, 46, 278, 437 Larmor frequency, 437 Larmor precession, 203 laser, 492 Laughlin wave function, 277 LCAO approximation, 616 LCAO method, 477 Legendre functions, associated, 198 Legendre polynomials, 101, 198 Levi–Civita tensor, 46 Levinson theorem, 542, 544, 573 Lie algebra, 155, 164 Lie group, 155 classical, 146 linear operator, 15 Liouville–von Neumann equation, 366 Lippman–Schwinger equation, 507 three-dimensional, 516 liquid droplet model, nuclear, 483 local realism, 347 London equations, 255 London–London model, 255 Lorentz force, 251 LS coupling (Russell–Saunders), 471 Lyapunov exponent, 388 quantum, 390 magnetic moment, 195 magnetic monopole, 301 Dirac, 303 magnon, 512
Majorana fermion, 648 many-body localization (MBL), 398 many-body physics, 623 matrix mechanics, 6 Maxwell duality, 301 measurement, quantum, 41 Meissner effect, 255 metric space, complete (Cauchy), 24 minimal coupling, 194, 250 mixed state, 361 monodromy, 308 monopole, ’t Hooft, 303 Moore–Read wave function, 647 multiparticle states, 614 multiplication table, 143 mutual information, 376 Neumann function, spherical, 537 no-cloning principle, 380 Noether theorem, 137 nonabelions, 641, 647 number operator, 93 occupation number, 617 picture, 614 on-shell value, 121 operator adjoint, 17 Hermitian, 19 in infinite dimensions, 30 matrix representation, 16 unitary, 18 optical theorem, 521, 547, 551 orbital, 469 molecular, 476, 479 ortho-helium, 243 orthogonal polynomials, classical, 101, 211 orthonormality, 13 oscillation states, 50 oscillations, neutrino, 50 OTOC, 388 pairing energy, 490 para-helium, 243 parity, 137, 149 partial wave amplitude, 536 partial wave expansion, 535 particle creation, 401 particle–wave complementarity, 6 partition function, connection with statistical mechanics, 315 thermodynamical, 368 Paschen–Back effect, 437 path integral, 118 configuration space, 120 fermionic, 311 Feynman, 116 harmonic phase space, 121 imaginary time, 311 phase space, 119 Pauli exclusion principle, 230
655
Index
Pauli matrices, 46, 172 permutation operator, 229 perturbation theory stationary (time-independent), 407 time-dependent, 418 time-dependent, second-order, 429 Pfaffian, 647 phase shift, 535, 537 WKB approximation, 580 photoelectric effect, 436, 444 photon in photoelectric effect, 4 picture Dirac (interaction), 112 Heisenberg, 106, 108 Schr¨odinger, 106 Planck constant, definition, 4 Planck formula, 497 postulates of quantum mechanics, 35 premeasurement, 395 projector, 39 propagator, 116 Feynman, 313 free correlator, 312 quadrupole moment, electric, 441 quantum chemistry, 469 quantum computation, 378 quantum computing, 375 quantum Hall effect fractional (FQHE), 277, 646 integer (IQHE), 275 quantum information, 375 quantum supremacy, 380 quantum teleportation, 381 qubit, 54, 378 Rabi formula, 493 Racah coefficients, 178 reflection coefficient, 87 Regge pole, 575 Regge theory, 574 representation adjoint, 157 fundamental, 145 reducible, 146 regular, 145 resonance, Breit–Wigner, 568 resonances, 566 Riemann sheet, energy domain, 562 Rindler space, 400 Rodrigues, formula, 102 rotation matrix in the ( j) representation, 184 Rutherford formula, 425, 530 Rydberg constant, 5, 210 S-matrix, 503, 521, 528 S-matrix element partial wave, 540 scalar product, 13 scattering amplitude, 516
analytical properties, 557 scattering length, 535, 541 scattering, one-dimensional, 503 Schmidt decomposition, 342 Schr¨odinger equation, 8, 36 Schr¨odinger’s cat, 393 Schwinger oscillator model, 179 scrambling time, 390 selection rules, 151, 438, 441 semiclassical approximation, 282 separable state, 340 Shannon entropy, 376 conditional, 376 Shannon theory, 375 shell model, 470 nuclear, 483 Slater determinant, 230, 615, 632 SO(n), 159 Sommerfeld polynomial method, 97, 207, 217 spectroscopic notation, 469 spherical harmonics, 188 addition formula, 199 spin, 194 spin chains, 512 spin–orbit coupling, 472 spin–orbit interaction, 257, 610 spin–statistics theorem, 234 spinor rotation, 199 square well finite, 84 infinitely deep, 78 Stark effect, 414 state mixed, 43 pure, 43 statistical operator, 362 statistics, nonabelian, 647 Stefan–Boltzmann constant, 4 Stern–Gerlach experiment, 5, 45, 194, 272 Stirling approximation, 376 SU (2), 162 structure constants, 156 sudden approximation, 421, 425 superconductor, 254 symmetry active, 140 continuous, 137, 143 discrete, 149 internal, 137, 149, 154 passive, 140 T-matrix, 528, 531 tensor operators, 183 tensor product, 21 thermalization, quantum, 393, 397 thermofield double state, 372 Thomas precession, 610 threshold behavior, 557 time-ordering operator, 107 time-reversal invariance, 149, 152
656
Index
time-reversal operator states with spin, 201 trace, 21 transfer matrix, 503, 510 transition frequency, 420 transition functions, 305 translation operator, 57 translation, lattice, 149 transmission coefficient, 84, 87 tunneling effect, 87 tunneling probability, 88 two-body problem, 204 two-level system, 45 uncertainty principle, Heisenberg, 66 uncertainty relation, energy–time, 72 uncertainty relations, Heisenberg, 69, 218 unitarity, 547 unitarity bound, 549 van der Waals interaction, 248 variance, 66 Gaussian, 72 variational method, 245, 457 Ritz, 458
vector space, 12 topological, 24 vortex, 647 wave function collapse, 54 definition, 8 wave packet, 64 Gaussian, 66, 72 Wick rotation, 315 Wigner 3j symbol, 177 Wigner 6j symbol, 179 Wigner–Eckhart theorem, 185, 437 WKB approximation, 282 path integral, 287 scattering, 577 WKB method, 127, 133 state transitions, 447 transition formulas, 284 WKB solution in one dimension, 283 Wronskian, 542 Wronskian theorem, 76, 503 Zeeman effect, 272, 279, 436