231 36 4MB
English Pages 240 Year 2021
Maurice A. de Gosson Quantum Harmonic Analysis
Advances in Analysis and Geometry
|
Editor-in Chief Jie Xiao, Memorial University, Canada Editorial Board Der-Chen Chang, Georgetown University, USA Goong Chen, Texas A&M University, USA Andrea Colesanti, University of Florence, Italy Robert McCann, University of Toronto, Canada De-Qi Zhang, National University of Singapore, Singapore Kehe Zhu, University at Albany, USA
Volume 4
Maurice A. de Gosson
Quantum Harmonic Analysis |
An Introduction
Mathematics Subject Classification 2010 Primary: 32A50, 51A50, 81Qxx; Secondary: 35Q40, 35S05, 47G30 Author Prof. Dr. Maurice A. de Gosson University of Vienna Faculty of Mathematics Oskar-Morgenstern-Platz 1 1090 Vienna Austria [email protected]
ISBN 978-3-11-072261-1 e-ISBN (PDF) 978-3-11-072277-2 e-ISBN (EPUB) 978-3-11-072290-1 ISSN 2511-0438 Library of Congress Control Number: 2021934872 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2021 Walter de Gruyter GmbH, Berlin/Boston Typesetting: VTeX UAB, Lithuania Printing and binding: CPI books GmbH, Leck www.degruyter.com
|
To Charlyne and to our children Serge, Corinne, Samantha, and Sven With all my love
Contents Preface | XIII Introduction | XV 1 1.1 1.1.1 1.1.2 1.2 1.2.1 1.2.2 1.2.3 1.2.4 1.2.5 1.3 1.3.1 1.3.2 1.4 1.4.1 1.4.2 1.4.3 1.5 1.5.1 1.5.2 1.5.3 1.5.4 1.5.5 1.6 1.6.1 1.6.2 1.6.3 1.6.4 1.6.5 1.6.6
Preliminaries | 1 Vector calculus: notation | 1 Vectors, matrices, and more | 1 Multi-index notation | 2 Hilbert spaces | 3 Example: the Sobolev spaces Hs | 3 The projection theorem | 4 Tensor products of Hilbert spaces | 4 Compact operators | 5 Positive operators | 6 Tempered distributions | 7 The Schwartz space 𝒮 (ℝn ) of test functions | 7 The dual space 𝒮 (ℝn ) | 7 The Fourier transform | 8 Definition | 8 Plancherel’s theorem | 8 Fourier transform and convolution | 9 The symplectic group | 9 Classical Lie groups | 9 Definition of Sp(n) | 10 Symplectic block-matrices | 11 The eigenvalues of a symplectic matrix | 12 Williamson’s symplectic diagonalization | 12 Quantum mechanics | 13 The axioms of quantum mechanics | 14 Quantization | 15 Superposition and entanglement | 15 Mixed quantum states; the density matrix | 16 Entanglement: spooky actions at a distance | 17 Variable ℏ | 18
2 2.1 2.1.1 2.1.2 2.2
Displacements and reflections | 21 The Heisenberg displacement operator | 21 Definition and motivation | 21 Properties | 22 The Grossmann–Royer reflection operators | 25
VIII | Contents 2.3 2.4 2.5 2.6
The symplectic Fourier transform | 27 ̂ and R(z) ̂ | 28 An analytic relationship between D(z) The Schrödinger representation | 29 Comments and references | 31
3 3.1 3.1.1 3.1.2 3.1.3 3.1.4 3.2 3.2.1 3.2.2 3.2.3 3.3 3.3.1 3.3.2 3.3.3 3.4 3.5 3.6
The cross-Wigner transform | 33 Definition and properties | 33 Definitions | 33 Elementary properties | 34 Hudson’s theorem | 36 Displacing cross-Wigner transforms | 36 The Moyal identity | 38 Statement and proof | 38 An extension result for the cross-Wigner transform | 39 A reconstruction result | 40 The cross-ambiguity function | 41 Definition | 41 Relations with the cross-Wigner transform | 41 Moyal identity for the cross-ambiguity function | 43 Dependence of the Wigner transform on h | 44 Statistical interpretation of the Wigner transform | 46 Comments and references | 47
4 4.1 4.1.1 4.1.2 4.1.3 4.2 4.2.1 4.2.2 4.2.3 4.3
Gaussians and hermite functions | 49 The Wigner transform of a gaussian | 49 The cross-Wigner transform of a pair of Gaussians | 49 Fermi’s trick | 51 Cross-ambiguity function of a Gaussian | 52 The case of Hermite functions | 53 Hermite functions | 53 Laguerre polynomials and functions | 54 The cross-Wigner transform of Hermite functions | 55 Comments and references | 59
5 5.1 5.1.1 5.1.2 5.1.3 5.2 5.2.1 5.2.2
The Weyl transform | 61 Definitions of the Weyl transform | 61 First definition | 61 Definition using the Wigner transform | 63 Pseudo-differential definition | 65 Main properties | 66 The distributional kernel of a Weyl operator | 66 The adjoint and the transpose of a Weyl operator | 67
Contents | IX
5.3 5.4 5.5
Shubin’s symbol classes | 68 Composition formulas | 69 Comments and references | 71
6 6.1 6.2 6.3 6.4 6.4.1 6.4.2 6.5 6.6
The Cohen class | 73 Definition of the Cohen class | 73 A Moyal identity | 75 The marginal properties | 76 The τ-Wigner transform | 77 Definition | 77 Properties | 79 The Born–Jordan kernel | 81 Comments and references | 82
7 7.1 7.2 7.2.1 7.2.2 7.3 7.3.1 7.3.2 7.4 7.4.1 7.4.2 7.5
Born–Jordan quantization | 83 Physical origins | 83 Algebraic motivation | 84 Weyl, Shubin, and Born–Jordan | 84 A remark on the quantization of monomials | 85 Born–Jordan operators: definition(s) | 85 Definition using the Cohen class | 85 Pseudodifferential expression | 87 Non-invertibility of Born–Jordan quantization | 89 A first non-injectivity result | 90 A general result | 91 Comments and references | 92
8 8.1 8.2 8.2.1 8.2.2 8.3 8.3.1 8.3.2 8.4
Metaplectic operators | 93 The symplectic group Sp(n) | 93 The metaplectic representation | 94 Fourier integral operators with quadratic phases | 94 Definition and properties of Mp(n) | 96 The projection π Mp | 96 Definition of the covering projection | 96 Construction of π Mp | 97 Comments and references | 101
9 9.1 9.1.1 9.1.2 9.1.3
The property of symplectic covariance | 103 Symplectic covariance of the cross-Wigner transform | 103 The Heisenberg displacement operators | 103 The Grossmann–Royer reflections | 105 Symplectic covariance of W (ψ, ϕ) and A(ψ, ϕ) | 105
X | Contents 9.2 9.2.1 9.2.2 9.2.3 9.3 9.4 9.4.1 9.4.2 9.4.3 9.4.4 9.5
Application: the action of Mp(n) on Gaussians | 106 A first result | 106 Pre-Iwasawa factorization of a symplectic matrix | 107 Application to Gaussians | 108 Symplectic covariance of Weyl operators | 109 Maximality of the symplectic group | 111 A technical lemma | 111 The maximality property for the Wigner transform | 112 Maximal covariance of Weyl operators | 115 Born–Jordan operators | 115 Comments and references | 117
10 The Feichtinger algebra | 119 10.1 Definition and first properties | 119 10.1.1 Definition of the Feichtinger algebra | 119 10.1.2 First properties | 121 10.1.3 The Banach algebra property | 124 10.2 The metaplectic invariance of S0 (ℝn ) | 128 10.3 The dual space S0 (ℝn ) | 129 10.4 The modulation spaces M1s (ℝn ) | 131 10.5 Comments and references | 133 11 Hilbert–Schmidt operators | 135 11.1 Hilbert–Schmidt operators on L2 (ℝn ) | 135 11.1.1 First definition | 135 11.1.2 Second definition | 136 11.1.3 Equivalence of the definitions; the Hilbert–Schmidt norm | 137 11.2 Further properties of the Hilbert–Schmidt operators | 140 11.2.1 Compactness of Hilbert–Schmidt operators | 140 11.2.2 Ideal property | 141 11.3 Hilbert–Schmidt and Weyl operators | 142 11.4 Comments and references | 143 12 The trace class | 145 12.1 Definitions of trace class operators | 145 12.1.1 Statement of the definitions | 145 12.1.2 Proof of the equivalence of the definitions | 146 12.1.3 Definition of the trace | 149 12.2 Properties of the trace class | 150 12.2.1 Trace class operators form an ideal | 150 12.2.2 The trace class norm | 152 12.3 Trace formulas | 153
Contents | XI
12.3.1 12.3.2 12.4
A product formula | 153 Relationship between the trace and the Weyl symbol | 154 Comments and references | 156
13 The quantum Bochner theorem | 157 13.1 The positivity question | 157 13.2 Bochner’s theorem | 157 13.3 The quantum case | 158 13.3.1 The KLM conditions | 158 13.3.2 The main result; statement and proof | 159 13.4 The case of Gaussian distributions | 161 13.4.1 Two lemmas | 161 13.4.2 The condition on Σ | 162 13.5 Positive trace-class operators | 164 13.6 Comments and references | 165 14 The density operator | 167 14.1 Mixed quantum states: heuristics | 167 14.2 Density operator: definition and first properties | 168 14.2.1 Definition of a density operator | 168 14.2.2 First properties | 168 14.2.3 The convexity property | 170 14.2.4 The spectral theorem for density operators | 171 14.3 The Wigner distribution as a Weyl symbol | 173 14.3.1 The density operator as a Weyl operator | 173 ̂ | 174 14.3.2 Application to the statistical interpretation of ρ 14.4 Quantum condition on the covariance matrix | 175 14.4.1 The covariance matrix of a density operator | 175 14.4.2 The quantum condition | 176 14.5 The Narcowich–Wigner spectrum | 179 14.6 Comments and references | 180 15 The uncertainty principle | 181 15.1 Feichtinger states | 181 15.2 The Robertson–Schrödinger inequalities | 182 15.3 Gromov’s symplectic non-squeezing theorem | 184 15.3.1 Statement of Gromov’s theorem | 184 15.3.2 Proof of Gromov’s theorem in the linear case | 184 15.3.3 The Gromov width | 185 15.4 The covariance ellipsoid | 186 15.5 Uncertainty and quantum polarity | 188 15.5.1 Orthogonal projections of the covariance ellipsoid | 188
XII | Contents 15.5.2 15.5.3 15.6
Polar duality in convex geometry | 189 The fundamental property of ΩX and ΩP | 190 Comments and references | 191
16 Separability and entanglement | 193 16.1 The reduced density operator | 193 16.1.1 Physical motivation | 193 16.1.2 Traditional Hilbert space approach | 193 16.1.3 The reduced density operator in harmonic analysis | 194 16.2 Separability and the PPT condition | 197 16.2.1 Separable and entangled quantum states | 197 16.2.2 The PPT theorem | 198 16.2.3 The Schur complement | 199 16.2.4 Orthogonal projections of the covariance ellipsoid | 200 16.3 Werner and Wolf’s condition | 201 16.3.1 Statement and equivalent formulation | 201 16.3.2 A geometric consequence | 203 16.4 Comments and references | 204 17 Separability of Gaussian states | 205 17.1 The purity of a Gaussian state | 205 17.1.1 A general purity formula for Gaussian sates | 205 17.1.2 Pure Gaussian states | 206 17.2 Separability of Gaussian quantum states | 207 17.2.1 Generalities, a sufficient condition for separability | 207 17.2.2 Reduced states of a Gaussian density operator | 208 17.2.3 Back to the Werner and Wolf conditions | 210 17.3 Disentanglement of Gaussian states | 210 17.3.1 A diagonalization result for positive symplectic matrices | 210 17.3.2 The disentanglement result | 212 17.4 The case of Gaussians | 213 17.5 Comments and references | 214 Bibliography | 215 Index | 219
Preface In the early 1970s, one of my analysis teachers Jean Dieudonné (who was, incidentally, one of the founding members of the “Bourbaki group”) declared, during a visit to the Mathematisches Kolloquium in Vienna (organized at that time by Hans Reiter), that “harmonic analysis was off-stream” and that one … “should no longer expect any interesting developments in this area”. While Dieudonné is well-known for his provocative and unfair statements (he had also severely criticized Henri Poincaré in A History of Algebraic and Differential Topology for his alleged lack of rigor), he should have known better, especially, since he had himself contributed to the area! It is an ironic answer to Dieudonné that harmonic analysis has become over the years a thriving area of both pure and applied mathematics under the impetus of scientists working in several related areas, for instance, in time-frequency analysis with the study of modulation spaces initiated by Hans Feichtinger. Quantum mechanics is arguably one of the most successful scientific theories ever; by quantum mechanics, we mean here quantum physics, quantum chemistry, or quantum optics, which refer to the same thing, even if in a few circumstances there could be a subtle difference between these topics. Historically, the story goes as follows: In the 1920’s, physicists (Schrödinger, Heisenberg, Born, and many others) developed mathematical tools that described the quantum behavior of subatomic particles. At this point, physicists began calling the new field “quantum mechanics” on the model of the phrase “classical mechanics”. One could expect from such a successful scientific theory that it is based on a set of well-defined and unambiguous axioms (or postulates), as is the case for classical mechanics, which is governed by Newton’s laws. Unfortunately, there seems to be no consensus among physicists about which system of axioms should be used. The difficulty (for physicists) is that quantum mechanics is still plagued with severe interpretational issues: One does not really know how to interpret the results one is dealing with (the “Wigner’s friend” paradox is emblematic of these difficulties). For instance, does the wavefunction have any physical reality? And how should its collapse be interpreted? Questions like these are numerous, and one sees new and often contradictory recipes or interpretations appear everyday in specialized journals, or pre-print repositories like arXiv. Physicists have developed, with the help of philosophers, a whole industry based on the spreading of unending discussions; it has also given birth to uncountable papers or books popularizing quantum mechanics, often in weird or even wrong ways. The reader can get a taste of these issues by reading the entry https://plato.stanford.edu/entries/qt-issues/ of the Stanford Encyclopedia of Philosophy. As the physicist Richard Feynman put it, somewhat jokingly, “Philosophy of science is about as useful to scientists as ornithology is to birds.” https://doi.org/10.1515/9783110722772-201
XIV | Preface For a mathematician, this situation is of course not tenable, and we will view quantum mechanics in this book as a branch of mathematics based on a well-defined and unambiguous system of axioms. There are many possible choices of (equivalent, or non-equivalent) such systems. We will apply the principle of parsimony and propose the simplest (and therefore the most general) set of axioms. It is my pleasure and duty to thank the following mathematicians and friends for fruitful conversations and encouragements. In alphabetical and geographical order: Nuno Dias and Joao Prata (Lisbon), Basil Hiley and Glen Dennis (London), Leon Cohen (New York), Jean-Pierre Gazeau (Paris), Franz Luef (Trondheim), Paolo Boggiatto, Elena Cordero, Fabio Nicola, Luigi Rodino (Turin), Markus Faulhuber, Hans Feichtinger, and Karlheinz Gröchenig (Vienna). My deepest gratitude goes to Charlyne de Gosson for having read the manuscript and pointed out several typos and errors.
Acknowledgement This work has been financed by the Grant number P 33447 of the Austrian Research Agency FWF (Fonds zur Förderung der wissenschaftlichen Forschung).
Introduction This book is about mathematical techniques relevant to the study of theoretical quantum mechanics. As its title indicates, it is an introduction to the vast topic of quantum harmonic analysis, and not a treatise on general harmonic analysis. We therefore do not aim at completeness; for reasons of time and space, several topics have been omitted; for instance, we do not discuss the important notions of quantum channels and completely positive maps, nor do we give physical applications (this is after all a book on mathematics, not physics!). However, we discuss somewhat in depth some difficult subjects that are still being very much investigated in current research; for instance, the questions of positivity for density operators and the separability and entanglement of quantum states, which are notoriously difficult. These are advanced notions where open mathematical problems still abound. The book is structured as a series of lecture notes, consisting of rather short chapters each of which can be – ideally – taught during a 90-minute session. The prerequisites are rather modest: The reader is supposed to have a basic knowledge of linear algebra and analysis (advanced calculus). The basic notions of functional analysis (mainly the theory of Hilbert spaces and distributions) are reviewed in a preliminary chapter, mostly without proofs. Some familiarity with elementary quantum mechanics would certainly be helpful, but is not required. More precisely: – Chapter 1 is a review of elementary properties from analysis and geometry. This chapter is meant to be a reference tool for the reader; we recall the basic notions from multivariate calculus, function spaces, and distribution theory, while fixing the notation we will use. If a reader thinks he is sufficiently familiar with these topics, he is encouraged to skip this chapter and directly proceed to the core of the book, beginning in next chapter. – Chapters 2 and 3 set the foundations for harmonic analysis in quantum mechanics: We introduce two elementary mathematical objects, the Heisenberg displacement operator and the Grossmann–Royer reflection operator, and use them to construct the Wigner transform and its symplectic Fourier transform, the ambiguity function. This approach to the Wigner formalism is not quite conventional, but has many advantages (conceptual and practical). Perhaps one of the most useful formulas that are derived in Chapter 2 is the Moyal identity. It is — among many other things — a bridge between orthogonality in L2 (ℝn ) and orthogonality in L2 (ℝ2n ). – In Chapter 4, we take a short break from theory and use the Wigner formalism previously introduced to calculate explicitly the cross-Wigner transforms of Gaussian and Hermite functions, thus obtaining a useful collection of formulas that are otherwise found spread about in the literature. It gives us the opportunity to quickly review some notions from the theory of special functions and polynomials. https://doi.org/10.1515/9783110722772-202
XVI | Introduction –
–
–
–
–
In Chapter 5, we are back to work. We define and study the properties of the arguably most important “quantization” procedure, the Weyl transform. Its genuine mathematical interest (besides being a theory of pseudodifferential operators) come from Schwartz’s distributional kernel theorem, which allows to prove that every linear continuous operator, from the space of Schwarz test functions to the space of tempered destitutions, is a Weyl operator, in the sense that one can associate to it (in a unique way) a symbol, which is a function (or tempered distribution) defined on phase space. (In the language of quantum mechanics, one would say that to every Weyl operator corresponds a unique “classical observable”, and vice versa). Chapter 6 is about the so-called Cohen class. It is the key to the alternative quantization scheme due to Born and Jordan we study in next chapter. Elements of the Cohen class are the convolutions of the Wigner transform with adequate temperate distributions (“Cohen kernels”). They are generalizations of the Wigner transform, and one recovers the latter by choosing as Cohen kernel the Dirac measure. The importance of the Cohen class has been growing over recent years, especially in time-frequency analysis and the study and damping of interference effects. In our case, it will be instrumental for the definition of Born–Jordan quantization as an alternative to the Weyl transform. In Chapter 7, which is devoted to a quantization procedure originally due (in a physical form) to Max Born and Pascual Jordan (and Werner Heisenberg), we begin by discussing (as an appetizer) some ordering issues for the quantization of monomials, and thereafter define the Born–Jordan quantization of arbitrary symbols. Born–Jordan quantization might very well be the physically correct way of associating operators to classical observables in quantum theory. It is constructed using an averaging procedure involving Shubin’s τ-pseudodifferential operators. Chapter 8 is devoted to the metaplectic machinery, which is in principle wellknown (at least in its rudimentary forms) to the harmonic analysis community. Quite abstractly, the metaplectic group is a unitary representation on the squareintegrable functions of the double cover of the symplectic group. Its importance comes from the fact that to every symplectic matrix the metaplectic representation associates two unitary operators, and these operators represent in a sense quantizations of the corresponding symplectic matrices. In Chapter 9, we use the metaplectic representation of the symplectic group to study the so-called symplectic covariance property of Weyl quantization. This is an essential property, characteristic of the Weyl transform, and is extremely useful in all applications since it allows to considerably simplify many problems. Roughly speaking, symplectic covariance means that, if one makes a linear symplectic change of variables S in a symbol, then the corresponding Weyl operator is conjugated by anyone of the metaplectic operators ±Ŝ arising from that change of variables.
Introduction
–
–
–
–
–
–
| XVII
Chapter 10 is devoted to a short study of the so-called Feichtinger algebra and to its most elementary weighted extension. The Feichtinger algebra is the simplest of all modulation spaces, a class of functional spaces introduced in the mid-1980s by Hans Feichtinger. While they are usually defined using the Gabor transform, in our approach the properties of he Wigner transform are used; this not only makes their theory explicitly symplectically covariant, but is also much better adapted to the study of quantum mechanical objects (for instance, the density operator). The importance of the Feichtinger algebra and its extensions comes in our context from the fact that they provide us with the right framework for studying density operators and the associated mixed quantum states. Chapters 11 and 12 are about two very well-known and related classes of operators, the Hilbert–Schmidt and trace-class operators. In addition to their intrinsic interest, these chapters are necessary preparation for a rigorous mathematical study of the density operator of quantum mechanics. Chapter 13 is a technical chapter. It is about a rather difficult topic, that of the positivity of trace-class operators. Positivity questions have been of great concern since the very beginning of the theory of pseudodifferential operators, and many problems are still open. We give a new proof of the so-called quantum Bochner theorem, which is the most important known necessary and sufficient condition for the positivity of a trace-class operator. Chapter 14 is about mixed states and the corresponding density operator. A mixed quantum state is the datum of a set of pairs {(ψj , λj )} where ψj is a (normalized square integrable) pure state and λj a classical probability; these probabilities sum up to one: ∑j λj = 1. The study of these objects is at the heart of quantum mechanics, both for theoretical and practical reasons. One could say that quantum mechanics ultimately is the study of density operators. In Chapter 15, we analyze the relationship between the positivity properties of density operators and the quantum uncertainty principle from a rigorous point of view, where the usual Heisenberg and Robertson–Schrödinger inequalities play only an accessory role. We also propose a new approach to quantum uncertainty using convex geometry (the property of polar duality). Chapters 16 and 17 are devoted to the study of the notions of entanglement and separability of quantum states. These notions play a very privileged role in quantum mechanics and are at the heart of modern research. While not so much is known about the general characterization of entanglement and separability of arbitrary mixed states, the situation for Gaussian states is somewhat more satisfactory. Not only are these states somewhat easier to study analytically than general states, but they are also omnipresent in many applications (quantum optics and computing, for instance).
1 Preliminaries We collect in this introductory chapter some basic reference material about Hilbert spaces, distribution spaces, and symplectic geometry. The properties are given mostly without proof (an exception being the proof of Williamson’s symplectic diagonalization theorem). For a detailed study of the symplectic group and its properties (and, more generally, of symplectic geometry), we refer to our book [30]; for functional analysis at an elementary level [6, 57] are good readings.
1.1 Vector calculus: notation 1.1.1 Vectors, matrices, and more In matrix calculations, elements of ℝn should be viewed as column vectors; for instance x1 x = ( ... ) . xn However, for typographic economy, we will usually write x = (x1 , . . . , xn ) in the text. The Euclidean scalar product ⋅ and norm | ⋅ | on ℝn are defined by n
x ⋅ y = xT y = ∑ xj yj , j=1
|x| = √x ⋅ x.
We will often write x ⋅x = x2 when no confusion is likely to arise. The gradient operator in the variables x1 , . . . , xn is 𝜕 𝜕x1
. 𝜕x = ( .. ) , 𝜕 𝜕xn
or, with the same abuse of notation as above, 𝜕x = (𝜕/𝜕x1 , . . . , 𝜕/𝜕xn ). Let f and g be differentiable functions ℝn → ℝn ; in matrix form, the chain rule is T
𝜕(g ∘ f )(x) = (Df (x)) 𝜕f (x)
(1.1)
where Df (x) is the Jacobian matrix of f : If f = (f1 , . . . , fm ) is a differentiable mapping ℝm → ℝm then 𝜕f1 𝜕x1 𝜕f2 𝜕x1
𝜕f1 𝜕x2 𝜕f2 𝜕x2
𝜕fm 𝜕x1
𝜕fn 𝜕x2
Df = ( . .. https://doi.org/10.1515/9783110722772-001
.. .
⋅⋅⋅ ⋅⋅⋅ .. . ⋅⋅⋅
𝜕f1 𝜕xm 𝜕f2 𝜕xm
.. ) . .
𝜕fm 𝜕xm
(1.2)
2 | 1 Preliminaries Let y = f (x); we will indifferently use the notations Df (x),
𝜕y 𝜕(y1 , . . . , ym ) , 𝜕x 𝜕(x1 , . . . , xm )
for the Jacobian matrix. If f is invertible, the inverse function theorem says that D(f −1 )(y) = [Df (x)] .
(1.3)
−1
If f : ℝm → ℝ is a twice continuously differentiable function, its Hessian calculated at a point x is the symmetric matrix of second derivatives 𝜕2 f 𝜕x12 𝜕2 f ( 𝜕x2 𝜕x1
D2 f (x) = (
.. .
𝜕2 f
( 𝜕xn 𝜕x1
𝜕2 f 𝜕x1 𝜕x2 𝜕2 f 𝜕x22
⋅⋅⋅ ⋅⋅⋅
.. .
..
𝜕2 f 𝜕xn 𝜕x2
.
⋅⋅⋅
𝜕2 f 𝜕x1 𝜕xn 𝜕2 f 𝜕x2 𝜕xn )
.. .
𝜕2 f 𝜕xn2
).
(1.4)
)
Notice that the Jacobian and Hessian matrices are related by the formula D(𝜕f )(x) = D2 f (x).
(1.5)
Also note the following useful formulae: 1
1
A𝜕x ⋅ 𝜕x e− 2 Mx⋅x = (MAMx ⋅ x − Tr(AM))e− 2 Mx⋅x Bx ⋅ 𝜕x e
− 21 Mx⋅x
= (MBx ⋅ x)e
− 21 Mx⋅x
(1.6) (1.7)
where A, B, and M are symmetric n × n matrices. 1.1.2 Multi-index notation By convention, the set of all non-negative integers is denoted by ℕ: ℕ = {0, 1, 2, . . .} (this convention is not universal; in some countries 0 is not an element of ℕ). By definition, a (n-)multi-index is an element α = (α1 , . . . , αn ) of ℕn . We call the integer |α| = α1 +⋅ ⋅ ⋅+αm the “length” of the “multi-index” α and define, for x ∈ ℝn , the “power ” xα by α
xα = x1 1 ⋅ ⋅ ⋅ xnαn ,
𝜕xα = 𝜕xα11 ⋅ ⋅ ⋅ 𝜕xαnn .
Likewise, α! = α1 ! ⋅ ⋅ ⋅ αn !,
α α! ( )= β!(α − β)! β
when α ≥ β, that is when αj ≥ βj for 1 ≤ j ≤ n.
1.2 Hilbert spaces | 3
1.2 Hilbert spaces Hilbert spaces (and their cousins, Banach spaces) are the best-known functional spaces in elementary functional analysis. By definition, a complex Hilbert space is a vector space H having the following properties: – H is endowed with an inner product (also called a scalar product) (⋅, ⋅)H , that is a sesquilinear form (⋅, ⋅)H : H × H → ℂ such that (i) (ψ, ψ)H > 0 for all ψ ∈ H, ψ ≠ 0; (ii) (ψ, ϕ)H = 0 for all ϕ ∈ H implies that ψ = 0; (iii) (ψ, ϕ)H = (ϕ, ψ)H . We suppose, by convention, that the anti-linearity takes place in the second term of the scalar product, i. e., (λψ, ϕ) = λ(ψ, ϕ) and (ψ, λϕ) = λ(ψ, ϕ) for λ ∈ ℂ; – Let ‖ ⋅ ‖H be the norm associated with (⋅, ⋅)H : ‖ψ‖H = (ψ, ψ)1/2 H . The space H is complete for the topology associate with this norm; i. e., every Cauchy sequence (ψj )j is convergent. The standard example (and perhaps the most useful in harmonic analysis and quantum mechanics) is the space L2 (ℝn ) of all square-integrable functions equipped with the scalar product (ψ|ϕ)L2 = ∫ ψ(x)ϕ(x)dx. ℝn
We mention that the scalar product (ψ|ϕ)L2 is related to the physicist’s “bra-ket” product ⟨ψ|ϕ⟩ by conjugation: ⟨ψ|ϕ⟩ = (ψ|ϕ)L2 which reflects the fact that physicists mostly use the opposite convention for sesquilinearity of the scalar product of functions. 1.2.1 Example: the Sobolev spaces H s The Sobolev spaces H s are well-known functional spaces, much used in the theory of partial differential operators and in microlocal analysis. Definition 1. Let s ∈ ℝ. The Sobolev space H s (ℝn ) consists of all ψ ∈ 𝒮 (ℝn ) whose ̂ = Fψ are functions satisfying Fourier transforms ψ 1/2
̂ 2 2 s ‖ψ‖H s = ( ∫ ψ(p) (1 + |p| ) dp)
< ∞.
ℝn
In particular, H 0 (ℝn ) = L2 (ℝn ), and we have H s (ℝn ) ⊂ H s (ℝn ) if s ≤ s. It is clear that H s (ℝn ) is a vector space, and one proves that ψ → ‖ψ‖H s is a norm on H s (ℝn )
4 | 1 Preliminaries that endows it with a Banach space topology. Moreover, one has the inclusions H s (ℝn ) ⊂ C k (ℝn ) for s > and
n +k 2
⋂ H s (ℝn ) = 𝒮 (ℝn ).
s≥0 s
n
We also have δ ∈ H (ℝ ) for s < −n/2, and, in fact, H s (ℝn ) is a Hilbert space for the scalar product s ̂ ̂ ϕ(p)(1 (ψ|ϕ)H s = ∫ ψ(p) + |p|2 ) dp. ℝn
1.2.2 The projection theorem The main virtue of Hilbert spaces in quantum mechanics is that one can define unambiguously orthogonal projections on convex subsets of that space (hence, in particular, on linear subspaces). This property is summarized in the following theorem Theorem 2. Let 𝒞 be a non-empty convex subset of the Hilbert space H. For every ψ ∈ H, there exists a unique ψ0 ∈ 𝒞 such that ‖ψ − ψ0 ‖H ≤ ‖ψ − ϕ‖H
for all ϕ ∈ 𝒞
that is ‖ψ − ψ0 ‖ ≤ d(ψ, 𝒞 ) (d the distance function). The vector ψ0 is the only element of 𝒞 satisfying Re(ψ − ψ0 |ϕ − ψ0 )H ≤ 0
for all ϕ ∈ 𝒞 .
Let (ψj )j be an orthonormal basis of H: (ψj |ψk )H = δjk . Then the Bessel identity holds: 2 ‖ψ‖2H = ∑(ψ|ψj )H j
(1.8)
for every ψ ∈ H. This property is the main motivation for the use of Hilbert spaces in quantum mechanics because it allows analyzing the reduction of wavefunctions. 1.2.3 Tensor products of Hilbert spaces Let H1 and H2 be two Hilbert spaces and (ψ1 , ψ2 ) ∈ H1 × H2 . The tensor product ψ1 ⊗ ψ2 is the form H1 × H2 → ℂ defined by (ψ1 ⊗ ψ2 )(ϕ1 , ϕ2 ) = (ψ1 |ϕ1 )H1 (ψ2 |ϕ2 )H2 .
1.2 Hilbert spaces | 5
Let F be the set of all finite linear combinations f = ∑ki=1 λi (ψ1,i ⊗ ψ2,i ) of such forms. It has a natural vector space structure. We equip F with the inner product (∑ λi (ψ1,i ⊗ ψ2,i ) ∑ λj (ψ1,j ⊗ ψ2,j )) = ∑ λi λj (ψ1,i |ψ1,j )(ψ2,i |ψ2,j ). F j i,j i It is easy to show that this definition does not depend on the choice of linear combinations representing the elements of F and that we have (f |f ) > 0 for every f ∈ F. Definition 3. The tensor product H1 ⊗ H2 of the two Hilbert spaces H1 and H2 is the completion of the vector space F equipped with the inner product (⋅|⋅)F . Associated with the tensor product of Hilbert spaces is the tensor product of operators. We limit ourselves here to the case of bounded operators A ∈ ℬ(H1 ) and B ∈ ℬ(H2 ). Definition 4. The tensor product A ⊗ B is the bounded operator on H1 ⊗ H2 defined by (A ⊗ B)(∑ λi ψ1,i ⊗ ψ2,i ) = ∑ λi A(ψ1,i ⊗ Bψ2,i ). i
i
1.2.4 Compact operators ̂ : ℬ(H) → ℬ(H). We denote by ℬ(H) the vector space of all bounded linear operators A ̂ ̂ We have A ∈ ℬ(H) if and only if A is continuous at 0, which is equivalent to saying that ̂ H ≤ ‖A‖ ̂ ‖ψ‖H where ‖Aψ‖ ̂ = sup{‖Aψ‖ ̂ H : ‖ψ‖H ≤ 1}. ‖A‖ ̂ ∈ ℬ(H) is said to be compact if, whenever (ψj ) is a Definition 5. An operator A ̂ j ) contains a convergent subsequence. Equivabounded sequence in H, then (Aψ ̂ lently, A is compact if and only if it takes every bounded subset of H to a relatively compact subset of H. We assumed in the definition from the beginning that a compact is bounded, so this condition is actually redundant because every operator on a Hilbert space satisfying the conditions above is automatically compact. Compact operators form a sub̂ j ) is a sequence space 𝒦(H) of ℬ(H) and that subspace is closed in ℬ(H): Assume that (A ̂ ̂ ̂ ̂ ∈ 𝒦(H). in 𝒦(H) such that there exists A ∈ ℬ(H) with limj→∞ ‖Aj − A‖ = 0. Then A ∗ ̂ ̂ It immediately follows from the definition that A ∈ 𝒦(H) if and only if A ∈ 𝒦(H). Also: ̂ ∈ 𝒦(H) is the limit (in the operator Proposition 6. Assume that H is separable. Every A norm) of a sequence of finite rank operators.
6 | 1 Preliminaries ̂ on H is of finite rank if and only if the range of A ̂ is (Recall that an operator A finite-dimensional). Of particular interest are the spectral properties of compact self-adjoint operators. The following theorem is essential: ̂ ∈ 𝒦(H) is self-adjoint: A ̂ =A ̂ ∗ . Then: (i) Theorem 7 (Spectral Theorem). Assume that A ̂ consists of real numbers, and is at most countable, and The set (λj ) of eigenvalues of A ̂ ≠ 0, if infinite limj→∞ λj = 0; (ii) The multiplicity of every eigenvalue λj is finite; (iii) If A ̂ with eigenvalues there exists a most countable set (ψj ) of orthonormal eigenvectors of A λj such that, if ψj is an eigenvector for λj, then ̂ = ∑ λj (ψ|ψj )H ψj . Aψ j
(1.9)
The spectral theorem for compact operators plays a fundamental role in the study of the density operator in quantum mechanics.
1.2.5 Positive operators ̂ ∈ ℬ(H) (A ̂ ≥ 0) is said to be positive semidefinite if we have (Aψ|ψ) ̂ An operator A H ≥0 for every ψ ∈ H. Assuming that H is a complex Hilbert space, we have: ̂ ∈ ℬ(H); we have A ̂ ≥ 0 if and only if any of the two equivalent Proposition 8. Let A ̂ ∈ ℬ(H) such that A ̂ =B ̂ ∗ B; ̂ (ii) There exists a unique statements hold: (i) There exists B 2 ̂ ̂ ̂ ̂ C ∈ ℬ(H), C ≥ 0, such that A = C . In particular, a positive semidefinite bounded operator on a complex Hilbert space ̂ defined by (ii) is called the square root of A ̂ and denoted is self-adjoint. The operator C ∗ ̂ 1/2 1/2 √̂ ̂ ̂ ̂ by A (or A). By definition, the operator |A| = (A A) is the modulus (or absolute ̂ value) of A. ̂ ∈ ℬ(H) can be written (in a unique way) as A ̂ = Proposition 9. Every self-adjoint A ̂+ − A ̂ − where A ̂ + ≥ 0 and A ̂ − ≥ 0 and belongs to ℬ(H). A ̂ is in addition compact, one way to prove this is to use the spectral formula When A (1.9): then ̂ + ψ = ∑ λ+ (ψ|ψj )H ψj , A j j
̂ − ψ = ∑ λ− (ψ|ψj )H ψj A j j
̂ where the λj+ (resp. λj− ) are the positive (resp. negative) eigenvalues of A.
1.3 Tempered distributions | 7
1.3 Tempered distributions The vector space of k times continuously differential functions ℝm → ℂ is denoted by C k (ℝm ); k is here an integer ≥ 1 or ∞. The subspace of C k (ℝm ) consisting of the compactly supported functions is denoted by C0k (ℝm ). 1.3.1 The Schwartz space 𝒮(ℝn ) of test functions 𝒮 (ℝn ) is the Schwartz space of rapidly decreasing functions: ψ ∈ 𝒮 (ℝn ) if and only if,
for every pair (α, β) of multi-indices, there Kαβ > 0 such that α β x 𝜕x ψ(x) ≤ Kαβ
for all x ∈ ℝn .
(1.10)
In particular, every C ∞ function on ℝn vanishing outside a bounded set is in 𝒮 (ℝn ): C0∞ (ℝn ) ⊂ 𝒮 (ℝn ). Taking the best constants Kαβ in (1.10), we obtain a family of seminorms on 𝒮 (ℝn ) ‖ψ‖α,β = sup xα 𝜕xβ ψ(x) n x∈ℝ
n
and one shows that 𝒮 (ℝ ) is a Fréchet space for the topology defined by these seminorms. We have continuous inclusions C0∞ (ℝn ) ⊂ 𝒮 (ℝn ) ⊂ L2 (ℝn ). Using the generalized Leibniz formula, one shows that an equivalent system of seminorms on 𝒮 (ℝm ) is given by ‖ψ‖α,β = sup 𝜕xα xβ ψ(x). n x∈ℝ
(1.11)
1.3.2 The dual space 𝒮 (ℝn ) The topological dual of 𝒮 (ℝn ) is called the space of tempered distributions and is denoted by 𝒮 (ℝn ). The distributional pairing between ψ ∈ 𝒮 (ℝn ) and ϕ ∈ 𝒮 (ℝn ) is denoted by ⟨ψ, ϕ⟩. When ψ and ϕ are both in 𝒮 (ℝn ), then ⟨ψ, ϕ⟩ = ∫ ψ(x)ϕ(x)dx. ℝn
The natural inclusions n
2
n
n
𝒮 (ℝ ) ⊂ L (ℝ ) ⊂ 𝒮 (ℝ )
are continuous, and 𝒮 (ℝn ) is dense in L2 (ℝn ).
8 | 1 Preliminaries Every polynomial in the x variables defines a tempered distribution. If P(x) is such a polynomial and ψ ∈ 𝒮 (ℝn ), then P(x)ψ ∈ 𝒮 (ℝn ) and is defined by ⟨Pψ, ϕ⟩ = ⟨ψ, Pϕ⟩. More generally, P can be replaced with any C ∞ function of x such that |𝜕xα P(x)| is dominated by a polynomial for every α ∈ ℕn . The space 𝒮 (ℝn ) is invariant by Fourier transform: If ψ ∈ 𝒮 (ℝn ), then Fψ ∈ 𝒮 (ℝn ) is defined by ⟨Fψ, ϕ⟩ = ⟨ψ, Fϕ⟩.
1.4 The Fourier transform 1.4.1 Definition Let ψ ∈ L1 (ℝn ). By definition, the Fourier transform of ψ is the function defined by the absolutely convergent integral n/2
1 ̂ Fψ(p) = ψ(p) =( ) 2πℏ
i
∫ e− ℏ p⋅x ψ(x)dx ; ℝn
here ℏ is any positive number (it plays the role of a parameter and is identified in physics with h/2π where h is Planck’s constant). We have Fψ ∈ C 0 (ℝn ) and lim|p|→∞ Fψ = 0 (Riemann–Lebesgue lemma); in fact, Fψ is uniformly continuous. For ψ ∈ 𝒮 (ℝn ), the following formulas hold: F(xα ψ) = (iℏ𝜕p )α Fψ,
F((−iℏ𝜕x )α ψ) = pα Fψ.
̂ ∈ L1 (ℝn ); then the Fourier inversion formula Assume that both ψ ∈ L1 (ℝn ) and ψ holds: ψ(x) = (
n/2
1 ) 2πℏ
i
̂ ∫ e ℏ p⋅x ψ(p)dp. ℝn
It follows that F is an automorphism 𝒮 (ℝn ) → 𝒮 (ℝn ) that can be extended into an automorphism 𝒮 (ℝn ) → 𝒮 (ℝn ) by defining the extension of F, also denoted by F, by duality: ⟨Fψ, ϕ⟩ = ⟨ψ, Fϕ⟩ for ψ ∈ 𝒮 (ℝn ) and ϕ ∈ 𝒮 (ℝn ). 1.4.2 Plancherel’s theorem An essential property of the Fourier transform is the Plancherel theorem: If ψ ∈ L1 (ℝn ) ∩ L2 (ℝn ), then ‖Fψ‖L2 = ‖ψ‖L2 .
1.5 The symplectic group
| 9
Plancherel’s formula allows extending the Fourier transform to a unitary operator on L2 (ℝn ): (Fψ|Fϕ)L2 = (ψ|ϕ)L2 . Note that, if ψ ∈ L2 (ℝn ) and ψ ∉ L1 (ℝn ), then Fψ cannot be calculated pointwise, and one has to use some approximation procedure. One can proceed as follows (see K. Gröchenig’s book Foundations of Time Frequency Analysis): since L1 (ℝn ) ∩ L2 (ℝn ) is dense in L2 (ℝn ), we may choose a sequence (ψj )j in L1 (ℝn ) ∩ L2 (ℝn ) such that limj→∞ ‖ψ − ψj ‖L2 = 0. Then Fψj is well defined, and by Plancherel’s theorem ‖Fψ − Fψj ‖L2 = ‖ψ − ψj ‖L2 so that (Fψj )j is a Cauchy sequence in L2 (ℝn ). The latter being complete, the sequence (Fψj )j has a unique limit in L2 (ℝn ). We then define the Fourier transform of ψ ∈ L2 (ℝn ) by Fψ = limj→∞ Fψj . 1.4.3 Fourier transform and convolution The Fourier transform F (and its inverse F −1 ) takes, under suitable conditions, convolutions to ordinary products (up to a constant factor): F(ψ ∗ ϕ) = (2πℏ)n/2 FψFϕ F(ψϕ) = (2πℏ)−n/2 Fψ ∗ Fϕ. These formulas make sense if, for instance, ψ and ϕ are in 𝒮 (ℝn ). They also are verified if ψ ∈ L1 (ℝn ) and ϕ ∈ L1 (ℝn ) since L1 (ℝn ) is a convolution algebra.
1.5 The symplectic group 1.5.1 Classical Lie groups In what follows, 𝕂 = ℝ or ℂ. M(m, 𝕂) is the algebra of all m × m matrices with entries in 𝕂. GL(m, 𝕂) is the general linear group. It consists of all invertible matrices in M(m, 𝕂). SL(m, 𝕂) is the special linear group: It is the subgroup of GL(m, 𝕂) consisting of all the matrices with determinants equal to one. Sym(m, 𝕂) is the vector space of all symmetric matrices in M(m, 𝕂); it has dimension 21 m(m + 1). U(n, ℂ) is the unitary group; it consists of all U ∈ M(n, ℂ) such that UU ∗ = U ∗ U = I (U ∗ = Ū T is the adjoint of U). n U(n, ℂ) is the multiplicative group of all unitary isomorphisms of ℝ : u ∈ U(n, ℂ) ∗ ∗ if and only u u = uu = In×n .
10 | 1 Preliminaries U(n) is the image in GL(2n, ℝ) of U(n, ℂ) by the monomorphism A B
A + iB → (
−B ) A
(1.12)
where A and B are real n × n matrices. 1.5.2 Definition of Sp(n) The symplectic group Sp(n) deserves a special mention since it will be ubiquitous in this book. Sp(n) is the group of all (linear) automorphisms of ℝ2n ≡ ℝn × ℝn leaving the symplectic form n
σ(z, z ) = ∑ pj xj − pj x j=1
invariant: S ∈ Sp(n) ⇐⇒ σ(Sz, Sz ) = σ(z, z ) for all (z, z ) ∈ ℝ2n × ℝ2n . Choosing once and for all a symplectic basis (for instance, the canonical basis of ℝn × ℝn ), Sp(n) can be identified with a matrix group, in fact, with the subgroup of GL(2n, ℝ) consisting of all real 2n × 2n matrices S such that ST JS = SJST = J where J is the “standard symplectic matrix” defined by 0n×n −In×n
J=(
In×n ). 0n×n
Since det ST JS = det S2 det J = det J, it follows that det S can, a priori, take any of the two values ±1. It turns out, however, that S ∈ Sp(n) ⇒ det S = 1. Here is an algebraic proof of this important property. Recall that, to every antisymmetric matrix A, one associates a polynomial Pf(A) in the entries of A (“the Pfaffian of A”). The Pfaffian has the following properties: Pf(ST AS) = (det S) Pf(A),
Pf(J) = 1.
Choose now A = J and S ∈ Sp(n). Since ST JS = J, we have Pf(ST JS) = det S = 1, which was to be proven.
1.5 The symplectic group
| 11
1.5.3 Symplectic block-matrices It is often helpful to write symplectic matrices in the block form A C
B ) D
S=(
(1.13)
where A, B, C, D are n × n matrices. The condition S ∈ Sp(n) is then equivalent to the two following sets of equivalent conditions: AT C, BT D symmetric, and AT D − C T B = I T
T
T
(1.14)
T
AB , CD symmetric, and AD − BC = I,
(1.15)
and it follows from the second of these sets of conditions that the inverse of S is DT −C T
−BT ). AT
S−1 = (
(1.16)
Note that the monomorphism (1.12) is an isomorphism of the unitary group U(n, ℂ) on the subgroup U(n) of Sp(n) consisting of all symplectic matrices A B
−B ). A
U=(
(1.17)
Proposition 10. The group monomorphism ι : GL(n, ℂ) → GL(2n, ℝ) defined by A B
ι(A + iB) = (
−B ) B
(1.18)
restricts to a monomorphism U(n, ℂ) → Sp(n) such that ι(u∗ ) = ι(u)T . That monomorphism identifies the unitary group U(n, ℂ) with the subgroup U(n) = ι(U(n, ℂ)) of Sp(n). Proof. It follows from conditions (1.14), (1.15) that A B
ι(u) = U = (
−B ) A
(1.19)
is in Sp(n) if and only if ABT = BT A,
AAT + BBT = I,
(1.20)
AT B = BAT ,
AT A + BT B = I.
(1.21)
or, equivalently
The equivalence of conditions (1.20) and (1.21) is proved by noting that U ∈ U(n) if and only if U T ∈ U(n) which follows from the fact that the monomorphism ι satisfies ι(u∗ ) = ι(u)T .
12 | 1 Preliminaries 1.5.4 The eigenvalues of a symplectic matrix The following result is well-known: Proposition 11. Let S ∈ Sp(n). (i) If λ is an eigenvalue of S, then so are λ̄ and 1/λ (and ̄ (ii) If the eigenvalue λ of S has multiplicity k, then so has 1/λ. (iii) S and hence also 1/λ); S−1 have the same eigenvalues.
Proof. Let us show that the characteristic polynomial PS (λ) = det(S − λI) of S satisfies the reflexivity relationship PS (λ) = λ2n PS (1/λ).
(1.22)
Property (i) will follow since, for real matrices, eigenvalues appear in conjugate pairs. Since ST JS = J, we have S = −J(ST )−1 J, and hence PS (λ) = det(−J(ST ) J − λI) −1
= det(−(ST ) J + λI) −1
= det(−J + λS) = λ2n det(S − λ−1 I) which is precisely (1.22). (ii). Let PS be the j-th derivative of the polynomial PS . If λ0 (j)
has multiplicity k, then PS (λ0 ) = 0 for 0 ≤ j ≤ k − 1 and PS(k) (λ) ≠ 0. In view of (1.22), (j)
(j) we also have PS (1/λ)
follows from (ii).
= 0 for 0 ≤ j ≤ k − 1 and PS(k) (1/λ) ≠ 0. Property (iii) immediately
1.5.5 Williamson’s symplectic diagonalization Every real symmetric matrix can be diagonalized using an orthogonal transformation. When M is 2n×2n positive definite symmetric matrix, then M can be diagonalized using a symplectic matrix. This is Williamson’s famous symplectic diagonalization theorem. It has many applications throughout quantum mechanics and quantum information theory. We first remark that the eigenvalues of JM are the same as those of the antisymmetric matrix M 1/2 JM 1/2 ; they are thus of the type ±iλj , where λj > 0. The set of positive numbers {λj : 1 ≤ j ≤ n} is called the symplectic spectrum of the matrix M. The symplectic spectrum is generally not related to the set of eigenvalues of M in any simple way. Proposition 12. Let M be a real symmetric positive definite 2n × 2n matrix. There exists S ∈ Sp(n) such that M = ST DS, where D is a diagonal matrix of the type Λ 0
D=(
0 ) Λ
1.6 Quantum mechanics | 13
the diagonal entries of Λ being the positive numbers λj such that ±iλj is an eigenvalue of JM. Proof. Let ⟨⋅, ⋅⟩M be the scalar product on ℂ2n defined by ⟨z, z ⟩M = ⟨Mz, z ⟩. Since both ⟨⋅, ⋅⟩M and the symplectic form σ are non-degenerate, we can find a unique invertible matrix K of order 2n such that ⟨z, Kz ⟩M = σ(z, z ) for all z, z ; that matrix satisfies K T M = J = −MK. Since σ is antisymmetric, we must have K = −K M , where K M = −M −1 K T M is the transpose of K with respect to ⟨⋅, ⋅⟩M ; it follows that the eigenvalues of K = −M −1 J are of the type ±iλj , λj > 0, and so are those of JM −1 . The corresponding complex eigenvectors occurring in conjugate pairs ej ±ifj we thus obtain a ⟨⋅, ⋅⟩M -orthonormal basis {ei , fj }1≤i,j≤n of ℝ2n such that Kei = λi fi and Kfj = −λj ej . It follows from these relationships that we have K 2 ei = −λi2 ei and K 2 fj = −λj2 fj and that the vectors of the basis {ei , fj }1≤i,j≤n satisfy the relationships σ(ei , ej ) = ⟨ei , Kej ⟩M = λj ⟨ei , fj ⟩M = 0 σ(fi , fj ) = ⟨fi , Kfj ⟩M = −λj ⟨fi , ej ⟩M = 0 σ(fi , ej ) = ⟨fi , Kej ⟩M = λi ⟨fi , fj ⟩M = −λi δij . Setting ei = λi−1/2 ei and fj = λj−1/2 fj , the basis {ei , fj }1≤i,j≤n is symplectic. Let S be the element of Sp(n) mapping the canonical symplectic basis to {ei , fj }1≤i,j≤n . The ⟨⋅, ⋅⟩M -orthogonality of {ei , fj }1≤i,j≤n implies M = ST DS with D = diag(Λ, Λ), Λ = diag(λ1 , . . . , λn ).
1.6 Quantum mechanics For George Mackey, quantum mechanics is a refinement of classical mechanics in its Hamiltonian form. It is thus viewed as a “super theory” containing classical mechanics if certain limiting conditions are satisfied (for instance when Planck’s constant can be neglected). This is the point of view we have adopted in our book Emergence of the Quantum from the Classical: Mathematical Aspects of Quantum Processes (World Scientific, 2017), where the point of view is also maintained that there is a form of emergence of the quantum from the classical. Quantum mechanics has a double status. While it has its roots in physics, it has become (and this rather rapidly since its inception) a branch of both pure and applied mathematics. That applied mathematics should play a role is rather obvious since there are many practical and numerical issues to be solved, but parts of the problems that are posed by quantum mechanics definitely belong to more abstract areas (this is also true, of course, of Hamiltonian mechanics, which has led to important developments in symplectic topology and geometry; Gromov’s non-squeezing theorem is a well-known example). Anyway, it is
14 | 1 Preliminaries clear that mathematics has largely contributed to the growth and understanding of quantum mechanics, and this from the very beginning.
1.6.1 The axioms of quantum mechanics In the title of this section, we use the term “axiom” instead of “postulate”. This is because we want to carefully avoid the quagmire of defining the rules governing quantum mechanics as a physical theory; we leave this ungrateful task to physicists and philosopher’s of science. The debate is open, and is not, for sure, to be soon closed. The first six axioms are rather standard and are essentially those of the “Copenhagen School” (they are actually due to John von Neumann). For a nice discussion, see for instance Paul Thierry [61]: Axiom 13. A pure quantum state of a physical system S is an element ψ ≠ 0 of a separable complex Hilbert space HS , called the state space of S. Two normalized elements ψ and ϕ of HS define the same state if there exists c ∈ ℂ (c ≠ 0) such that ψ = cϕ. Equivalently, a quantum state is a ray (= one dimensional subspace) ℂψ in HS . A finite sum of states ψ1 + ⋅ ⋅ ⋅ + ψm , with ψj ∈ HS , 1 ≤ j ≤ m, is also a state called the superposition of the states ψ1 , . . . , ψm . Axiom 14. A (mixed) quantum state in S is the datum of a set {(ψj , αj ) ∈ HS × ℝ+ : j ∈ F} with ‖ψj ‖H = 1 and where F is a countable set and ∑j∈F αj = 1; αj is the probability that the quantum state is the pure state ψj . Axiom 15. An observable of a quantum system S with state space HS is a (possibly un̂ on HS . bounded) self-adjoint operator A This axiom has to be complemented by the following statistical hypothesis, called “Born’s rule” (the validity of which is being regularly questioned by physicists): ̂ j = λj ψj , Axiom 16. Born’s rule: if λj ∈ ℝ and a normalized ψj ∈ HS are such that Aψ ̂ then, if the system S is in the state ψ, the observable A has the probability |(ψ|ψj )H |2 of taking the value λj . So far, the listed axioms define “standard” quantum mechanics. We will complete these axioms to enlarge the picture to “quantum mechanics in phase space”. For this purpose, we assume that HS = L2 (ℝn ). Axiom 17. The datum of a mixed quantum state {(ψj , αj ) ∈ L2 (ℝn ) × ℝ+ : j ∈ F} is equivalent to that of its Wigner distribution ρ = ∑j∈F αj Wψj . Axiom 18. Suppose the quantum system S is in the mixed state {(ψj , αj ) ∈ L2 (ℝn ) × ℝ+ : ̂ be a quantum observable. The expectation (or: average) value of A ̂ in j ∈ F}, and let A this mixed state is given, when defined, by the integral ∫ℝ2n a(z)ρ(z)dz.
1.6 Quantum mechanics | 15
1.6.2 Quantization In this book, we will not really use the quantum mechanical axioms just listed; we will actually be focusing on another related issue, that of the “quantization of observables”. Mathematically speaking, a quantization is a procedure a → Op(a) that assô = Op(a) (in general a pseudodifferential ciates to a symbol a ∈ 𝒮 (ℝ2n ) an operator A n n operator 𝒮 (ℝ ) → 𝒮 (ℝ )). This procedure should satisfy the following requirements to qualify as a quantization: 1. Continuity: Op is a continuous linear mapping
2n
n
n
𝒮 (ℝ ) → ℒ(𝒮 (ℝ ), 𝒮 (ℝ )) ;
2.
Triviality: Op should satisfy Op(b ⊗ 1)u = bu,
3.
Op(1 ⊗ f )u = ℱ −1 (f ℱ u)
(1.23)
for ψ ∈ 𝒮 (ℝn ) and b ∈ 𝒮 (ℝn ), Self-adjointness: Op(a) is self-adjoint ⇐⇒ a is real;
(1.24)
[Op(a), Op(b)] = iℏ Op({a, b})
(1.25)
4. “Dirac’s dream”:
where [⋅, ⋅] is the commutator and {a, b} the Poisson bracket associated with the standard symplectic form, i. e., {a, b} = 𝜕x a ⋅ 𝜕p b − 𝜕x b ⋅ 𝜕p a. While the first three conditions are satisfied by all known useful quantizations, the fourth property, “Dirac’s dream”, is not. Actually, the impossibility of constructing a full quantization satisfying all for properties is the famous result of Groenewold and van Hove, which is a “no-go” result. Mathematically, it says (in its strong form) that one cannot quantize the Poisson algebra of polynomials in ℝn , beyond those of degree ≤ 2. 1.6.3 Superposition and entanglement The typically “quantum” notions of superposition and entanglement are both surrounded by an almost mystical aura in nonscientific circles. However, both notions are perfectly well-defined mathematically. Here is a short and very elementary description, mainly addressed to mathematicians who might not be familiar with these concepts. A pure classical state is a point in some phase space, and observables are functions defined on this phase space. In quantum mechanics, a pure state is a point in a
16 | 1 Preliminaries Hilbert space, and observables are self-adjoint operators1 acting on that Hilbert space. In quantum physics, these observables are called “Hermitian operators”; it should be noted that in the physics literature the distinction between operators that are merely symmetric and operators that are actually self-adjoint is generally glossed over. Why does one insist that pure states should belong to a Hilbert space? It is because in Hilbert spaces we can define unambiguously orthogonal projections on subspaces (or more generally, on closed convex subsets); it is the completeness property of a Hilbert space that allows us to construct these projections by approximating them with Cauchy sequences. This property is essential when one wants to discuss the collapse (or: reduction) of the state function. Suppose indeed that we are studying a quantum ̂ on the Hilbert space L2 (ℝn ); we denote the scalar product of two eleobservable A ̂ has a discrete spectrum λ1 , λ2 , . . . with ments ψ, ϕ of L2 (ℝn ) by (ψ|ϕ)L2 . We assume A corresponding orthonormal eigenfunctions ψ1 , ψ2 , . . . of L2 (ℝn ). If measurements are performed on the state, then the only values that will ever be observed are the eigenvalues λj . However, this does not mean that the observable was initially in the pure state ψj ; the most general pure state corresponding to that observable is namely a superposition ψ = ∑j (ψ|ψj )L2 ψj . In view of Plancherel’s formula, we have ∑j |(ψ|ψj )L2 |2 = 1 which allows us to view the numbers |(ψ|ψj )L2 |2 as probabilities: According to all standard interpretations of quantum mechanics, |(ψ|ψj )L2 |2 is the probability that the measurement performed on the state will yield the value λj . Now comes the subtle point: After this measurement, the new state will be a linear combination of the eigenfunctions ψ(1) , . . . , ψ(m) corresponding to λj (m the multiplicity of λj ); the measurement thus j j orthogonally projects (“collapses”) the unknown state ψ onto the subspace spanned by the set of orthonormal vectors {ψ(1) , . . . , ψ(m) }. j j 1.6.4 Mixed quantum states; the density matrix In practice, it is usual not possible to assign a pure state to the system being studied because of uncertainties of a statistical nature. It is here that the notion of mixed state intervenes. In classical mechanics, a mixed state is just a probability distribution (“density of states”), i. e., a function ρ ≥ 0 on phase space whose integral is one with respect to some measure defined on that phase space. This function may be compactly supported, or not. In quantum mechanics, the situation is not so much different. A mixed quantum state is represented by a quasi-probability distribution ρ defined on phase space (the “Wigner distribution”), the prefix “quasi” meaning that ρ can (and usually does) take negative values. For all practical purposes, this function 1 In quantum mechanics, the observables are called “Hermitian operators”; it should be noted that in the physics literature the distinction between operators that are merely symmetric and operators that are actually self-adjoint is generally ignored.
1.6 Quantum mechanics | 17
ρ is used to calculate statistical averages exactly as if it were a bona fide probability distribution. Suppose indeed the quantum mixed state consists of a collection of pairs {(ψj , αj )} where the ψj ∈ L2 (ℝn ) are pure states, and the αj are positive numbers summing up to one. Physically this corresponds to representing the degree of knowledge of the state: the probability that it is in the pure state ψj is αj . A mixed state is thus a statistical mixture; it is different from a linear combination as just considered. A mixed state {(ψj , αj )} can be represented advantageously in two different but mathematically equivalent ways. The first is to identify {(ψj , αj )} with its Wigner distribution ρ: by definition ρ(x, p) = ∑ αj Wψj (x, p) where Wψj is the Wigner transform of ψj : Wψj (x, p) = (
n
i 1 1 1 ) ∫ e− ℏ py ψj (x + y)ψj (x − y)dy. 2πℏ 2 2
(1.26)
ℝn
The second way to represent {(ψj , αj )} is to note that, since the datum of a pure state ̂j ψ = (ψ|ψj ) 2 ψj , we can ψj is equivalent to the datum of the orthogonal projection Π L ̂j . The operator ρ̂ identify the mixed state with the convex sum of projectors ρ̂ = ∑j αj Π is called the density operator (or matrix) of the mixed state {(ψj , αj )}. It is a bounded, positive, and self-adjoint operator on L2 (ℝn ); it moreover has trace Tr(ρ̂) = 1. It turns out that both representations are equivalent because ρ̂ is just the Weyl operator whose symbol is (2πℏ)n ρ, that is i 1 ρ̂ψ(x) = ∫ e ℎ p(x−y) ρ( (x + y), p)ψ(y)dydp. 2
ℝ2n
1.6.5 Entanglement: spooky actions at a distance Assume that we are dealing with two quantum systems, labelled A and B, whose Hilbert spaces are H A and H B . To each of these systems, we assign an “observer”: Alice for A, and Bob for B physicists like Alice and Bob; if there is a third observer, he is called Charles. The tasks of these observers will be to perform measurements, either on H B or on H B , or on both. Suppose first that dim H A = dim H B = 1; then the state of A is a pure state ψA and, similarly, the state of B is a pure state ψB . We now consider the union of the two systems A and B; it is a new quantum system A ⊗ B with one-dimensional Hilbert space H A ⊗ H B consisting of all functions cAB (ψA ⊗ ψB ) with cAB ∈ ℂ. If now Alice observes the part A of the state A ⊗ B, she will collapse ψA ⊗ ψB to ψA , and Bob will, similarly, collapse ψA ⊗ ψB to ψB . Nothing surprising here, of course. But suppose now dim H A = dim H B = 2, and choose orthonormal bases (ψA , ϕA ) of
18 | 1 Preliminaries H A and (ψB , ϕB ) of H B . The system A is thus in some normalized state aA ψA + cA ϕA , and the system B in some normalized state aB ψB + cB ϕB . Consider again the global system A ⊗ B; its Hilbert space H A ⊗ H B is four dimensional, and has an orthonormal basis (ψA ⊗ ψB , ψA ⊗ ϕB , ϕA ⊗ ψB , ϕA ⊗ ϕB ), so the most general state of A ⊗ B is here an arbitrary normalized linear combination of these vectors. Consider in particular the state ψ=
1 (ψA ⊗ ϕB + ϕA ⊗ ψB ) √2
(1.27)
(such a state is called a “Bell state” in the physical literature). We now let Alice perform a measurement on the subsystem A of A ⊗ B. There are two possible outcomes, each having the same probability 50 %: Either (i) Alice measures ψA and hence ψ collapses to ψA ⊗ ϕB , and in any subsequent measurement Bob will find B in the state ϕB or (ii) she measures ϕA , in which case ψ collapses to ϕA ⊗ ψB , and Bob will automatically measure ψB . The point here is that Alice’s measurement is random, but her act of observation determines the outcome Bob will observe. This very non-classical aspect of quantum mechanics was considered to be shocking, led to the famous EPR paper [24] by Einstein, Podolsky, and Rosen, and also led Einstein to speak about “spooky actions at a distance”.
1.6.6 Variable ℏ In quantum physics, it does not make sense to speak about the exact value of h; it is a constant of nature, not a mathematical constant like π, e, or √2. Its value not only depends on the choice of units, but also on measurements. Fixing once and for all a system of units, all we can say is that h is an experimental value situated in some real interval [ℏ0 −ε, ℏ0 −ε] (this is equally true of any other physical constant). For instance, on 20 May 2019, the SI unit of mass (the kilogram) was redefined by the BIPM by fixing arbitrarily the value of Planck’s constant as being h = 6.626070150 × 10−34 J s.
(1.28)
This choice is, however, ad hoc, and was meant to make the kilogram fit with its best known values. However, measurements given by the National Institute of Standards (NIST) in 2014 yield the interval of confidence h = 6.626070150(81) × 10−34 J s;
(1.29)
more recent measurements, also performed at NIST, yield the different result h = 6.626070040(81) × 10−34 J s.
(1.30)
1.6 Quantum mechanics | 19
A few historical remarks. Paul Dirac [21] suggested in 1937 in his “Large Numbers Hypothesis” that some constants of nature could vary in space-time. Since then, the topic has remained a subject of fascination which has motivated numerous theoretical and experimental researches. The variability of physical parameters (we are avoiding the oxymoron “nonconstant constants”) is a possibility that cannot be outruled and which has being actively studied for a long time by many physicists. Dirac had speculated that physical constants, such as the gravitational constant or the fine-structure constant, might be subject to change over time. However, testing the constancy of a physical parameter means going to extraordinary lengths in terms of precision measurements and is intimately related to choices of unit systems. The history actually started in a quite romantic way, with the story of the Oklo natural nuclear reactor found in a uranium mine in Central Africa in 1972. The measurements that were made in Oklo give limits on the variation of the fine-structure constant over the period since the reactor was running for ca. 1.8 billion years, which is much less than the estimated age of universe. In 1999, a team of astronomers headed by the astrophysicist John Webb reported that measurements of light absorbed by very distant quasars suggest that the value of the fine-structure constant was once slightly different from what it is today. These experiments, made using the very sophisticated Keck and VLT telescopes in Hawaii, put an upper bound on the relative change per year, at roughly 10−17 per year.
2 Displacements and reflections In this chapter, we introduce the most basic tools from quantum harmonic analysis: the Heisenberg displacement operator and the Grossmann–Royer reflection operator. They are the building blocks of the operator theories we will study in the forthcoming chapters.
2.1 The Heisenberg displacement operator 2.1.1 Definition and motivation In what follows, ψ is a square integrable complex function: ψ ∈ L2 (ℝn ). We recall that the Schwartz space 𝒮 (ℝn ) is dense in L2 (ℝn ), so in many proofs it is sufficient to assume ψ ∈ 𝒮 (ℝn ), and then to conclude using a continuity and density argument. Definition 19. Let z0 = (x0 , p0 ) be in ℝ2n ≡ ℝn × ℝn . The unitary operator ̂ D(z0 ) : L2 (ℝn ) → L2 (ℝn ) defined by i
1
̂ D(z0 )ψ(x) = e ℎ (p0 ⋅x− 2 p0 ⋅x0 ) ψ(x − x0 )
(2.1)
is called the (Heisenberg) displacement operator determined by z0 . Another common denomination for ̂ D(z0 ) is “Heisenberg–Weyl operator”; physicists sometimes even use the shortened terminology “Weyl operator”, but this is confusing because the notion of Weyl operator has a very different meaning in mathematics. Admittedly, this definition seems to be ad hoc and somewhat mysterious. Here is a dynamical motivation. Let us begin with some rather trite considerations. The phase space translation operators T(z0 ) : z → z + z0 can be viewed as the time-one map of a Hamiltonian flow, namely that determined by the Hamiltonian function H(z) = σ(z, z0 ) = p ⋅ x0 − p0 ⋅ x.
(2.2)
In fact, the solutions of the associated Hamilton equations ż = J𝜕z H(z) are given by z(t) = z + tz0 . Consider now the “quantized” version of H; it is the operator ̂ = x0 ⋅ (−iℏ𝜕x ) − p0 ⋅ x. H https://doi.org/10.1515/9783110722772-002
22 | 2 Displacements and reflections One can check by a straightforward calculation that the solution ψ of the associated Schrödinger equation ̂ iℏ𝜕t ψ = Hψ,
ψ(⋅, 0) = ψ0
is given by the formula 1 2
i
ψ(x, t) = e ℎ (tp0 ⋅x− 2 t
p0 ⋅x0 )
ψ0 (x − tx0 ),
(2.3)
that is ψ(⋅, t) = ̂ D(tz0 )ψ0 . The Heisenberg displacement operator is thus the time-one propagator for the Schrödinger equation associated with the displacement Hamiltonian. This justifies the notation i
̂ ̂ D(z0 ) = e ℎ σ(z0 ,z)
that is often used in the literature. 2.1.2 Properties The displacement operators are linear: ̂ D(z0 )(λψ + μϕ) = λ̂ D(z0 )ψ + μ̂ D(z0 )ϕ and unitary: the inverse of ̂ D(z0 ) is ̂ D(z0 )−1 = ̂ D(−z0 ) = ̂ D(z0 )∗ .
(2.4)
In fact, setting x = x − x0 , 1
i
(̂ D(z0 )ψ|ϕ) = ∫ e ℎ (p0 ⋅x− 2 p0 ⋅x0 ) ψ(x − x0 )ϕ(x)dn x i
1
= ∫ ψ(x − x0 )e− ℎ (p0 ⋅x− 2 p0 ⋅x0 ) ϕ(x)dn x i
1
= ∫ ψ(x )e ℎ (−p0 ⋅x− 2 p0 ⋅x0 ) ϕ(x + x0 )dn x = (ψ|̂ D(−z0 )ϕ). We leave the proof of following property to the reader. ̂ 0 ) to the Schwartz space 𝒮 (ℝn ) is a continuous auProposition 20. The restriction of D(z n n tomorphism 𝒮 (ℝ ) → 𝒮 (ℝ ) that extends by duality into a continuous automorphism of 𝒮 (ℝn ).
2.1 The Heisenberg displacement operator
| 23
The displacement operators do not commute; in fact, they satisfy the following remarkable properties Proposition 21. The displacement operators satisfy i
̂ D(z1 )̂ D(z0 ) D(z0 )̂ D(z1 ) = e ℎ σ(z0 ,z1 ) ̂ ̂ D(z0 + z1 ) = e
− 2ℎi σ(z0 ,z1 )
̂ D(z0 )̂ D(z1 )
for all z0 , z1 ∈ ℝ2n . Proof. We have i
1
̂ D(z0 )̂ D(z1 ) = ̂ D(z0 )(e ℏ (p1 ⋅x− 2 p1 ⋅x1 ) T(z1 )) i
1
i
1
= e ℎ (p0 ⋅x− 2 p0 ⋅x0 ) e ℏ (p1 ⋅(x−x0 )− 2 p1 ⋅x1 ) T(z0 + z1 ) and, similarly i
1
i
1
̂ D(z1 )̂ D(z0 ) = e ℎ (p1 ⋅x− 2 p1 ⋅x1 ) e ℏ (p0 ⋅(x−x1 )− 2 p0 ⋅x0 ) T(z0 + z1 ). Setting 1 1 Φ = p0 ⋅ x − p0 ⋅ x0 + p1 ⋅ (x − x0 ) − p1 ⋅ x1 2 2 1 1 Φ = p1 ⋅ x − p1 ⋅ x1 + p0 ⋅ (x − x1 ) − p0 ⋅ x0 ; 2 2 we have i
̂ D(z1 )̂ D(z0 ) D(z0 )̂ D(z1 ) = e ℏ (Φ−Φ ) ̂ which yields Φ − Φ = p0 ⋅ x1 − p1 ⋅ x0 = σ(z0 , z1 ) which proves (2.5). Let us next prove formula (2.6). We have i
̂ D(z0 + z1 ) = e ℏ Φ T(z0 + z1 )
with 1 Φ = (p0 + p1 ) ⋅ x − (p0 + p1 ) ⋅ (x0 + x1 ). 2 On the other hand, we have seen previously that i
̂ D(z0 )̂ D(z1 ) = e ℏ Φ T(z0 + z1 ),
(2.5) (2.6)
24 | 2 Displacements and reflections so that i
̂ D(z0 + z1 ) = e ℏ (Φ
−Φ) ̂
D(z0 )̂ D(z1 ).
A straightforward algebraic calculation shows that 1 1 1 Φ − Φ = p1 ⋅ x0 − p0 ⋅ x1 = − σ(z0 , z1 ) 2 2 2 hence formula (2.6). Proposition 22. (i) The mapping z0 → ̂ D(z0 ) is strongly continuous on 𝒮 (ℝn ), that is lim ̂ D(z)ψ − ̂ D(z0 )ψα,β = 0
z→z0
for every ψ ∈ 𝒮 (ℝn ), and α, β ∈ ℕn where ‖ ⋅ ‖α,β ((α, β) ∈ ℕn × ℕn ) is the seminorm ‖ψ‖α,β = sup𝜕xα x β ψ on 𝒮 (ℝn ); (ii) The mapping z0 → ̂ D(z0 ) is weakly ∗-continuous on 𝒮 (ℝn ), that is lim ⟨̂ D(z)ψ, ϕ⟩ = ⟨̂ D(z0 )ψ, ϕ⟩
z→z0
for all ψ ∈ 𝒮 (ℝn ) and ϕ ∈ 𝒮 (ℝn ). Proof. (i) Let us assume that z0 = 0 and prove that lim 𝜕xα xβ (̂ D(z)ψ − ψ)∞ = 0.
|z|→0
Noting that, by Leibniz’ formula for the derivatives of a product, we can write 𝜕xα xβ ̂ D(z)ψ =
D(z)(𝜕xα−γ x β−δ ψ) ∑ cαβγδ x δ pγ ̂
γ≤α,δ≤β
(2.7)
where the cαβγδ are complex constants and γ ≤ α means γj ≤ αj for j = 1, 2, . . . , n, we have α β ̂ D(z)(𝜕xα x β ψ) − (𝜕xα x β ψ)∞ 𝜕x x (D(z)ψ − ψ)∞ ≤ ̂ ̂ α−γ β−δ + ∑ cαβγδ x δ pγ D(z)(𝜕 ψ)∞ . x x 0 0 if and only if ψ = ̂ D(z0 )ψM for some z0 ∈ ℝ2n . We omit the proof of Hudson’s theorem; see the subsequent comments and references section. We mention the following refinement of Hudson’s theorem due to J. Toft: Proposition 35. Let ψ, ϕ ∈ L2 (ℝn ). We have W(ψ, ϕ) > 0 if and only if ψ = T(z0 )ψM and ϕ = λψ for some λ > 0. Thus, we cannot expect the cross-Wigner transform to be positive for any distinct state. 3.1.4 Displacing cross-Wigner transforms The following important result describes the behavior of the cross-Wigner transform under displacements:
3.1 Definition and properties | 37
Proposition 36. For every pair (ψ, ϕ) of functions in 𝒮 (ℝn ), we have 1
i
W(̂ D(z0 )ψ, ̂ D(z1 )ϕ)(z) = e− ℏ [σ(z,z0 −z1 )+ 2 σ(z0 ,z1 )] W(ψ, ϕ)(z − ⟨z⟩)
(3.8)
where ⟨z⟩ = 21 (z0 + z1 ). In particular: W(̂ D(z0 )ψ, ̂ D(z0 )ϕ)(z) = W(ψ, ϕ)(z − z0 ) ̂ W(D(z0 )ψ)(z) = Wψ(z − z0 ) W(̂ D(z0 )ψ, ϕ)(z) = e
− ℏi σ(z,z0 )
1 W(ψ, ϕ)(z − z0 ). 2
(3.9) (3.10) (3.11)
Proof. To prove formula (3.8), it is sufficient to assume that ψ, ϕ ∈ 𝒮 (ℝn ). We will set ⟨x⟩ = 21 (x0 + x1 ) and ⟨p⟩ = 21 (p0 + p1 ). By definition of the Heisenberg–Weyl operator, we have i 1 1 1 1 ̂ D(z0 )ψ(x + y) = e ℏ [p0 ⋅(x+ 2 y)− 2 p0 ⋅x0 ] ψ(x − x0 + y) 2 2 i 1 1 1 1 ̂ D(z1 )ϕ(x − y) = e ℏ [p1 ⋅(x− 2 y)− 2 p1 ⋅x1 ] ϕ(x − x1 − y), 2 2
and hence i i 1 1 1 1 ̂ D(z0 )ψ(x + y)̂ D(z1 )ϕ(x − y) = e ℏ δ(z0 ,z1 ) e ℏ ⟨p⟩⋅y ψ(x − x0 + y)ϕ(x − x1 − y) 2 2 2 2
with 1 δ(z0 , z1 ) = (p0 − p1 ) ⋅ x − (p0 ⋅ x0 − p1 ⋅ x1 ). 2 It follows that we have n
i 1 W(̂ D(z0 )ψ, ̂ D(z1 )ϕ)(z) = ( ) e ℏ δ(z0 ,z1 ) 2πℏ i 1 1 × ∫ e− ℏ (p−⟨p⟩)⋅y ψ(x − x0 + y)ϕ(x − x0 − y)dy. 2 2
ℝn
Performing the change of variables y = x1 −x0 +y in the integral, this equality becomes n
i 1 W(̂ D(z0 )ψ, ̂ D(z1 )ϕ)(z) = ( ) eℏΔ 2πℏ i 1 1 × ∫ e− ℏ (p−⟨p⟩)⋅y ψ(x − ⟨x⟩ + y)ϕ(x − ⟨x⟩ − y)dy 2 2
ℝn
where the phase Δ is given by 1 Δ = (p0 − p1 ) ⋅ x − (x0 − x1 ) ⋅ p + (p1 ⋅ x0 − p0 ⋅ x1 ) 2
38 | 3 The cross-Wigner transform 1 = −σ(z, z0 − z1 ) − σ(z0 , z1 ), 2 hence formula (3.8). The formulas (3.9), (3.10), and (3.11) immediately follow from (3.8).
3.2 The Moyal identity 3.2.1 Statement and proof The main result of this section is formula (3.12) whose importance cannot be overestimated. Proposition 37. The cross-Wigner transform satisfies the “Moyal identity” (W(ψ, ϕ)|W(ψ , ϕ ))L2 (ℝ2n ) = (
n
1 ) (ψ|ψ )L2 (ϕ|ϕ )L2 2πℏ
(3.12)
for all (ψ, ϕ) ∈ L2 (ℝn ) × L2 (ℝn ) and (ψ , ϕ ) ∈ L2 (ℝn ) × L2 (ℝn ). In particular n
1 ) ‖ψ‖L2 ‖ϕ‖L2 W(ψ, ϕ)L2 (ℝ2n ) = ( 2πℏ
(3.13)
and n/2
‖Wψ‖L2 (ℝ2n ) = (
1 ) 2πℏ
‖ψ‖2L2 .
Proof. Let us set A = (2πℏ)2n (W(ψ, ϕ)|W(ψ , ϕ ))L2 (ℝ2n ) . In view of the integral definition (3.4) of the cross-Wigner transform, we have i 1 1 A = ∫ e− ℏ p⋅(y−y ) ψ(x + y)ψ (x + y ) 2 2
ℝ4n
1 1 × ϕ(x − y)ϕ (x − y )dydy dxdp. 2 2 The integral in p can be viewed as an inverse Fourier transform, yielding i
∫ e− ℏ p⋅(y−y ) dp = (2πℏ)n δ(y − y ),
ℝn
and hence 1 1 1 1 A = (2πℏ)n ∫ ψ(x + y)ψ (x − y)ϕ(x + y)ϕ (x − y)dydy dx. 2 2 2 2 ℝ3n
(3.14)
3.2 The Moyal identity | 39
Setting x = x + 21 y and y = x − 21 y, we have dx dy = dxdy, and hence A = (2πℏ)n ( ∫ ψ(x )ψ (x )dx )( ∫ ϕ(y )ϕ (y )dy ) ℝn
ℝn
which proves Moyal’s identity (3.12). Formula (3.14) immediately follows. Notice that the Moyal identity can be written, using distributional brackets, as ⟨W(ψ, ϕ), W(ψ , ϕ )⟩ = (
n
1 ) ⟨ψ, ψ ⟩⟨ϕ, ϕ ⟩. 2πℏ
(3.15)
3.2.2 An extension result for the cross-Wigner transform The Moyal identity implies the following continuity result for the cross-Wigner transform: Proposition 38. The cross-Wigner transform is a continuous mapping W : L2 (ℝn ) × L2 (ℝn ) → L2 (ℝ2n ).
(3.16)
Proof. Choosing ψ = ψ and ϕ = ϕ in Moyal’s identity (3.12), we get n/2
1 ) W(ψ, ϕ)L2 (ℝ2n| ) = ( 2πℏ
‖ψ‖L2| ‖ϕ‖L2
hence the sesquilinear mapping (ψ, ϕ) → W(ψ, ϕ) is a continuous mapping L2 (ℝn ) × L2 (ℝn ) → L2 (ℝ2n ). Recall that ϕ0 is the standard Gaussian on ℝn defined by 2
ϕ0 (x) = (πℏ)−n e−|x| /2ℏ . Corollary 39. Let ϕ0 ∈ L2 (ℝn ) such that ‖ϕ0 ‖L2 ≠ 0. The set {̂ D(z)ϕ0 : z ∈ ℝ2n } spans a 2 n dense subspace X of L (ℝ ). Proof. Let ψ ∈ L2 (ℝn ). Let us prove that (ψ|̂ D(z)ϕ0 )L2 = 0 for all z ∈ ℝn implies that ψ = 0; the result will follow. We have (ψ|̂ D(z)ϕ0 )L2 = (πℎ)n W(ψ, ϕ0 )(z), hence (ψ|̂ D(z)ϕ0 )L2 = 0 for all z is equivalent to W(ψ, ϕ0 )(z) = 0 for all z. In view of Moyal’s identity, we then have ‖ψ‖L2 ‖ϕ0 ‖L2 = 0, and hence ψ = 0. Note that this result says that, if W(ψ, ϕ) = 0 for all ϕ then ψ = 0. A similar result holds for the cross-ambiguity function: If A(ψ, ϕ) = 0 for all ϕ, then ψ = 0.
40 | 3 The cross-Wigner transform 3.2.3 A reconstruction result Moyal’s identity allows us in addition to prove an inversion formula for the Wigner function. Proposition 40. Let (ϕ, γ) ∈ L2 (ℝn ) × L2 (ℝn ) be such that (γ|ϕ)L2 ≠ 0. For every ψ ∈ 𝒮 (ℝn ), we have 2n ̂ ∫ W(ψ, ϕ)(z)R(z)γdz (γ|ϕ)L2
ψ=
(3.17)
ℝ2n
almost everywhere. Proof. Let us denote by f the right-hand side of (3.17). This function is well-defined since W(ψ, ϕ) ∈ L2 (ℝ2n ) in view of Moyal’s identity. For any α ∈ 𝒮 (ℝn ), we have (f |α)L2 =
2n ̂ ∫ W(ψ, ϕ)(z)(R(z)γ|α) L2 dz. (γ|ϕ)L2 ℝ2n
Recalling that by definition W(ψ, ϕ)(z) = (
n
1 ̂ ) (R(z)ψ|ϕ) L2 , πℏ
we thus have the sequence of equalities (f |α) = = =
(2πℏ)n ∫ W(ψ, ϕ)(z)W(γ, α)dz (γ|ϕ)L2 n
ℝ2n
(2πℏ) ∫ W(ψ, ϕ)(z)W(α, γ)dz (γ|ϕ)L2 n
ℝ2n
(2πℏ) (W(ψ, ϕ)|W(α, γ))L2 . (γ|ϕ)L2
Applying Moyal’s identity (3.12) to the product (W(ψ, ϕ)|W(α, γ))L2 , we get (f |α)L2 =
1 (ψ|α)L2 (ϕ|γ)L2 = (ψ|α)L2 . (γ|ϕ)
Since this identity holds for all α ∈ 𝒮 (ℝn )., we have f = ψ almost everywhere, hence (3.17).
3.3 The cross-ambiguity function | 41
3.3 The cross-ambiguity function 3.3.1 Definition Closely related to the cross-Wigner transform is the ambiguity function since they are symplectic Fourier transforms of each other. We begin by giving a definition in terms of the Heisenberg displacement operator. Definition 41. The cross-ambiguity function on ℝn is the sesquilinear mapping A : L2 (ℝn ) × L2 (ℝn ) → L∞ (ℝ2n ) defined by A(ψ, ϕ)(z) = (
n
1 ) (ψ|̂ D(z)ϕ)L2 . 2πℏ
(3.18)
The function Aψ = A(ψ, ψ) is called the auto-ambiguity function: Aψ(z) = (
n
1 ) (ψ|̂ D(z)ψ)L2 . 2πℏ
(3.19)
The cross-ambiguity function has the property that (cf. (3.2)) A : 𝒮 (ℝn ) × 𝒮 (ℝn ) → 𝒮 (ℝ2n ).
(3.20)
Using the definition of ̂ D(z), it is easily seen that the cross-ambiguity function of 2 n ψ, ϕ ∈ L (ℝ ) is explicitly given by the formula A(ψ, ϕ)(z) = (
n
i 1 1 1 ) ∫ e− ℏ p⋅y ψ(y + x)ϕ(y − x)dy ; 2πℏ 2 2
(3.21)
ℝn
as for the cross-Wigner, transform it is immediately checked, using the Cauchy– Schwarz inequality, that n
1 ) ‖ψ‖L2| ‖ϕ‖L2 . A(ψ, ϕ)(z) ≤ ( 2πℏ 3.3.2 Relations with the cross-Wigner transform There are two ways to relate the cross-Wigner and ambiguity functions. One is algebraic, and the other is analytical. Proposition 42. We have 1 A(ψ, ϕ)(z) = 2−n W(ψ, ϕ∨ )( z) 2 where ϕ∨ (x) = ϕ(−x).
(3.22)
42 | 3 The cross-Wigner transform Proof. By definition of W(ψ, ϕ), we have n
i 1 1 1 1 1 1 ) ∫ e− 2ℏ p⋅y ψ( x + y)ϕ( x − y)dy ; W(ψ, ϕ)( z) = ( 2 2πℏ 2 2 2 2
ℝn
setting x = 21 y, this is n
i 1 1 1 1 W(ψ, ϕ∨ )( z) = ( ) ∫ e− ℏ p⋅x ψ(x + x)ϕ∨ (x − x)dx , 2 πℏ 2 2
ℝn
hence (3.22) in view of (3.21). Many of the properties of the ambiguity function follow from the previous above; for instance, we have the following analogue of Proposition 33: Proposition 43. Let ψ, ϕ ∈ 𝒮 (ℝn ). The mapping z → A(ψ, ϕ)(z) is continuous on ℝ2n . The relationship (3.22) is purely algebraic; the following result is analytical: Proposition 44. Let ψ and ϕ be in 𝒮 (ℝn ). Let Fσ be the symplectic Fourier transform. We have A(ψ, ϕ) = Fσ (W(ψ, ϕ)),
W(ψ, ϕ) = Fσ (A(ψ, ϕ)).
(3.23)
In particular, Aψ = Fσ (Wψ), and Wψ = Fσ (Aψ). Proof. It is sufficient to show that A(ψ, ϕ) = Fσ W(ψ, ϕ): Since the symplectic Fourier transform is involutive, both formulas (3.23) are equivalent. It is moreover sufficient to assume that ψ and ϕ are in the Schwartz space 𝒮 (ℝn ). Setting f = (2πℏ)2n Fσ W(ψ, ϕ), we have, by definition of Fσ and W(ψ, ϕ), i 1 1 f (z) = ∫ e− ℏ [σ(z,z )+p ⋅(y−x)] ψ(x + y)ϕ(x − y)dp dx dy 2 2
ℝ3n
i i 1 1 = ∫ e− ℏ p ⋅(y−x) e− ℏ p⋅x ψ(x + y)ϕ(x − y)dp dx dy. 2 2
ℝ3n
Using the Fourier inversion formula, informally written as i
∫ e− ℏ p ⋅(y−x) dp = (2πℏ)n δ(x − y), ℝn
3.3 The cross-ambiguity function |
43
we have i 1 1 f (z) = (2πℏ)n ∫ δ(x − y)e− ℏ p⋅x ψ(x + y)ϕ(x − y)dx dy 2 2
ℝ2n
i 1 1 = (2πℏ)n ∫ e− ℏ p⋅x ψ(x + x)ϕ(x − x)dx , 2 2
ℝn
hence f = A(ψ, ϕ), which proves our assertion.
3.3.3 Moyal identity for the cross-ambiguity function This result has as an immediate consequence the fact that the ambiguity function also satisfies an Moyal identity: Proposition 45. The cross-ambiguity function satisfies the Moyal identity (A(ψ, ϕ)|A(ψ , ϕ ))L2 (ℝ2n ) = (
n
1 ) (ψ|ψ )L2 (ϕ|ϕ )L2 2πℏ
(3.24)
for all ψ, ϕ ∈ L2 (ℝn ). In particular, n/2
‖Aψ‖L2 (ℝ2n ) = (
1 ) 2πℏ
‖ψ‖L2 .
(3.25)
Proof. We have, taking the Moyal identity in Proposition 37 and the formula A(ψ, ϕ) = Fσ (W(ψ, ϕ)) into account, (A(ψ, ϕ)|A(ψ , ϕ ))L2 (ℝ2n ) = Fσ (W(ψ, ϕ))|Fσ (W(ψ , ϕ ))L2 (ℝ2n ), hence formula (3.24) since Fσ is a unitary operator on L2 (ℝ2n ). The Moyal identity has the following interesting application to the convolution of Wigner transforms: Proposition 46. Let (ψ, ϕ) ∈ L2 (ℝn ) × L2 (ℝn ). We have Wψ ∗ Wϕ = A(ψ∨ , ϕ)|2 where ψ∨ (x) = ψ(−x) and A(⋅, ⋅) is the cross-ambiguity function.
(3.26)
44 | 3 The cross-Wigner transform Proof. We have, since Wϕ is a real function, (Wψ ∗ Wϕ)(z) = ∫ Wψ(z − z0 )Wϕ(z0 )dz0 ℝ2n
= ∫ W(̂ D(z)ψ∨ )(z0 )Wϕ(z0 )dz0 , ℝ2n
hence, by the Moyal identity, (Wψ ∗ Wϕ)(z) = (
n
1 ̂ 2 ∨ ) (D(z)ψ |ϕ)L2 . 2πℏ
Formula (3.26) follows in view of the definition (3.18) of the cross-ambiguity transform.
3.4 Dependence of the Wigner transform on h Let us now address the following question: for given ψ ∈ L2 (ℝn ), can we find ϕ such that Wη ϕ = Wψ for η ≠ ℏ? The answer is negative. We will use the Fη Fourier transform, obtained from F by replacing ℏ with η: Fη ψ(p) = (
n/2
1 ) 2πη
∫e
− ηi p⋅x
ψ(x)dx,
ℝn
and, similarly, denote by Wη the η-dependent Wigner transform: Wη (ψ, ϕ)(z) = (
n
1 1 1 − i p⋅y ) ∫ e η ψ(x + y)ϕ(x − y)dy 2πη 2 2 ℝn
and Wη ψ = Wη (ψ, ψ). When η = ℏ, we will write F, Wψ, etc. as usual. Proposition 47. Let ψ ∈ L1 (ℝn ) ∩ L2 (ℝn ) (ψ ≠ 0) and η > 0. Then: (i) There does not exist any ϕ ∈ L1 (ℝn ) ∩ L2 (ℝn ) such that Wη ϕ = Wψ if η ≠ ℏ; (ii) Assume that there exists a sequence (ψj )j of functions ψj ∈ L1 (ℝn ) ∩ L2 (ℝn ), with ‖ψj ‖2 = 1 for every j, and a sequence (αj )j ∈ ℓ1 (ℕ) of nonnegative real numbers such that Wψ = ∑ αj Wη ψj . j
(3.27)
Then we must have η ≤ ℏ. Proof. (i) Assume that Wη ϕ = Wψ; then, since ψ ∈ L1 (ℝn ) ∩ L2 (ℝn ), the marginal property (3.29) 2 Fη ϕ(p) = ∫ Wη ϕ(x, p)dx ℝn
3.4 Dependence of the Wigner transform on h
| 45
applies, and we thus have since Wη ϕ = Wψ, 2 2 Fη ϕ(p) = ∫ Wψ(x, p)dx = Fℏ ψ(p) , ℝn
hence, by Parseval’s equality, ϕ and ψ must have the same L2 -norm: ‖ϕ‖ = ‖ψ‖. On the other hand, using the Moyal identity, the equality Wψ = Wη ϕ implies that ∫ Wψ(z)2 dz = ( ℝ2n
∫ Wη ϕ(z)2 dz = ( ℝ2n
n
1 ) ‖ψ‖4 2πℏ n
1 ) ‖ϕ‖4 , 2πη
hence we must have η = ℏ. (ii) Observe that the series is absolutely convergent in L2 (ℝ2n ). Proceeding in the same way, we get, using again the first marginal property and Parseval’s equality, ‖ψ‖2 = ∑ αj ‖ψj ‖2 = ∑ αj . j
j
(3.28)
On the other hand, squaring Wψ we get (Wψ)2 = ∑ αj αk Wη ψWη ψk , j,k
hence, integrating and using the Moyal identity for (Wψ)2 and Wη ϕj Wη ϕk , together with the Cauchy–Schwarz inequality we get, since the αj are nonnegative, n
(
n
1 1 2 ) ‖ψ‖4 = ( ) ∑ α α (ψ |ψ ) 2πℏ 2πη j,k j k j k n
≤(
1 ) ∑ α α ‖ψ ‖2 ‖ψk ‖2 2πη j,k j k j
=(
1 ) (∑ αj ) . 2πη j
n
2
Taking the equality (3.28) into account n
(
n
1 1 ) ‖ψ‖4 ≤ ( ) ‖ψ‖4 2πℏ 2πη
which implies η ≤ ℏ as claimed.
46 | 3 The cross-Wigner transform
3.5 Statistical interpretation of the Wigner transform The Wigner transform Wψ of ψ ∈ L2 (ℝn ) is a real function. Suppose that ‖ψ‖L2 = 1. Then, by formula (3.6) in Proposition 31, ∫ Wψ(z)dz = 1. ℝ2n
This formula, together with the fact that Wψ always is real, suggests that the Wigner transform might be viewed as a “quasi-probability distribution” on ℝ2n . This idea is actually confirmed by the following result, which shows that Wψ(z) satisfies the expected marginal properties. Proposition 48. Assume that ψ ∈ L1 (ℝn ) ∩ L2 (ℝn ) and Fψ ∈ L1 (ℝn ) ∩ L2 (ℝn ). Then 2 ∫ Wψ(z)dp = ψ(x) ,
ℝn
2 ∫ Wψ(z)dx = Fψ(p) .
(3.29)
ℝn
Proof. Let us prove that, more generally, ∫ W(ψ, ϕ)(z)dp = ψ(x)ϕ(x)
(3.30)
ℝn
∫ W(ψ, ϕ)(z)dx = Fψ(p)Fϕ(p)
(3.31)
ℝn
when ψ and ϕ both satisfy the stated conditions. Formulas (3.29) will follow taking ψ = ϕ. Writing the Fourier inversion formula in distributional form as i
∫ e− ℏ p⋅y dp = (2πℏ)n δ(y), ℝn
we have 1 1 ∫ W(ψ, ϕ)(z)dp = ∫ δ(y)ψ(x + y)ϕ(x − y)dy 2 2
ℝn
ℝn
= ∫ δ(y)ψ(x)ϕ(x)dy ℝn
= ψ(x)ϕ(x) as claimed. This proves formula (3.30). Let us prove (3.31). Making the unitary change of variables (x, y) → (x + 21 y, x − 21 y) on the right-hand of the equality ∫ Wψ(z)dx = ( ℝn
n
i 1 1 1 ) ∫ e− ℏ p⋅y ψ(x + y)ϕ(x − y)dxdy, 2πℏ 2 2
ℝ2n
3.6 Comments and references | 47
we get, using Fubini’s theorem, ∫ Wψ(z)dx = ( ℝn
n
i i 1 ) ∫ e− ℏ p⋅x ψ(x )e− ℏ p⋅x ϕ(x)dxdy 2πℏ
n
=(
ℝ2n
i i 1 ) ∫ e− ℏ p⋅x ψ(x)dx ∫ e− ℏ p⋅y ϕ(y)dy 2πℏ
ℝn
ℝn
= Fψ(p)Fϕ(p) which is formula (3.31). The two formulas (3.29) trivially follow taking ψ = ϕ. Notice that, by integrating with respect to x (resp. p) the first (resp. second) formula (3.29), we recover the relationship ∫ Wψ(z)dz = ‖ψ‖L2 . ℝ2n
In the quasiprobability distribution, one treats the Wigner transform as if it were a genuine phase-space probability distribution; for instance, given a “quantum observ̂ (i. e., a self-adjoint bounded operator) viewed as the quantization of a function able” A ̂ by the formula a (“classical observable”), one defines the average value (or mean) of A ̂ ψ = ∫ a(z)Wψ(z)dz. ⟨A⟩ ℝ2n
We will come back to this interpretation in a more rigorous way in the forthcoming chapters.
3.6 Comments and references The Wigner transform Wψ was introduced by Eugene Wigner in his celebrated 1932 paper [75]. It is unclear what led Wigner (who won the Nobel Prize in physics in 1963) to “guess” his eponymous transform. There are (unconfirmed) speculations that he might have discussed the topic with the physicist Leo Szilard. The importance of the Wigner transform and the associated objects and variants cannot be overestimated. It is at the origin of phase-space quantum mechanics and its formal variant, timefrequency analysis. The reader interested in a detailed exposition of the properties of the Wigner transform and various developments is invited to consult our monograph [32]. For a slightly different point of view and with different priorities, see the classical treatise by G. Folland [28] which has been seminal to many developments in time-frequency analysis. Hudson’s theorem on the positivity of the Wigner transform was first proven in [47].
4 Gaussians and hermite functions 4.1 The Wigner transform of a gaussian We will use the following classical formula giving the Fourier transform of a Gaussian: 1
ϕ(x) = e− 2ℏ Qx
2
where Q is a symmetric complex 2n × 2n matrix such that Re Q > 0. This condition ensures us that ϕ ∈ 𝒮 (ℝn ) and hence also FϕM ∈ 𝒮 (ℝn ): Fϕ(x) = (det Q)−1/2 ϕQ−1 (x)
(4.1)
where 1/2 (det Q)1/2 = λ11/2 ⋅ ⋅ ⋅ λm ,
the numbers λ11/2 , . . . , λn1/2 being the square roots with positive real part of the eigenvalues λ1 , . . . , λm of Q. 4.1.1 The cross-Wigner transform of a pair of Gaussians We will henceforth denote by ψM the Gaussian function ψM = (
n/4
1 ) πℏ
(Re M)1/4 ϕM,
that is, explicitly: ψM (x) = (
n/4
1 ) πℏ
1
(det X)1/4 e− 2ℏ Mx
2
(4.2)
where we have set M = X + iY, X = Re M > 0. The prefactor in (4.2) is chosen so that ψM is L2 -normalized: ‖ψM ‖L2 = 1. Proposition 49. Let ψM and ψM be two Gaussian functions of the type above. We have W(ψM , ψM )(z) = (
n
1 1 ) CM,M e− ℏ Fz⋅⋅z πℏ
(4.3)
where CM,M is a constant: CM,M = (det XX )
1/4
1 det[ (M + M )] 2
−1/2
(4.4)
and F is the complex matrix given by 2M (M + M )−1 M F=( −i(M + M )−1 (M − M ) https://doi.org/10.1515/9783110722772-004
−i(M − M )(M + M )−1 ). 2(M + M )−1
(4.5)
50 | 4 Gaussians and hermite functions Proof. We have i
1
W(ψM , ψM )(z) = C(X, X ) ∫ e− ℏ py e− 2ℏ Φ(x,y) dy ℝn
with C(X, X ) = 2−n (
2n
1 1/4 ) (det XX ) πℏ 2
2
1 1 Φ(x, y) = M(x + y) + M (x − y) . 2 2 Let us calculate the integral 1
i
I(z) = ∫ e− ℏ py e− 2ℏ Φ(x,y) dy. ℝn
We have Φ(x, y) = (M + M )x 2 +
1 (M + M )y2 + (M − M )x ⋅ y, 4
and hence, 1
i
2
i
1
2
I(z) = e− 2ℏ (M+M )x ∫ e− ℏ [p− 2 (M−M )x]⋅y e− 8ℏ (M+M )y dy. ℝn
Using the Fourier transformation formula (4.1), we get 1 I(z) = (2πℏ)n/2 det[ (M + M )] 4
−1/2
× exp(−
2
1 1 −1 [(M + M )x2 + 4(M + M ) (p − (M − M )x) ]). 2ℏ 2
A straightforward calculation shows that 2
1 1 −1 (M + M )x2 + 4(M + M ) (p − (M − M )x) = Fz ⋅ z 2 2 where F is the matrix (
K −i(M + M )−1 (M − M )
−i(M − M )(M + M )−1 ) 2(M + M )−1
where the left upper block is 1 −1 K = [M + M − (M − M )(M + M ) (M − M )]. 2
(4.6)
4.1 The Wigner transform of a gaussian
| 51
Using the identity M + M − (M − M )(M + M ) (M − M ) = 4M (M + M ) M, −1
(4.7)
−1
the matrix (4.6) is given by (4.5). The result follows by collecting the constants and simplifying the obtained expression. Suppose that M = M . Then M + M = 2X, M − M = 2Y, hence (4.5) becomes F = G with X + YX −1 Y X −1 Y
YX −1 ), X −1
G=(
(4.8)
and we have CM,M = 1 so formula (4.3) becomes WψM (z) = (
n
1 2 1 ) e− ℏ Gz . πℏ
(4.9)
The following observation is essential for the study of Gaussian quantum states: Besides being obviously symmetric, the matrix G is in addition symplectic: G ∈ Sp(n). This follows from the obvious factorization G = ST S where X 1/2 X −1/2 Y
0 ) ∈ Sp(n). X −1/2
S=(
(4.10)
WψM is normalized to unity by making the change of variables z → Sz in the integral in formula (4.9). In particular, when ψM is the standard Gaussian ϕ0 (x) = (
n/4
1 ) πℏ
1
2
e− 2ℏ |x| ,
(4.11)
we immediately get from (4.9) Wϕ0 (z) = (
n
1 2 1 ) e− ℏ |z| . πℏ
(4.12)
4.1.2 Fermi’s trick In the case n = 1, the standard Gaussian is a solution of the differential equation 1 1 (−ℏ2 𝜕x2 + x2 )ϕ0 = ℏϕ0 2 2
(4.13)
(when ℏ = 1 this the “Hermite equation”). A natural question to ask is whether the generalized Gaussians (4.2) also are the solutions of a second-order (partial) differential equation. We will use the following elementary result, the proof of which is purely computational, and therefore left to the reader:
52 | 4 Gaussians and hermite functions Lemma 50. Let ψ ∈ C 2 (ℝn ). Writing ψ(x) = R(x)eiΦ(x)/ℎ where R ≥ 0 and Φ are real, ̂ = 0 at all points x where R(x) > 0 where H ̂ is the that function satisfies the equation Hψ partial differential operator ̂ = (−iℏ∇x − 𝜕x Φ)2 + ℏ2 ΔR . H R
(4.14)
Choosing in particular ψ = ψM , we have Proposition 51. The generalized Gaussian function ψM (x) = (
n/4
1 ) πℏ
1
(det X)1/4 e− 2ℏ Mx
2
(M = X + iY, X > 0) satisfies the eigenvalue problem ̂M ψM = (ℏ Tr X)ψM H
(4.15)
̂M is the partial differential operator where H ̂M = (−iℏ𝜕x + Yx)2 + X 2 x ⋅ x H and Tr X > 0 is the trace of the matrix X. 1
Proof. Applying Lemma 50 with Φ(x) = − 21 Yx ⋅ x and R(x) = e− 2ℏ Xx⋅x , we get 𝜕x Φ(x) = −Yx,
ΔR(x) 1 1 = − Tr X + 2 X 2 x ⋅ x, R(x) ℏ ℏ
(4.16)
̂ defined by (4.14) is given by hence the operator H ̂ = (−iℏ𝜕x + Yx)2 − ℏ Tr X + X 2 x ⋅ x ; H ̂ is the function HM − ℏ Tr X. formula (4.15) follows since the Weyl symbol of H
4.1.3 Cross-ambiguity function of a Gaussian The calculation of cross-ambiguity functions of Gaussians is straightforward if one uses the algebraic relation (3.22) in Proposition 42: 1 A(ψ, ϕ)(z) = 2−n W(ψ, ϕ∨ )( z) 2 where ϕ∨ (x) = ϕ(−x).
4.2 The case of Hermite functions | 53
Proposition 52. Let ψM and ψM be generalized Gaussians (4.2). We have A(ψM , ψM )(z) = (
n
1 2 1 ) CM,M e− 4ℏ Fz 2πℏ
(4.17)
where the constant CM,M and the matrix F are again given by (4.4) and (4.5). In particular, AψM (z) = (
n
1 2 1 ) e− 4ℏ Gz 2πℏ
where the symplectic matrix G is given by (4.8). Proof. It immediately follows from formula (3.22) in Proposition 42. 2
Let for instance ϕ0 (x) = (πℏ)−n/4 e−|x| /2ℏ be the standard Gaussian. Then: 1
2
Aϕ0 (z) = (2πℏ)−n e− 4ℏ |z| .
4.2 The case of Hermite functions 4.2.1 Hermite functions The following formulas are to be found widespread in the literature (see the subsequent comments and references section). The N-th Hermite polynomial hN is defined by “Rodrigue’s formula”: 2
hN (x) = (−1)N ex (
dN −x2 e ). dx N
(4.18)
These functions satisfy Hermite’s differential equation d2 d hN − 2x hN + 2NhN = 0 2 dx dx
(4.19)
and the Hermite recurrence relationships d h = 2NhN−1 , dx N
hN+1 = 2xhN − 2NhN−1,
(4.20)
which allow them to be easily calculated stepwise. Definition 53. The N-th Hermite function is given by HN (x) = √
1/4
1 1 ( ) N 2 N! π
e−x
2
/2
hN (x).
(4.21)
54 | 4 Gaussians and hermite functions Setting N = 0, we have h0 (x) = 1, hence H0 (x) = π −1/4 e−x
2
/2
is the standard Gaussian corresponding to the choice ℏ = 1. The Hermite functions satisfy the parity relationship HN (−x) = (−1)N HN (x),
(4.22)
̂ = 1 (−𝜕2 + x 2 ): and they are the eigenfunctions of the “Hermite operator” H x 2 1 1 (−𝜕x2 + x2 )ϕN (x) = (N + )ϕN (x) 2 2
(4.23)
for N = 0, 1, 2, . . . In quantum mechanics it is customary to use a rescaled variant of the Hermite functions. Consider the operator ̂ = 1 (−ℏ2 𝜕2 + m2 ω2 x 2 ); H x 2m it is the quantization of the Hamiltonian function H=
1 2 (p + m2 ω2 x 2 ) 2m
(4.24)
which represents a one-dimensional anisotropic harmonic oscillator with mass m and ̂ has the eigenvalues λN = (N + 1 )ℏω, and its eigenfuncfrequency ω. The operator H 2 tions ΦN are given by ΦN (x) = √αHN (αx),
α=√
mω , ℏ
that is, in view of (4.21), ΦN (x) = √
1/4
1 mω ( ) N 2 N! πℏ
e−mωx
2
/2ℏ
hN (√
mω x). ℏ
(4.25)
4.2.2 Laguerre polynomials and functions Let us recall the basics of the theory of Laguerre functions. Definition 54. The N-th Laguerre polynomial is given by LN (x) = ex
dN N −x (x e ). dx N
(4.26)
4.2 The case of Hermite functions | 55
More generally, for k ∈ ℝ, the function L(k) N : ℝ → ℝ defined by L(k) N (x) =
ex x−k dN N+k −x (x e ) N! dxN
(4.27)
is called the N-th generalized Laguerre polynomial. Applying Leibniz’s rule for the differentiation of a product, we have the explicit expressions N N (−x)j LN (x) = ∑ ( ) j j! j=0 N
1 N +k ( )(−x)j . j! N − j j=0
L(k) N (x) = ∑
The Laguerre polynomials satisfy the Laguerre equation d2 (k) d + NL(k) L + (k + 1 − x) L(k) N = 0, dx N dx2 N
(4.28)
and we have the following orthogonality relationships: ∞
(k) ∫ xk e−x L(k) M (x)LN (x)dx = 0
Γ(N + k + 1) δM,N . N!
(4.29)
Taking k = 0, we have in particular: ∞
∫ e−x LM (x)LN (x)dx = δM,N .
(4.30)
0
The following relationship between Hermite and Laguerre functions is very useful: If M ≤ N, then ∞
2
(−2uv) ; ∫ e−x hM (x + u)hN (x + v)dx = 2n √πM!vN−M L(N−M) M
(4.31)
−∞
choosing M = N, we get in particular ∞
2
∫ e−x hN (x + u)hN (x + v)dx = 2n √πN!LN (−2uv).
(4.32)
−∞
4.2.3 The cross-Wigner transform of Hermite functions Proposition 55. The Wigner transform of the N-th Hermite function ΦN is given by WΦN (z) =
(−1)N −2H(z)/ℏω 4 e LN ( H(z)) πℏ ℏω
(4.33)
56 | 4 Gaussians and hermite functions where z = (x, p) and LN is the N-th simple Laguerre polynomial, H being defined by (4.24). Proof. We have, by definition of the Wigner transform, ∞
WΦN (z) =
i 1 1 1 ∫ e− ℏ py ΦN (x + y)ΦN (x − y)dy 2πℏ 2 2
−∞
that is, in view of (4.25) ∞
i 2 2 1 2 1 1 1 WΦN (z) = CN ∫ e− ℏ py e−α (x + 4 y ) hN (α(x + y))hN (α(x − y))dy 2 2 2
−∞
∞
2 2
i
2 2
= CN e−α x ∫ e− 2ℏ py e−α y hN (α(x + y))hN (α(x − y))dy −∞
where CN and α are constants: CN =
1/2
mω 1 1 ( ) , πℏ 2N N! πℏ
α=√
mω . ℏ
Completing the squares, we have α2 y 2 +
2
ip p2 i py = α2 (y − 2 ) + 2 2 , 2ℏ αℏ αℏ
and hence 2 2
WΦN (z) = CN e−α x e
−
∞
p2 α2 ℏ2
∫ e
−α2 (y−
ip 2 ) α2 ℏ
hN (α(x + y))hN (α(x − y))dy.
−∞
Setting u = α(y − ip/α2 ℏ) and
β = ip/αℏ,
we get, after simplification and using the parity relationship (4.22), ∞
2
WΦN (z) = KN (z) ∫ e−u hN (u + αx + β)hN (u − αx + β)du −∞
where KN (z) = CN
(−1)N −(α2 x2 −β2 ) (−1)N π −1/2 − ℏω2 H(z) e = e . α πℏ 2N N!
(4.34)
4.2 The case of Hermite functions | 57
Using formula (4.32), one finds that ∞
2
∫ e−u hN (u + αx + β)hN (u − αx + β)du = 2N √πN!LN ( −∞
4 H(z)) ℏω
where LN is the Laguerre polynomial (4.26), hence, after some simplifications, WΦN (z) =
4 (−1)N − ℏω2 H(z) LN ( H(z)) e πℏ ℏω
which is formula (4.33). If we choose m = ω = 1, then H = 21 (p2 + x2 ), and (4.33) becomes (−1)N − ℏ1 |z|2 2 e LN ( |z|2 ). πℏ ℏ
WΦN (x, p) =
(4.35)
Taking in addition ℏ = 1 we get the standard Hermite function HN and (−1)N −|z|2 e LN (2|z|2 ), π
WHN (x, p) =
(4.36)
while the choice ℏ = 1/2π leads to 2
WΦN (x, p) = 2(−1)N e−2π|z| LN (4π|z|2 ).
(4.37)
The main result of this section is: Proposition 56. Let M ≤ N. The cross-Wigner transform of ΦM and ΦN is given by W(ΦM , ΦN )(z) = CM,N (x −
N−M
ip ) mω
2
e− ℏω H(z) L(N−M) ( M
4 H(z)) ℏω
(4.38)
where the constant factor CM,N is given by CM,N =
(−1)M √ 2N M! mω ( ) πℏ 2M N! ℏ
(N−M)/2
.
Proof. In the course of the proof of Proposition 55, we showed that ∞
2
WΦN (z) = KN (z) ∫ e−u hN (u + αx + β)hN (u − αx + β)du −∞
where KN (z) =
(−1)N π −1/2 − ℏω2 H(z) e . πℏ 2N N!
(4.39)
58 | 4 Gaussians and hermite functions In the present case, a similar argument leads to the formula W(ΦM , ΦN )(z) = KM,N (z)IM,N (z) where the factor KM,N (z) is given by KM,N (z) =
2 1 (−1)N √ π −1/2 e− ℏω H(z) M+N πℏ 2 M!N!
and ∞
2
IM,N (z) = ∫ e−u hM (u + αx + β)hN (u − αx + β)du. −∞
In view of formula (4.32), we have IM,N (z) = 2N √πM!(−1)N−M (αx − β)N−M L(N−M) (2(α2 x 2 − β2 )); M using the relationships mω ip (x − ) ℏ mω 4 2(α2 x2 − β2 ) = H(z), ℏω
αx − β = √
the function IM,N (z) is given by IM,N (z) = 2N M!(−1)N−M (
mω ) ℏ
N−M
(N−M)/2
(x −
ip ) mω
L(N−M) ( M
4 H(z)), ℏω
hence formula (4.38) after a few simplifications. If we choose m = ω = 1 and ℏ = 1/2π, formula (4.38) can be written W(ΦM , ΦN )(z) = CM,N ζ
N−M −2π|ζ |2 (N−M) e LM (4π|ζ |2 )
(4.40)
where ζ is the complex variable ζ = x + ip and the constant CM,N is given by CM,N = 2N−M+1 (−1)M π (N−M)/2 √
M! . N!
The following formulas for the cross-ambiguity function of (ΦM , ΦN ) easily follow from the preceding result:
4.3 Comments and references | 59
Corollary 57. Let M ≤ N. We have A(ΦM , ΦN )(z) = DM,N (x −
N−M
ip ) mω
1
(N−M) ( e− 2ℏω H(z) LM
1 H(z)) ℏω
(4.41)
where DM,N =
(−1)M+N N−M √ 2N M! mω 2 ( ) 2πℏ 2M N! ℏ
(N−M)/2
.
In particular, AΦN (z) =
1 1 1 − 2ℏω H(z) e LN ( H(z)). 2πℏ ℏω
(4.42)
Proof. The relationship (3.22) in Proposition 42 becomes here 1 1 A(ψ, ϕ)(z) = W(ψ, ϕ∨ )( z). 2 2
(4.43)
Using the parity formula (4.22), we have Φ∨N (x) = (−1)N ΦN (x), hence 1 1 A(ΦM , ΦN )(z) = (−1)N W(ΦM , ΦN )( z). 2 2
4.3 Comments and references Hudson showed in [47] that the Wigner transform of a function on ℝ is positive if an only that function is a generalized Gaussian. Janssen generalized in [49] Hudson’s result to the multidimensional case under more general hypotheses. See Folland [28], Chap. 1, Gröchenig [40], Soto and Claverie [67], Toft [70] for independent proofs. We refer to the monumental treatise [39] by Gradshteyn and Ryzhik for material on Hermite and Laguerre functions. The cross-Wigner transform of Hermite functions has been studied by Agorram et al. [3]; also see our monograph [32] which we have been following in our presentation. One can find variants of the proof of the formula for the Wigner transform of Hermite functions in many texts.
5 The Weyl transform 5.1 Definitions of the Weyl transform The Weyl transform associates to every symbol a ∈ 𝒮 (ℝ2n ) an operator on 𝒮 (ℝn ). The properties of the Weyl transform make it into a possible quantization procedure. 5.1.1 First definition In Proposition 179, we expressed the relationship between a density operator ρ̂ and its Wigner distribution ρ using the harmonic decomposition ρ̂ψ = ∫ ρσ (z)̂ D(z)ψdz
(5.1)
ℝ2n
̂ is the Heisenberg displacement operator, and ρσ = Fσ ρ is where ψ ∈ L2 (ℝn ), D(z) the symplectic Fourier transform of ρ. Equivalently, if one uses the Grossmann–Royer reflection operator, ̂ ρ̂ψ = 2n ∫ ρ(z)R(z)ψdz.
(5.2)
ℝ2n
This suggests the following definition. Definition 58. Let a ∈ 𝒮 (ℝ2n ). The Weyl operator OpW (a) with symbol a is the mapping 𝒮 (ℝn ) → 𝒮 (ℝn ) defined by OpW (a)ψ = (
n
1 ) ∫ aσ (z)̂ D(z)ψdz, 2πℏ
(5.3)
ℝ2n
and the mapping a → OpW (a) is called the “Weyl transform”; formula (5.3) is equivalent to OpW (a)ψ = (
n
1 ̂ ) ∫ a(z)R(z)ψdz. πℏ
(5.4)
ℝ2n
[Some authors use the notation aw (x, D) instead of OpW (a).] The proof of the equivalence of the definitions (5.3) and (5.4) is mutatis mutandis similar to that of Proposition 179 using the symplectic Fourier transform. These formulas should be understood in the distributional sense: for every ϕ ∈ 𝒮 (ℝn ), we have ⟨OpW (a)ψ, ϕ⟩ = ( https://doi.org/10.1515/9783110722772-005
n
1 ) ⟨aσ , ⟨̂ D(⋅)ψ, ϕ⟩⟩ 2πℏ
62 | 5 The Weyl transform
⟨OpW (a)ψ, ϕ⟩ = (
n
1 ̂ ) ⟨a, ⟨R(⋅)ψ, ϕ⟩⟩ πℏ
̂ where ⟨̂ D(⋅)ψ, ϕ⟩ ∈ 𝒮 (ℝ2n ) and ⟨R(⋅)ψ, ϕ⟩ ∈ 𝒮 (ℝ2n ). Remark 59. The density operator ρ̂ defined by (5.1) is thus a Weyl operator with symbol (2πℏ)n ρ. Admittedly, this definition is a little vague since one does not immediately see how to interpret the integral in (9.37) except in a weak sense, nor is it clear what the domain (and the range) of OpW (a) could be. Also, the presence of the prefactor (2πℏ)−n , which does not appear in (5.1), has to be motivated. So, a few words of explanation are necessary. Proposition 60. (i) The Weyl transform OpW (a) is a well-defined operator 𝒮 (ℝn ) → 𝒮 (ℝn ) for all a ∈ 𝒮 (ℝ2n ) and (ii) Let b ∈ 𝒮 (ℝn ); then OpW (b ⊗ 1)ψ(x) = b(x)ψ(x), and, in particular, OpW (1) is the identity operator Id . ̂ Proof. (i) Let ψ ∈ 𝒮 (ℝn ). For every ϕ ∈ 𝒮 (ℝn ), the distributional bracket ⟨D(z)ψ, ϕ⟩ is n equal to (πℏ) A(ψ, ϕ)(z) (A the cross-ambiguity function) and hence, since A(ψ, ϕ) ∈ 𝒮 (ℝ2n ) (see (3.20)) ⟨OpW (a)ψ, ϕ⟩ = 2−n ∫ ρσ (z)A(ψ, ϕ)(z)dz −n
ℝ2n
= 2 ⟨ρσ , A(ψ, ϕ)⟩ hence our claim since a ∈ 𝒮 (ℝ2n ) if and only if aσ ∈ 𝒮 (ℝ2n ). (ii) It is sufficient to give the proof for b ∈ 𝒮 (ℝn ). We have, using the equality Fn 1 = (2πℏ)n/2 δ (Fn the n-dimensional Fourier transform), OpW (b ⊗ 1)ψ(x ) = (
n/2
1 ) 2πℏ
n/2
=(
1 ) 2πℏ
̂ ̂ D(z)ψ(x )dz ∫ b(p)δ(x) ℝ2n i
̂ )dp ∫ e ℏ px b(p)ψ(x
ℝn
= b(x )ψ(x ).
In the case where the symbol is the Dirac distribution δ on ℝ2n , we have aσ = (2πℏ)−n , and hence OpW (δ)ψ(x) = (
2n
1 ) 2πℏ
i
1
∫ ( ∫ e ℏ (p0 ⋅x− 2 p0 ⋅x0 ) dp0 )ψ(x − x0 )dx0 ℝn ℝn
5.1 Definitions of the Weyl transform
| 63
n
=(
1 1 ) ∫ δ(x − x0 )ψ(x − x0 )dx0 2πℏ 2 n
=(
ℝn
1 ) ψ(−x) πℏ
̂ = R(0). ̂ so that OpW (δ) is, up to the factor (πℏ)−n , the reflection operator R Another consequence of this definition is: i
Proposition 61. The operator with Weyl symbol z → e− ℏ σ(z,z0 ) is the Heisenberg displacement operator ̂ D(z0 ). i
̂ 0 = OpW (a0 ). We have Proof. Let us write a0 (z) = e− ℏ σ(z,z0 ) , and let A n
i 2i ̂ 0 ψ(x) = ( 1 ) ∫ e− ℏ σ(z,z ) e ℏ p0 ⋅(x−x0 ) ψ(2x0 − x)dp0 dx0 A πℏ
n
=(
i i 1 ) ∫ e ℏ p ⋅x0 e ℏ p0 ⋅(x +2x0 −2x) ψ(2x0 − x)dp0 dx0 πℏ
n
=(
ℝ2n
ℝ2n
i i 1 ) ∫ e ℏ p ⋅x0 [ ∫ e ℏ p0 ⋅(x +2x0 −2x) dp0 ]ψ(2x0 − x)dx0 πℏ
ℝn
= 2n ∫ e
i p ⋅x0 ℏ
ℝn
δ(x + 2x0 − 2x)ψ(2x0 − x)dx0 .
ℝn
Setting y = 2x0 , we get i
1
̂ 0 ψ(x) = e ℏ (p ⋅x− 2 p ⋅x ) ψ(x − x ) = ̂ A D(z0 )ψ(x).
5.1.2 Definition using the Wigner transform It is a variant of the previous definition. Definition 62. Let (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ), and a ∈ 𝒮 (ℝ2n ). Then the Weyl operator OpW (a) is the only operator 𝒮 (ℝn ) → 𝒮 (ℝn ) such that ⟨OpW (a)ψ, ϕ⟩ = ⟨a, W(ψ, ϕ)⟩.
(5.5)
̂ : 𝒮 (ℝn ) → 𝒮 (ℝn ) is another An operator satisfying (5.5) (if it exits) is unique: If A such operator, then ̂ − OpW (a))ψ, ϕ⟩ = 0 ⟨(A
64 | 5 The Weyl transform ̂ − OpW (a))ψ = 0 for all ψ which is only possible if A ̂ = OpW (a). Let for all ϕ hence (A ̂ exists; for this it suffices to check that us show that such an operator A n
̂ = ( 1 ) ∫ a(z)R(z)ψdz ̂ Aψ πℏ ℝ2n
for ψ ∈ 𝒮 (ℝn ). Letting ϕ ∈ 𝒮 (ℝn ), we have using definition (3.1) of the cross-Wigner transform, ̂ ϕ⟩ = ( ⟨Aψ,
n
1 ̂ ) ∫ a(z)⟨R(z)ψ, ϕ⟩dz πℏ ℝ2n
= ∫ a(z)W(ψ, ϕ)(z)dz ℝ2n
[the integrals being viewed as distributional brackets on ℝ2n ] which proves our claim. An immediate consequence of this definition is: Proposition 63. If a ∈ L2 (ℝ2n ), then OpW (a): L2 (ℝn ) → L2 (ℝn ) continuously. Proof. Definition (5.5) implies that (OpW (a)ψ|ϕ)L2 = (a|W(ψ, ϕ))L2 (ℝ2n )
(5.6)
for all (ψ, ϕ) ∈ L2 (ℝn ) × L2 (ℝn ), since we have in this case W(ψ, ϕ) ∈ L2 (ℝ2n ) (Proposition 38). In view of the Cauchy–Schwarz inequality together with the Moyal identity (3.12), we have 2 (a|W(ψ, ϕ))L2 (ℝ2n ) ≤ ‖a‖L2 (ℝ2n ) W(ψ, ϕ)L2 (ℝ2n ) n/2
=(
1 ) 2πℏ
‖a‖L2 (ℝ2n ) ‖ψ‖L2| ‖ϕ‖L2
that is, taking ψ = ϕ in (5.6), n/2
1 ) (OpW (a)ψ|ψ)L2 ≤ ( 2πℏ
‖a‖L2 (ℝ2n ) ‖ψ‖2L2| ,
hence the result. Here is another application: One consequence of Moyal’s identity is that, if (ψj ) is an orthonormal set of vectors in L2 (ℝn ), then the functions W(ψj , ψk ) are orthogonal as well. In fact, more precisely: Proposition 64. Let (ψj )j and (ϕj )j be orthonormal bases of L2 (ℝn ). The family of functions (Φj,k )j,k with Φj,k = (2πℏ)n/2 W(ψj , ϕjk ) is an orthonormal basis of L2 (ℝ2n ). Proof. In view of the Moyal identity (3.12) for the cross-Wigner transform, we have (Φj,k |Φj ,k )L2 (ℝ2n ) = (ψj |ψj )L2 (ϕk |ϕk )L2 ,
5.1 Definitions of the Weyl transform
| 65
hence the Φj,k form an orthonormal system in L2 (ℝ2n ). To show that (Φj,k )j,k is a basis of L2 (ℝ2n ), we must show that the vectors Φj,k span L2 (ℝ2n ). For this, it is sufficient to show that, if a ∈ L2 (ℝ2n ) is orthogonal to each Φj,k , then a = 0. Assume that (a|Φjk )L2 (ℝ2n ) = 0 for all indices j, k. Since we have (formula (5.6)) (a|Φjk )L2 (ℝ2n ) = (2πℏ)n/2 (a|W(ψj , ϕk ))L2 (ℝ2n )
= (2πℏ)n/2 (OpW (a)ϕk |ψj )L2 (ℝ2n ) ,
the condition (a|Φjk )L2 (ℝ2n ) = 0 for all j, k is equivalent to (OpW (a)ϕk |ψj )L2 (ℝ2n ) = 0 for all indices j, k, and this implies that OpW (a) = 0 since (ψj )j and (ϕk )k are bases of L2 (ℝn ). In view of the uniqueness of the Weyl symbol of an operator, this implies in turn that a = 0 as we set out to prove. 5.1.3 Pseudo-differential definition The pseudo-differential definition of the Weyl transform is the most commonly used in introductory texts obtained by taking literally formula (5.2): Definition 65. Let a ∈ 𝒮 (ℝ2n ). The Weyl operator with symbol a is defined by OpW (a)ψ(x) = (
n
1 ) 2πℏ
i 1 ∫ e ℏ p⋅(x−y) a( (x + y), p)ψ(y)dpdy 2
(5.7)
ℝn ×ℝn
for ψ ∈ 𝒮 (ℝn ). This definition makes immediately visible that Weyl operators are Fourier integral operators of a particular type. Its main drawback is that it is not immediately obvious from formula (5.7) how to extend the definition to symbols that are not decreasing fast enough to make the integral absolutely convergent. Things become more convincing if one remarks that (5.7) is the integral form of the identity (5.4). In fact, using the explicit expression (2.9) of the Grossmann–Royer reflection operator, we get OpW (a)ψ(x ) = (
n
1 ̂ ) ∫ a(z)R(z)ψ(x )dz πℏ n
=(
ℝ2n
2i 1 ) ∫ a(x, p)e ℏ p⋅(x −x) ψ(2x − x )dpdx ; πℏ
ℝ2n
setting y = 2x − x , we get OpW (a)ψ(x ) = (
n
i 1 1 ) ∫ a( (x + y), p)e ℏ p⋅(x −y) ψ(y)dpdy, 2πℏ 2
ℝ2n
66 | 5 The Weyl transform which is (5.7). There are several ways to reinterpret the integral in (5.7) when it is not absolutely convergent; perhaps the simplest is to use a compactly supported cut-off function χ ∈ C0∞ (ℝ3n ) such that χ(0, 0, 0) = 1 and to define OpW (a)ψ(x) = lim Iε (x) ε→0
(5.8)
where Iε (x) = (
n
i 1 1 ) ∫ e ℏ p⋅(x−y) χ(εx, εy, εp)a( (x + y), p)ψ(y)dpdy. 2πℏ 2
ℝ2n
The integral Iε (x) is absolutely convergent provided that, for instance, χa ∈ L1 (ℝ2n ). Both definitions (5.7) and (5.8) coincide when a ∈ 𝒮 (ℝ2n ) and ψ ∈ 𝒮 (ℝn ); in this case, the integrals (which we denote respectively I(x) and Iε (x)) in (5.7) and (5.8) are absolutely convergent, and we have 1 I1 (x) − Iε (x) ≤ ∫ 1 − χ(εx, εy, εp)a( (x + y), p) ψ(y)dpdy, 2 2n ℝ
hence limε→0 |I1 (x) − Iε (x)| = 0 by Lebesgue’s dominated convergence theorem. In particular, the choice of the cut-off function χ is irrelevant.
5.2 Main properties 5.2.1 The distributional kernel of a Weyl operator ̂ : 𝒮 (ℝn ) → 𝒮 (ℝn ) be an operator. If there exists a distribution K ∈ 𝒮 (ℝn × ℝn ) Let A such that ̂ ϕ⟩ = ⟨K, ϕ ⊗ ψ⟩ ⟨Aψ, ̂ We for all ϕ, ψ ∈ 𝒮 (ℝn ), then K is called the (distributional, or Schwartz) kernel of A. will also write informally Aψ(x) = ∫ K(x, y)ψ(y)dy ℝn
where the integral is to be understood in the distributional sense. It turns out that, in ̂ : 𝒮 (ℝn ) → 𝒮 (ℝn ) has a kernel K ∈ 𝒮 (ℝn × ℝn ) (this fact, every continuous operator A is Schwartz’s kernel theorem). We are going to see that every continuous operator 𝒮 (ℝn ) → 𝒮 (ℝn ) is a Weyl operator OpW (a) for some symbol a ∈ 𝒮 (ℝ2n ).
5.2 Main properties | 67
Proposition 66. Let a ∈ 𝒮 (ℝ2n ). (i) The kernel of OpW (a) is given by K(x, y) = (
n
i 1 1 ) ∫ e ℏ p⋅(x−y) a( (x + y), p)dp 2πℏ 2
(5.9)
ℝn
̂ is a continuous interpreted as a Fourier transform in the variable p. (ii) If conversely A n n ̂ operator 𝒮 (ℝ ) → 𝒮 (ℝ ) with kernel K, then A = OpW (a) with i 1 1 a(x, p) = ∫ e− ℏ p⋅y K(x + y, x − y)dy 2 2
(5.10)
ℝn
interpreted as the inverse Fourier transform in y. Proof. The result is trivial using definition (5.7) of a Weyl operator and the relationship n
i 1 1 1 K(x + y, x − y) = ( ) ∫ e ℏ p⋅y a(x, p)dp. 2 2 2πℏ
(5.11)
ℝn
5.2.2 The adjoint and the transpose of a Weyl operator ̂ is an operator 𝒮 (ℝn ) → 𝒮 (ℝn ). The transposed operator A ̂ T is defined Assume that A by the formula ̂ ϕ⟩ = ⟨ψ, A ̂ T ϕ⟩ ⟨Aψ,
(5.12)
̂ ∗ by for (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ) and the adjoint A ̂ ∗ ψ|ϕ) 2 = (ψ|Aϕ) ̂ (A L2 . L ̂ T of A ̂ = OpW (a) is the Weyl operator A ̂ T = OpW (b) Proposition 67. (i) The transpose A where the symbol b is obtained from a by the partial reflection p → −p: , that is, b(z) = a(z)
with z = (x, p).
(5.13)
̂ ∗ is obtained by complex conjugation of the Weyl symbol: (ii) The adjoint A ̂ ∗ = OpW (a). A
(5.14)
Proof. (i) By formula (5.4), we have, taking the transposition formula (2.10) for the reflection operator into account, ̂ ϕ⟩ = ( ⟨Aψ,
n
1 ̂ ) ∫ a(z)⟨R(z)ψ, ϕ⟩dz πℏ ℝ2n
68 | 5 The Weyl transform n
=(
1 ̂ ) ∫ a(z)⟨ψ, R(z)ϕ⟩dz πℏ n
=(
ℝ2n
1 ̂ ) ∫ a(z)⟨R(z)ϕ, ψ⟩dz, πℏ ℝ2n
hence, since dz = dz, n
̂ T ϕ = ( 1 ) ∫ a(z)R(z)ϕdz ̂ A πℏ ℝ2n
which proves (5.12). (ii) Formula (5.14) is proven by a similar argument, replacing the distributional brackets with the scalar product on L2 (ℝn ).
5.3 Shubin’s symbol classes For z ∈ ℝ2n , we set ⟨z⟩ = (1 + |z|2 )1/2 . 2n Definition 68. Let m ∈ ℝ and ρ ∈ ]0, 1]. The Shubin symbol class Γm ρ (ℝ ) consists of
all complex functions a ∈ C ∞ (ℝ2n ) such that for every α ∈ ℕ2n there exists a constant Cα ≥ 0 with α m−ρ|α| 𝜕z a(z) ≤ Cα ⟨z⟩
for z ∈ ℝ2n
(5.15)
where We denote by Gρm (ℝn ) the vector space of all Weyl operators OpW (a) with a ∈
2n Γm ρ (ℝ ).
2n Obviously Γm ρ (ℝ ) is a complex vector space for the usual operations of addition and multiplication by complex numbers. Using the generalized Leibniz rule for the derivatives of a product of functions, one easily checks that: 2n m 2n m+m a ∈ Γm (ℝ2n ); ρ (ℝ ) and b ∈ Γρ (ℝ ) ⇒ ab ∈ Γρ
2n 2n α m−|α| a ∈ Γm (ℝ2n ). ρ (ℝ ) and α ∈ ℕ ⇒ 𝜕z a ∈ Γρ
One also easily verifies that 2n m 2n 2n Γ−∞ ρ (ℝ ) = ⋂ Γρ (ℝ ) = 𝒮 (ℝ ). m∈ℝ
2n The symbol class Γm ρ (ℝ ) is preserved by linear changes of variables: If a ∈
2n 2n m 2n Γm ρ (ℝ ) and M is a linear automorphism of ℝ , then a ∘ M is also in Γρ (ℝ ) (the elementary proof is left to the reader). The elements of Gρm (ℝn ) have many interesting regularity properties; for instance:
5.4 Composition formulas | 69
Proposition 69. Every Weyl operator OpW (a) ∈ Gρm (ℝn ) is a continuous operator 𝒮 (ℝn ) → 𝒮 (ℝn ) extending into a continuous operator 𝒮 (ℝn ) → 𝒮 (ℝn ). ̂ = OpW (a) extends into a continuous operator 𝒮 (ℝn ) → 𝒮 (ℝn ) follows Proof. That A ̂ T is also in Gm (ℝn ). In fact, from the first statement by duality since the transpose A ρ 2n m 2n ̄ the condition a ∈ Γm (ℝ ) implies that b ∈ Γ (ℝ ) where b(z) = a( z) (Proposition 67), ρ ρ ̂ T : 𝒮 (ℝn ) → 𝒮 (ℝn ). One can then define the extension of A ̂ (also denoted A) ̂ hence A T n n ̂ ̂ by the formula ⟨Aψ, ϕ⟩ = ⟨ψ, A ϕ⟩ for ψ ∈ 𝒮 (ℝ ). The continuity 𝒮 (ℝ ) → 𝒮 (ℝn )
is easily verified using the estimates on the semi-norms together with the polynomial 2n boundedness of the derivatives of the symbol a ∈ Γm ρ (ℝ ).
5.4 Composition formulas We now assume that the Weyl operators n
̂ = OpW (a) = ( 1 ) ∫ aσ (z)̂ D(z)dz A 2πℏ n
ℝ2n
̂ = OpW (b) = ( 1 ) ∫ bσ (z)̂ D(z)dz B 2πℏ ℝ2n
̂=A ̂B ̂ is, for instance, always defined when can be composed. The compose C OpW (a) : 𝒮 (ℝn ) → 𝒮 (ℝn )
OpW (b) : 𝒮 (ℝn ) → 𝒮 (ℝn ) or when
OpW (a) : L2 (ℝn ) → L2 (ℝn )
OpW (b) : L2 (ℝn ) → 𝒮 (ℝn ). Writing the compose in Weyl form n
̂ = ( 1 ) ∫ c (z)̂ D(z)dz, C σ 2πℏ ℝ2n
we set out to determine the symbol c (or, equivalently, its symplectic Fourier transform cσ ). ̂ = A ̂ = Op (c). ̂B ̂ is well-defined and write C Proposition 70. Assume the product C W (i) The symplectic Fourier transform of c is given by cσ (z) = (
2n
1 ) 2πℏ
i
∫ e 2ℏ σ(z,z ) aσ (z − z )bσ (z )dz ℝ2n
(5.16)
70 | 5 The Weyl transform and (ii) the symbol c is given by c(z) = (
2n
1 ) 4πℏ
i 1 1 ∫ e 2ℏ σ(z ,z ) a(z + z )b(z − z )dz dz . 2 2
ℝ4n
̂ and B ̂ in the usual form Proof. (i) Writing the operators A n
̂ = ( 1 ) ∫ aσ (z0 )̂ D(z0 )dz0 A 2πℏ n
ℝ2n
̂ = ( 1 ) ∫ bσ (z1 )̂ B D(z1 )dz1, 2πℏ ℝ2n
we have, using the property (2.6) of displacement operators, n
̂ ̂ = ( 1 ) ∫ bσ (z1 )̂ D(z0 )B D(z0 )̂ D(z1 )dz1 2πℏ n
=(
ℝ2n
i 1 ) ∫ e 2ℏ σ(z0 ,z1 ) bσ (z1 )̂ D(z0 + z1 )dz1, 2πℏ
ℝ2n
and hence 2n
̂B ̂=( 1 ) A 2πℏ
i
D(z0 + z1 )dz0 dz1 . ∫ e 2ℏ σ(z0 ,z1 ) aσ (z0 )bσ (z1 )̂ ℝ4n
Setting z = z0 + z1 and z = z1 , this can be written 2n
̂B ̂=( 1 ) A 2πℏ
i
D(z)dz ∫ ( ∫ e 2ℏ σ(z,z ) aσ (z − z )bσ (z )dz )̂
ℝ2n ℝ2n
hence (5.16). (ii) Formula (5.17) is proven in a similar way, writing n
̂ = ( 1 ) ∫ a(z0 )R(z ̂ 0 )dz0 A πℏ n
ℝ2n
̂ = ( 1 ) ∫ b(z1 )R(z ̂ 1 )dz1 B πℏ ℝ2n
̂ and using the properties of the reflection operators R(z).
(5.17)
5.5 Comments and references | 71
5.5 Comments and references There is a multitude of books and treatises on the Weyl transform, so it is an impossible task to cite them all. Here are a few of our personal preferences: Folland [28]; his approach is particularly well-suited to an audience having a serious background in time-frequency analysis; K. Gröchenig’s book [40] is also designed for a TFA public, but is very complete, and nevertheless easy to read. Historically, the first mathematically rigorous treatment was given by Hörmander in [45], who developed the ideas contained in his earlier paper [46]. The Shubin symbol classes were introduced by Shubin in [66]. They have the advantage over the usual standard Hörmander symbol classes to be global, in the sense that the coordinates x and p are placed on equal levels, while the Hörmander classes privilege the growth properties of the symbols in the p coordinates. This difference of treatment is due to the specificity of the problems addressed by these authors. While Hörmander had in mind the study of partial differential equations ∑ aα (x)Dαx u(x) = f (x), |α|
Shubin wanted to develop a tool more suitable to quantum operators, which usually arise from Hamiltonian functions where x and p are placed on equal footing. For a proof of Schwartz’s kernel theorem, see for instance Gröchenig [40].
6 The Cohen class 6.1 Definition of the Cohen class Definition 71. Let Q : 𝒮 (ℝn ) × 𝒮 (ℝn ) → 𝒮 (ℝ2n ) be a sesquilinear form. We say that Q belongs to the Cohen class if there exists a tempered distribution θ ∈ 𝒮 (ℝ2n ) (the “Cohen kernel”) such that Q(ψ, ϕ) = W(ψ, ϕ) ∗ θ
(6.1)
for all ψ, ϕ ∈ 𝒮 (ℝn ). When ψ = ϕ, one writes Qψ = Q(ψ, ψ). Observe that it follows from formula (3.23) that the definition (6.1) is equivalent to Fσ Q(ψ, ϕ) = (2πℏ)n A(ψ, ϕ)Fσ θ
(6.2)
where Fσ is the symplectic Fourier transform and A(ψ, ϕ) the cross-ambiguity function. Choosing θ = δ, we have Q(ψ, ϕ) = W(ψ, ϕ), hence the cross-Wigner transform trivially belongs to the Cohen class. Choosing 1
2
θ(z) = (πℏ)−n e− ℏ |z| , we get the so-called the Husimi distribution. The following result gives a sufficient condition for a sesquilinear form Q to belong to Cohen’s class when it takes its values in 𝒮 (ℝ2n ). Proposition 72. Let Q : 𝒮 (ℝn ) × 𝒮 (ℝn ) → 𝒮 (ℝ2n ) be a sesquilinear form such that Q(̂ D(z0 )ψ)(z) = Qψ(z − z0 ) Q(ψ, ϕ)(0, 0) ≤ ‖ψ‖ ‖ϕ‖
(6.3) (6.4)
for all ψ, ϕ in 𝒮 (ℝn ), then Q belongs to the Cohen class. Proof. The condition (6.4) means that the sesquilinear form Q is bounded. It follows, ̂ on L2 (ℝn ) using Riesz’s representation theorem, that there exists a bounded operator A such that ̂ Q(ψ, ϕ)(0, 0) = (Aψ|ϕ) L2 . Using the translation property (6.3), we then have ̂̂ Qψ(z0 ) = Q(̂ D(−z0 )ψ)(0) = (A D(−z0 )ψ, ̂ D(−z0 )ψ)L2 . In view of Schwartz’s kernel theorem, there exists a distribution K ∈ 𝒮 (ℝn × ℝn ) such that ̂ (ϕ|Aψ) L2 = ⟨K, ψ ⊗ ϕ⟩ https://doi.org/10.1515/9783110722772-006
74 | 6 The Cohen class for all (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ). We thus have D(−z0 )ψ⟩, Qψ(z0 ) = ⟨K, ̂ D(−z0 )ψ ⊗ ̂ which we write formally in integral form as Qψ(z0 ) = ∫ K(x, y)̂ D(−z0 )ψ(x)̂ D(−z0 )ψ(y)dxdy. ℝ2n
By the definition of ̂ D(z0 ), we have i
1
̂ D(−z0 )ψ(x) = e ℏ (−p0 ⋅x− 2 p0 ⋅x0 ) ψ(x + x0 ), and hence i
Qψ(z0 ) = ∫ e− ℏ p0 (x−y) K(x, y)ψ(x + x0 )ψ(y + x0 )dxdy.
(6.5)
ℝ2n
On the other hand, for every θ ∈ 𝒮 (ℝ2n ), we have (Wψ ∗ θ)(z0 ) = ∫ Wψ(z0 − z)θ(z)dz, ℝ2n
hence, in view of the definition of the Wigner transform, (Wψ ∗ θ)(z0 ) = (
n
i 1 1 ) ∫ e− ℏ (p0 −p)⋅y ψ(x0 − x + y ) 2πℏ 2
ℝ3n
1 × ψ(x0 − x − y )θ(x , p )dpdx dy . 2 Calculating the integral in p, we get (Wψ ∗ θ)(z0 ) = (
n/2
1 ) 2πℏ
i 1 ∫ F2−1 θ(x , y )e− ℏ p0 ⋅y ψ(x0 − x + y ) 2
ℝ3n
1 × ψ(x0 − x − y )θ(x , p )dx dy 2 where F2−1 is the inverse Fourier transform in y . Making the change of variables x = − 21 (x + y) and y = x − y, we have dx dy = dxdy, so the equality above becomes (Wψ ∗ θ)(z0 ) = (
n/2
1 ) 2πℏ
i
∫ F2−1 θ(x, x − y)e− ℏ p0 ⋅(x−y) ψ(x + x0 )ψ(y + x0 )dxdy. ℝ2n
6.2 A Moyal identity | 75
Comparison with (6.5) shows that Qψ = Wψ∗θ where θ is defined by the partial inverse Fourier transform K(x, y) = (
n/2
1 ) 2πℏ
F2−1 θ(x, x − y),
and hence the functions i
θ(x, p) = (2πℏ)n/2 ∫ e− ℏ p⋅y K(x, x − y)dy ℝn
and Fθ(0, p) are not constant (they are Gaussians).
6.2 A Moyal identity We are going to show that a Moyal-type identity is satisfied by Q provided that the Fourier transform of the Cohen kernel has a constant modulus. Proposition 73. Let Q be an element of the Cohen class. We have the Moyal identity (Q(ψ, ϕ)|Q(ψ , ϕ ))L2 (ℝ2n ) = (
n
1 ) (ψ|ψ )L2 (ϕ|ϕ )L2 2πℏ
(6.6)
if and only the Fourier transform Fθ satisfies −n Fθ(z) = (2πℏ)
(6.7)
for some real function φ. Proof. In view of Moyal’s identity for the cross-Wigner transform, it is equivalent to prove that (Q(ψ, ϕ)|Q(ψ , ϕ ))L2 (ℝ2n ) = (W(ψ, ϕ)|W(ψ , ϕ ))L2 (ℝ2n ) n
(6.8)
n
for all ψ, ϕ, ψ , ϕ in 𝒮 (ℝ ) if and only if |Fθ(z)| = (2πℏ) ; applying Plancherel’s formula to both sides of this equality, this condition is in fact equivalent to
(FQ(ψ, ϕ)|FQ(ψ , ϕ ))L2 (ℝ2n ) = (FW(ψ, ϕ)|FW(ψ , ϕ ))L2 (ℝ2n ) .
(6.9)
Since Q(ψ, ϕ) = W(ψ, ϕ) ∗ θ, we have FQ(ψ, ϕ) = (2πℏ)n FW(ψ, ϕ)Fθ, hence (6.9) is equivalent to (2πℏ)2n (FW(ψ, ϕ)Fθ(z)|FW(ψ , ϕ )Fθ(z))L2 (ℝ2n ) = (FW(ψ, ϕ)|FW(ψ , ϕ ))L2 (ℝ2n ) , and this equality can hold if and only if |Fθ(z)|2 = (2πℏ)−2n for all z. Since condition (6.7) implies the conditions (6.11), any element of the Cohen class that satisfies Moyal’s identity also satisfies the marginal conditions (6.12).
76 | 6 The Cohen class
6.3 The marginal properties Let us discuss the existence of marginal properties similar to those (3.29) satisfied by the Wigner transform. We will say that and element Q of the Cohen class satisfies the marginal conditions if we have, for ψ ∈ L1 (ℝn ) ∩ L2 (ℝn ) and Fψ ∈ L1 (ℝn ) ∩ L2 (ℝn ), 2 ∫ Qψ(z)dp = ψ(x)
and
̂ 2 ∫ Qψ(z)dx = ψ(p) .
(6.10)
ℝn
ℝn
Note that when these conditions are satisfied, we have ∫ Qψ(z)dz = 1 ℝ2n
when ‖ψ‖L2 = 1. Here is a necessary and sufficient condition: Proposition 74. Let ψ ∈ L1 (ℝn ) ∩ L2 (ℝn ) and Fψ ∈ L1 (ℝn ) ∩ L2 (ℝn ). Then Qψ = Wψ ∗ θ satisfies the marginal properties 2 ∫ Qψ(z)dp = ψ(x) ,
ℝn
2 ∫ Qψ(z)dx = Fψ(p)
ℝn
if and only if the Fourier transform Fθ exists and is such that: Fθ(x, 0) = Fθ(0, p) = (2πℏ)−n .
(6.11)
Proof. Let us prove the more general intermediary result: If the partial mappings x → θ(x, p) and p → θ(x, p) are integrable, then ∫ Qψ(z)dp = (|ψ|2 ∗ α)(x),
∫ Qψ(z)dx = (|Fψ|2 ∗ β)(p)
ℝn
ℝn
(6.12)
where the functions α and β are defined by α(x) = ∫ θ(x, p)dp, ℝn
β(p) = ∫ θ(x, p)dx. ℝn
In view of the first marginal property (3.30) satisfied by the Wigner distribution, we have 2 ∫ Wψ(x − x , p − p )dp = ψ(x − x ) ,
ℝn
6.4 The τ-Wigner transform
| 77
and hence, using Fubini’s theorem, ∫ Qψ(z)dp = ∫ ( ∫ Wψ(z − z )θ(z )dz )dp ℝn ℝ2n
ℝn
= ∫ ( ∫ Wψ(z − z )dp)θ(z )dz ℝ2n ℝn
2 = ∫ ψ(x − x ) ( ∫ θ(z )dp )dx ℝn
ℝn
which yields the first formula (6.12). The second formula is proven in a similar way using the second marginal property (3.31) of the Wigner transform. (ii) It suffices to show that the conditions (6.11) imply that α(x) = δ(x) and β(p) = δ(p). Let θ̂ be the usual Fourier transform of the kernel θ; we have Fθ(x, 0) = (
n
i 1 ) ∫ e− ℏ xx θ(x , p )dp dx 2πℏ
n
=(
i 1 ) ∫ e− ℏ xx ( ∫ θ(x , p )dp )dx 2πℏ
n
=(
ℝ2n
ℝn
1 ) ∫e 2πℏ
ℝn
− ℏi xx
α(x )dx ,
ℝn
hence the condition Fθ(x, 0) = (2πℏ)−n is equivalent to α(x) = δ(x). Similarly, we have Fθ(0, p) = (2πℏ)−n . Remark 75. In particular, if Q satisfies a Moyal identity, then it also satisfies the marginal conditions.
6.4 The τ-Wigner transform Here we study a very particular member of the Cohen class. It will be instrumental in our study of Born–Jordan quantization later in this book. 6.4.1 Definition In what follows, τ is a real parameter. Definition 76. The element Wτ of the Cohen class corresponding to the Cohen kernel θτ (z) = {
n
2i
1 n 2 ( 2πℏ ) |2τ−1|n e ℏ(2τ−1) p⋅x δ(z)
for τ ≠ for τ =
1 2 1 2
(6.13)
78 | 6 The Cohen class is called the τ-Wigner transform. We call θτ the Shubin τ-kernel. Of course W1/2 (ψ, ϕ) is the usual Wigner function W(ψ, ϕ). Let give an explicit description of Wτ (ψ, ϕ) for τ ≠ 21 . Proposition 77. Let (ψ, ϕ) ∈ L2 (ℝn ) × L2 (ℝn ). We have, for τ ≠ 21 , Wτ (ψ, ϕ)(z) = (
n
i 1 ) ∫ e− ℏ py ψ(x + τy)ϕ(x − (1 − τ)y)dy. 2πℏ
(6.14)
ℝn
Proof. It suffices to give the proof for (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ); the expression (6.14) extends to the general case as for the usual cross-Wigner transform. By definition of the convolution product, W(ψ, ϕ) ∗ θτ is given by the absolutely convergent integral W(ψ, ϕ) ∗ θτ (z) = ∫ W(ψ, ϕ)(z − z )θτ (z )dz . ℝ2n
Expanding the integrand using formula (3.4) for the cross-Wigner transform, we get W(ψ, ϕ)(z − z )θτ (z ) = (
2n
i 2x i 2n 1 ) ∫ e− ℏ py e ℏ p (y+ 2τ−1 ) n 2πℏ |2τ − 1|
ℝn
1 1 × ψ(x − x + y)ϕ(x − x − y)dy. 2 2 Using the identity i
2x
∫ e ℏ p (y+ 2τ−1 ) dp = (2πℏ)n δ(y +
ℝn
2x ) 2τ − 1
and integrating with respect to p yields ∫ W(ψ, ϕ)(z − z )θτ (z )dp = ℝn
n
i 2n 1 2x ) ∫ e− ℏ py δ(y + ) ( n |2τ − 1| 2πℏ 2τ − 1
ℝn
1 1 × ψ(x − x + y)ϕ(x − x − y)dy 2 2 that is ∫ W(ψ, ϕ)(z − z )θτ (z )dp ℝn
n
=
i 2n 1 2x ( ) ∫ e− ℏ py δ(y + )ψ(x + τy)ϕ(x − (1 − τ)y)dy n |2τ − 1| 2πℏ 2τ − 1
ℝn
6.4 The τ-Wigner transform | 79
where we have used the equality δ(y +
1 2x 1 )ψ(x − x + y)ϕ(x − x − y) 2τ − 1 2 2 = δ(y +
2x )ψ(x + τy)ϕ(x − (1 − τ)y)dy. 2τ − 1
Integrating now in x and taking the identity ∫ δ(y + ℝn
n
2x |2τ − 1| )dx = ( ) 2τ − 1 2
into account, we finally get W(ψ, ϕ) ∗ θτ (z) = (
n
i 1 ) ∫ e− ℏ py ψ(x + τy)ϕ(x − (1 − τ)y)dy 2πℏ
ℝn
which is (6.14). Notice that, putting τ = 21 on the right-hand-side of (6.14), one recovers the standard cross-Wigner transform W(ψ, ϕ). This can be seen as a limiting case of the definition since θτ → δ when τ → 21 . If τ = 0, we get W0 (ψ, ϕ)(z) = (
n/2
1 ) 2πℏ
i
e− ℏ px ψ(x)Fϕ(p) :
W0 is called the Rihaczek–Kirkwood distribution in time-frequency analysis and signal theory. For τ = 1, one gets the so-called dual Rihaczek–Kirkwood distribution. It is easily verified using (6.14) that we have the complex conjugation formula Wτ (ϕ, ψ) = W1−τ (ψ, ϕ). 6.4.2 Properties Proposition 78. (i) The Fourier transform of the Shubin τ-kernel is given for τ ≠ Fθτ (z) = (
n
i(2τ−1) 1 ) e− 2ℏ p⋅x . 2πℏ
1 2
by (6.15)
(ii) The cross τ-Wigner transform satisfies the Moyal identity (Wτ (ψ, ϕ)|Wτ (ψ , ϕ ))L2 (ℝ2n ) = (
n
1 ) (ψ|ψ )L2 (ϕ|ϕ )L2 2πℏ
(6.16)
80 | 6 The Cohen class for all (ψ, ϕ) ∈ L2 (ℝn ) × L2 (ℝn ); (iii) The τ-Wigner transform satisfies the marginal conditions 2 ∫ Wτ ψ(z)dx = Fψ(p)
2 ∫ Wτ ψ(z)dp = ψ(x) ,
(6.17)
ℝn
ℝn
for all ψ ∈ L1 (ℝn ) ∩ L2 (ℝn ) such that Fψ ∈ L1 (ℝn ) ∩ L2 (ℝn ). Proof. Let us prove formula (6.15); it implies that |Fθτ (z)| = (2πℏ)−n and properties (ii) and (iii) will then follow in view of Proposition 73 and Remark 75 following the proof of Proposition 74. The Fourier transform of θτ is given by Fθτ (z) = (
2n
2i i 1 2n ) ∫ e− ℏ z⋅z e ℏ(2τ−1) p ⋅x dp dx . 2πℏ |2τ − 1|n
ℝ2n
We have, since z ⋅ z = x ⋅ x + p ⋅ p , i
2i
i
i
2
∫ e− ℏ z⋅z e ℏ(2τ−1) p ⋅x dz = ∫ e− ℏ x⋅x [ ∫ e− ℏ p ⋅(p− 2τ−1 x ) dp ]dx
ℝn
ℝ2n
ℝn i
= (2πℏ)n ∫ e− ℏ x⋅x δ(p −
ℝn n
=
2 x )dx 2τ − 1
i(2τ−1) |2τ − 1| (2πℏ)n e− 2ℏ p⋅x , n 2
hence formula (6.15). It is remarkable that Definition 71 of the Cohen class can be reformulated using the τ-Wigner transform for any fixed value of τ: Proposition 79. Letting τ ∈ [0, 1], then every Q(ψ, ϕ) = W(ψ, ϕ) ∗ θ in the Cohen class can be written Q(ψ, ϕ) = Wτ (ψ, ϕ) ∗ fτ
(6.18)
for a suitable fτ ∈ 𝒮 (ℝn ). Proof. In view of formula (6.15), we have Fθτ Fθ1−τ = (2πℏ)−2n , hence F(θτ ∗ θ1−τ ) = (2πℏ)n Fθτ Fθ1−τ = (2πℏ)−n so that, using the Fourier inversion formula θτ ∗ θ1−τ = δ, it follows that Q(ψ, ϕ) = W(ψ, ϕ) ∗ θ can be written Q(ψ, ϕ) = (W(ψ, ϕ) ∗ θ) ∗ (θτ ∗ θ1−τ ) = (W(ψ, ϕ) ∗ θτ ) ∗ (θ ∗ θ1−τ )
6.5 The Born–Jordan kernel | 81
whence (6.18) with fτ = θ ∗ θ1−τ . For this construction to make sense, we must however prove that fτ exists and that fτ ∈ 𝒮 (ℝn ), which is not a priori clear. Since θ ∗ θ1−τ = (2πℏ)n F −1 (FθFθ1−τ ), it is sufficient to show that FθFθ1−τ ∈ 𝒮 (ℝn ). Now Fθ ∈ 𝒮 (ℝn ) and in view of (6.15) Fθ1−τ = (
n
i(2τ−1) 1 ) e 2ℏ p⋅x , 2πℏ
which is a C ∞ function with polynomially bounded derivatives; therefore, FθFθ1−τ ∈ 𝒮 (ℝn ).
6.5 The Born–Jordan kernel The choice θ = δ for the Cohen kernel is the standard one when one deals with the usual Wigner formalism. It leads, as we will see, to Weyl quantization of symbols. Another very important choice that seems to be a priori pulled out of thin air is the following: Definition 80. The Cohen kernel θBJ obtained by averaging the Shubin τ-kernels over τ ∈ [0, 1], that is 1
θBJ = ∫ θτ dτ,
(6.19)
0
is called the “Born–Jordan kernel”. We shall not attempt to calculate the integral (6.19) here, but rather describe θBJ by giving its (symplectic) Fourier transform. Proposition 81. The symplectic Fourier transform of the Born–Jordan kernel is given by Fσ θBJ (z) = (
n
1 sin(px/2ℏ) ) 2πℏ px/2ℏ
(6.20)
with the convention sin(0)/0 = 1. Proof. In view of formula (6.15), the Fourier transform of the Shubin τ-kernel θτ is given by Fθτ (z) = (
n
i(2τ−1) 1 ) e− 2ℏ p⋅x . 2πℏ
Integrating both sides of this equality in τ ∈ [0, 1] yields 1
n 1
0
0
i(2τ−1) 1 ) ∫ e− 2ℏ p⋅x dτ ∫ Fθτ (z)dτ = ( 2πℏ
(6.21)
82 | 6 The Cohen class n
=(
1 sin(px/2ℏ) ) . 2πℏ px/2ℏ
Formula (6.20) follows since Fσ θBJ (x, p) = FθBJ (p, −x). Formula (6.20) can be rewritten for short Fσ θBJ (z) = (
n
1 ) sinc(px/2ℏ) 2πℏ
where sinc α = sin α/α for α ≠ 0 is the sinus cardinalis function well-known from sampling theory. The element of the Cohen class associated with the Born–Jordan kernel WBJ (ψ, ϕ) = W(ψ, ϕ) ∗ θBJ satisfies the marginal properties in Proposition 74 because FθBJ (x, 0) = FθBJ (0, p) = (2πℏ)−n . Therefore, if ψ ∈ L1 (ℝn ) ∩ L2 (ℝn ) and Fψ ∈ L1 (ℝn ) ∩ L2 (ℝn ), we will have 2 ∫ WBJ ψ(z)dp = ψ(x)
(6.22)
̂ 2 ∫ WBJ ψ(z)dx = ψ(p) .
(6.23)
ℝn
ℝn
Notice that the Moyal identity does not hold for WBJ (ψ, ϕ) since we have |FθBJ (z)| ≠ (2πℏ)−n (Proposition 73).
6.6 Comments and references The notion of quasi-distribution Q goes back to the seminal work of Leon Cohen [14, 15]. The terminology “Cohen class” was apparently introduced by Gröchenig in [40]. In the proof of Proposition 72, we have been following ([31, 40]) with some minor modifications. Boggiatto and his collaborators have studied the τ-dependent transforms Wτ (ψ, ϕ). Proposition 79 is due to Boggiatto, De Donno, and Oliaro [7].
7 Born–Jordan quantization 7.1 Physical origins To understand what Born–Jordan operators are about one has to go back to the early years of quantum mechanics (around 1925), when the notion of quantization was first discussed in depth by the physicists Max Born and Pascual Jordan following ideas of Werner Heisenberg. These investigations quickly led to the so-called “ordering problem” for operators. In its most basic form the question was the following: if we accept that the quantum version of the variables x and p should be the non-commuting ̂ where x̂ is multiplication by x and p ̂ = −iℏ𝜕x , what should then operators x̂ and p ̂ or p ̂ x̂? Or should it be the quantization rule for the product xp be? Should it be x̂p ̂+p ̂ x̂) something else? Consensus that the “right” choice should be the average 21 (x̂p was quickly reached, using symmetry considerations (Erwin Schrödinger had already proposed this choice in previous work related to his eponymous equation). Born and Jordan thereafter proposed a more general rule for “quantizing” any monomial in the variables x and p: they suggested to associate to the function a(x, p) = x m pℓ the differential operator ̂ BJ = A
1 m k ℓ ℓ−k ̂ x̂ p ̂ . ∑p ℓ + 1 k=0
(7.1)
Their definition seemed to be pulled out of thin air. It is actually a rigorous consequence of Born and Jordan’s attempt to “quantize” Hamiltonian mechanics, and replace the Hamilton equations of motion dx̂ 𝜕H ̂ ), = (x̂, p ̂ dt 𝜕p
̂ dp 𝜕H ̂) =− (x̂, p dt 𝜕x̂
with a “quantum” counterpart, postulated to be dx 𝜕H = (x, p), dt 𝜕p
dp 𝜕H =− (x, p). dt 𝜕x
For this reason it might very well the physically correct quantization scheme, as opposed to the Weyl quantization we studied in Chapter 5; the latter leads to the quantization rule ℓ ̂ Weyl = 1 ∑ ( ℓ ) p ̂ ℓ−k x̂m p ̂k A ℓ 2 k=0 k
(7.2)
̂ BJ ≠ A ̂ Weyl as soon as m + ℓ > 2. It turns out for the monomial a(x, p) = xm pℓ ; we have A that Weyl quantization quickly superseded Born and Jordan’s rule for historical reasons. The attractiveness of Weyl quantization has several origins; it enjoys the symplectic covariance property which in a sense reproduces the canonical invariance of https://doi.org/10.1515/9783110722772-007
84 | 7 Born–Jordan quantization Hamilton’s equation. We are going to see that the ideas of Born and Jordan can be very easily and elegantly interpreted in terms of a certain element of the Cohen class, defined by what we called the Born–Jordan kernel in last chapter.
7.2 Algebraic motivation 7.2.1 Weyl, Shubin, and Born–Jordan Let a be, as above, the monomial a(x, p) = x m pℓ where m and ℓ are non-negative in̂ be differential operators 𝒮 (ℝ) → 𝒮 (ℝ); we choose as above x̂ = tegers and x̂ and p ̂ = −iℏ𝜕x . Let τ be a real parameter and define the Shubin multiplication by x and p ordering by associating to a the differential operator ℓ
̂ τ = ∑ ( ℓ ) (1 − τ)k τℓ−k p ̂ k x̂m p ̂ ℓ−k . A k k=0 ̂0 = p ̂ 0 = x̂m p ̂ ℓ x̂m and, setting τ = 1, A ̂ ℓ (these orSetting τ = 0 we get the operator A derings are called in the literature “anti-normal ordering” and “normal ordering”, respectively). More interesting is the choice τ = 21 which leads to the Weyl ordering (7.2): ℓ ̂ 1/2 = A ̂ Weyl = 1 ∑ ( ℓ ) p ̂ ℓ−k x̂m p ̂k . A ℓ 2 k=0 k
While no choice of τ leads to the Born–Jordan ordering (7.1), the latter is however ob̂ τ over the interval [0, 1]. In fact, using the tained by averaging the mapping τ → A identity 1
∫(1 − τ)k τℓ−k dτ = 0
k!(ℓ − k)! (ℓ + 1)!
familiar from the theory of the beta function we get 1
ℓ 1 ( ) ∫(1 − τ)k τℓ−k dτ = k ℓ+1 0
so that we have 1
̂ BJ = ∫ A ̂ τ dτ A 0
as claimed. In the rest of this chapter we will use this “averaging” procedure to define and study Born–Jordan operators in full generality.
7.3 Born–Jordan operators: definition(s)
| 85
7.2.2 A remark on the quantization of monomials Our search for quantizations of the monomials a(x, p) = x m pℓ can be placed in a more axiomatic setting by introducing a set of axioms. Let ℂ[x, p] be the complex vector space of all monomial functions. We define a quantization of ℂ[x, p] as being a linear mapping ̂] Op : ℂ[x, p] → ℂ[x̂, p such that: ̂; (A1) Op(1) = 1 (the identity), Op(x) = x̂ and Op(p) = p (A2) Op(a) is self-adjoint if a ∈ ℂ[x, p] is a real polynomial; (A3) The restricted Dirac correspondence [x̂, Op(a)] = iℏ Op({x, a})
(7.3)
̂ , Op(a)] = iℏ Op({p, a}) [p
(7.4)
̂ ] = iℏ. holds for every a ∈ ℂ[x, p]; in particular [x̂, p ̂ ] be a quantization of monomials n the sense Proposition 82. Let Op : ℂ[x, p] → ℂ[x̂, p above. Then there exists a real function χ ∈ C ∞ (ℝ2 ) with χ(0) = 1 such that Op(xr ps ) =
min(r,s)
s r s−ℓ r−ℓ ̂ x̂ ∑ ℏℓ ℓ!f (ℓ) (0)( )( )p ℓ ℓ ℓ=0
(7.5)
where f (ℓ) (0) is the ℓ-th derivative of the function f (x) = eix/2 χ(x) at x = 0. Suppose for instance that χ(x) = 1 for all x, then f (x) = eix/2 and f (j) (0) = (i/2)j and ̂ ] = iℏ that in this it is easy to show using repeatedly the commutation relations [x̂, p case Op is just Weyl quantization: Op = OpW . Similarly, the choice χ(x) = sinc(x/2) leads to f (ℓ) (0) = iℓ /(ℓ + 1) and Op is the Born–Jordan quantization: Op = OpW . We notice that the Shubin ordering does not correspond to any monomial quantization in the sense above. For details and references see the last Section of this Chapter.
7.3 Born–Jordan operators: definition(s) It turns out that we have already developed much of the material we need to extend the algebraic considerations above to the case of arbitrary symbols. 7.3.1 Definition using the Cohen class Let a be a function or distribution having vocation to be a symbol. Recall (formula (5.5)) that the Weyl operator OpW (a) can be defined in the weak sense, for a ∈ 𝒮 (ℝ2n )
86 | 7 Born–Jordan quantization by the condition ⟨OpW (a)ψ, ϕ⟩ = ⟨a, W(ψ, ϕ)⟩ where (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ). Definition 83. Let a ∈ 𝒮 (ℝ2n ). The Born–Jordan operator OpBJ (a) is the only operator 𝒮 (ℝn ) → 𝒮 (ℝn ) such that ⟨OpBJ (a)ψ, ϕ⟩ = ⟨a, WBJ (ψ, ϕ)⟩
(7.6)
for all (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ). Here WBJ (ψ, ϕ) = W(ψ, ϕ) ∗ θBJ where θBJ is the Born– Jordan kernel (Definition 80). ̂ BJ = OpBJ (a). We will write A Recall that symplectic Fourier transform of the Born–Jordan kernel is defined by Fσ θBJ (z) = (
n
1 sin(px/2ℏ) ) . 2πℏ px/2ℏ
(7.7)
While formula (7.6) indeed defines OpBJ (a) in the weak sense, we must check the ̂ = Op (a) satisfies (7.6); uniqueness statement. Assume that a second operator A BJ BJ then we have ̂ BJ − A ̂ )ψ, ϕ⟩ = 0 ⟨(A BJ ̂ BJ − A ̂ = 0. for all (ψ, ϕ) and hence A BJ Proposition 84. Let a ∈ 𝒮 (ℝ2n ). (i) We have OpBJ (a) = OpW (a ∗ θBJ )
(7.8)
where θBJ is the Born–Jordan kernel. (ii) The operator OpBJ (a) has the harmonic decomposition n
̂ BJ ψ(x) = ( 1 ) ∫ aσ (z0 )̂ DBJ (z0 )ψ(x)dz0 A 2πℏ
(7.9)
ℝ2n
where ̂ DBJ (z0 ) is the operator defined by p x ̂ DBJ (z0 ) = ̂ D(z0 ) sinc( 0 0 ). 2ℏ Proof. (i) In integral notation definition (7.6) reads ̂ BJ ψ(x)ϕ(x)dx = ∫ a(z)WBJ (ψ, ϕ)(z)dz ; ∫A ℝn
ℝ2n
(7.10)
7.3 Born–Jordan operators: definition(s)
| 87
using the Plancherel formula for the symplectic Fourier transform Fσ this can be rewritten ̂ BJ ψ(x)ϕ(x)dx = ∫ Fσ a(z)Fσ WBJ (ψ, ϕ)(−z)dz. ∫A ℝn
ℝ2n
In view of the formula Fσ (f ∗ g) = (2πℏ)n Fσ fFσ g we have Fσ WBJ (ψ, ϕ) = (2πℏ)n Fσ W(ψ, ϕ)Fσ θBJ and hence ̂ BJ ψ(x)ϕ(x)dx = (2πℏ)n ∫ Fσ a(z)Fσ W(ψ, ϕ)(−z)Fσ θBJ (−z)dz ∫A ℝn
ℝ2n
that is, since Fσ θBJ (−z) = Fσ θBJ (z), ̂ BJ ψ(x)ϕ(x)dx = (2πℏ)n ∫ Fσ a(z)Fσ θBJ (z)Fσ W(ψ, ϕ)(−z)dz ∫A ℝn
ℝ2n
= ∫ Fσ (a ∗ θBJ )(z)Fσ W(ψ, ϕ)(−z)dz. ℝ2n
Using again Plancherel’s formula we get ̂ BJ ψ(x)ϕ(x)dx = ∫ (a ∗ θBJ )(z)W(ψ, ϕ)(z)dz ∫A ℝn
ℝ2n
hence formula (7.8). (ii) Taking definition (7.7) into account we have Fσ WBJ (ψ, ϕ)(z) =
sin(px/2ℏ) Fσ W(ψ, ϕ)(z). px/2ℏ
(Notice that Fσ W(ψ, ϕ) is just the cross-ambiguity function A(ψ, ϕ)). It follows from the calculations above that we have the identity (7.9) where ̂ D(z0 ) is the usual Heisenberg displacement operator. Notice that the operator ̂ DBJ (z0 ) defined by (7.10) is not unitary on L2 (ℝn ).
7.3.2 Pseudodifferential expression Here is an alternative definition of OpBJ (a) which links the results above to the averaging procedure introduced in the beginning of this chapter.
88 | 7 Born–Jordan quantization Recall that we defined in last chapter (formula (6.13)) the τ-Wigner transform (τ a real parameter) as the element of the Cohen class associated to the Shubin τ-kernel given by θτ (z) = (
n
2i 1 2n ) e ℏ(2τ−1) p⋅x n 2πℏ |2τ − 1|
for τ = 21 , and θ1/2 (z) = δ(z). We thereafter defined (formula (6.19)) the Born–Jordan kernel θBJ by averaging the Shubin τ-kernels over the real interval [0, 1] 1
θBJ = ∫ θτ dτ.
(7.11)
0
Definition 85. Let τ ∈ ℝ and a ∈ 𝒮 (ℝ2n ). The Shubin τ-pseudodifferential operator Aτ = Opτ (a) is the only operator 𝒮 (ℝn ) → 𝒮 (ℝn ) such that ⟨Opτ (a)ψ, ϕ⟩ = ⟨a, Wτ (ψ, ϕ)⟩
(7.12)
for all (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ). Here Wτ (ψ, ϕ) = W(ψ, ϕ) ∗ θτ is the τ-Wigner transform studied in last chapter (Definition 76). ̂ τ = Opτ (a) is Proposition 86. Let a ∈ 𝒮 (ℝ2n ) and ψ ∈ 𝒮 (ℝn ). (i) The Shubin operator A given by n
i ̂ τ ψ(x) = ( 1 ) ∫ e ℏ p(x−y) a((1 − τ)x + τy, p)ψ(y)dydp A 2πℏ
(7.13)
ℝ2n
̂ BJ = OpBJ (a) is given by and (ii) The Born–Jordan operator A 1
̂ BJ = ∫ A ̂ τ dτ. A
(7.14)
0
Proof. (i) In view of formula (6.14) of last chapter we have Wτ (ψ, ϕ)(z) = (
n
i 1 ) ∫ e− ℏ py ψ(x + τy)ϕ(x − (1 − τ)y)dy 2πℏ
ℝn
and hence ⟨a, Wτ (ψ, ϕ)⟩ = (
n
i 1 ) ∫ e− ℏ py a(x, p)ψ(x + τy) 2πℏ
ℝ3n
× ϕ(x − (1 − τ)y)dydpdx.
(7.15)
7.4 Non-invertibility of Born–Jordan quantization
| 89
Performing the change of variables x = x − (1 − τ)y and y = x + τy this is ⟨a, Wτ (ψ, ϕ)⟩ = (
n
i 1 ) ∫ e− ℏ p(x −y ) 2πℏ
ℝ3n
× a((1 − τ)x + τy , p)ψ(y )ϕ(x )dn y dn pdn x hence the equality (7.13). (ii) Formula (7.14) follows. Notice that a Born–Jordan operator can be written in the form of a Fourier integral operator: setting 1
b(x, y, p) = ∫ a((1 − τ)x + τy, p)dτ 0
we have n
i ̂ BJ ψ(x) = ( 1 ) ∫ e ℏ p(x−y) b(x, y, p)ψ(y)dydp. A 2πℏ
ℝ2n
The following result in a sense “trivializes” Born–Jordan quantization; it shows that if we choose the symbol in one of the Shubin classes, then OpBJ (a) is essentially a Shubin pseudodifferential operator: ̂ BJ = OpBJ (a) with a ∈ Γm (ℝ2n ) there exists, for every τ ∈ ℝ, a Proposition 87. (i) If A ρ ̂ BJ = Opτ (aτ ). (ii) Consymbol aτ belonging to the same symbol class Γm (ℝ2n ) such that A ρ
2n m 2n versely, for any given symbol aτ ∈ Γm ρ (ℝ ) there exists a symbol a ∈ Γρ (ℝ ) such that
OpBJ (a) = Opτ (aτ ) + R where R is an operator with integral kernel in 𝒮 (ℝ2n ).
̂ BJ = OpBJ (a) is a Weyl pseudodifferential In particular, taking τ = 21 , the operator A operator with symbol in the same Shubin class as a.
7.4 Non-invertibility of Born–Jordan quantization The Weyl transform, viewed as a quantization procedure, is invertible in the folloŵ : 𝒮 (ℝn ) → 𝒮 (ℝn ) corresponds a ing sense: to every continuous linear operator A 2n ̂ W = OpW (a); conversely given a symbol unique distribution a ∈ 𝒮 (ℝ ) such that A a ∈ 𝒮 (ℝ2n ) we know how to construct the operator OpW (a). Thus, OpW : 𝒮 (ℝ2n ) → ℒ(𝒮 (ℝn ), 𝒮 (ℝn )) is a bijection (in fact a linear isomorphism). We are going to see that the Born–Jordan quantization OpBJ : 𝒮 (ℝ2n ) → ℒ(𝒮 (ℝn ), 𝒮 (ℝn ))
90 | 7 Born–Jordan quantization is surjective, but not injective: there are non-zero symbols a ∈ 𝒮 (ℝ2n ) which are mapped to the zero operator. The physical meaning of this particularity is not yet fully understood. 7.4.1 A first non-injectivity result We prove here a non-uniqueness result; the general question of invertibility, which is much more subtle, will be addressed below. Proposition 88. Let a ∈ 𝒮 (ℝ2n ) and az0 (z) = e−iσ(z,z0 )/ℏ . We have OpBJ (a) = OpBJ (a + az0 )
(7.16)
for all z0 = (x0 , p0 ) such that Fσ θBJ (z0 ) = 0, that is p0 x0 = 2Nπℏ (N ∈ ℤ), N ≠ 0. Proof. By linearity it suffices to prove that OpBJ (az0 ) = 0 if p0 x0 = 2Nπℏ. We have, by formula (7.9), OpBJ (az0 )ψ = (
n
1 ) ∫ Fσ az0 (z)̂ DBJ (z)ψdz. 2πℏ
(7.17)
ℝ2n
Now, Fσ az0 (z) = (
n
i 1 ) ∫ e− ℏ σ(z−z0 ,z ) dz = (2πℏ)n δ(z − z0 ) 2πℏ
ℝ2n
and hence OpBJ (az0 ) = ∫ δ(z − z0 ) sinc( ℝ2n
= ∫ δ(z − z0 ) sinc( ℝ2n
px ̂ )D(z)dz 2ℏ p0 x0 )dz 2ℏ
so that OpBJ (az0 ) = 0 if p0 x0 ∈ 2πℏℤ with p0 x0 ≠ 0. Observe that the set Σ of all z such that χBJ (z) = 0 consists of a family of concentric 2n − 1 dimensional sheets in phase space (when n = 1 they are just hyperbolas in the phase plane). The distance of the set Σ to the origin is easily calculated and one gets dist(0, Σ) = √4πℏ. It follows from (7.16) that we have OpBJ (a) = 0 for all symbols of the type a(z) = ∑ λ(z0 )e−iσ(z,z0 )/ℏ z0 ∈Λ
(7.18)
7.4 Non-invertibility of Born–Jordan quantization
| 91
where Λ is any finite lattice of points z0 = (x0 , p0 ) in ℝ2n such that p0 x0 /2πℏ ∈ ℤ and λ : Λ → ℂ. However, the Born–Jordan quantization of monomials is invertible: Assuming that n = 1 we denote by ℂ[x, p] is the polynomial ring generated by the ̂ ] the algebra generated by x̂ and p ̂ . Then: real variables x and p, and by ℂ[x̂, p Proposition 89. The quantization OpBJ is an isomorphism of vector spaces ̂ ]. OpBJ : ℂ[x, p] → ℂ[x̂, p We omit the (quite technical) proof here and refer to the end of the chapter for references.
7.4.2 A general result Let us recall a few basic concepts from distribution theory and the theory of complex functions. The Fourier transform of a compactly supported function (or distribution) is an entire analytic function (Paley–Wiener theorem). We denote by B2n (r) the closed ball in ℝ2n centered at the origin and with radius r, and by ℰ (ℝ2n ) the space of compactly supported distributions on ℝ2n ; we have inclusion ℰ (ℝ2n ) ⊂ 𝒮 (ℝ2n ). For r ≥ 0 we denote by Ar (2n) the subspace of 𝒮 (ℝ2n ) consisting of all tempered distributions a whose symplectic Fourier transform Fσ a = aσ has support supp(aσ ) ⊂ B2n (r). Equivalently, a satisfies an estimate N r | Im ζ | a(ζ ) ≤ C(1 + |ζ |) e ℏ
(7.19)
for some constants C > 0, N > 0. Obviously A0 (2) = ℂ[x, p], the space of polynomials in the real variables x and p. More generally, A0 (2n) is the space of polynomials in the variables x1 , . . . , xn and p1 , . . . , pn . The following inversion result generalizes Proposition 89: Proposition 90. The linear mapping 𝒮 (ℝ2n ) → 𝒮 (ℝ2n ) defined by a → (
n
1 ) a ∗ θBJ 2πℏ
(7.20)
restricts to an automorphism of Ar (2n) if and only if 0 ≤ r < √4πℏ.
(7.21)
That is, the equation bσ = aσ Fσ θBJ admits, for every b ∈ Ar (2n), a unique solution a ∈ Ar (2n) if and only if condition (7.21) holds.
92 | 7 Born–Jordan quantization Proof. Let us first prove the sufficiency of condition (7.21). Assume that 0 ≤ r < √4πℏ; it follows from the equality (7.18) that the ball B2n (r) does not contain any zero of Fσ θBJ hence the equation bσ = aσ Fσ θBJ admits the solution aσ = bσ /Fσ θBJ for every b ∈ Ar (2n), and it is clear that aσ ∈ 𝒮 (ℝ2n ). Since supp(bσ ) ⊂ B2n (r) we also have supp(aσ ) ⊂ B2n (r), hence a ∈ Ar (2n). Condition (7.21) is also necessary: assume in fact that r ≥ √4πℏ and choose z0 = (x0 , p0 ) such that Fσ θBJ = 0. Then ̂ DBJ (z0 ) = 0 and supp(bσ ) thus contains the points z0 ; hence the mapping (7.20) is not injective. As for the announced surjectivity, we have th following result: Proposition 91. The mapping a → (2πℏ)−n a ∗ θBJ is a linear surjection 𝒮 (ℝ2n ) → 𝒮 (ℝ2n ): for every b ∈ 𝒮 (ℝ2n ) there exists (a non-unique) a ∈ 𝒮 (ℝ2n ) such that b=(
n
1 ) a ∗ θBJ . 2πℏ
(7.22)
Its proof is highly non-trivial in spite of its apparent simplicity. The difficulty comes from the fact that we are confronted to a division problem for distributions. See the Comments and References section.
7.5 Comments and references Born–Jordan quantization was proposed in the papers [11, 12] by Pascual Jordan, Max Born, and Werner Heisenberg. The idea of averaging τ-Wigner transforms (which is the key to Born–Jordan quantization) is originally due to Paolo Boggiatto and his collaborators [8–10]. Proposition 82 on the orderings of monomials is due Domingo and Galapon [22]. The section addressing the (non-) invertibility and surjectivity of Born–Jordan quantization originates in joint work with Elena Cordero and Fabio Nicola (in particular the proof of Propositions 90 and 91); see [16]. Also see our monograph [33] for details and comments. The proof of Proposition 87 can be found in our joint work [38] with Fabio Nicola. The debate is still open whether Born–Jordan quantization is the “right” quantization procedure as opposed to the Weyl transformation is open. The two theories are hard to distinguish, because they both quantize “physical Hamiltonians” of the type H(x, p, t) =
1 2 (p − A(x, t)) + V(x, t) 2m
in the same way, and lead to the operator
̂ = 1 (−iℏ𝜕x − A(x, t))2 + V(x, t) H 2m
already “guessed” by Erwin Schrödinger. One can prove (see [38]) that the Born– Jordan quantization satisfies the Dirac prescription (1.25) is satisfied for all Hamiltonians of the type above.
8 Metaplectic operators 8.1 The symplectic group Sp(n) The symplectic group Sp(n) is generated by the set of all free symplectic matrices A C
S=(
B ), D
det B ≠ 0 ;
(8.1)
symplectic matrices of this type are called free symplectic matrices. The notion depends on the choice of symplectic basis and is therefore not intrinsic. Condition (8.1) is actually equivalent to the transversality condition S(0 × ℝn ) ∩ (0 × ℝn ) = 0. We have the following remarkable result (proof omitted): Proposition 92. Every S ∈ Sp(n) can be written (non-uniquely) as the product of exactly two free symplectic matrices. Let L and P be two real n × n matrices, P symmetric and L invertible. We set I 0
V−P = (
P ), I
L−1 0
ML = (
0 ). LT
These matrices are trivially symplectic, and one verifies by a straightforward calculation that S = V−DB−1 MB−1 JV−B−1 A .
(8.2)
It follows from Proposition 92 that: Proposition 93. The symplectic group Sp(n) is generated by the set G = {V−P , ML , J : P = P T , det L ≠ 0}. It is immediately verifiable that these generators satisfy the relationship V−P V−P = V−(P+P ) ,
ML V−P = V−LT PL ML ,
ML ML = ML L
V−P ML = ML V−(L−1 )T PL−1
(8.3) (8.4)
and (V−P ML )−1 = V−(L−1 )T PL−1 ML−1 .
(8.5)
An immediate consequence of this is that the determinant of a symplectic matrix always is equal to +1 (while the defining definition ST JS = J of a symplectic matrix only allows to conclude that det S = ±1). https://doi.org/10.1515/9783110722772-008
94 | 8 Metaplectic operators To each free symplectic matrix (8.1), we associate a homogeneous quadratic polynomial in the variables (x, x ): 1 1 W(x, x ) = DB−1 x2 − B−1 x ⋅ x + B−1 Ax 2 . 2 2 This polynomial is called the generating function of S, and we have (x, p) = S(x , p ) ⇐⇒ {
p = 𝜕x W(x, x ) p = −𝜕x W(x, x ).
Conversely, to every quadratic polynomial of the type 1 1 W(x, x ) = Px2 − Lx ⋅ x + Qx2 2 2 P = P T , Q = QT , det L ≠ 0,
(8.6)
we can associate a free symplectic matrix, namely SW = (
L−1 Q PL Q − LT
L−1 ). PL−1
−1
(8.7)
The latter can be factorized as SW = V−P ML JVQ .
(8.8)
8.2 The metaplectic representation 8.2.1 Fourier integral operators with quadratic phases The metaplectic group can be defined in several ways. Our approach highlights the notion of generating the function of a symplectic matrix that allows defining the metaplectic group in terms of a class of Fourier integral operators with quadratic phase. To every free symplectic matrix SW , we associate an operator ŜW,m by setting, for ψ ∈ 𝒮 (ℝn ), n/2
1 ) ŜW,m ψ(x) = ( 2πiℏ
i
Δ(W) ∫ e ℏ W(x,x ) ψ(x )dx ;
(8.9)
ℝn
here arg i = π/2 and the prefactor Δ(W) is defined by Δ(W) = im √| det L| ;
(8.10)
the integer m (“Maslov index”) corresponds to a choice of arg det L: mπ ≡ arg det L
mod 2π.
(8.11)
8.2 The metaplectic representation
| 95
(Thus to each SW correspond two operators: ŜW,m and ŜW,m+2 = −ŜW,m ). We will call the operator ŜW,m the “quadratic Fourier transform” associated with the free symplectic matrix SW . The generating function of the standard symplectic J is W(x, x ) = −x ⋅ x . We denote by n/2
̂Jψ(x) = ( 1 ) . 2πi..
i
∫ e− ℏ x⋅x ψ(x )dx = i−n/2 Fψ(x)
(8.12)
ℝn
the operator corresponding to the choice m = 0 for L = In×n . For ψ ∈ 𝒮 (ℝn ); F is the usual unitary Fourier transform on ℝn . ̂−P and M ̂L,m by Let us define operators V i
̂−P ψ(x) = e 2 Px⋅x ψ(x), V
̂L,m ψ(x) = im √| det L|ψ(Lx). M
(8.13)
These are obviously unitary operators on L2(ℝn ). Let W be the quadratic form (8.6). We have the factorization ̂−Q . ̂L,m̂J V ̂−P M ŜW,m = V
(8.14)
̂ noting that This immediately follows from the definition of J, n/2
̂L,m̂Jψ(x) = ( 1 ) M 2πi
im √| det L| ∫ e−iLx⋅x ψ(x )dx .
ℝn
Proposition 94. The operators ŜW,m extend to unitary operators L2 (ℝn ) → L2 (ℝn ), and the inverse of ŜW,m is given by −1 ŜW,m = ŜW ∗ ,m∗ ,
W ∗ (x, x ) = −W(x , x),
m∗ = n − m.
Proof. We obviously have ̂−P )−1 = V ̂P (V
̂L,m )−1 = M ̂ −1 , and (M L ,−m
and ̂J −1 is given by the Fourier inversion formula: n/2
̂J −1 ψ(x) = ( i ) 2πℏ
i
∫ e ℏ x⋅x ψ(x )dx .
ℝn
In view of the identity (8.14), we have −1 ̂Q̂J −1 M ̂ −1 V ̂ ŜW,m =V L ,−m P ,
(8.15)
96 | 8 Metaplectic operators and the inversion formulas (8.15) follow from the fact that n/2
̂J −1 M ̂ −1 ψ(x) = ( i ) L ,−m 2π
i i−m √det L−1 ∫ e ℏ x⋅x ψ(L−1 x )dx
n/2
=(
1 ) 2πi
ℝn i
T
i−m+n √| det L| ∫ e ℏ L
x⋅x
ψ(x )dx
ℝn
̂ T ̂ =M −L ,n−m Jψ(x).
8.2.2 Definition and properties of Mp(n) As a consequence of the proposition that the operators ŜW,m form a subset of the group 𝒰 (L2 (ℝn )) of unitary operators acting on L2 (ℝn ), this subset is closed under the operation of inversion. This motivates the following definition: Definition 95. The subgroup of 𝒰 (L2 (ℝn )) generated by the quadratic Fourier transforms ŜW,m is called the “metaplectic group” and is denoted by Mp(n). The elements of Mp(n) are called “metaplectic operators”. The following result considerably simplifies many arguments; it is the metaplectic analogue of Proposition 92 which states that every symplectic matrix can be written as the product of two free symplectic matrices. Proposition 96. (i) Every Ŝ ∈ Mp(n) can be written as a product of exactly two quadratic Fourier transforms: Ŝ = ŜW,m ŜW ,m . (ii) The metaplectic group Mp(n) is generated by the ̂−P , M ̂L,m , and ̂J. operators V Proof. (i) The result follows from the existence of a natural projection Mp(n) → Sp(n), which will be established in next section, and the fact that every S ∈ Sp(n) can be written as a product SW SW . (See the Comments at the end of the chapter). (ii) It follows from the definition of Mp(n), together with the fact that each ŜW,m is a product ̂−P M ̂L,m̂J V ̂−Q (formula (8.14)). (ii) It immediately follows in view of the factorization V formula (8.14). The factorization Ŝ = ŜW,m ŜW ,m , however, is not unique. For instance, the identity operator can be written ŜW,m ŜW ∗ ,m∗ for every generating function W.
8.3 The projection π Mp 8.3.1 Definition of the covering projection We are going to show that Mp(n) is a double covering group of Sp(n) by defining a covering mapping π Mp : Mp(n) → Sp(n).
8.3 The projection π Mp
| 97
Proposition 97. The mapping ŜW,m → SW extends into a unique group epimorphism π Mp : Mp(n) → Sp(n), with kernel ker(π Mp ) = {−I, +I} and satisfying π Mp (̂J) = J,
̂L,m ) = ML , π Mp (M
̂P ) = VP . π Mp (V
(8.16)
Proof. The construction of π Mp will be given later; let us prove the formulas (8.16) aŝ Mp (ŜŜ ). The formula π Mp (̂J) = J is obvious since ̂J is suming that π Mp (ŜŜ ) = π Mp (S)π ̂L,m = ̂J −1 (̂J M ̂L,m ) and a quadratic Fourier transform with W(x, x ) = −x ⋅ x . We have M T −1 Mp −1 −1 ̂J M ̂L,m = ŜW,m with W(x, x ) = −(L ) x ⋅ x . Since π (̂J ) = J = −J, it follows that ̂L,m ) = π Mp (̂J −1 )π Mp (̂J M ̂L,m ) = ML . π Mp (M ̂P ) = VP is proven using a similar argument. The formula π Mp (V 8.3.2 Construction of π Mp Let us now give an explicit construction of the covering projection π Mp . For notational simplicity, we take ℏ = 1. Let us denote the elements of the dual (ℝ2n )∗ of ℝ2n by the Latin letters a, b, c, etc. Thus a(z) = a(x, p) is the value of the linear form a at the point z = (x, p). To every a ∈ (ℝ2n )∗ , we associate a first-order linear partial differential operator A obtained by replacing formally p in a(x, p) by Dx : A = a(x, Dx ),
Dx = −i𝜕x ;
thus, if a(x, p) = α ⋅ x + β ⋅ p for α = (α1 , . . . , αn ), β = (β1 , . . . , βn ) in ℝn , then A = α ⋅ x + β ⋅ Dx = α ⋅ x − iβ ⋅ 𝜕x .
(8.17)
Obviously the sum of two operators of this type is an operator of the same type, and so is the product of such an operator by a scalar. It follows that these operators form a 2n-dimensional vector space, which we denote by Diff(1) (n). The vector spaces ℝ2n , (ℝ2n )∗ and Diff(1) (n) are isomorphic since they all have the same dimension 2n. The following result explicitly describes three canonical isomorphisms between these spaces:
98 | 8 Metaplectic operators Lemma 98. (i) The linear mappings φ1 : ℝ2n → (ℝ2n ) ,
φ1 : z0 → a
∗
2n ∗
φ2 : (ℝ ) → Diff (n), (1)
φ2 : a → A
where a is the unique linear form on ℝ2n such that a(z) = σ(z, z0 ) are isomorphisms, hence their compose is φ: φ = φ2 ∘ φ1 : ℝ2n → Diff(1) (n); the latter associates with z0 = (x0 , p0 ) the operator A = φ(z0 ) = p0 ⋅ x − x0 ⋅ Dx . (ii) Let [A, B] = AB − BA be the commutator of A, B ∈ Diff(1) (n); we have [φ(z1 ), φ(z2 )] = −iσ(z1 , z2 )
(8.18)
for all z1 , z2 ∈ ℝ2n . Proof. (i) For the vector spaces ℝ2n , (ℝ2n )∗ , and Diff(1) (n) having the same dimension, it suffices to show that ker(φ1 ) and ker(φ2 ) are zero. Now, φ1 (z0 ) = 0 is equivalent to the condition σ(z, z0 ) = 0 for all z, and hence to z0 = 0 since a symplectic form is non-degenerate. If φ2 (a) = 0, then Aψ = φ2 (a)ψ = 0
for all ψ ∈ 𝒮 (ℝn )
which implies A = 0 and thus a = 0. (ii) Let z1 = (x1 , p1 ), z2 = (x2 , p2 ). We have φ(z1 ) = p1 ⋅ x − x1 ⋅ Dx ,
φ(z2 ) = p2 ⋅ x − x2 ⋅ Dx
and hence [φ(z1 ), φ(z2 )] = i(x1 ⋅ p2 − x2 ⋅ p1 ), which is precisely the commutation formula (8.18). We are next going to show that the metaplectic group Mp(n) acts by conjugation on Diff(1) (n). ̂ ∈ Diff(1) (n) by Lemma 99. For z0 = (x0 , p0 ) ∈ ℝ2n , define A ̂ = φ(z0 ) = p0 ⋅ x − x0 ⋅ Dx . A (i) We have: ̂J A ̂ ̂J −1 = −x0 ⋅ x − p0 ⋅ Dx = φ(Jz0 )
(8.19)
8.3 The projection π Mp
̂L,m A( ̂ M ̂L,m )−1 = LT p0 ⋅ x − L−1 x0 ⋅ Dx = φ(ML z0 ) M ̂P A( ̂ V ̂P ) V
−1
| 99
(8.20)
= (p0 + Px0 ) ⋅ x − x0 ⋅ Dx = φ(VP z0 ).
(8.21)
̂ ∈ Diff(1) (n) and Ŝ ∈ Mp(n), then ŜA ̂ Ŝ−1 ∈ Diff(1) (n). (iii) For every Ŝ ∈ Mp(n), (ii) If A the mapping ΦŜ : Diff(1) (n) → Diff(1) (n),
̂ → ŜA ̂ Ŝ−1 A
is a vector space automorphism. Proof. (i) Using the properties of the Fourier transform, it is immediately verifiable that: (x0 ⋅ Dx )ψ = ̂J −1 (x0 ⋅ x)̂Jψ (p0 ⋅ x)ψ = −̂J −1 (p0 ⋅ Dx )̂Jψ for ψ ∈ 𝒮 (ℝn ), hence (8.19). To prove (8.20), it suffices to remark that ̂L,m (p0 ⋅ x)(M ̂L,m )−1 ψ(x) = (p0 ⋅ Lx)ψ(x) M and ̂L,m (x0 ⋅ Dx )(M ̂L,m )−1 ψ(x) = x0 (L−1 )T Dx ψ(x). M Let us prove formula (8.21). Recalling that by definition i
̂−P ψ(x) = e 2 Px⋅x ψ(x), V we have, since P is symmetric, ̂−P ψ(x) = V ̂−P (Px0 ⋅ x)ψ(x) + (p0 ⋅ Dx )ψ(x), (x0 ⋅ Dx )V and hence ̂P A(V ̂−P ψ)(x) = ([p0 + Px0 ) ⋅ x]ψ(x) − (x0 ⋅ Dx )ψ(x) V which is (8.21). (ii) This property immediately follows since Ŝ is a product of operators ̂J, M ̂L,m , V ̂P . (iii) The mapping Φ̂ is trivially a linear mapping Diff(1) (n) → Diff(1) (n). If S ̂ ̂ ̂ ̂ = Ŝ−1 B ̂ Ŝ ∈ Diff(1) (n) since A ̂ = Ŝ−1 B( ̂ Ŝ−1 )−1 . B = SAS−1 ∈ Diff(1) (n), then we have also A It follows that ΦŜ is surjective and hence bijective.
̂L,m , V ̂P generate Mp(n), the Lemma just stated shows that, Since the operators ̂J, M ̂ = â for every Ŝ ∈ Mp(n), there exists a linear automorphism S of ℝ2n such that Φ̂ (A) ∘S S
that is
ΦŜ (φ(z0 )) = φ(Sz0 ).
(8.22)
100 | 8 Metaplectic operators Let us show that the automorphism S preserves the symplectic form. For z, z ∈ ℝ2n , we have, in view of the commutation formula (8.18), σ(Sz, Sz ) = i[φ(Sz), φ(Sz )] = i[ΦŜ φ(z), ΦŜ φ(z )]
̂ −1 ̂ ̂ = i[Sφ(z) Ŝ−1 , Sφ(z )S ]
̂ = iS[φ(z), φ(z )]Ŝ−1 = σ(z, z ), hence S ∈ Sp(n) as claimed. We are now able to describe explicitly the natural projection of Mp(n) onto Sp(n). Definition 100. The covering projection π Mp : Mp(n) → Sp(n) is the mapping π Mp , ̂ = S ∈ Sp(n) defined by (8.22), that which to Ŝ ∈ Mp(n) associates the element π Mp (S) is S = φ−1 ΦŜ φ.
(8.23)
That the mapping π Mp indeed is a covering mapping follows from: Proposition 101. (i) The mapping π Mp is a continuous group epimorphism of Mp(n) → Sp(n), such that: π Mp (̂J) = J,
̂L,m ) = ML , π Mp (M
̂P ) = VP, π Mp (V
(8.24)
and hence π Mp (ŜW,m ) = SW .
(8.25)
(ii) We have ker(π Mp ) = {−I, +I}; hence π Mp : Mp(n) → Sp(n) is a two-fold covering map. Proof. (i) Let us first show that π Mp is a group homomorphism. In view of the obvious identity ΦŜ ΦŜ = ΦŜŜ we have π Mp (ŜŜ ) = φ−1 ΦŜŜ φ
= (φ−1 ΦŜ φ)(φ−1 ΦŜ φ) ̂ Mp (Ŝ ). = π Mp (S)π
Let us next prove that π Mp is surjective. Recall that the matrices J, L−1 0
ML = (
0 ) LT
I P
and V−P = (
0 ) I
8.4 Comments and references | 101
generate Sp(n) when L and P range over, respectively, the invertible and symmetric real matrices of order n. It is thus sufficient to show that formulae (8.24) hold. Now, using (8.19), (8.20), and (8.21), we have φΦ̂J φ−1 = J,
−1 φΦM ̂ φ = ML , L,m
φΦV̂ φ−1 = V−P, −P
hence (8.16). Formula (8.25) follows since every quadratic Fourier transform ŜW,m can be factorized as ̂−Q ̂L,m̂J V ̂−P M ŜW,m = V in view of Proposition 94. To establish the continuity of the mapping π Mp , we first remark that the isomorphism φ : ℝ2n → Diff(1) (n) defined in Lemma 98 is trivially continuous, and so is its inverse. Since ΦŜŜ = ΦŜ ΦŜ it suffices to show that for every ̂ ∈ Diff(1) (n), ψS (A) ̂ has A ̂ as limit when Ŝ → ̂I in Mp(n). Now, Mp(n) is a group of A continuous automorphisms of 𝒮 (ℝn ) hence, when Ŝ → ̂I then Ŝ−1 ψ → ψ for every ̂ Ŝ−1 ψ → Aψ ̂ and also ŜA ̂ Ŝ−1 ψ → ψ. (ii) Suppose that φ−1 Φ̂ φ = ̂I. ψ ∈ 𝒮 (ℝn ), that is A S (1) −1 ̂ ̂ ̂ ̂ ̂ Then SAS = A for every A ∈ Diff (n), and this is only possible if Ŝ is multiplication by a constant c with |c| = 1, and thus ker(π Mp ) ⊂ S1 . In view of Lemma 96, we have Ŝ = ŜW,m ŜW ,m for some choice of (W, m) and (W , m ), hence the condition Ŝ ∈ ker(π Mp ) is equivalent to ŜW ,m = c(ŜW,m )−1 = cŜW ∗ ,m∗ which is only possible if c = ±1, hence Ŝ = ±̂I as claimed.
8.4 Comments and references See Jean Leray [54], Chap. 1, or de Gosson [29]. According to Leray, the approach to the metaplectic group using the quadratic Fourier integral operators is due to Vladimir Buslaev. An excellent treatment of the metaplectic group, much in the spirit of Leray, is given by Hans Reiter [63]. For other explicit formulas giving the action of the metaplectic group on Gaussians, see Folland [28].
9 The property of symplectic covariance The property of symplectic covariance is of fundamental importance in harmonic analysis. This property is characteristic of Weyl quantization and allows considerably simplifying many proofs and the statement of many properties.
9.1 Symplectic covariance of the cross-Wigner transform 9.1.1 The Heisenberg displacement operators The phase-space translation operators T(z0 ) satisfy the intertwining formula ST(z0 )S−1 = T(Sz0 ) for every S ∈ Sp(n). It is therefore perhaps not so surprising that we have a similar formula for the Heisenberg displacement operators. We are going to prove a symplectic covariance formula for these operators; this result is fundamental because it implies the symplectic covariance of the Wigner transform ad Weyl calculus. ̂ We have Proposition 102. Let Ŝ ∈ Mp(n) and S = π Mp (S). ̂ 0) Ŝ̂ D(z0 )Ŝ−1 = D(Sz
(9.1)
for every z0 ∈ ℝ2n . Proof. It is sufficient to assume that Ŝ is a quadratic Fourier transform ŜW,m since every Ŝ ∈ Mp(n) is a product of two such operators. Suppose indeed that we have shown that −1 ̂ D(z0 )ŜW,m ; D(SW z0 ) = ŜW,m ̂
(9.2)
writing an arbitrary element S of Mp(n) as a product SW,m SW ,m we have −1 ̂−1 ̂ D(Sz0 ) = ŜW,m (ŜW ,m ̂ D(z0 )ŜW ,m )SW,m −1 = ŜW,m ̂ D(SW z0 )ŜW,m ̂ W SW z0 ) = D(S
̂ 0 )(ŜW,m ŜW ,m )−1 = ŜW,m ŜW ,m D(z = Ŝ̂ D(z0 )Ŝ−1 . Let us set out to prove that −1 ̂ D(z0 )ŜW,m = ŜW,m ̂ D(SW z0 ).
For ψ ∈ 𝒮 (ℝn ) set, g(x) = ̂ D(z0 )ŜW,m ψ(x). https://doi.org/10.1515/9783110722772-009
(9.3)
104 | 9 The property of symplectic covariance By definition of a ŜW,m and ̂ D(z0 ), we have g(x) = (
n/2
1 ) 2πiℏ
i
1
Δ(W)e− 2ℏ p0 ⋅x0 ∫ e ℏ (W(x−x0 ,x )+p0 ⋅x) ψ(x )dx .
ℝn
It is straightforward to verify that the function W0 (x, x ) = W(x − x0 , x ) + p0 ⋅ x
(9.4)
is a generating function of the free affine symplectomorphism T(z0 )SW , that is, (x, p) = T(z0 )SW (x , p ) ⇐⇒ {
p = 𝜕x W0 (x, x ) p = −𝜕x W0 (x, x ),
hence we have just shown that i
̂ D(z0 )ŜW,m = e 2ℏ p0 ⋅x0 ŜW0 ,m
(9.5)
where ŜW0 ,m is one of the metaplectic operators associated to W0 . Let us now set −1 h(x) = ŜW,m ̂ D(SW z0 )ψ(x)
and
−1 (x0 , p0 ) = ŜW,m (x0 , p0 );
we have h(x) = (
n/2
1 ) 2πiℏ
i
i
i
Δ(W) ∫ e ℏ W(x,x ) e− 2ℏ p0 ⋅x0 e ℏ p0 ⋅x ψ(x − x0 ) dx
ℝn
that is, performing the change of variables x → x + x0 : h(x) = (
n/2
1 ) 2πiℏ
i
i
i
Δ(W) ∫ e ℏ W(x,x +x0 ) e 2ℏ p0 ⋅x0 e ℏ p0 ⋅x ψ(x ) dx .
ℝn
We will thus have h(x) = g(x) as claimed, if we show that 1 1 W(x, x + x0 ) + p0 ⋅ x0 + p0 ⋅ x = implies :0 (x, x ) − p0 ⋅ x0 , 2 2 that is 1 1 W(x, x + x0 ) + p0 ⋅ x0 + p0 ⋅ x = W(x − x0 , x ) + p0 ⋅ x − p0 ⋅ x0 . 2 2 Replacing x by x + x0 , this amounts to proving that 1 1 W(x + x0 , x + x0 ) + p0 ⋅ x0 + p0 ⋅ x = W(x, x ) + p0 ⋅ x0 + p0 ⋅ x, 2 2 which is straightforward to verify by a direct calculation.
9.1 Symplectic covariance of the cross-Wigner transform | 105
9.1.2 The Grossmann–Royer reflections Recall that the Grossmann–Royer reflection operator, which is explicitly given by the formula 2i
̂ 0 )ψ(x) = e ℏ p0 ⋅(x−x0 ) ψ(2x0 − x), R(z was defined in terms of the displacement operator as ̂ 0) = ̂ ̂̂ R(z D(z0 )R D(z0 )−1 ,
̂ Rψ(x) = ψ(−x).
(9.6)
Proposition 102 implies: ̂ We have Corollary 103. Let Ŝ ∈ Mp(n) and S = π Mp (S). ̂ 0 )Ŝ−1 = R(Sz ̂ 0 ). ŜR(z
(9.7)
Proof. Applying the symplectic covariance formula (9.1) to (9.6), we get, since ̂ D(z0 )−1 = ̂ D(−z0 ), ̂ 0 )Ŝ−1 = Ŝ̂ ̂̂ ŜR(z D(z0 )R D(−z0 )Ŝ−1 ̂ Ŝ−1 (Ŝ̂ = (Ŝ̂ D(z0 )Ŝ−1 )ŜR D(−z0 )Ŝ−1 ) ̂ Ŝ−1 )̂ =̂ D(Sz )(ŜR D(−Sz ).̇ 0
0
̂ Ŝ−1 = R ̂ because metaplectic operators commute Formula (9.7) now follows since ŜR with reflections. It suffices to prove this claim when Ŝ = ŜW,.m : for ψ ∈ 𝒮 (ℝn ), we have n/2
̂ ŜW,.m ψ(x) = ( 1 ) R 2πiℏ
n/2
=(
1 ) 2πiℏ
n/2
=(
1 ) 2πiℏ
i
Δ(W) ∫ e ℏ W(−x,x ) ψ(x ) dx
ℝn i
Δ(W) ∫ e ℏ W(−x,−x ) ψ(−x ) dx
ℝn i
̂ Δ(W) ∫ e ℏ W(x,x ) Rψ(x ) dx
ℝn
since W(−x, −x ) = W(x, x ).
9.1.3 Symplectic covariance of W (ψ, ϕ) and A(ψ, ϕ) From the results just stated, we immediately get the desired symplectic covariance result for the cross-Wigner and ambiguity functions:
106 | 9 The property of symplectic covariance ̂ and (ψ, ϕ) ∈ L2 (ℝn ) × L2 (ℝn ). We have Proposition 104. Let Ŝ ∈ Mp(n) and S = π Mp (S) ̂ Sϕ) ̂ = W(ψ, ϕ) ∘ S−1 W(Sψ,
and
̂ Sϕ) ̂ = A(ψ, ϕ) ∘ S−1 . A(Sψ,
Proof. The cross-Wigner and cross-ambiguity functions are defined in terms of the reflection and displacement operators as (formulas (3.1) and (3.18)) W(ψ, ϕ) = ( A(ψ, ϕ)(z) = (
n
1 ̂ ) (R(⋅)ψ|ϕ) L2 πℏ n
1 ) (ψ|̂ D(z)ϕ)L2 . 2πℏ
We have n
̂ Sϕ) ̂ 2, ̂ Sϕ) ̂ = ( 1 ) (R(⋅) ̂ Sψ| W(Sψ, L πℏ that is, taking the unitarity of Ŝ into account and using formula (9.7), n
1 ̂ Sϕ)(z) ̂ ̂ ̂ Sψ|ϕ) W(Sψ, = ( ) (Ŝ−1 R(z) L2 πℏ n
=(
1 ̂ −1 z)ψ|ϕ) 2 ) (R(S L πℏ
= W(ψ, ϕ)(S−1 z).
The symplectic covariance formula for the cross-ambiguity function is proven similarly, using (9.1).
9.2 Application: the action of Mp(n) on Gaussians 9.2.1 A first result An immediate consequence of the symplectic covariance of the Wigner transform is the following very useful result that allows to study the action of the metaplectic group on Gaussians up to a phase factor: A Proposition 105. Let Ŝ ∈ Mp(n) have projection S = ( C
B ) on Sp(n). Let D
2
ϕ0 (x) = (πℏ)−n/4 e−|x| /2ℏ be the standard Gaussian on ℝn . We have n/4
̂ (x) = eiγ ( 1 ) Sϕ 0 πℏ
1
(det X)1/4 e− 2ℏ (X+iY)x⋅x
(9.8)
9.2 Application: the action of Mp(n) on Gaussians | 107
where the phase γ is a real constant and X and Y are real symmetric matrices X = (AAT + BBT )
(9.9)
−1 T −1
Y = −(CAT + DBT )(AAT + BB ) .
(9.10)
̂ )(z) = Wϕ (S−1 z). Since Proof. We have W(Sϕ 0 0 Wϕ0 (z) = (
n
1 1 ) e− ℏ |z| , πℏ
we have ̂ )(z) = ( W(Sϕ 0
n
1 −1 T −1 1 ) e− ℏ (S ) S z⋅z . πℏ
(9.11)
The inverse of S being given by DT −C T
−BT ), AT
S=( an immediate calculation yields T
(S−1 ) S−1 = (
CC T + DDT −BDT − AC T
−DBT − CAT ). AAT + BBT
Comparison with formulas (4.8) and (4.9) yields X + YX −1 Y = CC T + DDT ,
YX −1 = −DBT − CAT
X −1 Y = −BDT − AC T ,
X −1 = AAT + BBT .
Solving this system of matrix equations yields the solutions (9.9) and (9.10). 9.2.2 Pre-Iwasawa factorization of a symplectic matrix Recall that every free symplectic matrix A C
S=(
B ) ∈ Sp(n), D
det B ≠ 0
can be factorized as S = V−P ML JV−Q where I P
V−P = (
0 ), I
L−1 0
ML = (
0 ). LT
The pre-Iwasawa factorization is a variant of this result; it states that every S ∈ Sp(n) can be written as a product of an element of a certain subgroup Sp0 (n) of Sp(n) and of a symplectic rotation U ∈ U(n).
108 | 9 The property of symplectic covariance Definition 106. We will denote by Sp0 (n) the subgroup of Sp(n) generated by the symplectic matrices V−P and ML . That subgroup Sp0 (n) is called the local symplectic group. The subgroup Mp0 (n) = (π Mp )−1 (Sp0 (n)) of Mp(n) is called the local metaplectic group. The group Sp0 (n) is the isotropy subgroup (= stabilizer) of the subspace 0 ⊕ ℝn of ℝ . It thus consists of all symplectic block matrices with upper corner B = 0, that is of matrices of the type 2n
L−1 V−P ML = ( −1 PL
0 ), LT
L−1 LT P
0 ). LT
ML V−P = (
̂−P M ̂L,m (or M ̂L,m V ̂−P ); The local metaplectic group Mp0 (n) consists of all products V they are local operators (hence the terminology) in the sense that they do not increase the supports of the functions (or distributions) to which they are applied; intuitively, local operators do not contain any Fourier transform. Proposition 107 (Iwasawa factorization). For every S ∈ Sp(n), there exists a unique S0 ∈ Sp0 (n) and a unique matrice P = P T and L = LT > 0 and U ∈ U(n) such that S = S0 U; more precisely I S0 = ( P
0 L−1 )( I 0
0 ), L
U=(
X −Y
Y ) X
where P, L, X, Y are given by: P = (CAT + DBT )(AAT + BBT )
−1
L = (AAT + BBT ) X = (AAT + BB )
(9.12)
= LT > 0
−1/2
T −1/2
= PT
(9.13) T −1/2
A, Y = (AAT + BB )
B.
(9.14)
Proof. It is purely computational writing A C
(
B I )=( D P
0 L−1 )( I 0
0 X )( L −Y
Y ) X
(9.15)
and determining the matrices P, L, X, Y in terms on A, B, C, D.
9.2.3 Application to Gaussians Let us return to formula (9.11) ̂ )(z) = ( W(Sϕ 0
n
1 −1 T −1 1 ) e− ℏ (S ) S z⋅z πℏ
(9.16)
9.3 Symplectic covariance of Weyl operators | 109
giving the Wigner transform of Ŝ ∈ Mp(n) applied to the standard Gaussian 2
ϕ0 (x) = (πℏ)−n/4 e−|x| /2ℏ . Let us focus on the product (S−1 )T S−1 = (SST )−1 . If we replace S with any other sympleĉ ) with W(Ŝ ϕ ), and the tic matrix S ST such that SST = SST , we can replace W(Sϕ 0 0 ̂ and Ŝ ϕ will thus be the same (up to an innocuous factor with modulus functions Sϕ 0 0 one). If we now use the Iwasawa factorization S = S0 U, we have SST = S0 S0T . Since I S0 = ( P
0 L−1 )( I 0
0 L−1 ) = ( −1 L PL
0 ), L
we have L−2 PL−2
L−2 P ). PL P + L2
S0 S0T = (
−2
̂ (up to a sign) using this procedure: More interesting is that we can directly obtain Sϕ 0 ̂ ̂ ̂ ̂ ̂ We have Sϕ0 = S0 ϕ0 where S0 = V−P ML,m that is, explicitly, i
Ŝ0 ϕ0 (x) = e 2ℏ Px⋅x im √| det L|ϕ0 (Lx). Since L > 0, we have det L > 0 and m ∈ {0, 2}, hence i
Ŝ0 ϕ0 (x) = ±e 2ℏ Px⋅x √det Lϕ0 (Lx), that is 1
2
Ŝ0 ϕ0 (x) = ±(πℏ)−n/4 √det Le− 2ℏ (L +iP)x⋅x with P = (CAT + DBT )(AAT + BBT ) L = (AAT + BBT )
−1/2
−1
.
(9.17) (9.18)
9.3 Symplectic covariance of Weyl operators Weyl operators are the only pseudo-differential operator calculus for which symplectic covariance holds. See the Comments and References at the end of the chapter. ̂ 0 ) satisfy the symplectic covariance formuWe have just proven that ̂ D(z0 ) and R(z las Ŝ̂ D(z)Ŝ−1 = ̂ D(Sz),
̂ Ŝ−1 = R(Sz) ̂ ŜR(z)
110 | 9 The property of symplectic covariance ̂ for every z ∈ ℝ2n . Also, for all (ψ, ϕ) ∈ L2 (ℝn ) × L2 (ℝn ), for Ŝ ∈ Mp(n) and S = π Mp (S) the cross-Wigner distribution satisfies ̂ Sϕ)(z) ̂ W(Sψ, = W(ψ, ϕ)(S−1 z)
(9.19)
and, similarly, for the cross-ambiguity function, ̂ Sϕ)(z) ̂ A(Sψ, = A(ψ, ϕ)(S−1 z).
(9.20)
Proposition 108. Let Ŝ ∈ Mp(n) have projection S ∈ Sp(n) and be any of the two metâ For every Weyl operator Op (a), we have plectic operators with S = π Mp (S). W ̂ OpW (a ∘ S) = Ŝ−1 OpW (a)S.
(9.21)
̂ the Weyl operator with symbol a ∘ S. We have Proof. Let us denote B ̂ = ∫ aσ (Sz)̂ Bψ D(z)ψdz ℝ2n
that is, performing the change of variables Sz → z and taking into account the fact that det S = 1, ̂ = ∫ aσ (z)̂ Bψ D(S−1 z)ψdz. ℝ2n
Since Ŝ−1 ̂ D(z)Ŝ = ̂ D(S−1 z), we have ̂ ̂ = ∫ aσ (z)Ŝ−1 ̂ Bψ D(z)Sψdz ℝ2n
̂ = Ŝ−1 ( ∫ aσ (z)̂ D(z)dz)Sψ ℝ2n
which is (9.21). Notice that this result could be proven as well using the alternative (9.21) definition OpW (a) = (
n
1 ̂ 0 )dz0 , ) ∫ a(z0 )R(z πℏ ℝ2n
expressing the Weyl correspondence in terms of the Grossmann–Royer operators ψ = ϕ.
9.4 Maximality of the symplectic group
| 111
9.4 Maximality of the symplectic group We are going to show that the symplectic group is in a sense “almost” maximal for covariance properties. More precisely, we will show that if M ∈ GL(2n, ℝ), then, for every ψ ∈ 𝒮 (ℝn ) there exists ψ ∈ 𝒮 (ℝn ) such that Wψ ∘ M = Wψ if and only if M is either symplectic or antisymplectic (i. e., M T JM = −J). 9.4.1 A technical lemma Recall that an automorphism M of ℝ2n is antisymplectic if σ(Mz, Mz ) = −σ(z, z ) for all 0 ). z, z ∈ ℝ2n ; in matrix notation, M T JM = −J. Equivalently, CM ∈ Sp(n) where C = ( 0I −I Lemma 109. Let M ∈ GL(2n, ℝ), and assume that M T GM ∈ Sp(n) for every X 0
0 ) ∈ Sp+ (n). X −1
G=(
(9.22)
Then M is either symplectic, or anti-symplectic. Proof. We first remark that, taking G = I in the condition M T GM ∈ Sp(n), we have M T M ∈ Sp(n). Next, we can write M = HP where H = M(M T M)−1/2 is orthogonal and P = (M T M)1/2 ∈ Sp+ (n) (polar decomposition theorem). It follows that the condition M T GM ∈ Sp(n) is equivalent to P(H T GH)P ∈ Sp(n); since P is symplectic, so is P −1 , and hence H T GH ∈ Sp(n) for all G of the form (9.22). Let us now make the following particular choice for G by taking Λ 0
G=(
0 ), Λ−1
Λ = diag(λ1 , . . . , λn )
with λj > 0 for 1 ≤ j ≤ n. We thus have Λ 0
HT (
0 ) H ∈ Sp+ (n) Λ−1
for every Λ of this form. Let U ∈ U(n) be a symplectic rotation such that Λ 0
HT (
0 Λ ) H = UT ( Λ−1 0
0 )U ; Λ−1
we have H T GH ∈ Sp+ (n) and the eigenvalues of H T GH are those of G since H is orthogonal. Setting R = HU T , the previous equality is equivalent to Λ 0
(
0 Λ )R = R( Λ−1 0
0 ). Λ−1
(9.23)
112 | 9 The property of symplectic covariance Writing R = ( AC DB ), we get the conditions ΛA = AΛ, Λ−1 C = CΛ,
ΛB = BΛ−1 Λ−1 D = DΛ−1
for all Λ. It follows that A and D must themselves be diagonal A = diag(a1 , . . . , an ), D = diag(d1 , . . . , dn ). On the other hand, choosing Λ = λI, λ ≠ 1, we get B = C = 0. Hence, taking into account the fact that R ∈ O(2n, ℝ), we must have A 0
0 ), D
R=(
A2 = D2 = I.
(9.24)
Conversely, if R is of the form (9.24), then (9.23) holds for any positive-definite diagonal Λ. We conclude that M has to be of the form M = RUP where R is of the form (9.24). Since UP ∈ Sp(n), the relation ship M T GM ∈ Sp(n) implies RGR ∈ Sp(n). For each pair i, j with 1 ≤ i < j ≤ n, we now choose the following matrix X in (9.22): 1 X (ij) = I + E (ij) 2
(9.25)
where E (ij) is the symmetric matrix whose entries are all zero except the ones on the i-th row and j-th column and on the j-th row and i-th column which are equal to one. A simple calculation then shows that RGR = (
AX (ij) A 0
0 ), D(X (ij) )−1 D
(9.26)
and if we impose the condition RGR ∈ Sp(n), we obtain AX (ij) AD(X (ij) ) D = I ⇔ X (ij) AD = ADX (ij) . −1
(9.27)
In other words, the matrix AD commutes with every real positive-definite n × n matrix X (ij) of the form (9.25). Let us write AD = diag(c1 , . . . , cn ) with cj = aj dj for 1 ≤ j ≤ n. Then if we apply (9.27) to (9.25) for i < j, we conclude that ci = cj which means that the entries of the matrix AD are all equal, that is, either AD = I or AD = −I, or equivalently A = D or A = −D. In the first case, R is symplectic and so is M. In the second case, R is anti-symplectic; but then M is also anti-symplectic.
9.4.2 The maximality property for the Wigner transform The considerations just posed allow us to prove the main result of this section:
9.4 Maximality of the symplectic group
| 113
Proposition 110. Let M ∈ GL(2n, ℝ). (i) Assume that M is antisymplectic: S = CM ∈ 0 ); then for every ψ ∈ 𝒮 (ℝn ), Sp(n) where C = ( 0I −I Wψ(Mz) = W(Ŝ−1 ψ)(z)
(9.28)
where Ŝ is any of the two elements of the metaplectic group Mp(n) covering S. (ii) Conversely, assume that for any ψ ∈ 𝒮 (ℝn ) there exists ψ ∈ 𝒮 (ℝn ) such that Wψ(Mz) = Wψ (z).
(9.29)
Then M is either symplectic or antisymplectic. Proof. (i) It is sufficient to assume that ψ ∈ 𝒮 (ℝn ). We have Wψ(Cz) = (
n
i 1 1 1 ) ∫ e ℏ p⋅y ψ(x + y)ψ(x − y)dy 2πℏ 2 2
n
=(
ℝn
i 1 1 1 ) ∫ e− ℏ p⋅y ψ(x − y)ψ(x + y)dy 2πℏ 2 2
ℝn
= Wψ(z). It follows that Wψ(Mz) = Wψ(CSz) = Wψ(Sz), hence formula (9.28). (ii) Choosing for ψ a Gaussian ψX (x) = (
n/4
1 ) πℏ
1
(det X)1/4 e− 2ℏ Xx⋅x
(9.30)
(X real symmetric and positive definite) we have WψX (z) = (
n
1 1 ) e− ℏ Gz⋅z πℏ
(9.31)
where X 0
G=(
0 ) X −1
(9.32)
is positive definite and belongs to Sp(n). Condition (9.29) implies that we must have Wψ (z) = (
n
1 T 1 ) e− ℏ M GMz⋅z . πℏ
114 | 9 The property of symplectic covariance The Wigner transform of a function is a Gaussian if and only if the function itself is a Gaussian (see [30, 31]). This can be seen in the following way: The matrix M T GM being symmetric and positive definite, we can use a Williamson diagonalization: There exists S ∈ Sp(n) such that Σ 0
ST (M T GM)S = Δ = (
0 ), Σ
(9.33)
and hence Wψ (Sz) = (
n
1 1 ) e− ℏ Δz⋅z . πℏ
In view of the symplectic covariance of the Wigner transform, we have Wψ (Sz) = Wψ (z),
ψ = Ŝ−1 ψ
where Ŝ ∈ Mp(n) is one of the two elements of the metaplectic group covering S. We now show that the equality Wψ (z) = (
n
1 1 ) e− ℏ Δz⋅z πℏ
(9.34)
implies that ψ must be a Gaussian of the form (4.2), and hence Wψ must be of the type (9.31, 9.32). That ψ must be a Gaussian follows from Wψ ≥ 0 and Hudson’s theorem (see e. g., [28]). If ψ were of the more general type ψX,Y (x) = (
n/4
1 ) πℏ
1
(det X)1/4 e− 2ℏ (X+iY)x⋅x
(9.35)
(X, Y are real and symmetric and X is positive definite), the matrix G in (9.31) would be X + YX −1 Y X −1 Y
G=(
YX −1 ) X −1
(9.36)
which is only compatible with (9.34) if Y = 0. In addition, due to the parity of Wψ , ψ must be even hence Gaussians more general than ψX,Y are excluded. It follows from these considerations that we have Σ 0
Δ=(
0 X )=( Σ 0
0 ), X −1
so that Σ = Σ−1 . Since Σ > 0, this implies that we must have Σ = I, and hence, using formula (9.33), ST (M T GM)S = I. It follows that we must have M T GM ∈ Sp(n) for every G = ( X0 X0−1 ) ∈ Sp+ (n). In view of Lemma 109, the matrix M must then be either symplectic or antisymplectic.
9.4 Maximality of the symplectic group
| 115
9.4.3 Maximal covariance of Weyl operators The previous result implies the following maximal covariance result for Weyl operators: ̂ : Corollary 111. Let M ∈ GL(2n, ℝ). Assume that there exists a unitary operator M 2 n 2 n L (ℝ ) → L (ℝ ) such that ̂ OpW (a)M ̂−1 = OpW (a ∘ M −1 ) M
(9.37)
for all a ∈ 𝒮 (ℝ2n ). Then M is either symplectic or antisymplectic. Proof. Suppose that (9.37) holds; then ̂ OpW (a)M ̂−1 ψ|ϕ) 2 = ⟨a ∘ M −1 , W(ψ, ϕ)⟩ (M L = ⟨a, W(ψ, ϕ) ∘ M⟩.
̂ On the other hand, using the unitarity of M, ̂ OpW (a)M ̂−1 ψ|ϕ) 2 = (OpW (a)M ̂−1 ψ|M ̂−1 ϕ) 2 (M L L ̂−1 ψ, M ̂−1 ϕ)⟩. = ⟨a, W(M
It follows that we must have ̂−1 ψ, M ̂−1 ϕ)⟩⟩ ⟨⟨a, W(ψ, ϕ) ∘ M⟩⟩ = ⟨⟨a, W(M for all (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ), and hence, in particular, taking ψ = ϕ: ̂−1 ψ)⟩ ⟨a, Wψ ∘ M⟩ = ⟨a, W(M for all ψ ∈ 𝒮 (ℝn ). Since a is arbitrary, this implies that we must have Wψ ∘ M = ̂−1 ψ). In view of Proposition 110, the automorphism M must be either symplecW(M tic or antisymplectic. Summarizing: Proposition 112. Let a → OpW (a) be a continuous linear mapping from 𝒮 (ℝ2n ) on the space ℒ(𝒮 (ℝn ), 𝒮 (ℝn )). Assume that: (i) If a only depends on x ∈ ℝn and a ∈ L∞ (ℝn ), then OpW (a) is multiplication by a(x); (ii) if S ∈ Sp(n), then OpW (a ∘ S) = Ŝ OpW (a)Ŝ−1 . Then a → OpW (a) is the Weyl correspondence: OpW (a) = OpW (a). 9.4.4 Born–Jordan operators The Born Jordan correspondence a → OpBJ (a) is not identical to the Weyl correspondence a → OpW (a), hence we cannot expect it to enjoy full symplectic covariance
116 | 9 The property of symplectic covariance since the latter is a characteristic property of Weyl quantization. We have however the following partial result. Recall that the metaplectic group Mp(n) is generated by the modified Fourier transform ̂J = i−n/2 F, the multiplication operators ̂−P ψ = eiPx V
2
/2ℏ
ψ
(P = P T ),
and the unitary scaling operators ̂L,m ψ(x) = im √| det L|ψ(Lx) M (det L ≠ 0, mπ = arg det L). The projections of these operators on Sp(n) are, respectively, the symplectic matrix J and I V−P = ( P
0 ), I
L−1 0
ML = (
0 ). L2
Proposition 113. Let a ∈ 𝒮 (ℝ2n ). We have Ŝ OpBJ (a)Ŝ−1 = OpBJ (a ∘ S−1 )
(9.38)
̂L,m . for every Ŝ ∈ Mp(n) which is a product of a (finite number) of operators ̂J and M ̂L,m . Let first Ŝ be an arbiProof. It suffices to prove formula (9.38) for Ŝ = ̂J and Ŝ = M trary element of Mp(n); we have n
1 2n ̂ Ŝ OpBJ (a) = ( ) ∫ aσ (z)Θ(z)ŜT(z)d z 2πℏ n
= [(
1 2n ̂ ̂ ) ∫ aσ (z)Θ(z)T(Sz)d z]S 2πℏ
where the second equality follows from the usual symplectic covariance property ̂ ̂ ŜT(z) = T(Sz) Ŝ of the Heisenberg operators. Making the change of variables z = Sz in the integral, we get, since det S = 1, 2n 2n ̂ ̂ z = ∫ aσ (S−1 z)Θ(S−1 z)T(z)d z. ∫ aσ (z)Θ(z)T(Sz)d
Now, by definition of the symplectic Fourier transform, we have aσ (S−1 z) = (
n
i −1 1 ) ∫ e− ℏ σ(S z,z ) a(z )d2n z 2πℏ
= (a ∘ S−1 )σ (z). ̂L,m we have Choosing Ŝ = M Θ(ML−1 z) =
sin(2πLp ⋅ (LT )−1 x) = Θ(z); 2πLp ⋅ (LT )−1 x
9.5 Comments and references | 117
similarly Θ(J −1 z) = Θ(z), hence in both cases 2n ̂ Ŝ OpBJ (a) = (∫(a ∘ S−1 )σ Θ(z)T(z)d z)Ŝ
= OpBJ (a ∘ S−1 )Ŝ whence formula (9.38). This proof shows that the essential step consists in noting that Θ(S−1 z) = Θ(z) when S = J or S = ML . It is clear that this property fails if one takes any operator S = VP with P ≠ 0, so we cannot expect to have full symplectic covariance for Born–Jordan operators since the symplectic group Sp(n) is generated by the set of all matrices J, ML and VP .
9.5 Comments and references The proof of the maximality results in Section 9.4 was first published in a joint work [19] with Nuno Dias and Joao Prata (the proof of Lemma 109 is due to Dias and Prata). That the Weyl transform is the only quantization that is a fully symplectically covariant was proven by Stein in his treatise [68]; Stein’s argument has been detailed by Wong [76].
10 The Feichtinger algebra The Feichtinger algebra S0 (ℝn ) is a normed Banach algebra of functions that van be defined in terms of the Wigner transform. It is a good substitute for the Schwartz space 𝒮 (ℝn ) as long as one is not interested in differentiation properties. It is moreover the smallest Banach space containing 𝒮 (ℝn ) and being invariant under the action of the metaplectic group and the Heisenberg displacement. The dual space S0 (ℝn ) of S0 (ℝn ) contains many basic distributions such as the Dirac distribution δ or its translates. These properties, together with the fact that Banach spaces are mathematically easier to deal with than Fréchet spaces, makes the Feichtinger algebra into a tool of choice in harmonic analyses and its applications to quantum mechanics.
10.1 Definition and first properties 10.1.1 Definition of the Feichtinger algebra In what follows, ϕ will be a non-zero element of 𝒮 (ℝn ) with L2 norm equal to one, and referred to as a “window”. Definition 114. The Feichtinger algebra S0 (ℝn ) consists of all functions ψ ∈ L2 (ℝn ) such that W(ψ, ϕ) ∈ L1 (ℝ2n ) for every window ϕ. The number ‖ψ‖ϕ,S0 = W(ψ, ϕ)L1 (ℝ2n ) = ∫ W(ψ, ϕ)(z)dz
(10.1)
ℝ2n
is called the norm of ψ relative to the window ϕ. Admittedly, this definition is not very practical, because it means that we should in principle verify an uncountable set of conditions. We will see later that it actually suffices to verify that W(ψ, ϕ) ∈ L1 (ℝ2n ) for only one window ϕ. That S0 (ℝn ) is a vector space is clear. That ‖ ⋅ ‖ϕ,S0 indeed is a norm on S0 (ℝn ) for each ϕ ∈ 𝒮 (ℝn ), ϕ ≠ 0 is also obvious. For instance, ‖ψ‖ϕ,S0 = 0 is equivalent to W(ψ, ϕ) = 0, that is to ψ = 0 since ϕ ≠ 0 the triangle inequality follows from the sesquilinearity of W and so does the property ‖λψ‖ϕ,S0 = |λ| ‖ψ‖ϕ,S0 . Here is an elementary example of an element of an element of S0 (ℝn ): the continuous but not differentiable function ψ(x) = {
1 − |x| 0
if |x| ≤ 1 if |x| > 1.
That the definition of the Feichtinger algebra is independent of the choice of window ϕ – as the definition suggests – is not quite obvious. Let us prove this property in detail. https://doi.org/10.1515/9783110722772-010
120 | 10 The Feichtinger algebra Proposition 115. (i) Let ψ ∈ L2 (ℝn ). We have ψ ∈ S0 (ℝn ) if and only if there exists one window ϕ such that W(ψ, ϕ) ∈ L1 (ℝ2n ). (ii) The norms ‖ ⋅ ‖ϕ,S0 are all equivalent when ϕ ranges over 𝒮 (ℝn ) and thus define the same topology on S0 (ℝn ). Proof. Let us express ψ in terms of W(ψ, ϕ) using the reconstruction formula (3.17): ψ(x) =
2n ̂ 0 )γ(x)dz0 ∫ W(ψ, ϕ)(z0 )R(z (γ|ϕ)L2
(10.2)
ℝ2n
̂ ) to both sides of this equality, where γ ∈ 𝒮 (ℝn ) is such that (γ|ϕ)L2 ≠ 0. Applying R(z we get, in view of the product formula (2.11) for reflection operators, ̂ )ψ = R(z
2i 2n D(2z − 2z )γdz. ∫ W(ψ, ϕ)(z)e ℏ σ(z,z ) ̂ (γ|ϕ)L2
ℝ2n
Let ϕ be a second window; by definition of the cross-Wigner transform, we have ̂ )ψ|ϕ ) 2 = (πℏ)n W(ψ, ϕ )(z ) (R(z L and by definition (3.18) of the cross-ambiguity function (̂ D(2z − 2z )γ|ϕ )L2 = (2πℏ)n A(γ, ϕ )(2z − 2z) = (πℏ)n W(γ, ϕ∨ )(z − z)
where ϕ∨ (x) = ϕ (−x), the second equality following from formula (3.22) relating cross-ambiguity and cross-Wigner transforms. Formula (10.2) thus yields W(ψ, ϕ )(z ) =
2i 2n ∫ W(ψ, ϕ)(z)e ℏ σ(z,z )n W(γ, ϕ∨ )(z − z)dz, (γ|ϕ)L2
ℝ2n
and hence W(ψ, ϕ )(z ) ≤
2n ∫ W(ψ, ϕ)(z)W(γ, ϕ∨ )(z − z)dz, |(γ|ϕ)L2 | ℝ2n
that is 2n ∨ W(ψ, ϕ ) ≤ W(ψ, ϕ) ∗ W(γ, ϕ ). |(γ|ϕ)L2 |
(10.3)
Integrating both sides of this inequality with respect to z yields ‖ψ‖ϕ ,S0 ≤
2n ‖ψ‖ϕ,S0 ‖γ‖ϕ∨ ,S0 (γ|ϕ)L2
(10.4)
where we have used the inequality ‖F ∗ G‖L1 ≤ ‖F‖L1 ‖G‖L1 valid for any integrable functions F and G. The equivalence of the norms ‖ ⋅ ‖ψ,S0 and ‖ ⋅ ‖ϕ,S0 follows.
10.1 Definition and first properties | 121
10.1.2 First properties We are going to show that it turns out that the window ϕ used in the definition of S0 (ℝn ) can itself be chosen in S0 (ℝn ). As a consequence, we can redefine the Feichtinger algebra in terms of the Wigner transform of its elements: A square integrable function ψ is in S0 (ℝn ) if and only if Wψ ∈ L1 (ℝ2n ). To prove this important result, we will need the following technical lemma: Lemma 116. Let ϕ ∈ 𝒮 (ℝn ), ϕ ≠ 0. The following properties are equivalent: (i) ψ ∈ 𝒮 (ℝn ); (ii) W(ψ, ϕ) ∈ 𝒮 (ℝ2n ); (iii) for every N ≥ 0, there exists CN ≥ 0 such that −N W(ψ, ϕ) ≤ CN (1 + |z|) .
Proof. It is clear, by definition of 𝒮 (ℝn ) and 𝒮 (ℝ2n ), that (i)⇒(ii)⇒(iii). Let us prove that (iii)⇒(i). It is easily verified that the function χ defined by χ(x) =
2n ̂ ∫ W(ψ, ϕ)(z)R(z)ϕ(x)dz ‖ϕ‖L2 ℝ2n
is in 𝒮 (ℝn ); but then χ = ψ in view of the proof of Proposition 40 since (iii) implies in particular that W(ψ, ϕ) ∈ L2 (ℝ2n ). Hence ψ ∈ 𝒮 (ℝn ) as we set out to prove. Proposition 117. Let (ψ, ϕ) ∈ L2 (ℝn ) × L2 (ℝn ). (i) If W(ψ, ϕ) ∈ L1 (ℝ2n ), then both ψ and ϕ are in S0 (ℝn ); (ii) we have ψ ∈ S0 (ℝn ) if and only if W(ψ, ϕ) ∈ L1 (ℝ2n ) for one (and hence every) ϕ ∈ S0 (ℝn ); (iii) a function ψ ∈ L2 (ℝn ) belongs to S0 (ℝn ) if and only if Wψ ∈ L1 (ℝ2n ). Proof. Property (ii) immediately follows from (i). Let us prove property (i). The condition that ψ, ϕ ∈ L2 (ℝn ) implies that W(ψ, ϕ) is a square-integrable and continuous function. Recall that, in the course of the proof of Proposition 115, we proved the inequality (10.3). 2n ∨ W(ψ, ϕ ) ≤ W(ψ, ϕ) ∗ W(γ, ϕ ). |(γ|ϕ)L2 | Choosing γ = ϕ∨ , this inequality becomes 2n ∨ W(ψ, ϕ) ∗ W(ϕ ), W(ψ, ϕ ) ≤ |(γ|ϕ)L2 | hence, integrating both sides, ‖ψ‖ϕ ,S0 ≤
2n ∨ W(ψ, ϕ)L1 Wϕ ∞ < ∞ |(γ|ϕ)L2 |
which shows that ψ ∈ S0 (ℝn ). Swapping ψ and ϕ, this inequality becomes ‖ϕ‖ϕ ,S0 ≤
2n ∨ W(ϕ, ψ)L1 W(ϕ )∞ < ∞, |(γ|ϕ)L2 |
122 | 10 The Feichtinger algebra hence we also have ϕ ∈ S0 (ℝn ). (iii) In view of (i), the condition Wψ ∈ L1 (ℝ2n ) implies that ψ ∈ S0 (ℝn ). If conversely ψ ∈ S0 (ℝn ), then Wψ ∈ L1 (ℝ2n ) in view of (ii). The Feichtinger algebra S0 (ℝn ) is a subspace of several usual spaces of functions: Proposition 118. We have the chain of inclusions S0 (ℝn ) ⊂ C 0 (ℝn ) ∩ L1 (ℝn ) ∩ F(L1 (ℝn )).
(10.5)
Proof. We will use again the reconstruction formula ψ(x) =
2n ̂ 0 )γ(x)dz0 ∫ W(ψ, ϕ)(z0 )R(z (γ|ϕ)L2 ℝ2n
which is valid for all γ ∈ 𝒮 (ℝn ) such that (γ|ϕ)L2 ≠ 0. Putting Δψ(x) = ψ(x + Δx) − ψ(x), we have 2n ̂ ∫ W(ψ, ϕ)(z0 )R(z Δψ(x) ≤ 0 )(γ(x + Δx) − γ(x))dz0 |(γ|ϕ)L2 | ℝ2n
n
≤ =
2 ̂ W(ψ, ϕ)L1 supR(z 0 )(γ(x + Δx) − γ(x)) |(γ|ϕ)L2 | z0
2n W(ψ, ϕ)L1 sup(γ(2x0 − x − Δx) − γ(2x0 − x)) |(γ|ϕ)L2 | x0
̂ where the last equality follows from the definition of R(z), from which readily follows n 0 n that limΔx→0 Δψ = 0; hence, S0 (ℝ ) ⊂ C (ℝ ). Let us next show that S0 (ℝn ) ⊂ L1 (ℝn ). Let ψ ∈ S0 (ℝn ). Using again the reconstruction formula, we get 2n ∫ W(ψ, ϕ)(z0 )γ(2x0 − x)dz0 ψ(x) ≤ |(γ|ϕ)L2 | n
≤
ℝ2n
2 ∫ W(ψ, ϕ)(z0 )γ(2x0 − x)dz0, |(γ|ϕ)L2 | ℝ2n
and hence, integrating in x, ‖ψ‖L1 ≤
2n ‖ψ‖ϕ,S0 ‖γ‖L∞ < ∞ (γ|ϕ)L2
so that ψ ∈ S0 (ℝn ). To prove the inclusion S0 (ℝn ) ⊂ F(L1 (ℝn )), it suffices to note that S0 (ℝn ) is invariant under a Fourier transform: we have ψ ∈ S0 (ℝn ) if and only if W(ψ, F −1 ϕ) ∈ L1 (ℝ2n ) for every window ϕ since F is an automorphism 𝒮 (ℝn ) → 𝒮 (ℝn ). Now, W(F −1 ψ, F −1 ϕ)(z) = W(ψ, ϕ)(Jz)
(10.6)
10.1 Definition and first properties | 123
in view of the symplectic covariance property of the cross-Wigner transform. It follows from the inclusion S0 (ℝn ) ⊂ L1 (ℝn ) that we have F −1 ψ ∈ L1 (ℝn ), and hence ψ ∈ F(L1 (ℝn )) as claimed. The space 𝒮 (ℝn ) is dense in S0 (ℝn ): Proposition 119. We have 𝒮 (ℝn ) ⊂ S0 (ℝn ), and 𝒮 (ℝn ) is dense in S0 (ℝn ). Proof. Let us show that 𝒮 (ℝn ) ⊂ S0 (ℝn ). Let ψ ∈ 𝒮 (ℝn ); for every window ϕ, we have W(ψ, ϕ) ∈ 𝒮 (ℝ2n ), hence for every N > 0 there exists CN > 0 such that −N W(ψ, ϕ)(z) ≤ CN (1 + |z|) .
It follows, by definition of the norm ‖ ⋅ ‖ϕ,S0 that ‖ψ‖ϕ,S0 ≤ CN ∫ (1 + |z|)
−N
dz,
ℝ2n
and hence ‖ψ‖ϕ,S0 < ∞ if we choose N > 2n. To prove the density of 𝒮 (ℝn ) in S0 (ℝn ), let us choose an exhaustive sequence (Kj ) of compact subsets of ℝ2n (i. e., Kj ⊂ K̊ j+1 and ℝ2n = ⋃j Kj ) and set Ψj = W(ψ, ϕ)χj where χj is a C ∞ function such that 0 ≤ χj ≤ 1, equal to one on Kj and supported in Kj+1 . Recalling that we have by the reconstruction formula (3.17), and since ‖ϕ‖L2 = 1, ̂ ψ = 2n ∫ W(ψ, ϕ)(z)R(z)ϕdz, ℝ2n
we define an approximating function by ̂ ψj = 2n ∫ Ψj (z)R(z)ϕdz. ℝ2n
Since W(ψ, ϕ) ∈ 𝒮 (ℝ2n ), it is easy to see, using Leibniz’s differentiation rule for products, that ψj ∈ 𝒮 (ℝn ). We thus have ̂ ψ − ψj = 2n ∫ W(ψ, ϕ)(z)(1 − χj (z))R(z)ϕdz.
(10.7)
ℝ2n
Let us show lim ‖ψ − ψj ‖ϕ,S0 = 0,
j→∞
which will prove our assertion. By definition of the cross-Wigner transform, we have W(ψ − ψj , ϕ)(z ) = (
n
1 ̂ )(ψ − ψj )|ϕ) 2 n ) (R(z L (ℝ ) πℏ
124 | 10 The Feichtinger algebra n
=(
1 ̂ ) (ψ − ψj |R(−z )ϕ)L2 (ℝn ) πℏ
̂ )∗ = R(−z ̂ ); using (10.9), we get where we have used the fact that R(z W(ψ − ψj , ϕ)(z ) = (
n
2 ) πℏ
̂ ̂ × ∫ W(ψ, ϕ)(z)(1 − χj (z))(R(z)ϕ| R(−z )ϕ)L2 (ℝn ) dz. ℝ2n
Integrating with respect to z , we get, by definition (10.1) of the norm ‖ ⋅ ‖ϕ,S0 , ‖ψ − ψj ‖ϕ,S0 = W(ψ − ψj , ϕ)L1 (ℝ2n ) n
≤(
2 ̂ ̂ ) ∫ W(ψ, ϕ)(z)(1 − χj (z))(R(z)ϕ| R(−z )ϕ)L2 (ℝn ) dz. πℏ ℝ2n
̂ are unitary and By the Cauchy–Schwarz inequality, we have, since the operators R(⋅) ‖ϕ‖|L2 = 1, ̂ ̂ )ϕ)L2 (ℝn ) ≤ 1, (R(z)ϕ|R(−z so that ‖ψ − ψj ‖ϕ,S0 ≤ (
n
2 ) ∫ W(ψ, ϕ)(z)(1 − χj (z))dz. πℏ ℝ2n
Since ∫ W(ψ, ϕ)(z)(1 − χj (z))dz ≤ ∫ W(ψ, ϕ)(z)dz < ∞,
ℝ2n
ℝ2n
we conclude, using Lebesgue’s dominated convergence theorem, that lim ∫ W(ψ, ϕ)(z)(1 − χj (z))dz = 0,
j→∞
ℝ2n
hence also limj→∞ ‖ψ − ψj ‖ϕ,S0 = 0. 10.1.3 The Banach algebra property We are going to see that the Feichtinger algebra is a Banach algebra; in addition, we will prove that this algebra is invariant under the action of the metaplectic group and
10.1 Definition and first properties | 125
that it enjoys a characteristic minimality property for the action of the Heisenberg– Weyl operators. As we have seen that the topology defined on S0 (ℝn ) using the norm ‖ψ‖ϕ,S0 = W(ψ, ϕ)L1 (ℝ2n ) is independent of the choice of window ϕ. Let us prove that the normed space S0 (ℝn ) is complete. Proposition 120. The Feichtinger algebra S0 (ℝn ) is a Banach space for the norm ‖⋅‖ϕ,S0 . Proof. Let (ψj )j be a Cauchy sequence in S0 (ℝn ); then (Ψj )j = (W(ψj , ϕ))j is a Cauchy sequence in L1 (ℝ2n ). In fact, W(ψj , ϕ) − W(ψk , ϕ) = W(ψj − ψk , ϕ), and hence, using Moyal’s identity (3.13), we have n
1 ) ‖ψj − ψk ‖L2 ‖ϕ‖L2 , W(ψj , ϕ) − W(ψk , ϕ)L2 (ℝ2n ) = ( 2πℏ proving that W(ψj , ϕ) is a Cauchy sequence if and only if (ψj )j is. The space L1 (ℝ2n ) being complete, there exists Ψ ∈ L1 (ℝ2n ) such that lim Ψ − W(ψj , ϕ)L1 (ℝ2n ) = 0.
j→∞
Defining ψ by the formula ψ(x) =
2n ̂ ∫ Ψ(z)R(z)ϕ(x)dz, ‖ϕ‖L2
(10.8)
ℝ2n
one thereafter shows that ψ ∈ S0 (ℝn ) and that ‖ψ − ψj ‖ϕ,S0 = lim W(ψ − ψj , ϕ)L1 (ℝ2n ) j→∞ = lim Ψ − W(ψj , ϕ)L1 (ℝ2n ) = 0, j→∞ hence S0 (ℝn ) is complete as claimed. Let us show that 𝒮 (ℝn ) ⊂ S0 (ℝn ). Let ψ ∈ 𝒮 (ℝn ); for every window ϕ, we have W(ψ, ϕ) ∈ 𝒮 (ℝ2n ), hence for every N > 0 there exists CN > 0 such that −N W(ψ, ϕ)(z) ≤ CN (1 + |z|) .
It follows, by definition of the norm ‖ ⋅ ‖ϕ,S0 , that ‖ψ‖ϕ,S0 ≤ CN ∫ (1 + |z|)
−N
ℝ2n
dz,
126 | 10 The Feichtinger algebra and hence ‖ψ‖ϕ,S0 < ∞ if we choose N > 2n. To prove the density of 𝒮 (ℝn ) in S0 (ℝn ), let us choose an exhaustive sequence (Kj ) of compact subsets of ℝ2n (i. e., Kj ⊂ K̊ j+1 and ℝ2n = ⋃j Kj ) and set Ψj = W(ψ, ϕ)χj where χj is a C ∞ function such that 0 ≤ χj ≤ 1, equal to one on Kj and supported in Kj+1 . Recalling that we have, by the reconstruction formula (3.17), and since ‖ϕ‖L2 = 1, ̂ ψ = 2n ∫ W(ψ, ϕ)(z)R(z)ϕdz, ℝ2n
we define an approximating function by ̂ ψj = 2n ∫ Ψj (z)R(z)ϕdz. ℝ2n
Since W(ψ, ϕ) ∈ 𝒮 (ℝ2n ), it is easy to see, using Leibniz’s differentiation rule for products, that ψj ∈ 𝒮 (ℝn ). We thus have ̂ ψ − ψj = 2n ∫ W(ψ, ϕ)(z)(1 − χj (z))R(z)ϕdz.
(10.9)
ℝ2n
Let us show lim ‖ψ − ψj ‖ϕ,S0 = 0,
j→∞
which will prove our assertion. By definition of the cross-Wigner transform, we have W(ψ − ψj , ϕ)(z ) = (
n
1 ̂ )(ψ − ψj )|ϕ) 2 n ) (R(z L (ℝ ) πℏ n
=(
1 ̂ ) (ψ − ψj |R(−z )ϕ)L2 (ℝn ) πℏ
̂ )∗ = R(−z ̂ ); using (10.9), we get where we have used the fact that R(z W(ψ − ψj , ϕ)(z ) = (
n
2 ̂ ̂ ) ∫ W(ψ, ϕ)(z)(1 − χj (z))(R(z)ϕ| R(−z )ϕ)L2 (ℝn ) dz. πℏ ℝ2n
Integrating with respect to z , we get, by definition (10.1) of the norm ‖ ⋅ ‖ϕ,S0 , ‖ψ − ψj ‖ϕ,S0 = W(ψ − ψj , ϕ)L1 (ℝ2n ) n
≤(
2 ̂ ̂ ) ∫ W(ψ, ϕ)(z)(1 − χj (z))(R(z)ϕ| R(−z )ϕ)L2 (ℝn ) dz. πℏ ℝ2n
̂ are unitary and By the Cauchy–Schwarz inequality, we have, since the operators R(⋅) ‖ϕ‖|L2 = 1, ̂ ̂ )ϕ)L2 (ℝn ) ≤ 1 (R(z)ϕ|R(−z
10.1 Definition and first properties | 127
so that ‖ψ − ψj ‖ϕ,S0 ≤ (
n
2 ) ∫ W(ψ, ϕ)(z)(1 − χj (z))dz. πℏ ℝ2n
Since ∫ W(ψ, ϕ)(z)(1 − χj (z))dz ≤ ∫ W(ψ, ϕ)(z)dz < ∞, ℝ2n
ℝ2n
we conclude, using Lebesgue’s dominated convergence theorem, that lim ∫ W(ψ, ϕ)(z)(1 − χj (z))dz = 0,
j→∞
ℝ2n
hence also limj→∞ ‖ψ − ψj ‖ϕ,S0 = 0. We next prove a convolution result: Proposition 121. Suppose that ψ ∈ L1 (ℝn ) and ψ ∈ S0 (ℝn ). Then ψ ∗ ψ ∈ S0 (ℝn ), and we have ψ ∗ ψ ϕ,S0 ≤ ‖ψ‖L1 ψ ϕ,S0
(10.10)
for every window ϕ ∈ 𝒮 (ℝn ). Thus, if ψ ∈ L1 (ℝn ) and ψ ∈ S0 , then ψ ∗ ψ ∈ S0 (ℝn ): L1 (ℝn ) ∗ S0 (ℝn ) ⊂ S0 (ℝn ). Proof. Recall that the cross-Wigner transform is defined by W(ψ, ϕ)(z) = (
n
1 ̂ ) (R(z)ψ|ϕ) L2 πℏ
̂ 0 ) is the Grossmann–Royer reflection operator. This yields the formula where R(z W(ψ, ϕ)(z0 ) = (
n
2i 1 ) e− ℏ p0 ⋅x0 ∫ ψ(2x0 − x)ϕp0 (x)dx πℏ
ℝn
2i
with ϕp0 (x) = e ℏ p0 ⋅x ϕ(x), that is W(ψ, ϕ)(z0 ) = (
n
2i 1 ) e− ℏ p0 ⋅x0 ψ ∗ ϕp0 (2x0 ). πℏ
(10.11)
It follows, in particular, that n
‖ψ‖ϕ,S0 = (
1 ) ∫ ‖ψ ∗ ϕp0 ‖L1 dp0 . 2πℏ ℝn
(10.12)
128 | 10 The Feichtinger algebra Formula (10.11) now shows that W(ψ ∗ ψ , ϕ)(z0 ) = (
n
2i 1 ) e− ℏ p0 ⋅x0 ψ ∗ ψ ∗ ϕp0 (2x0 ), πℏ
and hence, by (10.12), n
1 ) ∫ ψ ∗ (ψ ∗ ϕp0 )L1 dp0 . ψ ∗ ψ ϕ,S0 = ( 2πℏ ℝn
Since L1 (ℝn ) is a convolution algebra, we have ψ ∗ (ψ ∗ ϕp0 )L1 ≤ ‖ψ‖L1 ψ ∗ ϕp0 L1 , and we obtain the inequality n
1 ) ‖ψ‖L1 ∫ ψ ∗ ϕp0 L1 dp0 , ψ ∗ ψ ϕ0 ≤ ( 2πℏ ℝn
that is, using again (10.12), ψ ∗ ψ ϕ,S0 ≤ ‖ψ‖L1 ‖ψ‖ϕ,S0 which we set out to prove. Corollary 122. The Banach space S0 (ℝn ) is an algebra for both pointwise multiplication and convolution: if ψ and ψ are in S0 (ℝn ), then ψψ ∈ S0 (ℝn ) and ψ ∗ ψ ∈ S0 (ℝn ). Proof. Since ψψ and ψ∗ψ are interchangeable by the Fourier transform F, and S0 (ℝn ) being invariant under F in view of Proposition 123, (iii), it is sufficient to show that ψ ∗ ψ ∈ S0 (ℝn ) if ψ ∈ S0 (ℝn ) and ψ ∈ S0 (ℝn ). This follows from the inequality (10.10) since S0 (ℝn ) ⊂ L1 (ℝn ) in view of Proposition 118.
10.2 The metaplectic invariance of S0 (ℝn ) Feichtinger’s algebra is invariant (as are L2 (ℝn ) and 𝒮 (ℝn )) under the action of metaplectic operators and displacements: ̂ ∈ S (ℝn ); Proposition 123. Let ψ ∈ S0 (ℝn ), Ŝ ∈ Mp(n), and z0 ∈ ℝn . We have: (i) Sψ 0 n n ̂ (ii) D(z0 )ψ ∈ S0 (ℝ ); (iii) in particular, ψ ∈ S0 (ℝ ) if and only if Fψ ∈ S0 (ℝn ). Proof. We have ψ ∈ S0 (ℝn ) if and only if Wψ ∈ L1 (ℝn ). In view of the symplectic covariance property ̂ W(Sψ)(z) = Wψ(S−1 z)
10.3 The dual space S0 (ℝn )
| 129
̂ of the Wigner function, we have W(Sψ)(z) = Wψ(S−1 z) where S ∈ Sp(2n, R) is the ̂ Since det S = 1, we have projection of S. ̂ dz, ∫ Wψ(S−1 z)dz = ∫ W(Sψ)(z) ℝ2n
ℝ2n
̂ ∈ L1 (ℝn ) if and only if Wψ ∈ L1 (ℝn ). On the other hand, by formula (3.8), hence W(Sψ) i 1 W(̂ D(z0 )ψ, ϕ) = e ℏ σ(z,z0 ) W(ψ, ϕ)(z − z0 ), 2
hence, for every window ϕ ∈ 𝒮 (ℝn ), 1 ̂ ∫ W(T(z 0 )ψ, ϕ)(z)dz = ∫ Wψ(z − z0 )dz 2 2n 2n
ℝ
ℝ
= ∫ Wψ(z)dz ℝ2n
so that W(̂ D(z0 )ψ) ∈ S0 (ℝn ) if and only if Wψ ∈ L1 (ℝn ). (iii) It follows from the fact that the Fourier transform F is related to the generator ̂J of Mp(n) by the formula F = in/2̂J. Corollary 124. Every ψ ∈ S0 (ℝn ) is bounded, and we have limz→∞ ψ = 0. Proof. This follows from property (iii) in the previous proposition: Since ψ is continuous, it suffices to prove that limz→∞ ψ = 0. We have F −1 ψ ∈ S0 (ℝn ) ⊂ L1 (ℝn ), hence ψ = F(F −1 ψ) has limit 0 at infinity in view of the Riemann–Lebesgue lemma.
10.3 The dual space S0 (ℝn ) In this section, we will study the Banach Gelfand triple (S0 (ℝn ), L2 (ℝn ), S0 (ℝn )) where S0 (ℝn ) is the dual space of the Feichtinger algebra S0 (ℝn ). Let us denote by S0 (ℝn ) the dual Banach space of S0 (ℝn ). It is the space of all bounded linear functionals on S0 (ℝn ). Since S0 (ℝn ) is the smallest Banach space isometrically invariant under the action of the affine metaplectic group (and hence under the Heisenberg–Weyl operators), its dual is essentially the largest space of distributions with this property. The following result characterizes the distribution space S0 (ℝn ):
130 | 10 The Feichtinger algebra Proposition 125. The Banach space S0 (ℝn ) consists of all ψ ∈ S (ℝn ) such that W(ψ, ϕ) ∈ L∞ (ℝ2n ) for one (and hence all) window ϕ ∈ S0 (ℝn ); the duality bracket is given by the pairing (ψ, ψ ) = ∫ W(ψ, ϕ)(z)W(ψ , ϕ)(z)dz,
(10.13)
ℝ2n
and the formula ‖ψ‖ϕ,S0 (ℝn ) = sup W(ψ, ϕ)(z) 2n z∈ℝ
(10.14)
defines a norm on S0 (ℝn ). Proof. We omit the proof of this result; see the comments and references at the end of the chapter. It readily follows from this characterization that: Proposition 126. The Dirac distribution δ is in S0 (ℝn ); more generally, δa ∈ S0 (ℝn ) where δa (x) = δ(x − a). Proof. By translation, it is sufficient to assume a = 0. We have, by definition of the cross-Wigner transform, W(δ, ϕ)(z0 ) = (
n
1 ̂ 0 )δ|ϕ) 2 ) (R(z L πℏ
and also 2i
2i
̂ 0 )δ(x) = e ℏ p0 ⋅(x−x0 ) δ(2x0 − x) = e ℏ p0 ⋅x0 δ(2x0 − x). R(z It follows that W(δ, ϕ)(z0 ) = (
n
2i 1 ) e ℏ p0 ⋅x0 ϕ(2x0 ), πℏ
and hence n
1 W(δ, ϕ)(z0 ) ≤ ( ) ‖ϕ‖∞ . πℏ It follows that δ ∈ S0 (ℝn ). Recall that a Banach Gelfand triple is a triple (ℬ, H, ℬ ) consisting of a Banach space ℬ that is continuously and densely embedded into a Hilbert space H, which in
10.4 The modulation spaces M1s (ℝn )
| 131
turn is w∗ -continuously and densely embedded into the dual Banach space ℬ . One identifies H with its dual H ∗ , and the scalar product on H thus extends in a natural way into a pairing between ℬ ⊂ H and ℬ ⊃ H. A typical example is the triple (𝒮 (ℝn ), L2 (ℝn ), 𝒮 (ℝn )). Given a Gelfand triple (ℬ, H, ℬ ), every ϕ ∈ ℬ has an expansion with respect to the generalized eigenvectors ψα which generalizes the usual expansion with respect to a basis of eigenvectors. A classical example is the following: Consider the Gelfand triple ̂ = −iℏ𝜕x . The generalized eigenvalues of A ̂ are (S0 (ℝn ), L2 (ℝn ), S0 (ℝn )), and choose A j
the functions χp (x) = eip⋅x/ℎ (p ∈ ℝn ), and the corresponding expansion can be written as the Fourier inversion formula ψ(x) = (
n/2
1 ) 2πℏ
i
∫ e ℏ p⋅x Fψ(p)dp
(10.15)
ℝn
(see Feichtinger et al. in [27] for a detailed discussion of the Fourier transform within the context of the Banach Gelfand triple (S0 (ℝn ), L2 (ℝn ), S0 (ℝn ))). An important feature of Gelfand triples is the existence of a kernel theorem, which is much more useful both for theoretical and practical purposes than the usual kernel theorem of Schwartz. We denote as usual by ((⋅, ⋅)L2 )L2 the distributional bracket for distributions on ℝ2n . ̂ : Proposition 127. The following properties hold: (i) Every linear bounded operator A n n n n S0 (ℝ ) → S0 (ℝ ) has a kernel K ∈ S0 (ℝ × ℝ ), that is (Aψ, ϕ)L2 = ((KA , ϕ ⊗ ψ)L2 )L2 for ψ and ϕ in S0 (ℝn ). (ii) Conversely, every K ∈ S0 (ℝn × ℝn ) defines by the previous formula a bounded operator S0 (ℝn ) → S0 (ℝn ). Formally we can thus write Aψ(x) = ∫ K(x, y)ψ(y)dy ℝn
for some K ∈ S0 (ℝn × ℝn ) when A : S0 (ℝn ) → S0 (ℝn ) is a continuous operator. This result was announced by Feichtinger in [25] and proven in [26]. See [40], § 11.4 for a detailed proof, comments, and various extensions.
10.4 The modulation spaces Ms1 (ℝn ) Closely related to the Feichtinger algebra are the modulation spaces Ms1 (ℝn ). Let us introduce the following notation. For z ∈ ℝ2n , we set ⟨z⟩ = (1 + |z|2 )1/2 . The function z → ⟨z⟩ is the Weyl symbol of the pseudodifferential operator (1 − Δ)1/2 where Δ is the Laplacian in the z variables. For s ≥ 0, we denote by L1s (ℝ2n ) the weighted L1 space
132 | 10 The Feichtinger algebra defined by L1s (ℝ2n ) = {ρ : ℝ2n → ℂ : ⟨⋅⟩s ρ ∈ L1 (ℝ2n )}. Thus, a function ρ is in L1s (ℝ2n ) if and only if ‖ρ‖L1s (ℝ2n ) = ∫ ρ(z)⟨z⟩s dz < ∞. ℝ2n
L12 ℝ2n is clearly a complex vector space on which the mapping ρ → ‖ρ‖L1 (ℝ2n ) is a norm; L1s (ℝ2n ) is complete for that norm and is thus a Banach space.
2
Definition 128. A function ψ belongs to the modulation space Ms1 (ℝn ) if and only if W(ψ, ϕ) ∈ L1s (ℝ2n ) for every window ϕ ∈ 𝒮 (ℝn ). Of course, M01 (ℝn ) = S0 (ℝn ), and more generally Ms1 (ℝn ) ⊂ S0 (ℝn ). Hence, in view of the inclusions (10.5), Ms1 (ℝn ) ⊂ C 0 (ℝn ) ∩ L1 (ℝn ) ∩ F(L1 (ℝn )). Exactly as in the case of Feichtinger’s algebra (Proposition 117), it suffices to check the definition for one window: Proposition 129. (i) We have ψ ∈ Ms1 (ℝn ) if and only if W(ψ, ϕ) ∈ L1s (ℝ2n ) for one window ϕ. (ii) The mappings ψ → ‖ψ‖ϕ,Ms1 defined by ‖ψ‖ϕ,Ms1 = W(ψ, ϕ)L1 (ℝ2n ) = ∫ W(ψ, ϕ)(z)⟨z⟩s dz s ℝ2n
form a family of equivalent norms, and the topology on Ms1 (ℝn ) thus defined makes it into a Banach space. In fact, most of the properties we have proven for S0 (ℝn ) are also true for the modulation spaces Ms1 (ℝn ) and are proven quite similarly. The density result Proposition 119 for S0 (ℝn ) extends to Ms1 (ℝn ): Proposition 130. The Schwartz space 𝒮 (ℝn ) is a dense subspace of each of the modulation spaces Ms1 (ℝn ). We also have the metaplectic invariance property extending Proposition 123: Proposition 131. The modulation space Ms1 (ℝn ) is invariant under the action of the ̂ ∈ M 1 (ℝn ) if and only if ψ ∈ M 1 (ℝn ). metaplectic group Mp(n): if Ŝ ∈ Mp(n), then Sψ s s In particular, Ms1 (ℝn ) is invariant under the Fourier transform ψ ∈ Ms1 (ℝn ) if and only if Fψ ∈ Ms1 (ℝn ). The modulation spaces Ms1 (ℝn ) behave well under the action of the Heisenberg displacement operator:
10.5 Comments and references | 133
Proposition 132. Each space Ms1 (ℝn ) is invariant under the action of the Heisenberg displacement operator ̂ D(z), and there exists a constant C > 0 such that ̂ s D(z0 )ψϕ,M 1 ≤ C⟨z0 ⟩ ‖ψ‖ϕ,Ms1 s
(10.16)
for every z0 ∈ ℝ2n . Proof. The cross-Wigner transform satisfies i ̂ 0 )ψ, ϕ)(z) = e− ℏ σ(z,z0 ) W(ψ, ϕ)(z − 1 z0 ) W(T(z 2
(property (3.11)), hence it suffices to show that L1s (ℝ2n ) is invariant under phase space translations T(z0 ) : z → z + z0 . In view of the submultiplicative property ⟨z + z0 ⟩s ≤ ⟨z⟩s ⟨z0 ⟩s of the weight function ⟨⋅⟩s , we have s T(z0 )ρL1 (ℝ2n ) = ∫ ρ(z − z0 )⟨z⟩ dz s ℝ2n
= ∫ ρ(z)⟨z + z0 ⟩s dz ℝ2n
≤ ⟨z0 ⟩s ∫ ρ(z)⟨z⟩s dz, ℝ2n
hence (10.16́) and ̂ D(z)ψ ∈ Ms1 (ℝn ).
10.5 Comments and references We recommend Jakobsen’s recent review [48]; of the Feichtinger algebra. The most detailed reference is still Gröchenig’s treatise [40]. The Feichtinger algebra S0 (ℝn ) and its dual S0 (ℝn ) were introduced by Hans Feichtinger in the early 1980s. It is a particular case of the class of modulation spaces Mvp,q (ℝn ) that play an important role in many theoretical and practical questions in analysis. Modulation spaces were originally designed to study phase-space concentration problems in time-frequency analysis. In the traditional approaches (see e. g. Gröchenig [40]), the Feichtinger algebra is defined not in terms of the cross-Wigner transform, as we do, but rather using the short-time Fourier transform (also called the Gabor transform) Vϕ ψ(z) = ∫ e−2πip⋅x ψ(x )ϕ(x − x)dx .
(10.17)
ℝn
The latter is related to the cross-Wigner transform by the formula W(ψ, ϕ)(z) = (
n/2
2 ) πℏ
2i
e ℏ p⋅x Vϕ∨
√2πℏ
ψ√2πℏ (z √
2 ) πℏ
(10.18)
134 | 10 The Feichtinger algebra where ψ√2πℏ (x) = ψ(x√2πℏ) and ϕ∨ (x) = ϕ(−x). We have seen (Proposition 123) that Feichtinger’s algebra is invariant under displacements and metaplectic operators. It is in fact the smallest Banach algebra in 𝒮 (ℝn ) having this property; see Gröchenig [40]. For a proof of Proposition 125 see [40], § 11.3. The use of the Gelfand triple (S0 (ℝn ), L2 (ℝn ), S0 (ℝn )) not only offers a better description of self-adjoint operators, but it also makes possible simplification of many proofs. Here is a typical situation. Given a Gelfand triple (ℬ, H, ℬ ), one proves that every self-adjoint operator A : ℬ → ℬ has a complete family of generalized eigenvalues (ψα )α = {ψα ∈ ℬ : α ∈ 𝔸} (𝔸 an index set), defined as follows: For every α ∈ 𝔸, there exists λα ∈ ℂ such that (ψα |Aϕ) = λα (ψα |ϕ)
for every ϕ ∈ ℬ.
A basic example, in the case n = 1, is the operator x̂ of multiplication by x. This operator has no eigenfunctions in L2 (ℝ), but since x̂δa (x) = xδ(x − a) = aδa (x), every a ∈ ℝ is a generalized eigenvalue (with associated eigenfunction δa ∈ S0 (ℝ)).
11 Hilbert–Schmidt operators Hilbert–Schmidt operators play an essential role in harmonic and functional analysis. One of the reasons is that every density operator (and, more generally, every trace class operator) is the product of two Hilbert–Schmidt operators. The presentation in this chapter is quite conventional: We begin by defining and studying Hilbert–Schmidt operator on L2 (ℝn ), which are the operators with square integrable distributional kernels.
11.1 Hilbert–Schmidt operators on L2 (ℝn ) 11.1.1 First definition We begin by giving a simple working definition, which is actually sufficient for most of our needs since we will be, in practice, dealing almost exclusively with the Hilbert space L2 (ℝn ) in our study of quantum states in the forthcoming chapters. ̂ : 𝒮 (ℝn ) → 𝒮 (ℝn ) is called a Hilbert– Definition 133. A continuous operator A Schmidt operator (or an integral Hilbert–Schmidt operator) if its kernel K is square integrable, that is, if there exists K ∈ L2 (ℝn × ℝn ) such that we have ̂ Aψ(x) = ∫ K(x, y)ψ(y)dy
(11.1)
ℝn
for all ψ ∈ 𝒮 (ℝn ). We denote by ℒ2 (L2 (ℝn )) the complex vector space of all Hilbert– Schmidt operators on L2 (ℝn ). Remark 134. This definition can be extended to any measure space. ̂ has a kernel K and B ̂ a kernel L, then That ℒ2 (L2 (ℝn )) is a vector space is clear: If A ̂ +B ̂ is simply K + L, which is square integrable if K and L are. Similarly, the kernel of A ̂ is λK which is square integrable if and only if K is. It is for λ ∈ ℂ, the kernel of λA ̂ is a Hilbert–Schmidt operator, then so is easy to check, using formula (11.1), that if A ∗ ̂ its adjoint A and ̂ ∗ ψ(x) = ∫ K(y, x)ψ(y)dy. A
(11.2)
ℝn
̂ can be uniquely extended into a continuous A Hilbert–Schmidt operator A 2 n 2 ̂ For ψ ∈ 𝒮 (ℝn ), we have, by bounded operator L (ℝ ) → L (ℝn ), also denoted A: the Cauchy–Schwarz inequality, ̂ 2 2 2 Aψ(x) ≤ ∫ K(x, y) dy ∫ ψ(y) dy, ℝn
https://doi.org/10.1515/9783110722772-011
ℝn
136 | 11 Hilbert–Schmidt operators and hence, integrating with respect to x, ̂ 2 2 ≤ ‖K‖2 2 2n ‖ψ‖2 2 . ‖Aψ‖ L L (ℝ ) L
(11.3)
̂ is a conThis inequality extends to all ψ ∈ L2 (ℝn ) by density and continuity, hence A 2 n 2 n ̂ of a tinuous operator L (ℝ ) → L (ℝ ). It also shows that the operator norm ‖A‖ ̂ Hilbert–Schmidt operator A satisfies ̂ = sup ‖Aψ‖ ̂ ‖A‖ L2 ≤ ‖K‖L2 (ℝ2n ) .
(11.4)
‖ψ‖L2 =1
The previous inequality motivates the following definition: Definition 135. The Hilbert–Schmidt norm ‖ ⋅ ‖2 on ℒ2 (L2 (ℝn )) is defined by 1/2
̂ 2 = ‖K‖ 2 2n = ( ∫ K(x, y)2 dxdy) ‖A‖ L (ℝ )
(11.5)
ℝ2n
̂ ∈ ℒ2 (L2 (ℝn )). where K is the (distributional) kernel of A The verification that ‖ ⋅ ‖2 is indeed a norm is a trivial exercise left to the reader. 11.1.2 Second definition An alternative definition of Hilbert–Schmidt operators (which can be extended to any separable Hilbert space) is the following: ̂ ∈ ℬ(L2 (ℝn )) is called a Hilbert–Schmidt operator if Definition 136. An operator A there exists an orthonormal basis (ψj )j of L2 (ℝn ) such that ̂ j ‖2 2 < ∞. ∑ ‖Aψ L
(11.6)
̂ ∗ Aψ ̂ j |ψj ) 2 < ∞. ∑(A L
(11.7)
j
Equivalently, j
The choice of basis (ψj )j for which (11.6) holds is irrelevant: If the condition (11.6) ̂ j ‖2 2 does holds for one orthonormal basis, then it holds for all, and the sum ∑j ‖Aψ L moreover not depend on the choice of basis. Let us prove this essential property. ̂ ∈ ℒ2 (L2 (ℝn )). Let (ψj )j and (ϕj )j be orthonormal bases of L2 (ℝn ), Proposition 137. Let A ̂ j ‖ < ∞. Then and assume that ∑j ‖Aψ ̂ j ‖2 2 = ∑ ‖Aψ ̂ j ‖2 < ∞. ∑ ‖Aϕ L j
j
11.1 Hilbert–Schmidt operators on L2 (ℝn )
| 137
Proof. The Parseval identity 2 ‖ψ‖2L2 = ∑(ψ|ϕk )
(11.8)
k
implies that ̂ j ‖2 2 = ∑(Aψ ̂ j |ϕk ) 2 2 . ‖Aψ L L k
Summing over j, we get ̂ j |ϕk ) 2 2 ̂ j ‖2 2 = ∑(Aψ ∑ ‖Aψ L L j
j,k
̂ ∗ ϕk ) 2 2 . = ∑(ψj |A L j,k
Applying again the Parseval identity, we have ̂ ∗ ϕk ) 2 2 = A ̂ ∗ 2 ∑(ψj |A ϕk L2 , L j
and hence ̂ j ‖2 2 = ∑A ̂ ∗ 2 ∑ ‖Aψ ϕk L2 < ∞. L j
k
̂ ∗ ψk ‖2 2 < ∞, hence the adjoint of A ̂ is also a Hilbert– Taking (ϕj )j = (ψj )j , we have ∑k ‖A L ∗ ̂ by A ̂ in the previous formula, which yields Schmidt operator; we may thus replace A ̂ k ‖2 2 < ∞ as claimed. ∑k ‖Aϕ L ̂ ∈ ℬ(L2 (ℝn )) Remark 138. In the course of the previous proof, we have shown that A ̂ ∗ is — which is already clear is a Hilbert–Schmidt operator if and only if its adjoint A if one uses Definition 11.1 since the kernel of the adjoint is (x, y) → K(y, x) which is square integrable if and only if (x, y) → K(x, y) is. 11.1.3 Equivalence of the definitions; the Hilbert–Schmidt norm There remains to prove that both definitions (11.1) and (11.6) are equivalent. This will enable us to give an alternative definition of the Hilbert–Schmidt norm. ̂ on L2 (ℝn ) satisfies ∑ ‖Aψ ̂ j ‖2 2 < ∞ for one (and Proposition 139. (i) An operator A j L hence every) orthonormal basis (ψj )j of L2 (ℝn ) if and only if it has a kernel K ∈ L2 (ℝn × ̂ ∈ ℒ2 (L2 (ℝn )) is given by ℝn ). (ii) The Hilbert–Schmidt norm of A 1/2
̂ 2 = (∑ ‖Aψ ̂ j ‖2 2 ) ‖A‖ L j
1/2
̂ ∗ Aψ ̂ j |ψj ) 2 ) . = (∑(A L j
(11.9)
138 | 11 Hilbert–Schmidt operators (iii) The absolutely convergent series ̂ B) ̂ 2 = ∑(Aψ ̂ j |Bψ ̂ j) 2 (A| L j
̂ 2 = ((A| ̂ A) ̂ 2 )1/2 , and ℒ2 (L2 (ℝn )) is a defines a scalar product on ℒ2 (L2 (ℝn )) such that ‖A‖ Hilbert space for this scalar product. ̂ be a Hilbert–Schmidt operator on L2 (ℝn ) in the sense of definition Proof. (i) Let A (11.6), and choose an orthonormal basis (ψj )j of L2 (ℝn ). The family (ψi ⊗ ψj )i,j of tensor products is an orthonormal basis of L2 (ℝn × ℝn ). Consider the kernel ̂ i |ψj )ψj (x)ψi (y). K(x, y) = ∑(Aψ i,j
(11.10)
We have 2 ̂ 2 2 ∫ K(x, y) dxdy ≤ ∑(Aψ i |ψj )L2 ‖ψj ⊗ ψi ‖L2 (ℝ2n ) i,j
ℝ2n
̂ 2 = ∑(Aψ i |ψj )L2 < ∞, i,j
hence K ∈ L2 (ℝn × ℝn ). By definition (11.10) of K, we have ̂ i |ψj )ψj (x) ∫ ψ(y)ψi (y)dy ∫ K(x, y)ψ(y)dy = ∑(Aψ i,j
ℝn
ℝn
̂ i |ψj ) 2 ψj (x)(ψ|ψi ). = ∑(Aψ L i,j
̂ i = ∑ (Aψ ̂ j |ψj ) 2 ej , we also have Since ψ = ∑i (ψ|ψi )L2 ei and Aψ L j ̂ ̂ i (x) Aψ(x) = ∑(ψ|ψi )L2 Aψ i
̂ j |ψj ) 2 ψj (x), = ∑(ψ|ψi )L2 (Aψ L i,j
and hence ̂ Aψ(x) = ∫ K(x, y)ψ(y)dy.
(11.11)
ℝn
̂ ∈ ℬ(L2 (ℝn )) belongs to L2 (ℝn × ℝn ). We can Assume conversely that the kernel K of A then find complex numbers cij such that ∑i,j |cij |2 < ∞ and K(x, y) = ∑ cij ψj (x) ⊗ ψi (y). i,j
(11.12)
11.1 Hilbert–Schmidt operators on L2 (ℝn )
| 139
̂ defined by The operator A ̂ = ∫ K(x, y)ψ(y)dy Aψ ℝn
is thus given by the absolutely convergent integral ̂ = ∑ cij (ψ|ψi ) 2 ψj . Aψ L i,j
̂ to the basis vectors ψk , we get, in particular, since the vectors ψj are orApplying A thonormal, ̂ k = ∑ cij (ψk |ψi ) 2 ψj = ∑ ckj ψj , Aψ L i,j
j
and hence ̂ k ‖2 2 = (Aψ ̂ k |Aψ ̂ k ) 2 = ∑ |ckj |2 ‖Aψ L L j
so that ̂ k ‖2 2 = ∑ |ckj |2 < ∞, ∑ ‖Aψ L k
j,k
(11.13)
̂ is thus a Hilbert–Schmidt operator. (ii) We have to show that and A 2 ̂ j ‖2 2 . ∫ K(x, y) dxdy = ∑ ‖Aψ L j
ℝ2n
In view of the identities (11.12) and (11.13), we have 2 ̂ k ‖2 2 . ∫ K(x, y) dxdy = ∑ |cij |2 = ∑ ‖Aψ L i,j
ℝ2n
k
̂ j |Bψ ̂ j ) 2 | < ∞ and is independent of the choice of or(iii) We must show that ∑j |(Aψ L thonormal basis (ψj )j . In view of the Cauchy–Schwarz inequality, we have ̂ ̂ ̂ ̂ ∑(Aψ j |Bψj )L2 ≤ ∑ ‖Aψj ‖L2 ‖Bψj ‖L2 j
j
1 ̂ j ‖2 2 + ‖Bψ ̂ j ‖2 2 < ∞ ≤ ∑ ‖Aψ L L 2 j
̂ B) ̂ 2 is independent of the choice of orthonormal basis hence the first assertion. That (A| ̂ 2 = ((A| ̂ A) ̂ 2 )1/2 , we must have (Jordan and follows from the observation that, since ‖A‖ von Neumann’s theorem) ̂ + B‖ ̂ 2 + ‖A ̂ − B‖ ̂ 2 = 2(‖A‖ ̂ 2 + ‖B‖ ̂ 2 ), ‖A 2 2 2 2
140 | 11 Hilbert–Schmidt operators ̂ B) ̂ 2 is uniquely determined by this requirement since it imand the inner product (A| plies that ̂ B) ̂ 2 = ‖A‖ ̂ 2 + ‖B‖ ̂ 2 − ‖A ̂ − B‖ ̂ 2. 2(A| 2 2 2 The proof of the completeness of ℒ2 (L2 (ℝn )) is omitted.
11.2 Further properties of the Hilbert–Schmidt operators 11.2.1 Compactness of Hilbert–Schmidt operators ̂ of A ̂ is defined by |A| ̂ = (A ̂ ∗ A) ̂ 1/2 . Let us Let A ∈ ℬ(L2 (ℝn )). The absolute value |A| show that the absolute value of a Hilbert–Schmidt operator is also a Hilbert–Schmidt operator and that the square root of a nonnegative Hilbert–Schmidt operator is also a Hilbert–Schmidt. ̂ ∈ ℒ2 (L2 (ℝn )) if and only if |A| ̂ ∈ ℒ2 (L2 (ℝn )). Proposition 140. We have A ̂ ∈ ℒ2 (L2 (ℝn )). We Proof. Let (ψj )j be an orthonormal basis of L2 (ℝn ). Assume that A have ̂ 2 ̂ ∗ ̂ 1/2 ̂ ∗ ̂ 1/2 ∑|A|ψ j L2 = ∑((A A) ψj |(A A) ψj )L2 j
j
̂ ∗ Aψ ̂ j |ψj ) 2 = ∑(A L j
̂ j |Aψ ̂ j) 2 < ∞ = ∑(Aψ L j
in view of definition (11.6). The sequence of these equalities can be reversed, so we ̂ ∈ ℒ2 (L2 (ℝn )), then we also have A ̂ ∈ ℒ2 (L2 (ℝn )). conclude that, if |A| Hilbert–Schmidt operators are bounded. They are in fact compact operators: ̂ ∈ ℒ2 (L2 (ℝn )) is a compact operator on L2 (ℝn ). Proposition 141. Every A ̂ is the limit (in the operator norm) of a sequence of Proof. It suffices to show that A finite rank operators, and this will prove the result since the limit (in the operator norm) of a sequence of compact operators is compact and finite-rank operators are ̂ N on compact. Let (ψj )j be an orthonormal basis of L2 (ℝn ). Define now an operator A 2 n L (ℝ ) by N
̂ N ψ = ∑(Aψ|ψ ̂ A j )L2 ψj . j=1
11.2 Further properties of the Hilbert–Schmidt operators | 141
̂ is given by the absolutely convergent series Noting that Aψ ∞
̂ = ∑(ψ|ψj ) 2 Aψ ̂ j, Aψ L j=1
we have ̂ −A ̂ N )ψ = ∑ (ψ|ψj ) 2 Aψ ̂ j, (A L j>N
and hence, ̂ ̂ 2 2 ̂ j ‖2 2 (A − AN )ψL2 ≤ ∑ (ψ|ψj )L2 ∑ ‖Aψ L j>N
j>N
̂ j ‖2 2 . ≤ ‖ψ‖L2 ∑ ‖Aψ L j>N
This implies the inequality ̂ −A ̂ N ‖2 ≤ ∑ ‖Aψ ̂ j ‖2 2 ; ‖A L j>N
̂ j ‖2 2 being convergent (its limit is the square of the Hilbert–Schmidt the series ∑j ‖Aψ L ̂ 2 ), this implies that limN→∞ ‖(A ̂ −A ̂ N )‖ = 0. norm ‖A‖
11.2.2 Ideal property We are going to show that ℒ2 (L2 (ℝn )) is a left and right ℬ(L2 (ℝn ))-module, in fact a two-sided ideal in ℬ(L2 (ℝn )). Proposition 142. The vector space ℒ2 (L2 (ℝn )) is a two-sided ideal in the algebra ̂ ∈ ℒ2 (L2 (ℝn )) and B ̂ ∈ ℬ(L2 (ℝn )), ℬ(L2 (ℝn )) of bounded operators on L2 (ℝn ): If A 2 n ̂B ̂ and B ̂A ̂ are both in ℒ2 (L (ℝ )), and we have then A ̂ B‖ ̂ 2 ≤ ‖A‖ ̂ 2 ‖B‖, ̂ ‖A
̂ A‖ ̂ 2 ≤ ‖B‖‖ ̂ A‖ ̂ 2 ‖B
(11.14)
̂ is the operator norm of B. ̂ In particular, ℒ2 (L2 (ℝn )) is a normed algebra. where ‖B‖ ̂B ̂ and B ̂A ̂ are Proof. Since Hilbert–Schmidt operators are compact, it is clear that A 2 n 2 n both in ℬ(L (ℝ )). Let us show that they are in ℒ2 (L (ℝ )). For this, it suffices to prove ̂ Aψ ̂ j‖ 2 ≤ the inequalities (11.14). Let (ψj )j be an orthonormal basis of L2 (ℝn ); we have ‖B L ̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂ 2 ‖B‖ ̂ ‖B‖‖Aψj ‖L2 ; summing over j yields ‖BA‖2 ≤ ‖B‖‖A‖2 . The inequality ‖AB‖2 ≤ ‖A‖ ∗ ∗ ∗ ̂B ̂ = (B ̂ A ̂ ) . follows, writing A
142 | 11 Hilbert–Schmidt operators
11.3 Hilbert–Schmidt and Weyl operators ̂ : 𝒮 (ℝn ) → The Schwartz kernel theorem tells us that every continuous operator A ̂ ϕ⟩ = ⟨K, ϕ ⊗ ψ⟩ where K ∈ 𝒮 (ℝn ) can be represented in the weak sense as ⟨Aψ, n n 𝒮 (ℝ ×ℝ ). Hilbert–Schmidt operators are, by definition, exactly those operators that are represented by a kernel K ∈ L2 (ℝn × ℝn ). We also know that the relationship between the kernel of an operator and its Weyl symbol is given by the Fourier transform (formula (5.10)) i 1 1 a(x, p) = ∫ e− ℏ p⋅y K(x + y, x − y)dy 2 2
(11.15)
ℝn
or, equivalently, n
i 1 1 1 K(x + y, x − y) = ( ) ∫ e ℏ p⋅y a(x, p)dp. 2 2 2πℏ
(11.16)
ℝn
̂ = OpW (a) is a Hilbert–Schmidt operator. Then a ∈ Proposition 143. Assume that A 2 2n 2 2n L (ℝ ) and aσ ∈ L (ℝ ); moreover: ‖a‖L2 (ℝ2n ) = ‖aσ ‖L2 (ℝ2n ) = (2πℏ)n/2 ‖K‖L2 (ℝn ×ℝn ),
(11.17)
̂ 2 = (2πℏ)−n/2 ‖a‖ 2 2n . ‖A‖ L (ℝ )
(11.18)
that is
Proof. The equality ‖a‖L2 (ℝ2n ) = ‖aσ ‖L2 (ℝ2n ) is obvious since the symplectic Fourier transform is unitary. Let us prove that ‖a‖L2 (ℝ2n ) = (2πℏ)n/2 ‖K‖L2 when K ∈ 𝒮 (ℝn × ℝn ); the proposition will follow using the density of 𝒮 (ℝn × ℝn ) in L2 (ℝn × ℝn ). In view of formula (11.15), the symbol a is, for fixed x, (2πℏ)n/2 times the Fourier transform of the function y → K(x + 21 y, x − 21 y), hence, by Plancherel’s formula, 2 1 1 2 ∫ a(x, p) dp = (2πℏ)n ∫ K(x + y, x − y) dy 2 2 n n
ℝ
ℝ
and, integrating with respect to x, 2 1 1 2 ∫ a(z) dz = (2πℏ)n ∫ ( ∫ K(x + y, x − y) dy)dx 2 2 n n 2n
ℝ
ℝ
ℝ
2 1 1 = (2πℏ)n ∫ K(x + y, x − y) dxdy 2 2 n n ℝ ×ℝ
(11.19)
11.4 Comments and references |
143
where we have applied Fubini’s theorem (the integrals are absolutely convergent since (x, y) → K(x + 21 y, x − 21 y) is in 𝒮 (ℝn × ℝn ) because K is). Now set x = x + 21 y and y = x − 21 y; we have dx dy = dxdy, hence 2 2 ∫ a(z) dz = (2πℏ)n ∫ K(x , y ) dx dy ℝ2n
ℝn ×ℝn
which we set out to prove. Formula (11.18) follows by the definition of the Hilbert– Schmidt norm (Definition 135).
11.4 Comments and references The theory of Hilbert–Schmidt (and the related trace class operators) is a wellestablished topic in functional analysis. Their theory can be found in almost any introductory text to functional analysis. A good reference at the undergraduate level is MacCluer [57], another is the monograph by Blanchard and Brüning [6]. In our presentation, we have deliberately emphasized the operators acting on the square-integrable functions, thus privileging the model Hilbert space L2 (ℝn ). We have however always given alternative definitions in terms of orthonormal bases that can be immediately extended to arbitrary (separable) Hilbert spaces.
12 The trace class Trace class operators are essential for the study of mixed quantum states and their density operators.
12.1 Definitions of trace class operators 12.1.1 Statement of the definitions Let us give two definitions of trace class operators; we will thereafter show that they are equivalent. The first definition expresses these operators directly in terms of Hilbert–Schmidt operators: Definition 144. The space ℒ1 (L2 (ℝn )) of trace class operators on L2 (ℝn ) consists of all ̂ ∈ ℬ(L2 (ℝn )) that are the product of two Hilbert–Schmidt operators on operators A ̂ in ℒ (L2 (ℝn )) such that ̂ ∈ ℒ1 (L2 (ℝn )) if and only if there exist B ̂ and C L2 (ℝn ). That is, A 2 ∗̂ ̂ ̂ A = C B. ̂ ∗B ̂ might seem cumbersome: It will ̂ = C ̂ instead of, say, A ̂ = B ̂ C, The notation A however make many arguments clearer as we will soon see. Note that this definition immediately implies the inclusion 2
n
2
n
ℒ1 (L (ℝ )) ⊂ ℒ2 (L (ℝ ))
in view of the algebra property of ℒ2 (L2 (ℝn )) (Proposition 141). The compactness of trace class operators follows, so we have the following spectral decomposition result: ̂ be a self-adjoint trace class operator on L2 (ℝn ). Then: (i) The Proposition 145. Let A ̂ is at most countable and consists of non-negative numbers set (λj ) of eigenvalues λj of A λ1 ≥ λ2 ≥ ⋅ ⋅ ⋅ ≥ λj ≥ ⋅ ⋅ ⋅ ≥ 0 such that ∑j λj = 1, and we have limj→∞ λj = 0 if (λj ) is infinite. Our second definition is more in the spirit of that of Hilbert–Schmidt operators because it uses orthonormal bases in L2 (ℝn ): ̂ on L2 (ℝn ) is of trace class if and only if Definition 146. A bounded linear operator A there exists a pair of orthonormal bases (ψj )j and (ϕk )k of L2 (ℝn ) such that ̂ j |ϕk ) 2 < ∞ ∑(Aψ L j,k
(12.1)
where the series is absolutely convergent. It immediately follows from this definition that ℒ1 (L2 (ℝn )) is a vector space: If ̂ 1 ϕj |χj ) 2 | < ∞ and ∑ |(A ̂ 2 ϕj |χj ) 2 | < ∞, then we have, using the triangle inequal∑j |(A L L j https://doi.org/10.1515/9783110722772-012
146 | 12 The trace class ity, ̂ ̂ ̂ ̂ ∑((A 1 + A2 )ϕj |χj )L2 ≤ ∑(A1 ϕj |χj )L2 + ∑(A2 ϕj |χj )L2 < ∞ j
j
j
̂1 + A ̂ 2 ∈ ℒ1 (L2 (ℝn )); that λA ̂ ∈ ℒ1 (L2 (ℝn )) if A ̂ ∈ ℒ1 (L2 (ℝn )) and λ ∈ ℂ is clear. so that A 12.1.2 Proof of the equivalence of the definitions Let us show that the two definitions 144 and 146 of trace class operators are equivalent. We will need the following useful result about absolute values of trace class operators: ̂ is of trace class if and only if its absolute value |A| ̂ = Proposition 147. The operator A ∗ 1/2 ̂ A) ̂ (A is of trace class. Proof. In view of the polar decomposition theorem, there exists a unitary operator U ̂ = U( ̂A ̂ ∗ A) ̂ 1/2 . Supposing A ̂ is of trace class, then on L2 (ℝn ) such that A ̂ A ̂ ∗ A) ̂ 1/2 ψj |ϕk ) 2 < ∞ ∑(U( L j,k
for some orthonormal bases (ψj )j and (ϕk )k , that is ̂ ∗ A) ̂ 1/2 ψj |ϕ ) 2 < ∞ ∑((A k L j,k
̂ −1 ϕk . Since U ̂ is unitary, (ϕ )k is also an orthonormal basis, hence |A| ̂ = where ϕk = U k ∗ 1/2 ̂ A) ̂ ̂ is of trace class, then so is A ̂ = U( ̂A ̂ ∗ A) ̂ 1/2 , (A is of trace class. If, conversely, |A| ̂ −1 ϕ )k . replacing again (ϕk )k with (U k
Proposition 148. Definitions 144 and 146 define the same class of operators. Proof. In what follows, (ψj ) and (ϕj )j are two arbitrary orthonormal bases of L2 (ℝn ). Let us first show that Definition 144 ⇒ Definition 146. ̂ ∗B ̂ are both Hilbert–Schmidt operators. Since ̂ = C ̂ where B ̂ and C Suppose that A ̂ ) 2 , we have, using successively the triangle and the Cauchy– ̂ j |ϕk ) 2 = (Bψ ̂ j |Cϕ (Aψ k L L Schwarz inequalities, together with the trivial inequality bc ≤ 21 (b2 + c2 ), ̂ ̂ ̂ ∑(Aψj |ϕk )L2 ≤ ∑(Bψ j |Cϕk )L2 j,k j,k ̂ ‖2 ̂ j ‖ 2 ‖Cϕ ≤ ∑ ‖Bψ k L L j,k
12.1 Definitions of trace class operators | 147
1 ̂ ‖2 2 ) ̂ j ‖2 2 + ∑ ‖Cϕ ≤ (∑ ‖Bψ k L L 2 j k
1 ̂ 2 ̂ 2 = (‖B‖ 2 + ‖C‖2 ) < ∞, 2
̂ are both Hilbert–Schmidt operators. We have thus ̂ and C the last equality since B ̂ satisfies the condition in Definition 144. Let us next show that, conproven that A versely, Definition 146 ⇒ Definition 144. We choose orthonormal bases (ψj ) and (ϕj )j such that ̂ ∑(|A|ψ j |ϕk )L2 < ∞ j,k
(12.2)
̂ is of trace class if and only if A| ̂ is). Using the polar (recall from Proposition 147 that A ∗ 1/2 ̂ = U( ̂A ̂ A) ̂ ̂ A| ̂ where U is unitary on decomposition theorem, we can write A = U| 2 n ∗ ∗ ̂ 1/4 ∗ ̂ 1/4 ̂ ̂ ∗ B; ̂ ̂ ̂ ̂ ̂ let us show that L (ℝ ). Setting C = U(A A) and B = (A A) , we have A = C ̂ ̂ ̂ will then C and B are both Hilbert–Schmidt operators; the proof will follow since A 2 n ̂ ∈ ℒ (L (ℝ )), it suffices to prove satisfy the condition in Definition 144. To show that C 2 ∗ 2 n ̂ that C ∈ ℒ2 (L (ℝ )), that is ̂∗ ̂∗ ψj |C ϕj )L2 < ∞. ∑(C j
We have, since the absolute values of Hilbert–Schmidt operators also are Hilbert– Schmidt (Proposition 140), ̂∗ ̂∗ ̂ ̂∗ ψj |C ϕk )L2 = ∑(C C ψj |ϕk )L2 ∑(C j,k
j,k
̂ ∗ ̂ 1/2 = ∑((A A) ψj |ϕk )L2 j,k
̂ = ∑(|A|ψ j |ϕk )L2 < ∞, j,k
̂ ∈ ℒ (L2 (ℝn )) as claimed. To show that we also have B ̂ ∈ ℒ2 (L2 (ℝn )), the arguhence C 2 ∗ ∗ ̂ C ̂ = |A|. ̂ B ̂=C ̂ ment is exactly the same, noting that B Let us next show that in Definition 146 the choice of orthonormal bases in (12.1) has no importance and that it suffices to prove that, if condition (12.1) holds for one particular choice of orthonormal bases, then it holds for all. This will in addition allow us to define unambiguously the trace of a trace class operator.
148 | 12 The trace class ̂ ∈ ℬ(L2 (ℝn )) is of trace class if and only if Proposition 149. (i) An operator A ̂ ∑(Aψ j |ϕk )L2 < ∞ j,k
(12.3)
for every pair ((ψj )j , (ϕj )j ) of bases of L2 (ℝn ). (ii) When this is the case, we have ̂ j |ψj ) 2 = ∑(Aϕ ̂ j |ϕj ) 2 < ∞ ∑(Aψ L L j
j
(12.4)
for all orthonormal bases (ψj )j and (ϕj )j of L2 (ℝn ) and where both series are absolutely convergent. Proof. (i) Let ((ψj )j , (ϕj )j ) and ((ψj )j , (ϕj )j ) be two pairs of orthonormal bases of L2 (ℝn ). Writing Fourier expansions ψi = ∑(ψi |ψj )L2 ψj , j
ϕm = ∑(ϕm |ϕk )L2 ϕk , k
we have ̂ |ϕ ) 2 = ∑(ψ |ψj ) 2 (ϕ |ϕk ) 2 (Aψ ̂ j |ϕk ) 2 , (Aψ L i m L i m L L j,k
(12.5)
and hence ̂ ̂ (Aψi |ϕm )L2 ≤ ∑(ψi |ψj )L2 (ϕm |ϕk )L2 (Aψ j |ϕk )L2 . j,k
Summing with respect to the indices i, m, this inequality yields ̂ ̂ ∑(Aψ i |ϕm )L2 ≤ ∑(∑(ψi |ψj )L2 (ϕm |ϕk )L2 )(Aψj |ϕk )L2 i,m
j,k
i,m
(the interchange of summations is justified since the series consist of positive terms), and hence 2 2 ̂ 1 ̂ ∑(Aψ i |ϕm )L2 ≤ 2 ∑(∑(ψi |ψj )L2 + ∑(ϕm |ϕk )L2 )(Aψj |ϕk )L2 . m i i,m j,k
Now, 2 ∑(ψi |ψj )L2 = ‖ψj ‖2L2 = 1, i
2 ∑(ϕm |ϕk )L2 = ‖ϕk ‖2L2 = 1, m
and hence the inequality becomes ̂ ̂ ∑(Aψ i |ϕm )L2 ≤ ∑(Aψj |ϕk )L2 . i,m
j,k
12.1 Definitions of trace class operators | 149
Hence condition (12.1) holds for the bases ((ψj )j , (ϕj )j ) if it holds for ((ψj )j , (ϕj )j ). (ii) In ̂ is of trace class, then view of the first part of the proposition, if A ̂ ̂ ̂ ∑(Aψj |ψj )L2 ≤ ∑(Aψ j |ψj )L2 ≤ ∑(Aψj |ψk )L2 < ∞, j j j,k ̂ j |ψj ) 2 is absolutely convergent. The equality (12.5) implies, in parso the series ∑j (Aψ L ticular, that ̂ i |ϕi ) 2 = ∑(ϕi |ψj ) 2 (ϕi |ψk ) 2 (Aψ ̂ j |ψk ) 2 , (Aϕ L L L L j,k
and hence ̂ j |ψk ) 2 . ̂ i |ϕi ) 2 = ∑(∑(ϕi |ψj ) 2 (ϕi |ψk ) 2 )(Aψ ∑(Aϕ L L L L i
j,k
i
Since we have ∑(ϕi |ψj )L2 (ϕi |ψk )L2 = (ψj |ψk ) = δjk , i
the equality becomes ̂ i |ϕi ) 2 = ∑(Aψ ̂ j |ψj ) 2 , ∑(Aϕ L L i
j
which was to be proven. 12.1.3 Definition of the trace ̂ be a positive semidefinite opIn many texts, one finds the following definition: Let A erator on some infinitely dimensional separable Hilbert space H, and let (ψj )j be an orthonormal basis of H. Then ̂ = ∑(Aψ ̂ j |ψj ) 2 ≤ ∞ Tr(A) L j
̂ and Tr(A) ̂ does not depend on the choice of (ψj )j . Doing so, one is called the trace of A, has to do thereafter some juggling to define the trace of an arbitrary operator (when it exists). We prefer the following approach: ̂ ∈ ℒ1 (L2 (ℝn )) be a trace class operator. By definition, the trace of Definition 150. Let A ̂ is the real number A ̂ = ∑(Aψ ̂ j |ψj ) 2 Tr(A) L j
whose value is independent of the choice of orthonormal basis (ψj )j of L2 (ℝn ).
(12.6)
150 | 12 The trace class It immediately follows, from this definition, that the trace satisfies the two folloŵ ∈ ℒ1 (L2 (ℝn )), then ing properties: If A ̂ ̂ ∗ ) = Tr(A) Tr(A
̂ ≥ 0 ⇒ Tr(A) ̂ ≥ 0; A
(12.7)
̂ is a real number if A ̂ ∗ = A). ̂ Also note that, following the spectral (in particular Tr(A) theorem for compact operators, we have ̂ = ∑ λj Tr(A) j
(12.8)
̂ where the λj are the eigenvalues of A.
12.2 Properties of the trace class 12.2.1 Trace class operators form an ideal Let us first show that the trace class is invariant under conjugation with unitary operators: ̂ ∈ ℒ1 (L2 (ℝn )) and U ̂ be an unitary operator on L2 (ℝn ). Then, Proposition 151. Let A ̂∗A ̂U ̂ ∈ ℒ1 (L2 (ℝn )) and has the same trace as A: ̂ U ̂∗A ̂ U) ̂ = Tr(A). ̂ Tr(U
(12.9)
̂ is of trace class if and only if ∑ (Aψ ̂ j |ψj ) 2 < ∞ for one (and, Proof. The operator A L j 2 n ̂∗A ̂ Uψ ̂ j |ψj ) 2 = (A ̂ Uψ ̂ j |Uψ ̂ j) 2 hence, every) orthonormal basis (ψj )j of L (ℝ ). Since (U L L ̂ j )j also is an orthonormal basis of L2 , it follows that U ̂∗A ̂U ̂ is of trace class; and (Uψ
formula (12.9) now follows: We have
̂∗A ̂ U) ̂ = ∑(U ̂∗A ̂ Uψ ̂ j |ψj ) 2 Tr(U L j
̂ Uψ ̂ j |Uψ ̂ j ) 2 = Tr(A). ̂ = ∑(A L j
Note that formula (12.9) reflects the freedom we have in choosing the orthonormal basis in which the trace is calculated (formula (12.6)). Trace class operators form an ideal in the bounded operators: Proposition 152. The set ℒ1 (L2 (ℝn )) of all trace class operators on L2 (ℝn ) is a two-sided ̂ ∈ ℒ1 (L2 (ℝn )) and B ̂ ∈ ℬ(L2 (ℝn )), then A ̂B ̂ ∈ ℒ1 (L2 (ℝn )) and ideal in ℬ(L2 (ℝn )): If A 2 n ̂ ̂ BA ∈ ℒ1 (L (ℝ )), and we have ̂ B) ̂ = Tr(B ̂ A). ̂ Tr(A
(12.10)
12.2 Properties of the trace class | 151
̂A ̂ ∈ ℒ1 (L2 (ℝn )) if A ̂ ∈ ℒ1 (L2 (ℝn )) and B ̂ is a bounded operator Proof. Let us show that B 2 n ̂ is equivalent to the existence of a number on L (ℝ ). Recall that the boundedness of B ̂ C ≥ 0 such that ‖Bψ‖ ≤ C‖ψ‖ for all ψ ∈ L2 (ℝn ). Let now (ψj )j and (ϕj )j be two or̂ Aψ ̂ j |ϕj ) 2 = (Aψ ̂ j |B ̂ ∗ ϕj ) 2 and applying Bessel’s thonormal bases of L2 (ℝn ); writing (B L L ∗ ̂ ̂ equality to (Aψj |B ϕj )L2 , we get ̂ j |ϕk ) 2 (B ̂ Aψ ̂ j |ϕj ) 2 = ∑(Aψ ̂ ∗ ϕj |ψk ) 2 . (B L L L k
(12.11)
̂ ∗ ϕj ‖ ≤ C, Using the Cauchy–Schwarz inequality, we have, since ‖A ̂ ∗ ϕj |ψk ) 2 ≤ B ̂ ∗ |(B ϕj L2 ‖ψk ‖L2 ≤ C, L and hence ∗ ̂ ̂ ̂ ̂∗ (BAψj |ϕj )L2 ≤ ∑(Aψ j |ϕk )L2 (B ϕj |ψk )L2 k
̂ ≤ C ∑(Aψ j |ϕk )L2 . k
̂ is of trace class, Summing this inequality with respect to the index j yields, since A ̂ ̂ ̂ ∑(BAψj |ϕj )L2 ≤ C ∑(Aψ j |ϕk )L2 < ∞, j j,k ̂A ̂ is of trace class as claimed. That A ̂B ̂ also is of trace class is immediate, noting hence B ∗ ∗ ∗ ̂B ̂ = (B ̂ A ̂ ) and using the fact that the trace class is preserved that we can write A by passage to the adjoint. There remains to prove the trace equality (12.10). Choosing (ψj )j = (ϕj )j , the Bessel equality (12.11) yields ̂ ∗ ψj |ψk ) 2 , ̂ Aψ ̂ j |ψj ) 2 = ∑(Aψ ̂ j |ψk ) 2 (B (B L L L k
and hence, summing over j, ̂ B) ̂ = ∑(A ̂ Bψ ̂ j |ψj ) 2 = ∑(Aψ ̂ j |ψk ) 2 (B ̂ ∗ ψj |ψk ) 2 . Tr(A L L L j
j,k
(12.12)
Similarly, ̂ Bψ ̂ j |ψj ) 2 = ∑(Bψ ̂ j |ψk ) 2 (A ̂ ∗ ψj |ψk ) 2 (A L L L k
̂ k |ψj ) 2 (B ̂ ∗ ψk |ψj ) 2 , = ∑(Aψ L L k
hence ̂ A) ̂ = ∑(A ̂ Bψ ̂ j |ψj ) 2 = ∑(Aψ ̂ k |ψj ) 2 (B ̂ ∗ ψk |ψj ) 2 . Tr(B L L L j
j,k
̂ B) ̂ = Tr(B ̂ A) ̂ as we set out to prove. Comparing (12.12) and (12.13), we have Tr(A
(12.13)
152 | 12 The trace class 12.2.2 The trace class norm Let us define the trace class norm: Definition 153. The trace class norm is the mapping ‖ ⋅ ‖1 : ℒ1 (L2 (ℝn )) → ℝ+ defined by ̂ 1 = Tr(|A|) ̂ ‖A‖ ̂ = (A ̂ ∗ A) ̂ 1/2 is the modulus of A. ̂ where |A| ̂ 1 ≥ 0 and ‖αA‖ ̂ 1 = |α| ‖A‖ ̂ 1 for α ∈ ℂ. Also, ‖A‖ ̂ 1 = 0 implies that |A| ̂ = 0, Clearly, ‖A‖ ̂ hence A = 0. There remains to prove the triangle inequality ̂ + B‖ ̂ 1 ≤ ‖A‖ ̂ 1 + ‖B‖ ̂ 1. ‖A
(12.14)
For this, we will need the formula ̂ 1 = sup{Tr(X ̂A) ̂ :X ̂ ∈ ℒ1 (L2 (ℝn )), ‖X‖ ̂ ≤ 1}, ‖A‖
(12.15)
which follows from the double inequality ̂ ̂ ̂A‖ ̂ 1 ≤ ‖X‖ ̂ ‖A‖ ̂ 1 Tr(X A) ≤ ‖X (we leave the proof of (12.15) to the reader as a pleasant exercise). Using (12.15), we get ̂ + B‖ ̂ 1 = sup Tr(X ̂A) ̂ + Tr(X ̂B) ̂ ‖A ̂ ‖X‖≤1
̂ ̂ ̂ ̂ ≤ sup (Tr(X A) + Tr(X B)) ̂ ‖X‖≤1
≤
sup
‖̂ Y‖≤1,‖Z‖≤1
̂ ̂ ̂ ̂ ̇ (Tr(Y A) + Tr(Z B)).
We can rewrite the last inequality as ̂ + B‖ ̂ 1 ≤ sup (Tr(Y ̂A) ̂ )̇ + sup (Tr(Ẑ B) ̂ )̇ ‖A ‖̂ Y‖≤1
‖Z‖≤1
̂ 1 + ‖B‖ ̂ 1, = ‖A‖ which ends the proof of the triangle inequality for the trace norm. Proposition 154. ℒ1 (L2 (ℝn )) is a Banach algebra for the trace class norm. We omit the proof of this property here; see the comments and references section.
12.3 Trace formulas |
153
12.3 Trace formulas 12.3.1 A product formula Recall that every trace class operator is the product of two Hilbert–Schmidt operators. ̂ = OpW (a) and B ̂ = OpW (b) be Hilbert–Schmidt operators. The Proposition 155. Let A 2 n ̂ ̂ trace of AB ∈ ℒ1 (L (ℝ )) is given by ̂ B) ̂ =( Tr(A
n
1 ) ∫ a(z)b(z)dz. 2πℏ
(12.16)
ℝ2n
̂ and B ̂ are Hilbert–Schmidt operators, we have a ∈ L2 (ℝ2n ) and b ∈ Proof. Since A 2 2n L (ℝ ). Let (ψj )j be an orthonormal basis of L2 (ℝn ); we have ∞
∞
j=1
j=1
̂ B) ̂ = ∑(A ̂ Bψ ̂ j |ψj ) 2 = ∑(Bψ ̂ j |A ̂ ∗ ψj ) 2 . Tr(A L L ̂ j and A ̂ ∗ ψj in the basis (ψj )j , we have Expanding Bψ ∞
̂ j = ∑ (Bψ ̂ j |ψk ) 2 ψk Bψ L k=1 ∞
̂ ℓ ) 2 ψℓ , ̂ ∗ ψj = ∑ (ψj |Aψ A L ℓ=1
and hence ∞
̂ j |A ̂ ∗ ψj ) 2 = ∑ (Bψ ̂ j |ψk ) 2 (Aψ ̂ j |ψk ) 2 . (Bψ L L L k=1
Observing that we have ̂ j |ψk ) 2 = ∫ a(z)W(ψj , ψk )(z)dz = (a|W(ψk , ψj )) 2 2n (Aψ L L (ℝ ) ℝ2n
̂ j |ψk ) 2 = ∫ b(z)W(ψj , ψk )(z)dz = (b|W(ψk , ψj )) 2 2n , (Bψ L L (ℝ ) ℝ2n
the previous above can be rewritten as ∞
̂ j |A ̂ ∗ ψj ) 2 = ∑ (a|W(ψk , ψj )) 2 2n (b|W(ψk , ψj )) 2 2n (Bψ L L (ℝ ) L (ℝ ). k=1
Recalling (Proposition 64) that, if (ψj )j is an orthonormal basis of L2 (ℝn ), then the vectors Φj,k = (2πℏ)n/2 W(ψk , ψj ) form an orthonormal basis of L2 (ℝ2n ), we have ̂ B) ̂ = ∑(Bψ ̂ j |A ̂ ∗ ψj ) 2 n Tr(A L (ℝ ) j
154 | 12 The trace class n
=(
1 ) ∑(a|Φj,k )L2 (ℝ2n ) (b|Φj,k )L2 (ℝ2n ) 2πℏ j,k
=(
1 ) (a|b)L2 (ℝ2n ) , 2πℏ
n
and this proves formula (12.16). ̂ and B ̂ be two Hilbert–Schmidt operators on L2 (ℝn ); the formula Remark 156. Let A ̂ B) ̂ ℒ = Tr(A ̂ ∗ B) ̂ (A| 2 defines a scalar product on ℒ2 (L2 (ℝn )). 12.3.2 Relationship between the trace and the Weyl symbol ̂ be a self-adjoint trace class operator on L2 (ℝn ). In view of the spectral theorem for Let A compact operators, there exists a most countable set (ψj ) of orthonormal eigenvectors ̂ with eigenvalues λj such that, if ψj is an eigenvector for λj , then of A ̂ = ∑ λj (ψ|ψj ) 2 ψj Aψ L j
(12.17)
̂ is given by the absolutely where (λj )j ∈ ℓ1 (ℕ). As a consequence, the Weyl symbol of A convergent series a(z) = (2πℏ)n ∑ λj Wψj (z). j
(12.18)
We omit the proof of the following result due to Shubin: 2n Proposition 157. Let a be in the Shubin symbol class Γm ρ (ℝ ) with m < −2n. Then the ̂ = OpW (a) is of trace class and has trace Weyl operator A
̂ =( Tr(A)
n
1 ) ∫ a(z)dz. 2πℏ
(12.19)
ℝ2n
If one assumes from the beginning that OpW (a) is of trace class, one has the following stronger result: ̂ = OpW (a) be a trace class operator. If a ∈ L1 (ℝ2n ), then Proposition 158. Let A ̂ =( Tr(A)
n
1 ) ∫ a(z)dz. 2πℏ ℝ2n
(12.20)
12.3 Trace formulas | 155
Proof. We notice that the trace formula (12.20) can be rewritten as ̂ = aσ (0) Tr(A) ̂ ̂ =B ̂C where aσ = Fσ a is the symplectic Fourier transform of a ∈ L1 (ℝ2n ). Let us write A ̂ ̂ where B and C are Hilbert–Schmidt operators. Using formula (12.16) in Proposition 155, we have ̂ =( Tr(A)
n
1 ) ∫ b(z)c(z)dz. 2πℏ ℝ2n
Let us show that aσ (0) = (
n
1 ) ∫ b(z)c(z)dz; 2πℏ ℝ2n
formula (12.20) will follow. We have, in view of the composition formula (5.16), aσ (z) = (
n
i 1 ) ∫ e 2ℏ σ(z,z ) bσ (z − z )cσ (z )dz , 2πℏ
ℝ2n
and hence aσ (0) = (
n
1 ) ∫ (bσ )∨ (z)cσ (z)dz 2πℏ n
=(
ℝ2n
1 ) ((bσ )∨ |cσ )L2 (ℝ2n ) 2πℏ
with (bσ )∨ (z) = bσ (−z). Noting that (bσ )∨ = (b∨ )σ and cσ = (c∨ )σ , we thus have, since the symplectic Fourier transform is unitary, aσ (0) = (
n
1 ) ((b∨ )σ |(c∨ )σ )L2 (ℝ2n ) 2πℏ n
=(
1 ) (b∨ |c∨ )L2 (ℝ2n ) 2πℏ n
=(
1 ) ∫ b(z)c(z)dz, 2πℏ ℝ2n
which was to be proven. ̂ is not a trace Remark 159. One can show that then that, if a ∈ L1 (ℝn ) but a ∉ L2 (ℝn ), A class operator.
156 | 12 The trace class
12.4 Comments and references There are many equivalent ways to address the topic of trace class operators, whose importance will become clear in next chapter where we explain the density operator formalism. We have been remotely following the approach in [31], with corrections, modifications, and complements. The elementary theory of Hilbert–Schmidt and trace class operators as studied here is not difficult (it is essentially linear algebra in infinitedimensional vector spaces), but it involves various tricks and juggling with cascades of indices. An elegant (but very concise) description apparently avoiding these calculations can be found in Hörmander [46], § 19.1, or in Shubin [66] (Appendix 3). Again, MacCluer [57] and Blanchard and Brüning [6] are recent good introductory texts. For the proof of Proposition 157, see Shubin [66], § 27.1 (Proposition 27.2(2)). The proof of this proposition is due to Du and Wong [23] (Theorem 2.4). The proof of the triangle inequality (12.14) and of formula (12.15) for the trace norm is due to Shubin [66], § A.3.4, in the form presented here; one also finds in this reference a complete proof of the completeness of the trace class.
13 The quantum Bochner theorem The quantum Bochner theorem deals with positivity questions for trace class operators. It is a difficult topic, which is still being investigated in the research literature.
13.1 The positivity question ̂ : 𝒮 (ℝn ) → 𝒮 (ℝn ) can be expressed Recall that every continuous linear operator A in terms of the symplectic Fourier transform of its Weyl symbol and the Heisenberg displacement operator as n
̂ = ( 1 ) ∫ aσ (z)̂ D(z)ψdz Aψ 2πℏ
(13.1)
ℝ2n
for ψ ∈ 𝒮 (ℝn ) and, equivalently, in terms of the reflection operator n
̂ ̂ = ( 1 ) ∫ a(z)R(z)ψdz. Aψ πℏ
(13.2)
ℝ2n
In this chapter, we investigate which conditions a or aσ should satisfy for a trace class ̂ to be semidefinite positive, i. e., operator A ̂ (Aψ|ψ) ≥0
for all ψ ∈ L2 (ℝn ).
This question is actually much more subtle than it seems, and the main culprit is the difficulty one has to give tractable conditions on the symbol a; much literature has been devoted to the subject (see the comments and references section at the end the chapter).
13.2 Bochner’s theorem A famous theorem of Bochner says that a (complex valued) function f on ℝn , continuous at the origin and such that f (0) = 1, is the Fourier transform of a probability density on ℝn if and only if it is of positive type, that is, if for all choices of points z1 , . . . , zN ∈ ℝn , the N × N matrix F(N) = (f (zj − zk ))1≤j,k≤N
(13.3)
is positive semidefinite (that is, the eigenvalues of F(N) are all ≥ 0). Let us introduce a modification of the symplectic Fourier transform Fσ . https://doi.org/10.1515/9783110722772-013
158 | 13 The quantum Bochner theorem Definition 160. The reduced symplectic Fourier transform F⋄ is defined, for a ∈ L1 (ℝ2n ), by a⋄ (z) = F⋄ a(z) = ∫ e−iσ(z,z ) a(z )d2n z .
(13.4)
The functions a⋄ and aσ = Fσ a are related by the formula a⋄ (z) = (2πℏ)n aσ (ℏz).
(13.5)
With this notation, Bochner’s theorem on Fourier transforms of probability measures can be restated in the following way: A real function ρ on ℝ2n is a probability density if and only if its reduced symplectic Fourier transform ρ⋄ is continuous, ρ⋄ (0) = 1, and, for all choices of z1 , . . . , zN ∈ ℝ2n , the N × N matrix Λ whose entries are the complex numbers ρ⋄ (zj − zk ) is positive semidefinite: Λ = (ρ⋄ (zj − zk ))1≤j,k≤N ≥ 0.
(13.6)
When condition (13.6) is satisfied, one says that the reduced symplectic Fourier transform ρ⋄ is of positive type.
13.3 The quantum case 13.3.1 The KLM conditions To deal with the quantum case, we need the following definition: Definition 161 (KLM conditions). Let a ∈ L1 (ℝ2n ); we say that a⋄ is of ℏ-positive type if for every integer N the N × N matrix Λ(N) with entries iℏ
Λjk = e− 2 σ(zj ,zk ) a⋄ (zj − zk ) is positive semidefinite for all choices of (z1 , z2 , . . . , zN ) ∈ (ℝ2n )N : Λ(N) = (Λjk )1≤j,k≤N ≥ 0.
(13.7)
The condition (13.7) is equivalent to the polynomial inequalities iℏ
∑ ζj ζk e− 2 σ(zj ,zk ) a⋄ (zj − zk ) ≥ 0
1≤j,k≤N
(13.8)
for all N ∈ ℕ, ζj , ζk ∈ ℂ, and zj , zk ∈ ℝ2n . It is easy to see that this implies a⋄ (−z) = a⋄ (z), and therefore a must be realvalued.
13.3 The quantum case
| 159
Remark 162. The conditions (13.7) are often called the “KLM conditions” in the literature following the seminal work of Kastler, Loupias, and Miracle-Sole (see the comments and references at the end of this chapter). Remark 163. One can prove that, if the function a : ℝ2n → ℂ is of ℏ-positive type, then a ∈ L2 (ℝ2n ). 13.3.2 The main result; statement and proof We will need the following lemma, due to Schur. Recall that the Hadamard (or Schur) product of two matrices M(N) = (Mjk )1≤j,k≤N and M(N) = (Mjk )1≤j,k≤N is defined by entrywise multiplication: M(N) ∘ M(N) = (Mjk Mjk )1≤j,k≤N .
We will need the following classical lemma: Lemma 164. Let M(N) = (Mjk )1≤j,k≤N be the Hadamard product M(N) ∘ M(N) of the matri ces M(N) and M(N) . If M(N) and M(N) are positive semidefinite, then so is M(N) .
Let us now prove the quantum Bochner theorem: ̂ = OpW (a) be a self-adjoint trace-class operaTheorem 165 (Quantum Bochner). Let A 2 n 1 2n ̂ ≥ 0 if and only if a⋄ is of ℏ-positive tor on L (ℝ ) with symbol a ∈ L (ℝ ). We have A type. ̂ implies that the symbol a is real. If A ̂ ≥ 0, Proof. Notice that the self-adjointness of A then we have (formula (12.18)) a = (2πℏ)n ∑ αj Wψj j
(13.9)
for an orthonormal basis (ψj )j in L2 (ℝn ), the coefficients αj being ≥ 0 and (αj )j ∈ ℓ1 (ℕ). It is thus sufficient to show that the Wigner transform Wψ of an arbitrary ψ ∈ L2 (ℝn ) is of ℏ-positive type. This amounts to showing that, for all (z1 , . . . , zN ) ∈ (ℝ2n )N and all (ζ1 , . . . , ζN ) ∈ ℂN , we have IN (ψ) =
i
∑ ζj ζk e− 2ℏ σ(zj ,zk ) Fσ Wψ(zj − zk ) ≥ 0
1≤j,k≤N
(13.10)
for every complex vector (ζ1 , . . . , ζN ) ∈ ℂN and every sequence (z1 , . . . , zN ) ∈ (ℝ2n )N . Since the Wigner distribution Wψ and the ambiguity function Aψ are obtained from each other by the symplectic Fourier transform Fσ , we have IN (ψ) =
i
∑ ζj ζk e− 2ℏ σ(zj ,zk ) Aψ(zj − zk ).
1≤j,k≤N
160 | 13 The quantum Bochner theorem Let us prove that IN (ψ) = (
2 ∑ ζj ̂ D(−z )ψ j 2 ; L
n
1 ) 2πℏ
1≤j≤N
(13.11)
the inequality (13.10) will follow. Taking into account the fact that ̂ D(−zk )∗ = ̂ D(zk ) and using the relationship (2.6), we have i
̂ D(zk − zj ) = e− 2ℏ σ(zj ,zk ) ̂ D(zk )̂ D(−zj ),
(13.12)
hence, expanding the square in the right-hand side of (13.11), 2 ̂ j )ψ = ∑ ζj ζk (̂ D(−zj )ψ|̂ D(−zk )ψ)L2 ∑ ζj D(−z 2 L 1≤j,k≤N 1≤j≤N = =
D(zk )̂ D(−zj )ψ|ψ)L2 ∑ ζj ζk (̂
1≤j,k≤N
i
D(zk − zj )ψ|ψ)L2 ∑ ζj ζk e 2ℏ σ(zj ,zk ) (̂
1≤j,k≤N
i
= (2πℏ)n ∑ ζj ζk e 2ℏ σ(zj ,zk ) Aψ(zj − zk ), 1≤j,k≤N
and this proves the equality (13.11). Assuming that, conversely, a⋄ is of ℏ-positive type, 2 n ̂ ̂ let us show that (Aψ|ψ) L2 ≥ 0 for all ψ ∈ L (ℝ ). Since the operator A is bounded on 2 n n L (ℝ ), it is sufficient to prove this for ψ ∈ 𝒮 (ℝ ), or equivalently ∫ a(z)Wψ(z)dz ≥ 0
(13.13)
ℝ2n
for ψ ∈ 𝒮 (ℝn ). Let us now write Mjk = Aψ(zk − zj )aσ (zj − zk ); we claim that the matrix M(N) = (Mjk )1≤j,k≤N is positive semidefinite. In fact, M is the = Hadamard product of the positive semidefinite matrices M(N) = (Mjk )1≤j,k≤N and M(N) (Mjk )1≤j,k≤N where i
Mjk = e 2ℏ σ(zj ,zk ) Aψ(−(zj − zk )) i
Mjk = e− 2 ℏσ(zj ,zk ) aσ (zj − zk ) and Lemma 164 implies that M(N) is also positive semidefinite. Hence, for the function b defined by bσ (z) = Aψ(−z)aσ (z),
13.4 The case of Gaussian distributions | 161
we have that b⋄ is of positive type. In particular, |bσ (z)| ≤ bσ (0), which implies bσ (0) > 0 or bσ (z) = 0 for every z. Moreover, aσ ∈ FL1 (ℝ2n ) and Aψ ∈ 𝒮 (ℝ2n ), which implies that bσ ∈ L1 (ℝ2n ) ∩ FL1 (ℝ2n ), and similarly for b itself. If bσ (z)(0) > 0, it follows from Bochner’s theorem that b is, up to a positive constant, a probability density, hence b(z) ≥ 0 for almost every z ∈ ℝ2n , and therefore for every z ∈ ℝ2n , because b is continuous. The same holds of course if bσ (z) = 0 for every z. In either case, integrating ̂ is self-adjoint) and the the equality, we get, using the fact that a is real (because A Plancherel formula for the symplectic Fourier transform, (2πℏ)n b(0) = ∫ Aψ(−z)aσ (z)dz ℝ2n
= ∫ Aψ(z)aσ (−z)dz ℝ2n
= ∫ Aψ(z)aσ (z)dz ℝ2n
= ∫ Wψ(z)a(z)dz, ℝ2n
hence the inequality (13.13) since b(0) ≥ 0, proving the equality (13.11).
13.4 The case of Gaussian distributions Let Σ be a positive symmetric (real) 2n × 2n matrix and consider the Gaussian ρ(z) = (
n
1 −1 2 1 1 e− 2 Σ z . ) 2π √det Σ
(13.14)
Let us find for which matrices Σ the function ρ is of ℏ-positive type.
13.4.1 Two lemmas We begin by stating the multidimensional Hardy’s uncertainty principle: Lemma 166. Let A and B be two real positive definite matrices and ψ ∈ L2 (ℝn ), ψ ≠ 0. Assume that − 1 Ax2 ψ(x) ≤ Ce 2
and
− 1 Bp2 Fψ(p) ≤ Ce 2
(13.15)
for a constant C > 0. Then: (i) The eigenvalues λj , j = 1, . . . , n, of the matrix AB are all 1
2
≤ 1/ℏ2 ; and (ii) if λj = 1/ℏ2 for all j, then ψ(x) = Ce− 2 Ax for some constant C.
162 | 13 The quantum Bochner theorem We will also need the two following lemmas: Lemma 167. If R is a symmetric positive semidefinite 2n × 2n matrix, then P(N) = (Rzj ⋅ zk )1≤j,k≤N
(13.16)
is a symmetric positive semidefinite N × N matrix for all z1 , . . . , zN ∈ ℝ2n . Proof. There exists a matrix L such that R = L∗ L (Cholesky decomposition). Denoting by ⟨z|z ⟩ = z ⋅ z the inner product on ℂ2n , we have, since the zj are real vectors, L∗ zj ⋅ zk = ⟨L∗ zj |zk ⟩ = ⟨zj |Lzk ⟩ = zj ⋅ Lzk hence Rzj ⋅ zk = Lzj ⋅ Lzk . It follows that for all complex ζj we have ∑ ζj ζk Rzj ⋅ zk = ∑ ζj Lzj ( ∑ ζj Lzj ) ≥ 0
1≤j,k≤N
1≤j≤N
1≤j≤N
hence our claim.
13.4.2 The condition on Σ We now have the tools needed to give a complete characterization of the ℏ-positivity of Gaussians (13.14). Recall that the symplectic eigenvalues of a positive definite symmetric matrix Σ are the numbers λj > 0 such that ±iλj is an eigenvalue of JΣ. One has Williamson’s symplectic diagonalization result M = ST DS where D is a diagonal matrix of the type Λ 0
D=(
0 ), Λ
the diagonal entries of Λ being the positive numbers λj (see Chapter 1). Proposition 168. The Gaussian function (13.14) is ℏ-positive if and only if ℏ ≤ 2λmin
(13.17)
where λmin is the smallest symplectic eigenvalue of Σ; equivalently Σ+
iℏ J ≥ 0. 2
(13.18)
Proof. Let us first show that the conditions (13.17) and (13.18) are equivalent. Let Σ = ST DS be a symplectic diagonalization of Σ. Since ST JS = J condition (13.18) is equivalent to D + iℏ2 J ≥ 0, let z = (x, p) be an eigenvector of D + iℏ2 J; the corresponding
13.4 The case of Gaussian distributions | 163
eigenvalue λ is real and ≥ 0. The equation (D + iℏ2 J)z = λz is easily seen to be equivalent to the equations (Λ − λ)2 x = 41 ℏ2 x and (Λ − λ)2 p = 41 ℏ2 p. Since z ≠ 0, we must have (λj − λ)2 = 41 ℏ2 , that is λj − λ = ± 21 ℏ for all j = 1, 2, . . . , n; since λ ≥ 0, this is equivalent to λj ≥ 21 ℏ for all j, that is to (13.17). Let us now show that the condition (13.17) is necessary for the function 1 −1 2
a(z) = (2π)−n √det Σ−1 e− 2 Σ
z
(13.19)
̂ = OpW (a) and Ŝ ∈ to be a ℏ-Wigner transform of a positive trace class operator. Let A −1 ̂ ̂ ̂ is of trace class if and only if SA ̂ S is, in which case Tr(A) ̂ = Mp(n); the operator A ̂ Ŝ−1 ). Choose Ŝ with projection S ∈ Sp(n) such that Σ = ST DS is a symplectic Tr(ŜA diagonalization of Σ. This choice reduces the proof to the case Σ = D, that is, to 1
−1 2
a(z) = (2π)−n (det Λ−1 )e− 2 (Λ
x +Λ−1 p2 )
(13.20)
.
̂ is of trace class; then by the spectral theorem (Proposition 145), Suppose now that A there exists an orthonormal basis of functions (ψj )j of L2 (ℝn ) such that ρ(z) = ∑ αj Wψj (z) j
where the αj are real numbers summing up to one. Integrating with respect to the p and x variables, respectively, the marginal conditions satisfied by the Wigner transform imply that we have 2 ∑ αj ψj (x) = (2π)−n/2 (det Λ)1/2 e− 2 Λ
x
2 ∑ αj Fψj (p) = (2π)−n/2 (det Λ)1/2 e− 2 Λ
p
1
−1 2
j
1
j
−1 2
.
In particular, since αj ≥ 0 for every j = 1, 2, . . . , n, − 1 Λ−1 x2 , ψj (x) ≤ Cj e 4
− 1 Λ−1 p2 Fψj (p) ≤ Cj e 4
with Cj = (2π)−n/4 (det Λ)1/4 /αj1/2 . Applying Lemma 166 with A = B = ℏ2 Λ−1 , we must have ℏ ≤ 2λj for all j = 1, . . . , n, which is condition (13.17); this establishes the sufficiency statement. (iii) Let us finally show that, conversely, the condition (13.18) is sufficient. It is again no restriction to assume that Σ is the diagonal matrix D = ( Λ0 Λ0 ); the symplectic Fourier transform of ρ is easily calculated, and one finds that ρ⋄ (z) = 1 2 e− 4 Dz . Let Λ(N) = (Λjk )1≤j,k≤N with iℏ
Λjk = e− 2 σ(zj ,zk ) ρ⋄ (zj − zk ); a simple algebraic calculation shows that we have 1
2
1
1
2
Λjk = e− 4 Dzj e 2 (D+iℏJ)zj ⋅zk e− 4 Dzk ,
164 | 13 The quantum Bochner theorem and hence Λ(N) = Δ(N) Γ(N) Δ∗(N) 1
2
1
1
2
where Δ(N) = diag(e− 4 Dz1 , . . . , e− 4 DzN ) and Γ(N) = (Γjk )1≤j,k≤N with Γjk = e 2 (D+iℏJ)zj ⋅zk . The matrix Λ(N) is thus positive semidefinite if and only if Γ(N) is, but this is the case in view of Lemma 167.
13.5 Positive trace-class operators We are now going to give an integral description of the positivity of a trace-class operator on L2 (ℝn ). We begin with a general result: ̂ = OpW (a) be a trace-class operator on L2 (ℝn ). We have A ̂ ≥ 0 if Proposition 169. Let A 2 n ̂ ̂ ̂ and only if Tr(AB) ≥ 0 for every positive operator B ∈ ℒ1 (L (ℝ )). ̂B ̂ is indeed of trace class so the Proof. Since ℒ1 (L2 (ℝn )) is an algebra the product, A ̂ B) ̂ ≥ 0 makes sense; setting B ̂ = OpW (b), we have condition Tr(A b(z) = (2πℏ)n ∑ βj Wψj j
where (βj )j ∈ ℓ1 (ℕ) with βj ≥ 0 and ψj an orthonormal basis for L2 (ℝn ). Observing that trace-class operators are also Hilbert–Schmidt ones, we also have, since a, b ∈ L2 (ℝ2n ), ̂ B) ̂ = ∫ a(z)b(z)dz, Tr(A
(13.21)
ℝ2n
and hence ̂ B) ̂ = (2πℏ)n ∑ βj ∫ a(z)Wψj (z)dz Tr(A j
ℝ2n
(the interchange of integral and series is justified by Fubini’s Theorem). Assume that ̂ B) ̂ ≥ 0. It is enough to check the positivity of A ̂ on unit vectors ψ in L2 (ℝn ). ChoosTr(A ing all the βj = 0 except β1 and setting ψ1 = ψ, we have ∫ a(z)Wψ(z)dz ≥ 0;
(13.22)
ℝ2n
̂ ≥ 0. If, consince we can choose ψ ∈ L2 (ℝn ) arbitrarily, this means that we have A ̂ ≥ 0, then (13.22) holds for all ψj, hence Tr(A ̂ B) ̂ ≥ 0. versely, we have A Let us next prove:
13.6 Comments and references | 165
̂ = OpW (a) be a trace-class operator. We have A ̂ ≥ 0 if and only Proposition 170. Let A if i
∫ Fσ a(z)( ∫ e− 2ℏ σ(z,z ) c(z − z)c(z )dz )dz ≥ 0 ℝ2n
(13.23)
ℝ2n
for all c ∈ L2 (ℝ2n ). ̂ ≥ 0 if and only if Tr(A ̂ B) ̂ ≥ 0 for all positive Proof. In view of Lemma 169, we have A 2 n ̂ B = OpW (b) ∈ ℒ1 (L (ℝ )), that is, (formula (13.21)) ∫ a(z)b(z)dz ≥ 0 = (a|b)L2 (ℝ2n ) .
(13.24)
ℝ2n
(Recall that Weyl symbols of self-adjoint operators are real). Using Plancherel’s Theorem, n
(a|b)L2 (ℝ2n ) = (
1 ) (Fσ a|Fσ b)L2 (ℝ2n ) . 2πℏ
(13.25)
̂ ∈ ℒ (L2 (ℝn )) such that B ̂ ∗ C, ̂ and hence, setting C ̂ = ̂ ≥ 0, there exists C ̂ = C Since B 2 ∗ ̂ = Op (c)), OpW (c) (recall that C by the composition formula for Weyl operators W ̄ Fσ,ℏ b(z) = (
n
i 1 ̄ − z )Fσ c(z )dz ) ∫ e 2ℏ σ(z,z ) Fσ c(z 2πℏ
n
=(
ℝ2n
i 1 ) ∫ e 2ℏ σ(z,z ) Fσ c(z − z)Fσ,ℏ c(z )dz . 2πℏ
ℝ2n
̂ = Op (c) is a Hilbert–Schmidt Vice-versa, taking any function c ∈ L2 (ℝ2n ), then C W ∗ ̂ C ̂ is a positive operator. Hence, using that the operator F is a ̂ =C operator, and B σ,ℏ topological isomorphism on L2 (ℝ2n ), condition (13.24) and (13.25) are equivalent to i
∫ Fσ a(z)( ∫ e− 2ℏ σ(z,z ) c(z − z)c(z )dz )dz ≥ 0, ℝ2n
(13.26)
ℝ2n
for every c ∈ L2 (ℝ2n ), as we set out to prove.
13.6 Comments and references The topic of positivity, which was quite active in the late 1980’s, hasn’t much evolved since; many open questions remain open. It is a notoriously difficult part of functional
166 | 13 The quantum Bochner theorem analysis that has been tackled by many authors, but there have been few decisive advances since the pioneering work of Kastler [50] and Loupias and Miracle-Sole [55, 56]; see however Dias and Prata [18] and our joint paper [17] with Cordero and Nicola. For a proof of Bochner’s theorem, see Katznelson [51]. The proof of the quantum Bochner theorem goes back to the seminal work of Kastler [50] and Loupias and Miracle-Sole [55, 56]. While these authors use the theory of C ∗ -algebras and hard functional analysis, the conceptually simpler proof we have given here was published in our joint work with Elena Cordero and Fabio Nicola [17] already mentioned. In the literature, the conditions (13.8) in Definition 161 are often called the “KLM conditions” as a tribute to Kastler, Loupias, and Miracle-Sole. (Theorem 4). Narcowich [58] was the first to study the positivity of Gaussians using the approach in Kastler’s paper [50]; the proof we have given in Proposition 168 is simpler. For a proof of Proposition 166, see de Gosson and Luef [37].
14 The density operator Quantum mechanics in its modern form is the study of density operators and of their properties, both geometric and analytical. The density operator (or density matrix as it is called in physics) describes a statistical ensemble of pure quantum states. Its theory is far from being complete, for instance, the study of entangled mixed states (which we study later on in this book) contains many open problems.
14.1 Mixed quantum states: heuristics A quantum state is, mathematically speaking, an operator that provides a probability distribution for the outcomes of each possible measurement on a system. A quantum system is said to be in a pure state if we have complete knowledge about that system. Physicists usually denote a pure state by |ψ) (Dirac’s ket notation). Mathematically, the state is identified with the linear span ℂψ = {λψ : λ ∈ ℂ} of a non-zero function ψ belonging to some Hilbert space H that we will usually identify with L2 (ℝn ). On a slightly more abstract level, this pure state can be identified with the orthogonal projection ρ̂ψ on the ray ℂψ. This projection, which is of rank one, is denoted by |ψ⟩⟨ψ| in quantum mechanics. However, in practice we have only partial knowledge of the quantum system under consideration. When this is the case, we say that the system is in a mixed state. Mixed states are classical probabilistic mixtures of pure states; however, different distributions of pure states can generate physically indistinguishable mixed states. Technically speaking, a quantum mixed state can be viewed as the datum of a set of pairs {(ψj , λj )} where ψj is a (normalized square integrable) pure state and λj (0 ≤ λj ≤ 1) a classical probability, these probabilities summing up to one: ∑j λj = 1. One identifies the state {(ψj , λj )} with the convex sum of projectors ρ̂ = ∑ λj ρ̂j j
where ρ̂j = ρ̂ψj is the orthogonal projection on the ray ℂψj . The operator ρ̂ is called a density operator (or density matrix in physics). It is a trace class operator with trace Tr(ρ̂) = 1. The Wigner distribution of the density operator ρ̂ (or of the state {(ψj , λj )}) is the function ρ = ∑ λj Wψj , j
and this function plays the role of a probability distribution on ℝn (although it can take negative values): The density operator describes a preparation procedure for an ensemble of quantum systems whose statistical properties correspond to the given preparation procedure https://doi.org/10.1515/9783110722772-014
168 | 14 The density operator
14.2 Density operator: definition and first properties 14.2.1 Definition of a density operator Recapitulating from our informal discussion: Definition 171. A quantum state is a set of pairs (ψj , λj ) ∈ L2 × ℝ+ indexed by a discrete set F and where ‖ψj ‖L2 = 1 for all j ∈ F and ∑j∈F λj = 1. The linear operator ρ̂ on L2 (ℝn ) defined by ρ̂ψ = ∑ λj (ψ|ψj )L2 ψj j∈F
(14.1)
is the density operator associated with that state; the Wigner distribution of that state is ρ = ∑ λj Wψj . j∈F
(14.2)
It is customary in quantum mechanics to identify a quantum state with the density operator it determines. Definition (14.1) can be rewritten ρ̂ = ∑ λj ρ̂j j∈F
where ρ̂j is the orthogonal projector on the ray ℂψj . Notice that this is absolutely convergent in ℬ(L2 (ℝn )) since ‖ρ̂j ‖ℬ(L2 (ℝn )) = 1 for every j and ∑ ‖λj ρ̂j ‖ℬ(L2 (ℝn )) ≤ ∑ |λj | < ∞. j
j
We will usually skip the reference to the index set F and write simply ρ̂ = ∑ λj ρ̂j . j
14.2.2 First properties The first (hardly surprising) result is: Proposition 172. The operator ρ̂ = ∑j λj ρ̂j is a self-adjoint semidefinite positive operator on L2 (ℝn ), and we have ρ̂ ∈ ℒ1 (L2 (ℝn )) and Tr(ρ̂) = 1. Proof. That ρ̂ ≥ 0 follows from the fact that (formula (14.1)) (ρ̂ψ|ψ)L2 = ∑ λj (ψ|ψj )L2 (ψj |ψ)L2 j
14.2 Density operator: definition and first properties | 169
2 = ∑ λj (ψ|ψj )L2 ≥ 0. j
Notice that it follows from the Cauchy–Schwarz inequality that the series is absolutely convergent (it is also easy to verify directly) and that (ρ̂ψ|ψ)L2 ≤ ‖ψ‖2L2 . The self-adjointness is clear since we are dealing with a complex Hilbert space, so it remains to be shown that ρ̂ is of trace class and Tr(ρ̂) = 1. Let (ϕk )k be an orthonormal basis of L2 (ℝn );, we have ρ̂j ϕk = (ϕk |ψj )L2 ψj , hence (ρ̂j ϕk |ϕk )L2 = |(ϕk |ψj )L2 |2 , and hence, using Parseval’s identity, 2 ∑(ρ̂ϕk |ϕk )L2 = ∑ λj (ϕk |ψj )L2 k
j,k
2 = ∑ λj (∑(ϕk |ψj )L2 ) j
k
= ∑ λj ‖ψj ‖2L2 < ∞, j
hence ρ̂ is of trace class, and Tr(ρ̂) = ∑j λj = 1 since ‖ψj ‖L2 = 1 for all j. For the Wigner distribution, we have: Proposition 173. The series ρ = ∑j λj Wψj , λj ≥ 0, ∑j λj = 1 converges in L2 (ℝ2n ), and we have ρ ∈ L2 (ℝ2n ) ∩ L∞ (ℝ2n ). In view of Moyal’s identity (3.12), we have n
‖λj Wψj ‖L2
n
1 2 1 2 ) ‖ψj ‖2L2 = ( ) λj , = λj ( 2πℏ 2πℏ
and hence ‖ρ‖L2 ≤ (2πℏ)−n/2 since the λj are ≥ 0 and sum up to one. We therefore have absolute convergence in L2 (ℝ2n ) of the series ρ. On the other hand, by Cauchy– Schwarz’s inequality (recall that ‖ψj ‖2L2 = 1), n
2 Wψj (z) ≤ ( ) πℏ for all z ∈ ℝ2n , hence
‖ρ‖∞ ≤ ∑ λj ‖Wψj ‖∞ ≤ ( j
n
2 ) , πℏ
and the series ρ is absolutely convergent in L∞ (ℝ2n ). Remark 174. Notice that in general ρ is not in L1 (ℝ2n ). Let for instance ρ = Wψ; if ρ ∈ L1 (ℝ2n ), then ψ must be in the Feichtinger algebra S0 (ℝ2n ) (Proposition 117) which is smaller than L1 (ℝn ). We will come back to this in a moment.
170 | 14 The density operator 14.2.3 The convexity property Here is an interesting geometric result about the set of density operators. Proposition 175. (i) The density operators on a Hilbert space H form a convex subset Dens(H) of the space ℬ(H) of bounded operators on H. (ii) Let ρ̂ψ be the orthogonal projection on the ray ℂψ (‖ψ‖H = 1). If there exist ρ̂1 , ρ̂2 ∈ Dens(H) such that ρ̂ψ = λρ̂1 + (1 − λ)ρ̂2 with λ ≠ 0 and λ ≠ 1, then ρ̂ψ = ρ̂1 = ρ̂2 . That is, the extreme points of Dens(H) are the rank-one projections that correspond to the pure states on H. Proof. (i) To say that the set of density matrices is convex means that, if ρ̂1 and ρ̂2 are in Dens(H), then so is λρ̂1 + (1 − λ)ρ̂2 for all real numbers λ such that 0 ≤ λ ≤ 1. Let {(ψj , αj ) : j ∈ F} and {(ϕj , βj ) : j ∈ G} be two mixed states. Relabeling if necessary the indices, we may assume that the sets F and G are disjoint. The corresponding density matrices are ρ̂1 = ∑ αk ρ̂ψk , k∈F
ρ̂2 = ∑ βℓ ρ̂ϕℓ ℓ∈G
where ρ̂ψk and ρ̂ϕℓ are the orthogonal projections on ℂψk and ℂϕℓ , respectively, and hence we have by linearity λρ̂1 + (1 − λ)ρ̂2 = ∑ γj ρ̂χj j∈F∪G
where χj = ψj , γj = λαj if j ∈ F and χj = ϕj , γj = (1 − λ)βj if j ∈ G. That ρ̂ = λρ̂1 + (1 − λ)ρ̂2 is a mixed state now follows from the obvious equality ∑ γj = λ ∑ αj + (1 − λ) ∑ βj = 1.
j∈F∪G
j∈F
j∈G
(ii) That a pure state never can be represented as a mixed state is easily seen using the following algebraic argument: Assume that ρ̂ψ can be rewritten as a sum ∑j αj ρ̂ψj with αj ≥ 0 and ∑j αj = 1. Discarding the terms with αj = 0, we may assume that αj > 0 for all indices j. Let now (ℂψ)⊥ be the subspace of H orthogonal to the ray ℂψ: It consists of all vectors ϕ in H such that (ψ|ϕ)L2 = 0. For every ϕ ∈ (ℂψ)⊥ , we have (ρ̂ψ ϕ|ϕ)H = |(ϕ|ψ)H |2 = 0, and hence also, if we assume that ρ̂ψ = ∑j αj ρ̂ψj , 2 (ρ̂ψ ϕ|ϕ)H = ∑ αj (ϕ|ψj )H = 0. j
This equality implies that we must have (ϕ|ψj )L2 = 0 for every ϕ ∈ (ℂψ)⊥ and every ψj , hence ψj ∈ ((ℂψ)⊥ )⊥ = ℂψ. Since ψ and ψj have norms equal to one, we must have ψj = λj ψ for some complex number λj with |λj | = 1; the vectors ψj thus define the state ψ. This means that, if ρ̂ψ is a pure state density matrix, then the relationship ρ̂ψ = λρ̂1 + (1 − λ)ρ̂2 with λ ≠ 0 and λ ≠ 1 implies ρ̂ψ = ρ̂1 = ρ̂2 . That this, is the case immediately follows from the previous argument.
14.2 Density operator: definition and first properties | 171
14.2.4 The spectral theorem for density operators We now have most of the tools needed for a functional study of the density operator ρ̂ψ = ∑ λj (ψ|ψj )L2 ψj . j∈F
(14.3)
Let us now state the spectral theorem for density operators. It is just a slight reformulation of the result for arbitrary trace class operators. Theorem 176 (Spectral Theorem). Let ρ̂ be a density operator on L2 (ℝn ). Then: (i) The set (λj ) of eigenvalues λj of ρ̂ is at most countable and consists of non-negative numbers λ1 ≥ λ2 ≥ ⋅ ⋅ ⋅ ≥ λj ≥ ⋅ ⋅ ⋅ ≥ 0 such that ∑j λj = 1, and we have limj→∞ λj = 0 if (λj ) is infinite; (ii) the multiplicity of every eigenvalue λj is finite; (iii) there exists an at-most ̂ with eigenvalues λj such that, if ψj countable set (ψj ) of orthonormal eigenvectors of A is an eigenvector for λj , then ρ̂ψ = ∑ λj (ψ|ψj )L2 ψj . j
(14.4)
In particular, the Wigner distribution of ρ̂ is the sum ρ = ∑ λj Wψj j
(14.5)
of pairwise orthogonal Wigner transforms Wψj , Proof. It is an immediate consequence of the spectral theorem for compact operators (Theorem 7). That the eigenvalues are ≥ 0 follows from the positive semi-definiteness of ρ̂, and the relationship ∑j λj = 1 follows from Tr(ρ̂) = 1. That the Wψj are pairwise orthogonal follows from Moyal’s identity. This result makes clear that different mixed states can give rise to the same operator: In our original definition (Definition 171) of the density matrix, there was no orthogonality requirement for the pure state composing the mixed state; the theorem shows that one can always find a system of orthonormal states defining that operator. The physical significance of this ambiguity is not well-understood and deserves further study. Let us just say that the statistical information we have about a mixed state is encoded in the density operator, so that the predictions we can make about the state are not dependent on the way it is decomposed. The spectral theorem allows us to study the purity of a quantum state that is defined as follows: Definition 177. Let ρ̂ be a density operator on L2 (ℝn ). The purity of ρ̂ is the positive real number μ(ρ̂) = Tr(ρ̂2 ).
(14.6)
172 | 14 The density operator The definition makes sense because ρ̂2 ∈ ℒ1 L(2 (ℝn )) in view of the algebra property of the trace class. Purity is a measure of the “mixedness” of a quantum state: Proposition 178. (i) We have μ(ρ̂) = ∑ λj2
(14.7)
j
where the λj are the eigenvalues of ρ̂. (ii) We have 0 ≤ μ(ρ̂) ≤ 1 and μ(ρ̂) = 1 if and only if ρ̂ is a pure state, that is, if there exists ψ ∈ L2 (ℝn ) such that ρ̂ = ρ̂ψ , the orthogonal projection on the ray ℂψ. Proof. (i) Since ρ̂ = (2πℏ)n OpW (ρ), the trace formula (12.16) yields Tr(ρ̂2 ) = (2πℏ)n ∫ ρ(z)2 dz.
(14.8)
ℝ2n
Writing ρ as the sum (14.5) of mutually orthogonal Wigner transforms, we have ρ2 = ∑ λj2 (Wψj )2 , j
hence 2
Tr(ρ̂2 ) = (2πℏ)n ∑ λj2 ∫ (Wψj (z)) dz. j
ℝ2n
In view of the Moyal identity (3.14) for the Wigner transform, we have 2
∫ (Wψj (z)) dz = ( ℝ2n
n
n
1 1 ) ‖ψj ‖2L2 = ( ) , 2πℏ 2πℏ
hence formula (14.7). (ii) Since ∑j λj = 1 and λj ≥ 0, we have μ(ρ̂) = ∑j λj2 ≤ 1. The equality μ(ρ̂) = 1 can occur only if all the λj except one are equal to zero. We will see later on that the purity of a Gaussian state with the Wigner distribution ρ(z) = (
n
1 −1 1 ) (det Σ)−1/2 e− 2 Σ z⋅z 2π
is given by n
ℏ μ(ρ̂) = ( ) det(Σ−1/2 ). 2
14.3 The Wigner distribution as a Weyl symbol |
173
14.3 The Wigner distribution as a Weyl symbol 14.3.1 The density operator as a Weyl operator While the definition of the Wigner distribution of a density operator seems at first sight to be ad hoc, it is canonically related to ρ̂ by the Weyl transform: Proposition 179. The density operator ρ̂ can be expressed in terms of its Wigner distribution ρ by the formula ρ̂ = (2πℏ)n OpW (ρ) .
(14.9)
̂ Equivalently, using the displacement and reflection operators ̂ D(z) and R(z), we have the formulas ̂ ρ̂ψ = 2n ∫ ρ(z)R(z)ψdz
(14.10)
ℝ2n
and D(z)ψdz, ρ̂ψ = ∫ ρσ (z)̂
(14.11)
ℝ2n
valid for ψ ∈ L2 (ℝn ). Proof. Let (ψ, ϕ) ∈ 𝒮 (ℝn ) × 𝒮 (ℝn ). We have (ρ̂ψ|ϕ)L2 = ∑ λj (ψ|ψj )L2 (ψj |ϕ)L2 j
= (2πℏ)n ∑ λj W(ψ, ϕ)Wψj j
where the second equality follows from Moyal’s identity (3.12). In view of the definition (3.1) of W(ψ, ϕ), this can be rewritten ̂ (ρ̂ψ|ϕ)L2 = 2n ∫ (ψ|R(z)ϕ) L2 ρ(z)dz ℝ2n
̂ = 2n ∫ ρ(z)(R(z)ψ|ϕ) L2 dz ℝ2n
̂ where we have used the fact that the reflection operator. R(z) is involutive. Formula (14.10) follows. To prove formula (14.11), we first note that (14.10) can be rewritten, using Plancherel’s formula, as ̂ ρ̂ψ = 2n ∫ Fσ ρ(z)Fσ (R(⋅)ψ)(−z)dz. ℝ2n
174 | 14 The density operator Recalling (formula (2.19) in Proposition 26) that ̂ R(z)ψ = 2−n Fσ [̂ D(⋅)ψ](−z), we also have, since Fσ is an involution of 𝒮 (ℝ2n ), ̂ Fσ (R(⋅)ψ)(z) = 2n ̂ D(−z)ψ hence formula (14.11). ̂ 14.3.2 Application to the statistical interpretation of ρ We have seen in Section 3.5 that the Wigner transform of a normalized function ψ ∈ L2 (ℝn ) can be viewed as a quasi-probability function for pure states. The generalization to the density operator is rather straightforward. ̂ be a self-adjoint operator on L2 (ℝn ); we assume for convenience that A ̂ is Let A 2 n ̂ ∈ ℬ(L (ℝ )). We will refer to A ̂ as an “observable”. bounded: A Definition 180. Let ρ̂ be a density operator on L2 (ℝn ). The average (or mean) value of ̂ in the state ρ̂ is the real number the observable A ̂ ρ̂ = Tr(A ̂ ρ̂). ⟨A⟩
(14.12)
̂ ρ̂ is well-defined: Recall that the trace class operators form an The number ⟨A⟩ 2 n ̂ ρ̂ ∈ ℒ1 (L2 (ℝn )). That Tr(A ̂ ρ̂) is real follows from ideal ℒ1 (L (ℝ )) in ℬ(L2 (ℝn )), hence A the observation that the trace of a self-adjoint operator is real and that ̂ ρ̂) = Tr((A ̂ ρ̂)∗ ) = Tr(ρ̂A) ̂ = Tr(A ̂ ρ̂). Tr(A The following result connects the definition above to the Wigner formalism: ̂ = OpW (a). We have Proposition 181. Assume that A ̂ ρ̂ = ∫ a(z)ρ(z)dz ⟨A⟩
(14.13)
ℝ2n
where ρ is the Wigner distribution of the density operator ρ̂. Proof. Formula (14.13) immediately follows from the product formula (12.16) since ρ̂ = (2πℏ)n OpW (ρ). Using the decomposition ρ = ∑ λj Wψj j
in a direct sum, we see that ̂ ρ̂ = ∑ λj ∫ a(z)Wψj (z)dz = ∑ λj ⟨A⟩ ̂ ψ. ⟨A⟩ j j
ℝ2n
j
(14.14)
14.4 Quantum condition on the covariance matrix | 175
14.4 Quantum condition on the covariance matrix 14.4.1 The covariance matrix of a density operator Let ρ be a real valued-function defined on ℝ2n satisfying the normalization condition ∫ ρ(z)dz = 1
(14.15)
ℝ2n
(ρ is not necessarily the Wigner distribution of a density matrix ρ̂ at this point). We will call ρ a quasi-distribution on ℝ2n . We assume in addition that ρ decreases sufficiently fast at infinity to guarantee that ∫ xα pβ ρ(x, p)dpdx < ∞
if |α| + |β| ≤ 2.
(14.16)
ℝ2n
In particular, we have ρ ∈ L1 (ℝ2n ). Let us write z = (x, p) as z = (z1 , . . . , z2n ), that is zα = xα if 1 ≤ α ≤ n and zα = pα if n + 1 ≤ α ≤ 2n. Definition 182. Let ρ be defined as just stated. The covariance matrix associated with ρ is the real symmetric 2n × 2n matrix defined by T
Σ = ∫ (z − ⟨z⟩)(z − ⟨z⟩) ρ(z)dz
(14.17)
ℝ2n
where ⟨z⟩, the average (or expectation) vector, is given by ⟨z⟩ = ∫ zρ(z)dz.
(14.18)
ℝ2n
(All vectors z are viewed as column matrices in this definition). There is a multitude of objects associate with the covariance matrix; they are defined following the lines of classical statistics. For instance, ⟨zα ⟩ρ = ∫ zα ρ(z)dz ℝ2n
is called the α-th expectation of ρ. The number Σαβ = ∫ (zα − ⟨zα ⟩ρ )(zβ − ⟨zβ ⟩ρ )ρ(z)dz ℝ2n
is called the (α, β) covariance of ρ, so that Σ = (Σαβ )1≤α,β≤2n .
(14.19)
176 | 14 The density operator When α = β, the number Σ2α = Σαα is called the α-th variance of ρ. It will often be convenient to write the covariance matrix in block form as ΣXX ΣPX
Σ=(
ΣXP ), ΣPP
ΣPX = ΣTXP
where the n × n blocks ΣXX , ΣXP , ΣPP are given by ΣXX = (Σαβ )1≤α,β≤n , etc.
14.4.2 The quantum condition We recall (Theorem 185) that a necessary condition for an operator trace class ρ̂ on L2 (ℝn ) to be positive semi-definite (and hence a density operator) is that its covariance matrix Σ satisfies (when defined) the quantum condition Σ+
iℏ J ≥ 0. 2
(14.20)
Also recall that this condition is equivalent to ℏ ≤ 2λmin
(14.21)
where λmin is the smallest symplectic eigenvalue of Σ. We now address the sensitivity of this condition on the parameter ℏ. We begin by noticing the following rather straightforward result: Proposition 183. If the quantum condition (14.20) holds, then it holds for every η < ℏ: Σ+
iη iℏ J ≥ 0 ⇒ Σ + J ≥ 0 for η < ℏ. 2 2
Proof. Setting η = rℏ with 0 < r ≤ 1, we have Σ+
iη iℏ J = (1 − r)Σ + r(Σ + J) ≥ 0 2 2
as soon as r ≤ 1 since the sum of two ≥ 0 matrices is also ≥ 0. Note that this property does not allows us to conclude that a quantum state remains a quantum state if we decrease ℏ because the quantum condition (14.20) is necessary but not sufficient to insure we have a bona fide density operator (except for Gaussian states). We are going to prove a very important result connecting the covariance matrix formalism with the positivity issues studied in Chapter 13. The proof relies on the following technical lemma:
14.4 Quantum condition on the covariance matrix | 177
Lemma 184. If the function a : ℝ2n → ℂ is continuous and twice continuously differentiable near 0 and of ℏ-positive type, then we have − 2ℏa (0) + iJ ≥ 0
(14.22)
where a (0) = D2 a(0) is the Hessian matrix of a at 0. Proof. For (λ1 , . . . , λm ) ∈ ℂm and ε ∈ ℝ, let us define m
iε2
R(ε) = ∑ λj λk e− 2ℏ σ(zj ,zk ) a(ε(zj − zk )). j,k=1
If a is of ℏ-positive type, we have R(ε) ≥ 0 for every ε; choose now the λj such that ∑j λj = 0; then R(0) = 0 and R (0) ≥ 0. An elementary calculation shows that R (0) = uT (−2a (0) + iℏ−1 J)u where u = ∑j λj zj ∈ ℂ2n . The λj and zj being arbitrary we thus have −2a (0) + iℏ−1 J ≥ 0, proving the inequality (14.22). Let us now prove: Theorem 185 (Quantum condition). Let ρ be the Wigner distribution of a Feichtinger state ρ̂. (i) The covariance matrix Σ satisfies the “quantum condition” Σ+
iℏ J≥0 2
(14.23)
i. e., the Hermitian matrix Σ+ iℏ2 J is positive semidefinite; (ii) the matrix Σ is positive: Σ > 0 (and hence invertible); (iii) condition (14.23) is also sufficient for ρ̂ to be a Feichtinger state when ρ is a Gaussian 1 −1 2
ρ(z) = (2π)−n √det Σ−1 e− 2 Σ Proof. (i) That Σ +
iℏ J 2
z
.
(14.24)
is Hermitian is clear: Since the adjoint of J is −J, we have iℏ iℏ iℏ J) = Σ∗ + ( J) = Σ + J. 2 2 2 ∗
(Σ +
∗
We next remark that Σ = Σ0 where Σ0 is the covariance matrix of ρ0 (z) = ρ(z + ⟨z⟩ρ ): We have ⟨z⟩ρ0 = 0, and hence it is thus sufficient to assume that the density operator ρ̂ is centered at 0. We are going to use Lemma 184 with f (z) = Fσ ρ0 (z) = ρ0,σ (z). Observing that the symplectic Fourier transform ρσ is twice continuously differentiable in view of the argument preceding the statement of the theorem, we have −n −ΣPP ℏ2 ρ σ (0) = (2πℏ) ( ΣPX
ΣXP ), −ΣXX
178 | 14 The density operator and hence ℏ2 ρ σ (0) = (
n
1 ) JΣJ. 2πℏ
(14.25)
Since ρ̂ is a density matrix, we have M = −2ℏ−1 JΣJ + iJ ≥ 0; the condition M ≥ 0 being equivalent to J T MJ ≥ 0 the inequality (14.23) follows. Let us show that Σ > 0. (ii) Suppose in contrario that Σ has a negative eigenvalue λ, and let zλ be a real eigenvector for λ. Since Jzλ ⋅ zλ = 0, we have (Σ +
iℏ J)zλ ⋅ zλ = Σzλ ⋅ zλ = λ|zλ |2 < 0 2
which contradicts the assumption Σ + iℏ2 J ≥ 0. There remains to show that 0 cannot be an eigenvalue of Σ. Suppose that 0 is an eigenvalue, and let z0 be a corresponding real eigenvector. For ε > 0, set z(ε) = (I + iεJ)z0 . Using the relationships Σz0 = 0 and Jz0 ⋅ z0 = σ(z0 , z0 ) = 0, we get, after a few calculations, (Σ +
iℏ 1 J)z(ε) ⋅ z(ε) = ε ℏ|z0 |2 + ε2 Σ(Jz0 ) ⋅ Jz0 . 2 2
Choose now ε opposite in sign to 21 ℏ; then ε 21 ℏ|z0 |2 < 0 and, if |ε| is small enough, we have z(ε)T (Σ + iℏ2 J)z(ε) < 0, which implies that Σ + iℏ2 J > 0. (iii) The sufficiency of the quantum condition is an immediate consequence of Proposition 168 since the ℏ-positivity of ρ implies that ρ̂ ≥ 0, hence ρ̂ is a density matrix. Remark 186. We will show later (Proposition 200) that the quantum condition Σ + iℏ J ≥ 0 has a geometric reformulation in terms of the symplectic eigenvalues of the 2 covariance matrix. We stress that the condition (14.23) is necessary but not sufficient for an arbitrary phase-space function to be the Wigner distribution of a quantum state (except when ρ is a Gaussian (14.24)). Let us check this on the following counterexample. Consider the function f (x, p) defined for n = 1 by 2 4 2 4 1 1 f (x, p) = (1 − ax2 − bx2 )e−(a x −b p ) 2 2
where a and b are positive constants such that ab ≥ 41 ℏ2 . Define now ρ(x, p) =
1 ∫ e−i(xx +pp ) f (x , p )dp dx ; 2π
ℝ2
14.5 The Narcowich–Wigner spectrum
| 179
since f is an even function, ρ is real. In view of the Fourier inversion formula, we have f (x, p) = ∫ ei(xx +pp ) ρ(x , p )dp dx ,
ℝ2
and hence ∫ ρ(x, p)dpdx = f (0, 0) = 1 ℝ2
so that ρ(x, p) is a candidate for being the Wigner function of some density matrix. Calculating the covariance matrix Σ associated with ρ, one finds after some tedious but straightforward calculations that Σ + iℏ2 J ≥ 0. However, ρ cannot be the Wigner function of a density matrix ρ̂ because, if this were the case, we would have ⟨p4 ⟩ρ̂ ≥ 0; but by definition of ρ, ⟨p4 ⟩ρ̂ = ∫ p4 ρ(x, p)dpdx = ℝ2
𝜕4 f (0, 0) = −24a2 < 0. 𝜕x4
14.5 The Narcowich–Wigner spectrum Let ρ ∈ L2 (ℝ2n ); we assume that ρ is real and that ∫ ρ(z)dz = 1.
(14.26)
ℝ2n
We also assume that the symplectic η-Fourier transform ρσ,η (z) = (
n
1 − i σ(z,z ) ) ∫e η ρ(z )dz 2πη ℝ2n
(η ∈ ℝ, η ≠ 0) is continuous. Definition 187. The Narcowich–Wigner spectrum ΣNW (ρ) consists of all η ∈ ℝ, η ≠ 0 such that ρ is the η-Wigner distribution of a density operator ρ̂η . The Narcowich–Wigner has the following elementary properties (where one allows the parameter η to take negative values): Proposition 188. Let ρ be as just stated. (i) We have η ∈ ΣNW (ρ) if and only if −η ∈ ΣNW (ρ); (ii) the Narcowich–Wigner spectrum of ρ is a bounded subset of ℝ: There exists η0 ∈ ℝ such that ΣNW (ρ) ⊂ [−η0 0) ∪ (0, η0 ]. The following result was proved by Dias and Prata:
180 | 14 The density operator Proposition 189. Assume that ρ is not a Gaussian distribution. Then ΣNW (ρ) = {−ℏ, ℏ}. For a proof of this result, see the comments and references section; the last result is intriguing because one can wonder why, as soon as a state becomes non-Gaussian, its Narcowich–Wigner spectrum collapses in such a spectacular way. A clue might be the extremality properties of Gaussian states that singles them out among all other states.
14.6 Comments and references The quantum condition (Theorem 185) has a long and rather chaotic story. The first to use the positivity condition Σ + iℏ2 J ≥ 0 on the covariance matrix were probably Arvind, Dutta, Mukunda, and Simon in [4]. They did that in order to reformulate the uncertainty principle (which we study in Chapter 15). It has become one of the main tools in quantum optics and quantum information theory because it serves as a test replacing the Robertson–Schrödinger inequalities, which are difficult to handle in practice. The main contributions to the proof of Theorem 185 are due to Narcowich and his collaborators [58–60]. The Narcowich–Wigner spectrum ΣNW (ρ) was introduced by Narcowich in [58] and further studied by Bröcker and Werner [13]. Mathematically speaking the topic of variable values for ℏ is very much open, and very little is known about the Narcowich– Wigner spectrum.
15 The uncertainty principle We pursue the study of covariance matrix of a density matrix and examine its relationship with the uncertainty principle of quantum mechanics, and show that it can be expressed in terms of a deep property of symplectic topology. We also suggest a new more general geometric version of the uncertainty principle that involves the use of polarity between convex sets and appears naturally when one studies the orthogonal projections of the covariance ellipsoid on the x and p subspaces.
15.1 Feichtinger states Let us introduce the following terminology: Definition 190. A mixed quantum state {(ψj , λj )} in L2 (ℝn ) is called a Feichtinger (quantum) state of order s if each ψj belongs to the modulation space Ms1 (ℝn ). Identifying the state {(ψj , λj )} with the density operator ρ̂ = ∑ λj ρ̂ψj , j
we will also call ρ̂ a Feichtinger state of order s. One needs to be somewhat careful at this point. Each mixed quantum state {(ψj , λj )} defines a unique density operator ρ̂, but the converse of this property is not true because mixed quantum states can usually be decomposed in infinitely many ways. For instance, the spectral theorem tells us that ρ̂ can be written ρ̂ = ∑ λj ρ̂ψ j
j
where the ψj form an orthonormal system; now, the definition of a Feichtinger state does not imply that ψj ∈ Ms1 (ℝn ) if ψj ∈ Ms1 (ℝn ). The definition thus makes use of one particular decomposition of the density operator. Recall that the condition ψ ∈ M21 (ℝn ) means that ψ ∈ L2 (ℝn ) and ∫ |Wψ(z)(1 + |z|2 )dz < ∞.
(15.1)
ℝ2n
It follows that the Wigner distribution ρ(z) = ∑ λj Wψj (z) j
of a Feichtinger state satisfies ∫ (1 + |z|2 )ρ(z)dz < ∞. ℝ2n https://doi.org/10.1515/9783110722772-015
(15.2)
182 | 15 The uncertainty principle An immediate consequence is that the conditions (14.16) ensuring the existence of Σ are satisfied, hence the covariance matrix of a Feichtinger state is always well-defined. Here is an interesting differentiability property satisfied by the Wigner distributions of Feichtinger that states: Proposition 191. The Fourier transforms Fρ and ρσ = Fσ ρ of the Wigner distribution of a Feichtinger state with s ≥ 2 are twice continuously differentiable: Fρ ∈ C 2 (ℝ2n ). Proof. It suffices to assume that s = 2. Since ρσ (z) = Fρ(Jz), it suffices to prove the result for the ordinary Fourier transform F, that is Fρ(z) = (
n
i 1 ) ∫ e− ℏ z⋅z ρ(z )dz . 2πℏ
ℝ2n
Since (15.2) is satisfied, we may differentiate twice under the integration sign, and we get i 𝜕zα Fρ = − F(zα ρ), ℏ
2
i 𝜕zα 𝜕zβ Fρ = (− ) F(zα zβ ρ), ℏ
and hence 1 𝜕zα Fρ(z) ≤ ∫ zα ρ(z)dz < ∞ ℏ 2n ℝ
2
1 𝜕zα 𝜕zβ Fρ(z) ≤ ( ) ℏ
∫ zα zβ ρ(z)dz < ∞. ℝ2n
15.2 The Robertson–Schrödinger inequalities The Robertson–Schrödinger inequalities are closely related to the quantum condition in Theorem 185. This connection is after all hardly surprising, as shown by the following result: Proposition 192. Let Σ be a real symmetric 2n × 2n matrix. Suppose that Σ satisfies the quantum condition Σ+
iℏ J ≥ 0. 2
(15.3)
Then the inequalities (ΔXj )2 (ΔPj )2 ≥ Δ(Xj , Pj )2 + hold for 1 ≤ j ≤ n.
1 2 ℏ 4
(15.4)
15.2 The Robertson–Schrödinger inequalities |
183
Proof. Recall that the matrix Σ + iℏ2 J is Hermitian and that Σ > 0. The non-negativity of Σ + iℏ2 J can be expressed in terms of the minors Σj = (
(ΔXj )2 Δ(Pj , Xj ) −
which are non-negative provided that Σ +
iℏ 2 iℏ J 2
Δ(Xj , Pj ) + (ΔPj )2
iℏ 2)
is. Since
Tr(Σj ) = (ΔXj )2 + (ΔPj )2 ≥ 0, we have Σj ≥ 0 if and only if det Σj = (ΔXj )2 (ΔPj )2 − Δ(Xj , Pj )2 −
1 2 ℏ ≥0 4
which is equivalent to the inequality (15.4). The inequalities (15.4) are called the Robertson–Schrödinger inequalities in the physical literature; notice that they imply the textbook “Heisenberg uncertainty principle” 1 (ΔXj )(ΔPj ) ≥ ℏ, 2
1 ≤ j ≤ n.
(15.5)
At this point, it is appropriate to notice that (except in the case n = 1) the condition Σ + iℏ2 J ≥ 0 is not in general equivalent to the uncertainty inequalities (15.4); it is in fact a stronger condition except for n = 1. In this case, the covariance matrix is just Σ=(
ΔX 2 Δ(P, X)
Δ(X, P) ), ΔP 2
and since Tr(Σ + the condition Σ +
iℏ J 2
iℏ J) = ΔX 2 + ΔP 2 ≥ 0, 2
≥ 0 is equivalent to det(Σ +
iℏ J) 2
ΔX 2 ΔP 2 − Δ(X, P)2 ≥
≥ 0, that is, to
1 2 ℏ 4
which is precisely (15.4) for n = 1. It is however important to notice that the equivalence between Σ + iℏ2 J ≥ 0 and the uncertainty inequalities (15.4) no longer holds for n > 1. The following result is well-known and proven in elementary books using statistical methods and the Cauchy–Schwarz inequality: Corollary 193. Let ρ̂ be a Feichtinger state on L2 (ℝn ) and Σ its covariance matrix. Then the associated covariances satisfy the Robertson–Schrödinger inequalities (ΔXj )2 (ΔPj )2 ≥ Δ(Xj , Pj )2 +
1 2 ℏ. 4
184 | 15 The uncertainty principle Proof. It is an immediate consequence of Theorem 185 and of Proposition 192. Notice that the assumption that ρ̂ is a Feichtinger state is essential since it guarantees the existence of the covariance matrix.
15.3 Gromov’s symplectic non-squeezing theorem 15.3.1 Statement of Gromov’s theorem We will study in next section the uncertainty principle from a geometric point of view. The key to our main result (Theorem 201) is a very deep property from symplectic topology due to Mikhail Gromov. Before we state his result, let us introduce the following notation: For R > 0, we denote by B2n (R) = {z ∈ ℝ2n : |z| ≤ R} the phase space ball with center 0 and radius R, and by Zj2n (R) = {z = (x, p) ∈ ℝ2n : xj2 + p2j ≤ R}, the cylinder with radius R centered at 0 and based on the xj , pj plane. When centered at an arbitrary point z0 ∈ ℝ2n , these sets are denoted by B2n (z0 , R) and Zj2n (z0 , R). Recall that a symplectomorphism f of (ℝ2n , σ) is a diffeomorphism of ℝ2n respecting the symplectic structure: f ∗ σ = σ (i. e., the Jacobian matrix Df (z) ∈ Sp(n) for every z ∈ ℝ2n ). Theorem 194 (Gromov). There exists a symplectomorphism f of (ℝ2n , σ) such that f (B2n (z0 , R)) ⊂ Zj2n (z0 , r) (j = 1, . . . , n) if and only if R ≤ r. The choice of index j for the cylinders Zj2n (z0 , r) is irrelevant because Zj2n (z0 , r) can always been taken to another cylinder Zk2n (z0 , r) using a trivial linear symplectic transformation. All the known proofs (direct or indirect) of Gromov’s theorem are notoriously difficult; Gromov’s original proof makes use of the theory of pseudoholomorphic curves. We will actually only need its linear variant, which is somewhat easier to prove. 15.3.2 Proof of Gromov’s theorem in the linear case We will need the following elementary lemma from linear algebra: Lemma 195. Let S = ( AC DB ) be a 2n × 2n symplectic matrix, and let (a, b) = (a1 , . . . , an , b1 , . . . , bn ), (c, d) = (c1 , . . . , cn , d1 , . . . , dn ) its j-th line and (n + j)-th line, respectively. We have a ⋅ d − b ⋅ c = 1.
15.3 Gromov’s symplectic non-squeezing theorem
| 185
Proof. Since S is symplectic, A, B, C and D satisfy the conditions ADT − BC T = I. (see (1.14)). The equality a ⋅ d − b ⋅ c = 1 follows. Let us now prove the linear Gromov theorem: Proposition 196. There exists S ∈ Sp(n) such that S(B2n (R)) ⊂ Zj2n (r) if and only if R ≤ r. Proof. Since S is linear, it is sufficient, by homogeneity, to assume that R = 1. It is moreover no restriction to assume j = 1. It is thus sufficient to prove the implication x12 + p21 ≤ r 2 } ⇒ 1 ≤ r. (x, p) ∈ S(B2n (1))
(15.6)
Denoting by a, b, c, d the first lines of the matrices A, B, C, D, respectively, we have x1 = a ⋅ x + b ⋅ p,
p1 = c ⋅ x + d ⋅ p.
Setting u = (a, b) and ν = (c, d), the condition x12 + p21 < r 2 for all (x, p) ∈ S(B2n (1)) is thus equivalent to (u ⋅ z)2 + (ν ⋅ z)2 ≤ r 2 for all z = (x, p) with |z| ≤ 1. In particular, choosing respectively z = u/|u| and z = v/|v|, we must thus have both |u|2 ≤ r 2 and |v|2 ≤ r 2 , that is a ⋅ a + b ⋅ b ≤ r2
and c ⋅ c + d ⋅ d ≤ r 2 .
(15.7)
Let us show that these inequalities imply that we must have 1 ≤ r 2 ; the result will follow. In view of Lemma 195, it follows by Cauchy–Schwarz’s inequality that 1/2
1 = a ⋅ d − b ⋅ c ≤ (a2 + b2 ) (c2 + d2 )
1/2
≤ r2 ,
(15.8)
which implies that r ≥ 1. 15.3.3 The Gromov width Associated with Gromov’s symplectic non-squeezing theorem comes the notion of Gromov width. We again limit ourselves to the linear case. Definition 197. The (linear) Gromov width of a subset Ω ⊂ ℝ2n is the (possibly infinite) number wG (Ω) =
sup
S∈Sp(n),z0
∈ℝ2n
{πR2 : S(B2n (z0 , R)) ⊂ Ω}.
186 | 15 The uncertainty principle It is obvious that wG (B2n (z0 , R)) = πR2 . It turns out that it follows from Gromov’s (linear) theorem just proven that we actually also have wG (Zj2n (z0 , R)) = πR2 . In view of the obvious implication Ω ⊂ Ω ⇒ wG (Ω ) ≤ wG (Ω), we thus have B2n (z0 , R) ⊂ Ω ⊂ Zj2n (z0 , R) ⇒ wG (Ω) = πR2 . The Gromov width is a particular case of the more general notion of symplectic capacity (see the comments and references section).
15.4 The covariance ellipsoid The notion of covariance ellipsoid will allow us to give geometric interpretations of the uncertainty principle. Let us introduce the following terminology: Definition 198. Let Σ be a quantum covariance matrix (hence Σ > 0). The phase space ellipsoid 1 Ω = {z ∈ ℝ2n : Σ−1 z 2 ≤ 1} 2 is called the covariance ellipsoid associated with Σ. One also sometimes uses the term “Wigner ellipsoid” for Ω. We are going to prove a very important reformulation of the uncertainty principle geometric in terms of concepts from symplectic geometry. Let us first give the following definition: Definition 199. (i) Let B2n (R) be the closed phase space ball with radius R centered at 0. The image S(B2n (R)) by S ∈ Sp(n) is called a symplectic ball. (ii) When R = √ℏ, the symplectic ball S(B2n (√ℏ)) is called a “quantum blob”. Here is a very useful reformulation of the quantum condition Σ + of the symplectic eigenvalues of the covariance matrix: Proposition 200. The quantum condition Σ +
iℏ J 2
iℏ J 2
≥ 0 in terms
≥ 0 is equivalent to the inequality
ℏ ≤ 2λmin
(15.9)
where λmin is the smallest symplectic eigenvalue of Σ. Proof. Let us show that the conditions (15.9) and (15.3) indeed are equivalent. Let Σ = ST DS be a Williamson diagonalization of Σ, that is S ∈ Sp(n) and D = ( Λ0 Λ0 ) where Λ is
15.4 The covariance ellipsoid | 187
the diagonal matrix whose diagonal entries are the symplectic eigenvalues λ1 , . . . , λn of Σ. Since ST JS = J, condition (15.3) is equivalent to D + iℏ2 J ≥ 0; the characteristic polynomial of D + iℏ2 J is the determinant Λ − λIn 1 = det[(Λ − λIn )2 − ℏ2 In ]; 4
Λ − λI P(λ) = iℏ n − 2 In
iℏ I 2 n
the matrix Λ being diagonal, the zeroes λ of P(λ) are the solutions of the n equations (λj − λ)2 − 41 ℏ2 = 0, that is λj − λ = ± 21 |ℏ|. Since λ ≥ 0, we must have λj ≥ 21 |ℏ| for all j, hence (15.9). We now have the material for proving our geometric reformulation of the uncertainty principle. We give it is the status of a theorem because of its great usefulness. Theorem 201. (i) A covariance matrix Σ satisfies the quantum condition Σ + iℏ2 J ≥ 0 if and only if the associated covariance ellipsoid Ω contains a quantum blob. (ii) This condition is equivalent to the inequality wG (Ω) ≥ πℏ. Proof. (i) As in the proof of Lemma 200, a Williamson diagonalization Σ = SDST (observe that ST is written before S) brings the quantum condition in the form D + iℏ2 J ≥ 0 where Λ 0
D=(
0 ), Λ
Λ = diag(λ1 , . . . , λn ),
the λj being the symplectic eigenvalues of Σ. The condition 21 Σ−1 z ⋅ ⋅z ≤ 1 is equivalent to 1 T −1 −1 −1 (S ) D S z ⋅ z ≤ 1 2 so that the Ω = S(ΩD ) where ΩD is the ellipsoid defined by 21 D−1 z 2 ≤ 1; in xj , pj coordinates, n
ΩD : ∑ j=1
1 2 (x + p2j ) ≤ 1. 2λj j
Assume now that S (B2n (√ℏ)) ⊂ ΩD for some S ∈ Sp(n). In view of Gromov’s nonsqueezing theorem, the areas of the orthogonal projections of S (B2n (R)) on every xj , pj plane must be at least πℏ, but the condition S (B2n (√ℏ)) ⊂ ΩD is equivalent to saying the sections xj2 + p2j ≤ 2λj of ΩD by these planes must have are 2πλj ≥ πℏ, for all j = 1, . . . , n. This is equivalent to ℏ ≤ 2λmin , hence the result in view of the inequality (15.9) in Proposition 200 since the condition S (B2n (√ℏ)) ⊂ ΩD means that SS (B2n (√ℏ)) ⊂
188 | 15 The uncertainty principle S(ΩD ) = Ω. (ii) Suppose that S(B2n (√ℏ)) ⊂ Ω for some S ∈ Sp(n); by the monotonicity and symplectic invariance of the Gromov width, we have πℏ = wG (B2n (√ℏ)) = wG (S(B2n (√ℏ))) ≤ wG (Ω) so that wG (Ω) ≥ πℏ. If, conversely, wG (Ω) ≥ πℏ then the supremum of all radii R such that S(B2n (R)) ⊂ Ω is √ℏ, hence S(B2n (√ℏ)) ⊂ Ω.
15.5 Uncertainty and quantum polarity 15.5.1 Orthogonal projections of the covariance ellipsoid The projection of an ellipsoid on any subspace is an ellipsoid. We are going to focus here on the orthogonal projections of the covariance ellipsoid on the position and momentum subspaces ℝnx ≡ ℝn × 0 and ℝnp ≡ 0 × ℝn of phase space ℝ2n . We will denote these projections by ΠX : ℝ2n → ℝnx , 2n
ΠP : ℝ
→
ℝnp ,
ΠX (x, p) = x
ΠP (x, p) = p.
We begin with the following general result: Lemma 202. Let Ω = {z : Mz ⋅ z ≤ ℏ}, M > 0. The orthogonal projections ΩX = ΠX Ω and ΩP = ΠP Ω are the ellipsoids ΩX = {x ∈ ℝnx : (M/MPP )x 2 ≤ ℏ}
(15.10)
ΩP = {p ∈
(15.11)
ℝnp
2
: (M/MXX )p ≤ ℏ}
where −1 M/MPP = MXX − MXP MPP MPX
(15.12)
−1 MPX MXX MXP
(15.13)
M/MXX = MPP −
are the Schur complements in M of MPP and MXX , respectively. Proof. Let us set Q(z) = Mz 2 − ℏ; the boundary 𝜕Ω of the hypersurface Q(z) = 0 is defined by MXX x2 + 2MPX x ⋅ p + MPP p2 = ℏ.
(15.14)
A point x belongs to the boundary 𝜕ΩX of ΩX if and only if the normal vector to 𝜕Ω at the point z = (x, p) is parallel to ℝnx × 0, hence we get the constraint ∇z Q(z) = 2Mz ∈ −1 ℝnx × 0; this is equivalent to saying that MPX x + MPP p = 0, that is, to p = −MPP MPX x. Inserting this value of p in the equation (15.14) shows that 𝜕ΩX is the set of all x such that (M/MPP )x2 = ℏ, which yields (15.10). Formula (15.11) is proven in the same way, swapping the subscripts X and P.
15.5 Uncertainty and quantum polarity | 189
15.5.2 Polar duality in convex geometry Let X be a convex body in configuration space ℝnx (a convex body in an Euclidean space is a compact convex set with non-empty interior). We assume in addition that X contains 0 in its interior. This is the case if, for instance, X is symmetric: X = −X. Definition 203. The (quantum) polar dual of X is the subset X ℏ = {p ∈ ℝnp : px ≤ ℏ for all x ∈ X}
(15.15)
of the dual space ℝnp ≡ (ℝnx )∗ . Notice that it trivially follows from the definition that X ℏ is convex. In the mathematical literature, one usually chooses ℏ = 1, in which case one writes X o for the polar dual; we have X ℏ = ℏX o . Here is an intuitive interpretation of the polar dual: X being convex, it is the intersection of a (possibly infinite) family of half spaces (the “supporting hyperplanes” of X). Therefore, the polar of X can be seen as the convex hull of a (possibly infinite) set of points, coming from all of the supporting hyperplanes. The following properties of the polar dual are obvious: ℏ
Biduality: (X ℏ ) = X;
(15.16) ℏ
ℏ
Antimonotonicity: X ⊂ Y ⇒ Y ⊂ X ;
(15.17)
T −1 ℏ
(15.18)
Scaling: det L ≠ 0 ⇒ (LX)ℏ = (L ) X .
The “smaller” X is, the larger X ℏ is. For instance, if X = 0 (corresponding to a perfectly localized system), then X ℏ = ℝnp , the whole momentum space. This property, reminiscent of the uncertainty principle and of the duality of the support of a function and that of its Fourier transform, becomes particularly visible when one studies the polar duals of ellipsoids. Here are a few useful results: Proposition 204. Let ℬXn (R) (resp. ℬPn (R)) be the ball {x : |x| ≤ R} in ℝnx (resp. {p : |p| ≤ R} in ℝnp ). (i) We have n
n
ℏ
ℬX (R) = ℬP (ℏ/R).
(15.19)
In particular, n
ℏ
n
ℬX (√ℏ) = ℬP (√ℏ).
(15.20)
(ii) Let A = AT be an invertible n × ×n matrix. We have {x : Ax2 ≤ R2 } = {p : A−1 p2 ≤ (ℏ/R)2 },
(15.21)
{x : Ax2 ≤ ℏ} = {p : A−1 p2 ≤ ℏ}.
(15.22)
ℏ
and hence ℏ
190 | 15 The uncertainty principle Proof. Let us show that ℬXn (R)ℏ ⊂ ℬPn (ℏ/R). Let p ∈ ℬXn (R)ℏ and set x = (R/|p|)p; we have |x| = R, and hence px ≤ ℏ, that is, R|p| ≤ ℏ and p ∈ ℬPn (ℏ/R). To prove the opposite inclusion, choose p ∈ ℬPn (ℏ/R). We have |p| ≤ ℏ/R, and hence, by the Cauchy–Schwarz inequality, px ≤ |x‖p| ≤ ℏ|x|/R, that is, px ≤ ℎ for all x such that |x| ≤ R; this means that p ∈ ℬXn (R)ℏ . (ii) The ellipsoid {x : Ax 2 ≤ R2 }ℏ is the image of ℬXn (R) by the automorphism A−1/2 ; in view of formula (15.21), it follows from the scaling property (15.18) and (15.19) that {x : Ax2 ≤ R2 } = A1/2 ℬXn (R)ℎ = A1/2 ℬPn (ℏ/R) ℏ
which is equivalent to (15.21). We have assumed that the convex body X contains the origin 0 in its interior. The definitions and results just listed extend without difficulty to the general case by choosing an arbitrary x0 ∈ X and replacing X with X0 = −x0 + X. 15.5.3 The fundamental property of ΩX and ΩP We are going to show that the orthogonal ΩX = ΠX Ω and ΩP = ΠP Ω of the covariance ellipsoid form a polar dual pair in the sense just stated, provided that the covariance matrix Σ satisfies the quantum condition Σ + iℏ2 J ≥ 0. Theorem 205. Assume that the covariance ellipsoid Ω satisfies the quantization condition Σ + iℏ2 J ≥ 0 (resp. M −1 + iJ ≥ 0). Then, the orthogonal projections ΩX = ΠX Ω and ΩP = ΠP Ω of Ω on ℝnx and ℝnp , respectively, form a dual quantum pair: we have ΩℏX ⊂ ΩP . Proof. It suffices to prove that there exist Y ⊂ ΩX and Q ⊂ ΩP such that Y ℏ ⊂ Q. For this purpose, we recall that the quantum condition Σ + (iℏ/2)J ≥ 0 is equivalent to the existence of S ∈ Sp(n) such that S(ℬ2n (√ℏ)) ⊂ Ω; it is therefore sufficient to show that the projections Y = ΠX (S(ℬ2n (√ℏ))) and Q = ΠP (S(ℬ2n (√ℏ))) form a quantum pair, that 2 is, Y ℏ ⊂ Q. The ellipsoid S(ℬ2n (√ℏ)) consists of all z ∈ ℝ2n z such that Rz ≤ ℏ where T −1 R = (SS ) . Since R is symmetric and positive definite, we can write it in block-matrix form as A BT
R=(
B ) D
with A > 0, D > 0, and the projections Y and Q are given by formulas (15.10) and (15.11), which read here Y = {x : (R/D)x 2 ≤ ℏ} Q = {p : (R/A)p2 ≤ ℏ};
15.6 Comments and references | 191
we have R/D > 0 and R/A > 0 [77]. In view of formula (15.22), we have Y ℏ = {p : (R/D)−1 p2 ≤ ℏ}, hence the condition Y ℏ ⊂ Q is equivalent to (R/D)−1 ≥ R/A.
(15.23)
Let us prove that this inequality holds. The conditions R ∈ Sp(n), R = RT being equivalent to RJR = J, we have ABT = BA,
BT D = DB 2
AD − B = In×n .
(15.24) (15.25)
These relationships imply that the Schur complements R/D and R/A are R/D = (AD − B2 )D−1 = D−1
(15.26)
R/A = A−1 (AD − B2 ) = A−1 ,
(15.27)
and hence the inequality (15.23) holds if and only if D ≥ A−1 . This condition is in turn equivalent to AD ≥ In×n . In fact, the inequality D ≥ A−1 is equivalent to A1/2 DA1/2 ≥ In×n ; now, A1/2 DA1/2 and AD have the same eigenvalues, hence AD ≥ In×n . If conversely AD ≥ In×n , then D1/2 AD1/2 ≥ In×n , hence A ≥ D−1 , that is, D ≥ A−1 . Now (15.25) implies that AD = In×n +B2 , hence we will have AD ≥ In×n if B2 ≥ 0. To prove that B2 ≥ 0 we note that since ABT = BA (first formula (15.24)), we have BT = A−1 BA so that B and BT = B∗ have the same eigenvalues, and these must be real. It follows that the eigenvalues of B2 are ≥ 0 hence B2 ≥ 0 as claimed.
15.6 Comments and references The definition of the notion of Feichtinger state is new and due to the author. That the quantum condition Σ + iℏ2 J ≥ 0 is not equivalent to the Robertson–Schrödinger inequalities (15.4) is easily seen on counterexamples; see [31], § 6.1.1. This fact can be stated in a lapidary way by saying that “the uncertainty principle does not characterize the quantumness of a state”; see de Gosson and Luef [36, 37] and the references therein. The notion of polarity is very well-known from convex geometry; see for instance the review paper by [5]; for a detailed account of the relation between quantum uncertainty and polar duality, see our paper [35].
16 Separability and entanglement A topic of great interest and importance in quantum mechanics is that of separability and of its antinomy, entanglement. Separability and entanglement is the key to many applications (e. g., quantum optics and computing, quantum information, to name a few).
16.1 The reduced density operator 16.1.1 Physical motivation Suppose we have some information on a mixed state {(ψj , λj )} that allows us to declare that, although the state is not known with precision, it must be, with probability λj , in a tensor product state ψAj ⊗ ψBj where ψAj is defined on ℝnA and ψBj on ℝnB where nA and nB are non-zero integers such that nA + nB = n. The state {(ψAj ⊗ ψBj , λj )} is then said to be separable; if it is not the case, i. e., if at least one of the pure states ψj cannot be written as a tensor product ψAj ⊗ψBj , then the state is said to be entangled. To decide whether a given mixed state is separable or not is a very difficult question that has not yet been fully resolved at the time of writing. Intimately associated with the notion of separability is that of reduced density operator. Imagine a quantum system SA with Hilbert space HA (for instance L2 (ℝnA )) and density operator ρ̂A . At some initial time (say t0 = 0), the system SA is coupled to another quantum system SB with Hilbert space HB represented by a density operator ρ̂B on ℋB so that the total density operator is now ρ̂A ⊗ ρ̂B . Assuming that the time evolution of the coupled system is governed by ̂t ) of transformations of the Hilbert space H = HA ⊗ HB , the density a unitary group (U ̂ ∗ (ρ̂A ⊗ ρ̂B )U ̂t at time t. Let us now focus on the operator of the total system will be U t original system SA , thus ignoring the coupled system SA ∪ SB . If there had been no interaction between both systems SA and SB in the course of time, the unitary operators ̂t would be tensor products U ̂ A ⊗U ̂ B with U ̂ A and U ̂ B acting on HA and HB , respectively, U t t t t A ∗ A ̂ ) ρ̂ U ̂ A . However, in the general case, we so the system SA would be in the state (U t t cannot decouple both systems since they are correlated and remain so in time. So, we ̂ ∗ (ρ̂A ⊗ ρ̂B )U ̂t so that must in some sense “project” or “restrict” the density operator U t it acts on the Hilbert HA only. This operation is called “partial tracing” in quantum mechanics, and the density operator on HA obtained by this procedure is denoted by ̂ ∗ (ρ̂A ⊗ ρ̂B )U ̂t ). ρ̂A (t) = TrB (U t 16.1.2 Traditional Hilbert space approach Let us briefly describe the approach to partial tracing common in books in quantum mechanics. It has the advantage of working in the general setting of arbitrary Hilbert https://doi.org/10.1515/9783110722772-016
194 | 16 Separability and entanglement spaces. In what follows, we consider two separable Hilbert spaces (HA , (⋅|⋅)A ) and (HB , (⋅|⋅)B ) and denote their tensor product by (H, (⋅|⋅)): H = HA ⊗ HB . Definition 206. Let ρ̂ ∈ ℒ1 (H) be a density operator. The partial trace of ρ̂ with respect to HB is the mapping ρ̂A : ℒ1 (H) → ℒ1 (HA ) defined by (ρ̂A ψA |ψA )H = ∑(ρ̂(ψA ⊗ ϕB,j )|ψA ⊗ ϕB,j )H A
j
(16.1)
where (ϕB,j )j is an arbitrary orthonormal basis of HB . Note that this definition can be written more compactly as, equivalently, ∗ ρ̂A = ∑ ϵB,j ρ̂ϵB,j j
(16.2)
where ϵB,j : HA → HA ⊗ HB is the extension map defined by ϵB,j ψA = ψA ⊗ ϕB,j ; its ∗ adjoint ϵB,j : HA ⊗ HB → HA is given by ∗ ϵB,j (ψA ⊗ ψB ) = (ψB |ϕB,j )HB ψA . ∗ One easily shows that the operator ρ̂Aj = ϵB,j ρ̂ϵB,j defined by
(ρ̂Aj ψA |ψA )H = (ρ̂(ψA ⊗ ϕB,j )|ψA ⊗ ϕB,j )H A
is a density operator that does not depend on the choice of the orthonormal basis (ϕB,j )j of HB , and that: Proposition 207. The partial trace ρ̂A = TrB (ρ̂) of ρ̂ is the unique operator ρ̂A = TrA (ρ̂) ∈ ℒ1 (HA ) such that ̂ = Tr(ρ̂(Y ̂ ⊗ ̂IB )) Tr(ρ̂A Y)
(16.3)
̂ ∈ ℒ1 (HA ) and ̂IB the identity operator ̂ ∈ ℬ(HA ); here Tr(ρ̂A Y) ̂ is the trace of ρ̂A Y for all Y on HB . Notice that the left hand side of (13.21) is well-defined since ℒ1 (HA ) is an ideal in ̂ ⊗ ̂IB )) is also well-defined since Y ̂ ⊗ ̂IB ∈ ℬ(H). The uniqueness ̂(Y ℬ(HA ). Similarly, Tr(ρ
of a trace class operator ρ̂A satisfying (13.21) is clear: If there were two such operators, ̂ ⊗ ̂IB )) = 0 for all Y ∈ ℬ(HA ), hence in particular, taking then we would gave Tr(ρ̂(Y ̂ ̂ ̂ Y = IA , Tr(ρ) = 0 which is not possible. The verification of the equality (13.21) is left to the reader. 16.1.3 The reduced density operator in harmonic analysis We now assume HA = L2 (ℝnA ) and HB = L2 (ℝnB ), so that we may identify the Hilbert space H = HA ⊗HB with L2 (ℝn ). Let nA , nB be two positive integers such that n = nA +nB .
16.1 The reduced density operator
| 195
We identify the direct sum ℝ2nA ⊕ ℝ2nB with ℝ2n and the symplectic form σ on ℝ2n with σA ⊕ σB where σA (resp. σB ) is the standard symplectic form on ℝ2nA (resp. ℝ2nB ). Let ρ̂ be a density operator on L2 (ℝn ) with Wigner distribution ρ. Assume that the trace formula Tr(ρ̂) = ∫ ρ(z)dz = 1 ℝ2n 2n holds (which is the case if, for example, ρ ∈ Γm ρ (ℝ ) with m < −2n, see Proposition 157, or if ρ̂ is a Feichtinger state), we give the following definition:
Definition 208. The reduced density operator ρ̂A is ρ̂A = (2πℏ)nA OpW (ρA )
(16.4)
where ρA is the function on ℝ2nA defined by ρA (zA ) = ∫ ρ(zA , zB )dzB .
(16.5)
ℝnB
Here is an existence result: m
2n A 2nA A Proposition 209. Assume that ρ ∈ Γm ) for every ρ (ℝ ) with m < −2n. Then ρ ∈ Γρ (ℝ A 2 nA ̂ mA < −2nA . The operator ρ is a density operator on L (ℝ ) with Wigner distribution ρA .
Proof. The integral (16.5) is convergent in view of the trivial inequality (1 + |z|2 )m ≤ (1 + |zB |2 )m . Choosing mA < −2nA and mB < −2nB such that m = mA + mB , we have ⟨z⟩m−|α| ≤ ⟨zA ⟩mA −|α| ⟨zB ⟩mB as follows from the inequality m−|α|
(1 + |z|2 )
≤ (1 + |zA |2 )
mA −|α|
(1 + |zB |2 )
mB
.
Using the Shubin estimates (5.15), we thus have 𝜕zαA ρA (zA ) = ∫ 𝜕zαA ρ(zA , zB )dzB ℝ2nB
≤ Cα ⟨z⟩m−|α| ∫ ⟨zB ⟩mB dzB , ℝ2nB
and hence ρA ∈ ΓmA (ℝ2nA ) since the integral over ℝnB is convergent in view of the inequality mB < −2nB . It follows from Proposition 157 that ρ̂A is a trace class operator whose trace is TrA (ρ̂A ) = ∫ ρA (zA )dzA = 1. ℝ2nA
(16.6)
196 | 16 Separability and entanglement There remains to be shown that ρ̂A ≥ 0. In view of Proposition 165, it is sufficient to prove that the Fourier transform (ρA )⋄ is continuous and satisfies ΛA(N) ≥ 0 for every integer N > 0 where ΛA(N) = (ΛAjk )j,k with iℏ
ΛAjk = e− 2 σA (zA,j ,zA,k ) (ρA )⋄ (zA,j − zA,k ), the vectors zA,j and zA,k of ℝ2nA being arbitrary. The continuity of (ρA )⋄ being obvious (Riemann–Lebesgue Lemma), all we have to do is to show that ΛA(N) ≥ 0. We first observe that, by Fubini’s theorem, (ρA )⋄ (zA ) = ρ⋄ (zA ⊕ 0), and hence iℏ
ΛAjk = e− 2 σA (zA,j ⊕0,zA,k ⊕0) ρ⋄ ((zA,j ⊕ 0) − (zA,k ⊕ 0)); the matrix ΛA(N) is thus the matrix Λ(N) corresponding to the particular choices zj = zA,j ⊕ 0 and zk = zA,k ⊕ 0. Since ρ⋄ satisfies the conditions in Proposition 165, we must have ΛA(N) ≥ 0, hence (ρA )⋄ also satisfies these conditions. From now on, we will write the covariance matrix Σ in the AB-ordering as ΣAA ΣBA
Σ=(
ΣAB ) ΣBB
with ΣBA = ΣTAB ,
(16.7)
the blocks ΣAA , ΣAB , ΣBA , ΣBB having dimensions 2nA ×2nA , 2nA ×2nB , 2nB ×2nA , 2nB ×2nB , respectively. In this notation, the quantum condition on the covariance matrix reads Σ+
iℏ J ≥0 2 AB
(16.8)
where we have set JAB = JA ⊕ JB = (
JA
0nB ×nA
0nA ×nB ), JB
JA and JB being the standard symplectic matrices of ℝ2nA and ℝ2nB , respectively. The covariance matrices ΣA and ΣB of the reduced density operators are, respectively, the blocks ΣAA and ΣBB of Σ as immediately follows from the definitions, using the formulas ρA (zA ) = ∫ ρ(zA , zB )dzB ,
ρB (zB ) = ∫ ρ(zA , zB )dzA . ℝnA
ℝnB
These reduced covariance matrices satisfy the quantum conditions ΣA +
iℏ J ≥0 2 A
and
ΣB +
iℏ J ≥ 0, 2 B
(16.9)
and the covariance ellipsoids of ρ̂A and ρ̂B are, accordingly, 1 ΩA = {zA : Σ−1 z ⋅ z ≤ 1}, 2 AA A A
1 ΩB = {zB : Σ−1 z ⋅ z ≤ 1}. 2 BB B B
(16.10)
16.2 Separability and the PPT condition
| 197
That the quantum conditions (16.9) hold follows from the fact that ρ̂A and ρ̂B are bona fide density operators; this can however also be seen directly by noting that the quantum condition (16.8) can be written ΣAA + iℏ2 JA ΣBA
ΣAB ) ≥ 0. ΣBB + iℏ2 JB
(
16.2 Separability and the PPT condition 16.2.1 Separable and entangled quantum states We begin by introducing some notation. We denote by IA the identity (xA , pA ) → (xA , pA ) and by I B the involution (xB , pB ) → (xB , −pB ) (“partial reflection”). We set I AB = IA ⊕ I B and JAB = JA ⊕ JB where JA (resp. JB ) is the standard symplectic matrix in ℝ2nA (resp. ℝ2nB ). Definition 210. The density operator ρ̂ on L2 (ℝn ) is said to be AB-separable (or, simply: separable) if there exists a sequence (ρ̂Aj )j (resp. (ρ̂Bj )j ) of density operators on L2 (ℝnA ) (resp. on L2 (ℝnB )) and real numbers αj ≥ 0, ∑j αj = 1 such that ρ̂ = ∑ αj ρ̂Aj ⊗ ρ̂Bj , j
(16.11)
the series being convergent in ℒ1 (L2 (ℝn )). If ρ̂ is not separable, it is said to be entangled. It follows that the Wigner distribution of a separable state (16.11) is of the type ρ(z) = ∑ αj ρAj (zA )ρBj (zB ). j
The property of separability is not preserved by conjugation by arbitrary unitary operators. However: Proposition 211. Let SA ∈ Sp(nA ) and SB ∈ Sp(nB ), and set S = SA ⊕ SB ; we have S ∈ Sp(n). Let Ŝ = ŜA ⊕ ŜB ∈ Mp(n) cover S. If ρ̂ is a separable density operator, then so is Ŝρ̂Ŝ−1 . Proof. This immediately follows from the symplectic covariance of the Weyl transform since ̂ ρ̂A ⊗ ρ̂B )Ŝ−1 = (Ŝ ρ̂A Ŝ−1 ) ⊗ (Ŝ ρ̂B Ŝ−1 ). S( A j A B j B j j The Wigner distribution of Ŝρ̂Ŝ−1 is ρ ∘ S−1 = ∑ αj (ρAj ∘ ŜA−1 ) ⊗ (ρBj ⋅ ŜA−1 ). j
198 | 16 Separability and entanglement 16.2.2 The PPT theorem We are going to prove a necessary condition for AB-separability. It makes use of the trivial equality Wψ(I B z) = Wψ(zB ),
(16.12)
valid for all ψ ∈ L2 (ℝn ). In other words, replacing Wψ(x, p) with Wψ(x, −p) amounts to replace Wψ with Wψ. ̂ T of a Weyl operator Recall from Proposition 67 in Chapter 5 that the transpose A ̂ = OpW (a)is the Weyl operator OpW (b) where b(x, p) = a(x, −p). A Proposition 212 (PPT). Let ρ̂ be a density operator on L2 (ℝn ). Suppose that the AB-separability condition (16.11) holds. Then the partially transposed operator T
ρ̂TB = ∑ αj ρ̂Aj ⊗ (ρ̂Bj ) ,
(16.13)
ρ̂TB = (2πℏ)n OpW (ρ ∘ I AB )
(16.14)
j
that is,
is also a density operator. Proof. The transpose of ρ̂Bj is explicitly given by T
(ρ̂Bj ) = (2πℏ)nB OpW (ρj ∘ I B ), hence the equivalence of (16.13) and (16.14). Suppose that the separability condition (16.11) holds; then the Wigner distribution is ρ = ∑j λj ρAj ⊗ ρBj with ρAj = ∑ αj,ℓ WA ψAj,ℓ , ℓ
ρBj = ∑ βj,m WB ψBj,m m
with (ψAj,ℓ , ψBj,ℓ ) ∈ L2 (ℝnA ) × L2 (ℝnB ) and αj,ℓ , βj,m ≥ 0; that is ρ = ∑ γj,ℓ,m WA ψAj,ℓ ⊗ WB ψBj,m j,ℓ,m
where γj,ℓ,m = λj αj,ℓ βj,m ≥ 0. We have ρ(I AB z) = ∑ λj ρAj (zA )ρBj (I B zB ); j∈ℐ
and thus B
ρ ∘ I AB = ∑ γj,ℓ,m W(ψAj,ℓ ⊗ ψj,m ), j,ℓ,m
hence OpW (ρ ∘ I AB ) is also a positive semidefinite trace class operator. That we have Tr(ρ̂TB ) = Tr(ρ̂) = 1 is obvious.
16.2 Separability and the PPT condition
| 199
The denomination PPT for the stated condition is very much used in physics; it is an acronym for partial positive transpose. Proposition 20 has the following consequence. We set J AB = JA ⊕ (−JB ) = I AB JAB I AB (hence J AB is the standard symplectic matrix of ℝ2nA ⊕ ℝ2nB , σA ⊕ (−σB )). Writing the covariance matrix in block matrix form ΣAA ΣBA
Σ=(
ΣAB ), ΣBB
(16.15)
we have Corollary 213. Let ρ̂ = (2πℏ)n OpW (ρ) be a separable density operator. Then, in addition to the total quantum condition (16.8), we have Σ+
iℏ J ≥ 0; 2 AB
(16.16)
Σ+
iℏ J ≥0 2 AB
(16.17)
equivalently,
where Σ = I AB ΣI AB , that is ΣAA I B ΣBA
Σ=(
ΣAB I B ). I B ΣBB I B
(16.18)
Proof. Replacing ρ with ρ ∘ I AB the matrix ΣAA in (16.7) remains unchanged, while ΣBB , ΣAB and ΣBA become I B ΣBB I B , ΣAB I B and I B ΣBA, respectively. The covariance matrix (16.7) thus becomes Σ = I AB ΣI AB . In view of Proposition 20, the operator (2πℏ)n OpW (ρ∘ I B ) is also positive semidefinite, hence we must have Σ + iℏ2 JAB ≥ 0 which is equivalent to Σ + iℏ2 I AB JAB I AB ≥ 0. Since I AB JAB I AB = J AB, this is the same thing as (16.16). 16.2.3 The Schur complement Let M be a 2n × 2n matrix, partitioned as MAA MBA
M=(
MAB ) MBB
(16.19)
where the blocks MAA , MAB , MBA , MBB have dimensions 2nA × 2nA , 2nA × 2nB , 2nB × 2nA , 2nB × 2nB , respectively.
200 | 16 Separability and entanglement Assume that is M is symmetric and positive definite, which we write M > 0. The symmetric matrices −1 M/MBB = MAA − MAB MBB MBA
(16.20)
−1 MBA MAA MAB
(16.21)
M/MAA = MBB −
are called the Schur complements of the blocks MBB and MAA , respectively. The positivity of M implies that MAA and MBB are themselves > 0 and hence invertible. Using the obvious factorization IA 0
M=(
−1 M/MBB MAB MBB )( 0 IB
I 0 ) ( −1 A MBB MBB MBA
0 ), IB
(16.22)
we readily obtain various formulas for the inverse of M; the one we will use here is (M/MBB )−1 M −1 = ( −1 −MBB MBA (M/MBB )−1
−1 −(M/MBB )−1 MAB MBB ). −1 (M/MAA )
(16.23)
Also note that it immediately follows from (16.22) that det M = det(M/MBB ) det MBB
(16.24)
det M = det(M/MAA ) det MAA .
(16.25)
and
16.2.4 Orthogonal projections of the covariance ellipsoid We are going to show (Proposition 215) that the orthogonal projections of the full covariance ellipsoid Ω on ℝ2nA and ℝ2nB are the covariance ellipsoids of the reduced density operators. The following is a restatement of Lemma 202 in our present notation: Lemma 214. Let ΠA (resp. ΠB ) be the orthogonal projection ℝ2n → ℝ2nA (resp. ℝ2n → ℝ2nB ) and ΩR the phase space ellipsoid ΩR = {z ∈ ℝ2n : Mz 2 ≤ R2 } (R > 0). We have ΠA ΩR = {zA ∈ ℝ2nA : (M/MBB )zA2 ≤ R2 }
(16.26)
ΠB ΩR = {zB ∈ ℝ
(16.27)
2nB
:
(M/MAA )zB2
2
≤ R }.
It follows from Lemma 214 that the orthogonal projections on ℝ2nA and ℝ2nB of the covariance ellipsoid Ω of ρ̂ are just the covariance ellipsoids of the reduced operators ρ̂A and ρ̂B :
16.3 Werner and Wolf’s condition |
201
Proposition 215. The covariance ellipsoids ΩA and ΩB of the reduced quantum states ρ̂A and ρ̂B are the orthogonal projections on ℝ2nA and ℝ2nB of the covariance ellipsoid Ω of ρ̂: 1 z ⋅ z ≤ 1} ΩA = ΠA Ω = {zA ∈ ℝ2nA : Σ−1 2 AA A A 1 ΩB = ΠB Ω = {zB ∈ ℝ2nB : Σ−1 z ⋅ z ≤ 1}. 2 BB B B
(16.28) (16.29)
Proof. Let M = ℏ2 Σ−1 ; the covariance matrix is then defined by the inequality Mz ⋅ z ≤ ℏ so that its projections ΠA Ω and ΠB Ω are defined by (M/MBB )zA ⋅ zA ≤ ℏ,
(M/MAA )zB ⋅ zB ≤ ℏ
in view of Lemma 214 with R = √ℏ. Writing M in block-matrix form (16.19), its inverse has the form (M/MBB )−1 ∗
M −1 = (
∗ ) (M/MAA )−1
(16.30)
(formula (16.23)), and hence (M/MBB )−1 =
ℏ Σ 2 AA
and (M/MAA )−1 =
ℏ Σ . 2 BB
Formulas (16.28) and (16.29).
16.3 Werner and Wolf’s condition 16.3.1 Statement and equivalent formulation The following crucial necessary condition for AB-separability was proven by Werner and Wolf: Proposition 216 (Werner and Wolf). Suppose that the density operator ρ̂ with covariance matrix Σ is AB-separable. Then there exist two real symmetric matrices ΣA and ΣB of dimensions 2nA × 2nA and 2nB × 2nB , respectively, satisfying the quantum conditions ΣA +
iℏ J ≥0 2 A
and
ΣB +
iℏ J ≥0 2 B
(16.31)
and such that Σ ≥ ΣA ⊕ ΣB .
(16.32)
202 | 16 Separability and entanglement We omit the proof which is quite technical, and instead show that Werner and Wolf’s result can be considerably refined using the properties of the symplectic group. We recall that the quantum condition Σ + iℏ2 J ≥ 0 on a covariance matrix is equivalent to the following property: Ω contains a quantum blob (Theorem 201). Proposition 217. The Werner–Wolf condition (16.32) is equivalent to the existence of two positive definite symplectic matrices PA = (SAT SA )
−1
∈ Sp(nA ),
PB = (SBT SB )
−1
∈ Sp(nB )
(16.33)
(SA ∈ Sp(nA ) and SB ∈ Sp(nB )) such that Σ≥
ℏ (P ⊕ PB ). 2 A
(16.34)
Equivalently, the covariance ellipsoid Ω contains a quantum blob (SA ⊕ SB )(B2n (√ℏ)). Proof. The sufficiency of the condition is clear since ΣA = ℏ2 PA and ΣA = ℏ2 PA satisfy the conditions (16.31). Assume conversely that Σ ≥ ΣA ⊕ ΣB . In view of Williamson’s diagonalization theorem, there exist SA ∈ Sp(nA ) and SB ∈ Sp(nB ) such that SA ΣA SAT = DA and SB ΣB SBT = DB where ΛA 0
0 ), ΛA
DA = (
ΛB 0
0 ) ΛB
DB = (
and ΛA , ΛB being the diagonal matrices consisting of the symplectic eigenvalues λ1A , . . . , λnAA of ΣA and λ1B , . . . , λnBB of ΣB . Since SA JA SAT = JA and SB JB SBT = JB , the condi-
tions ΣA + iℏ2 JA ≥ 0 and ΣB + iℏ2 JB ≥ 0 are equivalent to DA + iℏ2 JA ≥ 0 and DB + iℏ2 JB ≥ 0. These conditions imply in turn that DA ≥ ℏ2 IA and DB ≥ ℏ2 IB : the characteristic equation of DA + iℏ2 JA is det((ΛA − λIA )2 −
1 2 ℏ I ) = 0, 4 A
and, writing ΛA = diag(λ1A , . . . , λnA ), this equation is equivalent to the set of equations 2
(λjA − λ) −
1 2 ℏ = 0, 4
1 ≤ j ≤ n,
whose solutions are the real numbers λj = λjA ± ℏ2 . Since λj ≥ 0, we must thus have λjA ≥ ℏ2 , and hence DA ≥ ℏ2 IA . Similarly, DB ≥ ℏ2 IB , so we must have the inequalities ΣA = SA−1 DA (SAT )
−1
≥ (SAT SA )
ΣB = SB−1 DA (SBT )
−1
≥ (SBT SB ) .
−1
−1
Setting PA = (SAT SA )−1 and PB = (SBT SB )−1 , the inequality (16.34) follows.
16.3 Werner and Wolf’s condition |
203
16.3.2 A geometric consequence The Werner and Wolf conditions (16.31) mean that ΣA and ΣB are quantum covariances matrices, hence the sum Σ ΣA ⊕ ΣB ≡ ( A 0
0 ) ΣB
is a quantum covariance matrix in its own right. It follows from the condition (16.32) that the corresponding covariance ellipsoid, which we denote as 1 1 z 2 + Σ−1 z 2 ≤ 1}, ΩA⊕B = {zA ⊕ zB : Σ−1 2 A A 2 B B
(16.35)
is included in Ω. Moreover, the ellipsoid ΩA⊕B always contains a quantum blob of the form ΩAB = (SA ⊕ SB )(B2n (√ℏ)) 2 2 = {zA ⊕ zB : SA−1 zA + SB−1 zB ≤ ℏ}.
(16.36) (16.37)
Hence, if the density operator ρ̂ with covariance ellipsoid Ω is separable, then there exist quantum covariance ellipsoids of the form (16.35) and (16.36) such that the following inclusions hold Ω ⊃ ΩA⊕B ⊃ ΩAB .
(16.38)
This result has interesting consequences for the covariance ellipsoids 1 z 2 ≤ 1} ΩA = {zA : Σ−1 2 AA A
and
1 ΩB = {zB : Σ−1 z 2 ≤ 1} 2 BB B
of the reduced density operators ρ̂A and ρ̂B . We first show that: Proposition 218. The orthogonal projections ΠA ΩAB and ΠB ΩAB of ΩAB onto ℝ2nA and ℝ2nB satisfy ΠA ΩAB = SA (B2nA (√ℏ)),
ΠB ΩAB = SB (B2nB (√ℏ)).
(16.39)
Proof. This result is easily proved directly from the definition of ΩAB . Finally, from (16.38), we easily conclude that the covariant ellipsoids ΩA and ΩB impose the following constraints on the symplectic matrices SA and SB of Proposition 21: Corollary 219. Assume that the density operator ρ̂ with covariant ellipsoid Ω is separable. Then the symplectic matrices SA and SB of Proposition 21 satisfy: SA B2nA (√ℏ) ⊂ ΩA ,
SB B2nB (√ℏ) ⊂ ΩB
(16.40)
204 | 16 Separability and entanglement Proof. From (16.38), we have ΩAB ⊂ Ω and so: ΠA ΩAB ⊂ ΠA Ω = ΩA ,
ΠB ΩAB ⊂ ΠB Ω = ΩB .
16.4 Comments and references The study of the notions of separability and entanglement can be historically traced back to the famous “EPR paper” [24] by Einstein, Rosen, and Podolsky, where the apparent paradoxical predictions of quantum mechanics about strongly correlated systems were first discussed. It was Schrödinger who coined the word entanglement (Verschränkung in German) in a letter to Einstein. There is an immense physical literature on entanglement and separability, still growing every day. For recent developments, see Lami et al. [52] and Serafini [65]. The PPT condition was first precisely stated (in physicist’s language) by the Horodecki family in [43, 44] and by Asher Peres [62]. The Werner and Wolf theorem was proven in [74] following earlier constructions by Reinhardt Werner [73]. For useful formulas of the Schur complement, see Zhang [77]. Part of the last section contains material that is developed in a paper which is joint work [20] with Nuno Dias and Joao Prata. A numerical approach to the problem of entanglement can be found in the paper by Leinaas et al. [53]. We stress that the important PPT and Werner–Wolf theorems are necessary but not sufficient conditions; at the time of writing, a tractable sufficient condition is still lacking outside of a few particular cases (for instance, the Werner–Wolf condition is not only necessary but also sufficient for Gaussian state, as we will see in next chapter).
17 Separability of Gaussian states Gaussian quantum states, i. e., states whose Wigner distribution is a Gaussian function, play a very privileged role in quantum mechanics. Not only are they somewhat easier to study analytically than more general states, but they are also omnipresent in all applications (quantum optics and computing, for instance). They have, in addition, very interesting properties; for instance, the quantum condition on the covariance matrix is sufficient to ensure “quantumness”, and so is the Werner–Wolf condition for separability.
17.1 The purity of a Gaussian state Consider now, as we have done several times before in this book, a non-degenerate real Gaussian function on ℝ2n of the type ρ(z) =
1
(2π)n √det Σ
1 −1
e− 2 Σ
(z−z)̄ 2
centered at z̄ ∈ ℝ2n , where Σ is a positive definite real symmetric 2n × 2n matrix (the “covariance matrix”). We will only consider the case z̄ = 0; the more general case is easily reduced to the former by a phase-space translation. Hence we assume that ρ(z) =
1 −1 2 1 e− 2 Σ z (2π)n √det Σ
(17.1)
where Σ is a real positive definite symmetric 2n × 2n matrix. This function represents in all cases a probability distribution since ρ ≥ 0 and is normalized ∫ ρ(z)dz = 1 ℝ2n
as easily checked by diagonalizing Σ and using the elementary formula ∞
2 2 1 ∫ e−x /2σ dx = 1.̇ σ √2π
−∞
17.1.1 A general purity formula for Gaussian sates We know from Proposition 168 in Chapter 13 that the function ρ defined by (17.1) is the Wigner distribution of a density operator if and only if the quantum condition Σ+
iℏ J≥0 2
is satisfied. Assuming that this condition is satisfied, we have: https://doi.org/10.1515/9783110722772-017
206 | 17 Separability of Gaussian states Proposition 220. Let ρ̂ be the density matrix with Gaussian Wigner distribution defined previously. The purity of ρ̂ is n
ℏ μ(ρ̂) = ( ) det(Σ−1/2 ). 2
(17.2)
Proof. We have (formula (14.8)) Tr(ρ̂2 ) = (2πℏ)n ∫ ρ2 (z)dz. ℝ2n
Now, by definition of the Gaussian ρ, ∫ ρ2 (z)dz = (
2n
−1 2 1 ) (det Σ)−1 ∫ e−Σ z dz, 2π
ℝ2n
ℝ2n
hence, using the elementary formula from the theory of Gaussian integrals, 2
∫ e−Mz dz = π n (det M)−1/2 , ℝ2n
valid for every positive-definite symmetric matrix M, we get ∫ ρ2 (z)dz = ( ℝ2n
n
1 ) (det Σ)1/2 ; 4π
formula (17.2) follows. Notice that, since Tr(ρ̂2 ) ≤ 1, we must have det Σ ≥ ( 21 ℏ)n and μ(ρ̂) = 1 if and only if det(Σ) = ( 21 ℏ)n . 17.1.2 Pure Gaussian states Let us return to the generalized Gaussians studied in Chapter 4. We slightly change the notation and write ψX,Y instead of ψM to avoid confusion. Let X and Y be real symmetric n × n matrices, with X > 0. To these matrices, we associate the Gaussian function ϕX,Y on ℝn defined by 1
ψX,Y (x) = (πℏ)−n/4 (det X)1/4 e− 2ℏ (X+iY)x
2
(17.3)
where we are writing (X + iY)x2 for (X + iY)x ⋅ x. The function ψX,Y is L2 -normalized ‖ψX,Y ‖L2 = 1, and its Wigner transform is given by 1
WυX,Y (z) = (πℏ)−n e− ℏ S
T
Sz 2
(17.4)
17.2 Separability of Gaussian quantum states | 207
where X 1/2 S = ( −1/2 X Y
X
0
(17.5)
−1/2 )
is symplectic, so that X + YX −1 Y X −1 Y
ST S = (
YX −1 ). X −1
(17.6)
Setting Σ−1 = ℏ2 ST S, we can rewrite the Wigner transform (17.4) in the form WψX,Y (z) =
1
1 −1 2
(2π)n √det Σ
e− 2 Σ
z
,
hence to ρX,Y = WψX,Y corresponds a Gaussian density operator ρ̂X,Y . The quantum condition on the covariance matrix becomes here ST S + iJ ≥ 0; since (ST )−1 JS−1 = J, this is equivalent to I + iJ ≥ 0 (which is trivially satisfied). Proposition 221. The only pure Gaussian states are those with Wigner distribution WψX,Y . Proof. Let Σ = ST DS be a Williamson diagonalization of Σ. We have det(Σ) = det(JΣ) = λ12 ⋅ ⋅ ⋅ λn2 where the λj are the symplectic eigenvalues of Σ. Since we must have λj ≤ 21 ℏ (Proposition 168), the equality det(Σ) = ( 21 ℏ)n requires that we must have λj = 21 ℏ for all j = 1, 2, . . . , n. In this case, the matrix D in the Williamson diagonalization Σ = ST DS is the identity so that we must have Σ = ST S, hence ρ(z) =
1 T −1 2 1 e− 2 (S S ) z , (2π)n √det Σ
and the result follows from the previous discussion, setting S = (ST )−1 .
17.2 Separability of Gaussian quantum states 17.2.1 Generalities, a sufficient condition for separability Notice that Proposition 218 gives us an alternative proof of the fact that the partial trace operators ρ̂A and ρ̂B are density operators in the Gaussian case. In fact, to prove this, we had to use for the general case the KLM conditions (Proposition 165) to prove the positivity properties ρ̂A ≥ 0 and ρ̂B ≥ 0. In the Gaussian case, the equalities (16.39) imply that ΣA +
iℏ J ≥ 0, 2 A
ΣB +
iℏ J ≥ 0, 2 B
(17.7)
208 | 17 Separability of Gaussian states hence ρ̂A and ρ̂B are (Gaussian) density operators. Recall that the purity of ρ̂ is given by n
ℏ μ(ρ̂) = ( ) (det Σ)−1/2 = √det M. 2
(17.8)
That the terminology “covariance matrix” applied to Σ is justified in the quantum case because it is, in classical statistical mechanics, clear, as is the fact that we have ρ ∈ Γm (ℝ2n ) for every m < −2n, hence ρA ∈ ΓmA (ℝ2nA ) for every mA < −2nA . 17.2.2 Reduced states of a Gaussian density operator Let us describe in detail the reduced states of a Gaussian quantum state. We introduce the notation M=
ℏ −1 Σ . 2
Notice that we have the equivalence iℏ J ≥ 0 ⇐⇒ M −1 + iJ ≥ 0. 2
Σ+
We will write M in block matrix form MAA MBA
M=(
MAB ), MBB
T MAB = MBA .
Recall that the Schur complements M/MBB and M/MAA are symmetric positive definite and given by −1 M/MBB = MAA − MAB MBB MBA
M/MAA = MBB −
(17.9)
−1 MBA MAA MAB .
(17.10)
Proposition 222. The reduced density operator ρ̂A is a Gaussian state with Wigner distribution 1
2
ρA (zA ) = (πℏ)−nA (det M/MBB )1/2 e− ℏ (M/MBB )zA ;
(17.11)
and its covariance ellipsoid ΩA = {zA : (M/MBB )zA2 ≤ ℏ} is the orthogonal projection ΠA Ω on ℝ2nA of the covariance ellipsoid Ω of ρ̂.
(17.12)
17.2 Separability of Gaussian quantum states | 209
Proof. Writing z = zA ⊕ zB , we have Mz 2 = MAA zA2 + 2MBA zA ⋅ zB + MBB zB2 , so that 1
1
2
1
2
2
∫ e− ℏ Mz dzB = e− ℏ MAA zA ∫ e− ℏ (MBB zB +2MBA zA ⋅zB ) dzB . ℝ2nB
ℝ2nB −1 Setting zB = uB − MBB MBA zA , we have
−1 MBB zB2 + 2MBA zA ⋅ zB = MBB u2B − MAB MBB MBA zA2 ,
and hence, integrating with respect to the variables zB , 1
1
2
1
2
2
∫ e− ℏ Mz dzB = e− ℏ (MAA −MAB MBB MBA )zA ∫ e− ℏ MBB uB duB . ℝ2nB
−1
ℝ2nB
Using the classical formula 1
2
∫ e− ℏ MBB uB duB = (πℏ)nB (det MBB )−1/2 , ℝ2nB
we thus have 1
1
2
2
∫ e− ℏ Mz dzB = (πℏ)nB (det MBB )−1/2 e− ℏ (M/MBB )zA ℝ2nB
where M/MBB is the Schur complement of MBB of M; the identity (17.11) now follows from formula (16.24). The covariance ellipsoid of the reduced state ρ̂A is given by (17.12), and, in view of Lemma 214, it is indeed the orthogonal projection ΠA Ω of Ω on ℝ2nA . Corollary 223. The purity of the reduced density operator ρ̂A is μ(ρ̂A ) = (det M/MBB )1/2 ,
(17.13)
and ρ̂A is a pure state if and only if M/MBB ∈ Sp(nA ), in which case we have μ(ρ̂) = det MBB . Proof. In view of formulas (17.8) and (17.11), the purity of ρ̂A is given by (17.13). It follows that μ(ρ̂A ) = 1 if and only if det M/MBB = 1; by the same token as used in the proof of Lemma 221, we must then have M/MBB ∈ Sp(nA ). The equality μ(ρ̂) = det MBB follows from the identity (16.24).
210 | 17 Separability of Gaussian states 17.2.3 Back to the Werner and Wolf conditions It turns out that the Werner and Wolf’s conditions in Proposition 216 are sufficient for a Gaussian state to be separable: Proposition 224. Assume that there exist two partial covariance matrices ΣA and ΣB satisfying the quantum conditions (17.7) and such that Σ ≥ ΣA ⊕ ΣB .
(17.14)
Then the Gaussian state ρ(z) = (
n
1 −1 2 1 1 ) e− 2 Σ z 2π √det Σ
is separable. The conditions (17.7) mean that ΣA and ΣB are quantum covariances matrices, hence the sum ΣA 0
ΣA ⊕ ΣB ≡ (
0 ) ΣA
is a quantum covariance matrix in its own right. The corresponding covariance ellipsoid, which we denote 1 1 −1 2 2 ΩA⊕B = {zA ⊕ zB : Σ−1 A zA + ΣB zA ≤ 1}, 2 2 is included in Ω, and its orthogonal projections on ℝ2nA and ℝ2nB are just the intersections of ΩA⊕B with the hyperplanes zB = 0 and zA = 0, respectively.
17.3 Disentanglement of Gaussian states Since the action of the metaplectic group Mp(n) on the set of all centered Gaussians ψX,Y is transitive, the previous lemma can be rephrased by saying that every pure 2
Gaussian state is obtained from the standard Gaussian ϕ0 (x) = (πℏ)−n/4 e−|x| /2ℏ by a unitary transformation Ŝ ∈ Mp(n). We are in fact going to prove much more, namely that every Gaussian state (pure or mixed) can be “disentangled” using a metaplectic transform associated with a symplectic rotation. 17.3.1 A diagonalization result for positive symplectic matrices A symplectic and positive definite matrix can be diagonalized using a symplectic rotation that is an element of U(n) = Sp(n) ∩ O(2n).
17.3 Disentanglement of Gaussian states | 211
Proposition 225. Let S ∈ Sp(n) be positive definite and symmetric. Let λ1 ≤ ⋅ ⋅ ⋅ ≤ λn ≤ 1 be the n smallest eigenvalues of S and set Λ = diag(λ1 , . . . , λn ; 1/λ1 , . . . , 1/λn ).
(17.15)
There exists U ∈ U(n) such that S = U T ΛU. The eigenvalues occur in pairs (λ, 1/λ) of positive numbers (Proposition 11); if λ1 ≤ ⋅ ⋅ ⋅ ≤ λn are the n first eigenvalues, then /λ1 , . . . , 1/λn are the other n eigenvalues. Let now U be an orthogonal matrix such that S = U T ΛU, with Λ being given by (17.15). We claim that we can choose U ∈ U(n). It suffices to show that we can write U in the block-matrix form A B
U=(
−B ) A
with the conditions ABT = BT A,
AAT + BBT = I.
(17.16)
Let e1 , . . . , en be n orthonormal eigenvectors of U corresponding to the eigenvalues λ1 , . . . , λn . Since SJ = JS−1 (because S is both symplectic and symmetric), we have, for 1 ≤ k ≤ n, SJek = JS−1 ek =
1 Je , λj k
hence ±Je1 , . . . , ±Jen are the orthonormal eigenvectors of U corresponding to the remaining n eigenvalues 1/λ1 , . . . , 1/λn . Viewing the ek as column vectors, we write the 2n × n matrix (e1 , . . . , en ) as A (e1 , . . . , en ) = ( ) B where A and B are n × n matrices; we have A −B (−Je1 , . . . , −Jen ) = −J ( ) = ( ) , B A hence U is indeed of the type A B
U = (e1 , . . . , en ; −Je1 , . . . , −Jen ) = (
−B ). A
The symplectic conditions (17.16) are automatically satisfied since U T U = UU T = I because the diagonalizing matrix U is assumed to be orthogonal.
212 | 17 Separability of Gaussian states 17.3.2 The disentanglement result Let us state and prove our “disentanglement” result. Proposition 226. Let ρ̂ be a bipartite Gaussian density operator. There exists a sym̂ ρ̂U ̂ −1 is separable where U ̂ ∈ Mp(n) is any one of plectic rotation U ∈ U(n) such that U the two metaplectic operators covering U. Proof. Let S = PR be the symplectic polar decomposition of S ∈ Sp(n), that is, P ∈ Sp(n), P > 0, and R ∈ U(n). We have SB2n (√ℏ) = PB2n (√ℏ) since RB2n (√ℏ) = B2n (√ℏ) by rotational symmetry. In view of Proposition 225, there exists a symplectic rotation U ∈ U(n) such that P = U T ΔU
(17.17)
where Δ ∈ Sp(n) is a diagonal matrix (17.15). The inclusion SB2n (√ℏ) ⊂ Ω is thus equivalent to ΔB2n (√ℏ) ⊂ U(ΩΣ ), that is, to ΔB2n (√ℏ) ⊂ ΩΣU
(17.18)
where ΣU = UΣU T . This inclusion is equivalent to the matrix inequality ℏ 2 Δ ≤ ΣU . 2
(17.19)
We next note that ΣU is the covariance matrix of the density operator ρ̂U with Wigner distribution Wρ̂U (z) = Wρ̂ (U T z), that is, Wρ̂U (z) =
1
1 −1
(2π)n √det UΣU T
e− 2 Σ
U T z⋅U T z
.
(17.20)
Applying the symplectic covariance principle for Weyl operators to ρ̂ yields, since U T = U −1 , ̂ ρ̂U ̂ −1 ρ̂U = U
(17.21)
̂ ∈ Mp(n) is any of the two metaplectic operators ±U ̂ covering U. We claim that where U ρ̂U is separable. To see this, let us come back to the diagonal matrix Δ. It is given by Δ = diag(λ1 , λ1−1 , λ2 , λ2−1 , . . . , , λn , λn−1 ),
(17.22)
and, in the AB-ordering, it has the form Δ = ΔA ⊕ ΔB with nA
ΔA = ⨁ Δk , k=1
n
ΔB = ⨁ Δk k=nA +1
(17.23)
17.4 The case of Gaussians | 213
and λk 0
Δk = (
0 ), λk−1
k = 1, . . . , n.
(17.24)
Clearly ΔA ∈ Sp(nA ) and ΔB ∈ Sp(nB ). The symmetric matrices ℏ 2 Δ , 2 A
ΣB =
ℏ 2 Δ 2 B
(17.25)
iℏ J ≥ 0, 2 A
ΣB +
iℏ J ≥ 0. 2 B
(17.26)
ΣA = trivially satisfy ΣA + In view of (16.17), we have
ΣA ⊕ ΣB ≤ ΣU ,
(17.27)
and the theorem now follows from the Werner–Wolf separability conditions.
17.4 The case of Gaussians Let us consider again Gaussian distributions of the type ρ(z) =
1
1 −1
(2π)n √det Σ
e− 2 Σ
(z−z)̄ 2
where Σ is a positive definite symmetric (real) 2n × 2n matrix. It is the η-Wigner distribution of a density matrix if and only if Σ satisfies the condition Σ+
iη J ≥ 0 ⇐⇒ |η| ≤ 2λmin 2
(17.28)
where λmin is the smallest symplectic eigenvalue of the covariance matrix Σ, as follows from the discussions in previous chapters. In fact, if η > 0, this is just condition 13.17 in Proposition 168 replacing ℏ with η; if η < 0, it suffices to note that the condition iη iη Σ + 2 J ≥ 0 is equivalent, by complex conjugation, to Σ + 2 J ≤ 0. The purity of the corresponding η-density matrix is n
η μ(ρ̂η ) = ( ) det(Σ−1/2 ), 2
(17.29)
hence ρ̂η represents a pure state if and only if det(Σ) = (η/2)n . Proposition 227. Assume that ρ̂ (i. e. ρ̂η = ρ̂ℏ ) represents a pure state. Then, for every η < ℏ, the density operator ρ̂η represents a mixed Gaussian state.
214 | 17 Separability of Gaussian states Proof. This is clear in view of the previous discussion, using Proposition 47, since the iη quantum condition Σ + 2 J ≥ 0 is not only necessary but also sufficient for ρ̂η to be a density operator. To summarize, we have the following situation (we assume here for simplicity that η > 0): Suppose that the quantum condition on the covariance matrix holds for η = ℏ. σ Then the system is a mixed quantum state for all η ≤ ℏ; when ℏ ≤ η ≤ 2λmin , it is still a mixed state unless η = λ1 = ⋅ ⋅ ⋅ = λn , in which case it becomes a pure Gaussian state; when η > 2λmin , we are in the presence of a classical Gaussian state.
17.5 Comments and references There is a vast literature on Gaussian quantum states and their separability. This is due to the fact that Gaussian states are the easiest to manufacture “in the lab” and also because they are the only states for which separability and entanglement are relatively well understood from a theoretical point of view. Proposition 216 goes back to the paper [74] by Werner and Wolf where the authors elaborate on earlier work of Werner [73]. For a different proof, see Serafini [65], p. 178. Proposition 218 is a linear improvement of the symplectic non-squeezing theorem of Gromov [41] in the linear case and which refines a recent previous result of Abbondandolo and his collaborators [1, 2]. Proposition 226 extends to the multipartite case: Adapting the proof, it is easy to show that every multipartite Gaussian state ρ̂ can be disentangled by a symplectic rotation. See de Gosson [34].
Bibliography [1] [2] [3] [4] [5] [6]
[7] [8] [9] [10] [11] [12]
[13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24]
A. Abbondandolo, R. Matveyev, How Large Is the Shadow of a Symplectic Ball? J. Topol. Anal. 5(01), 87–119 (2013) A. Abbondandolo, P. Majer, A Non-Squeezing Theorem for Convex Symplectic Images of the Hilbert Ball. Calc. Var. 54, 1469–1506 (2015) F. Agorram, A. Benkhadra, A. El Hamyani, A. Ghanmi, Complex Hermite Functions as Fourier–Wigner Transform. Integral Transforms Spec. Funct. 27(2), 94–100 (2015) Arvind, B. Dutta, N. Mukunda, R. Simon, The Real Symplectic Groups in Quantum Mechanics and Optics. Pramana J. Phys. 45(6), 471–497 (1995) K. M. Ball, Ellipsoids of Maximal Volume in Convex Bodies. Geom. Dedic. 41(2), 241–250 (1992) P. Blanchard, E. Brüning, Mathematical Methods in Physics, Distributions, Hilbert Space Operators, and Variational Methods. Progress in Mathematical Physics, vol. 26 (Birkhäuser, 2015) P. Boggiatto, G. De Donno, A. Oliaro, Generalized Spectrograms and τ-Wigner Transforms. CUBO 12(3), 171–185 (2010) P. Boggiatto, G. De Donno, A. Oliaro, Time-Frequency Representations of Wigner Type and Pseudo-Differential Operators. Trans. Am. Math. Soc. 362(9), 4955–4981 (2010) P. Boggiatto, B. Kien Cuong, G. De Donno, A. Oliaro, Weighted Integrals of Wigner Representations. J. Pseudo-Differ. Oper. Appl. (2010) P. Boggiatto, G. Donno, A. Oliaro, Hudson’s Theorem for τ-Wigner Transforms. Bull. Lond. Math. Soc. 45(6), 1131–1147 (2013) M. Born, P. Jordan, Zur Quantenmechanik. Z. Phys. 34, 858–888 (1925) M. Born, W. Heisenberg, P. Jordan, Zur Quantenmechanik II. Z. Phys. 35, 557–615 (1925). English translation in: M. Jammer. The Conceptual Development of Quantum Mechanics (New York: McGraw-Hill, 1966), 2nd ed. (New York: American Institute of Physics, 1989) T. Bröcker, R. F. Werner, Mixed States with Positive Wigner Functions. J. Math. Phys. 36(1), 62–75 (1995) L. Cohen, Generalized Phase-Space Distribution Functions. J. Math. Phys. 7(5), 781–786 (1966) L. Cohen, The Weyl Operator and its Generalization. Pseudo-Differential Operators (Birkhäuser, Basel, 2013). E. Cordero, M. de Gosson, F. Nicola, On the Invertibility of Born–Jordan Quantization. J. Math. Pures Appl. 05(4), 537–557 (2016) E. Cordero, M. de Gosson, F. Nicola, On the Positivity of Trace Class Operators. Adv. Theor. Math. Phys. 23(8), 2061–2091 (2019) N. C. Dias, J. N. Prata, The Narcowich-Wigner Spectrum of a Pure State. Rep. Math. Phys. 63(1), 43–54 (2009) N. Dias, M. de Gosson, J. Prata, Maximal Covariance Group of Wigner Transforms and Pseudo-Differential Operators. Proc. Am. Math. Soc. 142(9), 3183–3192 (2014) N. Dias, M. de Gosson, J. Prata, Partial Traces and the Geometry of Entanglement; Sufficient Conditions for the Separability of Gaussian States. arXiv:2003.13190v1 (2020) P. A. M. Dirac, Quantised Singularities in the Electromagnetic Field. Proc. R. Soc. A 133, 60–72 (1931) H. B. Domingo, E. A. Galapon, Generalized Weyl Transform for Operator Ordering: Polynomial Functions in Phase Space. J. Math. Phys. 56, 022104 (2015) J. Du, M. W. Wong, A Trace Formula for Weyl Transforms. Approx. Theory Appl. (N. S.) 16(1), 41–45 (2000) A. Einstein, B. Podolsky, N. Rosen, Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? Phys. Rev. 47, 777 (1935)
https://doi.org/10.1515/9783110722772-018
216 | Bibliography
[25] H. G. Feichtinger, Un espace de Banach de distributions tempérées sur les groupes localement compact abéliens. C. R. Acad. Sci. Paris, Sér. A–B 290(17), A791–A794 (1980) [26] H. G. Feichtinger, K. Gröchenig, Gabor Frames and Time-Frequency Analysis of Distributions. J. Funct. Anal. 146(2), 464–495 (1997) [27] H. G. Feichtinger, F. Luef, E. Cordero, Banach Gelfand Triples for Gabor Analysis, in Pseudo-Differential Operators, Quantization and Signals. Lecture Notes in Mathematics, vol. 1949 (2008) [28] G. B. Folland, Harmonic Analysis in Phase space. Annals of Mathematics studies (Princeton University Press, Princeton, N.J., 1989) [29] M. de Gosson, Maslov Classes, Metaplectic Representation and Lagrangian Quantization. Research Notes in Mathematics, vol. 95 (Wiley–VCH, Berlin, 1997) [30] M. de Gosson, Symplectic Geometry and Quantum Mechanics. Operator Theory: Advances and Applications (subseries: Advances in Partial Differential Equations), vol. 166 (Birkhäuser, Basel, 2006) [31] M. de Gosson, Symplectic Methods in Harmonic Analysis and in Mathematical Physics (Birkhäuser, 2011) [32] M. de Gosson, The Wigner Transform. Advanced Textbooks in Mathematics (World Scientific, 2017) [33] M. de Gosson, Introduction to Born–Jordan Quantization. Fundamental Theories of Physics (Springer-Verlag, 2016) [34] M. de Gosson, On the Disentanglement of Gaussian Quantum States by Symplectic Rotations. C.R. Acad. Sci. Paris 358(4), 459–462 (2020) [35] M. de Gosson, Quantum Polar Duality and the Symplectic Camel: a New Geometric Approach to Quantization. arXiv:2009.10678v3 [quant-ph] (2020) [36] M. de Gosson, F. Luef, Remarks on the Fact that the Uncertainty Principle does not Characterize the Quantum State. Phys. Lett. A 364, 453–457 (2007) [37] M. de Gosson, F. Luef, Symplectic Capacities and the Geometry of Uncertainty: the Irruption of Symplectic Topology in Classical and Quantum Mechanics. Phys. Rep. 484, 131–179 (2009) [38] M. de Gosson, F. Nicola, Born–Jordan Pseudodifferential Operators and the Dirac Correspondence: Beyond the Groenewold–van Hove Theorem. Bull. Sci. Math. 144, 64–81 (2018) [39] I. S. Gradshteyn, I. M. Ryzhik, Table of Integrals, Series, and Products (Academic Press, 2014) [40] K. Gröchenig, Foundations of Time-Frequency Analysis (Birkhäuser, Boston, 2000) [41] M. Gromov, Pseudoholomorphic Curves in Symplectic Manifolds. Invent. Math. 82, 307–347 (1985) [42] A. Grossmann, Parity Operators and Quantization of δ-Functions. Commun. Math. Phys. 48, 191–193 (1976) [43] M. Horodecki, P. Horodecki, R. Horodecki, Separability of Mixed States: Necessary and Sufficient Conditions. Phys. Lett. A 223, 1–8 (1996) [44] M. Horodecki, P. Horodecki, R. Horodecki, Separability of n-Particle Mixed States: Necessary and Sufficient Conditions in Terms of Linear Maps. Phys. Lett. A 283(1–2), 1–7 (2001) [45] L. Hörmander, The Weyl Calculus of Pseudo-Differential Operators. Commun. Pure Appl. Math. 32, 359–443 (1979) [46] L. Hörmander, The Analysis of Linear Partial Differential Operators III (Springer–Verlag, Berlin, 1985) [47] R. L. Hudson, When is the Wigner Quasi-Probability Density Non-Negative? Rep. Math. Phys. 6, 249–252 (1974) [48] M. S. Jakobsen, On a (No Longer) New Segal Algebra: A Review of the Feichtinger Algebra. J. Fourier Anal. Appl. 24(6), 1579–1660 (2018)
Bibliography | 217
[49] A. J. E. M. Janssen, A Note on Hudson’s Theorem about Functions with Nonnegative Wigner Distributions. SIAM J. Math. Anal. 15(1), 170–176 (1984) [50] D. Kastler, The C ∗ -Algebras of a Free Boson Field. Commun. Math. Phys. 1, 14–48 (1965) [51] Y. Katznelson, An Introduction to Harmonic Analysis (Dover, New York, 1976) [52] L. Lami, A. Serafini, G. Adesso, Gaussian Entanglement Revisited. New J. Phys. 20, 023030 (2018) [53] J. M. Leinaas, J. Myrheim, E. Ovrum, Geometrical Aspects of Entanglement. Phys. Rev. A 74(1), 012313 (2006) [54] J. Leray, Lagrangian Analysis and Quantum Mechanics, a mathematical structure related to asymptotic expansions and the Maslov index (MIT Press, Cambridge, Mass, 1981); translated from Analyse Lagrangienne RCP 25, Strasbourg Collège de France, (1976–1977) [55] G. Loupias, S. Miracle-Sole, C ∗ -Algèbres des systèmes canoniques, I. Commun. math. Phys. 2, 31–48 [56] G. Loupias, S. Miracle-Sole, C ∗ -Algèbres des systèmes canoniques, II. Ann. Inst. Henri Poincaré 6(1), 39–58 (1967) [57] B. D. MacCluer, Elementary Functional Analysis. Graduate Texts in Mathematics, vol. 253 (Springer, 2009) [58] F. J. Narcowich, Conditions for the Convolution of Two Wigner Distributions to Be Itself a Wigner Distribution. J. Math. Phys. 29(9), 2036–2041 (1988) [59] F. J. Narcowich, Distributions of ℏ-Positive Type and Applications. J. Math. Phys. 30(11), 2565–2573 (1989) [60] F. J. Narcowich, Geometry and Uncertainty. J. Math. Phys. 31(2), 354–364 (1990) [61] T. Paul, A propos du formalisme mathématique de la Mécanique Quantique. Logique & Interaction: Géométrie de la cognition, Actes du colloque et école thématique du CNRS” Logique, Sciences, Philosophie” a Cerisy” (Hermann, 2009) [62] A. Peres, Separability Criterion for Density Matrices. Phys. Rev. Lett. 77, 1413–1415 (1996) [63] H. J. Reiter, Metaplectic Groups and Segal Algebras. Lect. Notes in Mathematics (Springer, 1989) [64] A. Royer, Wigner Functions as the Expectation Value of a Parity Operator. Phys. Rev. A 15, 449–450 (1977) [65] A. Serafini, Quantum Continuous Variables: A Primer of Theoretical Methods (CRC Press, 2017) [66] M. A. Shubin, Pseudodifferential Operators and Spectral Theory (Springer-Verlag, 1987) [original Russian edition in Nauka, Moskva, 1978] [67] F. Soto, P. Claverie, When Is the Wigner Function of Multidimensional Systems Nonnegative? J. Math. Phys. 24(1), 97–100 (1983) [68] E. M. Stein, Harmonic Analysis: Real Variable Methods, Orthogonality, and Oscillatory Integrals (Princeton University Press, 1993) [69] E. M. Stein, G. Weiss, Fourier Analysis on Euclidean Spaces (Princeton University Press, 1971) [70] J. Toft, Hudson’s Theorem and Rank One Operators in Weyl Calculus. Pseudo-Differential Operators and Related Topics (Birkhäuser, Basel, 2006), pp. 153–159 [71] V. S. Varadarajan, Lie Groups, Lie Algebras, and Their Representations (Prentice Hall, 1974) (Springer Verlag, 1984) [72] N. Wallach, Lie Groups: History, Frontiers and Applications. Symplectic Geometry and Fourier Analysis, vol. 5 (Math Sci Press, Brookline, MA, 1977) [73] R. Werner, Quantum Harmonic Analysis on Phase Space. J. Math. Phys. 25(5), 1404–1411 (1984) [74] R. F. Werner, M. M. Wolf, Bound Entangled Gaussian States. Phys. Rev. Lett. 86(16), 3658 (2001) [75] E. Wigner, On the Quantum Correction for Thermodynamic Equilibrium. Phys. Rev. 40, 749–759 (1932) [76] M. W. Wong, Weyl Transforms (Springer, 1998) [77] F. Zhang, The Schur Complement and Its Applications (Springer, Berlin, 2005)
Index ℏ-positive type 158 adjoint of a Weyl operator 67 auto-ambiguity function 41 axioms of quantum mechanics 14 Bochner theorem 157 Born–Jordan operator 83 Born–Jordan ordering 83 Born’s rule 14 ChaDensity 167 Cohen class 73 compact operator 5 composition of Weyl operators 69 covariance ellipsoid 186 covariance matrix 175 cross-ambiguity function 41 cross-Wigner transform 33 disentanglement of Gaussian states 210 entanglement 193 expectation value 175 Feichtinger algebra 119 Feichtinger state 181, 183 Fermi’s trick 51 Fourier transform 8 Gaussian quantum state 205 generators of the metaplectic group 96 generators of the symplectic group 93 Gromov width 185 Gromov’s symplectic non-squeezing theorem 184 Hadamard product of two matrices 159 Heisenberg – group 30 Hermite function 53 Hermite polynomial 53 Hermite’s differential equation 53 Hilbert space 3 Hilbert–Schmidt norm 136 Hilbert–Schmidt operator 135 Hudson’s theorem 36
kernel of a Weyl operator 66 KLM conditions 158 Laguerre function 54 Laguerre polynomial 54, 56 large numbers hypothesis 19 metaplectic group 96 mixed quantum state 14, 167 modulation space M1s (ℝn ) 131 Moyal identity 38, 75 Narcowich–Wigner spectrum 179 non-injectivity of Born–Jordan quantization 90 Parseval’s identity 4 partial trace 193 PPT theorem 197 pre-Iwasawa factorization 107 pure quanum state 14 purity of a quantum state 171 purity of Gaussian states 205 quantum blob 186 quantum Bochner theorem 157, 159 quantum condition on the covariance matrix 176 quantum observable 14 reduced density operator 193, 195 reflection operator 25 Robertson–Schrödinger inequalities 182 Rodrigue’s formula 53 Schur complement 199 Schur product of two matrices 159 separability 193 Shubin pseudodifferential operator 88 Shubin symbol classes 68 Sobolev space 3 spectral theorem 6 statistical interpretation of the Wigner transform 46 superposition of quantum states 14 surjectivity of Born–Jordan quantization 92 symplectic ball 186 symplectic covariance 103 symplectic covariance of the ambiguity function 105
220 | Index
symplectic covariance of the Wigner transform 105 symplectic covariance of Weyl operators 109 symplectic form 10 symplectic Fourier transform 27 symplectic group 10 tempered distribution 7 trace class operators 145 trace norm 152 transpose of a Weyl operator 67
uncertainty and Gromov’s width 187 uncertainty principle 181 variable ℏ 18 Werner and Wolf condition 201 Weyl operator 61 Weyl transform 61 Wigner distribution of a mixed stae 167 Wigner transforms of Gaussians 49 Wigner transforms of Hermite functions 53
Advances in Analysis and Geometry Volume 3 Alexander Grigor’yan, Yuhua Sun Analysis and Partial Differential Equations on Manifolds, Fractals and Graphs, 2021 ISBN 978-3-11-070063-3, e-ISBN 978-3-11-070076-3, e-ISBN (ePUB) 978-3-11-070085-5 Volume 2 Richard M. Aron, Eva A. Gallardo Gutiérrez, Miguel Martin, Dmitry Ryabogin, Ilya M. Spitkovsky, Artem Zvavitch (Eds.) The Mathematical Legacy of Victor Lomonosov. Operator Theory, 2020 ISBN 978-3-11-065339-7, e-ISBN 978-3-11-065675-6, e-ISBN (ePUB) 978-3-11-065346-5 Volume 1 Jie Xiao Qα Analysis on Euclidean Spaces, 2019 ISBN 978-3-11-060112-1, e-ISBN 978-3-11-060028-5, e-ISBN (ePUB) 978-3-11-060010-0
www.degruyter.com