379 29 11MB
English Pages 260 [261] Year 1994
Time: Towards a Consistent Theory
Fundamental Theories of Physics An International Book Series on The Fundamental Theories of Physics: Their Clarification, Df!velopment and Application
Editor:
ALWYN VANDER MERWE University of Denver, US.A.
Editorial Advisory Board: ASIM BARUT, University of Colorado, US.A. BRIAN D. JOSEPHSON, University of Cambridge, UK. CLIVE KILMISTER, University of London, UK. GUNTER LUDWIG, Philipps-Universitiit, Marburg, Germany NATHAN ROSEN, Israel Institute of Technology, Israel MENDEL SACHS, State University of New York at Buffalo, US.A. ABD~JS SALAM, International Centre for Theoretical Physics, Trieste, Italy HANS-JORGEN TREDER, Zentralinstitut fur Astrophysik der Akademie der Wissenschajten, Germany
Volume 65
Time: Towards a Consistent Theory by
C. K. Raju Indian Institute ofAdvanced Study, Rashtrapati Nivas, Shimla, India
and Centre for Development of Advanced Computing, New Delhi, India
SPRINGER-SCIENCE+BUSINESS MEDIA, B.V.
A C.I.P. Catalogue record for this book is available from the Library of Congress.
ISBN 978-90-481-4462-4
DOI 10.1007/978-94-015-8376-3
ISBN 978-94-015-8376-3 (eBook)
Printed on acid-free paper
All Rights Reserved © 1994 Springer Science+Business Media Dordrecht Originally published by Kluwer Academic Publishers in 1994 Softcover reprint of the hardcover 1st edition 1994 No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner.
Contents Synoptic table of contents .
vi
Preface
xi
. .
1
Introduction
Part A: Preliminary paradoxes and puzzles I
Philosophical time
11
Part B: The measurement oftime II
Newton's time
...... .
33
IlIA
The Michelson-Morley experiment
49
IIIB
Einstein's time
59
Part C: The arrow oftime IV
Thermodynamic time
79
VA
The electromagnetic field .
102
VB
Electromagnetic time
116
Part D: The topology oftime VIA
Bell and non-locality
139
VIB
Quantum-mechanical time
161
Cosmological time
190
VII
. . .
Part E: Towards a consistent model oftime VIII
Mundane time
215
Notes and References
233
Index.
247
. . . ..
v
Synoptic Table of Contents INTRODUCTION
1
1 The structure oftime 1.1 Mundane time 1.2 Superlinear time 1.3 The paradox of mundane time 1.4 Th~ irrelevance of (quantum) indetermlntSm 1.5 Temporal assumptions in the argument from chaos
1 1 2 2
4
2 The physicist's point of view
4
3 Objectives of the exposition
5
4 Organization
5
Part B: The measurement of time
II
3
Part A: Preliminary paradoxes and puzzles
I
PHILOSOPHICAL TIME
1 The definition of time
1.1 The operational defmition 1.2 Is time objectively definable? 1.3 Is a definition necessary? 2 The myth of passage
2.1 The A-series: the flow of time 2.2 The B-series: the arrow of time 2.3 McTaggart's paradox 2.4 Resolution of McTaggart's paradox
12 12 14 14
18
4 Logic and 'free will' 4.1 The Master Argument of Diodorus Cronus 4.2 Lukasiewicz's three-valued logic 4.3 The quasi truth-functional system
22 22 23 24
5 Zeno's paradoxes of motion 5.1 Achilles, dichotomy and the arrow 5.2 Tasks and supertasks
26 27 28
6 Conclusions
29
33 33 34 35 35 36 37 38 38
3 Falsifiability and physical theories
39
4 Laws of motion and law of gravitation
41
5 Laplace's demon
43 43 43 44 44 46 47
IlIA THE MICHELSON-MORLEY EXPERIMENT 49
20 20 21 21
1 Introduction
6 Conclusions Box items: 1. Newton and action at a distance 36 2. Karl RainlUnd Popper 40 3. Pierre Simon de Laplace 45
15 15 16
33
2 Newton's laws of motion 2.1 Bodies 2.2 Forces 2.3 Sources of forces 2.4 Inertial frames 2.5 External forces 2.6 Newton's laws as defmitions 2.7 Clocks
5.1 Time in classical mechanics 5.2 The structure of time in classical mechanics 5.3 The argument from complexity 5.4 Laplace's demon 5.5 Exorcism of Laplace's demon
11
3 The sea fight tomorrow 3.1 Fatalism and determinism 3.2 Dumett's paradox
NEWTON'S TIME
Box items: 1. Augustine 13 2. ~cTaggart 17 3. Sri Harsa 19 4. Zeno of Elea 27
vi
1 Introduction
49
2 The background of aether theories 2.1 The blue sky or why aether was introduced 2.2 The Dragon's head 2.3 Stellar aberration and the finiteness of the speed of light 2.4 The grove of trees 2.5 The aether drag 2.6 A delicate point 2.7 The aether dragged
50
51 52 52 53 54
3 Electrodynamics: theory and experiment 3.1 Maxwell 3.2 The experiment proper 3.3 Miller's observations 3.4 Aether, relativity and metaphysics 3.5 The moral of the story
54 54 55 56 57 57
50 51
4 Conclusions Appendix: Einstein on the Michelson Morley experiment
57
EINSTEIN'S TIME
59
IIIB
4.2 Poincare recurrence theorem
4.2.1 The recurrence theorem simplifled 4.2.2 Consequences of the recurrence theorem 4.2.3 Relationship to mechanics: Liouville's theorem
58
1 Introduction
59
2 Electrodynamics 2.1 Lorentz #1 2.2 Lorentz #2
60 60 61
3 Newton's laws again 3.1 Velocity-dependent forces 3.2 Homogeneity 3.3 Isotropy 3.4 Straight-line motion 3.5 Relative motion and force 3.6 Newton's time
62 62 63 63 63 64 64
4 Poincare 4.1 Poincare and the relativity principle 4.2 Poincare and aether 4.3 Poincare and mechanical explanations 4.4 Relative velocity and the MichelsonMorley experinlent 4.5 Poincare on Lorentz's theory 4.6 The new mechanics 4.7 Poincare on time
64 64 65 66 66 67 68 69
5 Einstein
70
6 Time
70
7 Conclusions Appendix: The historical record
71 72
4.3 The recurrence paradox 4.4 Refutation of the paradoxes 4.5 Objections to the refutations
4.5.1 Is the cosmos shuffled? 4.5.2 The meaning(lessness) of large recurrence times
5 Resolution of the paradoxes 6 Conclusions Appendix: Proof of Liouville's theorem Box items: 1. Pictures of entropy and evolution 84 2. Mixing and sortin! Maxwell's demon 90 3. Elements of er~o .c theory 93 4. Rudolph ClausIus 98 5. Ludwig Boltzmann 99
VA THE ELECTROMAGNETIC FIELD
Part C: The arrow o/time IV THERMODYNAMIC TIME
79
1 Introduction 1.1 Summary of earlier chapters 1.2 Relativity and existence: loss of time asymmetry 1.3 Physics and the direction of time 1.4 Time "ymmetry of physics 1.5 Is time symmetry reasonable?
79 79 79 80 80 80
2 The entropy law 2.1 The meaning of entropy
81 82
3 The Boltzmann H-theorem 3.1 The Ehrenfest model
83 86
4 The reversibility and recurrence paradoxes 4.1 Loschmidt's paradox
87 88 88 89 89 90 92 92
92 96 96 97
102
1 Introduction
102
2 Is the field necessary? 2.1 Force at a distance 2.2 Action by contact 2.3 The field 2.4 'Contact' and the structure of matter 2.5 The meaning of 'contact'
102 103 103 104 105 105
3 The electromagnetic arrow of time 3.1 Retarded and advanced solutions 3.2 Rl/-diative damping and the arrow of time 3.3 Mixed potentials 3.4 The paradoxes of advanced action
106 106 107 107 108
4 The divergences of field theory 4.1 The Abraham-Lorentz model 4.2 Action at a distance 4.3 Dirac's approach
109 109 111 111
5 Preacceleration
113
6 Conclusions
115
VB ELECTROMAGNETIC TIME 116 1 Introduction
116
2 The two-body problem of electrodynamics 117 2.1 Formulation 117 2.2 Some definitions 118
87 87
vii
2.3 The recurrence paradox and the pastvalue problem 119 2.4 The reversibility paradox: time asymmetry of delay 120 2.5 Preaq;ele~ation: the Taylor-series approxunatlon 122 2.6 The pond paradox: advanced equations 123 2.7 Indeterminism: mixed deviating argu124 ments 2.8 The Wheeler-Feynman paradox 125 3 The absorber theory of radiation 3.1 Action at a distance 3.2 The Sommerfeld radiation condition 3.3 The Wheeler-Feynman theory 3.4 Hogarth's theory 3.5 Other theories
126 126 126 127 129 130
4 Empirical tests
131
2.6 Me~app.y'sics: locality and the voodoo prmClple
150
3 Bell's inequalities 152 3.1 Outline of inequalities 152 3.2 Bell's locality postulate 153 3.3 Experiments on Bell's inequalities 153 3.4 Spookiness: interactions between separated systems 154 3.5 Bell locality: interpretation of experiments 154 3.5.1 The early objections 154 3.5.2 The cugument of Barut and Meystre 155 3.5.3 The cugument from efficiency 155 4 Further confusion: other notions of local157 ity 4.1 The Aharonov-Bohm effect 157 4.2 Locality in quantum field theory 157 4.3 Locality and Localizability 157 4.4 Summary 158
5 Conclusions 132 Appendix: Derivation of the relativistic twobody equations of motion with a tilt in the arrow of time 133
5 The debate over non-locality in classical mechanics 5.1 Francis Bacon and magic 5.2 Locality, chains of causes, and action by contact 5.3 Descartes and Newton 5.4 The field 5.5 Hyperbolicity: fmite propagation speeds
Part D: The topology o/time VIA BELLAND NON·WCALITY 139 1 The topology of time: a general introduc139 tion to Chapters VIA, VIB, VII 1.1 The language of time 140 1.1.1 The U-calculus 140 1.1.2 The 'minimal'tense logic 141 1.1.3 The language of time: further con142 siderations 1.2 A new logic for physics 142 1.2.1 Structured time and the tilt in the arrow of time 142 1.3 Objectives 144 1.4 Non-locality 145
6 Conclusions Box items: 1. Time in q.m.: some recognized new features 143 2. Schrodinger's cat 151 3. Schrodinger's cat and SQUIDs 152 4. The Aharonov-Bohm effect 156
158 158 159 159 159 159 160
VIB QUANTUM.MECHANICAL TIME 161
2 Background to Bell's inequalities 146 2.1 Basic experiments 146 2.1.1 The two-slit diffraction experiment 146 2.1.2 Quantization of electron spin 147 2.2 Interpretations 147 2.2.1 Hidden variables 147 2.2.2 The Copenhagen interpretation 148 2.3 Controversy: The EPR Paradox 148 2.4 Classical mechanics as a hidden variable theory 149 2.5 History: the no-hidden-variable theorems 149 2.5.1 von Neumann's 'theorem' 149 2.5.2 Gleason's theorem and context 150 dependence 2.5.3 Non-locality of Bohm's theory 150
1 Introduction
161
2 The orthodox formalism of q.m.
162
3 From quantum logic to the formalism of q.m. 3.1 Non-commutativity and non-existence of joint distributions 3.2 Need for a ch~e of logic: failure of the distributive aw 3.3 Birkhoff-von Neumann app'roach and the orthodox formalism of q.m. 3.3.1 The lattice ofprojections 3.3.2 Order relation and orthocomplement 3.3.3 Geometrical interpretation
viii
164 164 165 165 165 166 166
3.3.4 Dynamical variables, random variables and self-adjoint operators 166 3.4 The quantum logic approach 167 3.4.1 The minimum requirements 167 3.4.2 Orthomodularity 168 3.4.3 Compatibility 168 3.5 The J auch-Piron approach 170 3.6 Defects in the quantum logic approach 172
1.3 The cosmological arrow of time
4 The structured-time interpretation of
q.m. 173 4.1 Motivation 173 4.2 Overview of the argument 173 4.3 Fr,om electrodynamics to structured time 174 4.3.1 The analogy with CSP 174 4.3.2 Past and present contingents 175 4.3.3 The logic of stroctured time 175 4.4 From structured time to quantum logic 175 4.4.1 Statements 175 4.4.2 Troth-functional worlds 176 4.4.3 Quasi troth-functional worlds 176 4.4.4 Modalities 176 4.4.5 Imducible contingents and admissible worlds 177 4.4.6 States 177 4.4.7 Quantum measurements, access relations, and selection func177 tions 4.4.8 Incompatibility, joint measurements and repeated measure178 ments 4.4.9 Measurability 178 4.4.10 'And', 'or'; 'if, 'not' 178 4.4.11 Preliminary order relation 179 4.4.12 Orthogonality, compatibility and 179 orderrelation 4.4.13Main result 180 4.4.14Example offai/ure of the distributive law 181 4.4.15 The Lattice strocture 181 4.4. 16Admissibility 182 4.4.17 Example of a selection function 182 4.5 Relation to other interpretations 182 4.5.1 The many-worlds interpretation 182 4.5.2 The transactional interpretation 183 4.5.3 The modal interpretation 183 4.5.4 The Copenhagen interpretation 183 5 Conclusions 183 Appendix: Proof of the main theorem 184 Box items: 1. Collected definitions 169
VII
COSMOLOGICAL TIME
1 Introduction 1.1 Speculation in cosmology 1.2 Choice of a theory
192
2 Darkness of the night sky: "OIbers' .. paradox 193 2.1 Olbers' paradox 193 2.2 The absorption solution 193 2.3 The expansion solution 194 2.4 The finite age solution 194 2.5 Summary 195 3 Cosmological expansion and redshifts 3.1 Hubble's law 3.2 Other interpretations of the redshift
196 196 197
4 The Friedmann models 4.1 The cosmological principle 4.2 The FLRW line element 4.3 Id~ntmcation with observable quantities
197 197 198
5 The cosmic microwave background
201
6 The beginning of time 6.1 The initial singularity 6.2 Singularities and black holes 6.3 Cosmological singularities 6.4 Interpretation of singularities: shocks 6.5 Shocks and hyperbolicity 7 The end of time: dark matter and rotation of galaxies
202 202 202 203 203 204 205
8 The age of the universe and closed timelike curves
206
199
9 Problems about time: a cosmological per208 spective 9.1 Is there a proper clock? 208 9.2 Related asymmetries? 209 9.3 Imperfect asymmetries? 210 9.4 Local asymmetries? 210 9.5 Is asymmetry adequate? 211
Part E: Towards a consistent model of time
VIII MUNDANE TIME
190 190 191 191
ix
215
1 Introduction 2 Mundane time: linear past and branching future 2.1 Reformulation of the A-series 2.2 Ignorance, past and future 2.3 Epistemic and ontic contingents
215 216 216 216 217
3 Comparison with time in physics 3.1 The B-series from relativity
218 218
3.2 Limitations of the arrow simile 3.3 BranchinR; of time and the failure of physicaflaws
218
5.2.4 Laplace's demon 226 5.3 Epistemically broken time: the argument from chaos 226 5.3.1 Predictability and contingents 226 226 5.3.2 Reversibility of chaos 5.3.3 Unpredictability of the past 227 6 Consistency through tilting time 227 6.1 Tilting time and consistency within phYSICS 227 6.2 Preliminary consistency with mundane time 228 6.3 Remaining problems of consistency 228 6.4 The problem of intentionality 228 6.4.1 Temporal assumptions underlying epistemically broken time 228 6.4.2 Explanation from final causes 228 6.4.3 Intention or 'purpose' as future 229 data 6.4.4 Intentionality with mundane time 229 6.4.5 Differences between the two no229 tions 6.5 The problem of macrophysical choice 230 6.5.1 Advanced interactions as perturbations 230 6.5.2 Deep history dependence 230 6.5.3 Anticipatory phenomena (choice) and Maxwell's demon 230 6.5.4 Ampliatory self-organization 230 6.6 The problem of past contingents: structured time vs. past linearity 231
218
4 Need for consistency 219 4.1 The problem of consistency 219 4.2 Building a consistent model of experience 219 4.2.1 The two-time theory 219 4.2.2 Mundane time and assumptions 219 about time in physics 4 3 Mundane experience as hallucination: . the problem of intentional choice 221 4.3.1 The flat earth and parallax 221 4.3.2 The problem of intentional choice221 4.4 The complete problem of consistency 222 222 4.5 Do physical laws somehow fail for human beings? 5 Consistency through broken time 5.1 Ontic~lly broken time: quantum indetermlBlSm 5.1.1 Chance vs. choice 5.1.2 Occasionalism 5.1.3 The transition to macrophysics 5.1.4 Quantum 'choice' and measurement 5.1.5 Summary of quantum indeterminism and mundane time 5.2 Epistemically broke~ time: the argument from compleXIty 5.2.1 The complexity of macrophysical systems 5.2.2 Thennodynamics and stochastic evolution 5.2.3 Comparison with quantum indetenninism
222 223 223 223 224
224 224 225
225
7 Conclusions
225
Notes and References
225
x
231
233
Preface
T
IME is indubitably a matter of life and death! But it is a subject on which misconceptions are widely prevalent, even amongst physicists. This prompted me to write a series of expository papers, On Time, which appeared in the pedagogical journal Physics Education (India) as follows. I II IlIA IIIB IV VA VB VIA VIB VII
Philosophical time, 7(3), 204-217,1990. Newton's time, 8(1), 15-25, 1991. The Michelson-Morley experiment, 8(3), 193-200, 1991. Einstein's time, 8(4), 293-305,1992. Thermodynamic time, 9(1), 44-62,1992. The Electromagnetic field, 9(2), 119-128, 1992. Electromagnetic time, 9(3), 251-265,1992. Bell and non-locality, 10(1),55-73,1993. Quantum-mechanical time, 10(2),143-161,1993. Cosmological time (to appear).
I am grateful to Professor A.W. Joshi, Assistant Editor of that journal, for inviting me to write this series and for his constant enthusiasm which has been a source of support. I am also grateful to the Editors for permission to reproduce the papers in a slightly modified form in this book. While writing this series of papers, it was hardly possible to avoid the infusion of some new ideas and a fresh way of looking at old controversies. I am grateful to the Indian Institute of Advanced Study, Shimla, for a Fellowship which made it possible to complete the core of the series and shape it into a book. Apart from the Fellowship which provided the time, the tranquil and beautiful setting of the Institute was a definite source of inspiration. I am grateful to the authorities of the Institute, particularly to Professor S. Gopal, Chairman of the Governing Body, and Professor Mrinal Miri, Director, for readily agreeing to my proposal to have the book published outside so that it might reach a wider audience. In the surcharged atmosphere of an organization like C-DAC, invested with the mission of developing a supercomputer, from scratch, within three years, it would hardly have been possible to commence (and later complete) this book (even between delayed flights), but for the enlightened management of Dr. Vijay P. Bhatkar. I am grateful to Professor Jagdish Mehra for the general encouragement that he provided. In addition, there are a large number of people to whom my debt must remain unacknowledged. These include the readers of Physics Education, and the anonymous referees who provided useful feedback. Unfortunately, there is no one whom I can thank for careful preparation of the typescript, so there is no one to whom I can indirectly pass on even part of the blame for any errors that remain! I dedicate this book to Tomu and Jaya for the sudden disruption in their lives that the writing of this book entailed.
xi
INTRODUCTION 1 The structure of time
D
OES time have a non-trivial logical or topological structure? This possibility was implicitly recognized long ago; it found formal expression in temporal logics in the 60's and 70's and has even acquired considerable practical significance since the recent advent of parallel computing. But the question has, so far, not been considered from the point of view of physics. A quick way to bring the problem into focus is to think about the sharp contrast between the notion of time in physics and the notion of time in everyday life.
/
Fig. 1: Mundane time
On the mundane view, time is believed to be linear towards the past, and branching towards the future. There is only one past, and the possibilities actually realized are represented by the thick line. The arrows represent the several possible futures, while the thin lines represent the might-have-been possibilities, now excluded, presumably due to human choices.
1. 1 Mundane time Everyday actions tend to presume the picture of a time which branches towards the future anO, and
2: Pi = 1. For an i
arbitrary (discrete) probability distribution, the meaning of the function S may be ascertained as follows. Suppose that the random variable X can take on five values with equal probability. For example, suppose we know the first letter 'p', and the last letter 't' of a three-letter code: we know that the intermediate letter must be one of the five vowels a, e, i, 0, u, and we suppose that all five of pat, pet, pit, pot, put are equally likely. Suppose also that one must determine the value of X based only on yes-no type of questions, and that each question is very expensive so that one follows the best possible strategy. What is the average number of questions that one would have to ask? This average may be calculated as follows. Let us choose the following strategy. First divide the five values into a group of three values, say {i, 0, u}, and a group of two values {a, e}. Next, divide the group of three into subgroups of one, say {i}, and two. Only one question (IOU?) is required to decide in which group the actual value lies, and one more question (I?) to isolate the subgroup. Thus, the actual value can be ascertained in a maximum of 3 questions, in 2 cases, and a minimum of 2 questions in 3 cases, all cases being equally possible so that each case has probability lis. If one keeps repeating this process, the number of questions required will vary, and the average or expected number of questions would be 3 x ¥s + 2 x:Ys = 2.4, which is in good agreement with log2 5 = 2.32, Le., the entropy assuming that all possibilities are equally likely, so that eachpi = lis in equation (1), except that the base of the logarithm is 2, corresponding to a choice of the unit. Exact agreement obtains whenever the number of possibilities is a power of 2, e.g., 2 = log2 4 questions are required if there are 4 equally likely possibilities.
THERMODYNAMIC TIME
83
If the probabilities were unequal, say 0.3, 0.2, 0.2, 0.15, and 0.15, the average number of questions turns out to be 2.3, whereas S = 2.27. In general the average number of yes-no questions is greater than or equal to the entropy S, and S may be regarded as the minimum average number of yes-no questions that are needed. The average number of questions one must ask is a measure of one's lack of information about a system: the less information one has, the more questions one must ask (which perhaps explains why some students are afraid to ask questions!). Thus, the entropy represents the lack of information about the system. For other ways of looking at entropy, see Ash. 1 Let us further acknowledge that the only truly 'closed system' that we have is the cosmos. Then the entropy law states the following. Statement 2: The entropy of the cosmos never decreases. That is, the entropy of the cosmos at a future instant must be at least as large as the entropy at a past instant. That is, one has at least as much information about the past as one has about the future. This is the same thing as saying that memory is not less reliable than expectation. 3 The Boltzmann H-theorem Ever since Clausius and Boltzmann, physicists have not been happy with the above form of the entropy law, because it does not actually say that we are surer of the past than we are of the future. It is not of much use to say that entropy never decreases. What if it never increases as well, and remains constant? This would correspond to a situation where one is equally sure of past and future, so that expectation would have to be treated on par with memory. But more mundane problems would arise; for if entropy remains constant, heat need not flow, by itself, from hotter to cooler bodies. But this is something that is actually observed, and physicists would like to explain this observed phenomenon, using some general physical law such as the entropy law. The second reason why physicists tend to be unhappy with statement 2 of the entropy law is that the statement is semi-empirical: it does not have the force of theory behind it. In keeping with Panini's principle of laghava or brevity (or Occam's razor), one would like to show that this form of the entropy law is a logical consequence of the basic laws of mechanics - Newton's laws of motion, say. Thus, one would like to replace statement 2 of the entropy law by a statement to the following effect. Statement 3: The entropy of a closed system, or the cosmos, goes on increasing with time until it reaches its maximum value (in thermodynamic equilibrium). This is the more ambitious H-theorem, called a theorem because one tries to prove it, starting from the laws of mechanics, and some plausible assumptions. The 'H' refers to the negative of entropy, which must have a minimum. If the H-theorem were true, one could use it to define an earlier-later relationship: later times are those with higher entropy. Thus, the H-theorem, if valid, provides a thermodynamic asymmetry between past and future, or a thermodynamic arrow of time. Many physicists regard this as the primary source of time asymmetry.
84
CHAPfERIV
Box 1. Pictures of entropy and evolution .
1. 1hennodynamic entropy and i"eversible evolution dQIT is a perfect differential for reversible processes, dS = dQIT. For a perfect gas in equilibrium,s S = N k 10g(JIT:>l). This entropy represents the amount of heat 'unavailable' for 'useful' work. For beat, or anyform of energy, to perform work, anon~brium situationis necessary. The ocean is at an absolute temperature of several hundred degrees. But a ship sailing on the ocean cannot make use of this energy for it is in equilibrium with the ocean. Irrevemble processes, sucb as friction, transfer energy from non-equilibrium to an equilibrium situation. Most processes in nature are irreversible, and the notion of revemble proce&'i is an idealization. 2. Boltzmann entropy and shuffling
Conc;ider 6-dimensional ,u-space. Each molecule (point mass) of the gas has six degrees of freedom, and its dynamical state is completely specified by a single point in,u-space. Thus, the state of a gas with N molecules is completely specified by prescnbing N points in,u-space. Because of the difficulty in describing the motion of this swarm of N points, ,u-space is divided or discretized into celIs which are small compared to macroscopic dimensions, but still contain a large number of molecules. We can, thus, obtain the density of gas particles or points in,u-space,/ (q,p, t), or the probability clistribution in,u-space, which is all that is needed to descnbe the macroscopic properties of tbe gas. One often assumes spatial uniformity, constant energy, and a constant number N of gas particles, so that only the distribution function I (p, t) orI (v, t) is of concem Equipartition of energy leads to the Maxwell-Boltzmann distribution. The H-function is now defined by
H(t)
=
f I(v) log/(v)dv.
(Blot)
H is minimized by the M-B distribution, and in this case of equilibrium it is related to the entropy S by S = -kH V. Thus, H provides an extension of the notion of thermodynamic entropy to the non-equilibrium case. In this picture of evolution, the following is assumed. (i) Shuffling: Points in.u-space are shuftled at random. (ii)Moleadarchaos: Velocities (and positions) of molecules are statistically independentbej'ore collision. (ill) Principle ofmiao-reversibiJity. The cross-section for a collision is equal to the cross section for the reverse collision. I (continued on D 8S)
The H-theorem has an immediate intuitive appeal: the inevitability of the approach towards thermodynamic equilibrium is rather closely analogous to the apparently irreversible processes of aging and death. Ever since Boltzmann, more than a century ago, many attempts have been made to prove this theorem, with Newton's laws taken as the basic laws of mechanics.
85
THERMODYNAMIC TIME
3. Fine-grained entropy and Hamiltonian evolution This is the entropy associated with the standard picture of Hamiltonian evolution in 6N dimensional phase space or r -space. The exact state of the g~ at any instant is desaibed by a ingIe point in phase space, and its evolution by a single trajectory. Uncertainties in the microphysical state are described at the macroph)'Sicallevel by means of a number of identical mental copies of the system, an ensemble, represented by a swarm of points in r -space. The swarm of points in r-space is imaginary, compared to the swann ofN points in,u-space. The relative density of this swarm of points, or the probability distribution in r-space, p (ql, ..., q3N.Pl ..... P3N. t) describes the statistical properties of the system; p = constant is called the uniform ensemble, andp(E) = 6(E) is called the micro-canonical ensemble. The classical liouville theorem states that the swarm of points in r -space behaves like an incompressible fluid, dp/dt = D. the derivative being taken in coordinates movingwith the fluid, so that the density of points is conserved, or probability is preserved. The fine-grained entropy is defined by set)
= JaCt) log aCt)
dq dp,
(B1.2)
and remains constant by Liouville's theorem.
4. Coarse-grained entropy and Markovian evolution The discrete cells in,u-space correspond to finite regions. called 'stars'. in r-space. Due to the shuffiing of points in,u-space, the representative point wanders from star to star in a stochastic manner. Specifically, the assumption is that this evolution is Markovian: given an exact knowledge of the current macrostate. knowledge of the past history is redundant. If one is interested in the behaviour at only discrete instants of time, n. one obtains a Markov chain. Mathematically, the requirement is that the conditional probability
Pr {X(n)=x(n) IX(nl)=X(nl),X(n2)=X(n2), ... ,x(nk)=X(nk) } = Pr{X(n)=x(n) I X(nk)=X(nk)} for all 'instants of time' ni, such that nl < m < ... < nk .
(B1.3)
I (continuedonp 86)
The use ofNewton's laws, rather than relativity. may suggest a basic problem right here. since we know that in Newton's laws there is a conceptual difficulty with time. So the use of Newton's laws might amount to adding chaos to confusion! With a finite speed of interaction, there is a fundamental change in the nature of the many-body equations of motion: they become history-dependent, hence time asymmetric. Nevertheless, we will review the attempts that have been made to prove the H-theorem using Newton's laws, and point out the difficulties involved in these attempts before considering alternative means.
86
CHAPTER IV
If the gas is bounded (e,g., kept in a box) there are only a finite number of stars. Thus, the evolution of the ~as may be modeled by a finite Markov chain, theX(n) takevalues in a finite set {el, ez, .. ., en f' The transition probabilities,
pjk (n, m)
= Pr {X(n) = ek I X(m) = ed,
(B1.4)
arecalledstationaryifpjk(n+m, m+m) = pjk(n, m)forallm,n,m.Thatis,theoriginchosen for time is immaterial, andpjk(n, m) is a function only of n - m. Stationarity is the analogue of equilibrium, and itis a mathematicalfact9 that any Markov chain wiIlapproach a stationary state. Ergodicity now means that each state, ej, is accessIble from any other state, ej, pjj(n) >0, for some n, and so each state is visited some time. In the stationary case, the theory is particularly simple since pjk(n) =pjk(n+m, m) is the jlcth element of the matrix [Pjk(l)]" by the Chapman-Kolmogorov (Fokker-Planck) equation. If Q is the volume of a star, one defines a coarse-grained distribution function P
=
-b f
a dqdp
(B1.5)
f PlogPdqdp.
(B1.6)
Q
and an associated coarse-grained H function H
=
This coarse-grained entropy increases, except in equilibrium, but the details of the approach to equilibrium are missing.
5. Entropy and information: See text. 3. 1 The Ehrenfest model To illustrate the operation and the proof of the H-theorem, take a full pack of cards. (The original Ehrenfest model involved two urns and balls numbered 1 to N, which were moved at random from one urn to another.) Divide it into two packs, one consisting of all red cards, and the other consisting of all black cards. Move the cards according to the following rule. Take the top card from pack 1 and put it on top of pack 2. Take the bottom card from pack 2 and put it at the bottom of pack 1. Shuffle the packs separately, and again interchange according to the above rule. Continue this process. The number of, say, red cards in one of the packs gives the (macroscopic) state of the system. The rule for interchanging cards corresponds to the evolutionary (dynamical) law. The process of shuffling corresponds to the 'ergodic hypothesis', or the property of 'mixing' or 'metric transitivity'. (There are subtle differences between these concepts which we need not consider here; see Boxes 2 and 3.) The whole system is a model for a finite Markov chain (Box 1).
THERMODYNAMIC TIME
87
The initial state of the system, with all red cards in one pack and black cards in the other, is an ordered or low entropy state, like that of a gas in a box with all molecules in one half of the box. If we continue shuffling and interchanging the cards, after a short time there will be about an equal number (= 13) of red and black cards in each pack: the gas has expanded into vacuum. This is the most disordered state or the state of maximum entropy or thermodynamic equilibrium. The moral of the story is that the system, starting from a low entropy state, will progress irretrievably towards thermodynamic equilibrium.
4 The reversibility and recurrence paradoxes 4. 1 Loschmidt's paradox Loschmidt objected to Boltzmann's proof of the H-theorem, pointing out that the laws of mechanics were reversible, so no proof of the H-theorem could depend on the laws of mechanics alone. When told about Loschmidt's objection that molecular motions were reversible, Boltzmann is rumoured to have remarked 'Go ahead, reverse them'. In the above analogy, the laws of mechanics correspond to the rule for interchanging the cards. If the proof of the H -theorem depended only on this rule, then the entropy of the system would increase whether or not the cards were shuffled each time. If the cards are not shuffled after each interchange, then (by interchanging the two packs) one can get back to the initial state, starting from a given state, by the same rule for interchanging cards. If the increase in entropy depended only on this rule, then the entropy must increase in both cases. So the entropy of any state must be the same as th~ entropy of any other state, and entropy must remain constant. One can arrive at the same conclusion in another way. A universe evolving according to the laws of mechanics is time-symmetrically deterministic in the following sense. The future of the universe is completely determined by its past and vice versa. In such a deterministic universe, the future is uncertain only to the extent that the past is uncertain: the process of evolution does not generate any uncertainty. To summarize Loschmidt's objection, in a universe evolving time-symmetrically and deterministically, according to the laws of classical mechanics say, the future cannot be more uncertain than the past. Some shuffling mechanism is necessary. But who or what is responsible for shuffling the cosmos? and how does one know whether, in fact, the cosmos is being shuffled? 4. 2 Poincare recurrence theorem Zermelo raised another objection which applies to any proof of the H-theorem. This objection is based on the Poincare recurrence theorem. To illustrate the recurrence theorem, consider first the case where the packs are not shuffled. If we go on interchanging the cards between packs, ultimately we will come back to the state where all the red cards are back in the first pack, and all the black cards in the second pack. The initial state has recurred.
88
CHAPTER IV
If we continue interchanging, other states will recur as well - the evolution of the system is completely cyclic. Perhaps this is only because of the particular rule that we adopted. The Poincare recurrence theorem states that this kind of quasi-periodic behaviour must occur regardless ofthe particular rule used to interchange the cards. More specifically, the theorem asserts. Theorem. Let (X, p, Tt ) be an abstract dynamical system, i.e., (X, p) is a probability space, and Tt is a group of automorphisms of (X, p), so that each Tt is one-to-one and
preserves measure. Let A be any subset of X Then for almost every x E A, there exist arbitrarily large (hence infinitely many) t for which Tt x EA. Proof. Let B = {x E A, Tt x f/. A, V t ;?: to }. Relabel to as 1 for convenience. The sets B, TIB, ... , TnB, ... are all disjoint, andp(TnB) = pCB), since each Tn preserves measure. Hence,
so thatp(B) = O. 4. 2. 1 The recurrence theorem simplified. One could restate the theorem and proof as follows. Suppose we do not quite know the state of a system, we may liken the evolution of the system to a flow of points in the manner of a fluid. The trajectory of a point initially at x(O) is being described by the rule x (t) = Ttx (0). Suppose the flow of this fluid preserves volume, i.e., the fluid behaves as if it were incompressible. Then, for any finite volume regionA, the trajectory starting fromA must return toA after arbitrarily large times, hence infinitely often. The proof depends upon the assumptions that (a) the flow preserves volume and (b) the total volume of the available state space is finite. Thus, letA be any region, and let B be that subregion ofA , starting from which the trajectory never returns toA after 1 s, say. If the fluid initially in B occupies the region BI after 1 s, then Band BI cannot overlap by definition of B. Similarly, if the fluid is in B2 after 2 s, then B, BI, and B2 cannot overlap (else BI and B would overlap). But the flow preserves volume, so that the volume of B must be the same as that of BI, B2, .... SinceB, BI, B2, ... do not overlap, their volumes may be added. But there is only a finite volume available, which will sooner or later be exhausted unless B has zero volume. So, ifA has non-zero volume, all that volume must be occupied by states that return toA after 1 s. The same argument goes through for to s in place of 1 s. Since to could be chosen arbitrarily large, the trajectory starting from almost every point ofA will return toA after arbitrarily large times, hence infinitely often. Since the volume of A could be chosen as small as we please (so long as it is non-zero), the trajectory must return arbitrarily close to the initial point infinitely often. 4. 2. 2 Consequences of the recurrence theorem. The consequences of this theorem are rather dramatic. Suppose, for example, a gas is kept in one half of a box by means of a membrane, and the membrane is broken. We would expect the gas to expand and fill the entire box. The theorem predicts that the gas will spontaneously return to the half
THERMODYNAMIC TIME
89
of the box that it originally occupied, creating a vacuum in the other half. Moreover, this sort of thing will keep happening: history repeats itself.
4.2.3 Relationship to mechanics: Liouville's theorem. Before exploring this theorem further, let us settle one question. What does an abstract dynamical system have to do with Newton's laws of motion or Einstein's equations? The answer is provided by Liouville's theorem. The classical form of this theorem is that evolution in classical phase space, according to classical mechanics, preserves volume. We state this in a more general form, of which many physicists seem unaware, so that it also applies to, for example, the relativistic case of geodesic flow on a manifold. Theorem. Let X be a compact Hausdorff space, and Tt a group of homeomorphisms on X Then there exists a regular Borel probability measure fl on X which is invariant under Tt, so that Tt is a group of automorphisms of (X; fl). We postpone the proof to the appendix, and explain here only the hypotheses of the theorem in the context of Newton's laws. For a point mass, specification of3 position coordinates and 3 momenta completely specifies the dynamical state. Thus, the dynamical state of a gas is specified by specifying a point in the 6N-dimensional Euclidean space R6N. (If the gas molecules cannot be treated as point masses, one would need more degrees of freedom.) If the gas is confined to a finite volume, and energy is conserved, then the state must lie in a closed and bounded region of R6N, which is compact by the Heine-Borel theorem. This provides the compact X of the theorem. The group Tt specifies the evolutionary law. To say that each Tt is a homeomorphism means, first of all, that every state has a unique precursor, t seconds in the past, and a unique successor t seconds in the future. This requirement is met in classical mechanics, and is usually stated in the following form: 'Through every point of phase space passes a trajectory, and no two trajectories may cross each other.' (Thus, this requirement excludes singularities of the HawkingPenrose type encountered in relativistic evolution. 2) If the evolutionary law is specified, like Newton's law or Hamilton's equations, by an ordinary differential equation, this property follows from the existence and uniqueness theorems for ordinary differential equations. The homeomorphism requirement implies continuity, so that two states that are sufficiently close to each other wilVdid stay close to each other afterlbefore t seconds. Thus, two rockets fired into space, at a slight angle to each other may stray parsecs apart after a sufficient lapse of time. But we could, in principle, arrange matters so that after 1000 years the rockets are not more than 100 km apart. In the case of classical mechanics this last requirement follows from the theorem for ordinary differential equations which ensures continuous dependence of the solution on initial values. To summarize, in classical mechanics, Liouville's theorem, and hence the Poincare recurrence theorem, apply provided only that (i) the system is bounded and (ii) has a finite number of degrees of freedom. 4. 3 The recurrence paradox Poincare, and later Zermelo, objected to the H-'theorem'. Since every state of the system recurs, the entropy, rather than progressing irreversibly towards a maximum, must behave almost periodically. The observed entropy increase must be a local matter. One cannot
90
CHAPTER IV
Box 2. Mixing and sorling: Maxwell's demon
1. Abstract dynamical system
1"t) consists of a compact topological spacex, and a group of homeomorphisms 1"t onK Alternatively, by Uouville's theorem, an abstract dynamical !»'Stem (X, It, 1"t) is a group of automorphisms (measure preseIVing, bijective maps) on a probability space (X, It). In classical mechanics, the homeomorphisms T, are obtained ona subset X of the phase space R2n, from Hamilton's equations.
An abstract dynamical ~m (X,
2 Shuffling and e'Eodicity
One understands what is meant by shuftling a pack of cards. But what does shuJlling mean for a gas in a box? what does it mean for an abstract dynamical !»'Stem? If the cards are shuffled properly, one would expect every possible combination to be realized. Since each combination is a 'state', there is a finite probabilily of going from a given state to any other state in a finite number of steps: every state is accessible from any other state, so that each state is visited sometime. This is the definition of ergodicily for a finite Markov chain, and possibly the notion that Boltzmann, the atomist, had in mind. For a continuous !»'Stem, the classical ergodic hypothesis took the form that the trajectory visits every point in phase space. A plausible consequence of this hypothesis is that space and time averages coincide. But as stated, the hypothesis is false, and was soon shown to be so. Dimension is a topological invariant, the Tc's are homeomorphisms (and hence preseIVe topological invariants), the trajectory is one-dimensional, and phase spare is multi-dimensional.
3. Quasi-e'Eodicity and metric tnlnSitivity However, what was actually required was the interchange of space and time averages, and this could be achieved by the quasi-ergodic hypothesis: the trajectory comes arbitrarily close to any point in pha denotes inner product in H, and tp is the state with II tp II = 1. For an arbitrary tp, is a regular Borel measure, which is a probability measure when the state is normalized. The correspondence with the textbook approach is obtained as follows. In the first place, the form of the operator (whether q above or in a/ap) is unimportant. A self-adjoint operator T is really characterized by giving its spectrum u(n, together with multiplicities. For an observable T, the spectrum u(n k R and corresponds to measurable values. For, say, the position operator q, if we believe that the spectrum u (q ) = R, with no multiplicity, the spectral multiplicity theorem3 (Hahn-Hellinger theorem) allows us to recover the usual configuration-space representation. The theorem provides a unitary map, between the abstract Hilbert space H and L2 ( u (q » == L2(R), which carries q to the multiplication operator on L2(R), and the spectral measure Eq to the spectral measure E on L2(R) corresponding to multiplication by characteristic functions. Thus, Pr (qEA) = =
=
fR
XA 1/J1P
=
f A !tp! 2
(4)
recovers the more usual form of the probability interpretation. The uncertainty principle is an easy consequence of non-commutativity (1) and the Schwartz inequality. (iv) Schrooinger equation: The SchrOdinger equation describes unitary evolution in this Hilbert space
tp(t) = U(t) tp(O).
(5)
By Stone's theorem, any such (strongly continuous) one-parameter unitary group may be written as
U (t) = e -iHt,
(6)
where H is self-adjoint. Hence, the infinitesimal form of (5) reads:
atp at
=
- l'H tp,
(7)
which is the more usual form of the SchrOdinger equation. Physically, the infinitesimal generator H is identified with the Hamiltonian, or energy operator. It is an extraordinarily curious fact that, modulo commutativity, the quantum Hamiltonian.is the same function of the canonical variables as the classical Hamiltonian (when the latter exists). (v) The projection postulate: The naive formulation of the projection postulate, for the paradigmatic observable with discrete spectrum, is the following. 4 ' ... any result of a measurement of a real dynamical variable is one of its eigenvalues ... ,
164
CHAPTER VIB
... if the measurement of the observable x for the system in the state corresponding to Ix > is made a large number of times, the average of all the results obtained will be < xI ~ Ix > ... , ... a measurement always causes the system to jump into an eigenstate of the dynamical variable that is being measured .... ' The result of measuring such an observable always is an eigenvalue. As a 'consequence' of the measurement process, the system is thrown (discontinuously) into the eigenstate corresponding to the measured eigenvalue. If the same measurement is repeated immediately, the same eigenvalue results. Von Neumann incorrectlyS supposed that the postulate could be generalized in a straightforward way to observables with continuous spectra. The precise formulation, even for an observable with discrete spectrum, is messy. One must account for incomplete measurements, possible degeneracy (non-zero multiplicity in the spectrum), and superpositions. The last can be achieved by viewing states as density matrices (trace-class operators, provided by Gleason's theorem, normalized to trace unity; this theorem was described in Chapter VIA). Now, if Tis an observable with discrete spectrum, the integral in (2) reduces to a sum,
T =
L AjPj,
(8)
where Aj are the eigenvalues and Pj are the corresponding eigenprojections. Let
A E BR (A is a Borel subset of R). The act of measurement conditioned on the statement TEA (partial measurement) transforms the state p to the (non-normalized) state p', p'
=
L PipPi,
(9)
AjEA
the sum being taken over those indices i for which AiEA. In the case where A isa singleton, A = {lJ. (9) reduces to a complete measurement, and the system is thrown into an eigenstate. This process does not work if the observable is degenerate.
3 From quantum logic to the formalism of q.m. 3. 1 Non-commutativity and non-existence ofjoint distributions The relationship of the axiomatic approach to the textbook approach may be clear, but the relation to phenomena remains a mystery. Understanding this mystery may require a long process of distillation. What are the chief new features of this formalism? We have already encountered the Hilbert space and the picture of unitary evolution in the context of classical statistical mechanics (Chapter IV, Box 3). One could extend this picture to represent dynamical variables by self-adjoint operators.
165
QUANTUM-MECHANICAL TIME
The new feature however is non-commutativity: the approach of classical statistical mechanics always results in a commuting algebra of observables. Since non-commuting operators cannot be simultaneously diagonalized, this non-commutativity gives rise to a peculiar difference between classical and quantum probabilities: a joint probability distribution does not exist for canonically conjugate (non-commuting) dynamical variables, as observed6 and later proved by Wigner.7 The chief problem then would seem8 to be the explanation of the origin of these peculiar quantum probabilities. 3. 2 Need for a change of logic: failure of the distributive law Now probabilities may be defined on a a-algebra M of subsets of a given set X, using U, n and ~, or on a logic of sentences. In the latter setting, the usual measure-theoretic approach to probability is recovered by identifying the usual set-theoretic operations with the logical operations required to define them: 'not' with complement " 'and' with n, 'or' with U, and ~ with~. The usual calculus of sentences results in a Boolean algebra isomorphic to the algebra of subsets of a given set.9 In the Birkhoff-von Neumann (BN) approach10 the peculiarities of quantum probabilities are explained by asserting that the logic of q.m. differs from classical logic in that the distributive law between 'and' and 'or' fails. In a double-slit experiment, to say that 'the electron reached the screen and passed through slit Aor slit B', is not the same as saying that 'the electron reached the screen and passed through slit A or the electron reached the screen and passed through slit B'. In one case one gets a diffraction pattern, in the other case a superposition of two Gaussians. The failure of the distributive law means that a joint probability distribution cannot be defined; for example, the marginal distributions would fail to be additive.
Pr{aEA & (bEB or bEC)}
¢
Pr{aEA & bEB} + Pr{aEA & bEC}, (10)
even if the 'or' is exclusive, i.e., B and C are disjoint. The BN approach, therefore, advocates a change of the logic on which the probabilities are defined. Probabilities, such as those on the left hand side of (4), are defined on sentences, but the 'and' and 'or' used to compound these sentences are such that the distributive law fails. One therefore obtains a more general algebraic structure, rather than the usual Boolean algebra (ora-algebra), on which probabilities are to be defined, as countably additive, positive functionals with total mass 1. 3. 3 Birkhoff-von Neumann approach and the orthodox formalism of q.m. 3. 3. 1 The lattice ofprojections. The BN approach begins by noticing that the subspacesll of a Hilbert space form a lattice, or a 'logic', with the desired properties. One may identify a subspace of a Hilbert space with the orthogonal projection onto that subspace. We now define an 'and' (A) and 'or' (v) as follows. If P1 and P2 are two orthogonal projections on the subspaces R(P1) and R(P2) respectively, then P1 AP2 is the projection on the subspace R(P1) R(P2), while P1 VP2 is the projection on the smallest subspace which contains both R(P1) and R(P2). If the dimension of the Hilbert space is ~2, and P1, Pz, P3 are taken, as in Fig. 1, as
n
166
CHAPTERVIB
the projections on thex-axis,y-axis, and the line y=x, then it is clear that the distributive law fails: PlA(P2VP3) = PI, but PlAP2 = 0, PlAP3 = 0, and OvO = 0, so that PIA (P2 VP3)-:;t:. (PI AP2) V(PlVP3). Conversely, the algebraic structure corresponding to the usual sentence calculus is a distributive lattice or a Boolean algebra. Thus, the Hilbert space is related naturally to the failure of the distributive law. 3. 3. 2 Order relation and orthocomplement. The axioms for the usual sentence calculus may be formulated in terms of 'not', 'and' (A) and 'or' (V), or, more usually, in terms of 'not' ( - ) and implication (~ ). In algebraic terms, following the algebraization oflogic initiated by Boole, 'if and 'not' may be described respectively by an order relation l2 s, and an orthocomplement '. These may be used to describe the properties of the lattice of projections P: PI SP2 exactly if R(Pl) !: R(P2), i.e., the subspace onto which PI projects must be a subset of the subspace onto whichP2 projects. The orthocomplement P' is the projection on the null space of P denoted by N(P). Fig. 1: Failure of the distributive law
If the projections Pl, P2, P3 are defined by Pla = a£ Pza = ai, PJa = a.e, for any vector a, then the join ofany two of these is the projection on the plane, while their meet is zero. Hence the join of any two meets is zero, and cannot equal the meet of anyone projection with the join of the other two.
Given an order relation, the 'and' and 'or' may be re-interpreted: PlAP2 is the greatest lower bound (g.l.b., infimum), while PI VP2 is the least upper bound (l.u.b., supremum) of the two-element set {PI, P2}. One would expect de Morgan's laws to hold. One may also define the notion of orthogonality, P.lQ if pSQ '. 3. 3. 3 Geometrical interpretation. All these notions have simple geometrical meanings in 3-dimensional Euclidean space. The closed subspaces are: the point at the origin, lines through the origin (extended to infinity in both directions), planes through the origin, and the whole space. The orthogonal projections are precisely that: if P is the projection onto a line or a plane, the result of applying P to a vector is obtained by dropping a perpendicular to the line or the plane in question. The partial order is set-theoretic inclusion, and the infimum is the set-theoretic intersection. The l.u.b. of two lines is the plane they span. The orthocomplement of a line is the plane perpendicular to it. Orthogonality just means perpendicularity of the corresponding subspaces. 3. 3. 4 Dynamical variables, random variables and self-adjoint operators. Apart from the failure of the distributive law, what is the point of studying the lattice of projections? An immediate application is that it leads to the operator representation of observables (hence the probability interpretation) in a natural way. The first step is that classically an observable or a dynamical variable is a random
167
QUANfUM-MECHANICAL TIME
variable. One observes the dynamical variable, and the observed values vary or show some dispersion or scatter. Now, in the usual measure-theoretic approach to probabilities, a random variable is a measurable function. Given a set X, and a Boolean a-algebra, M, of subsets of X, a (real-valued) random variable is a functionj:X-+R, such that f-l(A) EM whenever A EBR. What one actually requires for this mysterious textbook definition is the inverse map,j- 1:BR -+M, which is an isomorphism between the two a-algebras:
r 1(
U Ai)
=
U
r
1 (Ai ),
r1 ( nAi) = nr1 (Ai), r1 (Ac)
=
(r 1(A»)"
(11)
where i will always denote an index running over finite or countable values. When the setting is changed from a a-algebra to the lattice P (H) of projections on a Hilbert space H, it is, therefore, natural to define a random variable in this way as an isomorphism m: BR -+ P(H) into a Boolean subalgebra of P (H): m(
U Ai) = V m(Ai),
m (nAi)
=
A m(Ai),
(12) The definition (12) makes sense for a broader class of algebraic structures. But if the basic algebraic structure is the lattice of projections on a Hilbert space, then (12) means that an observable, or a random variable, is automatically a projection-valued measure or a self-adjoint operator. Similarly, one can define a state, or a probability measure on the lattice of projections. The text-book definition of a (positive) measure is the following. Given (X, M) as above, a measure p. is a function p. : M -+ R+ such that (i) p.(rfJ) = 0, (ii) if Ai'S are pairwise disjoint,p. (UAi) = P. (Ai). A probability measure satisfiesp.(X) = 1. In the lattice of projections, the notion of disjointness is replaced by orthogonality: if P.l.Q then P I\Q = o. So, one uses the conditions that (i) p.(0) = 0, (ii) p.( V Pi) = LP.(Pi), if the Pi'S are pairwise orthogonal. A probability measure satisfies p.(1) = 1. Gleason's theorem now recovers the usual density matrix approach.
L
3. 4 The quantum logic approach 3. 4. 1 The minimum requirements. To summarize the preceding section, the crucial distinguishing feature of quantum probabilities is thenon-uzStence ofjoint distributions. In the BN approach, this is achieved by defining the probabilities on the non-distributive
168
CHAPTER VIB
lattice of projections on a Hilbert space. The operator representation of dynamical variables (or the 'probability interpretation') emerges as a bonus in this approach. But the mysterious Hilbert space is still in the background. Can one account for this too, starting only from a new type of 'and' and 'or' or a new type of 'if and 'not'? The quantum logic approach tries to whittle down, a little further, the mystery of quantum axiomatics. What is the minimum structure necessary to speak of probabilities without a joint distribution? Suppose we are given a set P, together with an order relation s, and an orthocomplement '. Such a triple (P, s, ') is called an orthoposet. For the above definition of random variable and probability measure to go through, the right hand sides should make sense. The minimum requirement is that P should be a-orthocomplete, i.e., every countable collection of mutually orthogonal elements of P should admit a supremum in P. 3. 4. 2 Orthomodularity. There is one more technical requirement: P should be orthomodular (see box). Orthomodularity is a weak form of the distributive law: distributivity => modularity => orthomodularity. The lattice of projections on a Hilbert space His orthomodular, but fails to be modular if H is infinite dimensional. The point of orthomodularity is this: if P fails to be orthomodular, measures on P may fail to be monotone. That is, we could have 13 asb, but fl(a)'