286 49 540KB
English Pages 129 [130] Year 2017
Joachim Schröter Minkowski Space
De Gruyter Studies in Mathematical Physics
| Edited by Michael Efroimsky, Bethesda, Maryland, USA Leonard Gamberg, Reading, Pennsylvania, USA Dmitry Gitman, São Paulo, Brazil Alexander Lazarian, Madison, Wisconsin, USA Boris Smirnov, Moscow, Russia
Volume 40
Joachim Schröter
Minkowski Space
| The Spacetime of Special Relativity
Mathematics Subject Classification 2010 Primary: 83; Secondary: 57 Author Prof. Dr. Joachim Schröter University of Paderborn Department of Physics Pohlweg 55 33098 Paderborn [email protected] Translator Dr. Christian Pfeifer University of Bremen Center of Applied Space Technology and Microgravity Am Fallturm 28359 Bremen [email protected]
ISBN 978-3-11-048457-1 e-ISBN (PDF) 978-3-11-048573-8 e-ISBN (EPUB) 978-3-11-048461-8 Set-ISBN 978-3-11-048574-5 ISSN 2194-3532
Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2017 Walter de Gruyter GmbH, Berlin/Boston Typesetting: PTP-Berlin, Protago-TEX-Production GmbH, Berlin Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com
Contents Introduction | 1 1 1.1 1.2 1.3
Basic properties of special relativity | 3 Special relativity as a special case of general relativity | 3 Connecting Lorentz transformations and Lorentz matrices | 7 Group properties | 10
2 2.1 2.2 2.3 2.4
Further properties of Lorentz matrices | 13 Supplements to Proposition 1.7 | 13 Proper, orthochronous, and antichronous Lorentz matrices | 15 Special Lorentz matrices | 18 Subgroups of L | 21
3 3.1 3.2 3.3
Further properties of Lorentz transformations | 23 Subgroups of P | 23 A condition for special Lorentz transformations | 24 A condition for orthochronous Lorentz transformations | 27
4 4.1 4.1.1 4.1.2 4.1.3 4.1.4 4.2 4.3 4.4 4.4.1 4.4.2 4.4.3 4.5
Decomposition of Lorentz matrices and Lorentz transformations | 29 The decomposition theorem for Lorentz matrices | 29 Notations and assumptions | 29 Theorem and proof | 30 Remarks on the interpretation of the decomposition theorem | 34 Decomposition of nonorthochronous Lorentz matrices | 35 The decomposition theorem for Lorentz transformations | 35 Nonuniqueness of the decomposition of Lorentz matrices | 36 The decomposition of products | 39 Preliminary remarks | 39 The theorem of relativistic addition of velocities | 39 Decomposition of a product L = L ⋅ L | 40 Parameter representation of Lorentz matrices | 40
5 5.1 5.2 5.3
Further structures on M s | 43 Introductory remarks | 43 Vector space structure | 43 Topology on M s | 46
VI | Contents
6 6.1 6.2 6.3 6.4 6.5
Tangent vectors in Ms | 47 Decomposition of Lorentz vector spaces | 47 Timelike tangent vectors | 47 Spacelike tangent vectors | 49 Some conclusions | 51 Non-Minkowskian coordinates | 52
7 7.1 7.1.1 7.1.2 7.2 7.3 7.3.1 7.3.2 7.3.3
Orientation | 55 Time orientation | 55 Definitions | 55 Time orientation on Ms | 56 Orientation of vector bases | 60 Orientations on M s | 61 Introductory remarks | 61 Time orientation | 61 Chronal and causal relations | 62
8 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8
Kinematics on M s | 65 Introductury remarks | 65 Worldlines, signals, observers | 65 Clocks | 71 Newtonian notions in special relativity | 74 Radar charts in Ms | 79 Time dilation | 81 Length contraction | 82 Aberration of light | 87
9 9.1 9.2 9.3 9.4 9.5 9.6 9.7
Some basic notions of relativistic theories | 91 Manifolds | 91 Tangent vectors | 92 Cotangent vectors | 96 Lorentz vector spaces | 98 Direct decomposition of Lorentz vector spaces | 104 Tensors | 108 Lorentzian manifolds | 113
Epilogue | 117 Bibliography | 119 Index | 121
Introduction Special relativity (SR) is the most important predecessor of general relativity (GR). It is undoubted that the insights Einstein gained by developing special relativity were necessary to create general relativity. Even though special relativity still showed some features of Newtonian physics, the theory was not fully accepted in the physics community after its publication in 1905 (see [1]). Important reasons were, on the one hand, that space and time lost their absolute status, which was completely new and radical in those days, and, on the other hand, that it was not possible to include the gravitational field into the special theory. It was decisive for the further development of the theory that Minkowski combinded space and time into one entity, the spacetime (see [2]). Finally, the description of gravitation was achieved satisfactorily ten years later with the formulation of general relativity in 1915 (see [3, 4]). Due to the huge success of general relativity it took quite some time until it was realized that the heuristic and inductive arguments which led to Einstein’s geometric curved space-time description of gravity were not completely satisfactory. The need for clarification of this situation was first brought forward by Hans Reichenbach in 1925 [5]. These concerns on the arguments for the spacetime picture of relativity started the development of “space-time theories” (STTs) which should clarify the nature of space and time independently of special and general relativity. The aim of these theories is to find solid rigorous arguments for the structure of spacetime in general solely from the most elementary observable properties of matter in the widest sense. Logically such STTs are the theoretical bases, also called pretheories, of the specific spacetime pictures of special and general relativity. Nowadays, there exist two mature STTs: the EPS axiomatic [6, 7] and the Sch2 theory [8–12], where [12] contains a complete collection of articles on STTs until 1997. Even though either of both theories starts its investigation from a different ansatz they agree in their conclusion: Each physical spacetime M can be described mathematically as a 4-dimensional Lorentz C k , k > 2 manifold, i.e., every coordinate transformation on M is continuously differentiable at least three times. Moreover, the elements, or points, p of M are to be interpreted as representativs of point-like physical events. A precise definition of these manifolds will be given in Sections 9.1 and 9.7. Throughout the literature on relativity the definitions of space-time and Lorentzian manifolds vary slightly from source to source. In particular, additional to topological requirements on spacetimes, it is sometimes demanded that they are orientable and time-orientable, which, however, is not necessary for our goal in this book. This means that we will characterize Minkowski spacetime by some axioms as simply as possible and then deduce all its other properties from these axioms. Consequently we do not formulate partial theories of special relativity, such for instance mechanDOI 10.1515/9783110485738-001
2 | Introduction
ics of point particles, or of continua, electrodynamics, thermodynamics, etc.. Rather the following text is, besides being the explanation of Minkowski spacetime, a prolegomenon to the comprehension of partial theories of special relativity, such as for instance the above mentioned ones.
1 Basic properties of special relativity 1.1 Special relativity as a special case of general relativity 1.1.1 The following definition is the basis of the spacetime structure of special realativity: Definition 1.1. Minkowski spacetime or short Minkowski space is a manifold Ms = (M s , As , g s ) for which the following holds: (1) M s is a set. (2) As is a C k -Atlas on M s with k ≥ 3. (3) There exists a global chart (M s , φ) in As , i.e., φ : M s → ℝ4
(1.1)
is bijective. (4) g s is a (0, 2)-tensor field on M s , called metric. (5) In the coordinates x = φ(p), p ∈ M s defined by the global chart (M s , φ) the metric takes the form g s (p) = dx1 ⊗ dx1 + dx2 ⊗ dx2 + dx3 ⊗ dx3 − dx4 ⊗ dx4 .
(1.2)
Properties (1) and (2) (with k ≥ 1) are the usual axioms of differentiable manifolds (Section 9.1). It follows that at every point p ∈ Ms there exists a tangent vector space T p Ms (Section 9.2) and its dual, the cotangent vector space T p∗ Ms (Section 9.3), as well as all of their tensor products (Section 9.6) and the corresponding differentiable tensor fields (Section 9.7). Properties (1)–(5) specify that Ms is a semi-Riemannian manifold (Section 9.7), by the fact that the metric g s (p) is an indefinite inner product in the tangent vector spaces T p Ms (Section 9.4). The objects dx κ , κ = 1, . . . , 4, used to formulate the metric in equation (1.2) are basis vectors in T p∗ Ms . In Chapters 1–5 these structures are not used very often; they become more important in Chapters 6–8. Their precise definition and properties can be found in Chapter 9 and in the literature, for example in [13–16]. Properties (3)–(5) in Definition 1.1 imply the characteristic properties of Minkowski spacetime which we will study in this and in the following sections in detail. Then in Chapter 5 we can conclude that the manifold Ms = (M s , As , g s ) is a Lorentz manifold and a Lorentz vector space at the same time. These properties are not assumed as axioms, but are deduced from the axioms in Definition 1.1. Additionally they imply that Minkowski spacetime is orientable as well as time-orientable.
DOI 10.1515/9783110485738-002
4 | 1 Basic properties of special relativity
1.1.2 In order for Ms to be connected to general relativity the metric g s must be a solution of Einstein’s field equations R−
1 ̄ s Rg + Λ0 g s = κ0 T. 2
Here R is the Ricci tensor, R̄ the Ricci curvature scalar, Λ0 the so-called cosmological constant, T an energy momentum tensor, and κ0 = 8πc−4 G the Einsteinian gravitational constant, which consists of a combination of Newton’s constant G and the speed of light c. The left-hand side of the equation is determined solely by the spacetime metric, while the right-hand side contains the metric and physical matter fields. We are considering a spacetime metric according to equation (1.2). The question now is: Are there matter fields which yield an energy momentum tensor such that the metric g s , as in equation (1.2), satisfies the Einstein equations? To answer this question we expand the Einstein equations in the coordinates x = s φ(p), p ∈ M s defined in equation (1.1). Let g αβ , α, β = 1, . . . , 4 be the components of s g in this coordinate system; then we find from equation (1.2) that for α, β = 1, . . . , 4 the following holds: 3 s g αβ = η αβ := ∑ δ αj δ βj − δ α4 δ β4 ,
(1.3)
j=1
and thus η αα = 1, α = 1, 2, 3, η44 = −1, and η αβ = 0, α ≠ β. This insight allows for the following. Conclusion 1.2. (1) The Christoffel symbols (see e.g., [16, p. 301]) calculated with equation (1.3) vanish identically, and hence the Ricci tensor and the Ricci scalar also vanish. The Einstein equations reduce to Λ0 g s = κ0 T. (1.4) ̂ s with p̂ = κ−1 (2) Since κ0 > 0, equation (1.4) can only be solved if T = pg 0 Λ 0 . So T must vanish for Λ0 = 0. Thus, in this case, there are no gravitating matter fields. (3) However, according to latest simulations [17, 18] Λ0 is nonvanishing and satisfies Λ0 = 0, 7ρ c , where ρ c is the critical density of the universe, i.e. Λ0 > 0 and p̂ > 0. ̂ s describes a perfect fluid with constant pressure p̂ and constant enThus, T = pg ergy density μ = −p̂ ([15, p. 70], [19, p. 85]). Hence there exists an exact solution for equation (1.4) for an interpretable energy momentum tensor T . The only problem is that to date there is no fluid known with constant pressure and negative energy density. (4) An alternative conclusion to (3) is to consider Λ0 to be a given constant which can be used for numerical approximations. The metric g s has only diagonal entries. Thus, it can only be an approximate solution of equation (1.4) if the off-diagonal
1.1 Special relativity as a special case of general relativity
|
5
terms in T can be neglected. This means that for α, β = 1, . . . , 4 the relations κ0 |T αβ | ≪ Λ0 , α ≠ β and κ0 T αα ≈ Λ0 η αα hold. Consequently special relativity can be used only as an approximation of a more general spacetime geometry in regions of spacetime where these conditions are satisfied. The question, for which physical situations this is the case must be discussed depending on the system one wishes to describe. Even though we just concluded that the Minkowski spacetime and special relativity have only restricted physical significance, in the following we will mainly investigate the properties of Minkowski spacetime. Its importance originates in its role as starting point and foundation for the development of general relativity. As usual in relativity we will employ the Einstein sum convention.
1.1.3 Another notation convention we use is that x ∈ ℝ4 is a column vector. The only exception is if x is used as the argument of a function f as in f(x1 , x2 , x3 , x4 ) = f(x T ), for which we write f(x) for simplicity. In Definition 1.1, the definition of Minkowski spacetime, we required the existence of a global chart (M s , φ). It is commonly specified as follows. Definition 1.3. Let Ms = (M s , As , g s ) be Minkowski spacetime, as defined in Definition 1.1, and let (M s , φ) be the specified global chart in As . Moreover, let (M s , φ ) be an arbitrary chart satisfying φ : M s → ℝ4 (1.5) and g s = η αβ dxα ⊗ dxβ , x
φ (p),
(M s ,
(1.6) φ ),
φ ,
= p ∈ We call every chart or simply of this kind with s Minkowski chart and the corresponding coordinates x = φ (p), p ∈ M Minkowski coordinates. By definition all of these Minkowski charts are included in the atlas As . In the next section we will see that there exist more than one Minkowski charts on Minkowski spacetime. The transformations between different Minkowski charts (M s , φ ) and (M s , φ ) will be of particular interest. Ms .
6 | 1 Basic properties of special relativity Definition 1.4. Let ϕ := φ ∘ φ−1 . Then ϕ : ℝ4 → ℝ4 is bijective and of class C k , k ≥ 3. The function ϕ is called Lorentz transformation (LT). Thus, if x = φ (p) and x = φ (p) we have x = ϕ(x ). Another important notion in special relativity is the notion of so-called Lorentz matrices. Definition 1.5. Let L = ((L αβ )) be a 4×4 matrix, and let η = ((η αβ )) = diag (1, 1, 1, −1), as in equation (1.3). In L αβ the α index labels a row and the β index a column. Moreover, L αβ shall satisfy η κλ L κα L λβ = η αβ ,
(1.7)
L T ⋅ η ⋅ L = η.
(1.8)
or in matrix notation Such a matrix L is called a Lorentz matrix (LM). In Section 1.2 we clarify the connection between Lorentz transformations and Lorentz matrices. Remark 1.6. (1) Minkowski charts and Minkowski coordinates can also exist on manifolds which are not Minkowski spacetimes. More precisely, on a semi-Riemannian manifold M = (M, A, g) with metric g of signature 2, there may exist a chart (V, φ) with V ⊂ M and with coordinates x = φ(p), p ∈ V such that g(p) = η αβ dx a ⊗ dx β .
(1.9)
In case such a chart exists it is called Minkowski chart in M . More generally it can be shown (see Section 9.7) that on every n-dimensional semi-Riemannian manifold with metric of signature n − 2 there exist coordinates around each point p ∈ M such that equation (1.9) holds at this chosen point. In this case η αβ is defined like in equation (1.3), but with 4 replaced by n and 3 by n − 1. These coordinates are called local Minkowski coordinates. They are a special case of so-called normal coordinates. (2) The choice (1.1) suggests a very handy, dimensionless formulation of special relativity. A formulation including dimensions like length or time can be obtained by replacing ℝ with ℝ ⋅ l, where l represents the dimension of choice. The details of this procedure are discussed in [14, p. 9, Sect. 0.1.4]. (3) Minkowski coordinates are the coordinates mostly used in the discussion of special relativity. They are comparably simple as Cartesian coordinates are in Euclidean space. This, however, does not mean that they are the only possible choice. Any other chart which is C k -compatible, k ≥ 3, belongs to the atlas As of Minkowski spacetime; for example polar coordinates (see [15, p. 118]) and the coordinates defined in Section 6.5.
1.2 Connecting Lorentz transformations and Lorentz matrices | 7
1.2 Connecting Lorentz transformations and Lorentz matrices We can deduce several fundamental properties of the Lorentz matrices from their defining equation (1.7) or (1.8). Conclusion 1.7. (1) For L being a LM we take the determinant of equation (1.8). Because of det L T = det L and det η = −1 we obtain det L = ±1. Thus, L is not singular, and its inverse exists. (2) If L is a LM, the matrix L T is also a LM. From equation (1.8) it follows that η = (L T )−1 ⋅ η ⋅ L−1 .
(1.10)
Taking the inverse on both sides of equation (1.10) and using η−1 = η we obtain η = L ⋅ η ⋅ LT .
(1.11)
(3) Because of equation (1.10) and (L T )−1 = (L−1 )T we see that for every LM L its inverse L−1 is a LM. (4) The unit matrix 14 : =diag(1, 1, 1, 1) and η itself are LMs. (5) For L1 and L2 being LMs their product L1 ⋅ L2 is also a LM since (L1 ⋅ L2 )T ⋅ η ⋅ L1 ⋅ L2 = L2T ⋅ η ⋅ L2 = η.
(1.12)
Later on we will deduce further properties of Lorentz matrices. In this section we want to clarify the relation between Lorentz matrices and Lorentz transformations. The answer to this question follows from a theorem which does not only hold on Minkowski spacetime but more generally. The idea of its proof is taken from the book on special relativity by A. Papapetrou [20]. We state the following lemma. Lemma 1.8. Consider a 4-dimensional semi-Riemannian C k -manifold M = (M, A, g) with k ≥ 2, and metric of signature 2. (1) Given two Minkowski charts (V, ψ) and (V , ψ ) of A (see Remark 1.6) with their domains satisfying V ∩ V =: N ≠ 0, i.e. ψ[N] and ψ [N] are open sets and the map ϕ := ψ ∘ ψ−1 : ψ[N] → ψ [N]
(1.13)
is C k . In this case there exists precisely one LM L and one a ∈ ℝ4 such that for all x ∈ ψ[N] we have x = ϕ(x) = L ⋅ x + a, (1.14) while for all x ∈ ψ [N] x = ϕ−1 (x ) = L−1 ⋅ x − L−1 ⋅ a holds.
(1.15)
8 | 1 Basic properties of special relativity (2) Let (W, φ) be a Minkowski chart of A , L be a LM and b ∈ ℝ4 . Define φ as φ (p) = L ⋅ φ(p) + b
(1.16)
for all p ∈ W . Then (W, φ ) is a Minkowski chart of A . Proof. (1) Using equation (1.13) and the transformation law of covectors (see Conclusion 9.16 and equation (9.49)), we obtain dx
λ
=
∂ϕ λ dx ϱ . ∂x ϱ
(1.17)
Since for both ψ and ψ equation (1.6) holds, one obtains η λμ
∂ϕ λ ∂ϕ μ dx ϱ ⊗ dx σ = η ϱσ dx ϱ ⊗ dx σ . ∂x ϱ ∂x σ
(1.18)
Now dx ϱ ⊗ dx σ , ϱ, σ = 1, . . . , 4, are linearly independent (see Conclusion 9.47) which implies ∂ϕ λ ∂ϕ μ = η ϱσ . (1.19) η λμ ϱ ∂x ∂x σ As an intermediate result we conclude that the matrix L :=
∂ϕ λ ∂ϕ := (( ϱ )) ∂x ∂x
(1.20)
is a Lorentz matrix which may be dependent on x. Next we show that L is actually independent of x. Differentiation of equation (1.19) with respect to x yields η λμ (
∂2 ϕ λ ∂ϕ μ ∂ϕ λ ∂2 ϕ μ + ) = 0. ∂x ϱ ∂x κ ∂x σ ∂x ϱ ∂x σ ∂x κ
(1.21)
We introduce the following useful abbreviations: Aϱκσ := η λμ
∂2 ϕ λ ∂ϕ μ ⋅ ∂x ϱ ∂x κ ∂x σ
(1.22)
Bϱσκ := Aϱκσ + Aσκϱ
which satisfy Aσκϱ = Aκσϱ
(1.23)
Bϱσκ = 0,
(1.24)
and since ϕ is C k , k ≥ 2 and since equation (1.21) will hold. Cyclic permutation of the indices ϱ, σ, κ does not change the form of equation (1.21), and thus, Bκϱσ = 0 and Bσκϱ = 0 must hold. This implies Bϱσκ + Bκϱσ − Bσκϱ = 0.
(1.25)
1.2 Connecting Lorentz transformations and Lorentz matrices | 9
Expanding equation (1.25) gives for σ, κ, ϱ = 1, . . . , 4 0 = Aϱκσ + Aσκϱ + Aκσϱ + Aϱσκ − Aσϱκ − Aκϱσ = 2Aσκϱ . Hence Aσκϱ := η λμ
Employing
∂ϕ μ ∂2 ϕ λ ⋅ = 0. σ κ ∂x ∂x ∂x ϱ
(1.26)
(1.27)
∂ϕ μ ∂ϕ−1ϱ μ ⋅ = δν ∂x ϱ ∂x ν
equation (1.27) yields A σκϱ
∂ϕ−1ϱ ∂2 ϕ λ = η λν σ κ = 0. ν ∂x ∂x ∂x
(1.28)
Since η is nonsingular, and using equation (1.20) we find ∂2 ϕ λ ∂ λ L = =0 κ ∂x σ ∂x σ ∂x κ
(1.29)
for all κ, λ, σ = 1, . . . , 4 and all x ∈ ϕ−1 [N]. Thus, ϕ is linear: x = ϕ(x) = L ⋅ x + a.
(1.30)
(2) Let x = φ (p) and x = φ(p), p ∈ M. From equation (1.16) it follows for the coordinate transformation ϕ = φ ∘ φ−1 that x = ϕ(x) = L ⋅ x + b
(1.31)
α
∂ϕ holds for all x ∈ φ[W]. Hence we find ∂x κ = L ακ . Moreover, (W, φ) is a Minkowski chart. Thus, in the x-coordinates g αϱ = η αϱ holds, which implies for the x -coordinates for all x ∈ φ [W] ∂ϕ α ∂ϕ β β gκλ = η αβ = L ακ L λ η αβ = η κλ . (1.32) ∂x κ ∂x λ
So we can conclude that (W, φ ) is a Minkowski chart. To continue with our investigations we apply Lemma 1.8 to the Minkowski space according to Definition 1.1 and obtain the following theorem. Theorem 1.9. (1) Every Lorentz transformation ϕ (see Definition 1.4) is defined by a Lorentz matrix L and an a ∈ ℝ4 through ϕ(x) = L ⋅ x + a. (1.33) (2) Complementary: For every Lorentz matrix L and every a ∈ ℝ4 the function ϕ in (1.33) is a Lorentz transformation for every Minkowski chart (M s , φ). This means that there exists another Minkowski chart (M s , φ ) such that ϕ = φ ∘ φ−1 .
10 | 1 Basic properties of special relativity The second part of the theorem is trivial, since φ is defined such that φ = ϕ ∘ φ is a Minkowski chart. Additionally for every Minkowski chart (M s , φ ) there exists another chart (M s , φ) with φ = ϕ−1 ∘ φ such that φ = ϕ ∘ φ. Thus, every Lorentz transformation is a bijection on the set of all Minkowski charts. Moreover, we can generate all Minkowski charts from one particular Minkowski chart by applying Lorentz transformations. Corollary 1.10. The set AM of all Minkowski charts on M s is a C ω subatlas of As , since equation (1.33) holds, i.e., the coordinate tranformations between charts of AM are real analytic.
1.3 Group properties Before we continue we introduce some notations. Notation 1.11. (1) L is the set of all Lorentz matrices. (2) P is the set of all Lorentz transformations. (3) A Lorentz transformation is called homogeneous if and only if ϕ(0) = 0. The set of all homogeneous Lorentz transformations is called P0 . (4) In what follows we use the abbreviation 1n := diag(1, . . . , 1) with n numbers 1 in the parentheses. Likewise we define 0n := (0, . . . , 0) with n numbers 0. Now we can show the following Proposition. Proposition 1.12. The set L , equipped with the usual matrix multiplication, is an infinite noncommutative group. Proof. According to Conclusion 1.7.(5) the product of two LM is again a LM and matrix multiplication is associative. The unit matrix 14 lies in L , and for every L ∈ L there exists an inverse L−1 ∈ L . The proof that L is noncommutative is postponed to Conclusion 2.6(2), where we give an explicit example of noncommutative LM. L is an infinite group since its subset of special LM (see Section 2.3) is not finite. Definition 1.13. L is called the group of Lorentz matrices, short LM group, or simply Lorentz group. Equivalent results hold for the set P . Proposition 1.14. The set P with the composition of functions “∘” as multiplication is an infinite goup. Proof. Let ϕ , ϕ ∈ P . According to equation (1.33) we have ϕ ∘ ϕ(x) = L ⋅ L ⋅ x + L ⋅ a + a .
(1.34)
1.3 Group properties |
11
Now Theorem 1.9 guarantees that ϕ ∘ ϕ ∈ P . Associativity holds for the composition of functions, and Conclusion 1.7.(3) ensures that the inverse ϕ−1 as well as the identity ϕ = id lies in P . That P is infinite can be seen from the fact that a ∈ ℝ. Corollary 1.15. The set P0 is the subset of P which is isomorphic to L . This is obvious, since ϕ ∈ P0 if and only if a = 0 in equation (1.33). Thus, P0 is also a group since L is a group. An element ϕ ∈ P0 is uniquely determined by an element L ∈ L , and vice versa. Definition 1.16. P is called Poincaré group, and P0 is called Lorentz group.
2 Further properties of Lorentz matrices 2.1 Supplements to Proposition 1.7 We state further results on Lorentz matrices. Conclusion 2.1. (1) For every Lorentz matrix L holds L−1 = η ⋅ L T ⋅ η.
(2.1)
This is true since from equation (1.8) it follows that L T ⋅ η ⋅ L ⋅ η = 14 .
(2.2)
L T ⋅ η = (L ⋅ η)−1 = η ⋅ L−1 ,
(2.3)
Thus, which implies equation (2.1). (2) The matrix elements of L−1 can be derived from equation (2.1): 4
(L−1 )αβ = ∑ η ακ L λκ η λβ = η αα L α η ββ . β
(2.4)
κ,λ=1
This implies β
(L−1 )αβ = L α (L−1 )αβ
=
β −L α
if
α, β = 1, 2, 3
and
α = β = 4,
if or
α, = 1, 2, 3 α=4
and and
β=4 β = 1, 2, 3.
(2.5)
Thus, 4
(L−1 )αλ L λβ = ∑ η αα L λα η λλ L λβ = η αα η αβ = δ αβ .
(2.6)
λ=1
Observe that among all matrix elements of an LM the matrix element L44 is of particular significance. It satisfies the following. Proposition 2.2. |L44 | ≥ 1. Proof. Equation (1.7) yields β
3
−1 = L4α η αβ L4 = ∑ (L4κ )2 − (L44 )2 ; κ=1
DOI 10.1515/9783110485738-003
(2.7)
14 | 2 Further properties of Lorentz matrices
thus, 3
(L44 )2 = 1 + ∑ (L4κ )2 ≥ 1.
(2.8)
κ=1
We will see that this simple fact is of great importance for the physical notion of time in special relativity. Before we discuss this in more detail we present further properties of Lorentz matrices. Proposition 2.3. Let L be a Lorentz matrix with |L44 | = 1. Then L=(
Q 03
03T ), ±1
(2.9)
where Q is an orthogonal 3 × 3 matrix and 03 = (0, 0, 0). Proof. From equation (2.8) we see that L4κ = 0 for κ = 1, 2, 3. Thus, L takes the form L=(
Q q
03T ), ±1
(2.10)
where Q is a 3 × 3 matrix and q a 3-dimensional vector. Now we know that L T is again a Lorentz matrix with (L T )44 = L44 = ±1, and LT = (
QT 03
qT ). ±1
(2.11)
Thus, q T = 0. Plugging equation (2.11) into (1.8) or (2.2) yields Q T ⋅ Q = 13 and Q ⋅ Q T = 13 , which means that Q is orthogonal. Corollary 2.4. For a Lorentz matrix of the form L=(
Q q
03T ) L44
(2.12)
the following holds: |L44 | = 1, q = 03T and Q is orthogonal. This statement follows immediately from equation (2.8) and Proposition 2.3. The reverse of this corollary works as follows: Proposition 2.5. For every orthogonal 3 × 3 matrix Q Q L=( 03 is a Lorentz matrix.
03T ) ±1
(2.13)
2.2 Proper, orthochronous, and antichronous Lorentz matrices | 15
Proof. LT ⋅ η ⋅ L = ⋅ ⋅ ⋅ = (
QT ⋅ Q 03
03T ) = η. −1
(2.14)
Besides the equations (1.7) and (1.8) the following characterization of Lorentz matrices turns out to be useful. Conclusion 2.6. (1) Let L := (
K p
q ), r
(2.15)
where K is a 3 × 3 matrix, q a column vector, p a row vector, and r a real number. Let 13 := diag(1, 1, 1) so that η=(
13 03
03T ). −1
(2.16)
Then we can rewrite equation (1.8) as follows: (
K T ⋅ K − p T ⋅ p, q T ⋅ K − rp,
13 K T ⋅ q − rp T )=( −q T ⋅ q − r2 03
03T ). −1
(2.17)
Thus, we find that L is a Lorentz matrix if K T ⋅ K − p T ⋅ p = 13 ,
K T ⋅ q − rp T = 03 ,
q T ⋅ q + r2 = 1.
(2.18)
(2) Let L and L be two Lorentz matrices defined through orthogonal matrices P and Q according to equation (2.13), respectively. They commute if and only if P and Q commute. Thus, in general two Lorentz matrices L and L do not commute, because there are noncommuting orthogonal matrices.
2.2 Proper, orthochronous, and antichronous Lorentz matrices We divide the set of all Lorentz matrices into subclasses and investigate their properties. The first step to achieve this goal is the following definition. Definition 2.7. (1) A Lorentz matrix L is called proper if and only if det L = 1. (2) A Lorentz matrix L is called orthochronous if and only if L44 ≥ 1. Due to equation (2.7) this condition is equivalent to L44 > 0. (3) A Lorentz matrix L is called antichronous if and only if L44 ≤ −1, or equivalently L44 < 0. The following statements hold.
16 | 2 Further properties of Lorentz matrices
Proposition 2.8. (1) For L being orthochronous or antichronous L T and L−1 inherit this property, which can be easily seen from (L T )44 = (L−1 )44 = L44 > 0 resp. < 0. (2) For L being proper L T and L−1 inherit this property since det L T = det L−1 = det L. For products of two Lorentz matrices L1 and L2 we observe a different behavior. Proposition 2.9. (1) For L1 , L2 being proper, their product L1 ⋅ L2 is proper, since the following holds det(L1 ⋅ L2 ) = det L1 ⋅ det L2 = 1. (2) For L1 , L2 being orthochronous, their product L = L1 ⋅ L2 is orthochronous. κ > 0. Expanding the sum for any two Lorentz Proof. We need to show that L44 = L41κ L24 matrices we obtain 3 κ λ = L414 L424 + ∑ L41λ L24 . L41κ L24
(2.19)
λ=1
Using equation (1.7) for L1T and L2 with α = β = 4 one further obtains 3 3 3 4 λ 2 ∑ L L ≤ ∑ (L4 )2 ∑ (L κ )2 24 24 1λ 1λ κ=1 λ=1
λ=1
= ((L414 )2 − 1)((L424 )2 − 1) = (L414 )2 (L424 )2 +
(2.20)
1 1 − (L414 )2 + − (L424 )2 . 2 2
Thus, for all orthochronous Lorentz matrices 3 4 λ ∑ L L < L4 L4 14 24 1λ 24
(2.21)
λ=1
holds and so, due to equation (2.19), the equation 3 κ λ L41κ L24 ≥ L414 L424 − ∑ L41λ L24 > 0.
(2.22)
λ=1
follows. This is the result that had to be proved. Corollary 2.10. For L1 , L2 being antichronous, their product L = L1 ⋅ L2 is orthochronous, since L 414 L424 > 0, such that equations (2.21) and (2.22) hold. Corollary 2.11. If L1 is orthochronous and L2 is antichronous, the products L1 ⋅ L2 and L2 ⋅ L1 are antichronous. Proof. From equation (2.20) we derive 3 4 λ 4 4 ∑ L L < L L = −L4 L4 . 14 24 14 24 1λ 24 λ=1
(2.23)
2.2 Proper, orthochronous, and antichronous Lorentz matrices | 17
Thus, κ λ L41κ L24 ≤ L414 L424 + ∑ L41λ L24 < 0. 3
(2.24)
λ=1
This leads directly to the following. Conclusion 2.12. If L1 is orthochronous and L2 is such that L424 = −1, then L1 ⋅ L2 and L2 ⋅ L1 are antichronous, since L2 is of the form (2.9), which implies κ κ = L414 ⋅ L424 = L42κ ⋅ L14 1 one obtains p = 03 and so K ⋅ K T = 13 . Additionally q T ⋅ q + r2 = 1, q T ⋅ q ≥ 0, and r2 ≥ 1 hold, and thus q = 03 and r = ±1. At first sight, the Lorentz matrices of the form (2.13) do not look very interesting in the context of special relativity. In Chapter 4 we will further discuss their relevance. Conclusion 2.14. Let L, L1 , L2 be Lorentz matrices of the form (2.13), and let L44 = L414 = L424 = 1; then L, L1 , L2 have the following properties: T
(1) L and L−1 are both proper and orthochronous if det Q = 1, since L−1 = (Q03 013 ). (2) Let L = L1 ⋅ L2 , then L is of the form (2.13). It is proper as well as orthochronous if det Q1 = det Q2 = ±1. (3) The 3 × 3-matrix diag(1, 1, 1) is orthogonal; thus, the 4 × 4-matrix diag(1, 1, 1, 1) is an orthochronous Lorentz matrix of the form (2.13).
18 | 2 Further properties of Lorentz matrices
2.3 Special Lorentz matrices In this section we consider matrices of the form k 0 Sυ = ( 0 −υk
0 1 0 0
0 0 1 0
−υk 0 ), 0 k
(2.28)
where υ ∈ ] − 1, 1[ and k = (1 − υ2 )− 2 > 0. These are of particular interest due to the following proposition. 1
Proposition 2.15. S υ is a proper and orthochronous Lorentz matrix. To see this observe that S Tυ = S υ implies k2 − υ2 k2 0 Sυ ⋅ η ⋅ Sυ = ⋅ ⋅ ⋅ = ( 0 0
0 1 0 0
0 0 1 0
0 0 ) = η. 0 υ2 k2 − k2
(2.29)
Moreover, det S υ = k2 (1 − υ2 ) = 1 and k > 0. Definition 2.16. A Lorentz matrix of the form (2.28) is called a special Lorentz matrix. Historically the transformations generated by S υ according to equation (1.33) were the first appearances of Lorentz transformations. H. A. Lorentz found these to be the transformations which leave the fundamental equations of electrodynamics, the Maxwell equations, invariant, when they are transformed from a reference coordinate system to a coordinate system which is in uniform motion with respect to the reference system (see [21]). Einstein derived the Lorentz transformations employing the axiom of the constancy of light (see Conclusion 8.24.1) and the relativity principle in the form: It is not possible to find an overall preferred coordinate system experimentally (see [1]). This line of thinking led to the discarding of the aether hypothesis as the explanation for the propagation of light as well as to the insight that there is no fixed observer-independent notion of space and time. Taking these ideas seriously allowed, among other achievements, for a consistent explanation of the results obtained in the experiment by Michelson and Morley, without the need of further hypotheses (see [20, p. 13]). Besides its historical relevance we will discuss the particular role of the special Lorentz transformations in Chapter 4. For S υ1 and S υ2 being Lorentz matrices, their product S υ1 ⋅ S υ2 is a Lorentz matrix. Proposition 2.17. For υ = (1 + υ1 υ2 )−1 (υ1 + υ2 ), we have S υ1 ⋅ S υ2 = S υ2 ⋅ S υ1 = S υ .
(2.30)
2.3 Special Lorentz matrices |
19
Proof. Using the abbreviation k j = (1 − υ2j )− 2 , j = 1, 2 one gets 1
S υ1 ⋅ S υ2
k1 k2 (1 + υ1 υ2 ) 0 =( 0 −k1 k2 (υ1 + υ2 )
0 1 0 0
0 0 1 0
−k1 k2 (υ1 + υ2 ) 0 ) =: L. 0 k1 k2 (1 + υ1 υ2 )
(2.31)
Employing equation (2.30) yields 1 − υ2 = (1 + υ1 υ2 )−2 (1 − υ21 )(1 − υ22 )
(2.32)
k = (1 + υ1 υ2 )k1 k2 .
(2.33)
and thus Moreover,
υ1 + υ2 = k1 k2 (υ1 + υ2 ). 1 + υ1 υ2 Plugging equations (2.33) and (2.34) into (2.31) we have L = S υ . υk = k
(2.34)
The physical significance of equation (2.30) will be discussed in Section 4.4.2. Conclusion 2.18. Using equation (2.28) it follows immediately that S0 = 14 . Setting υ2 = −υ1 in (2.31) yields S−υ1 = S−1 υ1 . One feature of Definition 2.16, respectively equation (2.28), is that the first and fourth row of S υ are distinguished compared to the second and third row. Taking into account that the fourth row of S υ is a transformation of time in a Lorentz transformation x = S υ ⋅ x, it is natural to investigate the matrices
S2,υ
1 0 =( 0 0
0 k 0 −υk
0 0 1 0
0 −υk ) 0 k
and
S3,υ
1 0 =( 0 0
0 1 0 0
0 0 k −υk
0 0 ). −υk k
(2.35)
Besides these we can furthermore consider the matrices which are obtained by changing k → −k in the above ones. They are called S υ− , S2,υ− and S3,υ− . The reason why we do not use S2,υ , S3,υ , S υ− , S2,υ− and S3,υ− is that they are not needed at all: it suffices to consider S υ . This can be seen from the following. Conclusion 2.19. (1) Let T1 = diag (−1, 1, 1, −1), T2 = diag ( 1, −1, 1, −1), T3 = diag ( 1, 1, −1, −1).
(2.36)
Then T j2 = 14 , j = 1, 2, 3, and T1 ⋅ S υ = S υ ⋅ T1 = S υ− , T2 ⋅ S2,υ = S2,υ ⋅ T2 = S2,υ− , T3 ⋅ S3,υ = S3,υ ⋅ T3 = S3,υ− .
(2.37)
20 | 2 Further properties of Lorentz matrices
(2) Let 0 1 V2 = ( 0 0
1 0 0 0
0 0 1 0
0 0 ), 0 1
(2.38)
then V22 = 14 , i.e., V2−1 = V2 and V2 ⋅ S υ ⋅ V2 = S2,υ .
(2.39)
(3) Let 0 0 V3 = ( 1 0
0 1 0 0
1 0 0 0
0 0 ); 0 1
(2.40)
then V32 = 14 , i.e., V3−1 = V3 and V3 ⋅ S υ ⋅ V3 = S3,υ .
(2.41)
(4) V2 and V3 are matrices of the form (2.13). Thus, they are Lorentz matrices. According to equations (2.35)) and (2.37) this implies that S2,υ and S3,υ are Lorentz matrices. Since T1 , T2 and T3 are also Lorentz matrices, so are S υ− , S2,υ− and S3υ− Two examples may demonstrate what was has just been discussed. Example 2.20. Consider S2,υ . By definition S2,0 = 14 . Moreover, we have S2,−υ = S−1 2,υ , since S2,−υ = V2−1 ⋅ S−υ ⋅ V2−1 = (V2 ⋅ S υ ⋅ V2 )−1 = S−1 (2.42) 2,υ . Finally, using equation (2.30) for υ1 , υ2 , and υ we get S2,υ1 ⋅ S2,υ2 = = = =
V 2 ⋅ S υ1 ⋅ V 2 ⋅ V 2 ⋅ S υ2 ⋅ V 2 V 2 ⋅ S υ1 ⋅ S υ2 ⋅ V 2 V2 ⋅ S υ ⋅ V2 S2,υ .
(2.43)
The analogous calculation works for S3,υ . Example 2.21. Consider S υ− . By definition S0− = T1 . Moreover, −1 S−1 = S−1 υ− = (T 1 ⋅ S υ ) υ ⋅ T 1 = S −υ ⋅ T 1 = S −υ−
(2.44)
and S υ1 − ⋅ S υ2 − = S υ1 ⋅ T 1 ⋅ T 1 ⋅ S υ2 = S υ1 ⋅ S υ2 = S υ .
(2.45)
Multiplication of Lorentz matrices of the type S υ− results in leaving the set of S υ− .
2.4 Subgroups of L
|
21
2.4 Subgroups of L We introduce some more notation. Definition 2.22. In special relativity we are interested in the following subgroups of L : Lei = {L ∈ L : det L = 1}, Loc = {L ∈ L : L44 > 0}, (2.46) Ls = {L ∈ L : L = S υ , υ ∈] − 1, 1[}, Log = {L ∈ L : L has the form of equation (2.13)}. This leads to the following conclusion. Conclusion 2.23. (1) Lei is a group by Propositions 2.8(1), 2.8(2), and 2.9(1). (2) Loc is a group due to Propositions 2.8(1) and 2.9(2). (3) Ls is an Abelian group by Proposition 2.17 and Conclusion 2.18. (4) Log is a group by the fact that the product of two matrices of the form (2.13) has again the form (2.13). It is noncommutative by Proposition 2.6(2). Elements of this group are called rotational (Lorentz) matrices or orthogonal Lorentz matrices. (5) Further subgroups of L can be obtained by taking intersections, for example Ls ⊂ Lei ∩ Loc or Log ∩ Lei . (6) All groups defined by equation (2.46) have an infinite number of elements. For Lei and Loc this follows from (5), since Ls is infinite. By the existence of infinite orthogonal 3×3 matrices Log is an infinite set.
3 Further properties of Lorentz transformations 3.1 Subgroups of P In Section 1.3 we showed that the set of all Lorentz transformations P is a group with the composition of functions as group multiplication. Moreover, in Section 1.2 we found that we can uniquely associate a pair (L, a) to every Lorentz transformation ϕ ∈ P where L is a Lorentz matrix L ∈ L and a ∈ ℝ4 . This suggests the following identification. Notation 3.1. We write ϕ = (L, a) and ϕ(x) = (L, a)(x) = L ⋅ x + a.
(3.1)
Conclusion 3.2. Equation (3.1) implies the group properties of P : id = (14 , 04 ), ϕ ∘ ϕ = (L , a ) ∘ (L, a) = (L ⋅ L, L ⋅ a + a ), ϕ−1 = (L, a)−1 = (L−1 , −L−1 ⋅ a).
(3.2)
P = {(L, a) : L ∈ L, a ∈ ℝ4 }.
(3.3)
Thus, The subgroups Lei , Loc , Ls and Log of L generate subgroups of P . This can be seen by introducing the following notation. Notation 3.3.
Pei : Poc : Ps : Pog :
= {(L, a) : = {(L, a) : = {(L, a) : = {(L, a) :
L L L L
∈ Lei , ∈ Loc , ∈ Ls , ∈ Log ,
a a a a
∈ ℝ4 }, ∈ ℝ4 }, ∈ ℝ4 }, ∈ ℝ4 }.
(3.4)
Conclusion 3.4. Pei , Poc , Ps , and Pog are groups. They contain (14 , 04 ) as the unit element, an inverse for every element, and the product between each two elements. This statement follows immediately from equation (3.2) and the group properties of Lei , Loc , Ls and Log according to Conclusion 2.23. Thus, Pei , Poc , Ps , and Pog are infinite subgroups of the Poincaré group P . Similar results can be obtained for the Lorentz group P0 : P0 = {(L, 04 ) : L ∈ L}.
(3.5)
The following subgroups of P0 will be considered: Notation 3.5.
DOI 10.1515/9783110485738-004
P0ei P0oc P0s P0og
= {(L, 04 ) : = {(L, 04 ) : = {(L, 04 ) : = {(L, 04 ) :
L L L L
∈ Lei }, ∈ Lor }, ∈ Ls }, ∈ Log }.
(3.6)
24 | 3 Further properties of Lorentz transformations Conclusion 3.6. P0ei , P0oc , P0s , and P0og are groups and so infinite subgroups of P0 , by the same arguments which proved Conclusion 3.4. Notation 3.7. The elements of Pei , Poc , and Ps are called proper, orthochronous, and special Lorentz transformation, respectively. The elements of Pog usually do not get a name of their own. We could call them orthogonal or rotational Lorentz transformations.
3.2 A condition for special Lorentz transformations Let ϕ be a Lorentz transformation which satisfies the following: (1) ϕ(04 ) = 04 .
(3.7)
(2) There exist three real numbers α j > 0, j = 1, 2, 3 such that for every (x1 , x2 , x3 ) ∈ ℝ3 the relation
ϕ(x1 , x2 , x3 , 0) = (α1 x1 , α2 x2 , α3 x3 , x 4 )T
(3.8)
holds, with x4 depending on (x1 , x2 , x3 ), i.e., for time x4 = 0 all spatial axes of the coordinates x and x = ϕ(x) coincide. (3) For all x1 , x4 ∈ ℝ ϕ(x1 , 0, 0, x4 ) = (x 1 , 0, 0, x 4 )T (3.9) holds, with x1 , x4 depending on x1 and x4 , i.e., for all times x4 the 1-axis of the coordinate systems x and x coincide. (4) ϕ is orthochronous. Proposition 3.8. ϕ is a homogeneous and special Lorentz transformation, i.e., there exists a υ ∈] − 1, 1[, such that ϕ = (S υ , 04 ). (3.10) Proof. ϕ is a Lorentz transformation, thus, there exists a Lorentz matrix L and an h ∈ ℝ4 , such that ϕ = (L, h). (1) Looking at equation (3.7) it is clear that h = 04 , and thus, ϕ = (L, 04 ). From equation (3.8) we find for all x κ ∈ ℝ
L ⋅ (x1 , x2 , x3 , 0)T = (α1 x1 , α2 x2 , α3 x3 , x 4 )T ,
(3.11)
which in turn implies α1 0 L=( 0 ⋅
0 α2 0 ⋅
0 0 α3 ⋅
⋅ ⋅ ), ⋅ ⋅
(3.12)
3.2 A condition for special Lorentz transformations
| 25
since ∑3λ=1 L κλ x λ = α κ x κ , κ = 1, 2, 3. Equations (3.9), (3.12), and
{ x κ, L1κ x1 + L4κ x4 = { 0, {
κ = 1, 4,
(3.13)
κ = 2, 3
yield that L must have the form α1 0 L=( 0 a1
0 α2 0 a2
0 0 α3 a3
b 0 ). 0 a4
(3.14)
Observe that α2 ≠ 0 and α3 ≠ 0 must hold since det L = 1. We can expand x = L ⋅ x explicitly: x 1 = α1 x1 + bx4 , as L11 = α1 , L14 = b,
as L22 = α2 ,
as L33 = α3 ,
as L4λ = a λ , λ = 1, . . . 4.
x 2 = α2 x2 , x 3 = α3 x3 , x 4 = ∑4λ=1 a λ x λ ,
(3.15)
Combining this with equation (1.8) implies x
T
⋅ η ⋅ x = x T ⋅ η ⋅ x.
(3.16)
Plugging equation(3.15) into (3.16) we find the following identity: α21 (x1 )2
+ α22 (x2 )2
+ α23 (x3 )2
+ b2 (x4 )2
−a21 (x1 )2
− a22 (x2 )2
− a23 (x3 )2
− a24 (x4 )2
−2a1 a2 x1 x2 − 2a1 a3 x1 x3 − 2a1 a4 x1 x4 + 2α1 bx1 x4
(3.17)
−2a2 a3 x2 x3 − 2a2 a4 x2 x4 − 2a3 a4 x3 x4 = (x1 )2
+ (x2 )2
+ (x3 )2
− (x4 )2 .
This identity can only be satisfied if the following equations hold: a1 a2
= 0,
α21 − a21
= 1,
a1 a3
= 0,
−
= 1,
a1 a4 − α1 b
= 0,
= 1,
a2 a3
= 0,
= −1,
a2 a4
= 0,
a3 a4
= 0.
α22 α23 b2
− −
a22 a23 a24
(3.18)
They are labeled as (3.18)κ , κ = 1, . . . , 10, where the left column contains the equations labeled κ = 1, . . . , 4 and the right one the equations κ = 5, . . . , 10.
26 | 3 Further properties of Lorentz transformations
(2) Obviously, equation (3.18) are ten equations which determine the eight unknown parameters α1 , α2 , α3 , b, a1 , a2 , a3 , a4 . Thus, it could happen that this system of equations has no solution at all. Below we will see that this is not the case, since three of the equations (3.18) are not independent from the others. Thus, what we really have here are seven equations for eight unknowns. Therefore, we expect to find a 1-parameter family of solutions to the equations (3.18). This is one reason why the proposition was formulated as a statement on existence. (3) The Lorentz matrix L we seek to find is orthochronous, and thus, by (3.14) a4 = L44 ≥ 1
(3.19)
must hold. Combining equations (3.18) and (3.19) yields a2 = 0,
a3 = 0.
(3.20)
Next, equation (3.20) immediately solves (3.18)5 , (3.18)6 , and (3.18)8 , so we do not need to worry about these. Moreover, combining (3.20), (3.18)2 , and (3.18)3 we find α2 = 1,
α3 = 1.
(3.21)
The remaining unknowns are α1 , b, a1 , a4 . Since α1 > 0 by assumption, (3.18) yields α1 ≥ 1.
(3.22)
Thus, it is possible to introduce a new variable υ by b = α1 υ, and then equations (3.18)1 , (3.18)4 , and (3.18)7 yield α41 υ2 = a21 a24 = (α21 − 1)(α21 υ2 + 1).
(3.23)
Solving equation (3.23) for α21 using α1 ≥ 1 one gets α1 = (1 − υ2 )− 2 .
(3.24)
b = υ(1 − υ2 )− 2 .
(3.25)
1
By definition of υ 1
Therefore the parameter υ satisfies − 1 < υ < 1.
(3.26)
Finally we can solve equations (3.18)1 and (3.24) as well as (3.18)4 and (3.25) for a1 and a4 to 1 (3.27) a1 = υ(1 − υ2 )− 2 = b and a4 = (1 − υ2 )− 2 = α1 . 1
(3.28)
This proves Proposition 3.8. Corollary 3.9. Every Lorentz transformation ϕ = (S υ , 04 ) with υ ∈] − 1, 1[ satisfies the assumptions of Proposition (3.8).
3.3 A condition for orthochronous Lorentz transformations
| 27
3.3 A condition for orthochronous Lorentz transformations Using the terms introduced in Notation 3.7 for a given Lorentz transformation ϕ = (L, a) and its homogeneous version ϕ0 = (L, 04 ), we obtain the following conclusion. Conclusion 3.10. ϕ is orthochronous if and only if ϕ0 is orthochronous. Moreover, ϕ0 is orthochronous if and only if L is orthochronous. The aim is to proof another equivalence of this type. To do so we need Notation 3.11. Let y ∈ ℝ4 such that y T ⋅ η ⋅ y < 0 and y4 > 0 (or y4 < 0). A Lorentz transformation ϕ = (L, a) satisfies the so-called OC condition if and only if y = ϕ0 (y) = L ⋅ y implies y 4 > 0 (resp. y 4 < 0). Thus, if ϕ satisfies the OC condition, all Lorentz transformations ϕ = (L, a ) do so for a ∈ ℝ4 . Now we can formulate the following theorem. Theorem 3.12. The Lorentz transformation ϕ is orthochronous if and only if it satisfies the OC condition. Proof. 1) Let ϕ = (L, a) be orthochronous, i.e., L44 ≥ 1. Then L T satisfies by equation (2.8) 1 2
3
L44 > ( ∑ (L4λ )2 ) .
(3.29)
λ=1
Since y T ⋅ η ⋅ y < 0, 3
(y4 )2 > ∑ (y κ )2
(3.30)
κ=1
holds. Thus, 1 2
3
y4 > ( ∑ (y κ )2 ) ,
if
y4 > 0,
(3.31)
y4 < 0.
(3.32)
κ=1
and
1 2
3
y4 < −( ∑ (y κ )2 ) ,
if
κ=1
Let us consider the case y4 > 0. Because of y = L ⋅ y this implies 3 3 3 2 2 y4 = L4α y α ≥ L44 y4 − ∑ L4κ y κ ≥ L44 y4 − ( ∑ (L4λ )2 ) ( ∑ (y κ )2 ) . κ=1 κ=1 κ=1 1
1
(3.33)
Using equations (3.29) and (3.31) we conclude
y 4 > 0. In a completely analogous way one obtains for
(3.34) y4
< 0 and
y
= L ⋅ y:
3 3 3 2 y 4 = L4α y α ≤ L44 y4 + ∑ L4λ y λ ≤ L44 y4 + ( ∑ (L4λ )2 ) ( ∑ (y κ )2 ) . κ=1 1 2
λ=1
λ=1
1
(3.35)
28 | 3 Further properties of Lorentz transformations
This implies
y4 0. Now y T ⋅ η ⋅ y < 0 holds. Moreover, y = L ⋅ y and the OC condition imply y 4 = L44 y4 > 0 and so L44 > 0. Thus, L is orthochronous.
4 Decomposition of Lorentz matrices and Lorentz transformations In this chapter we prove that every Lorentz matrix can be represented as a product of three particular Lorentz matrices: one from Log , one from Ls , and one from Log . This decomposition then implies a similar decomposition for Lorentz transformations.
4.1 The decomposition theorem for Lorentz matrices 4.1.1 Notations and assumptions The following matrices will be considered: j
j
P = ((b k ))3 ,
Q = ((a k ))3 ,
L = ((L αβ ))4 ,
K = ((L k ))3 ,
L4 =
j
(L14 , L24 , L34 )T ,
L4
=
(4.1)
(L41 , L42 , L43 ).
This yields L=(
K L4
L4 ). L44
(4.2)
Moreover, we use the abbreviation 03 = (0, 0, 0) and define P B=( 03 as well as
L44 0 S=( 0 r
with
03T ), 1 0 1 0 0
0 0 1 0
L44 D = (0 0
Q A=( 03 r 0 D ) = ( 0 q 4 L4 0 1 0
03T ), 1
qT ), L44
(4.3)
(4.4)
0 0) , 1 (4.5)
q = (r, 0, 0), r2 = (L44 )2 − 1 , L44 ≥ 1. Together with the abbreviation υ2 = 1 − (L44 )−2 DOI 10.1515/9783110485738-005
(4.6)
30 | 4 Decomposition of Lorentz matrices and Lorentz transformations and with |L44 | ≥ 1 one obtains the relation L44 = (1 − υ2 )− 2 1
−1 < υ < 1,
and
r = −υ(1 − υ2 )− 2 . 1
(4.7)
Conclusion 4.1. S is a special Lorentz matrix with dimensionless velocity υ measured in units of c. Using the notation introduced in equation (2.28) we obtain S = S υ . Further notations (1) The vector a1 is defined by a1j := r−1 L4j ,
j = 1, 2, 3,
a1 := (a11 , a12 , a13 ).
(4.8)
(2) a2 := (a21 , a22 , a23 ) and a3 := (a31 , a32 , a33 ) denote normed vectors such that {a1 , a2 , a3 } is an orthonormal basis in ℝ3 . (3) Moreover, let b1 := r−1 L4 , j
j
j
3
j
j = 1, 2, 3,
b k := ∑ L n a kn ,
k = 2, 3, j = 1, 2, 3,
(4.9) (4.10)
n=1
and define b k := (b1k , b2k , b3k ), k = 1, 2, 3.
4.1.2 Theorem and proof Proposition 4.2. (1) Let L be a Lorentz matrix, i.e., L satisfies L T ⋅ η ⋅ L = η.
(4.11)
L44 ≥ 1.
(4.12)
Moreover, let L be such that Then, under the assumptions (4.8), (4.9), and (4.10), the matrices P and Q in equation (4.1) are orthogonal. Moreover, using equations (4.3) and (4.4) the relation L = B ⋅ S ⋅ A.
(4.13)
holds. (2) The other way around: Let S be an arbitrary special Lorentz matrix, and let A and B be determined by arbitrary orthogonal matrices P and Q as in equation (4.3); then, according to equation (4.13), L is a Lorentz matrix.
4.1 The decomposition theorem for Lorentz matrices
|
31
Remark 4.3. Throughout the following proof we will use equation (4.11) in the form η κλ L κα L λβ = η αλ as well as the fact that for L being a Lorentz matrix, L T and L−1 are Lorentz matrices. Also formula (2.8) is often used. Proof.
(1) a1 is a normed vector, since 3
3
j=1
j=1
∑ (a1j )2 = r−2 ∑ (L4j )2 = r−2 ((L44 )2 − 1) = 1.
(4.14)
By assumption, a2 and a3 are freely chosen normed orthogonal vectors which are orthogonal to a1 , and thus Q is orthogonal. (2) We show that the vectors b k are normed. For b1 , 3
3
∑ (b1 )2 = r−2 ∑ (L4 )2 = r−2 ((L44 )2 − 1) = 1 j
j=1
j
(4.15)
j=1
holds. Considering b k , k = 2, 3 and using equation (4.10) we obtain 3
3
j
∑ (b k )2 =
n,m=1
j=1
3
j
j
∑ ( ∑ L n L m )a kn a km j=1
3
=
∑ a kn a km (L4n L4m + δ nm ) n,m=1 3
3
2
(4.16)
2
= ∑ (a kn ) + ( ∑ a kn L4n ) n=1
n=1
= 1, since a k is normed and
3
3
∑ a kn L4n = r ∑ a kn a1n = 0. n=1
(4.17)
n=1
(3) We show that {b1 , b2 , b3 } is an orthonormal basis in ℝ3 . The vector b1 is orthogonal to b2 and b3 , since, for k = 2, 3, 3
j
3
j
j
j
∑ b1 b k = r−1 ∑ L4 L n a kn , j=1
n,j=1 3
= r−1 L44 ∑ L4n a kn n=1 3
= L44 ∑ a1n a kn = 0, n=1
(4.18)
32 | 4 Decomposition of Lorentz matrices and Lorentz transformations where the orthogonality between a1 and a k is used. Furthermore, b2 and b3 are orthogonal, since 3
j
3
j
∑ b2 b3 = j=1
j
∑
j
L n a2n L m a3m
j,n,m=1 3
=
∑ a2n a3m (L4n L4m + δ nm )
(4.19)
n,m=1 3
3
3
= ∑ a2n a3n + ( ∑ a2n L4n )( ∑ a3m L4m ) = 0, n=1
n=1
m=1
by the orthogonality of a1 , a2 , a3 , and L4n = ra1n . (4) Now we can use these preparations to prove the first part of the proposition, equation (4.13). We reformulate it in the equivalent form B T ⋅ L ⋅ A T = S.
(4.20)
Plugging equations (4.1)–(4.5) into (4.20), we find that we must prove (
PT ⋅ K ⋅ QT L4 ⋅ Q T
P T ⋅ L4 D ) = ( L44 q
qT ). L44
(4.21)
Thus, we need to verify three relations: P T ⋅ K ⋅ Q T = D,
(4.22)
T
P ⋅ L4 = q ,
(4.23)
L ⋅ Q = q.
(4.24)
T
4
T
We can decompose equations (4.22), (4.23), and (4.24) into components (k, l = 1, 2, 3): 3
j
j
∑ b k L n a ln = D kl ,
(4.25)
j,n=1 3
j
j
(4.26)
∑ L4j a kj = rδ1k ,
(4.27)
∑ b k L4 = rδ1k , j=1 3 j=1
which is the form we will use in the following.
4.1 The decomposition theorem for Lorentz matrices |
33
(5) Proof of equation (4.25) by case-by-case analysis: (5a) l = k = 1:
3
3
∑ b1 L n a1n = r−1 ∑ L4 L n a1n j
j
j,n=1
j
j
j,n=1 3
= r−1 L44 ∑ nL4n a1n
(4.28)
n=1 3
= L44 ∑ (a1n )2 = L44 = D11 . n=1
(5b) l = k = 2, 3:
3
j
3
j
j
∑ b k L n a kn = ∑ (b k )2 = 1 = D kk . j,n=1
(4.29)
j=1
(5c) l = 1 and k = 2, 3: 3
3
3
3
∑ b k L n a1n = r−1 ∑ b k L n L4n = r−1 L44 ∑ b k L4 = L44 ∑ b k b1 = 0 = D1k . (4.30) j
j
j,n=1
j
j
j,n=1
j
j
j=1
j
j
j=1
(5d) l = 2, 3 and k = 1: 3
j
3
j
j
j
j
j
∑ b1 L n a ln = ∑ b1 b l = 0 = D1l . j,n=1
(4.31)
j=1
(5e) l = 2, k = 3, or l = 3, k = 2: 3
j
3
j
∑ b3 L n a2n = ∑ b3 b2 = 0 = D32 , j,n=1 3
j=1 j j b2 L n a3n
∑
3
= ∑
j,n=1
(4.32) j j b2 b3
= 0 = D23 .
j=1
(6) Proof of equations (4.26) and (4.27): 3
j
j
3
j
j
∑ b k L4 = r ∑ b k b1 = rδ1k , j=1
j=1
3
∑ j=1
(4.33)
3
L4j a kj
= r∑
a1j a kj
=
rδ1k .
j=1
This proves the first part of the decomposition theorem. (7) The second part of the theorem is immediate, since the given assumptions imply that A, B, and S are Lorentz matrices.
34 | 4 Decomposition of Lorentz matrices and Lorentz transformations Remark 4.4. In case det Q = −1, one can interchange a2 and a3 to obtain det Q = 1. In other words, if the orthonormal basis {a1 , a2 , a3 } is negatively oriented {a1 , a3 , a2 } is positively oriented. Corollary 4.5. The proof of the decomposition theorem presented here is constructive. For every Lorentz matrix L we can construct the matrices A, B, and S υ explicitly from j its matrix elements L k by using equations (4.6), (4.7), (4.8), (4.9), and (4.10).
4.1.3 Remarks on the interpretation of the decomposition theorem (1) Let L be a Lorentz matrix and consider the Lorentz transformation x = L ⋅ x.
(4.34)
The point x = (0, 0, 0, x4 )T can be interpreted as spatial origin in x-coordinates at time x4 . In the coordinates x it becomes x = L ⋅ (0, 0, 0, x4 )T = (L14 , L24 , L34 , L44 )T x4 ,
x 4 = L44 x4 ,
(4.35)
j
j
x j = L4 x4 = (L44 )−1 L4 x 4 ,
j = 1, 2, 3.
With the help of equation (4.9) and for j = 1, 2, 3 it follows that u j := (L44 )−1 L4 = r(L44 )−1 b1 j
j
(4.36)
are the components of the velocity of the spatial origin in x coordinates. The norm |u| of the vector u = (u1 , u2 , u3 ) is given by the root of 3
3
j=1
j=1
∑ (u j )2 = (L44 )−2 ∑ (L4 )2 = (L44 )−2 ((L44 )2 − 1). j
(4.37)
Thus, by equation (4.6) |u|2 = υ2
and
u = υb1 ,
(4.38)
which is exactly the result one expects. The sign of υ depends on the orientation of the orthonormal basis {a1 , a2 , a3 } resp. {b1 , b2 , b3 }. (2) The corresponding conclusion for the spatial origin of the x -coordinates at time x 4 , expressed in x-coordinates, can be obtained from x = L−1 ⋅ x and
x = L−1 ⋅ (0, 0, 0, x 4 )T
= −(L41 , L42 , L43 , −L44 )T x 4 .
(4.39)
Thus, x j = −(L44 )−1 L4j x4 ,
j = 1, 2, 3.
(4.40)
4.2 The decomposition theorem for Lorentz transformations
|
35
The velocity vector w is given by the components w j = −(L44 )−1 L4j = −r(L44 )−1 a1j ;
(4.41)
compare equation (4.8). As in (1) one obtains |w|2 = υ2 = |u|2
and
w = −υa1 ,
(4.42)
as expected.
4.1.4 Decomposition of nonorthochronous Lorentz matrices The proof of the decomposition theorem assumed that L is orthochronous. This assumption can be dropped due to the following proposition. Proposition 4.6. Let L be a anthichronus Lorentz matrix. Then there exists a special Lorentz matrix and two rotation matrices A and B of the form A = (
Q 03
03T ), 1
B = (
P 03
03T ), −1
(4.43)
such that L = B ⋅ S ⋅ A .
(4.44)
Proof. According to Corollary 2.10, L = η ⋅ L is orthochronous. and thus there exists a decomposition L = B ⋅ S ⋅ A and L = (η ⋅ B) ⋅ S ⋅ A, which is the decomposition (4.40) with B = η ⋅ B, S = S, and A = A.
4.2 The decomposition theorem for Lorentz transformations Given a Lorentz transformation (L, z) and a decomposition of its Lorentz matrix, L = B ⋅ S υ ⋅ A.
(4.45)
Then we are looking for a solution to the problem: Is it possible to find three vectors u, w, y ∈ ℝ4 such that (4.46) (L, z) = (B, u) ∘ (S υ , w) ∘ (A, y). The following propositions holds. Proposition 4.7. For every pair (u, w), (u, y), (w, y) it is possible to determine the missing vector y, w, or u such that equation (4.46) is true. Proof. According to equation (3.2) we can write (B, u) ∘ (S υ , w) ∘ (A, y) = (B ⋅ S υ ⋅ A, B ⋅ S υ ⋅ y + B ⋅ w + u).
(4.47)
36 | 4 Decomposition of Lorentz matrices and Lorentz transformations
Thus, equation (4.46) holds exactly if z = B ⋅ S υ ⋅ y + B ⋅ w + u.
(4.48)
From this expression we can deduce: given given given
(u, w), (u, y), (w, y),
then then then
−1 y = S−1 υ ⋅ B (z − u − B ⋅ w), −1 w = B (z − B ⋅ y − u), u = z − B ⋅ S υ ⋅ y − B ⋅ w.
(4.49)
These are the essential facts about the decomposition of Lorentz transformations. Conclusion 4.8. The above proposition shows that for a given Lorentz matrix L the decomposition of the corresponding Lorentz transformations can be done in several ways, i.e., it is not unique. Besides this nonuniqueness there is an additional nonuniqueness in the previously discussed decomposition of Lorentz matrices which will be discussed in the following Section 4.3.
4.3 Nonuniqueness of the decomposition of Lorentz matrices Let L be a given Lorentz matrix, S υ , S υ be special Lorentz matrices, and A1 , A2 , B1 , B2 be rotational Lorentz matrices such that L = B 1 ⋅ S υ ⋅ A 1 = B 2 ⋅ S υ ⋅ A 2
(4.50)
holds. Problem 4.9. What are the consequences of the equations (4.50) for S υ , S υ , A1 , A2 , B1 , B2 ? Since equation (4.50) is equivalent to S υ = B1T ⋅ B2 ⋅ S υ ⋅ A2 ⋅ A1T ,
(4.51)
we can reformulate the problem as follows: What are the implications of S υ = B ⋅ S υ ⋅ A
(4.52)
for special Lorentz matrices S υ , S υ and rotational Lorentz matrices B, A? Proposition 4.10. Let us assume that equation (4.52) holds. Then υ = ±υ, and
±1 B = (02T 0
A = BT 02 U 02
0 02T ) , 1
(4.53)
(4.54)
with U being an arbitrary 2 × 2 matrix and 02 = (0, 0). The sign in equations (4.53) and (4.54) must be the same, i.e., +1 or −1 in both equations.
4.3 Nonuniqueness of the decomposition of Lorentz matrices
|
37
Proof. (1) It is useful to introduce suitable notation first. Let r = −υk, r = −υ k with 1 1 k = (1 − υ2 )− 2 > 0, k = (1 − υ2 )− 2 and D = diag (k, 1, 1), D = diag (k , 1, 1), q = (r, 0, 0), q = (r , 0, 0). Then we can write Sυ = (
D q
qT ), k
S υ = (
(4.55)
D q
qT ). k
(4.56)
03T ), 1
(4.57)
Moreover, let 0j = (0, . . . , 0) with j zeros, and let P B=( 03
03T ), 1
Q A=( 03
where P and Q are orthogonal 3 × 3 matrices. (2) Using the notation just introduced we can rewrite equation (4.52): P B ⋅ S υ ⋅ A = ( 03
03T D ⋅ Q )⋅( 1 q ⋅Q
P ⋅ D ⋅ Q qT )=( k q ⋅Q
P⋅qT ) = Sυ . k
(4.58)
Using equations (4.55) and (4.58) yields k = k
υ = ±υ,
and
(4.59)
and so S υ = S±υ
r = ±r.
and
(4.60)
Thus, we obtain D = D
q = ±q.
and
(4.61)
Furthermore, we can conclude several equations from equation (4.58): P ⋅ D ⋅ Q = D, P ⋅ q T = ±q T , QT
⋅
qT
=
(4.62)
±q T .
(3) From equation (4.62) it follows that q T is an eigenvector of P and Q T with eigenvalue ±1. Thus, P and Q are of the form ±1 02T
P=(
02 ), U
±1 02T
Q=(
02 ), V
(4.63)
where U and V are 2 × 2 matrices and 02 = (0, 0)T . Employing 1 13 = P ⋅ P T = ( T 02
02 ) U ⋅ UT
(4.64)
38 | 4 Decomposition of Lorentz matrices and Lorentz transformations
yields U ⋅ U T = 12 .
(4.65)
V ⋅ V T = 12 .
(4.66)
Similarly we have for Q Thus, the matrices U and V are orthogonal. (4) From equation (4.62) it follows ±1 P⋅D = ( T 02
k 02 )⋅( T U 02
±k 02 ) = ( T 12 02
02 )= U
k = ( T 02
02 ±1 )⋅( T 12 02
02 ±k ) = ( T VT 02
02 ), VT
D⋅
QT
(4.67)
from which we can conclude U = VT.
(4.68)
There are no further equations U has to satisfy; it can be chosen freely. Together with equation (4.64) this allows the conclusion that P = QT as well as
and
±1 B = (02T 0
02 U 02
B = AT ,
(4.69)
0 02T ) . 1
(4.70)
This prooves the proposition. Corollary 4.11. (1) Equations (4.52) and (4.69) imply B ⋅ Sυ = Sυ ⋅ B
(4.71)
for the ± sign. (2) There exist two choices to satisfy equation (4.50) for given Lorentz matrices A1 , B1 , S υ with Lorentz matrices A2 , B2 , S υ . One can choose the plus sign, i.e., υ = υ and so S υ = S υ as well as B2 = B1 ⋅ B
and
A2 = B T ⋅ A1 .
(4.72)
Alternatively, one can choose the minus sign, i.e., υ = −υ and so S υ = S−υ as well as B2 and A2 according to (4.72), where the minus sign has to be chosen in the expressions for B and A.
4.4 The decomposition of products |
39
4.4 The decomposition of products 4.4.1 Preliminary remarks Given two Lorentz matrices L and L as well as their product L = L ⋅ L . Furthermore, we use the notation form Section 4.1.1 and the decomposition theorem (Proposition 4.2) to perform the following decomposition: L = B ⋅ S υ ⋅ A ,
L = B ⋅ S υ ⋅ A ,
L = B ⋅ S υ ⋅ A.
(4.73)
Now the following questions appear naturally: (1) Which parts of L and L determine υ and thus, the matrix S υ ? (2) Which parts of L and L determine the matrices B and A? To answer this question we use the notation from Section 4.1.1: L = (
L4 ), L44
K L4
L = (
K L 4
L 4 ) , L44
(4.74)
which yields L = L ⋅ L = (
K ⋅ K + L4 ⋅ L 4 , L 4 ⋅ K + L44 ⋅ L 4 ,
4 K ⋅ L 4 + L4 ⋅ L4 ) . 4 4 L ⋅ L4 + L4 ⋅ L44
(4.75)
From this equation we can read off the following relations:
4 4 L44 = L 4 ⋅ L 4 + L4 ⋅ L4 ,
4 L4 = K ⋅ L 4 + L4 ⋅ L4 ,
(4.76)
L4 = L 4 ⋅ K + L44 ⋅ L 4 . Furthermore we use the notations (4.6) to (4.10) for L, L and L in the following, i.e., a 1 , b 1 , υ, υ , υ , for example.
4.4.2 The theorem of relativistic addition of velocities
The velocities υ, υ , υ for L, L , L are determined by (4.6), while the vectors a 1 , b 1 are determined by (4.8) respectively (4.9).
Proposition 4.12. Using the abbreviation cos ϑ := a 1 ⋅ b 1 the following holds:
υ = (1 − υ υ cos ϑ)−1 ⋅ (υ 2 + υ
2
+ 2υ υ cos ϑ − υ 2 υ
2 sin2
1
ϑ) 2 .
(4.77)
Proof. The equations (4.8), (4.9), and (4.5) imply
L 4 = r a 1 = υ L44 a 1 ,
L4 = r b1 = υ L4 4 b1 .
(4.78)
40 | 4 Decomposition of Lorentz matrices and Lorentz transformations
Plugging equation (4.78) into (4.76) yields
L44 = L44 L4 4 (1 + υ υ cos ϑ).
(4.79)
Using equation (4.6) in (4.79) yields equation (4.77). Remark 4.13. For ϑ = 0 one obtains the addition theorem (2.30). Remark 4.14. A clear interpretation of the angle ϑ can be given by using the interpretation of the vectors a1 and b1 discussed in Section 4.1.3. Again, let L = L ⋅ L , and furthermore let x = L ⋅ x ,
x = L ⋅ x ,
x = L ⋅ x .
(4.80)
According to Section 4.1.3 the normed coordinate vector b 1 is interpreted as spatial direction along which the spatial origin of the x -coordinates (0, 0, 0, x 4 ) moves, expressed in x -coordinates. Thus, u = υ b 1 is the (coordinate) vector of the ve locity with which the spatial origin (0, 0, 0, x 4 ) of the x -coordinates moves, in x coordinates. Accordingly, −a1 is the direction of the velocity with which the spatial origin of the x-coordinates moves, in x -coordinates. From these considerations we can conclude that the coordinate-vectores a1 and 3 b 1 live in the same space, i.e. in ℝ . Thus, their (inner) product is mathematically well-defined.
4.4.3 Decomposition of a product L = L ⋅ L Equation (4.77) determines the velocity υ, and thus S υ . To obtain the desired decomposition of L the missing ingredients are the matrices A, B and Q, P, respectively. According to equation (4.8) a1 is determined by L4 and L4 by equation (4.76). Using the vectors a2 and a3 (see Section 4.1.1) the 3 × 3 matrix Q is determined, as well as by equation (4.3), the matrix A. Similarly, equation (4.9) determines b1 from L4 , and L4 is determined by equation (4.76). The vectors b2 and b3 which fix the matrices P and B, can be determined by equation (4.10). Thus, the decomposition of L = L ⋅ L can be performed explicitly, at least in principle.
4.5 Parameter representation of Lorentz matrices In this section we consider the group Loe := Loc ∩ Lei , the group of proper orthochronous Lorentz matrices. For this purpose we use the following notations.
4.5 Parameter representation of Lorentz matrices |
41
Notation 4.15. Let R jk (φ), 1 ≤ j < k ≤ 3 be a 3 × 3 matrix with diagonal entries, i.e., jj and kk, cos φ and off-diagonal entries, i.e., jk and kj, sin φ, and − sin φ, respectively. Furthermore, let the missing entry on the diagonal be 1 and all other entries be zero, i.e., for example cos φ sin φ 0 (4.81) R12 (φ) = (− sin φ cos φ 0) . 0 0 1 Let W jk (φ) be a 4 × 4 matrix defined by W jk (φ) = (
R jk (φ) 03
03T ), 1
(4.82)
with 03 := (0, 0, 0). Conclusion 4.16. Using Notation 1.11.(4) the following holds: W jk (φ) ⋅ W jk (ψ) = W jk (φ + ψ), R jk (φ) ⋅ R jk (ψ) = R jk (φ + ψ), R jk (0) = 13 , W jk (0) = 14 , R jk (φ)−1 = R jk (φ)T = R jk (−φ), W jk (φ)−1 = W jk (φ)T = W jk (−φ). (4.83) With this notation the Euler Theorem can be formulated as follows. Proposition 4.17. Let V be an orthogonal 3 × 3 matrix. Then there exist precisely three angles φ12 , φ13 , φ23 , all in the interval ] − 2π , 2π [, such that V = R12 (φ12 ) ⋅ R13 (φ13 ) ⋅ R23 (φ23 ) ⋅ C± ,
(4.84)
with C± = diag (1, 1, ±1) for det V = ±1. The proof can be found in various books on matrices, for example in [22, p. 150]. Conclusion 4.18. Let V be as in (4.84)) and U = ( V0 10 ); then U = W12 (φ12 ) ⋅ W13 (φ13 ) ⋅ W23 (φ23 ) ⋅ E± , with E± = diag (1, 1, ±1, 1).
(4.85)
Having clarified these prerequisites we can prove the following proposition. Proposition 4.19. For every L ∈ Loe there exist 6 parameters π π 4 (φ12 , φ13 , ψ12 , ψ13 ) ∈ ]− , [ , 2 2
χ ∈] − π, π[,
υ ∈] − 1, 1[,
(4.86)
L = W12 (φ12 ) ⋅ W13 (φ13 ) ⋅ W23 (χ) ⋅ S υ ⋅ W13 (−ψ13 ) ⋅ W12 (−ψ12 ).
(4.87)
such that
42 | 4 Decomposition of Lorentz matrices and Lorentz transformations Proof. (1) Consider L ∈ Loe and L = B ⋅ S υ ⋅ A, where P B=( 03
03T ), 1
A=(
(4.88)
Q 03
03T ). 1
(4.89)
Furthermore, let det Q = 1, which can always be realized by choosing a suitable orthonormal basis. Now det L = 1 and det S υ = 1 imply that detP = detQ = 1. The Euler Theorem guarantees that there exist 6 angles φ12 , . . . , ψ23 with values in ] − 2π , 2π [ such that P = R12 (φ12 ) ⋅ R13 (φ13 ) ⋅ R23 (φ23 ), (4.90) Q T = R12 (ψ12 ) ⋅ R13 (ψ13 ) ⋅ R23 (ψ23 ). The last equation implies Q = R23 (−ψ23 ) ⋅ R13 (−ψ13 ) ⋅ R12 (−ψ12 ),
(4.91)
which can be used to obtain A and B from equations (4.82), (4.85), and (4.89). This in turn yields L = W12 (φ12 ) ⋅ W13 (φ13 ) ⋅ W23 (φ23 ) ⋅ S υ ⋅ W23 (−ψ23 ) ⋅ W13 (−ψ13 ) ⋅ W12 (−ψ12 ), and because of S υ ⋅ W23 (−ψ23 ) = W23 (−ψ23 ) ⋅ S υ it follows that L = W12 (φ12 ) ⋅ W13 (φ13 ) ⋅ W23 (φ23 − ψ23 ) ⋅ S υ ⋅ W13 (−ψ13 ) ⋅ W12 (−ψ12 ).
(4.92)
Thus, it only remains to show that the nonuniqueness of A and B in equation (4.88) does not change the angle χ := φ23 − ψ23 . That this is the case can be seen as follows. The nonuniqueness of the decomposition of L can be expressed with help of equation (4.88) as the statement: There exists an angle α ∈] − 2π , 2π [ such that L = B ⋅ W23 (α) ⋅ S υ ⋅ W23 (−α) ⋅ A = W12 (φ12 ) ⋅ ⋅ ⋅ W23 (φ23 + α) ⋅ S υ ⋅ W23 (−ψ23 − α) ⋅ ⋅ ⋅ = W12 (φ12 ) ⋅ ⋅ ⋅ W23 (φ23 − ψ23 ) ⋅ S υ ⋅ W13 (−ψ13 ) ⋅ ⋅ ⋅ . This shows that all decompositions of L yield the same form (4.87). (2) Let L be of the form (4.87). Then L is a Lorentz matrix. Furthermore, all factors are orthochronous and proper. Thus, L ∈ Loe . Definition 4.20. Equation (4.87) is called the Euler decomposition of L.
5 Further structures on M s 5.1 Introductory remarks In this section the following basic property of two sets M1 and M2 will be used. Let M1 and M2 be such that there exists a bijection f : f : M1 → M2 .
(5.1)
In this case it is possible to map all structures defined on one of the sets M1 or M2 to the other. That this is the case is easy to see, since a bijection f is nothing but a relabeling of the elements of M1 , and f −1 is a relabeling of the elements of M2 . In the next two sections we will use such a mapping of structures for the sets M s and ℝ4 . The bijection f we will employ will be one of the Minkowski charts φ.
5.2 Vector space structure In every spacetime manifold a certain vector space structure is defined, namely its tangent vector spaces. Minkowski space Ms is distinguished among the set of all the possible spacetimes by the fact that the set M s itself can naturally be equipped with a vector space structure. Loosely speaking, one could say that M s is itself a vector space. It will become clear that there even exist infinitely many vector space structures on M s . Definition 5.1. Let (M s , φ) be a Minkwoski chart, and let p1 , p2 , p3 be three arbitrary points on M s with coordinates x1 = φ(p1 ), x2 = φ(p2 ), and x3 = φ(p3 ). Addition + and scalar multiplication ⋅ are defined through p1 + p2 = φ−1 (x1 + x2 ) = φ−1 (x2 + x1 ) = p2 + p1 αp3 : = α ⋅ p3 = φ−1 (αx3 ),
α ∈ ℝ.
(5.2)
Proposition 5.2. The triple (M s , +, ⋅) is a 4-dimensional vector space. Proof. The addition of two elements of M s is commutative by definition. It is associative, since p1 + (p2 + p3 ) = φ−1 (x1 + (x2 + x3 )) = φ−1 ((x1 + x2 ) + x3 ) = (p1 + p2 ) + p3 .
(5.3)
The zero element O ∈ M s is defined by O = φ−1 (0, . . . , 0)).
(5.4)
For every p ∈ M s there exists an inverse −p defined by − p = φ−1 (−x) DOI 10.1515/9783110485738-006
with
x = φ(p).
(5.5)
44 | 5 Further structures on M s
For the scalar multiplication, α(p1 + p2 ) = φ−1 (α(x1 + x2 )) = ⋅ ⋅ ⋅ = αp1 + αp2 , −1
(α + β) ⋅ p1 = φ ((α + β)x1 ) = ⋅ ⋅ ⋅ = αp1 + βp1 , −1
(αβ) ⋅ p1 = φ ((αβ)x1 ) = ⋅ ⋅ ⋅ = α(βp1 ), −1
1 ⋅ p1 = φ (1 ⋅ x1 ) = ⋅ ⋅ ⋅ = p1 ,
(5.6) (5.7) (5.8) (5.9)
and 0 ⋅ p1 = φ−1 (0 ⋅ x1 ) = ⋅ ⋅ ⋅ = O
(5.10)
holds. Thus, the vector spaces M s and ℝ4 are isomorphic and the proposition holds as stated. Interesting implications are collected in the following conclusion. Conclusion 5.3. (1) The vector space structure of M s depends on the Minkoswski chart chosen. It is defined such that φ is a linear map from M s to ℝ4 . Moreover, all Minkowski charts which are related by homogeneous Lorentz transformations define the same vector space structure on M s . This can be proved as follows. Let x = φ(p), x = φ (p) and ϕ = φ ∘ φ −1 = (L, 0), and thus p1 + p2 = φ−1 (x1 + x2 )
= φ−1 (ϕ(x1 ) + ϕ(x2 ))
= φ−1 ∘ ϕ(x1 + x2 ) = φ−1 (x1 + x2 )
(5.11)
and
αp3 = φ−1 (αx3 ) = φ−1 (αϕ(x3 )) = φ −1 (αx3 ).
(5.12)
However, if ϕ = (L, a) with a ≠ 0, then φ and φ generate different vector space struĉ 0̂ = (0, . . . , 0), and tures. In particular the zero elements differ. Since O = φ−1 (0), −1 ̂ ϕ=φ∘φ it follows that 0 in x-coordinates corresponds to x with 0̂ = ϕ(x ), i.e., −1 x = −L ⋅ a. Using 0̂ = (0, . . . 0) we see that
O = φ−1 ∘ ϕ(x ) = φ −1 (−L−1 ⋅ a) ≠ φ −1 (0̂ ) = O .
(5.13)
In conclusion we can associate an infinite number of vector space structures to M s . (2) The vector space M s is isomorphic to each of its own tangent vector spaces. On any tangent vector space of M s there exists an indefinite inner product, thus, by being isomorphic, it should be possible to define an indefinite inner product directly on M s . To realize this explicitly one needs to set ̂ 1 , p2 ) := φ(p1 )T ⋅ η ⋅ φ(p2 ) for g(p
p1 , p2 ∈ M s .
(5.14)
The vector space (M s , +, ⋅) equipped with the inner product ĝ is a Lorentzian vector space, according to Definition 9.21, where e α = φ−1 (z α ) and z α has components β β zα = δα .
5.2 Vector space structure
|
45
This claim can easily be verified. Let p = φ−1 (x) which implies x = x α z α and p = α 4 α ) = x e α . Since ẑ = (z 1 , . . . , z 4 ) is a Minkowski basis in ℝ , ê = (e 1 , . . . , e 4 ) s must be a Minkowski basis in (M , +, ⋅) (see Definition 9.19). x α φ−1 (z
(3) Let φ and φ be different Minkowski charts such that ϕ = φ ∘ φ−1 is a homogê Let ê be the basis defined by φ in (2). Moreover neous Lorentz transformation (L, 0). let ê = (e1 , . . . , e4 ) with eα = φ−1 (z α ). The following then holds: ê = L ⋅ ê T
resp.
eα = L α e β . β
(5.15)
This can be prooved by a simple calculation: eα = φ−1 (z α ) = φ−1 ∘ ϕ(z α ) = φ−1 (L α z β ) = L α e β = (L ⋅ ê T )α . β
β
(5.16)
The fact that ê and ê are bases of a vector space, implies by p = p α e α = pβ eβ and (5.15) the transformation law for vector components p α = L αβ pβ .
(5.17)
Equations (5.15) and (5.17) are the analogues of the transformation laws for tangent vectors (9.19) and (9.23). (4) Since (M s , +, ⋅) and ℝ4 are isomorphic, one could start the whole construction which we discussed so far in ℝ4 instead of on the abstract set M s . Conceptually however, this would have several drawbacks. The elements of M s are symbols for physical events, while ℝ4 contains the coordinates of the events. Starting the construction of special relativity with ℝ4 instead of with M s , one would have to add how one distinguishes events from their coordinates. (5) The tangent spaces of Lorentzian manifolds are isomorphic to ℝ4 : thus, one could identify all of them with ℝ4 . However, this is not a good idea, since one would put 4-velocities of particles and events in the same mathematical entity. This in turn would destroy the conceptual clarity introduced by general relativity, at least partly. (6) Instead of writing (M s , +, ⋅) we briefly write M s . In Section 8.2 we discuss how smooth curves can be defined in M s without resorting to tangents spaces or coordinates. In terms of these we will obtain velocity vectors in M s . In this formulation, events and velocities live in the same mathematical entity M s . The vector space structure of M s implies a conceptual ambiguity which does not appear in case one considers M s with its manifold structure only. The simple solution of this problem is to consider M s as an affine space. Instead of considering M s , the pair (M s , M s ) is considered. The first M s contains all position vectors p in M s as elements representing events, and the second M s contains all difference vectors p − q which are interpreted as velocities. Thus, if the set M s is considered as a vector space in context of special relativity, one should see it as affine space for the reasons just discussed. For most of the mathematical proofs it suffices to consider M s as vector space; the affine structure is usually not needed.
46 | 5 Further structures on M s
5.3 Topology on M s In Section 1.1 it was already mentioned that Minkowski spacetime is a particular case of an n-dimensional C k manifold. In Section 9.1 these objects are defined as tuples (M, A), where M is a set and A an atlas on it. This means that A is a set of charts (U, χ) with U ⊆ M and a bijection χ such that χ[U] is an open set in ℝn . Moreover, the composition χ ∘ χ−1 of every two functions χ and χ is C k with k ≥ 1. In case A is a complete atlas, i.e., it contains all pairwise compatible charts on M, then A defines a topology on M such that the coordinate chart maps χ are homeomorphisms (see Section 9.1). The topology of Minkowski space can be defined in precisely this way. However, there exists a simpler way to determine this topology for Minkowski space, which is demonstrated in the following. We will use the following notation. Notation 5.4. (1) N is called natural topology on ℝ4 , i.e., N is the set of open subsets of ℝ4 . (2) The Minkowski coordinate maps are called φ. (3) For one chosen φ, the set T is defined by T = {U : U = φ−1 [W], W ∈ N}.
(5.18)
We can proove the following Proposition 5.5. T is a topology of M s independent of the choice of φ. Proof. (1) M s and 0 are elements of T , since M s = φ−1 [ℝ4 ] and 0 = φ−1 [0]. (2) Let W j ∈ N and U j = φ−1 [W j ], j = 1, 2. Then we have W1 ∩ W2 ∈ N and φ−1 [W1 ∩ W2 ] = φ−1 [W1 ] ∩ φ−1 [W2 ] = U1 ∩ U2 . Thus, U1 ∩ U2 ∈ T . (3) Let S ⊂ T and K = {W : W ∈ φ[U], U ∈ S}. Thus, ⋃W∈K φ−1 [W] ∈ T . (4) Let φ and φ be different Minkowski charts and T the topology generated by φ . Let U ∈ T . Then W := φ [U ] ∈ N holds since ϕ = φ ∘ φ−1 is a Lorentz transformation and thus, ϕ is a homeomorphism from ℝ4 to ℝ4 , which means W := −1 ϕ[W ] ∈ N . Thus, U := φ−1 [W] ∈ T and U = φ−1 ∘ ϕ[W ] = φ [W ] = U . Now we can conclude T ⊆ T . In the same way one can prove the reverse relation so that one finally arrives at T = T , which proves the proposition. Conclusion 5.6. (1) Every Minkowski chart φ : M s → ℝ4 is a homeomorphism, since φ and φ−1 map open sets to open sets. All other charts in As are also homeomorphisms since they are C k -compatible with the Minkowski charts, k ≥ 3. The linearity of φ implies that φ even is a C ω diffeomorphism from M s to ℝ4 (see Corollary 1.10) . (2) Through any Minkowski chart M s inherits all topological properties of (ℝ4 , N), as for example simple connectedness, the Hausdorff property, existence of a countable basis for T . or paracompactness. Thus, the manifold Ms = (M s , As , g s ) is a Lorentzian manifold according to Definition 9.52.
6 Tangent vectors in Ms 6.1 Decomposition of Lorentz vector spaces The notions introduced in this section are of great importance in any kind of relativistic theories. Therefore they will be introduced for general Lorentz vector spaces (see Section 9.4). The physical interpretation of relativistic theories is based on the following definition. Definition 6.1. (1) Let (V, g) be a Lorentz vector space; then V can be decomposed into three disjoint parts, the elements of which are called and defined as follows: υ is called timelike, iff g(υ, υ) < 0; υ is called lightlike, iff g(υ, υ) = 0 and υ ≠ 0; υ is called spacelike, iff g(υ, υ) > 0 or υ = 0. (2) A vector υ is called causal if it is not spacelike. Special properties of Lorentz vector spaces are based on this classification of vectors. Examples like the statement that lightlike vectors are orthogonal if and only if they are parallel can be found e.g., in [14, pp. 20, 21]. It is important to stress that the meaning of the word causal in Definition 6.1 differs from its original meaning. A causal vector is not the cause of something, but rather it determines a signal between two events p1 and p2 in Ms . A signal from p1 to p2 can cause something at p2 but must not necessarily do so. In the context of special and general relativity the notion causal vector simply means that this vector can be used in connection with the propagation of a signal, as for example in the definition of the causal relations in Definition 7.17(2) or in the definition of a signal as tangent vector (see Definition 8.1(3) and Definition 8.7(3)). In the following sections we will only consider tangent vector spaces T p Ms , and we will make use of the results of the Chapter 9.
6.2 Timelike tangent vectors Calling a vector u timelike suggests that u behaves like time, i.e., that if it is represented in Minkowski coordinates, it only has one component, the 4-component. Simple counter examples show that this intuition is not correct. The question must be posed as follows: Do Minkowski coordinates exist such that a given timelike vector u ∈ T p Ms has only one nonvanishing component, its 4-component? According to Conclusion 9.8 it is possible to generate every tangent vector in T p Ms from a straight line γ̄ in ℝ4 . Recall that ℝ4 is the image of the Minkowski coordinates. The question now is whether it is possible to find a Lorentz transformation such that DOI 10.1515/9783110485738-007
48 | 6 Tangent vectors in Ms the line γ̄ gets transformed to a line γ̄ with j γ̄ j (σ) = x0 ,
j = 1, 2, 3.
Using Definition 9.7 the proof is given by the following Proposition 6.2. Let u ∈ T p Ms be timelike, then there exists a Minkowski chart φ and a number u 4 ≠ 0, such that u = u4 ∂ x4 . (6.1) Proof. Given u = u α ∂ x α with x = φ(p), p ∈ M s being Minkowski coordinates, such that ∑3j=1 (u j )2 =: w2 with w > 0. (For w = 0 equation (6.1) is satisfied trivially). From u being timelike it follows that |u4 | > w, and there exists a number b > 0, such that w2 − (u4 )2 = g(u, u) = −b2 .
(6.2)
Now we need to display a Lorentz matrix such that equation (6.1) holds for x = φ ∘ φ−1 (x) = L ⋅ x.
(6.3)
According to the idea sketched above we try to use the ansatz L = S υ ⋅ A,
(6.4)
where A is a rotational matrix and S υ a special Lorentz matrix. Let a1 =
1 1 2 3 (u , u , u ), w
(6.5)
and let a2 , a3 ∈ ℝ3 be two row vectors, such that a1 , a2 , a3 is an orthonormal basis of ℝ3 . This yields that a1 Q = (a2 ) (6.6) a3 is orthogonal, and that by Proposition 1.12 Q A=( 03
03T ) 1
(6.7)
is a rotational Lorentz matrix. We find w u1 0 u2 A ⋅ ( 3) = ( ) . 0 u u4 u4
(6.8)
6.3 Spacelike tangent vectors
|
49
Thus, Q respectively A rotates (u1 , u2 , u3 )T in the direction of the 1-axis. Now let υ = 1 w(u4 )−1 , k = (1 − υ2 )− 2 , and r = −υk, as well as (see equation (2.28)) k 0 Sυ = ( 0 r With |υ| < 1 and
0 1 0 0
0 0 1 0
r 0 ). 0 k
(6.9)
w wk + ru4 0 0 Sυ ⋅ ( ) = ( ), 0 0 wr + ku4 u4
(6.10)
wk + ru4 = k(w − υu4 ) = 0
(6.11)
we find and 1
wr + u4 k = ku4 (1 − υ2 ) = ((u4 )2 − w2 ) 2 = b.
(6.12)
Thus, u 4 = b and u i = 0, i = 1, 2, 3. This proof shows that the Lorentz matrix L is determined only up to rotations of a2 and a3 . There exist infinitely many Minkowski coordinates to satisfy equation (6.1). For all of these coordinates u 4 = b and b can be negative.
6.3 Spacelike tangent vectors Spacelike vectors can have four nonvanishing components, with respect to an arbitrary Minkowski chart, just like timelike vectors. There are two questions of interest to answer: (1) Does there exist a Minkowski chart φ̂ such that for a given spacelike vector u ∈ T p Ms , p ∈ M s its fourth component û 4 vanishes in these coordinates? (2) Does there exist a Minkowski chart φ such that for a spacelike vector u ∈ T p Ms , p ∈ M s only one of the components uj , j ∈ {1, 2, 3} and j ≠ 4 is nonvanishing? A sufficient condition which answers question (1) is the following. Proposition 6.3. Let u = u α ∂ x α be a spacelike vector with 0 < |u4 | < |u1 | for a Minkwoski chart φ. Then there exists a Minkwoski chart φ̂ such that û 4 = 0.
50 | 6 Tangent vectors in Ms
Proof. Given a special Lorentz matrix S υ (see (2.28)) one gets u1 ku1 + ru4 2 u u2 Sυ ⋅ ( 3) = ( ), u u3 u4 ru1 + ku4
(6.13)
where k = (1 − υ2 )− 2 and r = −υk. For υ = u4 (u1 )−1 it follows from (6.13) that 1
ru1 + ku4 = 0 and 1
ku1 + ru4 = u1 (1 − υ2 ) 2 ≠ 0. Thus, the proposition holds for |u1 | > |u4 |. In case |u2 | > |u4 | or |u3 | > |u4 |, one can produce the proof identically with S2,υ or S3,υ (see equation (2.35)). The answer to question (2) gives the following. Proposition 6.4. Again let u be a spacelike vector, which for a Minkowski chart φ is given by u = u α ∂ x α . Then, for every n ∈ {1, 2, 3} there exists a Minkowski chart φn such that u = un ∂ xn . (6.14) Proof. For u = 0 the proposition holds trivially; thus, we consider u ≠ 0. Moreover, we discuss only the case n = 1; the cases n = 2, 3 can be worked out analogously. We set φ 1 =: φ . Introducing 3
∑ (u j )2 =: w2 ,
w > 0,
(6.15)
j=1
we obtain from the fact that u is spacelike the relation w > |u4 |. Consider the matrices Q, A, S υ as in equations (6.6), (6.7), and (6.9), and L = Sυ ⋅ A
(6.16)
with υ = u4 w−1 . From equations (6.8) and (6.10) it follows that u1 wk + ru4 2 u 0 L ⋅ ( 3) = ( ). 0 u wr + ku4 u4
(6.17)
Since r = −υk, it follows wr + ku4 = k(u4 − wu4 w−1 ) = 0 and wk + ru4 = k(w − υu4 ) = 1 kw(1 − (u4 )2 w−2 ) = w(1 − υ2 ) 2 =: u1 . Thus, the proposition is proven. Moreover, we find that g s (u, u) = (u1 )2 . (6.18)
6.4 Some conclusions
|
51
6.4 Some conclusions The results discussed in the previous Sections 6.2 and 6.3 always refer to properties of single tangent vector spaces T p Ms and their bases defined by Minkowski charts. Therefore, these results hold in any Lorentz vector space, since, for these, basis transformations are defined by homogeneous Lorentz transformations. In particular the results hold in the tangent vector spaces of the spacetimes considered in general relativity by the fact that it is possible to use local Minkowski coordinates at every point of any spacetime manifold (see Proposition 9.53). This remark also applies to the following, Proposition 6.5. Let υ, u ∈ T p Ms and υ timelike. Moreover, let u be such that g s (υ, u) = 0. This implies u is spacelike. Proof. Since υ is timelike there exists a Minkowski chart φ such that υ = υ4 ∂ x4 with υ4 ≠ 0. In these coordinates let u = u α ∂ x α . Thus, g s (υ, u) = υ4 u4 = 0, and so u4 = 0. This means either u = 0 or 3
g s (u, u) = ∑ (u j )2 > 0.
(6.19)
j=1
Hence u is spacelike. Notation 6.6. Let υ ∈ T p Ms . The set of all u ∈ T p Ms such that g s (υ, u) = 0 is labeled by υ⊥ (see Section 9.4). Proposition 6.7. For υ ∈ T p Ms being timelike, we find that υ⊥ is a vector space with positive definite inner product h = g s |υ⊥ . In other words, υ⊥ is a 3-dimensional Euclidean vector space. Proof. Let u1 , u2 ∈ υ⊥ and a1 , a2 , ∈ ℝ, then a1 u1 + a2 u2 ∈ υ⊥ , since g s (υ, a1 u1 + a2 u2 ) = 0.
(6.20)
Moreover, the 0-vector belongs to υ⊥ , and for u ∈ υ⊥ we have −u ∈ υ⊥ . The positive definiteness of h follows from (6.19). Proposition 6.8. If for a given υ ≠ 0, υ ∈ T p Ms the set υ⊥ contains only spacelike vectors, then υ⊥ is a vector space, and υ is timelike. Proof. As in Proposition 6.7 it is clear that a1 u1 + a2 u2 ∈ υ⊥ and that 0 ∈ υ⊥ . Again, if u ∈ υ⊥ it follows that −u ∈ υ⊥ . For υ being lightlike g(υ, υ) = 0 implies that υ ∈ υ⊥ . However, then υ must be lightlike and spacelike, which is not possible. The other way around: if υ is spacelike then there exists a Minkowski chart such that 3
υ = ∑ υ j ∂ xj . j=1
52 | 6 Tangent vectors in Ms This would imply that the timelike vector u = u4 ∂ x4 satisfies g s (υ, u) = 0 and thus lies in υ⊥ , which is a contradiction to the supposition of the proposition. Thus, υ is neither spacelike nor lightlike, and hence it must be timelike, and υ⊥ is a 3-dimensional Euclidean vector space. In Sections 6.2 and 6.3 only timelike and spacelike vectors have been considered. An obvious question is to ask next if there exists a representation like equations (6.1) or (6.14) for lightlike vectors. The answers gives the following. Proposition 6.9. A υ = υ β ∂ x β is lightlike, if and only if υ 4 ≠ 0 and 3
∑ (υ j )2 = (υ4 )2 .
(6.21)
j=1
There exists no Minkowski chart φ such that the vector ∂ x α is lightlike for any α = 1, 2, 3, or 4. Proof. For a general υ = υ β ∂ x β holds 3
g s (υ, υ) = ∑ (υ j )2 − (υ4 )2 . j=1
Thus, υ is lightlike if and only if equation (6.21) is satisfied and υ4 ≠ 0. In Minkowski charts the equation g s (∂ x α , ∂ x α ) = (dx α (∂ x α ))2 = 1,
α = 1, 2, 3, 4,
holds. Hence it is clear that ∂ x α is not lightlike. For a representation of a lightlike vector in a Minkowski chart one needs a timelike and at least one spacelike basis vector.
6.5 Non-Minkowskian coordinates So far we have considered always Minkowski charts (M s , φ) for everything discussed in these sections. The basis vectors generated by such charts (∂ x1 , . . . , ∂ x4 ) always consist of three spacelike, one timelike, but no lightlike vectors. Here the following question will be discussed: Do non-Minkwoskian charts ψ(p) = y exist such that the bases generated by these charts (see Definition 9.7) contain only spacelike, or timelike, or even only lightlike vectors? The answer to this question is yes. This will become clear in two steps. Definition 6.10. Let φ be a Minkowski chart. Define the vectors e1 , . . . , e4 in T p Ms in terms of the coordinate vectors ∂ x α , α = 1, . . . , 4, with x = φ(p) as follows: for b > 0 let j = 1, 2, 3, e j = ∂ x j + b ∂ x4 , (6.22) e4 = −∂ x1 + b ∂ x4 .
6.5 Non-Minkowskian coordinates
|
53
Proposition 6.11. (1) (e1 , . . . , e4 ) is a basis of T p Ms for every b > 0. (2) e α , α = 1, . . . , 4 is spacelike for b < 1, lightlike for b = 1, and timelike for b > 1. Proof. (1) We need to show that e1 , . . . , e4 are linearly independent. Let a α ∈ ℝ, α = 1, . . . , 4 such that a1 e1 + a2 e2 + a3 e3 + a4 e4 = 0. (6.23) From equations (6.22) and (6.23) it follows that 4
(a1 − a4 )∂ x1 + a2 ∂ x2 + a3 ∂ x3 + b ∑ a β ∂ x4 = 0.
(6.24)
β=1
Thus, a1 = a4 ,
a2 = 0 = a3 ,
a1 = −a4 ,
(6.25)
such that a α = 0, for α = 1, . . . , 4. (2) The second part of the proposition follows immediately from g s (e β , e β ) = 1 − b2 for β = 1, . . . , 4. The next step is to show that the basis (e1 , . . . , e4 ) is determined by a chart ψ. Proposition 6.12. Let T be a matrix defined by 1, 0, T=( 0, −1,
0, 1, 0, 0,
0, 0, 1, 0,
b b ). b b
(6.26)
Let χ be a function defined by χ(y) = T ⋅ y.
(6.27)
Moreover, let ψ = χ −1 ∘ φ. Then e α = ∂ yα holds for α = 1, . . . , 4. Proof. According to equation (9.19) the following relation holds for the coordinate transformation χ(y) = x: ∂χ α ∂ x α = T βα ∂ x α . (6.28) ∂ yβ = ∂y β Since T βα ∂ x α = e β , one obtains e β = ∂ y β .
54 | 6 Tangent vectors in Ms Conclusion 6.13. The coordinate transformation x = χ(y) = T ⋅ y yields y = χ−1 (x) = T −1 ⋅ x. Thus, ψ(p) = T −1 ⋅ φ(p), (6.29) with T −1 =
1 2 − 12 ( − 12 1 2b
0
0
1
0
0
1
0
0
which is clear by multiplication with (6.26).
1 2 − 12 ), − 12 1 2b
(6.30)
7 Orientation 7.1 Time orientation 7.1.1 Definitions In our considerations so far, the x4 component was interpreted as time, but was treated just like a spatial length. In everyday life however, one’s experience is that time has one direction. Minkowski charts do not characterize a distinguished time direction, which is most visible by the fact that the Lorentz matrix η generates from every Minkowski chart φ another Minkowski chart φ = η ⋅ φ with opposite time direction. This observation introduces several questions, which we will answer here. We formulate the discussion such that it is not only valid for Minkowski space but for any Lorentzian manifold. Problem 7.1. (1) What is a meaningful definition which allows for a distinguished direction of time, the so called time orientability? (2) In case a Lorentzian manifold is time-orientable, how is a time orientation defined? To answer this question the following heuristic point of view is of help. Since we are looking for a notion of time orientability we expect that it can be defined without using spacelike tangent vectors of a Lorentzian manifold. Moreover, there exist only two directions of time, future and past, thus, we expect furthermore that a time orientation is connected to a split of the set of causal tangent vectors in two separate parts. We can make this heuristic idea precise by Definition 7.2. Let M be a Lorentzian manifold, and let C be the set of causal tangent vectors on M . The manifold M is called time orientable if and only if there exist two subset C+ and C− of C such that C+ ∪ C− = C,
C+ ∩ C− = 0.
(7.1)
Moreover, let u p ∈ C± ∩ T p M, and let u p be the vector obtained by parallel transport of u p from T p M to T p M along an arbitrary curve γ (see for example [16, p. 302]). Then u p ∈ C± holds, where the choice of + or − in ± has to be the same in both of its appearances. Time orientabillity means that under parallel transport it is not possible for a vector from C+ to become an element of C− and vice versa. This answers the first question of Problem 7.1. The second will be answered by a choice.
DOI 10.1515/9783110485738-008
56 | 7 Orientation Definition 7.3. The elements of C+ are called future-pointing; the elements of C− are called past-pointing. This choice is arbitrary but nonetheless meaningful. This is not yet obvious, but will become clear in Chapter 8 when we discuss examples which clearly demonstrate the physical meaning of the above definition.
7.1.2 Time orientation on Ms In this section we will have a closer look at the set C of causal tangent vectors on Ms . The following abbreviations turn out to be very helpful. Let u ∈ C ∩ T p Ms and φ be a Minkowski chart. Then we can expand u = u α ∂ x α with |u4 | > 0, since for u4 = 0 the vector u would be spacelike. Moreover, 1 2
3
( ∑ (u j )2 ) =: r ≥ 0
(7.2)
j=1
holds. For r > 0 let e j := u j ⋅ r−1 such that 3
∑ (e j )2 = 1.
(7.3)
j=1
For r = 0 the objects e1 , e2 , e3 shall be any real numbers such that equation (7.3) holds. As abbreviation we set u4 = a. All together one finds 3
u = r ∑ e j ∂ x j + a∂ x4 ,
(7.4)
j=1
which implies g s (u, u) = r2 − a2 .
(7.5)
Moreover, equation (7.5) immediately yields the following. Conclusion 7.4. The vector u is timelike if and only if 0 ≤ r < |a|, and u is lightlike if and only if 0 < r = |a|. Moreover, the following proposition holds. Proposition 7.5. Let u be a causal vector with components as in equation (7.4) with repect to a Minkowski chart φ. If the Lorentz transformation ϕ = φ ∘ φ−1 from φ to another Minkowski chart φ is orthochronous, then u has a 4-component a with the same sign as a. In case ϕ is antichronous the sign changes. Proof. Let u = u α ∂ x α = uβ ∂ xβ . According to the transformation equation (9.23), a = u4 =
∂ϕ4 α u = L4α u α ∂x α
(7.6)
7.1 Time orientation
| 57
holds. Thus, by equation (7.4) we obtain 3
a = r ∑ L4j e j + L44 a,
(7.7)
j=1
so that Proposition 7.5 is seen to be valid if r = 0. If r > 0 it is necessary to distinguish four cases. (1) ϕ is orthochronous, i.e., L44 ≥ 1, and a > 0. This yields with (7.7) 3 a ≥ L44 a − r ∑ L4j e j .
(7.8)
3 4 j 3 j 2 12 3 4 2 12 ∑ L e ≤ ∑ (e ) ∑ (L ) j j
(7.9)
j=1
Because of
j=1
j=1
j=1
and employing equation (3.29), respectively (2.8) for L T , yields 3
1 2
( ∑ (L4j )2 ) < L44
(7.10)
j=1
and thus, by equation (7.8) a > L44 (a − r) ≥ 0.
(7.11)
(2) ϕ is orthochronous, i.e., L44 ≥ 1, and a < 0. This yields L44 a < 0 and with equation (7.7) 3 a ≤ L44 a + r ∑ L4j e j . (7.12) j=1
Employing equations (7.9) and (7.10) yields a < L44 a + L44 r = L44 (r − |a|) ≤ 0.
(7.13)
(3) ϕ is antichronous and a > 0. This yields L44 a < 0 and with equation (7.7) 3 a ≤ L44 a + r ∑ L4j e j .
(7.14)
j=1
Employing again (7.9) and (2.8) for L T yields 3 4 j ∑ L e < |L4 |, 4 j
(7.15)
j=1
which in turn with equation (7.14) yields a < L44 a + |L44 |r = |L44 |(r − a) ≤ 0.
(7.16)
58 | 7 Orientation (4) ϕ is antichronous and a < 0. This yields L44 a > 0, and with equation (7.7) 3 a ≥ L44 a − r ∑ L4j e j .
(7.17)
j=1
Together with equations (7.9) and (7.15) one obtains a > L44 a − |L44 |r = |L44 |(|a| − r) ≥ 0.
(7.18)
Thus, the proposition is proven. The decomposition of the set of causal vectors on Ms is fixed by the following definition. Definition 7.6. (1) The causal cone in T p Ms , p ∈ M s is defined by C p := {υ ∈ T p Ms : g s (p)(υ, υ) ≤ 0, υ ≠ 0}.
(7.19)
The set of causal vectors on Ms is C = ⋃ Cp .
(7.20)
p∈M s
(2) Let φ be an arbitrary Minkowski chart and let x = φ(p), p ∈ M s . Then C±p := {υ ∈ C p : υ = υ α ∂ x α , ±υ4 > 0}
(7.21)
C± = ⋃ C±p .
(7.22)
and p∈M S
The definition of C± depends on an arbitrarily chosen chart φ. However, according to Proposition 7.5, C± is identical for all Minkowski charts which can be constructed from φ by an orthochronous Lorentz transformation ϕ. In case one would use a Minkowski chart φ in equation (7.21), which was generated from φ by an antichronous Lorentz transformation, then the term C± is defined by C± = C∓ . (7.23) This reflects the fact that the construction of Ms discussed so far does not distinguish between future and past. The only way to introduce a notion of past and future is an arbitrary choice. This means the notions of future and past must be associated with the mathematical terms C+ and C− which depend on the arbitrary choice of a Minkowski chart. According to the discussion in Section 7.1.1 the introduction of such a notion of future and past is possible if Ms is time-orientable. To prove this property one needs
7.1 Time orientation
| 59
some properties of covariant derivatives ∇ and of the notion of parallelism. The precise definition of covariant derivatives can be found in the literature (see for example [16, p. 303]). To prove the time orientability of Ms we state the following Definition 7.7. (1) Let u be a vector field defined on an open subset V ⊂ M s . Then u is called parallel if and only if ∇u = 0 for all p ∈ V . (2) Let γ : I → M s be a differentiable curve for I ⊂ ℝ open. Let u be a vector field defined along γ and on an open neighbourhood around γ. The vector field u is called parallel along γ if and only if ∇γ̇ u = 0. (3) Let σ1 , σ2 ∈ I, σ1 < σ2 and p j = γ(σ j ), j = 1, 2. Moreover, let u be parallel along γ. We call u(p2 ) the parallel transport of u(p1 ) along γ. Proposition 7.8. Ms is time-orientable. Proof. Let be given a Minkowski chart φ, the sets C±p , p ∈ M s and C± defined by φ. Moreover, let p be an arbitrary element of M s and u = u α ∂ x α p be an element of C±p . Then u(p) = u α ∂ x α p , p ∈ M s (7.24) defines a vector field on M s with u(p) ∈ C±p . Since the Christoffel symbols vanish for all Minkowski charts, the covariant derivative of u is ∇u =
∂u α β dx ⊗ ∂ x α = 0. ∂x β
(7.25)
This means u is parallel. Let γ : I → M s be a differentiable curve for the open interval I ⊂ ℝ. Then ∇γ̇ u = ∇u(γ,̇ ⋅) = 0. (7.26) This means that after parallel transport from p1 to p2 every vector which was an element of C±p1 lies in C±p2 . Surely the same sign + or − has to be chosen in the statement. Using Proposition 7.8 we can now introduce the time orientation of Ms . Definition 7.9. For a given chart φ the sets C+ and C− are defined as in Definition 7.6. The elements of C+ are called future-pointing, and the elements of C− are called pastpointing. There exist a large amount of additional facts on the time orientation in relativistic theories which we will not discuss here (see for example [14]).
60 | 7 Orientation
7.2 Orientation of vector bases Demanding that spacetime is orientable avoids a Mobius-strip-like space-time geometry. There exist different but equivalent definitions of orientation, for example, as follows: Definition 7.10. A Lorentzian manifold M is called oriented if there exists an everywhere-defined nonvanishing 4-form ω on M . This 4-form ω is called orientation. Proposition 7.11. Consider Ms and a Minkowski chart φ. The 4-form ω = ∑ sign (P)dx P(1) ⊗ ⋅ ⋅ ⋅ ⊗ dx P(4) =: dx1 ∧ dx2 ∧ dx3 ∧ dx4
(7.27)
P∈S
is an orientation of Ms , with S being the permutation group of four elements and sign(P) = 1 for even permutations and sign(P) = −1 for odd permutations of P. Proof. The 4-form ω is defined everywhere and for every p ∈ M s ω(p)(∂ x1 p , ∂ x2 p , ∂ x3 p , ∂ x4 p ) = 1.
(7.28)
holds, i.e., ω vanishes nowhere. Differentiability is guaranteed by Definition 9.49. An orientation of vector bases can now be defined as follows. Definition 7.12. Let ê = (e1 , e2 , e3 , e4 ) be an arbitrary basis of T p Ms . We call ê positively oriented for ω(p)(e)̂ > 0 and negatively oriented for ω(p)(e)̂ < 0. The definition of ω depends on the choice of a Minkowski chart φ. Thus, it is necessary to investigate the behavior of ω under chart changes. Proposition 7.13. Let ω be defined as above for a Minkowski chart φ, and let ω be defined by another Minkowski chart φ . Then, using Notation 3.7, it holds that ω = ω for proper Lorentz transformations, ω = −ω for improper Lorentz transformations. Proof. Equation (7.27) and the transformation equation dx α = L αβ dxβ (see Definition 9.14 and equation (9.49)) with L αβ = 4
ω = ∑ sign (P) P∈S
∑ β1 ,...,β4 =1
(7.29)
∂ϕ α ∂x β
for ϕ = φ ∘ φ−1 yield
L β1 ⋅ ⋅ ⋅ L β4 dxβ1 ⊗ ⋅ ⋅ ⋅ ⊗ dxβ4 . P(1)
P(4)
(7.30)
Using the notation P(1)
P(4)
A(β1 , . . . , β4 ) := ∑ sign (P)L β1 ⋅ ⋅ ⋅ L β4 P∈S
(7.31)
7.3 Orientations on M s
one obtains
| 61
A(β1 , . . . , β4 ) = det L,
(7.32)
for (β1 , . . . , β4 ) being an even permutation of (1, 2, 3, 4), A(β1 , . . . , β4 ) = − det L,
(7.33)
for (β1 , . . . , β4 ) being an odd permutation of (1, 2, 3, 4), and A(β1 , . . . , β4 ) = 0,
(7.34)
for (β1 , . . . , β4 ) being not pairwise different. Thus, (7.30) implies ω = det L ∑ sign(P)dxP(1) ⊗ ⋅ ⋅ ⋅ ⊗ dxP(4) .
(7.35)
P∈S
Thus finally, ω = det L ⋅ ω .
(7.36)
7.3 Orientations on M s 7.3.1 Introductory remarks In Section 5.2 we found that the quadruple (M s , +, ⋅, g)̂ is a Lorentz vector space. Since these structures on M s depend on the choice of a Minkowski chart it is convenient to chose the chart φ which was used to define the time orientation of Ms . However, the following discussion could be done equally in any other chart φ generated from φ by an orthochronous Lorentz transformation.
7.3.2 Time orientation Applying Definition 6.1 to V = (M s , +, ⋅, g)̂ we immediately find the following: Conclusion 7.14. (1) A vector p ∈ M s is called ̂ timelike, if and only if g(p, p) < 0, ̂ lightlike, if and only if g(p, p) = 0 and p ≠ 0, ̂ spacelike, if and only if g(p, p) > 0 or p = 0. (2) The coordinate form of p = p α e α with respect to φ and the definition of ĝ implies: p is timelike, if and only if ∑3j=1 (p j )2 < (p4 )2 , lightlike, if and only if ∑3j=1 (p j )2 = (p4 )2 and p ≠ 0, spacelike, if and only if ∑3j=1 (p j )2 > (p4 )2 or p = 0. (3) The vector p is causal if it is not spacelike
62 | 7 Orientation
Having realized this classification we find the following proposition which is analogous to Proposition 7.5. Proposition 7.15. Let p ∈ M s be causal. Expanded in the Minkwoski chart φ it has the form p = p α e α . An another Minkowski chart φ the 4-component p 4 of p has the same sign as p4 if the Lorentz transformation ϕ = φ ∘ φ−1 is orthochronous, p 4 has the opposite sign as p4 if ϕ is antichronous. The proof of this proposition is identical with the proof of Proposition 7.5. One simply has to replace 3
1 2
u α → p α , uβ → pβ , a → p4 , r = ( ∑ (p j )2 ) ≥ 0.
(7.37)
j=1
Further notions of the time orientation on M s which we will discuss now, can be introduced on all tangent vector spaces. Definition 7.16. (1) The set C = {p ∈ M s : p is causal}
(7.38)
is called a causal cone. (2) For the Minkowski chart φ chosen as in Section 7.3.1 we define C± := {p ∈ C : p = p α e α , ±p4 > 0}.
(7.39)
C+ is called a future, or forward, causal cone, C− is called a past, or backward, causal cone. A vector p ∈ C+ is called future-pointing, a vector p ∈ C− is called
past-pointing.
7.3.3 Chronal and causal relations Throughout the Sections 5.2, 7.3.1, and 7.3.2 we used M s as a set of vectors, while, originally, in Section 1.1, we considered M s as a manifold, interpreted as set of pointlike physical events. These two different viewpoints on the elements of M s allow for the introduction of two further structures. Definition 7.17. (1) The relation ≪ is called the chronological relation or chronology. It is defined by all pairs (p1 , p2 ) with p1 , p2 ∈ M s for which p2 − p1 is future-pointing. In other words: p1 ≪ p2 means thatp2 lies in the chronological future of p1 . (2) The relation ≤ is called the causal relation or causality. It is defined by all pairs (p1 , p2 ) with p1 , p2 ∈ M s , for which p2 − p1 is either zero or future-pointing and causal. In other words: p1 ≤ p2 means that p2 can be caused or causally influenced by p1 .
7.3 Orientations on M s
(3) The sets
| 63
K+ (p) = {p ∈ M s : p ≤ p } = {p : p − p ∈ C+ }, K− (p) = {p ∈ M s : p ≤ p} = {p : p − p ∈ C− }
(7.40)
are called causal future (+) and causal past (−) of p. In the same manner we call J+ (p) = {p ∈ M s : p ≪ p },
(7.41)
J− (p) = {p ∈ M s : p ≪ p}
the chronological future (+) and chronological past (−) of p. Conclusion 7.18. (1) By definition ≪ ⊂ ≤, which implies J± (p) ⊂ K± (p).
(7.42)
(2) The sets J± (p) are nonempty. This is clear since p = p α e α and for every λ > 0 p := (p α ± λδ4α )e α ∈ J± (p)
(7.43)
holds. Thus, by (7.42) the sets K± (p) contain more elements than only p. In the application of the relations ≤ and ≪ the following properties are of importance. Proposition 7.19. (1) The relation ≤ is a partial ordering. (2) The relation ≪ is transitive but not reflexive. Proof. (1) We need to show that ≤ is reflexive, antisymmetric and transitive. By Definition 7.17 p ≤ p holds, thus, ≤ is reflexive. If p ≤ p and p ≤ p hold, p − p and p − p must both be future-pointing. By definition it is not possible that the two vectors q and −q are both future-pointing; thus, we are left with p = p . Hence ≤ is antisymmetric. What remains is to show transitivity, i.e., the conclusion from p1 ≤ p2 and p2 ≤ p3 to p1 ≤ p3 . This is trivial if p1 = p2 or if p2 = p3 . The only nontrivial case is the case when p1 , p2 , p3 are pairwise different. In this case p2 − p1 and p3 − p2 are causal and future-pointing. This means their components with respect to the chart φ satisfy 1
3
0
σ0 . Proof. (1) Let f be a function of p, q ∈ M s and s ∈ ℝ+ defined by ̂ − q, p − q) + s f(s, p, q) = g(p = η αβ (p α − q α )(p β − q β ) + s,
(8.7)
where p = p α e α and q = q β e β in the chart chosen in Section 8.1. The form of f implies that p ∈ J+ (q) if f(s, p, q) = 0, for given q and s. For a fixed q and for each s ≥ 0 the set F s of those p such that f(s, p, q)=0 is a 3-dimensional surface in J+ (q) and J+ (q) = ⋃ F s .
(8.8)
s≥0
In particular for s = 0 we find f(0, q, q) = 0 and F0 = {q}. (2) At a point p, the normal n on F s is given by n = n α e α with ∂f ∂f − 2 αβ ∂f ) η ∂p κ ∂p λ ∂p β 1
n α = (−η κλ
(8.9)
(see for example [15, p. 99]). Using ∂f = 2(p j − q j ), ∂p j ∂f = −2(p4 − q4 ) ∂p4
j = 1, 2, 3, (8.10)
68 | 8 Kinematics on M s
one obtains η κλ
∂f ∂f = 4(f − s), ∂p κ ∂p λ η
αβ
∂f = 2(p α − q α ), ∂p α
(8.11)
which implies for f(s, p, q) = 0, i.e., at p ∈ F s ⊂ J+ (q) with s > 0 n = s− 2 (p − q). 1
(8.12)
Thus, the normal vector n is timelike future-pointing . All tangent vectors of F s at p lie in n⊥ and thus, are spacelike, according to Proposition 6.7. ∘
(3) Now γ (σ0 ) is timelike and future-pointing. Moreover, at γ(σ0 ) = q, the curve γ can be arbitrarily exact approximated by its tangent in the neigborhood of σ0 . Thus, there exists a σ1 > σ0 such that γ(σ) ∈ J+ (q) for σ0 < σ < σ1 . We define the function h q by ̂ − q, γ(σ) − q). (8.13) h q (σ) = −g(γ(σ) From equation (8.7) it follows that γ(σ) ∈ F s ⊂ J+ (q) if and only if 0 < s = h q (σ).
(8.14)
Finally, we will show that h q grows strictly monotonous. In case h q is not growing ∘
strictly monotonous for some σ, then γ (σ) would be tangent to F s with s = h q (σ) ∘
which would in turn imply that γ (σ) is spacelike, which contradicts the assumptions. Thus, h q must be growing strictly monotonous. So there exists h−1 q , i.e., for every s ∈ h q [I] exists a σ with h q (σ) = s. This ensures γ(σ) ∈ J+ (q). Corollary 8.6. (1) Every world-curve γ is injective. This can be seen from the fact that if σ1 ≠ σ2 , then s1 = h q (σ1 ) ≠ s2 = h q (σ2 ). Thus, F s1 ∩ F s2 = 0, which implies γ(σ1 ) ≠ γ(σ2 ) since γ(σ j ) ∈ F s j , j = 1, 2. (2) Let γ be a world-curve and let σ1 < σ2 . Then γ(σ1 ) ≪ γ(σ2 ). This implies by equation (8.1) γ̄ 4 (σ1 ) < γ̄ 4 (σ2 ). The other way around equation (8.1) and γ̄ 4 (σ1 ) < γ̄ 4 (σ2 ) imply γ(σ1 ) ≪ γ(σ2 ). In a next step we consider lightlike world curves. They are used to describe light pulses in a vacuum and are not only characterized by being lightlike and future-pointing but underly further conditions. Definition 8.7. (1) Let γ : I → M s be a C k -curve with k ≥ 3 defined on an open finite or infinite interval ̇ I ⊂ ℝ. Such a curve is called lightcurve if γ(σ) is lightlike and future-pointing for every σ ∈ I and if γ is a geodesic. One can always assume that γ is parametrized with an affine parameter.
8.2 Worldlines, signals, observers | 69
(2) The set γ[I] is called light world line or light ray. (3) The restriction of a light curve γ to a (finite) subinterval [σ1 , σ2 ] ⊂ I is called light signal or lightlike signal from p1 = γ(σ1 ) to p2 = γ(σ2 ). For lightcurves we can formulate the following proposition. It corresponds to Propositions 8.2 and 8.5 as well as to Corollaries 8.3 and 8.6. Proposition 8.8. (1) If γ : I → M s is a lightcurve with γ(σ0 ) = q then γ(σ) ∈ K+ (q)\J+ (q) =: H+ (q) for σ ≥ 0. (2) The function γ is injective. (3) For every p1 ≤ p2 with p1 ≠ p2 and p2 ∈ ̸ J+ (p1 ) exists a lightcurve γ : ℝ → M s such that the restriction of γ to the interval [0, 1] is a lightlike signal from p1 to p2 . (4) For every lightcurve γ the inequality γ̄ 4 (σ1 ) < γ̄ 4 (σ2 ) holds if σ1 < σ2 . Thus, γ̄ 4 is injective. Here we used the notation γ̄ = φ ∘ γ. Proof. (1) Let γ be an affine parametrized geodesic (see [16, p. 303]). Then γ is a linear function of σ, i.e., (8.15) γ(σ) = q + u(σ − σ0 ), and
∘
γ (σ) = u
for all
σ ∈ I.
(8.16)
holds. Since u is lightlike future-pointing we find for σ > σ0 γ(σ) − γ(σ0 ) = u(σ − σ0 ).
(8.17)
Thus, γ(σ) ∈ H+ (q). (2) Using the chart φ equation (8.15) becomes γ̄ α (σ) = q α + u α (σ − σ0 ),
α = 1, 2, 3, 4.
(8.18)
Now u is lightlike, so 1 2
3
0 < ( ∑ (u j )2 ) = u4 ,
(8.19)
j=1
holds. Thus, the equations (8.18) can either be solved for σ or do not depend on σ. This implies that γ is injective and therefore γ−1 exists. (3) By assumption p2 − p1 is lightlike and future-pointing. Let γ(σ) = p1 + (p2 − p1 )σ. ∘
(8.20)
Then γ(0) = p1 and γ(1) = p2 . Moreover, γ (σ) = p2 − p1 is lightlike and futurepointing. Thus, according to (8.20), γ is the desired curve.
70 | 8 Kinematics on M s (4) By (8.17) and γ̄ α = φ α ∘ γ, γ4 (σ) − γ4 (σ0 ) = u4 (σ − σ0 )
(8.21)
holds. Since u4 > 0, the proposition is proven. Definition 8.9. (1) It is usual to call world-curves observers. (2) A pair (p, u) with p ∈ M s and u ∈ T p Ms such that u is timelike future-pointing is called instant observer. The notion observer is mainly used in the case when signals arrive at or leave from a world-curve γ. The physical interpretation is that a real point, which is moving on γ, explores its surroundings. The notion instant observer is based on the idea that a real point observer always has and knows its position p and its velocity u. Globally one can see an observer as an infinite set of instant observers. In every instant of time an observer is an instant observer. In context with the Minkowski space, which is the main subject of this book, there exists a particular class of observers which play an important role. Definition 8.10. Let φ be the Minkowski chart introduced in Section 8.1, and let ψ be an arbitrary Minkowski chart with coordinates y = ψ(p), p ∈ M s such that the Lorentz transformation ψ ∘ φ−1 is orthochronous. Then σ ∈ I , y0 ∈ ℝ4 , and j μ̄ j (σ) = y0 , j = 1, 2, 3
and
μ̄ 4 (σ) = y40 + aσ,
a>0
(8.22)
̄ which is injective, timelike, and future-pointing. Thus, define a curve μ(σ) = ψ−1 ∘ μ(σ) this curve is an observer. It is called Minkowski observer or initial observer. This notion makes sense by the following fact. The curve μ is linear in σ, and thus μ is an affine parametrized geodesic, i.e., the observer is freely falling. Its curve μ can be visualized as a straight line parallel to the 4-axis of the coordinates y = ψ(p). Another notion of interest besides the decomposition of an observer in instant observers is the “cooperation” between observers. Definition 8.11. (1) A set of observers with pairwise disjoint worldlines is called reference system. (2) A set of observers which are defined in terms of a Minkowski chart ψ as in Definition 8.10 but with different (y 1 , y2 , y3 ) is called Minkowski reference system or inertial system, abbreviated as MR. For different (y1 , y2 , y3 ) the values of y4 and a in equation (8.22) may differ. In the next sections those Minkowski reference systems are of particular interest whose observers have the same y4 and the same a, i.e., those which differ only in (y1 , y2 , y3 ).
8.3 Clocks |
71
8.3 Clocks The most important tool of a real-point observer is its clock. In principle an observer may have infinitely many clocks, as we will discuss below. In this theoretical setting a clock is defined as follows. Definition 8.12. (1) Let γ be an observer with worldline γ[I]. Furthermore, let U : γ[I] → ℝ
(8.23)
be a C k , k ≥ 3 function such that f := U ∘ γ is growing strictly monotonously. The function U is called clock of γ or clock on γ[I]. (2) The number t determined by t = U(γ(σ)) is called time parameter of γ or simply time on γ[I]. Conclusion 8.13. (1) For every world curve γ there exists its inverse γ−1 . Thus it is possible to define a clock U of γ for every monotonous C k , k ≥ 3 function f by U = f ∘ γ−1 . The most simple example is the choice f = id. With this choice, σ is the time parameter of γ. In other words, γ always posseses a clock, or γ−1 is the clock of γ. Another example is to choose the function f = h q defined in equation (8.13) to define a clock on γ. (2) The notion of time used most often in special relativity is the following. Define f by σ 1
2 dλ. ̇ ̇ f(σ) := ∫ (−g(γ(λ), γ(λ))
(8.24)
σ0
The integrand of (8.24) is positive, and thus f is growing strictly monotonously. Hence, (8.25) U E := f ∘ γ−1 is a clock. Definition 8.14. The clock U E of γ is called proper time clock or standard clock of γ. The time parameter of U E is called proper time and often denoted by τ. Conclusion 8.15. (1) Let us consider the Minkowski observer μ introduced in equation (8.22). Since ̇ μ(σ) = a∂ y4 we find f(σ) = aσ for σ0 = 0, and thus τ = aσ = y4 is the proper time of the Minkowski observer. ̇ ̇ (2) For every curve γ parametrized in proper time we find g(γ(τ), γ(τ)) = −1, which can be understood as follows. By assumption, τ = U E (γ(τ)) = f(τ).
72 | 8 Kinematics on M s Thus, f = id, and since τ
̇ ), γ(τ ̇ ))) 2 dτ τ = ∫(−g(γ(τ 1
τ0
̇ ̇ we find τ0 = 0 and −g(γ(τ), γ(τ)) = b2 with b being a constant. Thus, f(τ) = bτ and so b = 1 follows. (3) Consider an observer γ with two clocks U and U . From Definition 8.12 we find that h := f ∘ f −1 = U ∘ U −1 describes the relation between the time parameters t and t of U and U : t = h(t) and h is growing monotonously. This insight tells us that we can construct arbitrary clocks U from a given clock U with the help of a real monotonously growing function h. In the literature this is often used by considering only the proper time of observers. These discussed properties of clocks suggest the following definition. Definition 8.16. Let t be a time label of U . The following relations hold: ̇ < 1. U is faster than U , if h(t) ̇ > 1. U is slower than U , if h(t) U and U are equally fast, if ḣ = 1. One could also say that the ticking rate of U is larger than, smaller than, or equal to the one of U . Conclusion 8.17. (1) The world-curve of every observer γ can be parametrized with every of the observer’s clocks without changing its status as observer, i.e., without changing its causal character. This can be seen from the following. Let γ : I → M s and U be a clock of γ. Then f := U ∘ γ : I → I ⊂ ℝ, (8.26) and for every σ ∈ I there exists a t ∈ I such that σ = f −1 (t). Let γ : I → M s defined by γ (t) = γ(f −1 (t)). Then γ̇ (t) =
d −1 ̇ f (t)γ(f(t)). dt
(8.27)
d −1 Since dt f (t) > 0 for all t ∈ I , it follows that γ̇ (t) is timelike and future-pointing. (2) According to what we just learned, the question of how one can realize all the theoretical clocks in an experiment reduces itself to the question of how one can realize one single clock, since all the others can be constructed by a rescaling. A few remarks regarding this problem can be found at the end of Section 8.6.
So far we have only discussed clocks of single observers. This implies the question of how clocks of different observers of one references system can, and should, be related. Observers of one reference system should be able to formulate statements about time
8.3 Clocks
|
73
measurements such that these time measurements make sense for all observers in the references system. This requires a notion of time which is identical for all observers in the reference system, i.e., one clock for the reference system. Definition 8.18. Let W ⊂ M s be an open and connected subset of M s . Furthermore, let Û : W → ℝ be a C k , k ≥ 3 function which satisfies for every world-curve γ with values in W the following relation: If σ1 < σ2 then ̂ ̂ (8.28) U(γ(σ 1 )) < U(γ(σ 2 )). Such a function Û is called universal clock on W . This definition is the first step towards an answer to the above question. Let us illustrate the notion of a universal clock by the following example. Consider an arbitrary system of reference B in M s and let V be the union of all worldlines of B : V = ⋃ γ[I γ ]. (8.29) γ∈B
Then we find the following. Proposition 8.19. Let Û be a universal clock with domain W such that V ⊂ W . Then for every observer γ of B there exists a clock U γ with the time parameter identical with the time parameter of Û . Proof. For every γ ∈ B let U γ be the restriction of Û onto the worldline γ[I γ ], i.e., U γ := U|̂ γ[I γ ] . Then f γ := U γ ∘ γ = Û ∘ γ (8.30) is growing strictly monotonously. Thus, U γ is a clock on γ[I γ ]. According to Conclusion 8.17 one can change the parameter σ of γ to the parameter t by σ = f γ−1 (t). Let γ = γ ∘ f γ−1 = γ ∘ γ−1 ∘ U γ−1 = U γ−1
(8.31)
B = {γ : γ = U γ−1 , γ ∈ B}.
(8.32)
and Then
̂ (t)) U(γ
= t for every
γ
∈
B
and every t ∈
I γ
:= f γ [I γ ].
The descriptive meaning of Proposition 8.19 is the following. The possibly not synchronized clocks of different observers of one reference system can be synchronized by using a universal clock. So how can one obtain a universal clock for a reference system? In the standard literature this problem is usually not discussed; however its reverse formulation is. This means that the problem of synchronizing clocks of one reference system is discussed in such a way that separate synchronized clocks get combined to a universal clock. The discussion of the details of this problem lies beyond the focus of this book. The interested reader may find details in [14, p. 52 ff.].
74 | 8 Kinematics on M s
8.4 Newtonian notions in special relativity The two basic notions in Newtonian physics are “absolute space” and “absolute time”. Both notions cannot be found in special relativity, since otherwise relativistic physics would not be necessary. The question can only be if one finds an observer-dependent, nonabsolute, notion of space and time in spacetime, here in particular in Minkowski spacetime. The answer to this question will not tell us what time and space are, but will only identify mathematical expressions in which space and time manifest themselves. The discussion of clocks in the previous Section 8.3 suggests the following. Definition 8.20. (1) The time of an observer γ is given by the time parameter of the clock γ. (2) The space of an observer can be defined in M s using the properties of the tangent ̂ The space of the observer γ at time σ is spaces T p Ms mapped to (M s , +, ⋅, g). R γ∘
∘
(σ)
:= (γ (σ))⊥ . It is possible to identify this space in T p Ms , p = γ(σ) with Rγ(σ) := ̇
⊥ (the definition of ⊥ can be found in Section 9.5). ̇ (γ(σ)) Observe that time is defined for an observer, while space is only defined for an instant observer.
Based on this observer-dependent notion of time and space it is possible to define the notion of a Newtonian velocity and Newtonian acceleration. First, we consider the velocity. The simplest approach is to use (M s , +, ⋅, g)̂ for the discussion. It can then be trivially transferred to the tangent spaces T p Ms or coordinate space ℝ4 . The problem we seek to solve can be formulated as follows. Given two curves γ : I → M s and ζ : I → M s , (8.33) where γ is causal and future-pointing, while ζ is timelike and future-pointing. Furthermore, there exist two time parameters τ0 ∈ I and σ0 ∈ I such that γ(τ0 ) = ζ(σ0 ) =: p ∈ M s . Let
∘
u :=γ (τ0 )
and
∘
υ :=ζ (σ0 ).
(8.34)
(8.35)
Now, if (p, υ) is an instant observer, the question is: What is the Newtonian velocity u N of γ which the observer (p, υ) measures with its own clock? For the upcoming discussion we need a Minkowski chart ψ with the following properties. (1) The chart ψ and the distinguished chart φ, defined in Section 8.1, are connected by an orthochronous Lorentz transformation (L, 0), i.e., ψ = L ⋅ φ. (2) The basis ê = (e1 , e2 , e3 , e4 ) of M s defined by ψ is constructed as in Proposiβ β tion 5.3(4), i.e., e α = ψ−1 (z α ), where z α has components z α = δ α . The dual basis
8.4 Newtonian notions in special relativity
| 75
1 , θ 2 , θ 3 , θ 4 ) of ê (see Section 9.4) is defined by ̂ θ=(θ
̂ λ , ⋅). θ κ = η κλ g(e
(8.36)
(3) υ = we4 , w > 0 holds (see Proposition 6.2). Proposition 8.21. The map P : M s → e⊥ 4 defined by 3
P = ∑ ej ⊗ θj
(8.37)
j=1
is a projector. Proof. P is linear and Pe4 = 0, Pe j = e j for j = 1, 2, 3. Furthermore, 3
P2 = ∑ e j ⊗ θ j (e k )θ k = P,
(8.38)
j,k=1 j
since θ j (e k ) = δ k . By choice of the chart ψ we obtain υ = we4 . Thus, the space υ⊥ defined by υ is spanned by e = (e1 , e2 , e3 ). Since the Newtonian velocity of a real-point or a lightpulse must be a vector in space, i.e., in υ⊥ , we investigate the first three components α of u. By definition u = u α e α with u α = γ̇̄ (τ0 ) and γ̄ = ψ ∘ γ. Thus, 3
Pu = ∑ γ̇̄ (τ0 )e j j
(8.39)
j=1
is exactly what is understood as Newtonian velocity. This justifies the following. Definition 8.22. The Newtonian velocity ũ N of a causal future-pointing curve γ measured with the time parameter τ of γ at time τ0 is given by ∘
ũ N = P γ (τ0 ), where P is the projector onto the space R υ in (γ(τ0 ), υ) (see Definition 8.20).
Ms
(8.40)
defined through the instant observer
Having introduced one possible notion of a Newtonian velocity we can go the next step. Proposition 8.23. The Newtonian velocity u N of γ at p = γ(τ0 ), measured by an instant observer (p, υ) with its own clock, is given by u N = aPu,
(8.41)
̂ ̂ a := g(υ, υ)g(υ, u)−1
(8.42)
where ∘
and u =γ (τ0 ) as in eqation (8.35).
76 | 8 Kinematics on M s Proof. (1) According to Proposition 8.8 and Corollary 8.6 the functions γ̄ 4 and ζ ̄ 4 are injective. Thus, the function χ := (γ̄ 4 )−1 ∘ ζ ̄ 4 exists such that τ = χ(σ) is the transformation of the time parameter τ of γ to the time parameter σ. The reparameterized curve γ is γ̂ := γ ∘ χ. Thus, we obtain, using τ = χ(σ): d d d ̂ γ(σ) = γ(τ) ⋅ χ(σ). dσ dτ dσ Morover, with t = γ̄ 4 (τ) ̇ χ(σ) =
d 4 −1 ̇ (γ̄ ) (t) ⋅ ζ ̄ 4 (σ). dt
(8.43)
(8.44)
holds. The 4-velocity û of γ,̂ i.e., the 4-velocity after reparametrization of γ, with τ = χ(σ) is ∘
̇ ̂ u(σ) =γ̂ (σ) = u(τ) ⋅ χ(σ).
(8.45)
Calling u N the Newtonian velocity parametrized with the transformed time parameter σ we obtain, according to equation (8.40), ̂ 0 ) = χ(σ ̇ 0 )Pu(τ0 ). u N = P u(σ
(8.46)
̇ 0 ) = a. Since g(υ, ̂ ̂ (2) We must show that χ(σ υ) = −w2 and g(υ, u) = −wu4 we find ̇ a = w(u4 )−1 . Furthermore, ζ ̄ 4 (σ0 ) = w, and using t0 = γ̄ 4 (τ0 ), d 4 −1 (γ̄ ) (t0 ) = (γ̇̄ 4 (τ0 ))−1 = (u4 )−1 dt
(8.47)
̇ 0 ) = a. holds. Employing equation (8.44) yields χ(σ To illustrate Proposition 8.23 consider the following example. Let γ be a lightcurve ∘
through p ∈ M s , and let (p, υ) = (ζ(σ0 ), ζ (σ0 )) be an instant observer. Then γ(τ) = p + uτ.
(8.48)
With respect to the chart ψ we have υ = we4 and 3
u = λ ∑ ε j e j + λe4
(8.49)
j=1
with ∑3j=1 (ε j )2 = 1 and λ > 0. Conclusion 8.24. (1) The above assumptions yield a = wλ−1 , and thus 3
uN = w ∑ εj ej . j=1
(8.50)
8.4 Newtonian notions in special relativity
| 77
The absolute value of u N , i.e., the speed of light c, is 1
1
̂ N , u N ) 2 = w =| g(υ, ̂ υ) | 2 . ||u N || = g(u
(8.51)
Thus, according to Proposition 8.15(2), w = 1 in the case where the world curve ζ which generates the instant observer is parametrized in proper time. Let us point out the meaning of this result once more. At every time σ, each observer ζ , using its standard clock, measures the speed of light c = 1, independent of its own ∘
velocity ζ (σ) and independent of the state of motion of the light source. This statement is what is called the principle of the constancy of the speed of light, which plays an important role in the heuristic deduction of special relativity. Here it is a consequence of the axioms of Minkowski spacetime in Definition 1.1, as it should be. (2) Another, rather trivial, example of the measurement of the Newtonian velocity by an instant observer (p, υ) is the measurement of its own velocity. In this case u = υ, and thus 3
υ N = Pυ = ∑ υ α e α .
(8.52)
j=1
So far we have considered the case that one instant observer (p, υ) measures the velocity of a causal curve γ which coincides with the observer at time τ0 , i.e., γ(τ0 ) = p. Surely different instant observers may exist at p, and they all can measure the velocity of the curve γ. So, are the measurements of the different observers mathematically related, and if so, how? To answer this question it suffices to consider two instant observers (p, υ) and (p, υ ). Let φ be the Minkowski chart defined in Section 8.1. Then, according to Proposition 6.2, there exist charts ψ and ψ such that, using the notations from Conclusion 5.3(2), in M s we find the component representations υ = υ4 e4
and
υ = υ 4 e4 .
(8.53)
Here ê = (e1 , . . . , e4 ) and ê = (e1 , . . . , e4 ) are the vector bases in M s generated by ψ, ψ respectively. Figuratively one can think about the two instant observers as parts of the Minkowski observers defined by ψ and ψ . Using Proposition 6.2, the charts ψ and ψ can be chosen such that ψ ∘ φ−1 and ψ ∘ φ−1 are homogeneous Lorentz transformations. Thus, ψ ∘ ψ−1 = (L, 0) is homogeneous as well. For the bases ê and ê this yields with equation (5.15) e β = L αβ eα .
(8.54)
The velocity of the causal curve γ under consideration at p = γ(τ0 ) is given by ∘
γ (τ0 ) =: u = u α e α = u β eβ .
(8.55)
78 | 8 Kinematics on M s
Employing equation (8.54) this implies u
= L κλ u λ .
κ
(8.56)
Finally, let 3
3
j
uN = ∑ uN ej
uN = ∑ u Nn en
and
(8.57)
n=1
j=1
be the Newtonian velocities measured by (p, υ) and (p, υ ), respectively. Thus, the relation between uN and u N can be formulate as follows. Proposition 8.25. For two instant observers (p, υ) and (p, υ ) using their proper time clocks we find
−1
3
j
3
j
j
u N = ( ∑ L4r u rN + L44 ) ( ∑ L n u nN + L4 ). r=1
(8.58)
n=1
̂ ̂ , υ ) = −1 Proof. According to Proposition 8.24, for proper time clocks g(υ, υ) = g(υ 4 4 ̂ holds. Employing (8.53) we find υ = υ = 1. Furthermore, g(υ, u) = −u4 and ̂ , u ) = −u4 such that one obtains, together with (8.42), a = (u4 )−1 and a = (u4 )−1 . g(υ Thus, (8.41) yields 3
u N = (u4 )−1 ∑ u j e j , j=1
uN
4 −1
= (u )
(8.59)
3
∑u
j ej
.
j=1
Using these relations, we can find that the components in (8.57) are given by u N = (u4 )−1 u j j
and
u N = (u 4 )−1 u j . j
(8.60)
Plugging equation (8.56) into the second equation of (8.60) and multiplying the right4 −1 ) one obtains hand side with 1 = (u (u4 )−1
u N = (L4α (u4 )−1 u α )−1 (L β (u4 )−1 u β ). j
j
(8.61)
Together with equation (8.60) this yields equation (8.58). The reverse can be proven in exactly the same manner. We can interpret this result as follows. The observer (p, υ) is able to conclude the measurement result uN of the observer (p, υ ) from its own measurement in case the Lorentz transformation between the observers is known. The most simple example of equation (8.58) can be constructed by using a special Lorentz matrix L, i.e., if k 0 L = S υ := ( 0 −υk
0 1 0 0
0 0 1 0
−υk 0 ) 0 k
(8.62)
8.5 Radar charts in Ms
| 79
holds, where k−1 = (1 − υ2 ) 2 . Using l := 1 − υu1N one obtains the relation 1
u N1 = l−1 (u1N − υ),
u N2 = (kl)−1 u2N ,
(8.63)
u N3 = (kl)−1 u3N . Equation (8.63) is called the theorem of the addition of velocities (see [20, p. 27]). Let us conclude this section with a remark on the Newtonian acceleration b N of a worldcurve γ with respect to its own time parameter measured by an instant observer (p, υ) with p = γ(τ0 ). We state the following definition. ∘
Definition 8.26. Let u(τ) =γ (τ) as defined in Proposition (8.6). Then 3
∘
b N = P u (τ0 ) = ∑ u̇ j (τ0 )e j .
(8.64)
j=1
Here P is defined as in Proposition 8.21.
8.5 Radar charts in Ms The notion of a radar chart plays a fundamental role in relativistic space-time theories. In this book we will discuss only selected aspects of radar charts: their connection to Minkowski charts. Definition 8.27. The radar coordinates of an event or a real-point are generated by an observer equipped with a clock and a device to measure the direction of incoming and outgoing lightrays. This observer protocols the time of the emission and the return of a radar signal as well as the directions at its emission and its return. The numbers collected represent, after an appropriate conversion of units, the coordinates of the event or the real-point which reflected the outgoing light pulse. In principal, the theoretical observer can employ an arbitrary number of radar experiments to explore the surroundings of its worldline and even the whole spacetime to obtain a complete map. In the following we will consider a Minkowski observer μ as well as a Minkowski chart ψ, which is generated from the chart φ introduced in Section 8.1 by a homogeneous orthochronous Lorentz transformation. The observer μ is given by 3
μ(τ) = ∑ y j e j + τe4 ,
(8.65)
j=1 ∘
̂ ̂ e j ) and τ = g(μ(τ), e4 ). Furthermore, μ (τ) = e4 , such that τ is the with y j = g(μ(τ), proper time of μ. The event which shall be coordinatized is called q.
80 | 8 Kinematics on M s
Proposition 8.28. The Minkowski observer μ can determine the Minkowski coordinates x of any event q ∈ M s by radar observations, in case its proper-time-clock is used and its space coordinates y j , j = 1, 2, 3 are given. Proof. Let x be the coordinates of q, then q = x α e α . Now q shall be connected to the observer μ by incoming and outgoing light rays. Thus, the vector 3
q − μ(τ) = ∑ (x j − y j )e j + (x4 − τ)e4
(8.66)
j=1
must be lightlike. This implies 1 2
3
( ∑ (x j − y j )2 ) ± (x4 − τ) = 0.
(8.67)
j=1 1
Calling r = (∑3j=1 (x j − y j )2 ) 2 one can use the above equation to obtain τ: τ1 = x4 − r
and
τ2 = x4 + r.
(8.68)
Thus, τ2 > τ1 , and
1 (τ2 + τ1 ), 2 1 r = (τ2 − τ1 ). 2 The radar signal consists of two light signals ϱ1 and ϱ2 , with x4 =
ϱ1 (σ) = q + (q − μ(τ1 ))σ, ϱ2 (σ) = q + (μ(τ2 ) − q)σ,
(8.69)
(8.70)
such that ϱ1 (−1) = μ(τ1 ), ϱ2 (1) = μ(τ2 ) and ϱ1 (0) = ϱ2 (0) = q. The spatial directions of the signals ϱ1 and ϱ2 are, according to equation (8.70), 1 1 j j ε1 = (x j − y j ) and ε2 = (y j − x j ). (8.71) r r Furthermore, 3
j
3
r ∑ ε1 e j = ∑ (x j − y j )e j , j=1
(8.72)
j=1
which implies for j = 1, 2, 3 j
j
x j = y j + rε1 = y j − rε2 .
(8.73)
Thus, combined with equation (8.69), the Minkowski coordinates x of q are completely determined by radar data. To obtain equations (8.69) and (8.73) the Minkowski chart ψ was explicitly used. One may ask whether it is possible to drop this requirement, i.e., whether equations (8.69) and (8.73) are sufficient to generate a Minkowski chart. This is indeed the case. One only has to ensure that the observer who performs the radar experiment is a Minkowski observer. Questions of this kind belong to the heuristic prethoughts of special relativity and will not be discussed here further.
8.6 Time dilation
| 81
8.6 Time dilation Let ψ be a Minkowski chart generated from the Minkowski chart φ defined in Section 8.1 by a homogeneous and orthochronous Lorentz transformation, as in the previous section. Consider two observers μ and γ, where μ is a Minkowski observer with 3
μ(τ) = ∑ y j e j + τe4 ,
(8.74)
j=1
and γ is an observer for which 3
γ(τ) = ∑ γ̄ j (τ)e j + τe4
(8.75)
j=1
holds. Both observers use the coordinate time x4 = τ as the time parameter. Furthermore, we assume the existence of two instants of time τ1 and τ2 with τ1 < τ2 and μ(τ1 ) = γ(τ1 ), μ(τ2 ) = γ(τ2 ). Additionally, the Newtonian velocity υ N of γ (see Definition 8.22) shall satisfy 1
3
2 υ N (τ) = ( ∑ ( γ̄̇ j (τ))2 ) ≠ 0,
(8.76)
j=1
at least in a nonempty interval I of [τ1 , τ2 ]. According to equation (8.74) the proper time of μ is given by τ. The proper time of γ is denoted by t. It is defined by τ
∘
∘
̂ := ∫(−g(̂ γ (τ ), γ (τ ))) 21 dτ . t = t(τ)
(8.77)
0
In a picture, these assumptions mean that μ is not moving with respect to the ψ- coordinates, while γ does, at least for some time interval. Proposition 8.29. The following inequality holds: ̂ 2 ) − t(τ ̂ 1 ) < τ2 − τ1 . t2 − t1 = t(τ
(8.78)
Proof. From 3
∘
γ (τ) = ∑ γ̇̄ (τ)e j + e4 j
(8.79)
j=1
we find ∘
3
∘
j g(̂ γ (τ), γ (τ)) = ∑ ( γ̇̄ (τ))2 − 1 = υ2N (τ) − 1.
(8.80)
j=1
Since υ N (t) > 0 holds in the interval I ⊂ [τ1 , τ2 ] we obtain τ2
τ2
̂ 2 ) − t(τ ̂ 1 ) = ∫(1 − υ N (t)2 ) 12 dt < ∫ dt = τ2 − τ1 . t(τ τ1
Thus, the proposition is proven.
τ1
(8.81)
82 | 8 Kinematics on M s
The effect described by equation (8.78) is called time dilation, first clock effect, or twin paradox. The picture behind the latter notion describes the cause of equation (8.78) quite precisely. Imagine that the observers μ and γ are twins. Twin γ starts a cosmic journey in a fast spaceship at time τ1 , while twin μ stays at home and does not move in its coordinate system. Both twins meet again at time τ2 . Thus, in the coordinate time, the proper time of twin μ, the twins are aged by τ2 − τ1 units of time. Now twin γ carries a second clock on its journey, which shows its proper time. According to this clock the time t2 − t1 has passed between leaving home and coming back. Thus, twin γ is proper-time younger than twin μ since t2 − t1 < τ2 − τ1 . In this context one often comes to the question: “How old is γ really?” This question can only be answered by establishing a connection between real clocks and the proper time of the observers. By heuristic considerations and the Haefele–Keating experiment (see [23]), the following assumption can be formulated: Exactly identical atomic clocks can be scaled in such a way that they measure the proper time of an observer, defined according to Definition 8.14, up to a factor. By rescaling it is possible to set the factor to 1. It is more difficult to explain the biological aspect of aging in this context, since the correlation between proper time and typical causes of certain intervals in life is loose. This means that the journey of the twin γ makes him younger compared to the twin μ; however this mechanism cannot be seen as a fountain of youth or beauty holiday. The question of the realization of arbitrary clocks of an observer, posed in Conclusion 8.17(2) can now be answered in terms of the assumption stated above. All clocks can be realized by atomic clocks with an appropriate scale. The same holds for any kind of clocks which can be constructed by any observer similar to the atomic clocks.
8.7 Length contraction The notion of length or Lorentz contraction emerged before special relativity was formulated and describes the hypothesis that the length of a solid body moving relatively to the aether is smaller compared to its length when it is at rest relatively to the aether. This hypothesis was originally formulated by Lorentz and FitzGerald (see [24] and [25]). Since in special relativity aether and absolute space do not exist, this Lorentz–FitzGerald hypothesis cannot be formulated. Thus, if there is something like a length contraction in special relativity, it must be connected to the motion of observers and reference systems. We formulate the following definition. Definition 8.30. Consider two inertial systems MR and MR which are determined by the Minkowski charts φ and φ and a solid rod Σ. We assume that Σ is at rest in MR and MR shall move with respect to MR.
8.7 Length contraction
| 83
Moreover, two measurements are performed: First, the length ℓ of Σ is determined in MR, and second, the length ℓ of Σ in MR . In case the result is ℓ < ℓ this effect is called length contraction. To be able to decide if there is a length contraction or not the measurement process must be discussed in further detail. Measurement process 1. Let μ be a Minkowski observer in the inertial system MR, defined by the Minkowski chart φ. We assume that Σ is at rest relative to MR and μ knows this, for example by performing a radar experiment. Thus, by observing the endpoints of Σ, μ knows the spacetime coordinates x1 , x2 of these ends, i.e., x1 = (x11 , x21 , x31 , σ1 )T
and
x2 = (x12 , x22 , x32 , σ2 )T ,
(8.82)
where σ1 and σ2 can in principal be different. Since Σ is at rest in MR, the coordinates j j x1 , x2 , j = 1, 2, 3 are the same for all values of σ1 , σ2 and ℓ is determined by j
1
j
ℓ = (Σ 3j=1 (x1 − x2 )2 ) 2 .
(8.83)
Measurement process 2. To measure the length of Σ from MR , all Minkowski observers in MR are needed. Moreover, it is assumed that they know their spacetime position in MR , i.e., their coordinates in the Minkowski chart φ . The observers in MR agree on one time parameter τ at which every observer determines if an end of Σ is in coincidence with its position or not. Thus, there exist exactly two such coincidence observers. Their coordinates in the chart φ are
x1 = (x11 , x12 , x13 , τ)T Thus,
and
x2 = (x21 , x22 , x23 , τ)T .
ℓ = (Σ3j=1 (x1 − x2 )2 ) 2 . j
1
j
(8.84)
(8.85)
Remark 8.31. It is important to stress the fact that the measurement process 2 cannot be performed from one observer alone, as it is described quite often in the literature. For the following considerations, observe also that measurement process 2 is not restricted to the measurement of a moving rod. To investigate the length contraction in special relativity the following observation is of importance. Lemma 8.32. As above, let x1 , x2 be the φ-coordinates which label the ends of Σ according to the measurement procedure 1, and let x1 , x2 be the φ -coordinates which label the ends of Σ according to measurement procedure 2. The relation between these coordinates is given by
x ϱ = L4 (L44 )−1 (τ − z4 ) + z j + Σ3r=1 (L r − (L44 )−1 L4 L4r )x rϱ , j
j
j
with j = 1, 2, 3 and ϱ = 1, 2 as well as (L, z) = φ ∘ φ−1 .
j
(8.86)
84 | 8 Kinematics on M s
Proof. By assumption we have xϱ = L ⋅ x ϱ + z,
ϱ = 1, 2,
(8.87)
and thus, using equation (8.84), β
τ = L4α x1α + z4 = L4β x2 + z4
(8.88)
holds. This equation can be solved for x41 and x42 : τ − z4 − Σ3r=1 L4r x rϱ = L44 x4ϱ ,
ϱ = 1, 2,
(8.89)
such that x4ϱ = (L44 )−1 (τ − z4 − Σ3r=1 L4r x rϱ ),
ϱ = 1, 2.
(8.90)
Combining equations (8.87) and (8.89), (ϱ = 1, 2; j = 1, 2, 3) yields
j
j
j
j
j
x ϱ = Σ 3r=1 L r x rϱ + L4 x4ϱ + z j = Σ 3r=1 L r x rϱ + L4 (L44 )−1 (τ − z4 − Σ3r=1 L4r x rϱ ) + z j .
(8.91)
Thus, equation (8.86) holds. Conclusion 8.33. (1) Using equation (8.86) on obtains
x1 − x2 = Σ 3r=1 (L r − (L44 )−1 L4 L4r )(x1r − x2r ), j
j
j
j
ℓ
x1r
(8.92)
− x2r , r = 1, 2, 3
and the Lorentz which enables us to determine depending on matrix L. This means that ℓ not only depends on ℓ and the Lorentz matrix but also on the direction of the rod Σ in the coordinate system φ. (2) It is convenient to reformulate equation (8.92) with a different notation as follows. Let K be a 3 × 3 matrix with elements K r = L r − (L44 )−1 L4 L4r , j
j
j
(8.93)
and let a = (a1 , a2 , a3 )T ,
b = (b1 , b2 , b3 )T
a j = ℓ−1 (x1 − x2 ),
b j = ℓ −1 (x1 − x2 )
(8.94)
with components j
j
j
j
(8.95)
so that a T ⋅ a = b T ⋅ b = 1. Hence a and b are the directions of rod Σ in MR and in MR . Thus, equation (8.92) becomes ℓ b j = ℓ Σ3r=1 K r a r ,
(8.96)
ℓ b = ℓ K ⋅ a.
(8.97)
j
or even more compact, The relation between
ℓ
and ℓ is given by
ℓ 2 = ℓ2 a T ⋅ K T ⋅ K ⋅ a.
(8.98)
8.7 Length contraction
| 85
The meaning of this relation can be best understood by considering the following two examples. Example 1. We consider the case k 0 L = Sυ = ( 0 −υk
0 1 0 0
−υk 0 ), 0 k
0 0 1 0
(8.99)
with k−1 = (1 − υ2 ) 2 , from which we find for j, r = 1, 2, 3 1
j
j
j
j
L r = kδ1 δ1r + δ2 δ2r + δ3 δ3r , j
j
L4 = −υkδ1 ,
L4r = −υkδ1r ,
(8.100)
L44 = k.
j
j
j
j
Plugging equation (8.100) into (8.93) yields K υr = k−1 δ1 δ1r + δ2 δ2r + δ3 δ3r . Thus, k−1 Kυ = ( 0 0
0 1 0
0 0) = K υT 1
(8.101)
and a T ⋅ K υT ⋅ K υ ⋅ a = 1 − υ2 (a1 )2 .
(8.102)
ℓ = (1 − (a1 υ)2 ) 2 ℓ.
(8.103)
Eventually the result is 1
For υ = 0 both lengths coincide ℓ = ℓ , i.e., there is no length contraction. This is immediately clear since in this case L = 14 . In the case υ ≠ 0 a length contraction exists if a1 ≠ 0. In the case a1 = 1 the rod Σ moves along the 1-direction relatively to MR and ℓ < ℓ. For a1 = 0 there is no length contraction, i.e., if Σ is oriented orthogonal to the direction of motion. From equations (8.97) and (8.103) we can determine the direction b of Σ in MR : b = (1 − (a1 υ)2 )−1 (k−1 a1 , a2 , a3 ).
(8.104)
Example 2. Using the results of Example 1 it is possible to discuss the general case, i.e., (8.105) L = B ⋅ S υ ⋅ A, with P B=( 03
03T ) 1
and
Q A=( 03
03T ), 1
(8.106)
86 | 8 Kinematics on M s where P and Q are orthogonal matrices and 03 := (0, 0, 0). By a simple calculation we can determine K . If j, r = 1, 2, 3 we find j
j
j
j
L r = kP1 Q1r + P2 Q2r + P3 Q3r , j
j
L4 = −υkP1 ,
L4r = −υkQ1r ,
L44 = k.
(8.107)
This and equation (8.93) yields K r = k−1 P1 Q1r + P2 Q2r + P3 Q3r . j
j
j
j
(8.108)
In the next step we consider the matrix P ⋅ K υ ⋅ Q. Again its elements can easily be obtained: j j (P ⋅ K υ ⋅ Q)r = K r , so we can write K = P ⋅ K υ ⋅ Q.
(8.109)
Plugging K and equations (8.109) in (8.98) gives
ℓ 2 = ℓ2 (Q ⋅ a)T ⋅ K 2υ ⋅ (Q ⋅ a)
(8.110)
and, as in equations (8.102) and (8.103), the desired equation ℓ = (1 − ((Q ⋅ a)1 υ)2 ) 2 ℓ,
(8.111)
a T ⋅ Q T ⋅ Q ⋅ a = a T ⋅ a = 1.
(8.112)
1
since The special cases can be discussed analogously, as it was done for Example 1. One simply replaces a1 by (Q ⋅ a)1 . Again, for υ = 0 equation (8.111) implies ℓ = ℓ, i.e., there is no length contraction. Thus, only if there is relative motion between MR and MR does the measurement processes 1 and 2 lead to a length contraction. For υ ≠ 0 there always exists a length contraction if (Q ⋅ a)1 ≠ 0. This means that the rod is longest in the system of reference where it is at rest. The length contraction effect emerges since the length of the rod Σ is measured with two different measurement processes. Thus, one cannot say that the rod at rest in MR has shrunk if looked at from MR . Finally, we can deduce the direction b of the rod Σ with respect to MR from its direction a in MR. Using equations (8.97) and (8.111) we find 1
b = (1 − ((Q ⋅ a)1 υ)2 ) 2 P ⋅ K υ ⋅ Q ⋅ a.
(8.113)
In general the length contraction of extended objects, i.e., the contraction of their images in the direction of motion, is not always visible. For large objects, straight lines which are not parallel to the direction of motion appear curved. In a special case the images of small objects are compressed relatively to their image at rest. This compression manifests itself as visible length contraction. A detailed discussion of the visibility and invisibility of the Lorentz contraction can be found in [26].
8.8 Aberration of light
| 87
8.8 Aberration of light Two observers in relative motion to each other observe a distant glowing object in the sky, here simply called star, in different directions. In other words: the angles under which the observers see the light of the star relative to a fixed axis are different. This phenomenon is called aberration of light and was experimentally detected before it was theoretically explained. Using Minkowski spacetime kinematics in (M s , +, ⋅, g)̂ we can model this effect as follows. The star is the source of light rays. Especially it emits a bundle of parallel rays, which are studied by two instant observers (p, υ) and (p , υ ). Here the events p and p need not coincide. Each observer measures the Newtonian velocity of light at one ray out of the bundle sent from the star. We seek to find conditions to determine if the velocity vectors the observers measure are identical or not. Our investigation is performed using the following three Assumptions. (1) There exists a Minkowski chart ψ in which the star is at rest. It is described mathemtically as a set of Minkowski observers, as presented in equation (8.22), for j which the set S, containing the points ỹ = (y10 , y20 , y30 )T with y0 ∈ ℝ, determines the spatial form of the star. In the simplest case S is a ball. Moreover, we assume that the star is not evolving after the time τ0 . Thus, for all ỹ ∈ S and every τ > τ0 the star in M s is described by μ S (y,̃ τ) = ψ−1 (y10 , y20 , y30 , τ).
(8.114)
(2) All points of the surface F of the star S emit light pulses in all directions for all times τ > τ0 . According to equation (8.15) these pulses are described by γ u (y,̃ τ, λ) = (λ − τ)u + r,
(8.115)
where u is lightlike and λ ≥ τ, as well as r = ψ −1 (y,̃ τ)
and
ỹ ∈ F.
(8.116)
(3) Given Minkowski charts φ and φ such that their composition φ−1 ∘ φ = (L, 0) yield a proper orthochronous Lorentz matrix L. Then the two instant observers (p, υ)
and
(p , υ )
(8.117)
belong to the Minkowski observers of the charts φ and φ which use their proper time clocks. From these assumptions we can deduce the following.
88 | 8 Kinematics on M s
Conclusion 8.34. (1) Let F u be the part of the surface of the star S which emits light pulses with one specific velocity u. The ray bundle B u of interest then is B u = {q ∈ M s : q = γ u (y,̃ τ, λ) for all ỹ ∈ F u and λ ≥ τ ≥ τ0 }.
(8.118)
(2) According to Assumption 3 the two instant observers (p, υ) and (p , υ ) belong to Minkowski observers μ and μ with proper time clocks, which are determined by Minkowski charts φ and φ . Employing Definition 8.10 and Conclusion 8.15(1) (both with setting a = 1), in φ-coordinates μ is given by β
μ̄ α (σ) = y0 δ αβ + σδ4α
(8.119)
and μ in the φ -coordinates by
μ̄ α (σ ) = y0 δ αβ + σ δ4α . β
(8.120)
̄ Introducing μ(σ) = (μ̄ 1 (σ), . . . , μ̄ 4 (σ))T , μ̄ (σ ) = (μ̄ 1 (σ ), . . . , μ̄ 4 (σ ))T , one writes the Minkowski observers μ and μ with respect to the bases ê = (e1 , . . . , e4 ) and ê = (e1 , . . . , e4 ) according to Conclusion 5.3(3) as ̄ μ(σ) = φ−1 (μ(σ))
= σe4 + q,
μ (σ ) = φ −1 (μ̄ (σ )) = σ e4 + q ,
(8.121)
β
where q = y0α e α and q = y0 eβ . Thus, we obtain for the instant observers (p, υ) = (μ(σ0 ), e4 ), (p , υ ) = (μ (σ0 ), e4 )
(8.122)
at the time parameters σ0 , σ0 at which the Newtonian speed of light is measured. The points just discussed have now prepared for the following. Proposition 8.35. Given a bundle of light rays B u according to equation (8.118) and two instant observers (p, υ) and (p , υ ) according to equation (8.122) with p, p ∈ B u . Let both observers measure the Newtonian velocity vectors u N and uN of the light rays in B u with respect to their clocks. Then u N = uN
(8.123)
holds if and only if υ = υ . Proof. (1) Since both observers use their proper time clocks, the norm of both vectors u N and uN is equal to 1 according to equation (8.51). Thus, u N and uN are the unit vectors pointing into the direction of the light rays in B u which the observers measure.
8.8 Aberration of light
| 89
(2) Following Proposition 8.23 the measured unit vectors are u N = aPu
and
uN = a P u,
(8.124)
with ̂ ̂ a = g(υ, υ)g(υ, u)−1 ,
̂ , υ )g(υ ̂ , u)−1 a = g(υ
(8.125)
and, moreover, according to equations (8.36) and (8.37) 3
3
P = ∑ ej ⊗ Θj
and
P = ∑ en ⊗ Θ n .
(8.126)
n=1
j=1
Since υ = e4 and υ = e4 as well as
u = u α e α = u β eβ ,
(8.127)
one obtains a = (u4 )−1
and
a = (u 4 )−1 .
(8.128)
(3) Both u N and uN have length one, and thus u N ≠ uN means that they point in different directions. Thus, equation (8.124) implies Pu ≠ P u and so P ≠ P , which means υ⊥ ≠ υ ⊥ (for the definition of ⊥ see Definition 9.31). Corollary 9.33 then gives υ ≠ υ . (4) The other way around, if we assume υ = e4 ≠ υ = e4 , then υ⊥ ≠ υ⊥ , and so P ≠ P , and thus Pu ≠ P u, and so u N ≠ uN . This proves Proposition 8.35. Conclusion 8.36. (1) Both observers (p, υ) and (p , υ ) are inside the light ray bundle B u , i.e., p, p ∈ B u . Thus, they measure the same directions u N and uN of light rays in B u only if they have the same velocities υ and υ . It is, however, not necessary that p = p . In experiments to measure the aberration of light the inequality p ≠ p is used. (2) The Newtonian spaces which the observers experience are υ⊥ and υ⊥ . Thus, they are different for υ ≠ υ . For practical purposes it is convenient to use the angle between u N and uN instead of the directions themselves. Definition 8.37. Given u N and uN . The angle between these two directions is ̂ N , uN ). δ := ∢(u N , uN ) = arccos g(u
(8.129)
The angle δ is called difference angle of the aberration. Conclusion 8.38. Using the bases ê and ê introduced in Conclusion 8.24(2) which are themselves defined by Minkowski charts φ and φ , one obtains 3
̂ N , uN ) = ∑ u jN u Nn g(e ̂ j , en ). g(u j,n=1
(8.130)
90 | 8 Kinematics on M s β
̂ j , en ) = By Assumption 3 φ−1 ∘ φ = (L, 0). Thus, by equation (5.15), en = L n e β and g(e j L n imply 3
̂ N , uN ) = ∑ u jN u Nn L jn . g(u
(8.131)
j,n=1
In Proposition 8.25 we showed that the components u N of uN are determined by u N and L. The proof can be obtaint for p ≠ p , as it is in the case which we like to consider here. Plugging equation (8.58) into (8.131) yields 3
̂ N , uN ) = ( ∑ L4r u rN + L44 ) g(u r=1
−1
3
3
j
j
j
∑ ( ∑ L kj L n u kN u nN + u kN L kj L4 ). j,k=1
(8.132)
n=1
This result shows that an observer (p, υ) can determine the difference angle between the direction he measured and the direction the observer (p , υ ) measures, if the Lorentz matrix L or the velocity υ is known. This can be seen from the proof of Proposition 9.29 in Chapter 9.
9 Some basic notions of relativistic theories 9.1 Manifolds In the Introduction it was already pointed out that the notion of spacetime is fundamental for all of physics, and that the mathematical term of a spacetime is the notion of a Lorentzian manifold, which will be discussed in detail in Section 9.7. In this chapter we will discuss the basic definition of a differentiable manifold on which all considerations on spacetime are based. Definition 9.1. (1) Let M be a set, U ⊂ M and φ : U → ℝn
(9.1)
be injective such that φ[U] ⊂ ℝn is an open set. The tupel (U, φ) is called a chart on M. For p ∈ U and φ(p) = x, the components x α , α = 1, . . . , n of x are called coordinates of p with respect to the chart (U, φ). Sometimes one simply calls φ alone “chart”. (2) Let A be the set of all charts (U σ , φ σ ) on M, labeled by σ ∈ I , such that ⋃σ∈I U σ = M .
(9.2)
A is called a C k -atlas on M if for σ ≠ σ and
W σσ := U σ ∩ U σ ≠ 0
(9.3)
the sets φ σ [W σσ ], φ σ [W σσ ] are open and the functions ϕ σ σ := φ σ ∘ φ−1 σ : φ σ [W σσ ] → φ σ [W σσ ]
(9.4)
are in C k , k ≥ 1. The charts in A are called C k -compatible. (3) The pair (M, A) is called an n-dimensional C k -manifold, or simply a differentiable manifold. As short hand notation one often writes (M, A) =: M . In case M = (M, A) is a differentiable manifold, it may be possible to define further atlases A , A on M. Atlases with the following properties are of interest. Definition 9.2. (1) The C k -atlases A1 and A2 on M are called C k - compatible if A = A1 ∪ A2 is a C k -atlas on M. (2) The union D = ⋃ϱ Aϱ of all C k -compatible atlases on M is called complete atlas, or differentiable structure, on M. For several applications of manifolds it is important that M carries a topological structure in addition to its differentiable structure. These two structures can be related since the differentiable structure determines a topology on M. DOI 10.1515/9783110485738-010
92 | 9 Some basic notions of relativistic theories Proposition 9.3. Let D be a differentiable structure on M. Then B = {U : (U, φ) ∈ D}
(9.5)
is the basis of a topology T on M, i.e., all O ∈ T are unions of U ∈ B. Proof. Let U1 and U2 be elements of B. Then there exist two charts (U1 , φ1 ) and (U2 , φ2 ) in D . Thus, φ1 and φ2 are C k -compatible. Moreover, let U1 ∩ U2 = U3 ≠ 0. Then, by the definition of the atlases, the sets φ1 [U3 ] and φ2 [U3 ] are open in ℝn . The restrictions of φ1 and φ2 onto U3 are called ψ1 and ψ2 . Then (U3 , ψ1 ) and (U3 , ψ2 ) are charts on M. Thus, both are in D and so U3 ∈ B. This proves the proposition (see for example [27, p. 30, Theorem 11] or [28, p. 26, Theorem 3.50]). This result allows in principle to test specific topological properties of a manifold M .
9.2 Tangent vectors In this section we will discuss some properties of any n-dimensional C k -manifold M = (M, A) with k ≥ 1. The first aim is to define tangent vectors as operators which correspond to directional derivatives of a function f along a curve γ at a point p. This heuristic idea can be made precise: Definition 9.4. (1) Let I ⊂ ℝ be an open interval and γ : I → M be a function such that for every chart (U, χ) ∈ A with γ[I] ∩ U ≠ 0 the function γ̄ = χ ∘ γ ∈ C r , r ≤ k. Then γ is called a C r -curve in M . (2) Let W ⊂ M be an open set and let f : W → ℝ be a function such that for every chart (U, ψ) ∈ A with U ∩ W ≠ 0 the function f ̄ = f ∘ ψ−1 ∈ C r , r ≤ k. Then f is called C r -function on M . (3) For p ∈ M, the set of all C r −curves with 1 ≤ r ≤ k for which a σ o ∈ dom γ with γ(σ0 ) = p exists is called K p . The set of all C r -functions with 1 ≤ r ≤ k which are defined in a neighborhood of p is called F p . ̇ 0 ) defined on F p by (4) Consider γ ∈ K p , f ∈ F p and the operator γ(σ ̇ 0 )(f) = γ(σ
d (f ∘ γ)(σ0 ) ∈ ℝ, dσ
(9.6)
̇ 0 ) tangent vector in p. If for a tanwhich is often called υ p . Then we call υ p = γ(σ gent vector υ p it is clear to which point p it belongs, the index p may be dropped. In the ongoing discussion it is convenient to use the abbreviations f ̄ = f ∘ φ−1 and γ̄ = φ ∘ γ as well as the components γ̄ α of γ̄ introduced in Definition 9.4. Using these we formulate the following.
9.2 Tangent vectors
|
93
Conclusion 9.5. (1) For a given chart (U, φ), with p ∈ U , the following equality can be deduced using equation (9.6): ̇ 0 )(f) = γ(σ
d d ̄ ∂ ̄ d α ((f ∘ φ−1 ) ∘ (φ ∘ γ)) = (f ∘ γ)̄ = γ̄ (σ0 ). (9.7) f (x) ⋅ σ0 dσ σ0 ∂x α dσ dσ
(2) Given f1 , f2 ∈ F p and γ1 , γ2 ∈ K p . In case the equations ∂ ̄ ∂ ̄ f1 (x) = f2 (x), ∂x α ∂x α
d α d α γ̄ (σ0,1 ) = γ̄ (σ0,2 ) dσ 1 dσ 2
(9.8)
hold for a chart φ, then γ̇ 1 (σ0,1 )(f1 ) = γ̇ 2 (σ0,2 )(f2 ).
(9.9)
Thus, for f1 = f2 it is possible that different curves γ1 and γ2 generate the same tangent vector. Similarly, for γ1 = γ2 it is possible that the action of a tangent vector on different functions in F p yields the same real number. This result can be summarized as follows: A tangent vector is determined by an equivalence class of curves in K p , while its value is determined by an equivalence class of functions. Both equivalence classes are determined by equations (9.8). The latter are independent of the chosen chart, i.e., if they hold in one chart (U, φ) they hold in any chart (U , φ ) with p ∈ U ∩ U . (3) A tangent vector is a linear operator on F p by Definition (9.6). Thus, ̇ 0 )(α1 f1 + α2 f2 ) = α1 γ(σ ̇ 0 )(f1 ) + α2 γ(σ ̇ 0 )(f2 ). γ(σ
(9.10)
Moreover, again by Definition (9.6), a tangent vector is a differential operator, and the product rule holds: ̇ 0 )(f ⋅ h) = f(p)γ(σ ̇ 0 )(h) + h(p)γ(σ ̇ 0 )(f). γ(σ
(9.11)
The next three propositions study the component representation of tangent vectors with respect to a chart (U, φ) with p ∈ U . Conclusion 9.6. Let x = φ(p) and β
β
γ̄ α (σ) = x β + δ α σ,
γ̄ α = (γ̄ 1α , . . . , γ̄ nα )
as well as γ α = φ−1 ∘ γ̄ α .
(9.12)
Using the notation from Definition 9.4 and equation (9.7) we find γ̇ α (0)(f) = Thus,
∂ ̄ ∂x α f
d β ∂ ∂ ̄ γ̄ α (0) ⋅ β f ̄(x) = f (x). dσ ∂x α ∂x
defines a tangent vector generated by the curve γ α .
(9.13)
94 | 9 Some basic notions of relativistic theories
Definition 9.7. The tangent vector defined by equation (9.13) is denoted ∂ x α , or sometimes ∂ x α p . Thus, γ̇ α (0) = ∂ x α holds. Conclusion 9.8. (1) Let x = φ(p), γ̄ α (σ) = x α + υ α σ and γ̄ = (x1 + υ1 σ, . . .) as well as γ = φ−1 ∘ γ.̄
(9.14)
Then, ̇ γ(0)(f) =
∂ d α γ̄ (0) ⋅ α f ̄ = υ α ∂ x α f dσ ∂x
(9.15)
holds. And thus, ̇ γ(0) = υ α ∂ xα .
(9.16)
This defines the linear combination of the tangent vectors ∂ x α , α = 1, . . . , n. (2) The other way around, consider a tangent vector υ p = γ̇ (σ0 ) given by d α γ̄ (σ0 ) =: υ α . dσ
(9.17)
Then, the γ defined by equation (9.14) satisfies ̇ γ̇ (σ0 ) = γ(0).
(9.18)
Thus, for every equivalence class of curves according to equation (9.8) there exists exactly one curve of type (9.14) which generates the tangent vector of the equivalence class. So far we have only used one arbitrarily chosen chart φ in our considerations. However, for many applications it is important to know the behavior of the components of a tangent vector, and the behavior of the corresponding basis, under coordinate changes. Conclusion 9.9. (1) Given two charts (U, φ) and (U , φ ) with p ∈ U ∩ U and a tangent vector υ p . Moreover, let ϕ = φ ∘ φ−1 and ϕ := ϕ−1 . The coordinates x and x are defined by x = φ(p) and x = φ (p). Then ∂ xα =
∂ϕ β ∂ β ∂x α x
(9.19)
holds. Since, using the notions from the Definitions 9.1 and 9.4 as well as using equation (9.8), we obtain ∂ xα f =
∂ϕ β ∂ϕ β ∂ ̄ ∂ ∂ ̄ = ∂ β f . f = f ∘ϕ= f ̄ β α α ∂x ∂x ∂x α ∂x α x ∂x
(9.20)
The reverse of equation (9.19) can be proven analogously:
∂ x β =
∂ϕ α ∂ xα . ∂x β
(9.21)
9.2 Tangent vectors
|
95
Let υ = υ α ∂ x α = υβ ∂ x β . Then one obtains υ
β
=
∂ϕ β α υ ∂x α
(9.22)
(9.23)
by plugging equation (9.19) into equation (9.22). Employing equation (9.21) one finally finds ∂ϕ α β υα = υ . (9.24) ∂xβ (2) From the above equations (9.19) and (9.23) we can deduce that the Lorentz transformations ϕ = (L, a) determine the same transformations for a given fixed L and all a ∈ ℝ4 with respect to the manifold Ms . Thus, if there is no specific a chosen, one can always choose ϕ = (L, 0) to obtain the transformation of the vector components and basis vectors. From our discussion so far we can collect the tangent vectors as follows. Definition 9.10. The set ̇ 0 ), γ(σ0 ) = p, γ ∈ K p } T p M := {υ : υ = γ(σ
(9.25)
is called tangent space at p ∈ M or tangent space of p. Now we find the following. Proposition 9.11. (1) It is possible to define an addition “+” and a scalar multiplication “⋅” on T p M such that the triple (T p M, +, ⋅), or simply T p M , is a vector space. (2) For each chart (M, φ) ∈ A the n-tuple (∂ x1 , . . . , ∂ x n ) is a basis of T p M . Proof. (1) Let φ be a chart with x = φ(p). Moreover, let υ1 , υ2 ∈ T p M such that υ j = υ αj ∂ x α , j = 1, 2.
(9.26)
υ3 = υ1 + υ2
(9.27)
υ3 = (υ1α + υ2α )∂ x α .
(9.28)
Then is defined through This definition is independent of the choice of chart φ by linearity of the transformation equations (9.19) and (9.24). The zero element in T p M is defined by n
0 = ∑(0∂ x α ) α
96 | 9 Some basic notions of relativistic theories and the inverse element of υ = υ α ∂ x α by −υ = (−υ α )∂ x α . By the associativity of addition, (T p M, +) is an Abelian group. Scalar multiplication is defined by α ⋅ υ =: αυ = (αυ α )∂ x α ,
(9.29)
which satisfies all four axioms of multiplication. The proof is trivial; see for example [29, p. 5]. (2) For every chart φ with x = φ(p) it is possible to express every tangent vector υ as in equation (9.16). Thus, we only need to show that the tangent vectors ∂ x α , α = 1, . . . , n are linearly independent. Consider u = u α ∂ xα = 0
(9.30)
and f β = f β̄ ∘ φ with f β̄ (x) = x β . Then ∂ xα f β =
∂ ̄ β fβ = δα . ∂ xα
(9.31)
Thus, by equations (9.30) and (9.31) u(f β ) = u β = 0(f β ) = 0.
(9.32)
Thus, equation (9.30) can only be satisfied if u α = 0 for all α = 1, . . . , n, which proves the second part of the proposition.
9.3 Cotangent vectors In this section we will consider linear functions ω : T p M → ℝ,
(9.33)
i.e., functions which have the property ω(a1 υ1 + a2 υ2 ) = a1 ω(υ1 ) + a2 ω(υ2 )
(9.34)
for all υ1 , υ2 ∈ T p M and all a1 , a2 ∈ ℝ. Definition 9.12. The functions ω are called linear forms, 1-forms, or covectors. The set of all ω which can be defined on T p M is called cotangent space T p⋆ M . Proposition 9.13. It is possible to define an addition “+” as well as a scalar multiplication with real number “⋅” on T p⋆ M such that T p⋆ M , more precisely the triple (T p⋆ M, +, ⋅), is a vector space.
9.3 Cotangent vectors
|
97
Proof. The sum ω1 + ω2 of two elements in T p⋆ M is defined by (ω1 + ω2 )(υ) = ω1 (υ) + ω2 (υ)
(9.35)
for all υ ∈ T p M . Scalar multiplication is defined by (a ⋅ ω)(υ) =: (aω)(υ) = ω(aυ)
(9.36)
for all υ ∈ T p M and all a ∈ ℝ . The zero element 0⋆ ∈ T p⋆ M is defined by 0⋆ (υ) = 0 for all υ ∈ T p M.
(9.37)
The inverse element of ω, denoted by −ω, is defined by (−ω)(υ) = −(ω(υ)).
(9.38)
Since the addition of co-vectors is associative, T p⋆ M is an Abelian group. It is easy to show that the scalar multiplication defined above satisfies the necessary axioms (see [29, p. 5]). Thus, T p⋆ M is a vector space. Conclusion 9.14. T p⋆ M is the dual vector space to T p M . In a next step we will define basis elements of T p⋆ M . Definition 9.15. Let (U, φ) be a chart of A with p ∈ U . Moreover, let ∂ x β , β = 1, . . . , n, be the basis of T p M induced by φ. Then we find that the co-vectors θ α , α = 1, . . . , n, defined by θ α (∂ x β ) = δ αβ , (9.39) are determined for all υ ∈ T p M . A notation which is often used for these co-vectors is θ α =: dx α or dx αp . Conclusion 9.16. (1) Let φ be a chart and the ∂ x α , α = 1, . . . , n, be the corresponding basis elements. Moreover, let υ = υ α ∂ x α ∈ T p M . Then, for an ω ∈ T p⋆ M it follows that θ α (υ) = υ α
and
ω(υ) = υ α ω(∂ x α ).
(9.40)
Using the abbreviation ω α = ω(∂ x α ) one obtains ω(υ) = ω α θ α (υ) for all υ ∈ T p M . Thus, every ω ∈ T p⋆ M has a component representation ω = ω α θ α = ω α dx α .
(9.41)
(2) The covectors θ α , α = 1, . . . , n, are a basis of T p⋆ M . By equation (9.41) we only need to show their linear independency. To see this, consider the following. If ω α θ α = 0⋆ ,
(9.42)
98 | 9 Some basic notions of relativistic theories then, for every β = 1, ⋅ ⋅ ⋅ , n, the equation ω β = ω α θ α (∂ x β ) = 0⋆ (∂ x β ) = 0
(9.43)
holds. Thus, (θ1 , . . . , θ n ) is in T p⋆ M the dual basis to (∂ x1 , . . . , ∂ x n ). So far we have considered only one specific chart φ. We now want to investigate the behavior of the basis elements θ α and the components ω α under a change of chart. Conclusion 9.17. (1) As in Conclusion 9.9 we consider two charts: (U, φ) and (U , φ ) with p ∈ U ∩ U . Moreover, let ϕ = φ ∘ φ−1 and ϕ := ϕ−1 . For ω ∈ T p⋆ M we find the following representations: ω = ω α θ α = ωβ θ β , (9.44) where θ α are the basis elements corresponding to φ and θ to φ . Using (9.19) we obtain ω α = ω(∂ x α ) =
∂ϕ β ∂ϕ β ω(∂ x β ) = ω , α ∂x ∂x α β
β
the ones belonging
(9.45)
which yields ωα =
∂ϕ β ω . ∂x α β
(9.46)
ωβ =
∂ϕα ωα ∂x β
(9.47)
The other way around,
holds. Employing equation (9.23) yields ∂ϕ β α ∂ϕ β α υ = θ (υ) ∂x α ∂x α
θ β (υ) = υβ = and in turn θ
β
=
As above we find θα =
(9.48)
∂ϕ β α θ . ∂x α
(9.49)
∂ϕα β θ . ∂x β
(9.50)
(2) On Ms one obtains that for all Lorentz transformations ϕ = (L, a) with a fixed L and arbitrary a ∈ ℝ4 the same transformations as (9.46) and (9.49) hold.
9.4 Lorentz vector spaces In the previous sections of this appendix we discussed some properties of tangent vector spaces, without using a metric tensor. However, a metric is the fundamental
9.4 Lorentz vector spaces | 99
entity in special and general relativity. In each tangent vector space it defines an inner product, and thus we present some properties of vector spaces with inner product here (see for example [29]). In the following we will consider an arbitrary real vector space V of dimension n. Definition 9.18. (1) Let the function g:V×V →ℝ
(9.51)
be symmetric and bilinear. Then g is called nondegenerate if g(u, υ) = 0 for all υ ∈ V implies u = 0. (2) A symmetric, bilinear and nondegenerate function g is called inner product (of V ). (3) The function g is called definite if g(υ, υ) > 0 for all υ ∈ V with υ ≠ 0. If this property does not hold, g is called indefinite. (4) The dual vector space of V is called V ⋆ , as usual. In the following we will consider pairs (V, g) where V is a vector space as above and g, defined by (9.51), is symmetric bilinear. The matrix η defined by equation (1.3) gets generalized to the n-dimensional case by the matrix elements n−1
η αβ = ∑ δ αj δ βj − δ αn δ βn .
(9.52)
j=1
With this notation we can characterize the vector spaces, which play a particularly important role in relativity theory for n = 4. Definition 9.19. The pair (V, g), or short V , is called Lorentz vector space, if there is a basis (e1 , . . . , e n ) ⊂ V such that g(e α , e β ) = η αβ
(9.53)
for α, β = 1, . . . , n. The basis (e1 , . . . , e n ) is called Minkowski basis. Proposition 9.20. The function g with properties (9.53) is an inner product on V . Proof. We need to show that g is nondegenerate. Let u be a vector which satisfies g(u, υ) = 0,
for all
υ ∈ V.
Then, for α = 1, . . . , n 0 = g(u, e α ) = η α,α u α
(9.54)
holds, and thus u = 0. Proposition 9.21. Let (e1 , . . . , e n ) be the basis of Definition 9.19 and (ε1 , . . . , ε n ) be the corresponding basis of V ⋆ . Then, g = η αβ ε α ⊗ ε β .
(9.55)
100 | 9 Some basic notions of relativistic theories Proof. Let υ1 , υ2 ∈ V . Then, υ j = υ κj e κ , j = 1, 2. And thus η αβ ε α ⊗ ε β (υ1 , υ2 ) = η αβ ε α (υ1 )ε β (υ2 ) = η αβ υ1κ υ2λ ε α (e κ )ε β (e λ ) β
β
= η αβ υ1α υ2 = υ1α υ2 g(e α , e β ) = g(υ1 , υ2 ), which proves the proposition. Corollary 9.22. Since equation (9.55) implies (9.53) for a given basis (e1 , . . . , e n ) the following result can be obtained. A pair (V, g), with symmetric bilinear g, is a Lorentz vector space if and only if for the dual basis of a basis (e1 , . . . , e n ) ⊂ V equation (9.55) holds. The representation (9.55) of g leads to the conjecture that g enables us to define a map from V to its dual V ⋆ . Indeed we find the following. Proposition 9.23. For every υ ∈ V there exists exactly one w ∈ V ⋆ , and for every w ∈ V ⋆ there exists precisely one υ ∈ V , such that w = g(υ, ⋅).
(9.56)
Proof. Since g is bilinear, g(υ, ⋅) is linear, i.e., an element of V ⋆ . Let (e1 , . . . , e n ) be a basis of V and (ε1 , . . . , ε n ) the corresponding dual basis of V ⋆ . Then ε α (υ) = υ α for every υ = υ α e α ∈ V . Thus, using equation (9.55), g(υ, ⋅) = η αβ ε β (υ)ε α = η αβ υ β ε α
(9.57)
holds. From equation (9.57) we can deduce that for a given υ ∈ V the corresponding w = w α ε α is determined by w α = η αβ υ β = η αα υ α . In the same way we find that for a given w that the corresponding υ = υ β e β is determined by υ β = η βα w α = η ββ w β , where η κλ := η κλ and β = 1, . . . , n. Thus, the proposition holds. Conclusion 9.24. (1) If υ = e κ , then υ α = δ ακ , and thus w α = η αα δ ακ . From this, using equation (9.57) it follows that g(e κ , ⋅) = η κκ ε κ . (9.58) ̂ then If g is defined independently of (ε1 , . . . , ε n ), as for example in (M s , +, ⋅, g), it is possible to use equation (9.58) in the form ε κ = η κκ g(e κ , ⋅)
(9.59)
as a definition of the dual basis of (e1 , . . . , e n ), as for example in equation (8.36).
9.4 Lorentz vector spaces | 101
(2) From the equations w α = η αβ υ β = η αα υ α , υ β = η βα w α = η ββ w β
(9.60)
it follows that the components of the corresponding vectors and covectors differ at most by a sign. This means wα = υα
for
α = 1, . . . , n − 1,
and w n = −υ n .
(9.61)
So far no inner product has been defined on V ⋆ , i.e., so far V ⋆ is not a Lorentz vector space. This can be changed by the following. Proposition 9.25. Let w, w ∈ V ⋆ and w = g(υ, ⋅), w = g(υ , ⋅). The function defined by g⋆ (w, w ) = g(υ, υ )
(9.62)
is a Lorentzian metric on V ⋆ , and thus V ⋆ is a Lorentz vector space. Proof. Using the second equation of (9.60) we find g(υ, υ ) = η αβ υ α υ β = η αβ η αλ η βκ w λ wκ .
(9.63)
From η αβ η αλ η βκ = δ λβ η βκ = η λκ it follows that g(υ, υ ) = η λκ w λ wκ =: g⋆ (w, w ).
(9.64)
g⋆ (ε λ , ε κ ) = η λκ
(9.65)
In particular, holds such that (V ⋆ , g⋆ ) is a Lorentz vector space according to Definition 9.19. Conclusion 9.26. We have seen that the dual vector space (V ⋆ , g⋆ ) of a Lorentz vector space (V, g) is itself a Lorentz vector space. Thus, on V ⋆ a Minkowski basis exists. Equation (9.65) implies that the dual basis ε̂ of V ⋆ of every Minkowski basis ê of V is itself a Minkowski basis. The preceding results can be summarized by saying that for every Lorentz vector space there exists a dual Lorentz vector space. The elements of these vector spaces, mapped onto each other by equation (9.56), differ only by the sign of the nth component with respect to the bases ê and ε̂ identified by equation (9.56). In the particular case of special relativity, with n = 4, we find that physical phenomena can be described equally well with both vector spaces. This suggests the following convenient definition. Definition 9.27. (1) A vector υ ∈ V and its corresponding covector w ∈ V ⋆ according to equation (9.56) are called physically equivalent. (2) The map χ : V → V ⋆ defined by (9.56) is called physical equivalence map.
102 | 9 Some basic notions of relativistic theories
This physical equivalence can be described as follows: It is just a matter of convenience which Lorentz vector space is chosen to describe a physical system. The equivalence map relating vectors and co-vectors suggests to denote these with the same letter and distinguish them only by an index. Two different cases exist: (1) Given υ ∈ V , then υ = υ α e α . Instead of writing χ(υ) one uses the notation υ♭ with υ♭ = υ α ε α . The musical symbol ♭ lowers the index of the components of υ. (2) Analogously we introduce for w = w β ε β ∈ V ⋆ the notation χ−1 (w) = w♯ with w♯ = w β e β . The musical symbol ♯ raises the index of the components of w. This suggests the following names: Notation 9.28. The symbols ♭ and ♯ are called musical operators. In our discussions so far Lorentz transformations have only appeared as coordinate transformations of Minkowski space. This suggests the question: Which role do Lorentz transformations play in Lorentz vector spaces? Lorentz transformations act in tangent and cotangent spaces only indirectly, as transformations of the vector bases and vector components; see equations (9.19), (9.21), (9.23), and (9.24). They do not change a vector itself; see equation (9.22). Thus, it is convenient to introduce Lorentz transformations in arbitrary 4-dimensional Lorentz vector spaces as transformations of vector bases and vector components. The foundation for this procedure is given in the following proposition. Proposition 9.29. (1) Let be given a Lorentz vector space (V, g) and a Minkowski basis ê = (e1 , . . . , e4 ) of V , i.e., the basis satisfies equation (9.53). Moreover, let L be a Lorentz matrix. Then ê = (e1 , . . . , e4 ) defined by eβ = L αβ e α (9.66) is again a Minkowski basis of V . For a vector
υ = υ α e α = υ β eβ the following holds: υ
β
= (L−1 )α υ α . β
(9.67)
(2) The other way around, given two Minkowski bases ê = (e1 , . . . , e4 ) and ê = (e1 , . . . , e4 ) in (V, g), then there exists a Lorentz matrix L such that equation (9.66) holds, and the matrix elements of L are determined by L κλ = g(e ϱ , eλ )η ϱκ .
(9.68)
9.4 Lorentz vector spaces | 103
Proof. (1) Using equation (9.66) one obtains g(eκ , eλ ) = g(L ακ e α , L λ e β ) = L ακ L λ g(e α , e β ) = L ακ L λ η αβ = η κλ . β
β
β
(9.69)
Thus, ê is a Minkowski basis. If moreover eβ is determined by (9.66), then
υ β eβ = υ = δ λκ υ κ e λ = (L−1 )κ L λβ υ κ e λ = (L−1 )κ υ κ eβ , β
β
which implies equation (9.67), since the components of a vector with respect to a basis are uniquely determined. (2) Let ê and ê be bases in (V, g). Thus, there exists a nonsingular matrix L such that e α = L κα e κ ,
α = 1, . . . , 4.
(9.70)
Moreover, ê and ê are Minkowski bases, and thus employing equation (9.53) we find η αβ = g(eα , eβ ) = L κα L λβ g(e κ , e λ ) = L κα L λβ η κλ .
(9.71)
Thus, L is a Lorentz matrix and employing equation (9.70), one gets g(e ϱ , eλ ) = L λ g(e ϱ , e β ) = L λ η ϱβ β
β
(9.72)
such that equation (9.68) holds. Since the dual V ⋆ of a Lorentz vector space V is itself a Lorentz vector space, according to Proposition 9.25, the statements proven for V also hold for V ⋆ . The relation between V and V ⋆ can be discussed in terms of the following proposition. Proposition 9.30. (1) Let ê = (e1 , . . . , e4 ) be a Minkowski basis of V and ε̂ = (ε1 , . . . , e4 ) the corresponding dual basis of V ⋆ . Moreover, let ê = (e1 , . . . , e4 ) be the transformed basis obtained by the Lorentz matrix L, i.e., eα = L α e β , β
α = 1, . . . , 4.
(9.73)
Then the basis ε̂ = (ε 1 , . . . , ε 4 ) defined by ε
κ
= (L−1 )κλ ε λ ,
λ = 1, . . . , 4
(9.74)
is the dual basis of ê . All bases ê , ε,̂ ε̂ are Minkowski bases. (2) For every w ∈ V ⋆ and the Minkowski bases ê and ê the equation w = w λ ε λ = wκ ε
κ
(9.75)
holds if and only if there exists a Lorentz matrix L such that wκ = L λκ w λ .
(9.76)
104 | 9 Some basic notions of relativistic theories
Proof. (1) Employing equations (9.73) and (9.74) yields
ε κ (eα ) = (L−1 )κλ L α ε λ (e β ) = (L−1 )κλ L α δ λβ = δ κα . β
β
(9.77)
Thus, ε̂ is dual to ê . According to Conclusion 9.26 and Proposition 9.29 the bases ê , ε,̂ ε̂ are Minkowski bases. (2) If w λ ε λ = wκ ε
κ
(9.78)
holds there exists a Lorentz matrix L such that equation (9.74) holds. This implies w λ ε λ = wκ (L−1 )κλ ε λ ,
(9.79)
w λ = (L−1 )κλ wκ ,
(9.80)
L λκ w λ = wκ .
(9.81)
and so which eventually yields If, the other way round, equation (9.81) holds, then equations (9.80) and (9.79) must hold, and equation (9.74) implies (9.78). From the considerations on Lorentz vector spaces in this Section 9.4 we can find the following result for the case n = 4. Every tangent space T p M of a Lorentzian manifold M is a Lorentzian vector space. This holds in particular for T p Ms , since the metric tensor g, written in local or global Minkowski coordinates, is an inner product of the kind described in Definition 9.19. This means that every tangent or cotangent space of an arbitrary relativistic spacetime M , again in particular of Ms , possesses all the properties we could prove here for pairs (V, g) and (V ⋆ , g⋆ ).
9.5 Direct decomposition of Lorentz vector spaces First, an n-dimensional vector space V with definite or indefinite inner product g will be considered. Definition 9.31. (1) Let V1 and V2 be subspaces of V such that for every υ ∈ V there exist two elements υ1 ∈ V1 and υ2 ∈ V2 with υ = υ1 + υ2 . In this case we write V = V1 + V2 and call V the sum of V1 and V2 . In case there exists a unique decomposition υ = υ1 + υ2 for all υ ∈ V , V is called the direct sum of V1 and V2 , written formally V = V1 ⊕ V2 = V2 ⊕ V1 .
(9.82)
9.5 Direct decomposition of Lorentz vector spaces | 105
(2) Let U ⊂ V be a subspace of V . The orthogonal complement U ⊥ of U is the set of all vectors orthogonal to all vectors in U . Hence U ⊥ = {w ∈ V : g(u, w) = 0 for all
u ∈ U}.
(9.83)
In case U = {λυ : λ ∈ ℝ} for a υ ∈ V , one may write υ⊥ := U ⊥ . The notions in the first part of Definition 9.31 can be generalized to more than two subspaces of V . However, such generalizations are not important for the following considerations, so we will only discuss properties of the decomposition of a vector space in two subspaces. Proposition 9.32. (1) From V = V1 + V2 it follows that V = V1 ⊕ V2 if and only if V1 ∩ V2 = {0}. (2) For U ⊂ V being a subspace of V , U ⊥ is also a subspace of V and V = U ⊕ U⊥
(9.84)
holds if and only if the restriction g of g onto U is not degenerate. Proof. (1) If υ1 , υ1 ∈ V1 and υ2 , υ2 ∈ V2 satisfy υ = υ1 + υ2 = υ1 + υ2 , then u := υ1 − υ1 = υ2 − υ2 ∈ V1 ∩ V2 . Thus, the decomposition of υ ∈ V is unique if and only if u = 0. (2) If u ∈ U , w, w ∈ U ⊥ and ϱ, ϱ ∈ ℝ, then g(u , ϱw + ϱ w ) = ϱg(u , w) + ϱ g(u , w ) = 0. Since all linear combinations of elements of U ⊥ are again elements of U ⊥ , and since the zero element 0 is in U ⊥ , U ⊥ is a vector subspace of V . In the next step we prove equation (9.84). Let â = (a1 , . . . , a m ) be an orthonormal basis in U . Since V is n-dimensional it is possible to extend â with pairwise orthogonal elements b̂ = (b1 , . . . , b k ) to an orthonormal basis ê = (e1 , . . . , e n ) in V , i.e., ê = â ∪ b.̂ Thus, b̂ is an orthonormal basis in U ⊥ and so k = n − m and V = U + U ⊥ follows. In case g is not degenerate, and if u ∈ U ∩ U ⊥ , then if the equation g (u, υ) = 0 holds for all υ ∈ U , it follows that u = 0. Thus, U ∩ U ⊥ = {0} and equation (9.84) holds. The other way round, if equation (9.84) holds, then there exists a decomposition w = υ + υ , υ ∈ U and υ ∈ U ⊥ , for every w ∈ V . Now let u ∈ U be a vector for which g (u, υ) = 0 holds for all υ ∈ U . Then, with υ ∈ U ⊥ , g(u, w) = g (u, υ) + g(u, υ ) = 0
(9.85)
106 | 9 Some basic notions of relativistic theories holds. Since g is an inner product on V , it is nondegenerate, and u = 0. Thus, g(u, υ ) = 0 implies g (u, υ) = 0 for all υ ∈ U , and so g is non-degenerate, which proves the second part of the proposition. The above proposition leads to the following useful property. Corollary 9.33. The relation (U ⊥ )⊥ = U
(9.86)
holds if equation (9.84) holds. Proof. The decomposition of V according to (9.84) is unique for a given subspace U , i.e., there exists only one U ⊥ for which (9.84) holds. Take W := U ⊥ as given. Then (9.84) holds in the form V = W ⊕ W⊥. (9.87) By the uniqueness of the decomposition U = W ⊥ = (U ⊥ )⊥ must hold. For the further considerations in this section we assume that V is a 4 dimensional Lorentz vector space according to Definition 9.19. Proposition 9.34. Let u ∈ V be timelike. Then u⊥ is a 3-dimensional Euclidean vector space consisting of spacelike vectors. Proof. Since V is a Lorentz vector space, there exists a Minkowski basis ê = (e1 , . . . , e4 ) such that u = u α e α for each u ∈ V . Using the same procedure as in Proposition 6.2, there exists a Lorentz matrix L defined as follows. Let 3
∑ (u j )2 =: w2
with
w > 0.
(9.88)
j=1
Then |u4 | > w and g(u, u) = w2 − (u4 )2 =: −b2 < 0, where we assume b > 0. In ℝ3 we define three row vectors a1 , a2 , a3 with a1 := w−1 (u1 , u2 , u3 )
(9.89)
and a2 , a3 such that (a1 , a2 , a3 ) form an orthonormal basis of ℝ3 . This defines the orthogonal matrix a1 Q = (a2 ) a3
(9.90)
as well as the rotational Lorentz matrix (see equation (4.3)) Q A=( 03
03T ). 1
(9.91)
9.5 Direct decomposition of Lorentz vector spaces | 107
Finally, let S υ be the special Lorentz matrix with υ = (u4 )−1 w. Then L is defined by L = S υ ⋅ A.
(9.92)
Next, define a basis ê and the vector components u
β
by
eβ = (L−1 )αβ e α and u
β
β
= Lα uα .
This implies that ê = (e1 , . . . , e4 ) is a Minkowski basis. Following the proof of Proposition 6.2 one obtains u β = 0 for β = 1, 2, 3,
1
u 4 = b = (−g(u, u)) 2 .
Hence u = u 4 e4 , which yields
u⊥ = e4⊥ = span (e1 , e2 , e3 )
(9.93)
V = u⊥ ⊕ span u,
(9.94)
and where u⊥ contains only spacelike vectors and span u is timelike. The complementary proposition to Proposition 9.34 is as follows. Proposition 9.35. Let V be a Lorentz vector space and R ⊂ V be a 3-dimensional subspace whose elements are spacelike vectors. Then there exists, up to a real factor, a unique timelike vector υ such that R = υ⊥ . Proof. Let ẽ = (e1 , e2 , e3 ) be an orthogonal basis of R. Then there exists a vector w ∈ V such that the vectors e1 , e2 , e3 , w are linearly independent. Let 3
υ := w − ∑ g(w, e j )e j .
(9.95)
j=1
Then, for k = 1, 2, 3, 3
g(υ, e k ) = g(w, e k ) − ∑ g(w, e j )g(e j , e k ) = g(w, e k ) − g(w, e k ) = 0.
(9.96)
j=1
Now υ ≠ 0 follows from the fact that w is linearly independent of (e1 , e2 , e3 ). From equation (9.96) it follows that R = υ⊥ , and so according to Proposition 6.8 and the introductory remarks of Section 6.4 υ is timelike. Let g(υ, υ) = −a−2 , a > 0 and e4 = aυ; then g(e4 , e4 ) = −1. Now using R = υ⊥ = (span e4 )⊥
(9.97)
108 | 9 Some basic notions of relativistic theories
and equation (9.86) we find R⊥ = span e4 .
(9.98)
Thus, R determines the normed vector e4 uniquely and (e1 , . . . , e4 ) is a Minkowski basis of V since g(e λ , e k ) = η λk (9.99) holds. Propositions 9.34 and 9.35 enable us to draw two important conclusions. Corollary 9.36. (1) Given two timelike vectors υ, υ ∈ V . If they satisfy υ⊥ = υ ⊥ , then according to Proposition 9.35 there exists a real number ϱ such that υ = ϱυ . (2) If υ⊥ ≠ υ ⊥ , then according to Proposition 9.15λυ ≠ υ holds for any real number λ.
9.6 Tensors Tensors play a fundamental role in special relativity, as discussed in detail in Sections 1.1, 8.4, 8.8, and 9.4. However, this is not the end of the story, as a short look at the literature on relativity shows. It also reveals that there exist different definitions of the notion of a tensor. The definition given in this section is extracted from the notion of a tensor developed by Greub in [30]. It is based on the notion of the tensor product. Roughly speaking, the objects which get multiplied are vector spaces, which yields a new vector space whose elements are called tensors. To perform this program and to make it precise, we need to make some introductory remarks. The starting point is the following. Definition 9.37. (1) For an arbitrary set Y consider functions f:Y→ℝ
(9.100)
such that f(y) ≠ 0 only for finitely many y ∈ Y . The functions f are called finite selection functions over Y . (2) The set of all selection functions is called F(Y). This definition yields the following. Conclusion 9.38. (1) Let f, g ∈ F(Y) and λ, μ ∈ ℝ. The linear combination λf + μg is defined by (λf + μg)(y) = λf(y) + μg(y). Thus, λf + μg ∈ F(Y) such that F(Y) is a vector space.
(9.101)
9.6 Tensors |
109
(2) Consider special functions f z with z ∈ Y defined by f z (y) = 1 for y = z = 0 for y ≠ z.
(9.102)
Let f ∈ F(Y) and f(y) ≠ 0 precisely for z1 , . . . , z m , as well as f(z j ) = α j , j = 1, . . . , m. Then m
f = ∑ α j f zj .
(9.103)
j=1
(3) The set of all functions f z , z ∈ Y , denoted by B, is a basis of F(Y). To see this we only need to prove linear independence, due to equation (9.103). If z j ∈ Y , j = 1, . . . , n, and if n
∑ β j f zj = 0
(9.104)
j=1
holds, then equation (9.104) implies n
∑ β j f z j (z k ) = β k = 0 j=1
for k = 1, . . . , n. Thus, B is a basis of F(Y). (4) The elements f z ∈ B are uniquely determined by the elements z ∈ Y . Thus, the elements z can be identified with functions f z : f z (y) =: z(y)
for all
z, y ∈ Y.
(9.105)
The above results can be summarized in the following definition. Definition 9.39. The set F(Y) equipped with addition, scalar multiplication and the identification of f z and z is called free vector space over the set Y . Remark 9.40. According to equations (9.103) and (9.105) the elements of F(Y) are given by m
f = ∑ αj zj ,
z j ∈ Y,
α j ∈ ℝ,
m < ∞.
(9.106)
j=1
This means that making linear combinations λf + μg according to (9.101) is done component-wise and does not use that f and z j , j = 1, . . . , m are real-valued functions. The latter property is only used to mathematically understand the sum symbol in equation (9.106). In the older literature on tensors the expression (9.106) is sometimes called “formal sum”. The mathematical construction above gives a precise meaning to this notion. In case the set Y allows for the definition of an addition and a scalar multiplication such that f as in (9.106) is a well defined expression, one can determine F(Y) in this way from the start. The next to last step in our introductory remarks is the following.
110 | 9 Some basic notions of relativistic theories
Definition 9.41. (1) Let be given a vector space V and a subspace U ⊂ V . Moreover, let υ, υ ∈ V . The vector υ is called equivalent to υ if υ − υ ∈ U . Symbolically we write υ ∼ υ . The relation ∼ is an equivalence relation, since it is reflexive, symmetric and transitive. (2) The relation ∼ splits V into equivalence classes. The set of equivalence classes of V is called quotient space V/U . (3) The set of vectors equivalent to υ ∈ V is called υ.̃ Thus, υ̃ = υ̃ exactly if υ − υ ∈ U . (4) The map π : V → V/U defined by πυ = υ̃ is called canonical projection. For later, the the following is important. Proposition 9.42. There exists a vector space structure on V/U such that π is linear, i.e., π(υ) + π(υ ) = π(υ + υ ), (9.107) λπ(υ) = π(λυ). The proof can be found in [29, p. 28]. To continue our reasoning we need some consequences implied in Proposition 9.42. Conclusion 9.43. Let υ ∈ U . If υ ∈ V satisfies υ − υ ∈ U , then υ ∈ U + υ = U . Thus, all elements of U are equivalent and 0̃ = U . For w ∉ U and υ ∈ U we find π(υ + w) = π(υ) + π(w) = π(0) + π(w) = π(w)
(9.108)
such that π(0) = U is the zero element in V/U . This concludes our introductory remarks on the notion of tensors. Everything we like to discuss about the notion of tensors in the context of this book follows from the supposition Y = V1 × ⋅ ⋅ ⋅ × V n , (9.109) where V j , j = 1, . . . , n, are arbitrary vector spaces. Next we apply the notions we have introduced so far to this special case of Y we are interested in. Definition 9.44. (1) The elements of V j are called υ j , i.e., (υ1 , . . . , υ n ) ∈ Y.
(9.110)
To label different elements of V j , upper indices are used. Thus, ρj
V j = {υ j : ρ j ∈ I j }, where I j is a set of indices. Then F(V1 × ⋅ ⋅ ⋅ × V n ) = {
∑ ρ1 ,...,ρ n
ρ
ρ
α ρ1 ⋅⋅⋅ρ n (υ11 , . . . , υ nn )},
where the sums only have a finite number of summed elements.
(9.111)
9.6 Tensors |
111
Moreover, let N(V1 × ⋅ ⋅ ⋅ × V n ) be the subspace of F(V1 × ⋅ ⋅ ⋅ × V n ), which is spanned by the vectors (υ1 , . . . , αυ j + βυj , . . . , υ n ) − α(υ1 , . . . , υ j , . . . , υ n ) − β(υ1 , . . . , υj , . . . , υ n ) (9.112) for j = 1, . . . , n and all υ k ∈ V k , k = 1, . . . , n, as well as all α, β ∈ ℝ. Hence N(V1 × ⋅ ⋅ ⋅ × V n ) is the set of all linear combinations of vectors of the form (9.112). Finally, let T(V1 × ⋅ ⋅ ⋅ × V n ) := F(V1 × ⋅ ⋅ ⋅ × V n )/N(V1 × ⋅ ⋅ ⋅ × V n ).
(9.113)
(2) As defined in Definition 9.41(4), π is the canonical projection π : F(V1 × ⋅ ⋅ ⋅ × V n ) → T(V1 × ⋅ ⋅ ⋅ × V n ).
(9.114)
(3) Moreover, we define the function Ψ : V1 × ⋅ ⋅ ⋅ × V n → T(V1 × ⋅ ⋅ ⋅ × V n )
(9.115)
Ψ(υ1 , . . . , υ n ) = π(υ1 , . . . , υ n ).
(9.116)
by This definition makes sense, since V1 × ⋅ ⋅ ⋅ × V n ⊂ F(V1 × ⋅ ⋅ ⋅ × V n ). Ψ has the following properties: Proposition 9.45. Ψ is n-linear, i.e., Ψ(⋅ ⋅ ⋅ , λυ k + λ υk , ⋅ ⋅ ⋅ ) = λΨ(⋅ ⋅ ⋅ , υ k , ⋅ ⋅ ⋅ ) + λ Ψ(⋅ ⋅ ⋅ , υk , ⋅ ⋅ ⋅ )
(9.117)
for every k = 1, . . . , n and λ, λ ∈ ℝ. Proof. Consider the canonical projection π applied to elements of V1 × ⋅ ⋅ ⋅ × V n . Since π is linear according to Proposition 9.42, and since π[N(V1 × ⋅ ⋅ ⋅ × V n ] = 0̃
(9.118)
holds, we find π(υ1 , . . . , λυ j + λυj , . . . , υ n ) = λπ(υ1 , . . . , υ j , . . . , υ n ) + λ π(υ1 , . . . , υj , . . . , υ n ) (9.119) for every j = 1, . . . , n. Thus, equation (9.117) holds. To apply the usual notation used in the context of tensors, which was already used in the previous sections, it is necessary to rewrite two of the notions we introduced. Definition 9.46. The linear space n
T(V1 × ⋅ ⋅ ⋅ × V n ) =: V1 ⊗ ⋅ ⋅ ⋅ ⊗ V n =: ⨂ V j . j=1
(9.120)
112 | 9 Some basic notions of relativistic theories is called tensor space over the vector spaces (V1 , . . . , V n ) and written down as in equation (9.120). The tensor product of vectors υ j ∈ V j , j = 1, . . . , n, is defined by Ψ(υ1 , . . . , υ n ) =: υ1 ⊗ ⋅ ⋅ ⋅ ⊗ υ n .
(9.121)
The main properties of tensors are summarized as folllows Conclusion 9.47. (1) Equation (9.117) immediately implies υ1 ⊗ ⋅ ⋅ ⋅ ⊗ (λυ k + λ υk ) ⊗ ⋅ ⋅ ⋅ ⊗ υ n = λυ1 ⊗ ⋅ ⋅ ⋅ ⊗ υ k ⊗ ⋅ ⋅ ⋅ ⊗ υ n + λ υ1 ⊗ ⋅ ⋅ ⋅ ⊗ υk ⊗ ⋅ ⋅ ⋅ ⊗ υ n
(9.122)
for k = 1, . . . , n. In other words, the tensor product of vectors is n-linear. j j (2) Let (e1 , . . . , e d j ) be a basis of V j , and let dj
j
υ j = ∑ υ αj e α j ∈ V j .
(9.123)
α i =1
for j = 1, . . . , n. Then d1 ⋅⋅⋅d n
υ1 ⊗ ⋅ ⋅ ⋅ ⊗ υ n =
∑
α1 ,⋅⋅⋅ ,α n =1
α
α
υ11 ⋅ ⋅ ⋅ υ n n e1α1 ⊗ ⋅ ⋅ ⋅ ⊗ e nα n .
(9.124)
This implies that the D := d1 ⋅ ⋅ ⋅ d n tensors e1α1 ⊗ ⋅ ⋅ ⋅ ⊗ e nα n with α j = 1, . . . , d j for j = 1, . . . , n are generating all linear combinations of tensors of the form υ1 ⊗ ⋅ ⋅ ⋅ ⊗ υ n with υ j ∈ V j , j = 1, . . . , n. (3) The set B = {e1α1 ⊗ ⋅ ⋅ ⋅ ⊗ e nα n : α j = 1, . . . , d j , j = 1, . . . , n} (9.125) is a basis of V1 ⊗ ⋅ ⋅ ⋅ ⊗ V n . The proof can be found in [30, pp. 18, 29]. Thus, all elements of V1 ⊗ ⋅ ⋅ ⋅ ⊗ V n can be written as linear combination of elements in B, and so V1 ⊗ ⋅ ⋅ ⋅ ⊗ V n is a D-dimensional vector space. (4) Every element S ∈ V1 ⊗ ⋅ ⋅ ⋅ ⊗ V n can be written in the form S=
d1 ⋅⋅⋅d n
∑
α1 ,⋅⋅⋅ ,α n
S α1 ⋅⋅⋅α n e1α1 ⊗ ⋅ ⋅ ⋅ ⊗ e nα n .
(9.126)
The numbers S α1 ⋅⋅⋅α n are called (tensor) components of S with respect to the basis B. From this it follows that these components behave under basis transformations like products of vector components. More precisely, let B̄ be another basis, different from B, which is generated by the elements j ē β j ∈ V j ,
β j = 1, . . . , d j ,
j = 1, . . . , n.
(9.127)
Then, if jβ
j j j e α j = A α j ē β j ,
j = 1, . . . , n
(9.128)
9.7 Lorentzian manifolds |
113
holds, we obtain d1 ⋅⋅⋅d n
S = ∑ S̄ β1 ⋅⋅⋅β n ē 1β1 ⊗ ⋅ ⋅ ⋅ ⊗ ē nβn ,
(9.129)
1β nβ S̄ β1 ⋅⋅⋅β n = A α1 1 ⋅ ⋅ ⋅ A α n n S α1 ⋅⋅⋅α n .
(9.130)
β1 ⋅⋅⋅β n
with
For general relativistic theories there are only two types of vector spaces V1 , . . . , V n , namely T p M and T p⋆ M . For special relativity there are two types of such pairs: T p Ms and T p⋆ Ms , as well as M s and the corresponding covector space M s⋆ . In the literature on relativity one usually works with the following notation. Notation 9.48. (1) The tensor space consisting of n factors, from which q factors are T p M and r factors are T p M⋆ , is represented by T(q, r) := T p M ⊗ ⋅ ⋅ ⋅ ⊗ T p M ⊗ T p⋆ M ⊗ ⋅ ⋅ ⋅ ⊗ T p⋆ M.
(9.131)
Hence, every S ∈ T(q, r) can be written as α ⋅⋅⋅α
S = S β11 ⋅⋅⋅β rq ∂ x α1 ⊗ ⋅ ⋅ ⋅ ⊗ ∂ x αq ⊗ dx β1 ⊗ ⋅ ⋅ ⋅ ⊗ dx β r .
(9.132)
One calls S a (q, r)-tensor. Corresponding notions are used in special relativity for the tensor products of M s and M s⋆ . (2) A (0, 2)- or (2, 0)-tensor, represented as in equation (9.132), is called symmetric in case the matrix ((S κλ )), respectively ((S ρσ )), is symmetric.
9.7 Lorentzian manifolds The starting point for the considerations in this section is the notion of a C k manifold introduced in Section 9.1. Moreover, we use the results from Sections 9.3, 9.4, and 9.6. With these it is possible to define fundamental notions which are important for all of relativistic physics. Definition 9.49. (1) Given an n-dimensional C k -manifold M = (M, A), k ≥ 1, with its tensor spaces T(q, r), q ≥ 0, and r ≥ 0. Moreover, define a function q
F r : M → T(q, r)
(9.133)
such that for every p ∈ M and every chart (U, φ) with p ∈ U q
α ,...,α
F r (p) = F β11 ,...,β rq (p)∂ x α1 ⊗ ⋅ ⋅ ⋅ ⊗ ∂ x αq ⊗ dx β1 ⊗ ⋅ ⋅ ⋅ ⊗ dx β r
(9.134)
holds, with ∂ x α ∈ T p M and dx β ∈ T p⋆ M . α ,...,α q F r is called C j -tensor field, j ≤ k, in M if F β11 ,...,β rq ∘ φ−1 is in C j for every chart (U, φ) ∈ A .
114 | 9 Some basic notions of relativistic theories (2) Usually, one calls a (1, 0)-field a vector field, a (0, 1)-field a covector field, and a (0, 0)-field a scalar field. All other fields are simply called tensor fields. As a special case of this notion we obtain the definition of a metric of a manifold as follows. Definition 9.50. (1) Let M = (M, A) be the n-dimensional C k -manifold from the previous definition. Consider a (0, 2)-tensor field g on M with the properties 1.1) g is a C k tensor field. 1.2) g is symmetric, i.e., for every p ∈ M and every pair u, υ ∈ T p M the equation g(p)(u, υ) = g(p)(υ, u).
(9.135)
holds. 1.3) g is nondegenerate, i.e., for every p ∈ M and a u ∈ T p M the equation g(p)(u, υ) = 0
(9.136)
holds for all υ ∈ T p M if and only if u = 0. 1.4) Let p ∈ M and (U, φ) be a chart of A with p ∈ U . Moreover, let g(p) = g κλ (p)dx κp ⊗ dx λp .
(9.137)
The signature s of the matrix ((g κλ (p))) is identical for all p ∈ M and is defined by s = n − 2m, where m is the number of negative eigenvalues of the matrix ((g κλ (p))). The (0, 2)-tensor field g is called metric, or metrical tensor, on M . (2) The manifold M = (M, A, g) is called Riemannian for s = n and semi-Riemannian for s = n − 2m, with m ≥ 1. Conclusion 9.51. (1) It follows from equation (9.137) that g is symmetric for all p ∈ M if and only if the matrices ((g κλ (p))) are symmetric for all p ∈ M. (2) Moreover, g is nondegenerate for all p ∈ M if and only if the matrices G(p) := ((g κλ (p))) are nonsingular. This in turn is equivalent to the property that all eigenvalues of G(p) are unequal 0. To see this consider a symmetric and singular matrix A in ℝn . Then there exists an orthogonal matrix P and a diagonal matrix D such that A = P T ⋅ D ⋅ P. (9.138) Since det A = 0 and det P = ±1 we find that det D = 0. Thus, 0 is an eigenvalue of A and there exists a vector u ∈ ℝn such that u ≠ 0 and A ⋅ u = 0. This implies υ T ⋅ A ⋅ u = 0 for all υ ∈ ℝn and thus, A is not nondegenerate. By these results the above definition of the sigature s is justified, because all eigenvalues are either positive or negative.
9.7 Lorentzian manifolds |
115
After all these preparations we can finally define the mathematical precise notion of a spacetime. Definition 9.52. An n-dimesional semi-Riemannian manifold with metric of signature s = n − 2, which is connected and has the Hausdorff property (see [16, p. 13, 14]), is called an n-dimensional Lorentzian manifold. In standard relativistic physics like SRT and ART one uses 4-dimensional manifolds, i.e., s = 2. In the previous chapters and sections we sometimes used local Minkowski charts. At the end of this section their existence will be proved. Proposition 9.53. Let M = (M, A, g) be a semi-Riemannian manifold the metric of which has signature n − 2. Then there exists a local Minkowski chart (U, χ) for each point p ∈ M with p ∈ U . Proof. The desired chart (U, χ) is constructed under the assumption that it is not yet contained in A . Given a chart (U, φ) ∈ A with p ∈ U and φ(q) = x for q ∈ U . Then g(q) = g αβ (q)dx α ⊗ dx β .
(9.139)
Since the matrix G := ((g αβ (p))) is symmetric there exists an orthogonal matrix A = μ ((A ν )) which diagonalizes G: β
g αβ (p)A ακ A λ = b κ η κλ
(9.140)
for κ, λ = 1, . . . , n, since G has only one negative eigenvalue. Thus, b κ > 0 for every κ = 1, . . . , n. We can introduce new coordinates y on U by setting y = A−1 ⋅ x =: ϕ(x).
(9.141)
According to equation (9.49) we obtain dy κ =
∂ϕ κ λ dx = (A T )κλ dx λ . ∂x λ
(9.142)
Thus, dx λ = A λκ dy κ such that g(p) = g κλ (p)A κα A λβ dy α ⊗ dy β = b α η αβ dy α ⊗ dy β .
(9.143)
Now, b ϱ > 0, ϱ = 1, . . . , n, so we can introduce further coordinates z by setting z α = 1
−1
b α2 y α such that dy α = b α 2 dz α holds. This implies with (9.142) the result g(p) = η αβ dz α ⊗ dz β .
(9.144)
The coordinates z are local Minkowski coordinates around p ∈ U . The local Minkowski chart (U, χ) around p ∈ U is defined by 1
1
χ(q) = diag (b12 , . . . , b n2 ) ⋅ A T ⋅ φ(q) for every q ∈ U . A fortiori this proposition holds for all Lorentzian manifolds.
(9.145)
116 | 9 Some basic notions of relativistic theories
Without discussing the mathematical details we can now draw the following conclusion from the proposition above: For each point p of a general relativistic spacetime there exists a neighborhood U in which special relativity holds approximately. The approximation gets more exact the smaller one chooses U .
Epilogue The initiative to write this book goes back to a long-standing debate with Dr. Wolfgang J. C. Müller ( previously Dornier System GmbH, Friedrichshafen), a friend and former fellow student of mine, on results and problems in special and general relativity. Certain differences between our views on this subject turned out to be very fruitful. I decided to combine these results systematically for later use into one manuscript. In his critical reviews of this text, W. J. C. Müller pointed me towards misprints and suggested improvements which I introduced into the manuscript that for the time being was written in German. Finally, I offered the manuscript to the publishing company Walter de Gruyter, Berlin. It was accepted on the condition that the German text be translated into English. For this task the company was fortunate to engage Dr. Christian Pfeifer (ZARM Bremen), who performed the translation in a rather short time. Thus, the only thing still to do was proofreading. In this situation my friend and colleague Prof. Dr. Rolf Breuer (University of Paderborn) helped me enormously by a very careful and critical reading of the whole text. I thank very much all the three named above for their efforts. Finally, I want to thank Dr. Konrad Kieling, Mrs. Dipl. Phys. Astrid Seifert and Mrs. M. Sc. Nadja Schedensack as the representatives of the publishing company for their effective cooperation in the realization of my book-project. Joachim Schröter University of Paderborn, Department of Physics D-33098 Paderborn, Germany e-mail: [email protected].
DOI 10.1515/9783110485738-011
Bibliography [1] [2] [3] [4] [5] [6] [7]
[8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27]
Einstein A. Zur Elekrodynamik bewegter Körper. Ann Physik. 1905; 17(4): 891. Minkowski H. Raum und Zeit. Physikalische Zeitschrift. 1909; 10: 104 Einstein A. Die Feldgleichungen der Gravitation. Berlin: Akademie der Wissenschaften; 1915; Sitzungsberichte, p. 844. Hilbert D. Die Grundlagen der Physik. Nachrichten von der Kön. Ges. der Wissenschaften zu Göttingen, math-phys Klasse. 1915; p. 395. Reichenbach H. The Philosophy of Space and Time. New York: Dover Publications; 1958. Ehlers J, Pirani FAE, Schild A. The geometry of free fall and light propagation. In O’Raifeartaigh L (ed.). General Relativity. Oxford: Clarendon Press; 1972; p. 63. Meister R. A structural analysis of the Ehlers-Pirani-Schild space-time-theory. Univ Paderborn: Diplomarbeit; 1991. English version: Center for Interdisciplinary Research, University of Bielefeld; 2004. Available in the internet under title. Schröter J. An Axiomatik Basis of Space-Time Theory, Part I. Rep Math Phys. 1988; 26; p. 303. Schröter J, Schelb U. An Axiomatic Basis of Space Time Theory, Part II. Rep Math Phys. 1992; 31; p. 5. Schelb U. An Axiomatic Basis of Spaces-Time Theory, Part III. Rep Math Phys. 1992; 31; p. 297. Schröter J, Schelb U. On the Relation between Space-Time Theory and General Relativity, Final Report. Bielefeld: Center for Interdisciplinary Research, University of Bielefeld; 1992/93. Schelb U. Zur physikalischen Begründung der Raum-Zeit Geometrie. Paderborn: Habilitationthesis, University of Paderborn; 1997; Internet: dr. udo schelb. von Westenholz C. Differential Forms in Mathematical Physics. Amsterdam, Oxford: NorthHolland Publ Company; 1986. Sachs RK, Wu H. General Realitivity for Mathematicians. Heidelberg, Berlin: Springer- Verlag; 1977. Hawking SW, Ellis GFR. The large scale structure of space-time. Cambridge: Cambridge Monographs on Math. Physics; 1989. Choquet-Bruhat Y, de Witt-Morette C. Analysis, Manifolds and Physics. Amsterdam, Oxford: North-Holland Publ. Company; 1996. Vogelsberger M, et al. Introducing the Illustris Project. Simulating the coevolution of dark and visible matter in the Universe. Internet under arXiv: 1405.2921v1. Springel V, et al. The Aquarius Project. Mon Not R Astr Soc. 2008; 391: 1685. Stephani H. Allgemeine Relativiätstheorie. Berlin: Deutscher Verlag der Wissenschaften; 1980. Papapetrou A. Spezielle Relativitätstheorie. Berlin: Deutscher Verlag der Wissenschaften; 1955. Lorentz HA. Attempt of a Theory of Electrical and Optical Phenomena in Moving Bodies. Brill Leiden; 1895. Neiss F, Liermann H. Determinanten und Matrizen. Berlin: Springer-Verlag; 1975. Hafele J, Keating R. Around the world atomic clocks observed relativistic time gains. Science. 1972; 177: 166. Lorentz HA. Die relative Bewegung der Erde und des Äthers. Abhandlungen über Theoretische Physik, Leipzig: B.G. Teubner; 1907, p. 443. FitzGerald GF., The Ether and the Earth’s Atmosphere. Science. 1889; 13: 390. Schröter J. Über die Bilder bewegter Objekte und die Unsichtbarkeit der Lorentz-Kontraktion. Z Naturforschung. 1966; 21a: 669. Grotemeyer KP. Topologie. Mannheim: Bibliographisches Institut ; 1969; Hochschulskripten 836, p. 28.
DOI 10.1515/9783110485738-012
120 | Bibliography
[28] Cullen HF. Introduction to General Topology. Boston: D.C. Heath and Company; 1975. [29] Greub W. Linear Algebra. Berlin, New York: Springer-Verlag; 1981. [30] Greub W. Multilinear Algebra. Berlin, New York: Springer-Verlag; 1967.
Index 1-forms 96 Aberration – of light 87 addition of velocities – relativistic 39 atlas – complete 91 canonical projection 110 causal 47, 61 causal cone 58, 62 – backward 62 – forward 62 – future 62 – past 62 causal future 63 causal past 63 causal relation 62 causality 62 chronological future 62, 63 chronological past 63 chronological relation 62 chronology 62 C k -atlas 91 clock of γ 71 clock on γ[I] 71 coordinates – non-Minkowskian 52 cosmological constant 4 cotangent space 96 covariant derivatives 59 covector field 114 covectors 96 Decomposition – nonorthochronous Lorentz matrices 35 decomposition – nonuniqueness 36 – of products 39 decomposition theorem – interpretation of the 34 – Lorentz matrices 29 difference angle – of the aberration 89 direct sum 104
energy momentum tensor 4 EPS axiomatic 1 Euler decomposition 42 Euler Theorem 41 finite selection functions – over Y 108 first clock effect 82 free vector space – over the set Y 109 future-pointing 56, 59, 62 gravitational constant – Einsteinian 4 inertial system 70 initial observer 70 inner product 99 – definite 99 – indefinite 99 instant observer 70 length contraction 83 light ray 69 light signal 69 light world line 69 lightcurve 68 lightlike 47, 61 lightlike signal 69 linear forms 96 local Minkowski coordinates 6 Lorentz 1 Lorentz group 10 Lorentz manifold 3 Lorentz matrices – orthogonal 21 – rotational 21 Lorentz matrix 6 – antichronous 15 – orthochronous 15 – proper 15 – special 18 Lorentz transformation 6 – homogeneous 10 – orthochronous 24 – proper 24
122 | Index
– special 24 Lorentz vector space 3, 99 Lorentzian manifold 55, 115
Ricci tensor 4 Riemannian – manifold 114
manifold 1 – n-dimensional C k 91 – differentiable 91 metric 114 metrical tensor 114 Minkowski basis 99 Minkowski chart 5, 6 Minkowski coordinate 5, 6 Minkowski observer 70 Minkowski reference system 70 Minkowski space 3 Minkowski spacetime 3 musical operators 102
scalar 114 Sch2 theory 1 semi-Riemannian 114 signal – material 66 space – of an observer 74 space-time theories 1 spacelike 47, 61 standard clock of γ 71 structure – differentiable 91
negatively oriented 64 Newtonian velocity 75 nondegenerate 99 observers 70 OC condition 27 orientation 60 – of the vector space M s 64 oriented 60 – negatively 60 – positively 60 orthogonal complement 105 parallelism 59 past-pointing 56, 59, 62 physical equivalence map 101 physically equivalent 101 Poincaré group 11 positively oriented 64 proper time 71 proper time clock 71
tangent space – at p ∈ M 95 – of p 95 tangent vector 92 tangent vectors – spacelike 49 – timelike 47 tensor field – C j 113 – j ≤ k 113 time – of an observer 74 time dilation 82 time on γ[I] 71 time orientability 55 time orientable 55 time orientation 55 – on Ms 56 time parameter of γ 71 timelike 47, 61 twin paradox 82 universal clock 73
quotient space 110 radar chart 79 radar coordinates 79 reference system 70 Ricci curvature scalar 4
vector field 114 – parallel 59 world-curve 66 worldline 66
De Gruyter Studies in Mathematical Physics Volume 39 Vladimir K. Dobrev Invariant Differential Operators. Volume 2: Quantum Groups, 2017 ISBN 978-3-11-043543-6, e-ISBN (PDF) 978-3-11-042770-7, e-ISBN (EPUB) 978-3-11-042778-3, Set-ISBN 978-3-11-042771-4 Volume 38 Alexander N. Petrov, Sergei M. Kopeikin, Robert R. Lompay, Bayram Tekin Metric theories of gravity: Perturbations and conservation laws, 2017 ISBN 978-3-11-035173-6, e-ISBN (PDF) 978-3-11-035178-1, e-ISBN (EPUB) 978-3-11-038340-9, Set-ISBN 978-3-11-035179-8 Volume 37 Igor Olegovich Cherednikov, Frederik F. Van der Veken Parton Densities in Quantum Chromodynamics: Gauge invariance, path-dependence and Wilson lines, 2016 ISBN 978-3-11-043939-7, e-ISBN (PDF) 978-3-11-043060-8, e-ISBN (EPUB) 978-3-11-043068-4, Set-ISBN 978-3-11-043061-5 Volume 36 Alexander B. Borisov, Vladimir V. Zverev Nonlinear Dynamics: Non-Integrable Systems and Chaotic Dynamics, 2016 ISBN 978-3-11-043938-0, e-ISBN (PDF) 978-3-11-043058-5, e-ISBN (EPUB) 978-3-11-043067-7, Set-ISBN 978-3-11-043059-2 Volume 35 Vladimir K. Dobrev Invariant Differential Operators: Volume 1 Noncompact Semisimple Lie Algebras and Groups, 2016 ISBN 978-3-11-043542-9, e-ISBN (PDF) 978-3-11-042764-6, e-ISBN (EPUB) 978-3-11-042780-6, Set-ISBN 978-3-11-042765-3 Volume 34 Abram I. Fet Group Theory of Chemical Elements, 2016 ISBN 978-3-11-047518-0, e-ISBN (PDF) 978-3-11-047623-1, e-ISBN (EPUB) 978-3-11-047520-3, Set-ISBN 978-3-11-047624-8 www.degruyter.com