Problems and Solutions on Vector Spaces for Physicists: From Part I in Mathematical Physics―A Modern Introduction to Its Foundations 3031312171, 9783031312175

This book offers supporting material for the comprehensive textbook Mathematical Physics―A Modern Introduction to Its Fo

128 80 3MB

English Pages 193 [188] Year 2023

Table of contents :
Foreword
Preface
Why Use Hassani’s Textbook?
How to Use This Solution Manual?
Contents
Acronyms and Symbols
1 Mathematical Preliminaries
1.1 Problems
1.2 Supplementary Problems
2 Vectors and Linear Maps
2.1 Problems
2.2 Supplementary Problems
3 Algebras
3.1 Problems
3.2 Supplementary Problems
4 Operator Algebra
4.1 Problems
4.2 Supplementary Problems
5 Matrices
5.1 Problems
5.2 Supplementary Problems
6 Spectral Decomposition
6.1 Problems
6.2 Supplementary Problems
Appendix References
Index

Recommend Papers

Mathematics for Physicists Solutions to problems [1 ed.]

119 12 31MB Read more

Mathematical Physics A Modern Introduction to Its Foundations [2nd ed. 2013] 9783319011943, 9783319011950, 3319011944, 3319011952

This book is for physics students interested in the mathematics they use and for mathematics students interested in seei

394 92 11MB Read more

Instructor Solutions Manual for Introduction to Mathematical Statistics and Its Applications [5 ed.] 9780321694010, 0321694015

Noted for its integration of real-world data and case studies, this text offers sound coverage of the theoretical aspect

676 22 2MB Read more

Mathematical Methods for Physics: Problems and Solutions 9814968714, 9789814968713

This book presents mathematical tools to solve partial differential equations, typical of physical problems. It explains

205 104 13MB Read more

Problems and solutions in group theory for physicists 9789812388322, 981-238-832-X

This book is aimed at graduate students in physics who are studying group theory and its application to physics. It cont

734 139 2MB Read more

A Course in Mathematical Methods for Physicists (Instructor Solution Manual, Solutions) 9781466584709

99 59 5MB Read more

Mathematical Tools for Physicists 3527405488, 9783527405480

Mathematical Tools for Physisists is a unique collection of 18 review articles, each one written by a renowned expert of

605 47 5MB Read more

Functional Equations in Mathematical Olympiads 2017-2018 Problems and Solutions

795 310 974KB Read more

Introduction to the Mathematical and Statistical Foundations of Econometrics 9780511080418

427 87 4MB Read more

Modern Methods in Topological Vector Spaces 0486493539, 9780486782249, 2013018446, 9780486493534

Designed for a one-year course in topological vector spaces, this text is geared toward advanced undergraduates and begi

108 15 3MB Read more

Problems and Solutions on Vector Spaces for Physicists: From Part I in Mathematical Physics―A Modern Introduction to Its Foundations
3031312171, 9783031312175

Author / Uploaded
Robert B. Scott

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Robert B. Scott

Problems and Solutions on Vector Spaces for Physicists From Part I in Mathematical Physics—A Modern Introduction to Its Foundations

Problems and Solutions on Vector Spaces for Physicists

Robert B. Scott

Problems and Solutions on Vector Spaces for Physicists From Part I in Mathematical Physics—A Modern Introduction to Its Foundations

Robert B. Scott Laboratoire de Mathématiques de Bretagne Atlantique Département de Physique Université de Bretagne Occidentale Brest, France

ISBN 978-3-031-31217-5 ISBN 978-3-031-31218-2 (eBook) https://doi.org/10.1007/978-3-031-31218-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

For Redmar and Karsten, who remind me, every once in a short while, just how wonderful is the world.

Foreword

Robert Scott has done a great service to the physics community by providing highly detailed solutions to a third of the exercises from the first part of my textbook Mathematical Physics. Those of you who have used his A Student’s Manual for a First Course in General Relativity will recognize his style, wherein each step is carefully justified. He has also added supplementary problems, one third with solutions and two thirds without, so that the total number of unsolved problems linked to my book has not decreased. His solutions make systematic reference to the theorems, propositions, lemmas and definitions in my book, allowing the reader to quickly find the relevant material needed to solve the problems. Of course, students should try all the exercises on their own first. When you are successful without help, Scott’s solutions can be used simply for verification—I have personally double-checked all of Scott’s solutions, and in rare cases provided my own solutions to him when I felt my solutions were more direct. But when you are stuck, Scott’s solutions provide an invaluable resource to quickly find the material you are missing, so you can efficiently advance in your learning. Learning mathematical physics is often challenging, so it is well worthwhile to invest in the material that will support you on your journey. The rewards are certainly worth the effort. Urbana, IL, USA September 2022

Sadri Hassani

vii

Preface

Why Use Hassani’s Textbook? If you are an undergraduate or masters physics student, launching a career in physics research, there are two good reasons why studying a rigorous mathematical physics textbook will be a valuable use of your time: (i) it is easier to learn the mathematics in its more natural habitat rather than the watered down version relegated to the appendix of a physics text, and (ii) you will acquire a more powerful set of tools that you will know how to use, and not to misuse, with confidence. Most physics students will respond that they do not want to be mathematicians, they want to do physics. But in the long run, it will be more efficient to study the formal foundations of the mathematics you need for physics rather than rely on vague arguments and appeals to physical intuition that are used in the physics classroom in the place of precise mathematics. There is great generality in abstraction ( =⇒ power) and there is great precision in rigor ( =⇒ confidence). As an added bonus, you will become more bilingual, opening access to the wealth of formal mathematics literature and better able to understand your colleagues in theoretical physics who speak and write in a mathematical dialect. Why is it easier to study mathematical physics than learn the mathematics from physics courses and textbooks? The mathematics presented in physics textbooks is often not adequate on its own. In an attempt to make their physics textbook selfcontained, the author will often include a preliminary chapter on the requist mathematics, or cram the essential theorems in an appendix. Invariably the author warns that these are no substitute for a proper course on this essential background material. You should take that warning seriously. Mathematics is often the stumbling block in physics education. As a physics student, my peers and I were intimidated by the difficult physics courses like electricity and magnetism and general relativity because of the demanding mathematics. Now I am a university lecturer and my peers are mostly pure mathematicians. I was intrigued and surprised when several of them confessed to me that they loved physics and indeed wanted to become physicists but ended up mathematicians because they

ix

x

Preface

found mathematics much easier than physics. As I learned more and more mathematics from different perspectives finally I made sense of their perspective. Even mathematicians have trouble following the mathematics in physics textbooks because it is often not clearly presented. Is the solution to turn to pure mathematics textbooks to learn the requist mathematics for physics? After one fights one’s way past the funny symbols and becomes accustomed to the “style in which pure mathematicians think and write” (p. 196 of Misner, Thorne and Wheeler, Gravitation, W. H. Freeman and Co., 1973) the explanations in a pure mathematics book are often much more clear and precise. The problem here is of course the great investment of time required. The 19-year-old Paul Dirac, having finished his engineering degree and unable to find work, did an undergraduate degree in mathematics. Not all of us have the luxury to do this. Most advanced mathematics books that cover the material interesting to physics students assume the reader has taken several mathematics courses that are typical of the mathematics undergraduate curriculum (notably analysis and algebra) but not that of physics. At the other end of the spectrum, one can find many textbooks on mathematical methods for physicists and engineers. Typically, these textbooks will teach you a variety of techniques and do little to help you learn rigorous mathematics. For some physicists, this may be enough. But for the ambitious aspiring theoretical physicist, an investment of precious time now with a mathematical physics textbook that emphasizes the foundations with a more rigorous presentation will soon pay off in your career. My experience is that there are several textbooks that achieve this attention to rigor while remaining accessible for physicists lacking formal mathematics training, Hassani’s textbook being by far the most comprehensive. Others that achieve this delicate balance include the old classic by the masters Richard Courant and David Hilbert (Methods of Mathematical Physics, Volume 1, Wiley) and from a more modern perspective the textbook of Peter Szekeres (Methods of Modern Mathematical Physics, Cambridge University Press, 2004). At a slightly more introductory level but written to the same standards of rigor is the recent textbook of Alexander Altland and Jan von Delft (Mathematics for Physicists, Cambridge University Press, 2019). Several more specialized textbooks covering limited topics achieve this balance of mathematical rigor and physicist-friendly style, for example: V. I. Arnold, Mathematical Methods of Classical Mechanics, Springer, 1989 J. David Logan, Invariant Variational Principles, Academic Press, 1977 Hanno Rund, The Hamilton-Jacobi Theory in the Calculus of Variations, D. Van Nostrand Company Ltd., 1966 David Lovelock and Hanno Rund Tensors, Differential Forms, and Variational Principles, Dover Publications, 1989 Bernard Schutz, Geometrical Methods of Mathematical Physics, Cambridge University Press, 1980.

Preface

xi

For the last 5 years I have been a physicist among pure mathematicians. I infiltrated the gauge theory working group of the mathematics laboratory Laboratoire de Mathmatique Bretagne Atlantique (LMBA) and became an active member of their fortnightly meetings. It has been a very challenging experience with huge language and cultural barriers to surmount. But it was worth the extraordinary effort for I have learned many useful and sometimes surprising things from this unique opportunity. The language barrier between mathematicians and physicists is huge, much larger and more frustrating than the French-English barrier. Working among the mathematicians I also learned of the great power and generality of abstract mathematics and how useful this can be in learning physics. I offer two first-hand experiences to illustrate this. I have given several talks on general relativity (GR) and find it much harder to talk to nonspecialist physicists than to the mathematicians of LMBA. With the physicists who do not know the mathematics of GR, I have to give a vague, intuitive, hand-waving explanation of what is spacetime curvature just to explain the basic idea of GR, and then still feel my hands are tied when I want to present the results or message of my talk. But with the mathematicians, who know much more differential geometry than I do, I feel I can introduce the basic idea of GR and then move on to describe precisely the message of my talk. As another illustrative example of the power of mathematics, I confess my experience learning quantum field theory (QFT). I convinced the working group that they should learn QFT because this is the theory behind the standard model of particle physics that motivates much of the mathematics that interests the group. So they organized a mini course with a sympathetic physicist, Glenn Barnich. My math colleagues, who have none of the prerequists for learning QFT (no course in classical mechanics, no quantum mechanics, no special relativity, no relativistic quantum mechanics, no electricity and magnetism) were soon far ahead of me, asking penetrating questions and amazingly learning the essentials of QFT faster than me. Admittedly the course was tailored to them, and Glenn did a remarkable job and patiently addressed their naive questions, but still, it was a sobering lesson to me of the power of a strong mathematics background. Maybe every aspiring physicist should, like Paul Dirac, do a mathematics degree before going to Cambridge for a Ph.D. in physics? Not all of us have time for a degree in mathematics so a shortcut is highly desirable and self-study with the appropriate textbooks is a convenient solution. Here one has to be careful because, mathematics being highly hierarchical by nature with everything built from set theory and the real numbers, jumping in at a graduate level or even advanced undergraduate level textbook the physicist immediately finds she does not have the prerequisits. Many times I have found myself excited to pick up a mathematics textbook that covers background relevant to physics then soon became deflated as I found myself lost in a maze of funny symbols and impenetrable language of pure mathematics. My experience is that a good mathematical physics textbook, written for physicists but with a rigorous theorem-proof style offers a viable shortcut to useful mathematics and the next best alternative to returning to first-year mathematics textbooks and working one’s way systematically through the undergraduate syllabus.

xii

Preface

How to Use This Solution Manual? If you picked up this book you must know by now the value of solving problems, and know to use this manual wisely; look at the solution only after you have at least tried to solve the problem yourself. I suggest that if you are stuck on a given problem, try writing down exactly why you are stuck. Imagine explaining to a friend what you do not understand fully. Then re-read the appropriate part of Hassani’s text, or if in doubt the whole chapter. With these questions in mind, you will read like a detective looking for clues. It is a pleasure to thank Sadri Hassani for proofreading an earlier version. It is a pleasure to thank my kind and generous colleagues in Laboratoire de Mathématiques de Bretagne Atlantique for their help in preparing this manuscript, especially Luis Gallardo, Johan Huismann, Jean-Philippe Nicolas, Rachid Regbaoui, (and last alphabetically, but foremost for his patience) Carl Tipler. La Forest-Landerneau, France August 2020

Robert B. Scott

Contents

1 Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Supplementary Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 4

2 Vectors and Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Supplementary Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9 9 22

3 Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Supplementary Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31 31 50

4 Operator Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Supplementary Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63 63 82

5 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.2 Supplementary Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6 Spectral Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.2 Supplementary Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

xiii

Acronyms and Symbols

∀ \ ≡ ◦ σ

GR

iff LHS QFT

RHS

For all or for any. For example, ∀x ∈ A reads for every x in the set A. (In set theory) related to. An operation on ordered pairs of elements from a set that takes values either true or false. (In set theory) difference. A\B = {a ∈ A|a ∈ / B}. Equal by definition. Function or map composition. See Fig. 1.1. Permutation symbol. Also called the Levi-Civita symbol or the al-ternating tensor. σ has value +1 for an even permutation σ and −1 for an odd permutation σ . See Problems 2.36, 5.18 and 5.50. (In physics) The general theory of relativity, invented by Einstein and completed in 1915. There are many rigorous introductions at the level of Hassani, for example M. P. Hobson and G. P. Efstathiou and A. N. Lasenby, General Relativity: An Introduction for Physicists, Cambridge University Press, 2006. If and only if. In logic we have that A iff B is logically equivalent to A =⇒ B and B =⇒ A, often written A ⇔ B. (In this book) The left-hand side of the equation in question. (In physics) Quantum field theory, the theory of advanced quantum mechanics that lies at the base of the standard model of particle physics. Along with general relativity it is the most fundamental theory we currently have. (In this book) The right-hand side of the equation in question.

xv

Chapter 1

Mathematical Preliminaries

Abstract This chapter introduces some of the basic tools (sets, equivalence relations and equivalence classes, maps, metric spaces, etc.) This is standard material in the education of a mathematician but often overlooked in a physics education. It will be useful for the physics student to form a precise notion of these fundamental concepts. Equivalence classes and factor sets will arise for the physicist in group theory [6, Chap. 1] and some of the properties proved by Cornwell are more easily proved in the general setting of equivalence classes. Maps are of course very familiar to physics students but are traditionally treated with more care by mathematicians. For instance mathematicians never introduce a new function without defining its domain X and codomain Y often using the notation f : X → Y explained by Hassani in Sect. 1.2. This attention to detail does not take much effort yet could avoid so much confusion; physicists should really follow the mathematical tradition here.

1.1 Problems 1.3 For each natural number n excluding zero, n ∈ N∗ , let 1 . In = x |x − 1| < n and |x + 1| > n

(1.1)

Find ∪n In and ∩n In . [ Consider x ∈ R.]

1.3 This is just a little exercise in getting used to the notation of set theory. You can approach the problem formally and use the elementary rules of real number analysis, especially |y| =

y if y ≥ 0 , −y if y ≤ 0

y ∈ R,

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 R. B. Scott, Problems and Solutions on Vector Spaces for Physicists, https://doi.org/10.1007/978-3-031-31218-2_1

(1.2)

1

2

1 Mathematical Preliminaries

and the ineqaulity changes direction when you multiply both sides by −1, i.e. x>y

=⇒

−x < −y.

(1.3)

However, it’s faster to think physically; think of the two constraints as requiring that the distance from +1 must be less than n and the distance from −1 must be greater than 1/n respectively. Setting n = 1, 2, 3, 4 we find the first two intervals of the sequence are a bit special: I1 = (0, 2), 1 I2 = − , 3 , 2 1 2 I3 = −2, −(1 ) and − , 4 , 3 3 1 3 I4 = −3, −(1 ) and − , 5 , 4 4 .. . 1 1 In = −n + 1, −(1 + ) and −1 + , n + 1 . n n

(1.4)

It is clear that the previous set In is a subset of In+1 for any nonzero natural number n ∈ N∗ . So for the union of all sets In we can take the limit of In for n → ∞. We find all the reals except −1:

In = lim In = R\{−1}. n→∞

n∈N

(1.5)

For the intersection of all sets In we can take the first interval of In , that is I1 :

In = I1 = (0, 2).

n∈N

1.6 Show that ( f ◦ g)−1 = g −1 ◦ f −1 when f and g are both bijections.

(1.6)

1.1 Problems

3

1.6 Denote the domain and codomain of g and f as the sets X, Y, W so that g : X → Y and f : Y → W . Figure 1.1a below shows what’s going on here, and Fig. 1.1b shows how to construct the inverse. Let’s see how to prove this rigorously. Recall that in general a map G : W → X is the inverse of map F : X → W iff both G ◦ F = id X , F ◦ G = idW ,

(1.7) (1.8)

where id A is the identity map on set A. Here let F = f ◦ g. We wish to show that G = g −1 ◦ f −1 is the inverse of F. First note that we are guaranteed that the inverses g −1 and f −1 exist because we are told that g and f are bijections. It suffices to verify both Eqs. (1.7) and (1.8). The first is satisfied because G ◦ F = (g −1 ◦ f −1 ) ◦ ( f ◦ g) = g −1 ◦ ( f −1 ◦ f ) ◦ g, =g

−1

◦ idY ◦ g,

=g

−1

◦ g = id X ,

composition is associative, (1.9)

and Eq. (1.8) is satisfied because F ◦ G = ( f ◦ g) ◦ (g −1 ◦ f −1 ), = f ◦ (g ◦ g −1 ) ◦ f −1 , = f ◦ idY ◦ f

−1

= f ◦ f

composition is associative, −1

= idW .

(1.10)

The existence of G such that Eq. (1.7) is satisfied is a sufficient condition that F : X → W is injective. The existence of G such that Eq. (1.8) is satisfied is a sufficient condition that F : X → W is surjective. A map that is both injective and surjective is called bijective. A bijective map has an inverse.

1.9 Take any two non-vanishing open intervals (a, b) and (c, d), in the reals, i.e. b > a, d > c, and a, b, c, d ∈ R, and show that there are as many points in the first as there are in the second, regardless of the size of the intervals. Hint: Find a (linear) algebraic relation between points of the two intervals.

4

1 Mathematical Preliminaries

Fig. 1.1 a The composition f ◦ g : X → W of maps g : X → Y and f : Y → W is illustrated graphically here. Note that the codomain of each map is identical to its range, i.e. all points in the codomain have corresponding points in the domain because f and g are surjective. b The direction of the arrows is reversed for the inverse map ( f ◦ g)−1 . This graph illustrates the desired relation ( f ◦ g)−1 = g −1 ◦ f −1

1.9 Naively one would expect that an interval of the real line twice as long as another would have twice as many points. In this exercise we prove all intervals of the real line (a, b), a > b, a, b ∈ R have precisely the same number of points. We exploit the fact that if two sets A and B are related by a bijection f : A → B then A and B have precisely the same number of elements. Following the hint, we choose as our bijection a simple affine relation f : (a, b) → (c, d) given by: f (x) = c + (d − c)

x −a , b−a

x ∈ (a, b), b > a, d > c.

(1.11)

f is clearly surjective and injective and therefore bijective. This suffices to prove that intervals (a, b) and (c, d) have the same number of points.

1.2 Supplementary Problems 1.12 Let A be the set of all English words and for a, b ∈ A let a b be interpreted as “a rhymes with b.” Show is an equivalence relation. 1.13 Verify Hassani Proposition 1.1.2 that if is an equivalence relation on set A and a, b ∈ A, then either a ∩ b = ∅ or a = b.

1.2 Supplementary Problems

5

1.13 An equivalence relation on set A is, of course, a type of relation on set A and that means that for any pair of elements a, b ∈ A, either it is true or not true that a b. In the latter case we could write a b. Suppose a b. By the symmetry and transitivity properties of an equivalence relation we have that a = b. In more detail, equivalence classes a and b are defined to be the sets a = {c ∈ A|c a},

b = {c ∈ A|c b}.

(1.12)

One can show that membership in b implies membership in a as follows. By the symmetry of an equivalence relation, c b implies b c. Then by transitivity of an equivalence relation, a b and b c implies a c. Again by symmetry this implies c a. Thus b ⊂ a. Similarly, membership in a implies membership in b because of transitivity so a ⊂ b. Thus a = b. The only other possibility is a b. We seek the intersection between the two equivalence classes a ∩ b in this case. a ∩ b = {c ∈ A|c ∈ a and c ∈ b}, = {c ∈ A|c a and c b}, = {c ∈ A|a c and c b}.

by definition of intersection, used Eq. (1.12) used symmetry of (1.13)

Transitivity of an equivalence relation implies that if c meeting this condition existed then a b. But we supposed this not to be the case; therefore no c exists and we conclude a ∩ b = ∅.

(1.14)

1.14 Let be the relation on the set of integers Z defined by m n for m, n ∈ Z if m − n is divisible by k [with zero remainder understood], where k is a fixed, positive integer. Verify that is an equivalence relation. Furthermore, show that the factor set in this case is Z/ = {0, 1, . . . , k − 1}.

(1.15)

Repeat this problem for k a fixed negative integer. 1.15 Why is the function det : Mn×n → R defined by det(A) = det A, the determinant of the real n × n matrix A, surjective?

6

1 Mathematical Preliminaries

1.15 To verify that det : Mn×n → R is surjective we must show that this function can take as its value any real number. First we note that the determinant of the identity matrix, regardless of the dimension n, is one: ⎛

⎞ 0 0⎟ ⎟ ⎟ = 1n = 1. ⎠ 0 0 ... 0 1

1 0 0 ... ⎜0 1 0 . . . ⎜ det(11) = det ⎜ . ⎝ ..

And now it suffices to observe that we can replace any one of these 1’s with an arbitrary real number a ∈ R, and this give the determinant the value a.

1.16 The abstract notion of distance d given in Hassani Definition 1.3.1 for a metric space E allows us to define the fundamental notion of an open subset of E. A subset A ⊂ E is open if for each element x ∈ A we can find a real number ε > 0 for which the open set Bε (x) centred on x of radius ε, Bε (x) = {y ∈ A|d(x, y) < ε},

(1.16)

is contained within A, i.e. Bε (x) ⊂ A. The set Bε (x) is often called the open ball centred on x of radius ε because of the geometric interpretation when the metric space E is Euclidean space. Prove the following important theorem. A bijective map f : A → A is continuous on A ⊂ E iff the inverse image V = f −1 (U ) of every open set U ⊂ A is an open subset of A. Some advanced physics textbooks, for example [20, Appendix A], will require that you know this. 1.17 Prove that the power set 2 S of a countably infinite set S is uncountable.

1.17 We use Cantor’s famous diagonal argument, which arrives at a contradiction when we try to construct a pairing between the natural numbers and the power set of S. Because S is countable we an organize its members into a sequence of elements S = {s1 , s2 , . . .}. Consider constructing an arbitrary subset U1 ⊂ S; it is not important for us whether U1 is finite or not. We do so by deciding, for each element sn ∈ S whether or not to include sn in U1 . And we could encode the information that characterizes U1 with a sequence x1 of binary digits ε1n , say 1’s or 0’s to indicate sn is included or not: x1 = {1, 0, 0, 1, . . .}, U1 contains s1 and s4 but not s2 and s3 etc. (1.17)

1.2 Supplementary Problems

7

And similarly we could characterize another subset U2 ⊂ S with sequence x2 . Now we attempt to put all subsets of S in one-to-one correspondance with N by indexing them Um , m ∈ N. This implies we can form an array with row m corresponding to sequence xm . Cantor’s beautiful insight was to note that this construction suggests a sequence x (of binary digits εn ) that is not in the array but represents another subset of S. For instance we could take x as the sequence obtained by flipping the digits of the diagonal of the array. That is x differs from xm at position n = m: εn

=

1 if εnn = 0, 0 if εnn = 1.

(1.18)

So we have found a new subset of S that differs from all the others Um , m ∈ N. This contradicts the fact that the rows of our array should characterize all possible subsets of S. We conclude the power set of a countably infinite set is not finite and not countable and therefore it is uncountable. Our solution embellishes the proof of Szekeres [19, Theorem 1.4].

1.18 The interval of the reals [0, 1] is uncountable. Hint: Put the elements of this interval into one-to-one correspondance with the power set 2N . 1.19 Prove that Hassani Eq. (1.1) for the binomial (a + b)m m m m−k k (a + b) = b , a k k=0 m

(1.19)

is unchanged upon interchange of a and b.

1.19 Interchanging a and b in Eq. (1.19) and introducing a new dummy index we find (b + a)m =

m m =0

bm− a .

(1.20)

We want to compare Eqs. (1.19) and (1.20) and in fact this can be done term by term because matching powers of a also gives the corresponding power of b for each term: bm− a = a m−k bk ,

when = m − k,

∀k = 0, 1, . . . m.

(1.21)

8

1 Mathematical Preliminaries

So we require that

m m = . (m − k) k

This is true because m! m m! m = = . = (m − k)![m − (m − k)]! (m − k)!k! k (m − k) This confirms that the binomial (a + b)m is unchanged upon interchange of a and b, as anticipated.

Chapter 2

Vectors and Linear Maps

Abstract The notion of a vector is abstracted and made precise with the mathematical structure of a vector space. Linear maps are important operations on vector spaces that find applications in many areas of science and indeed are essential for quantum mechanics.

2.1 Problems 2.3 For each of the following subsets of R3 determine whether it is a subspace of R3 : (a) {(x, y, z) ∈ R3 |x + y − 2z = 0}; (b) {(x, y, z) ∈ R3 |x + y − 2z = 3}; (c) {(x, y, z) ∈ R3 |x yz = 0};

(2.1)

2.3 We simply try to form arbitrary linear combinations and check if they meet the definition of the subset. If so, then they form a subspace. (a) {(x, y, z) ∈ R3 |x + y − 2z = 0}. Consider two points (x1 , y1 , z 1 ) and (x2 , y2 , z 2 ) that are part of the subset (satisfy the subset-defining equation). We form an arbitrary linear combination α(x1 , y1 , z 1 ) + β(x2 , y2 , z 2 ) = (x3 , y3 , z 3 ) and find that it does satisfy the subset-defining equation: x3 + y3 − 2z 3 = (αx1 + βx2 ) + (αy1 + β y2 ) − 2(αz 1 + βz 2 ), rearranged = α(x1 + y1 − 2z 1 ) + β(x2 + y2 − 2z 2 ), = 0, (2.2) for any α, β ∈ C. In the final step of Eq. (2.2) we used the fact that (x1 , y1 , z 1 ) and (x2 , y2 , z 2 ) satisfy the subset-defining equation.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 R. B. Scott, Problems and Solutions on Vector Spaces for Physicists, https://doi.org/10.1007/978-3-031-31218-2_2

9

10

2 Vectors and Linear Maps

(b) {(x, y, z) ∈ R3 |x + y − 2z = 3}. Consider two points (x1 , y1 , z 1 ) and (x2 , y2 , z 2 ) that are part of the subset (satisfy the subset-defining equation). We form an arbitrary linear combination α(x1 , y1 , z 1 ) + β(x2 , y2 , z 2 ) = (x3 , y3 , z 3 ) and find that it does not satisfy the subset-defining equation: x3 + y3 − 2z 3 = αx1 + βx2 + αy1 + β y2 − 2(αz 1 + βz 2 ), = α(x1 + y1 − 2z 1 ) + β(x2 + y2 − 2z 2 ) = α 3 + β 3 = 3.

(2.3)

Thus the equation is not satisfied for arbitrary α, β ∈ C. This subset is not a subspace. (c) {(x, y, z) ∈ R3 |x y z = 0}. Consider two points (x1 , y1 , z 1 ) and (x2 , y2 , z 2 ) that are part of the subset (satisfy the subset-defining equation). We form an arbitrary linear combination α(x1 , y1 , z 1 ) + β(x2 , y2 , z 2 ) = (x3 , y3 , z 3 ) and find that it does not satisfy the subset-defining equation: x3 y3 z 3 = (αx1 + βx2 )(αy1 + β y2 )(αz 1 + βz 2 ) = 0, =⇒ (x3 = 0) or (y3 = 0) or (z 3 = 0).

(2.4)

This equation is not satisfied for arbitrary α, β ∈ C. A counter example is (x1 , y1 , z 1 ) = (0, 1, 1) and (x2 , y2 , z 2 ) = (1, 0, 1). Then Eq. (2.4) reduces to x3 y3 z 3 = βα(α + β) = 0.

(2.5)

This is not true for arbitrary α, β ∈ C. This subset is not a subspace.

2.6 Prove Theorem 2.1.6: If S is any nonempty set of vectors in a vector space V, then the set W S of all linear combinations of vectors in S is a subspace of V.

2.6 Suppose |a⟩ and |b⟩ are arbitrary vectors in W S . Then by definition they can be written as a linear combination of the elements sk of S: |a⟩ =

N ∑ k=1

αk |sk ⟩,

|b⟩ =

N ∑

βk |sk ⟩,

k=1

where N is the number of vectors in S and αk , βk are complex coefficients. We must show that an arbitrary linear combination of these vectors is a vector in W S . Let A, B ∈ C be constants. Then

2.1 Problems

11

|c⟩ = A|a⟩ + B|b⟩ = A

N ∑

αk |sk ⟩ + B

k=1

N ∑

βk |sk ⟩ =

k=1

N ∑

γk |sk ⟩

k=1

where γk = Aαk + Bβk ∈ C. So |c⟩ has the correct form to be a vector in W S . This suffices to show that W S , called the span of S, is a subspace of S.

2.9 Show that the vectors defined by | ck ⟩ = (| ak ⟩, | 0 ⟩V ), | ck ⟩ = (| 0 ⟩U , | b ⟩k−M ),

if 1 ≤ k ≤ M, if M + 1 ≤ k ≤ M + N ,

Hassani Eq. (2.5)

(2.6)

˜ ⊕ V. ˜ Here {| ai ⟩} M is a basis of vector space U, {| bi ⟩} N is span W = U × V = U i=1 i=1 ˜ is the set of vectors of the form (| u ⟩, | 0 ⟩V ) with a basis of vector space V, and U | u ⟩ ∈ U and V˜ is the set of vectors of the form (| 0 ⟩U , | v ⟩) with | v ⟩ ∈ V.

2.9 Note the tilde symbol has been introduced to distinguish vectors | u ⟩ ∈ U ˜ The sets U ˜ and V˜ are subspaces of W by proposition and (| u ⟩, | 0 ⟩V ) ∈ U. N +M is the set of vectors of the form Hassani 2.1.19. The span of {| ck ⟩}k=1 N +M Span{| ck ⟩}k=1

=

N∑ +M

γk | ck ⟩,

γk ∈ C

k=1

=

( M ∑

) γk | ak ⟩, | 0 ⟩V

definition of span

( + | 0 ⟩U ,

k=1

N∑ +M

) γk | bk−M ⟩ ,

⎞ ⎛ M N ∑ ∑ γk | ak ⟩ + | 0 ⟩U , | 0 ⟩V + γ j+M | b j ⟩⎠ =⎝ k=1

= (u, v) ≡ u × v = W,

used Eq. (2.6)

k=M+1

relabelled index

j=1

u ∈ U, v ∈ V.

(2.7)

M is a In the last line u and v are completely general vectors because {| ai ⟩}i=1 N basis of vector space U, {| bi ⟩}i=1 is a basis of vector space V.

2.12 Given the linearly independent vectors x(t) = t n , for n = 0, 1, 2, . . . in Pc [t], use the Gram–Schmidt process to find the orthonormal polynomials e0 (t), e1 (t), and e2 (t)

12

2 Vectors and Linear Maps

(1 (a) when the inner product is defined as ⟨ x | y ⟩ = −1 x ∗ (t)y(t) dt. (b) when the inner product is defined with a nontrivial weight function ( ⟨x |y⟩ =

∞ −∞

e−t x ∗ (t)y(t) dt. 2

(2.8)

Hint: Use the following result: ⎧√ ⎪ if n = 0, ⎨ π −t 2 n e t dt = 0 if n is odd, ⎪ −∞ ⎩√ 1·3·5...(n−1) π 2n/2 if n is even.

(

∞

2.12(a) Recall Pc [t] is the set of polynomials in t with complex coefficients. Clearly the given set of polynomials {t n }n=0,1,... form a basis. We can always form an orthonormal basis by taking linear combinations of this basis (see statement at the beginning of Hassani Sect. 2.2.2.). We apply the Gram–Schmidt process, starting with | e0, ⟩ = t 0 = 1, which has squared norm ⟨

e0,

| ,⟩ |e = 0

(

1 −1

dt = 2.

(2.9)

We divide | e0, ⟩ by its norm to obtain our first normalized basis vector, 1 1 | e0 ⟩ = √ | e0, ⟩ = √ . 2 2

(2.10)

In Hassani Sect. 2.2.2 the first basis vector was called | e1 ⟩, not | e0 ⟩. This is just a convenient label and here it was more convenient to label the first one | e0 ⟩ because it corresponds to a polynomial of power zero. Next we find | e1, ⟩ by subtracting the projection of the next polynomial | a1 ⟩ = t onto | e0 ⟩ from | a1 ⟩: | e1, ⟩ = | a1 ⟩ − ⟨ e0 | a1 ⟩ | e0 ⟩, ) ( 1( 1 ∗ tdt| e0 ⟩, =t− √ 2 −1 = t.

(2.11)

The integral of an odd function over an interval centred on zero vanishes, implying that | a1 ⟩ and | e0 ⟩ are orthogonal. All we have to do it normalize | e1, ⟩:

2.1 Problems

13

| e1 ⟩ = / ⟨

1

| e1, | e1,

/ , ⟩ | e1

⟩=

3 , |e ⟩ = 2 1

/

3 t. 2

(2.12)

The next polynomial | a2 ⟩ = t 2 is orthogonal to | e1 ⟩ but not to | e0 ⟩ so we must subtract off this projection: | e2, ⟩ = | a2 ⟩ − ⟨ e0 | a2 ⟩ | e0 ⟩, ( ) ) ( 1( 1 1 1 1 ∗ 2 2 2 t dt| e0 ⟩ = t − √ [t 3 ]1−1 √ , =t − √ 3 2 2 2 −1 1 2 (2.13) =t − . 3 We thereby obtain an | e2, ⟩ orthogonal to both | e0 ⟩ and | e1 ⟩. Normalizing, we obtain / ( √ ) 1 45 , 45 2 1 , t − . (2.14) | e2 ⟩ = / ⟨ | ⟩ | e2 ⟩ = √ | e2 ⟩ = 8 3 8 e, | e, 2

2

(b) In part (b) only the inner product has changed, but this does change the orthonormal basis we construct via the Gram–Schmidt process. Starting with | e0, ⟩ = t 0 = 1, the squared norm is now ⟨

| ⟩ e0, | e0, =

(

∞

−∞

e−t dt = π 1/2 . 2

(2.15)

We divide | e0, ⟩ by its norm to obtain our first normalized basis vector, 1 1 | e0 ⟩ = √ | e0, ⟩ = 1/4 . π π 1/2

(2.16)

Again we find | e1, ⟩ by subtracting the projection of the next polynomial | a1 ⟩ = t onto | e0 ⟩ from | a1 ⟩: | e1, ⟩ = | a1 ⟩ − | e0 ⟩ ⟨ e0 | a1 ⟩ , ( ) ( ∞ 1 ∗ 2 e−t tdt | e0 ⟩, =t− π 1/4 −∞ = t.

(2.17)

14

2 Vectors and Linear Maps

Note that | a1 ⟩ and | e0 ⟩ are orthogonal as a result of the antisymmetry; the new inner product involves an integral centred at the origin with an even weight function. All we have to do it normalize | e1, ⟩. Here ⟨

e1,

| ,⟩ |e = 1

(

∞

−∞

e−t t 2 dt = 2

( π )1/2

,

(2.18)

( )1/4 2 ⟩= t. π

(2.19)

2

so | e1 ⟩ = / ⟨

1

| e1, | e1,

, ⟩ | e1

So the next polynomial | a2 ⟩ = t 2 is also orthogonal to | e1 ⟩ but not to | e0 ⟩ so we must subtract off this projection: | e2, ⟩ = | a2 ⟩ − ⟨ e0 | a2 ⟩ | e0 ⟩, ( ) ( ∞ 1 1 ∗ 2 2 −t 2 e t dt 1/4 , =t − 1/4 π π −∞ ( π )1/2 1 1 = t 2 − 1/2 = t2 − √ . π 2 2

(2.20)

We thereby obtain an | e2, ⟩ orthogonal to both | e0 ⟩ and | e1 ⟩. Finally we must normalize | e2, ⟩. The squared norm is ⟨

e2,

| ,⟩ |e = 2

(

∞

e

−t 2

(

1 t −√ 2

)2 dt =

2

−∞

√

( π

√ ) 5−2 2 , 4

(2.21)

so we obtain | e2 ⟩ = / ⟨

1

| e2, | e2,

, ⟩ | e2 ⟩ = /

(

) 1 2 − t . √ √ √ 2 π(5 − 2 2) 2

(2.22)

2.15 Show that (

∞ −∞

(t 10 − t 6 + 5t 4 − 5)e−t dt ≤ 4

/(

∞

−∞

/( (t 4 − 1)2 e

−t 4

∞

dt −∞

(t 6 + 5)2 e−t 4 dt (2.23)

Hint: Define an appropriate inner product and use the Schwarz inequality.

2.1 Problems

15

2.15 Inspection of Eq. (2.23) reveals that the common element in each factor is the weighted integral, (

∞

f (t)e−t dt, 4

−∞

(2.24)

for various functions f (t). Following the hint, we guess that Eq. (2.23) has the form, | ⟨ p|q ⟩| ≤

√

⟨ p| p⟩ ⟨q |q ⟩

(2.25)

where the inner product is given by the weighted integral Eq. (2.24), i.e. ( ⟨ p|q ⟩ ≡

∞ −∞

p ∗ (t)q(t)e−t dt, 4

(2.26)

Because e−t is real-valued, continuous and strictly positive, Eq. (2.26) is of the form Hassani Eq. (2.10), which is an inner product for p(t), q(t) ∈ Pc [t], the space of all polynomials in t with complex coefficients. In fact this candidate inner product is quite similar to Eq. (2.8) used in Proposition 2.12(b). It is then straightforward to verify that 4

p(t) = t 4 − 1,

q(t) = t 6 + 5,

(2.27)

confirming that Eq. (2.23) has the form Eq. (2.25). Thus the inequality Eq. (2.23) holds by the Schwarz inequality, Hassani Theorem 2.2.7.

∞ ∞ 2.18 Using the Schwarz inequality show that if {αi }i=1 and {βi }i=1 are in C∞ , then ∑ ∞ ∗ i=1 αi βi is convergent.

2.18 Recall, from Hassani Example 2.1.2, example 12, that C∞ is the set of ∞ for which, all complex sequences {αi }i=1 ∞ ∑

|αi |2 < ∞.

(2.28)

i=1 N N For any N ∈ N the two finite sequences {αi }i=1 , {βi }i=1 are members of the N inner product space C , the Schwarz inequality applies and implies that

16

2 Vectors and Linear Maps

√ | N | | N N |∑ | |∑ ∑ | | ∗ √ |αi |2 |βi |2 ≥ | αi βi | , | | i=1

i=1

∀N ∈ N.

(2.29)

i=1

Now let’s create two new vectors by simply taking the absolute value of each N N , | b, ⟩ = {|βi |}i=1 ; these are also members of the inner term | a , ⟩ = {|αi |}i=1 , , N product space | a ⟩, | b ⟩ ∈ C , so the Schwarz inequality still applies and because the terms are real the complex conjugate on the RHS of gives √ | N | | N N |∑ | |∑ ∑ | | √ 2 2 |αi | |βi | ≥ | |αi ||βi || , | | i=1

i=1

used |αi | ∈ R so |αi |∗ = |αi |,

i=1

=

N ∑ | ∗ | |α βi | ,

∀N ∈ N.

i

i=1

(2.30) The LHS of Eq. (2.30) didn’t change from that of Eq. (2.29) but the absolute values on the RHS moved inside the sum. Taking the limit N → ∞, the of assumption Eq. (2.28) applied to LHS Eq. (2.30) remains finite by virture∑ ∞ αi∗ βi converges absolutely; the | a , ⟩ and | b, ⟩.This shows that our series i=1 sum of absolute values of the terms converges. Absolute convergence implies convergence so our task in now complete. Absolute convergence might seem like overkill here but it is actually very important. A convergent series of complex numbers that is not absolutely convergent is called conditionally convergent and its sum can depend upon the order of the terms. This is explained clearly in Sect. 1.3 of Appel’s textbook [2]. (My pure mathematician friends Carl Tipler and Jean-Philippe Nicolas were especially helpful with this question.)

2.21 Let π be the permutation that takes (1, 2, 3) to (3, 1, 2). Find Aπ | ei ⟩,

i = 1, 2, 3,

(2.31)

3 where | ei ⟩i=1 is the standard basis of R3 (or C3 ), and Aπ is as defined in Example 2.3.5: if | x ⟩ = (η1 , η2 , . . . , ηn ) is a vector in Cn , we can write

Aπ | x ⟩ = (ηπ(1) , ηπ(2) , . . . , ηπ(n) ).

(2.32)

2.1 Problems

17

2.21 We can think of π as a a type of mapping from N into N. Because (1, 2, 3) was mapped to (3, 1, 2) we see that π(1) = 3, π(2) = 1, π(3) = 2. The standard basis of R3 or C3 was given in Hassani Example 2.1.10: | e1 ⟩ = (1, 0, 0),

| e2 ⟩ = (0, 1, 0),

| e3 ⟩ = (0, 0, 1).

(2.33)

Applying Aπ | e1 ⟩ we obtain Aπ | e1 ⟩ = (ηπ(1) , ηπ(2) , ηπ(3) ) = (η3 , η1 , η2 ) = (0, 1, 0) = | e2 ⟩.

(2.34)

Similarly, Aπ | e2 ⟩ = (ηπ(1) , ηπ(2) , ηπ(3) ) = (η3 , η1 , η2 ) = (0, 0, 1) = | e3 ⟩, Aπ | e3 ⟩ = (ηπ(1) , ηπ(2) , ηπ(3) ) = (η3 , η1 , η2 ) = (1, 0, 0) = | e1 ⟩.

(2.35)

2.24 Give an example of a function f : R2 → R such that f (α| a ⟩) = α f (| a ⟩),

∀α ∈ R and | a ⟩ ∈ R2

(2.36)

but f is not linear. Hint: Consider a homogeneous function of degree 1.

2.24 Consider f defined by f (x, y) =

√

x 2 + y2.

(2.37)

If | a ⟩ = (x, y) is an arbitrary vector in R2 and α any real number then α| a ⟩ = (αx, αy) and f (α| a ⟩) =

√

√ (αx)2 + (αy)2 = α x 2 + y 2 = α f (| a ⟩),

(2.38)

so the criterion in Eq. (2.36) is satisfied. Furthermore, f is not linear. Let | a ⟩ = (x1 , y1 ) and | b ⟩ = (x2 , y2 ). Then f (| a ⟩ + | b ⟩) =

√ (x1 + x2 )2 + (y1 + y2 )2 .

(2.39)

In general this is not equal to f (| a ⟩) + f (| b ⟩) =

/

x12 + y12 +

/

x22 + y22 .

(2.40)

18

2 Vectors and Linear Maps

This is related to the notion of a homogeneous function of order α ∈ R. Consider a function h : V → R defined on the vector space V. If h(λ| a ⟩) = λα h(| a ⟩),

∀λ ∈ R, | a ⟩ ∈ V

(2.41)

then h is homogeneous of degree α. So f in Eq. (2.37) is homogeneous of degree one. This exercise shows that the condition of linearity is more strict than homogeneity of degree.

2.27 Let V and W be finite dimensional vector spaces. Show that if T ∈ L(V, W) is surjective, then dimW ≤ dimV.

2.27 Recall from Hassani Sect. 1.2 that a surjective map f : X → Y from set X to set Y has range f (X ) = Y , i.e. the range is the entire codomain for an injective map. Here that map is the linear transformation T : V → W and because T is surjective we have T(V) = W. By the dimension theorem, see Hassani 2.3.13, we have for any linear transformation T : V → W mapping between finite dimensional vector spaces V and W: dimV = dim(T(V)) + dim (kerT) , = dim(W) + dim (kerT) .

used dimension theorem used T is surjective

(2.42)

The dimension of a vector space is a natural number so dim (kerT) ∈ N, being zero when the space consists of the single vector, kerT = {| 0 ⟩}. Thus Eq. (2.42) implies the inequality dimV ≥ dimW.

(2.43)

2.30 Using Hassani Theorem 2.3.11, prove Hassani Theorem 2.3.19.

2.30 Recall Hassani Theorem 2.3.19 states that an injective linear transformation T : V → W carries linearly independent sets of vectors onto linearly N be a linearly independent set independent sets of vectors. Let v = {| vi ⟩}i=1 N = T(v) ⊂ W. of vectors in V. This set is carried to the set w = {| wi ⟩}i=1 We wish to establish that the injectivity of T implies the linear independence of w. By definition Hassani 2.1.3 w is linearly independent iff the equation following equation for the scalars bi has only the trivial solution:

2.1 Problems

N ∑

19

bi | wi ⟩ = | 0 ⟩

=⇒

bi = 0, i = 1, . . . , N .

(2.44)

i=1

Our goal is to establish the implication in Eq. (2.44). By the definition of w we can write each | wi ⟩ = T| vi ⟩. Substitute this into the LHS of Eq. (2.44) and use linearity of T N ∑

( T

bi T| vi ⟩ = | 0 ⟩,

i=1 N ∑

used defintion of w

) b | vi ⟩ = | 0 ⟩, i

used linearity of T

i=1

=⇒

N ∑

bi | vi ⟩ ∈ kerT.

used definition of kernel

(2.45)

i=1

Now we turn to Hassani Theorem 2.3.11, which states that a linear transformation is 1–1 (injective) iff its kernel is the set containing only the zero vector. Thus the injectivity of T implies kerT = | 0 ⟩, which we substitute in the last line of Eq. (2.45): N ∑

bi | vi ⟩ = | 0 ⟩.

i=1

=⇒ bi = 0, i = 1, . . . , N

used linear independence of v. (2.46)

Thus we have established the implication in Eq. (2.44), which completes the proof.

2.33 Show that W0 is a subspace of V∗ and dimV = dimW + dimW0 .

(2.47)

Here we are assuming we have a finite dimensional vector space V with dual space V∗ and W is a subspace of V. Furthermore, W0 ⊂ V∗ is defined as the set of linear functionals φ that annihilate any vector in W, i.e. W0 = { φ | φ(| w ⟩) = 0, ∀| w ⟩ ∈ W}.

(2.48)

20

2 Vectors and Linear Maps

2.33 Let φ, θ ∈ W0 be two arbitary linear functionals in the set W0 ⊂ V∗ . We wish to show that an aritrary linear combination aφ + bθ = γ,

(2.49)

is also in the set W0 for arbitrary a, b ∈ C (or reals if V is a real space). This is so because γ| w ⟩ = (aφ + bθ)(| w ⟩) = aφ| w ⟩ + bθ| w ⟩, = 0.

used linearity (2.50)

This establishes that the set W0 is a subspace. Now let’s check the dimensions of the vector spaces in Eq. (2.47). Let N be a basis of vector space V, with N ∈ N a natural number. We B = {| bi ⟩}i=1 suppose that M ∈ N with 1 ≤ M < N is the dimension of the subspace W, i.e. dimW = M. To find dimW0 we find the corresponding basis. By Hassani N , defined by Theorem 2.5.2 there exists the dual basis B ∗ = {φi }i=1 φi (| b j ⟩) = δ i j ,

(2.51)

that spans the entire dual space V∗ . (Superscript indices for dual space vectors N is standard notation in general relativity.) We wish to find the subset of {φi }i=1 0 that spans the subspace W . Our strategy is to start with a completely general φ ∈ V∗ and then find the restriction that ensures it is in W0 . When an arbitrary φ ∈ V ∗ acts on any | w ⟩ ∈ W the result is φ| w ⟩ =

N ∑

⎞ ⎛ M ∑ ai φi ⎝ d j | b j ⟩⎠ ,

i=1

=

M N ∑ ∑

ai , d j ∈ C

j=1

ai d j φi (| b j ⟩),

used linearity

i=1 j=1

=

N ∑ M ∑ i=1 j=1

ai d j δ i j =

M ∑

aj d j.

summed over i

(2.52)

j=1

Note the terms with i > M in the sum above did not contribute because for all 1 ≤ j ≤ M the Kronecker delta vanished, δ i j = 0. Now to ensure that φ| w ⟩ = 0 for all | w ⟩ ∈ W we must have a j = 0 for j = 1 to M. It follows that

2.1 Problems

21

φ=

N ∑

ai φi ,

for arbitrary ai ∈ C,

(2.53)

i=M+1

is the most general φ such that φ| w ⟩ = 0. And thus the basis for W0 is N . This gives that dimW0 = N − M and Eq. (2.47) then follows sim{φi }i=M+1 ply from N = M + (N − M).

2.36 Prove Hassani Theorem 2.6.3: Let ω ∈ Ʌ p (V, U). Then the following assertions are equivalent: A1 . ω(| a1 ⟩, . . . , | a p ⟩) = 0 whenever | ai ⟩ = | a j ⟩ for some pair i /= j. A2 . ω(| aσ(1) ⟩, . . . , | aσ( p) ⟩) = ∈σ ω(| a1 ⟩, . . . , | a p ⟩), for any permutation σ of 1, 2, . . . , p, and any | a1 ⟩, . . . , | a p ⟩ ∈ V, p A3 . ω(| a1 ⟩, . . . , | a p ⟩) = 0 whenever {| ak ⟩}k=1 are linearly dependent. Here the zero on the RHS of A1 and A3 is the zero vector of U and ∈σ was defined in Hassani Definition 2.6.2 as +1 for an even permutation σ and −1 for an odd permutation σ.

2.36 The statement A1 is equivalent to the following: A,1 . ω(. . . , |ai ⟩, . . . , |a j ⟩, . . .) = −ω(. . . , |a j ⟩, . . . , |ai ⟩, . . .). Let τi j be the transposition that switches i and j (i.e. exchanges the position of the two elements |ai ⟩ et |a j ⟩). Then A,1 can be written as τi j · ω(|a1 ⟩, . . . , |a p ⟩) = −ω(|a1 ⟩, . . . , |a p ⟩), or more generally A,1 . τ · ω(|a1 ⟩, . . . , |a p ⟩) = −ω(|a1 ⟩, . . . , |a p ⟩) for any transposition τ . Proof of A1 =⇒ A,1 : 0 = ω(. . . , |ai ⟩ + |a j ⟩, . . . , |ai ⟩ + |a j ⟩, . . .) = ω(. . . , |ai ⟩, . . . , |ai ⟩ + |a j ⟩, . . .) + ω(. . . , |a j ⟩, . . . , |ai ⟩ + |a j ⟩, . . .) = ω(. . . , |ai ⟩, . . . , |ai ⟩, . . .) +ω(. . . , |ai ⟩, . . . , |a j ⟩, . . .) ! =0

+ ω(. . . , |a j ⟩, . . . , |ai ⟩, . . .) + ω(. . . , |a j ⟩, . . . , |a j ⟩, . . .) . ! =0

Proof of A,1 =⇒ A1 : τi j · ω(. . . , |ai ⟩, . . . , |ai ⟩, . . .) = −ω(. . . , |ai ⟩, . . . , |ai ⟩, . . .), = ω(. . . , |ai ⟩, . . . , |ai ⟩, . . .).

(2.54)

22

2 Vectors and Linear Maps

In the following, we use either A1 or A,1 and prove the two equivalences ⇐⇒ A2 and A1 ⇐⇒ A3 . That will prove Theorem 2.6.3, because of ⇐⇒ A1 shown above. First we show the equivalence of A,1 and A2 . Proof of A,1 =⇒ A2 : First note that any permutation σ can be decomposed into a number of steps M, each step τk consisting of a transposition. Let τ1 , τ2 , . . . , τm be the transpositions that decompose σ. Then σ = τ1 · τ2 · . . . · τm and A,1 A,1

ω(|aσ(1) ⟩, . . . , |aσ( p) ⟩) = σ · ω(|a1 ⟩, . . . , |a p ⟩), = τ1 · τ2 · . . . · τm · ω(|a1 ⟩, . . . , |a p ⟩) = (−1)m ω(|a1 ⟩, . . . , |a p ⟩), ≡ ∈σ ω(|a1 ⟩, . . . , |a p ⟩).

(2.55)

Proof of A2 =⇒ A,1 : Simply note that τi j is a permutation and ∈τi j = −1. Next we show the equivalence of A1 and A3 . ∑p Proof of A1 =⇒ A3 : Without loss of generality, assume |a1 ⟩ = k=2 αk |ak ⟩. Then, by linearity ω(|a1 ⟩, . . . , |a p ⟩) =

p ∑

αk ω(|ak ⟩, . . . , |a p ⟩).

k=2

In each term of the sum, the first entry in the argument of ω is equal to one of the remaining entries. By A1 , each term is zero. Thus A∑ 3 is implied. p Proof of A3 =⇒ A1 : If |ai ⟩ = |a j ⟩, then in the sum k=1 αk |ak ⟩, let α j = −αi /= 0 and set all the other αs equal to zero. Thus |ai ⟩ = |a j ⟩ is a special case of linear dependence. (Sadri Hassani provided this solution. See Supplementary Problems 2.50 and 2.51 for a different approach.)

2.2 Supplementary Problems 2.39 Prove the triangle inequality, ||a + b|| ≤ ||a|| + ||b||. Hint: Use the Schwarz inequality, Hassani Theorem 2.2.7.

(2.56)

2.2 Supplementary Problems

23

2.39 It should be noted that an inner product is not needed to define the norm of a vector, as pointed out in the paragraph after Hassani Definition 2.2.8 of a vector norm. However, here we assume the inner product exists on the vector space in question to simplify the proof, as follows: ||a + b||2 = ⟨ a + b | a + b ⟩ , = ⟨a |a ⟩ + ⟨b|b⟩ + ⟨a |b⟩ + ⟨b|a ⟩,

by property 2, Hassani 2.2.1,

= ||a||2 + ||b||2 + 2 Re( ⟨ a | b ⟩),

by property 1, Hassani 2.2.1,

≤ ||a||2 + ||b||2 + 2 | ⟨ a | b ⟩ |, ≤ ||a||2 + ||b||2 + 2 ||a||||b||,

used Schwarz inequality

= (||a|| + ||b||)2 .

(2.57)

Taking the square root of both sides of Eq. (2.57) gives the triangle inequality, ||a + b|| ≤ (||a|| + ||b||).

(2.58)

2.40 When the vectors | p ⟩, | q ⟩ ∈ C∞ (a, b), the set of all real-valued functions on the interval (a, b) of a single real variable that possess derivatives of all orders, and the inner product is of the form ( ⟨ p|q ⟩ =

b

p(t) q(t)dt,

(2.59)

a

then the Schwarz inequality (Hassani Theorem 2.2.7) is known as the CauchySchwarz-Buniakowsky inequality for integrals [10]. Prove the related inequality M1 M4 − M3 M2 ≥ 0,

(2.60)

where, for the strictly positive function E on [0, ∞), (

∞

Mn ≡

k n E(k)dk.

(2.61)

0

This inequality proved useful in the study of the evolution of the energy spectrum of turbulent fluids [17]. Hint: the required inequality can be easily obtained from entry 196 in the classic reference on inequalities by Hardy, Littlewood and Pólya [10]. N 2.41 Let {αi }i=1 be the components of a vector | a ⟩ ∈ C N . Prove that

24

2 Vectors and Linear Maps

||a||2 ≡

N ∑

|αi |2 ,

(2.62)

i=1

defines a norm that satisfies the parallelogram law, ||a + b||2 + ||a − b||2 = 2||a||2 + 2||b||2 .

Hassani Eq. (2.14)

(2.63)

2.42 Prove Hassani theorem 2.2.10: Every finite-dimensional vector space can be turned into an inner product space.

2.42 A finite-dimensional vector V space has, by definition, a finite basis, N , where N ∈ N is ∑ the dimension of the space and any vector say {| ai ⟩}i=1 | a ⟩ ∈ V can be written | a ⟩ = i=1 αi | ai ⟩, where αi are scalars (usually in mathematical physics scalars are real or complex numbers but in general can be elements of a field). The expression given in Eq. (2.62) defines a norm that satisfies the parallelogram law, see Problem 2.41. Hassani Theorem 2.2.9 states that a normed linear space is an inner product space if and only if the norm satisfies the parallelogram law. Thus any finite dimensional vector space forms an inner product space. The inner product associated with the norm in Eq. (2.62) is given by ⟨a |b⟩ =

N ∑

αi∗ βi ,

(2.64)

i=1

where | b ⟩ =

∑ i=1

βi | ai ⟩.

2.43 For | a ⟩ ∈ C N , prove that the so-called l p -norm ( ||a|| p ≡

∑

)1/ p |αi |

p

,

(2.65)

i=1

where real number p ≥ 1, satisfies the triangle inequality. Hint: see [13, Sect. 6.2]. 2.44 Show that T : (U ⊕ V) ⊗ W → (U ⊗ W) ⊕ (V ⊗ W) given by T((| u ⟩ + | v ⟩) ⊗ | w ⟩) = | u ⟩ ⊗ | w ⟩ + | v ⟩ ⊗ | w ⟩,

(2.66)

is an isomorphism. N 2.45 Let B = {| ai ⟩}i=1 be a basis for a vector space V and T : V → W a linear n is a basis for the subtransformation to a vector space W such that Bk = {| ai ⟩}i=1 N space ker(T). Show that Bs = {T| ai ⟩}i=n+1 is a basis for the range T(V).

2.2 Supplementary Problems

25

2.45 Recall from Hassani Definition 2.1.7 that a basis of a vector space U must be a set of linearly independent vectors that spans U. Because B is a basis for V we can write any | v ⟩ ∈ V |v⟩ =

N ∑

v i | ai ⟩,

(2.67)

i=1

with v i the appropriate scalars for V. Applying T to this arbitrary | v ⟩ we find T| v ⟩ = T

N ∑

v i | ai ⟩,

i=1

=

N ∑

v i (T| ai ⟩),

used linearity

i=1

=

N ∑

v i (T| ai ⟩).

used T| ai ⟩ = | 0 ⟩ for i = 1 to n

(2.68)

i=n+1

So Bs spans T(V). To show that Bs is a linearly independent set, consider the equation N ∑

αi (T| ai ⟩) = | 0 ⟩,

(2.69)

i=n+1

for unknown scalars αi . By linearity of T we can rewrite this ( T

N ∑

) α | ai ⟩ = | 0 ⟩. i

(2.70)

i=n+1

∑N But the vector in parentheses | u ⟩ = i=n+1 αi | ai ⟩ by construction cannot be in the kernel of T unless | u ⟩ = | 0 ⟩. Thus we can write ( |u ⟩ =

N ∑

) αi | ai ⟩ = | 0 ⟩.

(2.71)

i=n+1 N This implies αi = 0 for all i = n + 1 to N because the vectors {| ai ⟩}i=n+1 are a subset of the basis B and thus must be linearly independent. We conclude that Bs is also a linearly independent set, which completes the proof that Bs is a basis for the range T(V).

26

2 Vectors and Linear Maps

2.46 Find a complex structure J for the real vector space R2 with the natural inner product ( ) ) b1 ( ⟨ a | b ⟩ = a1 a 2 = (a1 b1 + a2 b2 ). b2

(2.72)

2.47 Let V be a real vector space with inner product ⟨|⟩ and W = C ⊗ V its complexification. Show that the inner product on W defined by ⟨ α ⊗ a | β ⊗ b ⟩ ≡ α¯ β ⟨ a | b ⟩

(2.73)

where α¯ denotes complex conjugate1 of α, satisfies the properties of Hassani Definition 2.2.1. 2.48 Justify in more detail all the steps of the proof given for Hassani Proposition 2.6.4.

2.48 We apply the N-linear skew-symmetric map ω from V to U to the set of N vectors | a j ⟩ ∈ V, j = 1 . . . N . Each | a j ⟩ is written as a linear combination of the basis vectors | aj ⟩ =

N ∑

α jk | ek ⟩ =

N ∑

α jk j | ek j ⟩.

(2.74)

k j =1

k=1

On the far RHS, to keep track of all the N indices, we gave them subscripts, k1 , k2 , . . . , k N . Then applying ω to these N vectors gives ⎛ ω (| a1 ⟩, . . . , | a N ⟩) = ω ⎝

N ∑ k1 =1

=

N ∑

α1k1 | ek1 ⟩, . . . ,

N ∑

⎞ α N k N | ek N ⟩⎠ ,

k N =1

α1k1 α2k2 · · · α N k N ω(| ek1 ⟩, . . . , | ek N ⟩).

(2.75)

k1 ,k2 ,...,k N =1

Here we used linearity to factor out the coefficients α jk j . Use the fact that ω is skew symmetric so that terms with ω(| ek1 ⟩, . . . , | e j ⟩, . . . , | ei ⟩, . . . , | ek N ⟩) = 0 when i = j. In other words, ω is only nonzero when all the basis vectors in the N slots of ω are different. This reduces the number of terms in the sum from N N to N !. Now we use the fact that since each basis vector | e j ⟩

1 Up until Hassani Sect. 2.4, the complex conjugate has been denoted by *, but this was changed to the other common symbol, overbar, because in the last two sections of this chapter * was used to denote the dual space.

2.2 Supplementary Problems

27

appears once and only once as an argument of ω in each term of the sum, we can write the N -tuple (k1 , . . . , k N ) as a permutation of 1, 2, . . . , N , i.e. (k1 , . . . , k N ) = (π(1), . . . , π(N )), so that ω (| a1 ⟩, . . . , | a N ⟩) =

∑

α1π(1) α2π(2) · · · α N π(N ) ω(| eπ(1) ⟩, . . . , | eπ(N ) ⟩).

π

(2.76) In Eq. (2.76) we have written the index of the sum as π, indicating a sum over all permutations. Finally we use the defining property of a skew-symmetric plinear map that changing the order of the arguments leaves the result unchanged up to a sign so ω (| a1 ⟩, . . . , | a N ⟩) =

( ∑

) ∈π α1π(1) α2π(2) · · · α N π(N ) ω(| e1 ⟩, . . . , | e N ⟩),

π

(2.77) where ∈π = +1 when π is even, and ∈π = −1 when π is odd. Note the term involving ω acting on the set of basis vectors ω(| e1 ⟩, . . . , | e N ⟩) has factored out of the sum. In this sense the result is determined uniquely by ω acting on the basis vectors. In particular, if ω(| e1 ⟩, . . . , | e N ⟩) = 0 then regardless of the set of N vectors (regardless the values of α jk ), the result will be nil, ω (| a1 ⟩, . . . , | a N ⟩) = 0.

2.49 Prove the trivial part of Hassani Theorem 2.3.11. That is, show that an injective linear transformation necessarily has zero kernel, i.e. only the zero vector | 0 ⟩ is mapped to | 0 ⟩. The proof will use linearity. Can you find an example of an injective but nonlinear transformation that does not have zero kernel? (Hint: Consider the vector space R, the real numbers, and the affine but not linear map y : R → R defined by y(x) = 1 + x.) 2.50 Redo Problem 2.36 by proving the implications A1 =⇒ A2 =⇒ A3 =⇒ A1 . 2.51 Redo Problem 2.36 by proving the implications A1 =⇒ A3 =⇒ A2 =⇒ A1 .

2.51 We must show that any one of the three assertions above implies the other two. This can become a lot of work so it’s worthwhile to think about () being efficient. Naively you might set out to prove 23 = 6 implications. But actually we halve the amount of work by proving only three impli-

28

2 Vectors and Linear Maps

cations such as A1 =⇒ A2 =⇒ A3 =⇒ A1 . The economy come from obtaining implications “for free”, for instance, we get A2 =⇒ A1 even if indirectly through A3 . Likewise the other economic option here is to prove A1 =⇒ A3 =⇒ A2 =⇒ A1 . Below we follow the second option; Problem 2.50 proposes that you follow the second option. However, the solution to Problem 2.36 above, provided by Sadri Hassani, is even shorter than my economic option 2 below. For the second option we start by assuming assertion A1 holds and try to prove that A3 follows. Consider the p-linear skew-symmetric map ω applied p to a linearly dependent set {| ak ⟩}k=1 , ( ) ω | a1 ⟩ . . . | a p ⟩ .

(2.78)

By definition of a linearly dependent set there exists a set of scalars αk not all zero such that p ∑

αk | ak ⟩ = | 0 ⟩,

(2.79)

k=1

where | 0 ⟩ is the zero vector of vector space V. Dividing through by one of these nonzero scalars, say αm /= 0, we have | am ⟩ = −

p 1 ∑ αk | ak ⟩. αm k/=m

(2.80)

Assuming that m = 1 simplifies the presentation without loss of generality. We substitute | a1 ⟩ by its linear combination Eq. (2.80) into Eq. (2.78). By the linearity of ω we have ( ω(| a1 ⟩, . . . , | a p ⟩) = ω −

) p 1 ∑ αk | ak ⟩, | a2 ⟩, . . . , | a p ⟩ , α1

used Eq. (2.80)

k=2

=−

p 1 ∑ αk ω(| ak ⟩, | a2 ⟩, . . . | a p ⟩), , α1

used linearity

k=2

= | 0 ⟩,

used A1

(2.81) where | 0 ⟩ here is the zero vector of U. The last lines follows because each term in the sum vanishes since there is a pair of equal vectors in the argument of ω, the first and the kth. So A1 =⇒ A3 . To show that A3 =⇒ A2 we note that any permutation σ can be decomposed into a number of steps M, each step σk consisting of a transposition

2.2 Supplementary Problems

29

(the exchange of two elements). It will be sufficient to consider the effect of a single transposition of | ai ⟩ and | a j ⟩. Let’s take the sum of ω applied to the arbitrary ordered set of p vectors (. . . | ai ⟩ . . . | a j ⟩ . . .) and the ordered set with | ai ⟩ and | a j ⟩ interchanged, (. . . | a j ⟩ . . . | ai ⟩ . . .), it being understood that | a j ⟩ in the second set occupies the position of | ai ⟩ of the first set and vice versa. This gives ω(. . . | ai ⟩ . . . | a j ⟩ . . .) + ω(. . . | a j ⟩ . . . | ai ⟩ . . .) = ω(. . . (| ai ⟩ + | a j ⟩) . . . (| a j ⟩ + | ai ⟩) . . .),

used linearity,

= | 0 ⟩.

used A3 (2.82)

The last equality follows because the argument of ω on the RHS of Eq. (2.82) contains a linearly dependent set because the vectors at position i and j are equal, (| ai ⟩ + | a j ⟩) = (| a j ⟩ + | ai ⟩), which is a special case of linear dependence. In short, a single transposition of the arguments of the p-linear skewsymmetric map ω gives the result, ω(. . . | ai ⟩ . . . | a j ⟩ . . .) = −ω(. . . | a j ⟩ . . . | ai ⟩ . . .).

(2.83)

We can build the result for an arbitrary permutation consisting of M transpositions from the result Eq. (2.83). For instance two transpositions gives ω(| aσ(1) ⟩, . . . , | aσ( p) ⟩) = (−1)2 ω(| a1 ⟩ . . . | a p ⟩) = ω(| a1 ⟩ . . . | a p ⟩). (2.84) In general, for a permutation σ consisting of M transpositions of the p argument vectors we have ω(| aσ(1) ⟩, . . . , | aσ( p) ⟩) = (−1) M ω(| a1 ⟩ . . . | a p ⟩), = ∈σ ω(| a1 ⟩ . . . | a p ⟩),

(2.85)

which is A2 . Thus A3 =⇒ A2 . Finally we must prove that A2 implies A1 . Consider ω(| a1 ⟩ . . . | a p ⟩),

(2.86)

where the ordered set of argument vectors of ω contains at least one pair | ai ⟩ = | a j ⟩ for i /= j. Then the transposition of these two argument vectors leaves the ordered set unchanged, yet by A2 it must change the sign of the output of ω,

30

2 Vectors and Linear Maps

ω(. . . | ai ⟩ . . . | a j ⟩ . . .) = −ω(. . . | a j ⟩ . . . | ai ⟩ . . .), used A2 = ω(. . . | a j ⟩ . . . | ai ⟩ . . .). argument unchanged (2.87) We see that ω maps the argument with two equal vectors into a vector in U that is –1 times itself. This must be the zero vector of U, which establishes A1 and completes our proof of the equivalence of all three assertions.

Chapter 3

Algebras

Abstract This chapter introduces further structure on vector spaces leading to the notion of an algebra. Recall these vector spaces consist of abstract vectors and the power and efficiency of this approach stems from the generality of the application. The abstract vectors could be used to represent the familiar geometric quantities from classical physics; these are a type of tensor (discussed in great detail in Hassani’s Part VIII). But the abstract vectors of a vector space are not necessarily tensors; as we saw in Hassani Example 2.1.2 they could be polynomials with complex coefficients or matrices of a given shape, etc. This generality paves the way to applications in quantum physics. It is important to bear in mind that when we talk about the vector space R3 over the reals, without further retriction, we are thinking of matrices of a particular shape, either 1 × 3 (column vectors) or 3 × 1 (row vectors) with not necessarily tensorial properties.

3.1 Problems 3.3 Prove that A2 , the derived algebra of A, is indeed an algebra.

3.3 We must show that A2 has the properties of an algebra, listed in Hassani Definition 3.1.1. In particular, we must show that (i) A2 is a vector space and (ii) A2 is closed under a binary operation (of multiplication) that is distributive over addition as indicated in Hassani Definition 3.1.1. The distributive property in (ii) is obtained immediately, since the binary operation of multiplication is inherited from the algebra A, along with its appropriate distributive properties. That A2 is closed under this mulitiplication follows immediately also. Recall that A2 was defined by, A2 ≡ {x ∈ A|x =

∑

ak bk , ak , bk ∈ A},

Hassani Eq. (3.2)

(3.1)

k

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 R. B. Scott, Problems and Solutions on Vector Spaces for Physicists, https://doi.org/10.1007/978-3-031-31218-2_3

31

32

3 Algebras

where the sum is over a finite number of terms. Because A is an algebra, for arbitrary x in A2 we also have x ∈ A and similarly arbitrary y in A2 is also in A. So immediately by the definition of A2 we have xy ∈ A2 . For property (i), we must show that A2 is a subspace of A. A convenient n , a basis of A. For x ∈ A2 , we can write, way of doing this is to use {ei }i=1 x=

∑ k

=

ak bk =

k

(

∑ ∑ i, j

( ∑ ∑

,

k

)

)⎛

∑

αki ei ⎝

i

⎞ βk j e j ⎠ ,

j

αki βk j ei e j , ,,

(3.2)

,

≡γi j

where here and below the sums over i, j are from 1 to n, the dimension of A, while the sum over k involves an unspecified but finite number of{ terms. } Thus every vector in A2 can be written as a linear combination of ei e j . Conversely, if ∑

y=

ηi j ei e j ,

(3.3)

i, j

then y=

( ∑ ∑ j

,

i

) ηi j ei e j =

,,

≡c j

∑

cjej

(3.4)

j

,

n and y ∈ A2 by the definition of A2 . So, with {ei }i=1 a basis of A,

A2 ≡ {x ∈ A|x =

∑

γi j ei e j }.

(3.5)

i, j

{ } ei e j is not a basis for A2 because, in general, there are more Note that } { ei e j than {ei } and a basis of a subspace cannot be larger than that of the whole space. To show that A2 is a subspace of A, let u, v ∈ A2 and write u=

∑ i, j

αi j ei e j , v =

∑ i, j

βi j ei e j .

(3.6)

3.1 Problems

33

Then αu + βv = α

∑

αi j ei e j + β

i, j

∑

βi j ei e j =

i, j

∑ (ααi j + ββi j )ei e j ∈ A2 . i, j

(3.7)

3.6 Prove Hassani Proposition 3.1.23: Let A and B be unital algebras. If φ : A → B is an epimorphism, then φ is unital.

3.6 The algebras A and B are unital, so we know they have identity elements; let’s call them 1 A and 1 B . We wish to show that φ(1 A ) = 1 B .

(3.8)

This is what we mean by a unital homomorphism, Hassani Definition 3.1.22. We can multiply 1 A by any vector a ∈ A and obtain the result a, a = a 1A.

(3.9)

Recall that an epimorphism is a surjective algebra homomorphism, Hassani Definition 3.1.17, and both the surjectivity and the homomorphism properties will be used. Applying φ to Eq. (3.9) we have φ(a) = φ(a 1 A ), = φ(a)φ(1 A ).

used φ is a homomorism

(3.10)

This implies that φ(1 A ) = 1 B ,

(3.11)

as desired. Now we note that there are no restrictions on φ(a) ∈ B. That is, for any b ∈ B there exists an a ∈ A such that φ(a) = b because φ is surjective.

3.9 Let A be an associative algebra, and x ∈ A. Show that (i) Ax is a left ideal, (ii) xA is a right ideal, and (iii) AxA is a two-sided ideal.

34

3 Algebras

3.9 (i) Recall that a left ideal of an algebra has two defining characteristics; it is a vector subspace of the underlying vector space, and it contains all products on the left. So to prove that the set of elements Ax is a left ideal we start by showing that it is a subspace of A. The set Ax for given x ∈ A is defined by Ax = {ax|a ∈ A}.

(3.12)

An arbitrary linear combination of members of this set can be written αax + βbx where a, b ∈ A are two arbitrary members of the algebra and α, β ∈ C arbitrary scalars. Is this linear combination necessarily also in the set Ax? αax + βbx = (αa + βb)x, = cx.

used linearity property of algebra product, (3.13)

But c ∈ A, because A is also a vector space, so the RHS of Eq. (3.13) is manifestly a member of the set Ax; we can now rightfully call Ax a subspace of A. Is the subspace Ax closed under multiplication on the left by any element of the algebra A? For any vector b ∈ A and any y ∈ Ax we must show that by ∈ Ax.

(3.14)

Any y ∈ Ax can be written ax for some a ∈ A so we can write by = b(ax), = (ba)x, = cx, =⇒ by ∈ Ax.

c∈A

used A is an associative algebra used A is an algebra (3.15)

So we can rightfully call Ax a left ideal of algebra A. In fact it is called a left ideal generated by x. (ii) Similarly, we can show that xA is a vector subspace. An arbitrary linear combination of members of this set can be written αxa + βxb where a, b ∈ A are two arbitrary members of the algebra and α, β ∈ C arbitrary scalars. This linear combination is necessarily also in the set xA because αxa + βxb = x(αa + βb), used linearity property of algebra product, (3.16) = xc, with c ∈ A, because A is also a vector space. Eq. (3.16) shows xA is closed under linear combinations so we can now rightfully call xA a subspace of A.

3.1 Problems

35

And we can show that the subspace xA is a right ideal. Let y = xa be an arbitrary member of the subspace xA for some a ∈ A. Multiplying y on the right by any b ∈ A we find yb = (xa)b, = x(ab), = xc,

used associative algebra used A is an algebra

(3.17)

with c ∈ A. This shows that the subspace xA is a right ideal. In fact it is called a right ideal generated by x. (iii) For the two-sided ideal AxA we must first show that this subset of the algebra A is a subspace of A. Recall that we define the product of two sets in Eq. (3.1). Here we extend this to the case where we have a single member x ∈ A sandwiched between the two sets so ∑ ak xbk , with ak ∈ A and bk ∈ A}, (3.18) AxA = {u ∈ A|u = k

again with a finite number of terms in the sum. Reminiscent of the approach n , which permits us to write in Problem 3.3, we expand ak , bk in a basis {ei }i=1 an arbitrary element u ∈ AxA, u=

∑

ak xbk =

k

∑

αki ei xβk j e j =

k,i, j

( ∑ ∑ i, j

k

,

) αki βk j ei x e j . ,,

≡γi j

(3.19)

,

And conversely, any sum of this form w=

∑ i, j

γi j ei x e j =

( ∑ ∑ j

,

i

) x ej,

γi j ei

,,

≡c j

(3.20)

,

with γi j arbitrary scalars, is a member of the ideal AxA because the RHS of Eq. (3.20) has the same form as the sum in Eq. (3.18). So any member of AxA can be written as in Eq. (3.19) and any element written as in Eq. (3.19) is a member of AxA. Now take a linear combination of two such terms u, v ∈ AxA, with α, β scalars,

36

3 Algebras

αu + βv = α

∑

γi j ei x e j + β

i, j

=

ηi j ei x e j ,

i, j

∑( i, j

∑

) αγi j + βηi j ei x e j . , ,, ,

(3.21)

μi j

Here μi j are scalars and Eq. (3.21) has the form of the RHS of Eq. (3.19). This confirms that AxA is a subspace of A. Now we must show that this subspace is an ideal (i.e. both a left ideal and a right ideal). Any y ∈ AxA can be written as y=

∑

al xbl ,

al , bl ∈ A.

(3.22)

l

Multiplying this arbitrary y ∈ AxA on the left by any element say c ∈ A we find ) ( ∑ al xbl , al , bl ∈ A. cy = c =

∑

l

(cal )xbl ,

used associativity,

l

=

∑

a'l xbl

(3.23)

l

where a'l = cal is also in A, so the RHS of Eq. (3.23) is manifestly a member of AxA. And a similar argument shows that yc ∈ AxA for any c ∈ A and y ∈ AxA, confirming that AxA is also a right ideal of A.

3.12 Show that the linear transformation of Hassani Example 3.1.18 is an isomorphism of the two algebras A and B.

3.12 We start be confirming that φ : A → B is a linear map from the vector space A to vector space B. Recall that A is R3 and B is the vector space of antisymmetric 3 × 3 real matrices. We assume that component-wise addition and scalar multiplication apply on vector spaces A and B. Then for α, β ∈ R and a, b ∈ A we have

3.1 Problems

37

⎛

⎞ 0 (αa1 + βb1 ) −(αa2 + βb2 ) 0 (αa3 + βb3 ) ⎠ , φ(αa + βb) = ⎝−(αa1 + βb1 ) 0 (αa2 + βb2 ) −(αa3 + βb3 ) = αφ(a) + βφ(b),

(3.24)

confirming the linearity of φ. To show that φ : A → B is a vector space isomorphism, we must show that it is both injective and surjective. Consider the equation φ(a) = φ(b), ⎞ ⎛ ⎞ 0 b1 −b2 0 a1 −a2 ⎝−a1 0 a3 ⎠ = ⎝−b1 0 b3 ⎠ , a2 −a3 0 b2 −b3 0

⎛

=⇒ (a1 , a2 , a3 ) = (b1 , b2 , b3 ).

(3.25)

This implies that φ is injective. There is no restriction on the three independent components of an arbitrary element a = (a1 , a2 , a3 ) ∈ A, so clearly φ : A → B is surjective, and thus also a linear vector space isomorphism. To now extend these results to φ : A → B being an algebra isomorphism we must show that φ is a homomorphism between the algebras A and B, that is φ(ab) = φ(a)φ(b),

∀ a, b ∈ A.

(3.26)

As instructed, we take the product on A to be the vector cross product, while the product on B is the antisymmetric matrix product M • N = MN − NM.

Hassani Eq. (3.3)

(3.27)

Consider the cross product of arbitrary a, b ∈ A, which we evaluate using the Levi-Civita tensor ∈i jk (see Hassani Example 3.1.16): ab = a × b = ai bi ∈i jk = (c1 , c2 , c3 ), = (a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ).

(3.28)

So the LHS of Eq. (3.26) evaluates to ⎛

⎞ 0 (a2 b3 − a3 b2 ) −(a3 b1 − a1 b3 ) 0 (a1 b2 − a2 b1 ) ⎠ . φ(ab) = ⎝−(a2 b3 − a3 b2 ) 0 (a3 b1 − a1 b3 ) −(a1 b2 − a2 b1 ) For the RHS of Eq. (3.26),

(3.29)

38

3 Algebras

φ(a) • φ(b) = ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ 0 b1 −b2 0 a1 −a2 0 b1 −b2 0 a1 −a2 ⎝−a1 0 a3 ⎠ ⎝−b1 0 b3 ⎠ − ⎝−b1 0 b3 ⎠ ⎝−a1 0 a3 ⎠ a2 −a3 0 b2 −b3 0 b2 −b3 0 a2 −a3 0 ⎛ ⎞ −(a1 b1 + a2 b2 ) a2 b3 a1 b3 ⎠ a3 b2 −(a1 b1 + a3 b3 ) a1 b2 =⎝ a3 b1 a2 b1 −(a2 b2 + a3 b3 ) ⎛ ⎞ −(a1 b1 + a2 b2 ) a3 b2 a3 b1 ⎠ a2 b3 −(a1 b1 + a3 b3 ) a2 b1 −⎝ a1 b3 a1 b2 −(a2 b2 + a3 b3 ) ⎛ ⎞ 0 (a2 b3 − a3 b2 ) −(a3 b1 − a1 b3 ) 0 (a1 b2 − a2 b1 ) ⎠ , = ⎝−(a2 b3 − a3 b2 ) 0 (a3 b1 − a1 b3 ) −(a1 b2 − a2 b1 ) = φ(ab).

(3.30)

This confirms that φ : A → B is an algebra homomorphism, and because φ is a vector space isomorphism, it is also an algebra isomorphism.

3.15 Show that the algebra of quaternions is central.

3.15 The algebra of quaternions H was introduced in Hassani Example 3.1.16. The demonstration that it is central can be modelled after Hassani Example 3.1.7. Consider an arbitrary element a in the center of the algebra, a ∈ Z(H). By definition, a must commute with all elements in H. That is, ba = ab

(3.31)

where b in an arbitrary element of H. The algebra H was defined in Hassani 3 of R4 . So we expand a Example 3.1.16 in terms of the standard basis {ei }i=0 and b in terms of this basis a=

3 ∑ i=0

αi ei ,

b=

3 ∑

βi ei ,

(3.32)

i=0

and seek restrictions on αi ∈ R such that Eq. (3.31) holds. Substituting Eq. (3.32) into Eq. (3.31) the RHS becomes

3.1 Problems

39

( ab =

3 ∑

)⎛ αi ei ⎝

3 ∑

⎞ βjej⎠ =

3 ∑

αi β j ei e j ,

(3.33)

while the LHS becomes ⎛ ⎞( ) 3 3 3 ∑ ∑ ∑ ⎝ ⎠ ba = βjej αi ei = αi β j e j ei .

(3.34)

i=0

j=0

j=0

i=0

i, j=0

i, j=0

Terms involving at least one e0 commute, so they cancel from the two sides of the equation, as do of course terms with i = j. However terms with i, j > 0 and i /= j change sign with an interchange of i and j; recall from Hassani Example 3.1.16 ei e j =

3 ∑

∈i jk ek ,

for i, j = 1, 2, 3, i /= j,

(3.35)

k=1

and ∈i jk is antisymmetric in all indices. Thus Eq. (3.31) leads to three separate conditions (α2 β3 − α3 β2 )e1 = 0, (α3 β1 − α1 β3 )e2 = 0, (α1 β2 − α2 β1 )e3 = 0,

(3.36)

Recall this must hold for all b, so βi are arbitrary. The unique solution is α1 = α2 = α3 = 0. But α0 remains arbitrary so that a = α0 e0 . That is, a ∈ Span{e0 }. Therefore, H is central by Hassani Definition 3.1.6.

3.18 Let p and q be two quaternions. Show that (a) ( pq)∗ = q ∗ p ∗ , (b) q ∈ R iff q ∗ = q, and q ∈ R3 iff q ∗ = −q, and (c) qq ∗ = q ∗ q is a nonnegative real number.

3.18 The calculation proceeds most easily if we follow the notation proposed in Hassani Example 3.1.16, q = y0 + y1 i + y2 j + y3 k = y0 + y, q ∗ = y0 − y.

(3.37)

40

3 Algebras

(a) The rule for multiplying quaternions p = x0 + x and q = y0 + y pq = (x0 + x)(y0 + y), = (x0 y0 − x · y) + x0 y + y0 x + x × y,

Hassani Eq. (3.6)

(3.38)

follows immediately from the structure constants in Hassani Example 3.1.16 and the convention that the so-called pure parts x and y are treated like vectors in 3D Euclidean space so that x·y =

3 ∑

xi yi ,

x×y =

i=1

3 ∑

∈i jk xi y j ek .

(3.39)

i, j=1

The “real part” of Eq. (3.38) is the term in parentheses, i.e. the part that multiplies the implicit e0 represented by 1. Taking the conjugate of Eq. (3.38) we change the sign of only the pure part, ( pq)∗ = (x0 y0 − x · y) − x0 y − y0 x − x × y.

(3.40)

Now we try calculating q ∗ p ∗ q ∗ p ∗ = (y0 − y)(x0 − x), = (y0 x0 − y · x) − y0 x − x0 y + y × x.

(3.41)

Comparing Eqs(3.40) and (3.41) we conclude the result in (a) because the only term for which the order matters is y × x = −x × y. (b) Suppose q ∗ = q. This implies y0 − y = y0 + y, y = (0, 0, 0),

(3.42)

so q has only real part. The converse is trivial: if q ∈ R, i.e. q has only real part so that q = y0 , then obviously q ∗ = y0∗ = y0 = q. Suppose q ∗ = −q. This implies y0 − y = −(y0 + y), y0 = 0,

(3.43)

so q has only pure part, i.e. q ∈ R3 . The converse is trivial; if q ∈ R3 we mean that q = y because y0 = 0 so that q ∗ = −y = −q. (c) We proceed by direct calculation using the product rule Eq. (3.38)

3.1 Problems

41

qq ∗ = (y0 + y)(y0 − y) = (y0 y0 + y · y) − y0 y + y0 y + y × y, = (y0 y0 + y · y), = q ∗ q ≥ 0.

(3.44)

3.21 Prove Hassani Theorem 3.3.2: The total matrix algebra Mn (F) is central simple.

3.21 An algebra is simple if it has no ideals other than the two obvious ones, the algebra itself and the zero element (Hassani Definition 3.2.12). Our strategy will be to seek a two-sided ideal, I , that is not the trivial ideal I / = {0}. We will try to construct I as small as possible and yet find I is the entire algebra. I must contain ∑ at least one nonzero a ∈ I . Expand this element in the standard basis a = i, j αi j ei j /= 0. That implies that at least one αi j /= 0, say α pq . Since I is also a left-sided ideal, we can multiple a on the left ∑by any element b ∈ Mn (F) and obtain an element still in I . Expanding b = i, j βi j ei j where βi j ∈ F are arbitrary, we find I must include elements ba of the form, ⎛ ba = ⎝

n ∑

⎞ βi j ei j ⎠ α pq e pq = α pq

i, j=1

= α pq

n ∑

βi j δ j p eiq ,

i, j=1

n ∑

βi p eiq =

i=1

n ∑

βi' eiq ,

(3.45)

i=1

i.e. the matrices that are zero everywhere except possibly column q. The set includes {0} as one particular element. For the second equality of the last line we absorbed the α pq , which is an arbitrary nonzero scalar in F, in the coefficients βi p , by defining βi' = α pq βi p . There is no need to carry the p and q indices in βi' . Let I be the set of all such matrices. Multiplying an element u ∈ I , on the left by an arbitrary matrix c we find n ∑

⎞( ) n ∑ γi j ei j ⎠ βk ekq ,

i, j=1

k=1

⎛ cu = ⎝ =

∑

γi j βk δ jk eiq =

i, j,k

=

n ∑ i=1

∑

γi j β j eiq ,

i, j

πi eiq ,

(3.46)

42

3 Algebras

which is again a matrix zero everywhere except column q. This confirms I is a left ideal, as found by Hassani ∑fact, taking a linear combination ∑ in §3.3. In of two such matrices, say i αi eiq and i βi ei p , we find left ideals with (n − 2) zero columns. Repeating this we can find left ideals Im with n − m zero columns; I0 and In being the trivials ideals corresponding to the zero matrix and the entire algebra. Now we take into consideration that I must also be a right-sided ideal (we arrive at the same conclusion working with any Im with m /= 0). We can multiply on the right by any ∑ element c ∈ Mn (F) and the result remains in the set I . Expanding c = k,l γkl ekl and multiplying it on the right of the arbitrary element u of the set I we are constructing ( uc =

n ∑

)( βi eiq

i=1

= =

( n n ∑ ∑ i,l=1 n ∑ i,l=1

n ∑

) γkl ekl ,

k,l=1

)

βi γkl δqk eil ,

k=1

βi γql eil = , ,, , ξil

n ∑

ξil eil .

(3.47)

i,l=1

The important thing to note is that the βi , γql ∈ F are completely arbitrary, and thus so is ξil ∈ F. So Eq. (3.47) represents the expansion in the standard basis of a completely arbitrary element. Even though we started with a being a single nonzero element proportional to one of the standard basis elements, when we multiplied on the left and right by the arbitrary elements b, c ∈ Mn (F) the set grew to include all matrices, I = Mn (F). The only smaller ideal contained in I is {0}, the singleton set consisting only of the n × n matrix with zero everywhere. The total matrix algebra Mn (F) was shown to be central in Hassani Example 3.3.3.

3.24 Let D : A → A be a derivation. Show that ker D is a subalgebra of A.

3.24 A subalgebra is a linear subspace that is closed under multiplication, see Hassani Definition 3.1.3. Because D is a derivation it is also a type of linear mapping (in particular an endomorphism). Recall the kernel of a linear transformation between vector spaces forms a subspace of the domain, see Hassani Theorem 2.3.9. So the kernel of D : A → A forms a subspace of A. Furthermore the ker D is closed under multiplication because

3.1 Problems

D(ab) = [D(a)]b + aD(b), = 0,

43

Hassani Definition 3.4.1, for all a, b ∈ ker D.

(3.48)

Therefore ab is in ker D whenever both a and b are in ker D. That is, this linear subspace is closed under multiplication, proving that ker D is a subalgebra.

3.27 Show that Dc defined on Cr (a, b) by Dc ( f ) = f ' (c), where a < c < b, is a φc -derivation if φc is defined as the evaluation map φc ( f ) = f (c).

3.27 We must show that Dc meets the criteria of Hassani Definition 3.4.7 of a φ-derivation. The φ-derivation is a sort of generalization of the derivation (Hassani Definition 3.4.1) but maps elements of one algebra A to another algebra B. In this case A is the algebra Cr (a, b) of r -times differentiable, real-valued functions defined on the open interval (a, b) of the real numbers and B is the algebra of real numbers. The proposed φ-derivation Dc ( f ) = f ' (c) clearly maps f ∈ Cr (a, b) to the reals. Note the φ from Hassani Definition 3.4.7 must be a homomorphism between the same algebras. The φ in question here is the evaluation map φc : Cr (a, b) → R, which is easily shown to be a homomorphism because φc ( f g) = f (c)g(c) = φc ( f )φc (g),

f, g ∈ Cr (a, b).

(3.49)

The prime in the definition Dc ( f ) = f ' (c) refers to ordinary differentiation of a real-valued function. To test if Dc is a φ-derivation we apply it to the product of two functions, f, g ∈ Cr (a, b), Dc ( f g) = φc [( f g)' ], = φc [ f ' g + f g ' ], = f ' (c)g(c) + f (c)g ' (c), = Dc ( f )g(c) + f (c)Dc (g), = Dc ( f )φc (g) + φc ( f )Dc (g).

by definition, Dc ( f ) = f ' (c) product rule of ordinary derivative evaluate argument at c by definition, Dc ( f ) = f ' (c) (3.50)

The final line corresponds to Hassani Definition 3.4.7 of a φ-derivation with the φ here being φc .

3.30 Prove Hassani Theorem 3.4.10: Let and Ω1 and Ω2 be antiderivations of algebra A with respect to two involutions ω1 and ω2 of A. Suppose that ω1 ◦ ω2 = ω2 ◦ ω1 . Furthermore assume that

44

3 Algebras

ω1 Ω2 = ±Ω2 ω1

and

ω2 Ω1 = ±Ω1 ω2 .

(3.51)

Then Ω1 Ω2 ∓ Ω2 Ω1 is an antiderivation with respect to the involution ω1 ◦ ω2 .

3.30 Define the endomorphism ∏ ≡ Ω1 Ω2 ∓ Ω2 Ω1 on A. Then our goal is to show that ∏ satisfies Hassani Definition 3.4.9 for an antiderivation ∏(a1 a2 ) = ∏(a1 ) · a2 + ω(a1 ) · ∏(a2 ),

a1 , a 2 ∈ A

(3.52)

of algebra A with respect to the involution ω = ω1 ◦ ω2 = ω2 ◦ ω1 of A. We must evaluate Eq. (3.52). What is the meaning of the product Ω1 Ω2 in the definition of ∏? Recall from Hassani Definition 3.4.9 that antiderivations are endomorphisms and in Hassani Example 3.1.9 the algebra End(A) is defined with multiplication being function composition. So here Ω1 Ω2 = Ω1 ◦ Ω2 for Ω1 , Ω2 ∈ End(A). In contrast, the product on the algebra A will be indicated with a solid dot, a1 · a2 for a1 , a2 ∈ A; it is helpful to distinguish the two. The LHS of Eq. (3.52) can be written ∏(a1 · a2 ) = [Ω1 ◦ Ω2 ∓ Ω2 ◦ Ω1 ] (a1 · a2 ), meaning of sum of maps = Ω1 ◦ Ω2 (a1 · a2 ) ∓ Ω2 ◦ Ω1 (a1 · a2 ), = Ω1 (Ω2 (a1 · a2 )) ∓ Ω2 (Ω1 (a1 · a2 )). product is map composition (3.53) Now Ω1 and Ω2 are antiderivations with respect to involutions ω1 and ω2 respectively so Ω1 (a1 · a2 )) = Ω1 (a1 ) · a2 + ω1 (a1 ) · Ω1 (a2 ), used Hassani Def. 3.4.9 Ω2 (a1 · a2 )) = Ω2 (a1 ) · a2 + ω2 (a1 ) · Ω2 (a2 ). (3.54) Substituting the second line of Eq. (3.54) into the first term on the RHS of Eq. (3.53) gives ⎧ ⎫ Ω1 (Ω2 (a1 · a2 )) = Ω1 ⎩Ω2 (a1 ) · a2 + ω2 (a1 ) · Ω2 (a2 )⎭ , ⎫ ⎧ ⎫ ⎧ = Ω1 ⎩Ω2 (a1 ) · a2⎭ + Ω1 ⎩ω2 (a1 ) · Ω2 (a2 )⎭ , = Ω1 (Ω2 (a1 )) · a2 + ω1 (Ω2 (a1 )) · Ω1 (a2 ) + Ω1 (ω2 (a1 )) · Ω2 (a2 ) + ω1 (ω2 (a1 )) · Ω1 (Ω2 (a2 )), = Ω1 ◦ Ω2 (a1 ) · a2 + ω1 (Ω2 (a1 )) · Ω1 (a2 ) ± ω2 (Ω1 (a1 )) · Ω2 (a2 ) + ω1 ◦ ω2 (a1 ) · Ω1 ◦ Ω2 (a2 ).

used Eq. (3.54) used Ω1 linear used Eq. (3.54) used Eq. (3.51) (3.55)

3.1 Problems

45

Similarly, substituting the first line of Eq. (3.54) into the second term on the RHS of Eq. (3.53) gives: ∓Ω2 (Ω1 (a1 · a2 )) = ∓Ω2 ◦ Ω1 (a1 ) · a2 ∓ ω2 (Ω1 (a1 )) · Ω2 (a2 ) − ω1 (Ω2 (a1 )) · Ω1 (a2 ) ∓ ω2 ◦ ω1 (a1 ) · Ω2 ◦ Ω1 (a2 ), (3.56) where we used (∓)(±) = − to get the sign of the third term. Now substitute the sum of the RHSs of Eq. (3.55) and Eq. (3.56) into Eq. (3.53) and cancel the four terms indicated below 1 (Ω )) · Ω1 (a2 ) ∏(a1 a2 ) = Ω1 ◦ Ω2 (a1 ) · a2 + ω1 2 (a ± ω2 (Ω )) · Ω 1 (a1 2 (a 2 ) + ω1 ◦ ω2 (a1 ) · Ω1 ◦ Ω2 (a2 ) )) · Ω ∓ Ω2 ◦ Ω1 (a1 ) · a2 ∓ ω2 (Ω 1 (a1 2 (a 2) 1 − (Ω )) · Ω1 (a2 ) ∓ ω2 ◦ ω1 (a1 ) · Ω2 ◦ Ω1 (a2 ). (3.57) ω1 2 (a Now combine the two terms with the argument (a1 ). Also noting that we can assume ω1 ◦ ω2 = ω2 ◦ ω1 we can also combine the two terms with the argument (a2 ): ∏(a1 a2 ) = [Ω1 ◦ Ω2 ∓ Ω2 ◦ Ω1 ](a1 ) · a2 + ω1 ◦ ω2 (a1 ) · [Ω1 ◦ Ω2 ∓ Ω2 ◦ Ω1 ](a2 ) = ∏(a1 ) · a2 + ω(a1 ) · ∏(a2 ).

(3.58)

In the second line we recognized our endomorphism ∏ and defined involution ω = ω1 ◦ ω2 . Eq. (3.58) confirms the desired result Eq. (3.52).

3.33 Let A be an algebra with an idempotent P. Show that PAP consists of elements a such that aP = Pa = a. For the subspaces of Theorem 3.5.11, let 3 are subA1 ≡ PAP, A2 ≡ PL(P), A3 ≡ R(P)P, and A4 ≡ I(P). Show that {Ai }i=1 algebras of A and that Ai ∩ A j = {0}, but Ai A j / = {0} for all i / = j, i, j = 1, . . . , 4. Thus, Peirce decomposition is a vector space direct sum, but not an algebra direct sum.

3.33 Consider the set A1 = PAP, which by definition consists of elements A1 = PAP = {a ∈ A| a = PbP, for any b ∈ A},

(3.59)

where P is a given idempotent of algebra A. Multiplying the defining property of an arbitrary element a on the left by P we have

46

3 Algebras

Pa = PPbP,

used Eq. (3.59), multiplied on left by P

= P bP,

assumed associate algebra

2

= PbP, = a.

used P is idempotent used Eq. (3.59)

(3.60)

Similarly multiplying the arbitrary element a on the right by P we find aP = PbPP,

multiplied on right by P

2

= PbP ,

assumed associate algebra

= PbP, = a.

used P is an idempotent used definition of this arbitrary a

(3.61)

We conclude A1 ⊆ {a ∈ A| Pa = a = aP} = A'1 .

(3.62)

Be careful, we are not done yet! In order to conclude that A1 = A'1 we must show the reverse inclusion: A'1 ⊆ A1 . But this is trivial; starting with the first property Pa = a of elements of set A'1 , we substitude the a on the left with a from the second property a = aP. So a ∈ A'1 =⇒ a = PaP =⇒ a ∈ A1 .

(3.63)

Therefore A'1 ⊆ A1 . Because the inclusion applies in both directions, we have A'1 = A1 . To show set A1 defined in Eq. (3.59) is a subalgebra of A we must show that it is a linear subspace of the vector space A that is closed under multiplication. (We ignore temporarily that A1 enjoys the properties in Eq. (3.60) and Eq. (3.61) because we do not need them.) Let a = Pa' P and b = Pb' P be an arbitrary pair of elements of the set A1 . Is the linear combination c = αa + βb also in the set A1 for arbitrary scalars α, β ∈ C? αa + βb = αPa' P + βPb' P, ) ( = P αa' P + βb' P , ) ( = P αa' + βb' P, = Pc' P, c' ∈ A,

used property of algebra multiplication used property of algebra multiplication used A is a linear space (3.64)

where c' = αa' + βb' . This establishes that c = αa + βb is in the set A1 defined in Eq. (3.59), so it is a linear subspace. Is the product d = ab also in the subspace A1 ?

3.1 Problems

47

d = ab = (Pa)(bP),

used Eqs(3.60, 3.61),

= P(ab)P.

used A associative algebra

(3.65)

We are only concerned that ab ∈ A, as it must be because A is closed under multiplication. This confirms that d ∈ A1 because it meets the criterion defining the set in Eq. (3.59). The set A2 ≡ PL(P) is defined by A2 = PL(P) = {a ∈ A|Pa = a and aP = 0},

Hassani Eq. (3.13). (3.66)

To show that set A2 ≡ PL(P) is a subalgebra, we first show it is a linear subspace by showing that it is closed under linear combinations. Consider two arbitrary elements a, b ∈ A2 . The linear combination c = αa + βb is also in set A2 because it obeys the two defining properties in Eq. (3.66): Pc = P(αa + βb), = αPa + βPb, = αa + βb, = c,

used property of algebra multiplication used a, b ∈ A2 (3.67)

and cP = (αa + βb)P, = αaP + βbP, = α0 + β0,

used property of algebra multiplication used a, b ∈ A2

= 0.

(3.68)

So A2 contains its linear combinations and we can now call it a linear subspace. The other criterion of a subalgebra is that it contains its products. Consider two arbitrary elements b1 , b2 ∈ A2 . Then we check whether b3 = b1 b2 meets the two defining properties of the set Eq. (3.66). The first property applies to b3 because Pb3 = Pb1 b2 , = (Pb1 )b2 , = b1 b2 = b3 .

associative algebra

Furthermore, the second property also applies to b3 because

(3.69)

48

3 Algebras

b3 P = b1 b2 P, = b1 (b2 P),

associative algebra

= b1 0 = 0.

(3.70)

Therefore b3 has the two defining properties in Eq. (3.66), so subspace A2 is also a subalgebra. By a similar argument we could show that A3 ≡ R(P)P is a subalgebra. Instead, as an exercise, we apply an alternative strategy by returning to the original definition of the set. Consider two arbitrary elements a, b ∈ A3 , so a = xP and b = yP with x, y ∈ R(P), i.e. Px = 0 and Py = 0 (see Hassani Example 3.2.2). All linear combinations c = αa + βb are also in set A3 as we now show: αa + βb = αxP + βyP, = (αx + βy)P, = zP,

used property of algebra multiplication (3.71)

where z = αx + βy earns its membership in R(P) because Pz = P(αx + βy), = αPx + βPy,

used property of algebra multiplication used x, y ∈ R(P).

= 0.

(3.72)

This shows that subset A3 includes all its linear combinations so it is indeed a subspace. Again consider the two arbitrary elements a, b ∈ A3 . Now we will show that d = ab ∈ A3 as well, i.e. that it is a subalgebra. d = ab = (xP)(yP), = (xPy)P.

associative algebra

(3.73)

But xPy ∈ R(P) because P(xPy) = (Px)Py = 0Py = 0.

(3.74)

Thus the subspace A3 is also a subalgebra. For completeness we mention that A4 is a subalgebra. It was defined by, A4 = I(P) = L(P) ∩ R(P).

(3.75)

But recall the left annihilator L(P) is a left ideal of A while the right annihilator R(P) is a right ideal of A. Thus I(P), the set of elements that have both these

3.1 Problems

49

properties, is a two-sided ideal (or simple “ideal” for short) of A and an algebra ideal is a special subalgebra. The fact that Ai ∩ A j = {0} follows from the definitions of these in Hassani Eq. (3.13): A1 = PAP = {a ∈ A| Pa = a = aP}, A2 = PL(P) = {a ∈ A| Pa = a, aP = 0},

c.f. Equation (3.62) c.f. Equation (3.66)

A3 = R(P)P = {a ∈ A| aP = a, Pa = 0}, A4 = I(P) = {a ∈ A| aP = Pa = 0}.

(3.76)

To find the intersection we combine with a logical conjunction the defining properties. For example, for A1 ∩ A2 we include all a ∈ A such that Pa = a and a = aP (to be in set A1 ) and Pa = a and aP = 0 (to also be in set A2 ). This implies that a = 0, so the intersection A1 ∩ A2 is the singleton with only the zero element. More briefly, A1 ∩ A2 = {a ∈ A| Pa = aP = a = 0} = {0}, A1 ∩ A3 = {a ∈ A| Pa = aP = a = 0} = {0}, A1 ∩ A4 = {a ∈ A| Pa = aP = a = 0} = {0}, A2 ∩ A3 = {a ∈ A| Pa = a = aP = 0} = {0}, A2 ∩ A4 = {a ∈ A| aP = Pa = a = 0} = {0}, A3 ∩ A4 = {a ∈ A| Pa = aP = a = 0} = {0}.

(3.77)

So the Peirce decomposition is a vector space direct sum. Recall, Hassani Definition 3.1.10, that an algebra A is the direct sum of its subalgebras B and C if the corresponding vector spaces can be written A = B ⊕ C and BC = CB = {0}. The Peirce decomposition does not meet the second criterion. To show this it suffices to find one product of elements not equal to the zero vector. We will show that A2 ⊂ A1 A2 so that, barring the trivial case of A2 = {0}, we have shown that A1 A2 contains a nonzero element. Every element a2 ∈ A2 has the two properties a2 = Pa2 and a2 P = 0. Does every element a2 also exist in A1 A2 ? First notice that P ∈ A1 and P /= 0 by definition. Then note that a1 a2 = Pa2 , = a2 .

chose a1 = P used Eq. (3.66)

So yes, every a2 is also in A1 A2 , confirming that A2 ⊂ A1 A2 .

(3.78)

50

3 Algebras

3.36 Derive the chain rule, d ( p(q)) = p ' (q) · q ' ,

Hassani Eq. (3.17)

(3.79)

3.36 Here p(q) is a polynomial, an element of the polynomial algebra P[a] generated by some fixed element a of algebra A and q is also a polynomial of a. We can expand p(q) using basis {q k }∞ k=0 , p(q) =

∞ ∑

αk q k ,

αk ∈ C.

(3.80)

k=0

Applying the differentiation map d : P[a] → P[a] we find d( p(q)) = d

(∞ ∑

) αk q

k

,

used Eq. (3.80)

k=0

= =

∞ ∑

αk d(q k ),

k=0 (∞ ∑

used linearity of differentiation map )

αk kq k−1 · d(q),

k=1 '

= p (q) · q ' .

note change in lower limit different notation (3.81)

In the second to last line above we used the fomulas derived by Hassani immediately following his Theorem 3.6.3.

3.2 Supplementary Problems 3.37 Show that the two-sided inverse of an associative algebra is unique. 3.38 Prove the second part of Hassani Theorem 3.1.2. That is, let a and b be invertible elements of an associative algebra. Prove that ab is invertible with inverse (ab)−1 = b−1 a−1 .

(3.82)

3.2 Supplementary Problems

51

3.39 Convince yourself that C is a division algebra.

3.39 A division algebra is a unital algebra all of whose nonzero elements have inverses. Recall from Hassani Example 2.1.2 that C is a vector space over the complex numbers (and also over the reals). Regular multiplication of complex numbers obeys the distributive rules required for an algebra, a(βb + γ c) = βab + γ ac, (βb + γ c)a = βba + γ ca,

(3.83)

for all a, b, c ∈ C and β, γ ∈ C (or R). Furthermore, the product of any two complex numbers gives again a complex number, so the set is closed under multiplication. Thus we have all the necessary ingredients of an algebra. The number 1 plays the role of the identity, 1 · z = z · 1 = z,

∀ z ∈ C,

(3.84)

so the algebra C is also unital. Finally, to show that C is a division algebra, we must show that all nonzero elements have an inverse, i.e. for arbitrary z /= 0, z ∈ C, there exists a c ∈ C such that z · c = c · z = 1.

(3.85)

For z /= 0 we know that z ∗ z = |z|2 > 0, where z ∗ is the complex conjugate of z, so that we can define c=

z∗ . |z|2

(3.86)

This c satisfies Eq. (3.85), showing that C is a division algebra.

3.40 The algebra of quaternions H, with vector space R4 , was introduced in Hassani 3 , which are normally denoted by 1, Example 3.1.16, using the standard basis {ei }i=0 i, j, k respectively. This suggests e0 = 1 is the identity element of H and indeed we assumed this in the solution to Problem 3.15. Prove that e0 is indeed the unique identity element of H.

52

3 Algebras

3.41 Let p, q be quaternions, with q ∗ the conjugate of q = x + i y + j z + kw = x0 + x defined by q ∗ = x − i y − j z − kw = x0 − x.

(3.87)

Here i, j, k are the basis vectors of Hassani Example 3.1.16 and x is the 3-vector x = i y + j z + kw. Verify that for p = y0 + y the product qp is given by qp = x0 y0 − x · y + x0 y + y0 x + x × y.

Hassani Eq. (3.6)

(3.88)

3.42 Explain briefly why it is clear from the definition of an ideal of an algebra that the only ideal that contains the identity, or an invertible element, is the entire algebra itself. Is it possible to have an ideal B of a unital algebra A such that the identity 1 A is not contained in B?

3.42 Let A be a unital algebra with identity 1 A . Suppose B is a left ideal of A. That means that any ab ∈ B,

∀ a ∈ A, ∀ b ∈ B.

(3.89)

Suppose 1 A ∈ B. Set b = 1 A in Eq. (3.89). We immediately have the result that A⊂B

(3.90)

and since B ⊂ A by assumption, we have that A = B. A similar argument applies if B is a right ideal of A and therefore also applies to the case of a two-sided ideal. Specifying that the left (right or two-sided) ideal contains an invertible element implies that the left (right or two-sided) ideal also contains the identity as we now argue. Suppose again that B is a left ideal and it now contains an invertible element, say c ∈ B with inverse c−1 ∈ A. Then by Eq. (3.89) with a = c−1 and b = c we have that c−1 c = 1 A ∈ B. Then our previous result applies. If B is a right ideal then it must contain all the products ba ∈ B,

∀ a ∈ A, ∀ b ∈ B.

(3.91)

So with the invertible element c in this right ideal and again when a = c−1 then cc−1 = 1 A ∈ B. Again, our previous result A = B applies. When B is a two-sided ideal with invertible element then either of our above two arguments shows that B contains the identity and we must therefore have that A = B.

3.2 Supplementary Problems

53

Yes, it is possible to have an ideal that does not contain the identity. For example, there is the trivial ideal consisting of the zero element, the singleton set {0}. A less trivial example would be the left and right annihilator of a nonzero element a ∈ A.

3.43 Prove Hassani Theorem 3.2.6: Let L be a left ideal of [an associative algebra] A. Then the following statements are equivalent: (a) L is a minimal left ideal. (b) Ax = L for all x ∈ L. (c) Lx = L for all x ∈ L. 3.44 Show that the set of basis vectors of Hassani Example 3.2.10, 1 (e0 + e3 ), 2 1 f3 = (e0 − e3 ), 2

f1 =

1 (e1 − e2 ), 2 1 f4 = (e1 + e2 ), 2

f2 =

Hassan Eq. (3.9)

(3.92)

defined in terms of the basis vectors ei in Hassani Example 3.1.7, are linearly independent. 3.45 In Hassani Theorem 3.2.7 we saw that if A and B are algebras, φ : A → B an epimorphism and L a left minimal ideal of A then φ(L) is a left minimal ideal of B. (a) Explain how we know that φ(L) is a subspace of the underlying vector space B. (b) Highlight the role played by the surjectivity of φ in the proof of Theorem 3.2.7. (c) Provide an alternative proof that φ(L) is minimal using Hassani Theorem 3.2.6(c) rather than (b).

3.45 (a) φ is a linear map and L, being an ideal must also be a subspace. Thus Hassani Theorem 2.3.10 applies and guarantees that the range φ(L) is a subspace of the targetbreak space B. (b) In proving that φ(L) is a left ideal of algebra B we must show that for every element b ∈ B the product by is in the subspace φ(L), where y ∈ φ(L). The surjectivity of φ was used in Hassani’s proof to guarantee the existence a ∈ A such that b = φ(a). In contrast, x ∈ L such that φ(x) = y ∈ φ(L) provides an arbitrary element of the subspace φ(L). The homomorphism property of φ then gave the stability of the product: by = φ(a)φ(x) = φ(ax) ∈ φ(L).

(3.93)

54

3 Algebras

(c) Our task here is to prove that φ(L)u = φ(L),

∀ u ∈ φ(L),

(3.94)

which by Hassani Theorem 3.2.6(c) is equivalent to φ(L) being a minimal left ideal of B. By definition of the set φ(L), there exists a (not necessarily unique) t ∈ L such that φ(t) = u. So we can rewrite the LHS of Eq. (3.94) as φ(L)φ(t) = φ(Lt) = φ(L)

used φ is a homomorphism, used L is a minimal left idea.

(3.95)

The last line again uses Hassani Theorem 3.2.6(c).

3.46 Prove Hassani Proposition 3.5.2: The identity element is the only idempotent in [an associative] division algebra. 3.47 Show that the vector product on R3 turns the vector space of R3 into an algebra that is noncommutative and nonassociative, as indicated in Hassani Example 3.1.9. 3.48 (a) Show that the set of n by n diagonal matrices forms a subalgebra of the square matrix algebra with complex elements Mn ×n . (b) The product defined on the vector space Cn by the following structure constants for the standard basis, cikj = δi j ,

∀ i, j, k = 1, . . . , n,

(3.96)

turns Cn into an algebra. Intuitively this algebra contains the same information as the subalgebra of (a). This intuition can be formalized by showing they are isomorphic. Find a linear map between the two algebras that meets all the requirements of an algebra isomorphism.

3.48 (a) The matrix algebra of Hassani Example 3.1.9 combined ordinary matrix multiplication with the vector space of square complex matrices, a subset of the m × n complex matrices Mm×n introduced in Hassani Example 2.1.2. So we inherit the usual (component-wise) matrix addition and scalar multiplication from this vector space. Then it is clear that the set of n by n diagonal matrices is closed under linear combinations. In detail, if A and B are arbitrary diagonal n by n complex matrices then, Ai j = 0 when i /= j, i, j = 1, . . . n Bi j = 0 when i /= j, i, j = 1, . . . n.

(3.97)

3.2 Supplementary Problems

55

So the linear combination, αA + βB, with α, β ∈ C is also a diagonal matrix because, (αA + βB)i j = αAi j + βBi j = 0, i, j = 1, . . . n when i /= j.

(3.98)

The closure under linear combinations means that the n by n diagonal matrices are a subspace of the vector space of square complex matrices Mn×n . Furthermore the subspace of n by n diagonal matrices is closed under ordinary matrix multiplication as we now show. Let A and B again be the arbitrary diagonal n by n complex matrices of Eq. (3.97). Their product AB has elements (AB)i j =

n ∑

Aik Bk j ,

k=1

= Aii B j j δi j , i, j = 1, . . . n, ⎧ Aii Bii when i = j, = 0 when i /= j.

(3.99)

So the subspace of diagonal n by n (complex) matrices is a subalgebra of the n by n matrix algebra. (b) We seek a linear map φ : Mn×n → Cn . An obvious choice would be the map defined component-wise by identifying the kth component ak of the output a ∈ Cn with the row k column k diagonal element Akk of the input matrix A. This could be written, ak = φ(A)k = Akk ,

k = 1, . . . , n,

(3.100)

where φ(A)k means extract the kth component of the quantity φ(A). The map φ is linear, φ(αA + βB) = αAkk + βBkk , = αφ(A) + βφ(B).

(3.101)

The map φ is surjective because for any a ∈ Cn we can find an n × n diagonal matrix with the components of a along its diagonal. The map φ is injective because φ(A) = φ(B) implies A = B. Finally the map φ is a homomorphism, which we can verify by looking at the kth component of the output, for all k = 1 to n:

56

3 Algebras

(φ(AB))k = (AB)kk , = Akk Bkk ,

used Eq. (3.100) used Eq. (3.99) used Eq. (3.99)

= (φ(A))k (φ(B))k .

(3.102)

3.49 Show that the product A • B ≡ AB + BA,

(3.103)

defined on the vector space of square matrices with ordinary matrix multiplication on the RHS of Eq. (3.103), turns this space into an algebra that is commutative but nonassociative. 3.50 Prove Hassani Theorem 3.4.4. Let {ei }iN=1 be a basis of the algebra A. Then a vector space endomorphism D : A → A is a derivation on A iff D(ei e j ) = D(ei ) · e j + ei · D(e j )

for i, j = 1, 2, . . . , N .

(3.104)

3.51 Let A be the vector space of n × n upper triangular, complex-valued matrices together with the asymmetric matrix product Eq. (3.27). Show that this is an algebra. Also explain why the associated derived algebra A2 has matrices with vanishing diagonal elements, i.e. n × n strickly upper triangular matrices, as indicated in Hassani Example 3.1.9.

3.51 A square matrix A ∈ Mn×n is called upper triangular when it has vanishing elements below and to the left of the diagonal, Ai j j. The same consideration applies for the second

3.2 Supplementary Problems

57

term with the roles of A and B interchanged. So clearly when i > j we have no terms in the sum and (A • B)i j = 0. This confirms that A ⊂ Mn×n is stable under its multiplication Eq. (3.27) and thus an algebra. Why does the derived algebra A2 have strickly upper triangular matrices? The derived algebra contains only elements that are obtained from finite sums of products of elements from A: A2 = {x ∈ A|x =

∑

yl • zl , yl , zl ∈ A}.

Hassani Eq. (3.2) (3.107)

l

Here the diagonal elements of yl • zl vanish as we see by substituting yl = A and zl = B in Eq. (3.106) and extracting the diagonal element by setting j = i. In particular, (A • B)ii =

i ∑

Aik Bki − Bik Aki ,

k=i

= Aii Bii − Bii Aii = 0.

(3.108)

3.52 Let Ω be an antiderivation of algebra A. Show that the action of Ω is completely determined by its action on the generators of A. 3.53 Find an example of a unital algebra for which no element, other than those in Span{e}, has an inverse. Hint: Look in Hassani Example 3.1.9. 3.54 Explain in more detail the steps of the proof of Hassani Lemma 3.5.9.

3.54 Let B ≡ Aak−1 . Then B is a left ideal of A generated by the element ak−1 , see Problem 3.9. Form the set Ba. Recall the supposition of Hassani Lemma 3.5.9 that there exists an element a ∈ A such that Aak = Aak−1 . This supposition implies that, Ba = (Aak−1 )a = A(ak−1 a) = Aak , = Aa

k−1

,

used associative algebra used supposition Aak = Aak−1 ,

= B.

(3.109)

Multiplying the LHS and RHS of Eq. (3.109) repeatedly on the right by a we eventually arrive at Bak = B.

(3.110)

58

3 Algebras

We know that ak ∈ B from its definition, which implies that B contains all elements that can be written as cak−1 for any c ∈ A. But of course a ∈ A so aak−1 = ak is in B. Now define b = ak and substitute this into Eq. (3.110) giving Bb = B.

(3.111)

This says that every element in the set B can be written as some element in B times b. But since b ∈ B that means that b too can be written as some element, let’s say P ∈ B, times b, in other words, Pb = b.

(3.112)

Multiplying both sides of Eq. (3.112) on the left by P and rearranging, 0 = P2 b − Pb = (P2 − P)b.

(3.113)

Which, as we saw in Problem 3.32, implies P2 = P. In conclusion the supposition that there exists an element a ∈ A such that Aak = Aak−1 implies that A has an idempotent element.

3.55 Let A be the algebra with product Eq. (3.96) defined on the vector space R2 . (a) Show that the open unit disc, {(x, y) ∈ R2 |x 2 + y 2 < 1},

(3.114)

is a subset of A that is stable under multiplication Eq. (3.96). Why is this subset of A not a subalgebra of A? Are there any lines that are subalgebras of A? (b) Show that the straight line, {(x, y) ∈ R2 |y = mx, for fixed m ∈ R},

(3.115)

is a subset of A that is stable under linear combinations. Why is this subset of A not generally a subalgebra of A for arbitrary fixed m? Find three straight lines that are subalgebras of A. (c) Show that the set in Eq. (3.115) is a subalgebra of the algebra on R2 with product given by the following structure constants of the standard basis 1 2 = c22 = 1, c11 1 2 c22 = c11 = 0.

1 1 2 2 c12 = c21 = c12 = c21 =

1 , 2 (3.116)

3.2 Supplementary Problems

59

3.56 Take R2 as our underlying vector space upon which we wish to build an algebra A. Recall that proper nontrivial subspaces of R2 are straight lines through the origin. Let’s say that we want every subspace of R2 to be a subalgebra of A. Show that this imposes restrictions on the structure constants cikj ∈ R 1 2 2 = c12 + c21 , c11 1 = 0, c22 2 c11 = 0, 2 1 1 = c12 + c21 . c22

(3.117)

3.57 Explain why the trivial ideal consisting of only the zero element is always contained in a minimal left (right or two-sided) ideal but is not necessarily equal to the minimal left (right or two-sided) ideal.

3.57 We start with the case of a left ideal, but then note that our considerations apply equally to left, right and two-sided ideals. Any algebra A has at least two obvious left ideals, the whole algebra A and the singelton {0}, the latter being the so-called trivial ideal. Furthermore the zero element 0 must be contained in each algebra over a field such as R or C because the underlying vector space necessarily contains the zero vector (take any vector and multiply by zero). Therefore any left ideal B ⊂ A also trivially contains 0 because when we search for elements c satisfying c = ab ∈ B,

∀a ∈ A and b ∈ B,

(3.118)

we must include a = 0, which gives c = 0. A minimal left ideal can be defined as an ideal that contains no other nontrivial left ideal. Equivalently, a left ideal M of A is minimal if every nontrivial left ideal B of A contained in M coincides with M. The same considerations apply for a right or two-sided ideal.

3.58 Show that the structure constants cikj ∈ C of an associative algebra over C must obey ∑ l

m cljk cil =

∑

m cilj clk .

(3.119)

l

3.59 (This problem might be difficult for students that have not yet read Chap. 5.) Suppose we have an associative algebra A given by an n-dimensional vector space over C with basis {ek }nk=1 and a set of structure constants cikj ∈ C that obey

60

3 Algebras

Eq. (3.119). Now transform to a new basis {fk }nk=1 where the two are related via an invertible n × n complex matrix R fk =

n ∑

R jk e j .

(3.120)

j=1

Show that the structure constants in the new basis are given by dikj =

n ∑

m Rli Rr j R−1 km clr ,

(3.121)

l,r,m=1

where R−1 is the inverse of R, so n ∑

Rik R−1 k j = δi j .

(3.122)

k=1

This implies that the structure constants are a type of tensor. Tensors are used extensively in physics and engineering and are the subject of Hassani Part VIII. They are discussed briefly in the next problem. 3.60 In physics we sometimes hear that if a, b ∈ R3 are so-called true vectors or polar vectors then their cross product c c = a × b,

(3.123)

is a so-called pseudo vector or axial vector. Explain why the vector space R3 over the reals endowed with the cross product is a legitimate algebra, not a somehow faulty algebra.

3.60 Names can be misleading. When we talk about vectors being true or polar vectors versus pseudo or axial vectors, we are thinking of them as a type of tensor. An important trait considered in the taxonomy of vectors, and tensors more generally, is their behaviour under transformations of the basis vectors of the coordinate system. For example an important species of tensors used extensively in classical mechanics is called affine tensors or sometimes Cartesian tensors. Affine tensors are invariant under orthogonal transformations (the determinant of the transformation matrix can be plus or minus 1, so including rotations and reflections) of the rectanglur coordinate system in which they are represented; true or polar vectors are animals of this species. Pseudo or axial vectors are tensors of a different but closely related species; they are invariant under rotations of the rectanglur coordinate system in which they are

3.2 Supplementary Problems

represented (these are orthogonal transformations with transformation matrix determinant plus one so reflections are excluded). Pseudo or axial vectors are not faulty or illegitimate but simply more restricted in terms of the permitted transformations of the bases. For full discussion of tensors see either Hassani Part VIII, or the applied mathematics book dedicated to the subject [12], or physics books that use tensors extensively in relativity theory [11, 16]). In Part I of Hassani’s book we are considering finite dimensional vector spaces and endowing them with additional structure like a norm and products between them etc. These vectors spaces consist of abstract vectors that could be the geometric quantities used to represent familiar quantities from classical physics. But the abstract vectors of a vector space are not necessarily tensors; as we saw in Chap. 2 they could be polynomials with complex coefficients (see Hassani Example 2.1.2) or matrices of a given shape, etc., and this generality paves the way to applications in quantum physics. When we talk about the vector space R3 over the reals, without further retriction, we are thinking of matrices of a particular shape, either 1 × 3 (column vectors) or 3 × 1 (row vectors) with not necessarily tensorial properties. In short, an algebra over a field builds an algebra from an abstract vector space which is more than general enough to include so-called pseudo vectors. Even as tensors, pseudo vectors are a type of tensor with invariance properties but we must restrict the coordinate transformations a little bit more than we do for so-called true or polar vectors.

61

Chapter 4

Operator Algebra

Abstract The previous chapter on algebras introduced further structure on vector spaces by allowing the product of two vectors, the result being another vector in the vector space. Operators are linear transformations, i.e. linear maps from a vector space back into the same vector space. The vector space of primary interest in this chapter is that of operators. By “the product of operators” is this case we mean the composition of the linear maps. For the physicist operators are essential elements of quantum mechanics. Every observable quantity Q is associated with a hermitian operator Q and the only possible measured values of the quantity Q are the eigenvalues of Q.

4.1 Problems 4.3 For D and T defined in Example 2.3.5: (a) Show that [ D, T] = 1. (b) Calculate the linear transformations D3 T3 and T3 D3 .

4.3 (a) In Example 2.3.5 the derivative operator D was defined on the vector space Pc [t] described in Example 2.1.2, i.e. polynomials of unspecified order in the variable t with complex coefficients. The variable t need not be a real number; indeed recall in Hassani Sect. 3.6 we considered polynomials in an element a of an associative unital algebra. For convenience we assume t ∈ R. Consider the arbitrary vector | x ⟩ ∈ Pc [t] given by |x ⟩ =

n ∑

αk t k ,

(4.1)

k=0

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 R. B. Scott, Problems and Solutions on Vector Spaces for Physicists, https://doi.org/10.1007/978-3-031-31218-2_4

63

64

4 Operator Algebra

where αk ∈ C are the coefficients and n ∈ N is the arbitrary order of the polynomial. The derivative operator D was defined via its effect on such an arbitrary vector, D| x ⟩ =

n ∑ k=1

d = dt

kαk t k−1 ,

( n ∑

) αk t

k

.

assumed t is a real variable

(4.2)

k=0

The operator T was also defined: T|y⟩ = t|y⟩,

|y⟩ ∈ Cn (a, b).

(4.3)

where from Example 2.1.2 Cn (a, b) is the set of real-valued functions defined in the real interval (a, b) whose first n derivatives exist and are continuous. It is not essential that the functions be real-valued, so T can be extended to apply to complex-valued functions defined in the real interval (a, b). So clearly T can be applied to | x ⟩ ∈ Pc [t] as well because their first n derivatives are continuous on a real interval (a, b) (indeed their derivative to any order is continuous from −∞ < t < ∞). Now consider how the commutator [ D, T] acts on an arbitrary vector | x ⟩, d d (t| x ⟩) − t | x ⟩ dt dt = | x ⟩,

[ D, T]| x ⟩ =

(4.4)

which implies [ D, T] = 1. (b) Now consider the operator D3 T3 operating on an arbitrary vector | x ⟩, ) ( d2 d3 3 2 3 d (t | x ⟩) = | x ⟩ + t | x ⟩ , 3t dt dt 2 dt 3 ) ) ( ( d d d d2 d = 6t| x ⟩ + 3t 2 | x ⟩ + 3t 2 | x ⟩ + t 3 2 | x ⟩ , dt dt dt dt dt 2 3 d d d = 6| x ⟩ + 18t | x ⟩ + 9t 2 2 | x ⟩ + t 3 3 | x ⟩. (4.5) dt dt dt

D3 T 3 | x ⟩ =

This implies that D3 T3 = 61 + 18TD + 9T2 D2 + T3 D3 .

(4.6)

4.1 Problems

65

In the reverse order the resulting operator is T3 D3 , which operates on an arbitrary vector | x ⟩ quite simply, ( T3 D3 | x ⟩ = t 3

) d3 d3 | x ⟩ = t 3 3 | x ⟩. 3 dt dt

(4.7)

So we can immediately find the commutator [D3 , T3 ] = 61 + 18TD + 9T2 D2 .

(4.8)

4.6 Show that if [[A, B], A] = 0, then for every positive integer k, [Ak , B] = kAk−1 [A, B].

(4.9)

Hint: First prove the relation for low values of k; then use mathematical induction.

4.6 Of course the A, B here are operators acting on some vector space V, i.e. A, B ∈ End(V). There is the well know ambiguity about whether the positive integers includes zero or not. It will turn out that our proof is more general if we exclude k = 0 in the sense that we can be less restrictive about the operator A. Let’s start with k = 1. The LHS of Eq. (4.9) becomes, [A1 , B] = [A, B],

(4.10)

1A0 [A, B] = 11[A, B] = [A, B].

(4.11)

while the RHS becomes,

(We used A0 = 1 which, while natural enough, is not obvious either, see Supplementary Problem 4.40. In fact it was implied in Hassani Definition 3.6.1 and used explicitly in Definition 3.6.2 in the context of polynomial algebra.) We conclude from Eq. (4.11) that Eq. (4.9) holds for k = 1. Now we need to generalize these results to arbitrary k. We assume that Eq. (4.9) holds for a particular k ∈ Z+ and verify that it then holds for k + 1. We have on the LHS of Eq. (4.9)

66

4 Operator Algebra

[Ak+1 , B] = [Ak A, B],

used associative algebra

= A [A, B] + [A , B]A, k

k

= A [A, B] + kA k

k−1

used right derivative property

[A, B]A,

used Eq. (4.9)

= (k + 1)A [A, B].

used [[A, B], A] = 0. (4.12)

k

Because Eq. (4.12) also obeys Eq. (4.9), by mathematical induction, this proves Eq. (4.9) holds for any k ∈ Z+ . Note the result Eq. (4.9) emphasizes the similarity of the commutator with the derivative operator. For completeness, consider k = 0. Then the LHS of Eq. (4.9) is [A0 , B] = [1, B] = 0,

(4.13)

0A−1 [A, B] = 0.

(4.14)

while the RHS is

Note that we had to assume that A−1 exists. If so, then the RHS is the zero operator 0 and Eq. (4.9) holds for any integer k ≥ 0. If A is not invertible then we can only use Eq. (4.9) for k > 0; zero times an undefined operator is undefined.

4.9 Show that for any α, β ∈ R and any H ∈ End(V), we have eαH eβH = e(α+β)H .

(4.15)

4.9 If α /= 0 we can let J ≡ (β/α)H and then write the LHS of Eq. (4.15) as eαH eβH = eαH eαJ , = eα(H+J) e(α =e

α(H+J)

,

= e(α+β)H .

2

/2)[H,J]

,

used Baker-Campbell-Hausdorff formula used J = f (H) and e0 = 1, (4.16)

If α = 0 the result is immediate, because e0 = 1, see Supplementary Problem 4.40.

4.1 Problems

67

4.12 Find the solution to the operator differential equation dU = tHU(t). dt

(4.17)

Hint: Make the change of variable y = t 2 and use the result of Example 4.2.3.

4.12 Using the chain rule of differential calculus we have d dy d d = = 2t . dt dt dy dy

(4.18)

Substituting Eq. (4.18) into Eq. (4.17) we find √ dU(± y(t)) √ = tHU(± y). 2t dy

(4.19)

√ As a bit of formalism, it is convenient to redefine U(t) = U(± y(t)) ≡ V(y), and to restrict t ≥ 0 so that we can drop the ± in y(t). Then Eq. (4.19) has the form encountered in Example 4.2.3, except for the unimportant factor of 2 (this could be absorbed into H if we wished): dV(y) 1 = HV(y). dy 2

(4.20)

So the solution is available at the end of Example 4.2.3: y

V(y) = e 2 H V(0),

y ≥ 0,

(4.21)

t ≥ 0.

(4.22)

or t2

U(t) = e 2 H U(0),

4.15 Assuming that [[S, T], T] = 0 = [[S, T], S], show that [S, exp(tT)] = t[S, T] exp(tT). Hint: Expand the exponential and use Problem 4.6.

(4.23)

68

4 Operator Algebra

4.15 Expand the operator function exp(tT) using a Taylor series, | ∞ ∑ d k exp(t x) || exp(tT) = | | dxk =

k=0 ∞ ∑

tk

k=0

x=0

Tk , k!

used Hassani Eq. (4.5)

Tk . k!

(4.24)

Substitute Eq. (4.24) into the LHS of Eq. (4.23) ] k T tk , S . [S, exp(tT)] = − k! k=0 [∞ ∑

(4.25)

We want to follow the hint and use Eq. (4.9). However, to avoid the question of the invertibility of T, we first consider the first term in the infinite sum. When k = 0 we have ] [ [ ] 0 1 0T , S = 1 , S = 0. (4.26) t 0! 1 So [∞ ∑

] Tk [S, exp(tT)] = 0 − t ,S , k! k=1 =0− =0+ (

∞ ∑ k=1 ∞ ∑

k

kt k

Tk−1 [T, S], k!

k t t (k−1)

k=1

∞ ∑ Tl = tl l! l=0

)

Tk−1 [S, T], (k − 1)! × k

t[S, T],

= exp(tT) t[S, T].

used Eq. (4.9)

rearranged let l ≡ (k − 1) used Eq. (4.24) (4.27)

4.1 Problems

69

4.18 Prove Theorem 4.3.2: Let U, T ∈ End(V) and α ∈ C. Then 1. (U + T)† = U† + T† ,

2. (UT)† = T† U† ,

3. (αT)† = α∗ T† ,

4. ((T)† )† = T.

(4.28)

(Note that identity 4. does not in general hold for infinite-dimensional V.) Hint: Use the result that for all | a ⟩, | b ⟩ ∈ V ⟨a|T|b⟩∗ = ⟨b|T† |a⟩,

Hassani Eq. (4.11)

(4.29)

and Theorem 2.3.7: An endomorphism T of an inner product space is 0 if an only if ⟨b|T|a⟩ ≡ ⟨b|Ta⟩ = 0 for all |a⟩ and |b⟩.

4.12 The basic strategy here is to use Eq. (4.29) to convert the operator on the LHS of Eq. (4.28) (with hermitian conjugate) into a familiar operator equation that we can manipulate using the standard rules. And then we use Eq. (4.29) again to convert it back to an equation with the operators on the RHS of Eq. (4.28). For the first one we have ⟨b|(U + T)† |a⟩ = ⟨a|(U + T)|b⟩∗ , ( )∗ = ⟨a|U|b⟩ + ⟨a|T|b⟩ , ∗

∗

used Eq. (4.29) linearity of op. & inner product

= ⟨a|U|b⟩ + ⟨a|T|b⟩ ,

complex conjugate property

= ⟨b|U |a⟩ + ⟨b|T |a⟩,

used Eq. (4.29)

†

†

= ⟨b|(U + T )|a⟩. †

†

linearity of op.&inner product (4.30)

Now the answer seems obvious, but to be rigorous we go through all the steps of the argument. Rearranging the last line of Eq. (4.30) we have 0 = ⟨b|(U + T)† − (U† + T† )|a⟩.

linearity of op. & inner product (4.31)

And now because |a⟩ and |b⟩ were arbitrary we have, by Theorem 2.3.7, that (U + T)† − (U† + T† ) = 0 and therefore (U + T)† = (U† + T† ).

(4.32)

This completes the proof of part 1. Part 2 of Eq. (4.28) proceeds in a similar way. Note the associative property for linear transformations is inherited from the general notion of the composition of two general transformations (Hassani, Fig. 1.2). We have

70

4 Operator Algebra

⟨b|(UT)† |a⟩ = ⟨a|(UT)|b⟩∗ , [ ]∗ = (⟨a|U)(T| b ⟩) , = ⟨ a | U| c ⟩∗ ,

used Eq. (4.29) used associative property where | c ⟩ = T| b ⟩

= ⟨ c | U† | a ⟩,

used Hassani Eq. (4.11)

= ⟨ b | T U | a ⟩.

used Hassani Eq. (4.12)

†

†

(4.33)

We saw above in the proof of part 1 that an equality of this sort implies the operators are equal, so Eq. (4.33) implies (UT)† = T† U† ,

(4.34)

which completes the proof of part 2. Part 3 is a bit shorter: ⟨b|(αT)† |a⟩ = ⟨a|(αT)|b⟩∗ , ∗

= (α⟨ a |T| b ⟩) , = α∗ ⟨a|T|b⟩∗ ,

used Eq. (4.29) used Definition 2.2.1 Proposition 2 used properties of C

= α∗ ⟨b|T† |a⟩, ∗ †

= ⟨b|α T |a⟩.

used Eq. (4.29) used Definition 2.2.1 Proposition 2 (4.35)

So as above we can conclude that (αT)† = α∗ T† , as required for part 3. And part 4 is even shorter again: ⟨b|(T† )† |a⟩ = ⟨b|T† |a⟩∗ , = (⟨a|T|b⟩∗ )∗ , = ⟨a|T|b⟩.

used Eq. (4.29) used Eq. (4.29) used properties of C

(4.36)

And so (T† )† = T.

4.21 In this problem, you will go through the steps of proving the rigorous statement of the Heisenberg uncertainty principle. Denote the expectation (average) value of an operator A in a state |Ψ ⟩ by Aavg . Thus, Aavg = ⟨A⟩ = ⟨Ψ |A|Ψ ⟩. The uncertainty (deviation from the mean) in normalized state |Ψ ⟩ of the operator A is given by ΔA =

/

⟨(A − Aavg )2 ⟩ =

/ ⟨Ψ |(A − Aavg 1)2 |Ψ ⟩.

(a) Show that for any two hermitian operators A and B, we have

(4.37)

4.1 Problems

71

|⟨Ψ |AB|Ψ ⟩|2 ≤ ⟨Ψ |A2 |Ψ ⟩⟨Ψ |B2 |Ψ ⟩.

(4.38)

Hint: Apply the Schwarz inequality to an appropriate pair of vectors. (b) Using the above and the triangle inequality for complex numbers, show that |⟨Ψ |[A, B]|Ψ ⟩|2 ≤ 4⟨Ψ |A2 |Ψ ⟩⟨Ψ |B2 |Ψ ⟩.

(4.39)

(c) Define the operators A, = A − α1, B, = B − β1, where α and β are real numbers. Show that A, and B, are hermitian and [A, , B, ] = [A, B].

(4.40)

(d) Now use all the results above to show the celebrated uncertainty relation (ΔA)(ΔB) ≥

1 |⟨Ψ |[A, B]|Ψ ⟩|. 2

(4.41)

What does this reduce to for position operator x and momentum p if [x, p] = i ?

4.21 (a) Recall the Schwarz inequality: for two vectors |a⟩, |b⟩ in the same inner product space we have |⟨a|b⟩|2 ≤ ⟨a|a⟩2 ⟨b|b⟩2 .

Hassani Theorem 2.2.7

(4.42)

Start with Eq. (4.38) and rework into the form of Eq. (4.42). In fact, the two vectors referred to in the hint will turn out to be |a⟩ = A|Ψ ⟩ and |b⟩ = B|Ψ ⟩. We see this by first expanding the LHS of Eq. (4.38), the first line using the definition of the square of the absolute value of a complex number: |⟨Ψ |AB|Ψ ⟩|2 = ⟨Ψ |AB|Ψ ⟩∗ ⟨Ψ |AB|Ψ ⟩.

used def. of |z|2 , z ∈ C (4.43)

Now we want to have only products of vectors so we absorb the operators as follows. Start with the first factor on the RHS of Eq. (4.43) ⟨Ψ |AB|Ψ ⟩∗ = ⟨ Ψ |(AB)† | Ψ ⟩, = ⟨ Ψ |B A | Ψ ⟩, †

†

= ⟨ Ψ |B A| Ψ ⟩, ≡ ⟨b|a⟩, †

used Eq. (4.29) used Eq. (4.34) used A is hermitian (4.44)

72

4 Operator Algebra

where | a ⟩ ≡ A| Ψ ⟩,

| b ⟩ ≡ B| Ψ ⟩.

(4.45)

We then find that the second factor on the RHS of Eq. (4.43) is ⟨b|a⟩∗ and so Eq. (4.43) becomes: |⟨Ψ |AB|Ψ ⟩|2 = ⟨b|a⟩⟨b|a⟩∗ = |⟨b|a⟩|2 .

(4.46)

We have the LHS of Eq. (4.38) in the form we want. Using the definitions Eq. (4.45), the RHS of Eq. (4.38) can be rewritten because ⟨ Ψ |B2 | Ψ ⟩ = ⟨ Ψ |B† B| Ψ ⟩, = ⟨b|b⟩.

used B hermitian

⟨ Ψ |A2 | Ψ ⟩ = ⟨ Ψ |A† A| Ψ ⟩, = ⟨a |a ⟩.

used A hermitian

(4.47)

Similarly,

(4.48)

Combining the previous two results, the RHS of Eq. (4.38) can be rewritten ⟨ Ψ |A2 | Ψ ⟩⟨ Ψ |B2 | Ψ ⟩ = ⟨ a | a ⟩ ⟨ b | b ⟩ .

(4.49)

The Schwarz inequality Eq. (4.42), using Eqs. (4.46) and (4.49) gives that |⟨Ψ |AB|Ψ ⟩|2 = |⟨b|a⟩|2 ≤ ⟨a|a⟩⟨b|b⟩ = ⟨Ψ |A2 |Ψ ⟩ ⟨Ψ |B2 |Ψ ⟩.

(4.50)

(b) Start by expanding the LHS of Eq. (4.39) |⟨Ψ |[A, B]|Ψ ⟩|2 = |⟨Ψ |AB|Ψ ⟩ − ⟨Ψ |BA|Ψ ⟩|2 .

(4.51)

We know something about the first term on the RHS of Eq. (4.51), but what about the second? Actually we know just as much! Using again our definitions Eq. (4.45), the second term on the RHS of Eq. (4.51) becomes ⟨ Ψ |BA| Ψ ⟩ = ⟨ Ψ |B| a ⟩,

used def. Eq. (4.45) ∗

= ⟨ a |B | Ψ ⟩ , = ⟨ a | b ⟩∗ . †

= z∗,

used Eq. (4.29) used B hermitian, def. Eq. (4.45) (4.52)

4.1 Problems

73

where z = ⟨ Ψ |AB| Ψ ⟩. Substituting z into the RHS of Eq. (4.51) |⟨Ψ |[A, B]|Ψ ⟩|2 = |z − z ∗ |2 , ( )2 ≤ |z| + | − z ∗ | ,

used triangle inequality, Definition 2.2.8

= 4|z| , 2

= 4|⟨ Ψ |AB| Ψ ⟩|2 .

(4.53)

Now with the result Eq. (4.50) from (a), |⟨Ψ |[A, B]|Ψ ⟩|2 ≤ 4| ⟨ a | b ⟩ |2 , ≤ 4⟨ Ψ |A2 | Ψ ⟩⟨ Ψ |B2 | Ψ ⟩,

(4.54)

which is the next desired intermediate result. (c) The operators A, = A − α1, B, = B − β1, are clearly hermitian when α and β are real numbers, since for an arbitrary bra-ket pair ⟨ c | and | d ⟩, ⟨ c |(A − α1)† | d ⟩ = ⟨ d |(A − α1)| c ⟩∗ , = ⟨ d |A| c ⟩∗ − α∗ ⟨ d |1| c ⟩∗ , = ⟨ c |A| d ⟩ − α∗ ⟨ c | d ⟩ , = ⟨ c |(A − α1)| d ⟩.

used Eq. (4.29) used linearity used A and 1 hermitian used α ∈ R and linearity (4.55)

Because ⟨ c | and | d ⟩ were arbitrary, this result and Hassani Theorem 2.3.7 prove A, = A − α1 is hermitian. Of course B, enjoys the same property. For the commutator we have [A, , B, ] = [A − α1, B − β1] = [A, B] − β[A, 1] − α[1, B] + αβ[1, 1], = [A, B].

used Proposition 4.1.8 (4.56)

(d) To show the uncertainty relation Eq. (4.41), all we have to do is identify the pieces:

74

4 Operator Algebra

/ ⟨Ψ |A |Ψ ⟩ ⟨Ψ |B,2 |Ψ ⟩, |2 1 || ⟨Ψ |[A, , B, ]|Ψ ⟩| , 4 1 |⟨Ψ |[A, B]|Ψ ⟩|2 , 4 1 |⟨Ψ |[A, B]|Ψ ⟩| . 2

/ (ΔA)(ΔB) = (ΔA)2 (ΔB)2 ≥ = (ΔA)(ΔB) ≥

,2

used Eq. (4.37) used Eq. (4.39) used Eq. (4.40) sq. root

(4.57)

The quantum mechanical operators of position x and momentum p are described in any introductory textbook on quantum mechanics, e.g. [8, Sect. 4.6]. Substitute the quantum mechanical operators of position A = x and momentum B = p. Denoting x as position and p as momentum we have 1 |⟨Ψ |[x, p]|Ψ ⟩| , 2 1 1 . ≥ |i ⟨Ψ |Ψ ⟩| = 2 2

Δx Δp ≥

used normalization of the wavefunction (4.58)

This is the result presented in introductory textbooks on quantum mechanics [8, Sect. 3.7 and Eq. (3.66)].

4.24 Show that if P is a (hermitian) projection operator, so are 1 − P and U† PU for any unitary operator U.

4.24 Hassani Definition 4.4.2 states that a projection operator is a hermitian idempotent of End(V). This means that we must show that 1 − P and U† PU are all of the following: 1. members of End(V), i.e. they are linear operators on the vector space V, 2. hermitian, i.e. they equal their adjoint, 3. idempotents, i.e. they equal their square. Both 1 − P and U† PU inherit their membership in End(V) from the fact they are built from 1, U† , U ∈ End(V). Both 1 − P and U† PU are hermitian: (1 − P)† = 1† − P† , = 1 − P, and

used Eq. (4.28) used 1 hermitian, SP 4.42; P hermitian,

(4.59)

4.1 Problems

75

(U† PU)† = U† P† (U† )† ,

used Eq. (4.28)

= U PU. †

used Eq. (4.28); P hermitian

(4.60)

Both 1 − P and U† PU are idempotents: (1 − P)2 = (1 − P)(1 − P), = 1 − 2P + P2 , = 1 − P.

used P idempotent

(U† PU)2 = (U† PU)(U† PU) = (U† P)(UU† )(PU), = U P U, †

2

(4.61)

associative used U unitary

= U PU. †

used P idempotent (4.62)

4.27 Let | a1 ⟩ = a1 = (1, 1, −1)T and | a2 ⟩ = a2 = (−2, 1, −1)T . [Represent the kets as column vectors and the bras as row vectors so that the inner product is consistent with the rules of matrix multiplication.] (a) Construct (in the form of a matrix) the projection operators P1 and P2 that project onto the directions of | a1 ⟩ and | a2 ⟩, respectively. Verify that they are indeed projection operators. (b) Construct (in the form of a matrix) P = P1 + P2 and verify directly that it is a projection operator. (c) Let P act on an arbitrary vector (x, y, z)T . What is the dot product of the resulting vector with the vector a1 × a2 ? What can you say about P and your conclusion in (b)?

4.27 (a) Following Hassani Definition 4.4.4, the projection along any vector | a1 ⟩ is simply P1 =

1 | a1 ⟩⟨ a1 |. ⟨ a1 | a1 ⟩

Here the real number ⟨ a1 | a1 ⟩ is the norm squared

(4.63)

76

4 Operator Algebra

⎛

⎞ 1 ⟨ a1 | a1 ⟩ = 1 1 −1 ⎝ 1 ⎠ = 3. −1 (

)

(4.64)

(The corresponding dual vector, the bra ⟨ a1 | was written as a row vector consistent with Hassani Box 4.3.3.) The matrix is obtained via the rules of matrix algebra ⎛ ⎞ ⎛ ⎞ 1 ) 1 1 1 −1 1 ⎝ ⎠( 1 1 1 −1 = ⎝ 1 1 −1⎠ . P1 = 3 −1 3 −1 −1 1

(4.65)

For | a2 ⟩ one finds ⎛ ⎞ ⎛ ⎞ −2 ) 1 4 −2 2 1 ⎝ ⎠( 1 −2 1 −1 = ⎝−2 1 −1⎠ . P2 = 6 −1 6 2 −1 1

(4.66)

A projection operator is a hermitian idempotent. Both P1 and P2 are real matrices so to be hermitian they must be symmetric, as they are. For linear operator P to be idempotent we require P2 = P, which is easily verified ⎛ ⎛ ⎞⎛ ⎞ ⎞ 1 1 −1 1 1 −1 1 1 −1 1⎝ 1 1 1 −1⎠ ⎝ 1 1 −1⎠ = ⎝ 1 1 −1⎠ . P1 P1 = 9 −1 −1 1 3 −1 −1 1 −1 −1 1

(4.67)

Similarly, ⎛

⎛ ⎞⎛ ⎞ ⎞ 4 −2 2 4 −2 2 4 −2 2 1 1 P2 P2 = 2 ⎝−2 1 −1⎠ ⎝−2 1 −1⎠ = ⎝−2 1 −1⎠ . 6 6 2 −1 1 2 −1 1 2 −1 1

(4.68)

(b) The matrix for P = P1 + P2 is obtained via element by element addition. It is convenient to first multiply each element of the matrix in P1 by 2 to obtain a common denominator with P2 ⎛ ⎛ ⎞ ⎞ 2 2 −2 4 −2 2 1 1 P1 + P2 = ⎝ 2 2 −2⎠ + ⎝−2 1 −1⎠ , 6 −2 −2 2 6 2 −1 1 ⎛ ⎛ ⎞ ⎞ 6 0 0 2 0 0 1⎝ 1 0 3 −3⎠ = ⎝0 1 −1⎠ . (4.69) = 6 0 −3 3 2 0 −1 1

4.1 Problems

77

The result is a symmetric real matrix so clearly hermitian (as it must be since it is the sum of two hermitian matrices). The operator is idempotent, ⎛ ⎛ ⎞⎛ ⎞ ⎞ 2 0 0 2 0 0 2 0 0 1 1 P2 = 2 ⎝0 1 −1⎠ ⎝0 1 −1⎠ = ⎝0 1 −1⎠ = P. 2 0 −1 1 2 0 −1 1 0 −1 1

(4.70)

(c) When P acts on an arbitrary vector | b ⟩ = (x, y, z)T the components are ⎛ ⎞⎛ ⎞ ⎛ ⎞ x 2 0 0 x 1 ⎠ ≡ c. (4.71) P| b ⟩ = ⎝0 1 −1⎠ ⎝ y ⎠ = ⎝ y−z 2 2 0 −1 1 z−y z 2

The vector a1 × a2 = (0, 3, 3)T will be orthogonal to the plan passing through a1 and a2 . The arbitrary vector c = P| b ⟩ will lie in the plan passing through a1 and a2 . Hence we expect (0, 3, 3)T to be orthogonal to any vector c, (

)

⎛

x

⎞

⎠ = 0. (a1 × a2 ) · c = 0 3 3 ⎝ y−z 2

(4.72)

z−y 2

4.30 The parametric equation of a line L in a coordinate system with origin O is x = 2t + 1,

y = t + 1,

z = −2t + 2.

(4.73)

A point P has coordinates (3, −2, 1). (a) Using the projection operators, find the length of the projection of O P on the line L. (b) Find the vector whose beginning is P and ends perpendicularly on L. (c) From this vector calculate the distance from P to L.

4.30 (a) We are working in three dimensional Euclidean space with Cartesian coordinates. By inspection we note that the coordinates of the points on the line L can be written in matrix form ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ x 1 2 ⎝ y ⎠ = ⎝1⎠ + t ⎝ 1 ⎠ . (4.74) z 2 −2

78

4 Operator Algebra

This can be written in vector form: O A = O Q + t O R,

(4.75)

where O A is the vector from the origin to an arbitrary point A on the line, O Q is the vector from the origin to the point Q with coordinates (1, 1, 2) and O R is the vector from the origin to the point R with coordinates (2, 1, −2). Let us define the unit ket vector | e ⟩ = O R/||O R|| parallel to the line L. Then the projection operator | e ⟩⟨ e | acting on a vector gives its projection onto the line L. Here, ⎛ ⎞ ⎛ ⎞ 2 2 1 ⎝ 1 ⎠ = 1 ⎝ 1 ⎠, |e⟩ = √ (4.76) 3 −2 22 + 12 + (−2)2 −2 so the projection operator has matrix representation, ⎛ ⎞ ⎛ ⎞ 2 ) 1 4 2 −4 1 ⎝ ⎠( 1 2 1 −2 = ⎝ 2 1 −2⎠ . | e ⟩⟨ e | = 2 9 −4 −2 4 3 −2

(4.77)

Define the vector O S as the projection of O P onto L; O S will have coordinates ⎛

⎛ ⎞ ⎞⎛ ⎞ 4 2 −4 4 3 1 1 | O S ⟩ = | e ⟩⟨ e || O P ⟩ = ⎝ 2 1 −2⎠ ⎝−2⎠ = ⎝ 2 ⎠ . 9 −4 −2 4 9 −4 1

(4.78)

The length of this projected vector is,

||O S|| =

√

⎛ ⎞⎞1/2 4 ( ) 1 2 ⟨ O S | O S ⟩ = ⎝ 2 4 2 −4 ⎝ 2 ⎠⎠ = . 9 3 −4 ⎛

(4.79)

(b) The vector P A from the point P = (3, −2, 1) to an arbitrary point A on the line, has components ⎛

⎞ x −3 P A = ⎝ y + 2⎠ , z−1 ⎛ ⎞ ⎛ ⎞ −2 2 = ⎝ 3 ⎠ + t ⎝ 1 ⎠. 1 −2

used Eq. (4.74)

(4.80)

4.1 Problems

79

We wish to find the special point B on the line L such that P B is orthogonal to L. In this case ⎛⎛ ⎞ ⎛ ⎞⎞ −2 2 ( ) 1 2 1 −2 ⎝⎝ 3 ⎠ + t ⎝ 1 ⎠⎠ , 0 = ⟨e| PB ⟩ = 3 1 −2 = −1 + 3t =⇒ t =

1 . 3

(4.81)

, 1 ). The vector P B = (− 43 , 10 3 3 (c) The distance from P to the line L is ||P B||: ⎛ ⎞ −4 ( ) 117 1 ||P B||2 = 2 −4 10 1 ⎝ 10 ⎠ = , 9 3 1 √ 117 . ||P B|| = 3

(4.82)

4.33 Let S ∈ End(V) be an operator that is both unitary and hermitian. Show that (a) S is involutive (i.e., S2 = 1), and (b) S = P+ − P− , where P+ and P− are [hermitian] projection operators.

4.33 The first part of the problem is rather straightforward. (a) The involution property follows naturally from the unitary and hermitian properties: S2 = SS = S† S, −1

= S S, = 1.

used hermitian property used unitary property (4.83)

(b) I found this exercise more challenging than the others so I’ll try to explain the reasoning that lead me to the solution. By virtue of S being an endomorphism it is an element of a vector space and can therefore be written as the sum of two other elements of the vector space. The problem is to show that P+ and P− can have the stated properties while S maintains its properties. Starting with Eq. (4.83) we require

80

4 Operator Algebra

)2 ( 1 = S2 = P+ − P− , = (P+ )2 + (P− )2 − P+ P− − P− P+ , = P+ + P− − P+ P− − P− P+ , = P+ + P− − {P+ , P− }.

used P± idempotents used def. anticommutator (4.84)

Here we might guess (I admit it was just a guess, but a very fortunate one) that just maybe the anticommutator of P+ and P− vanishes. Let’s try it and see what happens! We immediately find an expression for P+ and P− : 1 = P+ + P−

guessed anticommutator vanishes

(4.85)

Adding Eq. (4.85) to the following equation, S = P+ − P− ,

supposition of problem

(4.86)

allows us to solve for P+ : P+ =

1 (S + 1). 2

(4.87)

Subtracting Eq. (4.86) from Eq. (4.85) we find P− =

1 (S − 1). 2

(4.88)

This looks promising, but let’s verify that our guess of vanishing anticommutator applies: 1 1 1 1 (S + 1) (S − 1) + (S − 1) (S + 1), 2 2 2 2 1 1 = (S2 − 1) + (S2 − 1), 4 4 = 0.

{P+ , P− } =

used S is involutive

(4.89) So fortunately our guess worked and all that remains is to verify that the assumed idempotents P+ and P− can also be hermitian while S keeps its required properties:

4.1 Problems

81

(P+ )† =

(

)†

1 (S + 1) 2

1 † (S + 1† ), 2 1 = (S + 1), 2 = P+ . =

used Proposition 1 &3 Eq. (4.28) used S and 1 hermitian, (4.90)

A similar argument shows that P− is also hermitian. In summary, we have shown that an arbitrary hermitian and unitary operator S admits the decomposition Eq.(4.86) where P+ and P− are hermitian idempotents (i.e. projection operators).

4.36 Show that any two equivalent representations of any algebra have the same kernel.

4.36 For an arbitrary algebra A, assume we are given two equivalent representations, ρ1 : A → End(V1 ) in vector space V1 and ρ2 : A → End(V2 ) in vector space V2 . Because they are equivalent, there must be an isomorphism T : V1 → V2 such that T ◦ ρ1 (a) = ρ2 (a) ◦ T, ∀a ∈ A.

Hassani Definition 4.5.6

(4.91)

Now suppose a ∈ ker(ρ1 ), i.e. ρ1 (a) = 0 is the zero endomorphism on V1 . Then a ∈ ker(ρ2 ) because, ρ2 (a) = T ◦ ρ1 (a) ◦ T−1 ,

used Eq. (4.91)

= 0,

(4.92) (4.93)

where 0 is the zero endomorphism on V2 . There are several steps to appreciate in passing from Eq. (4.92) to Eq. (4.93). Starting on the far RHS of Eq. (4.92), T−1 is guaranteed to exist because T is an isomorphism (a bijection) and it simply maps any vector in V2 to V1 . The vector in V1 then hits ρ1 (a) the zero endomorphism on V1 , giving the zero vector of V1 , which is finally mapped by T back to the zero vector on V2 . Hassani Theorem 2.3.11 assures us that the zero vector of one vector space is mapped to the zero of another by an injective linear transformation. This implies ker(ρ1 ) ⊂ ker(ρ2 ).

(4.94)

82

4 Operator Algebra

Interchanging the roles of ρ1 and ρ2 and repeating the above argument we obtain ker(ρ2 ) ⊂ ker(ρ1 ).

(4.95)

We conclude the sets are identical, ker(ρ1 ) = ker(ρ2 ).

(4.96)

4.2 Supplementary Problems 4.38 For D and T defined in Example 2.3.5 calculate the commutator of D3 T3 and T3 D3 , using your results from Problem 4.3 and Hassani Proposition 4.1.8 to reduce the answer to the forms below: [D3 T3 , T3 D3 ] = 9[D2 T2 , T3 D3 ], = 27(T2 DT2 D3 − T3 D2 TD2 ).

(4.97)

4.39 Prove Hassani Proposition 4.1.3: An endomorphism T ∈ End(V) is invertible iff it sends a basis of V onto another basis of V.

4.39 (Solution kindly provided by Sadri Hassani.) We work with basis B = n n {|ai ⟩}i=1 of V. First suppose that T is invertible. We have to show that {T|ai ⟩}i=1 is a basis. T is surjective. So, for any |a⟩ ∈ V (here V is the range), there is a |b⟩ ∈ V (here V is the domain) such that |a⟩ = T|b⟩. Thus, |a⟩ = T

( n ∑

) βi |ai ⟩ =

i=1

n ∑

βi T|ai ⟩,

i=1

n span where βi are components of |b⟩ in the basis B. This shows that {T|ai ⟩}i=1 n V. To show that {T|ai ⟩}i=1 are linearly independent, note that

|0⟩ =

n ∑ i=1

αi T|ai ⟩ = T

( n ∑ i=1

) αi |ai ⟩ .

4.2 Supplementary Problems

83

∑n This implies that i=1 αi |ai ⟩ = |0⟩ because T is linear and bijective (actually, n are linearly independent, we conclude injectivity is sufficient). Since {|ai ⟩}i=1 n is a basis. that αi = 0 for all i. This completes the proof that {T|ai ⟩}i=1 n Conversely assume that {T|ai ⟩}i=1 is a basis. By Hassani Proposition 2.3.14, it is sufficient to show that T is surjective. For any |a⟩ ∈ V, we can write by assumption, ( n ) n ∑ ∑ βi T|ai ⟩ = T βi |ai ⟩ . |a⟩ = i=1

i=1

∑n βi |ai ⟩ ∈ V Therefore, for any |a⟩ ∈ V (here V is the range) there is |b⟩ ≡ i=1 (here V is the domain) such that |a⟩ = T|b⟩. This shows that T is surjective. This completes the proof. We could also prove the converse using the injectivity of T. By Hassani zero vector is the only vector Theorem 2.3.11, it is sufficient to show that ∑the n αi |ai ⟩ be a vector in V. We that maps to the zero vector. Let |a⟩ = i=1 want to show that if T|a⟩ = | 0 ⟩ then |a⟩ must be the zero vector. This is straightforward, because ( | 0 ⟩ = T|a⟩ = T

n ∑ i=1

) αi |ai ⟩ =

n ∑

αi T|ai ⟩.

i=1

n is a basis and therefore a linearly Therefore, αi = 0 for all i because {T|ai ⟩}i=1 independent set of vectors.

4.40 Which step of the proof of Theorem 4.3.2 part 4 breaks down for an infinite dimensional vector space (see solution to Problem 4.18)? 4.41 Show that the exponential of the zero operator is the identity operator, e0 = 1. 4.42 Under what conditions is a linear combination of hermitian operators on a complex vector space also hermitian?

4.42 Consider a linear combination αA + βB of two arbitrary hermitian operators A, B ∈ End(V) with α, β ∈ C. If we insist that their linear combination is hermitian we find:

84

4 Operator Algebra

(αA + βB)† = α∗ A† + β ∗ B† ,

used Hassani Theorem 4.3.2

= α∗ A + β ∗ B,

used A, B hermitian

= αA + βB,

insist linear combination hermitian

0 = (α − α∗ )A + (β − β ∗ )B.

rearranged

(4.98) This indicates two options. Either the two endomorphisms are linearly dependent vectors and the coefficients α, β such that Eq. (4.98) is satisfied, or the two endomorphisms are linearly independent vectors and the coefficients α, β are real.

4.43 Prove that the identity operator 1 ∈ End(V) on some vector space V is both hermitian and unitary. 4.44 If an operator S ∈ End(V) is both hermitian and unitary then it is involutive (see Supplementary Problem 4.33). Is the inverse statement true, i.e. if S is involutive, is it necessarily both hermitian and unitary? Hint: consider the operator S represented by the matrix ) −1 −1 . S= 0 1 (

(4.99)

4.45 (a) Show that the identity operator 1 ∈ End(V) on some vector space V is a projection operator. What space does it project onto? (b) In general projection operators do not necessarily have an inverse. Find a projection operator that does have an inverse. (c) Prove that a projection operator P ∈ End(V) that projects onto a proper subspace of a vector space V is not invertible.

4.45 (a) Recall a projection operator is a hermitian idempotent, Hassani Definition 4.4.2. In Supplementary Problem 4.43 we claimed that the identity 1 ∈ End(V) is hermitian. Clearly the identity 1 is idempotent: 12 = 1. So that identity is a projection operator. Any | a ⟩ ∈ V is projected onto itself by 1| a ⟩ = | a ⟩,

∀| a ⟩ ∈ V.

(4.100)

So the identity is a rather uninteresting projection operator in the sense that it projects onto the entire space V. (b) Let P ∈ End(V) be an invertible projection operator, so that by supposition P−1 exists such that

4.2 Supplementary Problems

85

P−1 P = PP−1 = 1.

(4.101)

Because P is idempotent P2 = P, −1

definition of idempotent,

−1

P P = P P, 2

(P−1 P)P = 1,

supposed P is invertible, End (V) is an associative algebra,

P = 1.

(4.102)

Assuming only that the projection operator is invertible we are lead inexorably to the identity endomorphism. (c) The implication of the above two results is that any projection operator more interesting than the identity in that it projects onto a proper subspace of End(V) is not invertible.

4.46 In Hassani Sect. 4.4 we learned how hermitian idempotent endomorphisms project vectors onto subspaces. How do these projection operators, framed within the context of linear operators, fit into the more general consideration of projecting vectors onto subspaces? Consider the familiar problem of projecting the globe (a spherical idealization of the surface of the Earth) onto a plane cutting through its centre to form a world map, the famous map projection problem with a very long history. The points on the globe can be represented by vectors in R3 . Project these points onto the equatorial plane via the mapping F : R3 → R2 F(R, θ, φ) = (Rθ, φ).

(4.103)

Here (R, θ, φ) on the LHS of Eq. (4.103) are the spherical coordinates of a point on the globe with radial coordinate the Earth’s radius R, polar angle θ related to the latitude lat radians by θ = π2 − lat, and azimuthal angle φ simply the longitude. The coordinates (Rθ, φ) on the RHS of Eq. (4.103) are polar coordinates in the plane with radial coordinate r = Rθ the distance from the North Pole and φ still the longitude. (This map projection, the cartographers’ azimuthal equidistant map projection has several nice properties and was used by the author to provide a simple visualization of the twin paradox [18].) Is F a projection operator in the sense of Hassani Definition 4.4.2, a hermitian idempotent endomorphism? If not, why not? 4.47 Suppose that the vector space V admits the decomposition into the direct sum of subspaces V = U1 ⊕ . . . ⊕ Ur =

r . i=1

Ui .

(4.104)

86

4 Operator Algebra

Define P j as the operator that sends any vector | v ⟩ ∈ V to its component | v j ⟩ ∈ U j P j | v ⟩ = | v j ⟩.

(4.105)

Show that P j is a linear operator. 4.48 While a strictly positive endomorphism T is an automorphism (an invertible operator, see Hassani Theorem 4.3.10), the converse is not true. Find an example of an automorphism R ∈ End(V) on some vector space V that is not strictly positive.

4.48 Suppose V is R3 and R ∈ End(V) is the anticlockwise rotation about the z-axis by π/2 radians. For | a ⟩ a nonzero vector lying in the x-y plane, R| a ⟩ is perpendicular to | a ⟩ so that ⟨ a |R| a ⟩ = 0.

(4.106)

So R is not strictly positive according to Hassani Definition 4.3.9. However, R is invertible, the inverse being the clockwise rotation about the z-axis by π/2 radians. 4.49 Suppose that, for an n-dimensional vector space V over C, P ∈ End(V) is a projection operator and S ∈ End(V) is an involution operator. n for which P is represented by a matrix (a) Show that there exists a basis {ai }i=1 with either 1 or 0 on the diagonal and zeros off the diagonal. (b) Show that S = 2P − 1, establishes a one-to-one correspondance between all projection operators and involutions. (c) Show that any involution S can be represented by a matrix with either ±1 on the diagonal and zeros off the diagonal. 4.50 Suppose that G ∈ End(V) is a hermitian operator and ∈ ∈ R. Show that U∈ = 1 − iG∈ is unitary to first order in ∈. (This problem is from a well-known textbook on quantum mechanics [15, Eq. (3.1.11) and following text].) 4.51 Recall from Hassani Problem 3.1 that the product (x1 , x2 ) · (y1 , y2 ) = (x1 y1 − x2 y2 , x1 y2 + x2 y1 ),

(4.107)

turns R2 into a commutative, associative algebra that we shall call A. Consider the mapping ρ : A → EndR (R2 ) defined by ( ) x Tq | x ⟩ ≡ Tq 1 , x2 )( ) ( x1 q −q2 , = 1 q2 q1 x2

(4.108)

4.2 Supplementary Problems

87

where Tq ≡ ρ(q) ∈ EndR (R2 ) is written in the second line Eq. (4.108) as a matrix and where the elements of R2 are expressed as a column vector. Show that ρ so defined provides a real representation of A. Is the representation unique? Is it faithful?

4.51 We must show that the mapping ρ : A → EndR (R2 ) defined by Eq. (4.108) meets the two criteria of Hassani Definition 4.5.1. Following Hassani Example 4.5.3, this proceeds most efficiently working with a basis for A. Let e1 = (1, 0) and e2 = (0, 1) be the standard basis of A. Using Eq. (4.107) we find e1 · e1 = (1, 0) · (1, 0), = (1 × 1 − 0 × 0, 1 × 0 + 0 × 1), = (1, 0) = e1 , e1 · e2 = (1, 0) · (0, 1) = (0, 1) = e2 , = e2 · e 1 .

used commutative algebra (4.109)

So e1 = 1A , the identity of A. Furthermore, e2 · e2 = (0, 1) · (0, 1) = (−1, 0) = −e1 .

(4.110)

The first criterion of Hassani Definition 4.5.1 was that ρ(1A ) = 1, where 1 is the identity of EndR (R2 ). Using Eq. (4.108) we find ( ρ(e1 ) =

) 10 , 01

(4.111)

the identity matrix and the identity of EndR (R2 ), confirming the first criterion of Hassani Definition 4.5.1. The second criterion of Hassani Definition 4.5.1 was that ρ is a homomorphism, ρ(q1 q2 ) = ρ(q1 )ρ(q2 ) for all q1 , q2 ∈ A. Hassani Proposition 3.1.19 allows us to save some effort by only verifying that ρ(q1 q2 ) = ρ(q1 )ρ(q2 ) for all the basis vectors of A. Using Eq. (4.108) we find ρ(e1 e1 ) = ρ(e1 ), ( ) 10 = . 01 On the other hand,

used Eq. (4.109) used Eq. (4.111)

(4.112)

88

4 Operator Algebra

(

1 0 ( 1 = 0

ρ(e1 )ρ(e1 ) =

0 1

)(

) 0 , 1

) 10 , 01

= ρ(e1 e1 ).

used Eq. (4.111),

used Eq. (4.112)

(4.113)

Similarly we find ρ(e1 e2 ) = ρ(e2 ), ( ) 0 −1 = , 1 0 ( )( ) 10 0 −1 = , 01 1 0

used Eq. (4.109) used Eq. (4.108)

= ρ(e1 )(e2 ).

(4.114)

While A is commutative, EndR (R2 ) is not so we must also check that, ρ(e2 e1 ) = ρ(e2 ), ( ) 0 −1 = , 1 0 ( )( ) 0 −1 10 = , 1 0 01

used Eq. (4.109) used Eq. (4.108)

= ρ(e2 )(e1 ).

(4.115)

Finally the only other product of two basis vectors is ρ(e2 e2 ) = −ρ(e1 ), ( ) 10 =− . 01

used Eq. (4.110) used Eq. (4.111)

(4.116)

On the other hand (

)( ) 0 −1 0 −1 , 1 0 1 0 ( ) −1 0 = , 0 −1

ρ(e2 )ρ(e2 ) =

= ρ(e2 e2 ).

used Eq. (4.108),

used Eq. (4.116)

So indeed ρ is a homomorphism and a real representation of A.

(4.117)

4.2 Supplementary Problems

89

Note the representation Eq. (4.108) is not unique. Using ( ) x Tq | x ⟩ ≡ Tq 1 , x2 )( ) ( x1 q1 q2 , = −q2 q1 x2

(4.118)

also works. Yes, the representation Eq. (4.108) is faithful. q = (0, 0) is the ( Clearly ) 00 , i.e. the kernel of ρ is only vector mapped to the zero endomorphism 00 the zero vector. By Hassani Theorem 2.3.11, it must be injective. By Hassani Definition 4.5.1, this is a so-called faithful representation.

4.52 Spell out the reasoning behind the proof given for Hassani Proposition 4.5.2, which states that a nontrivial representation of a simple algebra is faithful. What would be a trivial representation? 4.53 Consider the two dimensional complex vector space V with standard orthonormal basis written as for the ket vectors representing the quantum mechanical spin 21 systems with states spin up, | + ⟩, and spin down, | − ⟩, ⟨ + | + ⟩ = 1,

⟨ − | − ⟩ = 1,

⟨ + | − ⟩ = 0.

(4.119)

(a) Show that the projection operators defined as in Hassani Sect. 4.4.1, P+ ≡ |+⟩⟨+|,

P− ≡ |−⟩⟨−|,

(4.120)

satisfy the completeness relation (Hassani Proposition 4.4.6). (b) The operators Sx , defined by Sx ≡

2

(|+⟩⟨−| + |−⟩⟨+|) ,

(4.121)

corresponds to a measurement of spin along the x-axis, using for example a SternGerlach device with magnetic field gradient oriented in the x-direction. The connection between eigenstates, operators and laboratory measurements is well-explained by Sakurai and Napolitano [15, Chap. 1]; Eq. (4.121) above corresponds to [15, Chap. 1, their Eq. (4.18a)]. Similarly measurements of spin along the y-axis and z-axis correspond to operators, Sy ≡

2

(i|−⟩⟨+| − i |+⟩⟨−|) ,

Sz ≡

2

(|+⟩⟨+| − |−⟩⟨−|) .

(4.122)

90

4 Operator Algebra

Show that the spin operators obey the commutation relations of the operators Lx , L y , Lz in Hassani problem 4.4, if we choose units such that = 1. (c) The operator defined by ( Dz (φ) ≡ exp

−iφSz

) ,

(4.123)

with φ ∈ R a real parameter, corresponds to a rotation of the physical system about the z-axis by an angle φ, see reference [15, Chap. 3, Eq. (2.3)]. That is, an arbitrary spin state | α ⟩ ∈ V is rotated by an angle φ about the z-axis, to obtain a new spin state, | β ⟩ = Dz (φ)| α ⟩.

(4.124)

To show that Dz (φ) really does correspond to such a rotation, the authors suggest calculating the expectation value of the x, y and z-direction spins for an arbitrary spin state | α ⟩ ∈ V after it has been rotated using Dz (φ). That is, one calculates, | ⟩ ⟨ β |Sx | β ⟩ = ⟨ α |Dz (φ)† Sx Dz (φ)| α ,

(4.125)

| ⟩ and similarly ⟨ β |S y | β and ⟨ β |Sz | β ⟩. Show that | ⟩ | ⟩ ⟨ α |Dz (φ)† Sx Dz (φ)| α = cos(φ) ⟨ α |Sx | α ⟩ − sin(φ) ⟨ α |S y | α , | ⟩ | ⟩ ⟨ α |Dz (φ)† S y Dz (φ)| α = sin(φ) ⟨ α |Sx | α ⟩ + cos(φ) ⟨ α |S y | α , | ⟩ ⟨ α |Dz (φ)† Sz Dz (φ)| α = ⟨ α |Sz | α ⟩ . (4.126) That is, the three expected values ⟨ α |Sx | α ⟩ , ⟨ α |Sx | α ⟩ , ⟨ α |Sx | α ⟩ transform like the three components of a cartesian vector (x, y, z) in Euclidean space under an active rotation about the z-axis, ⎛ ⎞⎛ ⎞ ⎛ ⎞ cos φ − sin φ 0 x x cos φ − y sin φ ⎝ sin φ cos φ 0⎠ ⎝ y ⎠ = ⎝ y cos φ + x sin φ⎠ . (4.127) 0 0 1 z z Hint: Between Hassani 4.2.5 and 4.2.6 you’ll find a “widely used formula”: eA Be−A = B + [A, B] +

1 1 [A, [A, B]] + [A, [A, [A, B]] + . . . 2! 3!

(4.128)

Use this, and the commutation relations found in (b) to show that ( ( ) ) 1 1 1 1 Dz (φ)† Sx Dz (φ) = Sx 1 − φ2 + φ4 . . . − S y φ − φ3 + φ5 . . . . 2! 3! 4! 5!

Then identify the Taylor Series of the trigonometric functions.

(4.129)

Chapter 5

Matrices

Abstract Matrices are extremely important. They are familiar from elementary linear algebra as a compact way of writing a system of linear equations. They appear in many areas of mathematical physics, for they can represent tensors such as the 3-dimensional deformation tensors of continuum mechanics and the 4-dimensional coordinate transformations and tensors of special and general relativity. They also are essential in representation of groups that play such an important role in the standard model of particle physics. And they play a fundamental role representing the linear operators of quantum mechanics since, as Hassani demonstrates in Sect. 5.1, there is a one-to-one correspondance between linear operators and matrices.

5.1 Problems 5.3 The linear operator A : R3 → R2 is given by ⎛ ⎞ ( ) x 2x + y − 3z ⎝ ⎠ . A y = x+y−z z

(5.1)

Construct the matrix representing A in the standard bases of R3 and R2 .

5.3 By inspection the matrix is ) 2 1 −3 . A= 1 1 −1 (

(5.2)

We confirm that this matrix corresponds to the operator in Eq. (5.1) by performing the matrix multiplication:

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 R. B. Scott, Problems and Solutions on Vector Spaces for Physicists, https://doi.org/10.1007/978-3-031-31218-2_5

91

92

5 Matrices

⎛ ⎞ ⎛ ⎞ ( ) x ( ) x 2 1 −3 2x + y − 3z ⎝ ⎝ ⎠ ⎠ y = A y = . 1 1 −1 x+y−z z z

(5.3)

Here there is no subtlety, in contrast to Hassani Example 5.1.5 where the operator is to be represented in non-standard bases.

5.6 Prove that for M BBUV (B ◦ A) = M BBUW (B)M BBWV (A),

Hassani Eq. (5.6)

(5.4)

to hold, we must have (

M BBUV (B ◦ A)

) kj

=

M ( ∑

) (

M BBUW (B)

i=1

ki

)

M BBWV (A)

ij

.

(5.5)

5.6 Here A : V N → W M is a linear transformation from N -dimensional vector space V N to M-dimensional vector space W M and B : W M → U K is a linear transformation from W M to K -dimensional vector space U K . The RHS of Eq. (5.4) is the product of matrix M BBUW (B) with matrix M BBWV (A). Equation (5.5) expresses this matrix product in component form, using a standard notation; (A)i j is the component of matrix A found in the ith row and jth column. (You’ll find the same notation in other physics textbooks, those authors differ on whether to use square or rounded paretheses, sometimes even within the same book [16].) M being the dimension of vector space W M , the first matrix M BBUW (B) has M columns and the second matrix M BBWV (A) has M rows.

5.9 Show that an arbitrary orthogonal 2 × 2 matrix can be written in one of the following two forms: ) ( cos θ − sin θ sin θ cos θ

( or

cos θ sin θ sin θ − cos θ

) (5.6)

The first is a pure rotation (its determinant is +1), and the second has determinant −1. The form of the choices is dictated by the assumption that the first entry of the matrix reduces to 1 when θ = 0.

5.1 Problems

93

5.9 Let’s be clear about what is being asked for here. We are required to prove that there is nothing preventing us from writing any orthogonal 2 × 2 matrix in one of the two forms given in Eq. (5.6). More formally, given an orthogonal 2 × 2 matrix A, we are guarenteed the existence of a real parameter θ such that one of the two matrices in Eq. (5.6) equals A. The condition for a matrix to be orthogonal is that its transpose is also its inverse, i.e. A At = 1 = At A. So it is necessary for us to show that the two matrices of Eq. (5.6) are orthogonal, but that is not sufficient. We must also show that the two matrices of Eq. (5.6) are general enough to cover all the possibilities. So we must be careful to not impose unnecessary restrictions.( ) ab For a real 2 × 2 matrix, A = orthogonality cd At A = 1 = AAt ,

(5.7)

directly gives us six relations among the four matrix elements a 2 + b2 = 1,

(5.8)

a + c = 1,

(5.9)

b + d = 1,

(5.10)

2

2

2

2

c + d = 1,

(5.11)

ac + bd = 0,

(5.12)

ab + cd = 0.

(5.13)

2

2

Clearly these are not all independent. Equation (5.8) tells us that −1 ≤ a ≤ 1 and −1 ≤ b ≤ 1. Similarly, Eq. (5.11) tells us that −1 ≤ c, d ≤ 1. There’s nothing to prevent us from using any number of functions of a single parameter, say θ ∈ R, that gives f (θ) = a in the range from −1 to +1, but we prefer a function for which f (0) = 1, as indicated in the statement of the problem. While the cosine function is a natural choice for f , another valid but nonstandard possibility would be a = f (θ) = 2e−θ − 1, 2

−∞ < θ < ∞.

(5.14)

√ 2 Then Eq. (5.8) immediately √ gives b = ± 1 − f and similarly Eq. (5.9) 2 immediately gives c = ± 1 − f . Equations (5.10) and (5.11) then require d = ±a. The last two equations impose the same requirement; when a = d then b = −c, as in the first matrix of Eq. (5.6), and when a = −d then b = c, as in the second matrix of Eq. (5.6). This reveals that there are two classes of orthogonal matrix. In fact these correspond to determinant +1 when the

94

5 Matrices

diagonal elements have the same sign, and determinant −1 when the diagonal elements have opposite sign. For the determinant +1 orthogonal matrix, choosing f (θ) = cos(θ) = a = d for the diagonal elements, we require off-diagonal elements b = −c = ± sin θ. There is no loss in generality in deciding that c is fixed at c = sin θ, for with −π < θ ≤ π we can obtain any −1 ≤ c ≤ 1. For the determinant −1 orthogonal matrix, choosing f (θ) = cos(θ) = a = −d, we require off-diagonal elements b = c = ± sin θ. Again, there is no loss in generality in deciding that c = d = sin θ, for with −π < θ ≤ π we can obtain any −1 ≤ c = d ≤ 1.

5.12 Construct the matrix representations of D : Pc4 [t] → Pc4 [t]

T : Pc3 [t] → Pc4 [t]

and

(5.15)

the derivative and multiplication-by-t operators. Choose {1, t, t 2 , t 3 } as your basis of Pc3 [t] and {1, t, t 2 , t 3 , t 4 } as your basis of Pc4 [t]. Use the matrix of D so obtained to find the first, second, third, fourth, and fifth derivatives of a general polynomial of degree 4. 5.12 Recall Hassani Box 5.1.1 gives the procedure to obtain the matrix A representing a linear map A ∈ L(V, W) from vector space V to W. For the derivative operator D we have the special case of W = V = Pc4 [t] and we are instructed to use the given basis, {| b j ⟩}5j=1 = {1, t, t 2 , t 3 , t 4 }. Let D = (di j ) be the matrix representing D so the first column has elements di1 with i = 1 . . . 5 representing the five rows of this column vector. To find di1 we apply D to the first basis vector | b1 ⟩ = 1 and express the result as a linear combination of the basis vectors: D| b1 ⟩ =

5 ∑

di1 | bi ⟩.

(5.16)

i=1

(Compare this with Hassani Eq. (5.3) and note I have interchanged the roles of i and j.) Recall that D was defined in Hassani Example 2.3.5 and its action on a polynomial is given above in Eq. (4.2). Substituting | b1 ⟩ = 1, we find

5.1 Problems

95

D| b1 ⟩ = D 1 = 0, 0=

5 ∑

used Eq. (4.2)

di1 | bi ⟩.

used Eq. (5.16)

i=1

= d11 + d21 t + d31 t 2 + d41 t 3 + d51 t 4 , =⇒ di1 = 0, i = 1, . . . , 5.

used basis (5.17)

The implication of the last line follows from the fact that any basis must be a linearly independent set (Hassani Definition 2.1.7). For the second column di2 we apply D to the second basis vector: D| b2 ⟩ = D t = 1, 1=

5 ∑

used Eq. (4.2)

di2 | bi ⟩,

used Hassani Eq. (5.3)

i=1

= d12 + d22 t + d32 t 2 + d41 t 3 + d52 t 4 , =⇒ d12 = 1, di2 = 0, i = 2, 3, 4, 5.

used basis (5.18)

Continuing in this way, for the third column we find D| b3 ⟩ = D t 2 = 2t =

5 ∑

di3 | bi ⟩,

i=1

=⇒ d23 = 2, d13 = d33 = d43 = d53 = 0.

(5.19)

For the last two columns we find D| b4 ⟩ = D t 3 = 3t 2 =

5 ∑

di4 | bi ⟩,

i=1

=⇒ d34 = 3, d14 = d24 = d44 = d54 = 0. D| b5 ⟩ = D t 4 = 4t 3 =

5 ∑

di5 | bi ⟩,

i=1

=⇒ di5 = 4δi,4 , i = 1, . . . , 5.

(5.20)

Assemble the columns in Eq. (5.17) through (5.20) to form the matrix D that represents the operator D in this basis:

96

5 Matrices

⎛

0 ⎜0 ( ) ⎜ D = di j = ⎜ ⎜0 ⎝0 0

1 0 0 0 0

0 2 0 0 0

0 0 3 0 0

⎞ 0 0⎟ ⎟ 0⎟ ⎟. 4⎠ 0

(5.21)

The multiplication-by-t linear map T acts on vectors in Pc3 , i.e. linear combinations of, for example, the basis | bk' ⟩ ∈ {1, t, t 2 , t 3 } and produces vectors in Pc4 . (We keep | b j ⟩ as the notation for the basis vectors of Pc4 and use | bk' ⟩ for basis vectors of Pc3 . Applying T separately to all four basis vectors | bk' ⟩, we find, T| b1' ⟩ = t · 1 = t = | b2 ⟩,

T| b2' ⟩ = t · t = t 2 = | b3 ⟩,

T| b3' ⟩ = t · t 2 = t 3 = | b4 ⟩,

T| b4' ⟩ = t · t 3 = t 4 = | b5 ⟩.

(5.22)

Assembling the four columns implict in Eq. (5.22) gives the matrix: ⎛

0 ⎜1 ⎜ ( ) t jk = ⎜ ⎜0 ⎝0 0

0 0 1 0 0

0 0 0 1 0

⎞ 0 0⎟ ⎟ 0⎟ ⎟. 0⎠ 1

(5.23)

To obtain the nth derivative of a general fourth-degree polynomial, p(t) = c1 + c2 t + c3 t 2 + c4 t 3 + c5 t 4 , with ck ∈ C we write this polynomial as a vector using our basis {| bk ⟩}4k=1 p(t) = | a ⟩ =

5 ∑

ck | bk ⟩.

(5.24)

k=1

Applying the derivative operator n times, Dn | a ⟩ then gives the nth derivative of a general fourth-degree polynomial, p(t). Working with the matrix (d jk ) that we found above, we find ∑ dp(t) = D| a ⟩ = d jk ck , dt k=1 ⎛ ⎞⎛ ⎞ ⎛ ⎞ 01000 c2 c1 ⎜0 0 2 0 0⎟ ⎜c2 ⎟ ⎜2c3 ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎟⎜ ⎟ ⎜ ⎟ =⎜ ⎜0 0 0 3 0⎟ ⎜c3 ⎟ = ⎜3c4 ⎟ . ⎝0 0 0 0 4⎠ ⎝c4 ⎠ ⎝4c5 ⎠ c5 0 00000 5

(5.25)

5.1 Problems

97

Repeating the process, we find for the second derivative, d 2 p(t) = D2 | a ⟩ = dt 2 ⎛ 0100 ⎜0 0 2 0 ⎜ =⎜ ⎜0 0 0 3 ⎝0 0 0 0 0000

5 ∑

di j d jk ck ,

j=1,k=1

⎞⎛ 0 0 ⎜0 0⎟ ⎟⎜ ⎜ 0⎟ ⎟ ⎜0 4⎠ ⎝0 0 0

1 0 0 0 0

0 2 0 0 0

0 0 3 0 0

⎞ ⎞⎛ ⎞ ⎛ 2c3 0 c1 ⎟ ⎜ ⎟ ⎜ 0⎟ ⎟ ⎜c2 ⎟ ⎜ 6c4 ⎟ ⎜c3 ⎟ = ⎜12c5 ⎟ . 0⎟ ⎟ ⎟⎜ ⎟ ⎜ 4⎠ ⎝c4 ⎠ ⎝ 0 ⎠ c5 0 0

(5.26)

Repeating the process, we find for the third derivative, 5 ∑ d 3 p(t) 3 = D | a ⟩ = dhi di j d jk ck , dt 3 i=1, j=1,k=1 ⎛ ⎞⎛ ⎞⎛ 01000 01000 01 ⎜0 0 2 0 0⎟ ⎜0 0 2 0 0⎟ ⎜0 0 ⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ =⎜ ⎜0 0 0 3 0⎟ ⎜0 0 0 3 0⎟ ⎜0 0 ⎝0 0 0 0 4⎠ ⎝0 0 0 0 4⎠ ⎝0 0 00000 00000 00

0 2 0 0 0

0 0 3 0 0

⎞ ⎞⎛ ⎞ ⎛ 6c4 0 c1 ⎟ ⎜ ⎟ ⎜ 0⎟ ⎟ ⎜c2 ⎟ ⎜24c5 ⎟ ⎟ ⎟ ⎜ ⎜ 0⎟ ⎟ ⎜c3 ⎟ = ⎜ 0 ⎟ . 4⎠ ⎝c4 ⎠ ⎝ 0 ⎠ c5 0 0 (5.27)

For the fourth derivative, 5 ∑ d 4 p(t) 4 = D | a ⟩ = dgh dhi di j d jk ck , dt 4 h=1,i=1, j=1,k=1 ⎛ ⎞ ⎞4 ⎛ ⎞ ⎛ c1 01000 24c5 ⎜0 0 2 0 0⎟ ⎜c2 ⎟ ⎜ 0 ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ =⎜ ⎜0 0 0 3 0⎟ ⎜c3 ⎟ = ⎜ 0 ⎟ . ⎝0 0 0 0 4⎠ ⎝c4 ⎠ ⎝ 0 ⎠ c5 0 00000

Finally for the fifth derivative,

(5.28)

98

5 Matrices 5 ∑ d 5 p(t) 5 = D | a ⟩ = d f g dgh dhi di j d jk ck , dt 5 g,h,i, j,k=1 ⎛ ⎞5 ⎛ ⎞ ⎛ ⎞ c1 01000 0 ⎜0 0 2 0 0⎟ ⎜c2 ⎟ ⎜0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎟ =⎜ ⎜0 0 0 3 0⎟ ⎜c3 ⎟ = ⎜0⎟ . ⎝0 0 0 0 4⎠ ⎝c4 ⎠ ⎝0⎠ c5 00000 0

(5.29)

5.15 If the matrix representation of an endomorphism T of C3 with respect to the standard basis is ⎛ ⎞ 0 1 1 ⎝ 1 0 −1⎠ (5.30) −1 −1 0 what is the representation of T with respect to the basis ⎧ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫ 1 −1 ⎬ ⎨ 0 ⎝ 1 ⎠ , ⎝−1⎠ , ⎝ 1 ⎠ ? ⎩ ⎭ −1 1 0

(5.31)

5.15 We seek the matrix T' that represent the linear map T with respect to the new basis given in Eq. (5.31). This is obtained from a so-called similarity transformation T' = RTR−1 ,

cf. Hassani Eq. (5.19)

(5.32)

where T is the matrix representation of T in the standard basis, and R is the “basis transformation matrix”, i.e. the matrix that transforms vectors from their old representation v to their new representation v ' , v ' = Rv.

cf. to Hassani Eq. (5.17)

(5.33)

How do we find the matrix R? Note that our non-standard basis in Eq. (5.31) is not orthogonal, nor is it normalized. So the instructions in Hassani Box 5.4.1 do not apply and we cannot use Hassani Eq. (5.21)! In fact, the information we are given in Eq. (5.31) lets us easily find the inverse of basis transformation matrix, R−1 , for we are given the linear combinations of the old basis, say | bk ⟩, that gives the new basis kets | b'j ⟩. To facilitate comparison with Hassani

5.1 Problems

99

Eq. (5.16) we write Eq. (5.31) in the form ⎞ ⎛ ⎞⎛ ⎞ 0 1 −1 | b1' ⟩ | b1 ⟩ ⎝| b2' ⟩⎠ = ⎝ 1 −1 1 ⎠ ⎝| b2 ⟩⎠ . −1 1 0 | b3' ⟩ | b3 ⟩ ⎛

(5.34)

We note that the roles of old and new basis are inverted so that the 3 × 3 matrix in Eq. (5.34) is (Rt )−1 . (Why the transpose? Look again at Hassani Eq. (5.16), the first “row” of the matrix is written ρ11 ρ21 . . . etc. The first index is the row index and the second is the column index. It’s not a typo – that’s his Rt matrix!) Elementary manipulations (see Hassani Example 5.5.9 and Problem 5.30 if you are unfamiliar with finding the matrix inverse) reveal that ⎛ ⎞ 110 (5.35) Rt = ⎝ 1 1 1 ⎠ . 011 (You should verify that applying Rt to the new basis vectors of Eq. (5.31) reproduces the standard basis. For example, the first row of Rt gives 1 · | b1' ⟩ + 1 · | b2' ⟩ + 0 · | b3' ⟩ = (0, 1, −1)t + (1, −1, 1)t = (1, 0, 0)t , the first standard basis vector.) Note that in this problem things simplify because Rt = R so we can immediately write, ⎛

⎞⎛ ⎞⎛ ⎞ 110 0 1 1 0 1 −1 T' = RTR−1 = ⎝1 1 1⎠ ⎝ 1 0 −1⎠ ⎝ 1 −1 1 ⎠ , 011 −1 −1 0 −1 1 0 ⎛ ⎞ 10 0 = ⎝0 0 0 ⎠ . 0 0 −1

5.18 Show number α.

(5.36)

that det(αA) = α N det A for an N × N matrix A and a complex

5.18 The desired result follows directly from Hassani Theorem 5.5.1 det A =

∑ π

∈π

N ∏ j=1

(A)π( j) j ,

(5.37)

100

5 Matrices

where π is a permutation of the row index and (A)π( j) j is the matrix element at column j and row π( j). Recall ∈σ from Hassani Definition 2.6.2 as the sign of the permutation σ; ∈σ = +1 when σ is even and ∈σ = −1 when σ is odd. How do we know if a permutation is odd or even? Any permutation can be constructed from pair-wise interchanges—pick two elements and interchange them. An even permutation is obtained by an even number of pair-wise interchanges, an odd-permutation is obtained by an odd number of pair-wise interchanges with the starting arrangement being the ordered set 1 through N . That this is independent of how you went about constructing the permutation from pair-wise interchanges is an important and not obvious result in group theory. To see this in more detail, recall that multiplying a matrix A by a scalar α ∈ C means that we multiply every element Ai j by α, so αA = (αAi j ). Substituting this in Eq. (5.37) gives det A =

∑

∈π

π

=

∑

N ∏

(A)π( j) j ,

j=1

∈π α

π

N

N ∏

(A)π( j) j = α N det(A).

(5.38)

j=1

While that’s a complete answer to this problem, it’s perhaps worth adding a couple remarks. The formula Eq. (5.37) of Hassani Theorem 5.5.1 is called “Leibniz rule”, not to be confused with Leibniz’s other rules, including the one for differentiation of products, or the derivative of an integral with variable limits. Furthermore, the formula Eq. (5.37) might appear more daunting that it should. What’s the funny π( j ) doing in place of the row index i?! It simply means that we have chosen one of the rows from a permutation of rows, in particular the jth element of the permutation called π. A little experience with elementary group theory and permutations would go a long way to deciphering this formula. Reference [1], which is an excellent mathematical physics textbook written in the same spirit as Hassani but at a more introductory level, expands Eq. (5.38) for N = 1, 2, and 3 allowing the reader to see how the familiar formula for a determinant is obtained.

5.21 Let A be any N × N matrix. Replace its ith row (column) with any of its other rows (columns), leaving the latter unchanged. Now expand the determinant of the new matrix by its ith row (column) to show that N ∑ j=1

(A) ji (cof A) jk = 0 =

N ∑ (A)i j (cof A)k j j=1

k /= i.

(5.39)

5.1 Problems

101

5.21 Let k /= i be the row that replaced row i of A in forming the new matrix, say B, i.e. ⎧ (B)lj =

(A)lj when l /= i, (A)k j when l = i.

(5.40)

As instructed, we expand the deterimant of B about its ith row using N ∑ (B)i j (cof B)i j .

det B =

Hassani Eq. (5.27)

(5.41)

j=1

Now we observe that det B = 0 because the ith and kth rows are identical and therefore linearly dependent so that condition 4 of Hassani Theorem 5.5.2 applies. Furthermore, we can write the RHS of Eq. (5.41) in terms of A as follows, det B = 0, =

used Hassani Theorem 5.5.2,

N ∑

(A)k j (cof A)i j .

used Eq. (5.40)

(5.42)

j=1

That (cof B)i j = (cof A)i j follows from the fact that by construction the ith row is the only row that differs between A and B while the ith row of B is not used in calculating cofactor (B)i j . Recall (cof A)i j = (−1)i+ j Mi j ,

(5.43)

where Mi j is the so-called minor, i.e. the determinant of the N − 1 by N − 1 matrix with row i and column j of A removed. This completes the demonstration for the first equality in Eq. (5.39). The fact that det A = det At , by Hassani Theorem 5.5.1, allows us to interchange the roles of rows and columns in the above argument, giving the second equality.

5.24 Show explicitly that det(AB) = det(A) det(B) for 2 × 2 matrices.

102

5 Matrices

5.24 This is a straightforward calculation. Writing out the elements of our matrices we have )( )⎫ ⎧( b11 b12 a11 a12 , det(AB) = det a21 a22 b21 b22 ) ( (a11 b11 + a12 b21 ) (a11 b12 + a12 b22 ) , = det (a21 b11 + a22 b21 ) (a21 b12 + a22 b22 ) a b11 a22 b = a11 b11 a22 b22 + a12 b21 a21 b12 + a11 21 b12 + a12 b21 22 a − a21 b11 a12 b22 − a22 b21 a11 b12 − b11 b − a b a b a21 11 12 22 21 12 22 (5.44) The parentheses around individual elements in Eq. (5.44) are not strictly needed but added to distinguish the two different columns. Equation (5.44) can be compared to det(A) det(B) = (a11 a22 − a21 a12 )(b11 b22 − b21 b12 ) = a11 a22 b11 b22 + a21 a12 b21 b12 − a21 a12 b11 b22 − a11 a22 b21 b12 .

(5.45)

The terms in Eqs. (5.44) and (5.45) are in the same order but the factors need to be rearranged to see that they agree.

5.27 Find the matrix that transforms the standard basis of C3 to the vectors ⎛ 1 ⎞ ⎛ ⎞ ⎛ −i ⎞ √ √ 0 2 2 −2 ⎟ ⎜ √1 ⎟ ⎜√ ⎜ √i ⎟ | a3 ⟩ = ⎝ 6 ⎠ . | a2 ⟩ = ⎝ 6 ⎠ , (5.46) | a1 ⟩ = ⎝ 6 ⎠ , 1+i √ 6

−1+i √ 6

1+i √ 6

Show that this matrix is unitary. ⎛ ⎞ 1 5.27 Writing the standard basis as three column vectors, | e1 ⟩ = ⎝0⎠ , | e2 ⟩ = 0 ⎛ ⎞ ⎛ ⎞ 0 0 ⎝1⎠ , | e3 ⟩ = ⎝0⎠ organized into a matrix, gives the identity matrix. We 1 0 seek a transformation matrix, R, that multiplied by the identity matrix gives three basis vectors written as column vectors. (This is the basis transformation matrix, the R of the similarity transformation in Hassani Eq. (5.19).) Obviously the matrix is

5.1 Problems

103

⎛ R=

⎞

−i √ √1 0 ⎜ √12 √i 2 √ −2 ⎟ . ⎝ 6 6 6⎠ 1+i 1+i √ −1+i √ √ 6 6 6

(5.47)

To show that R is unitary, we must show that its hermitian adjoint R† is its inverse. ⎛ 1 −i ⎞ ⎛ 1 1 1−i ⎞ ⎛ ⎞ √ √ √ √ √ 0 100 2 6 6 2 2 ⎜ ⎟ ⎟ ⎜ −i −1−i −2 √i √ √ ⎠ = ⎝0 1 0 ⎠ . (5.48) RR† = ⎝ √16 √i 6 √ 6 6 6 ⎠⎝ 2 1−i −1+i 1+i −2 1+i 001 √ √ √ √ √ 0 6 6 6 6 6 Note that we could have approached this problem slightly differently. Let A be an operator that transforms the standard basis to the given vectors, i.e. A| e1 ⟩ = | a1 ⟩, A| e2 ⟩ = | a2 ⟩ and A| e3 ⟩ = | a3 ⟩. Find the matrix A representing A in the standard basis. Then, applying the procedure of Hassani Box.5.1.1, we arrive at exactly A = R as in Eq. (5.47).

5.30 Find

the inverse of the following matrices if they exist: ⎛

⎞ 3 −1 2 A = ⎝ 1 0 −3⎠ , −2 1 −1 ⎛ ⎞ 10 1 C = ⎝0 1 0 ⎠ . 1 0 −1

⎛

⎞ 0 1 −1 B = ⎝ 1 2 0 ⎠, −1 −2 1

(5.49)

(5.50)

5.30 We should first check whether the matrix has an inverse. Recall a matrix is invertible if and only if its determinant is nonzero (Hassani Theorem 5.5.4.) Laplace’s rule, Hassani Eq. (5.27), is the most convenient formula for this. Let’s choose i = 2 and sum over column 2 so we pass through the zero in the middle: ⎛

⎞ 3 −1 2 det A = det ⎝ 1 0 −3⎠ , −2 1 −1 ( ) ( ) ( ) 1 −3 3 2 3 2 + 0(−1)2+2 det + 1(−1)3+2 det , = −1(−1)1+2 det −2 −1 −2 −1 1 −3 = (+1)(−1 − 6) + 0 + (−1)(−9 − 2) = 4 / = 0.

(5.51)

104

5 Matrices

We conclude that this matrix definitely has an inverse, A−1 . We can find this inverse matrix using the procedure of Hassani Example 5.5.9. Recall we write the matrix we want to invert on the left of a vertical line and the identity on the right. Then we perform so-called “elementary row operations” on each row to reduce the matrix on the left to the identity. Once achieved, the matrix on the right will be the inverse we seek. Each operation on the left is repeated on the right. In their most general form, the elementary row operations consist of replacing a row by a linear combination of itself with another row, what Hassani indicates by α(i ) + β( j). This accounts for all three types of elementary row operations in Hassani Definition 5.5.6. For example, we can interchange rows 1 and 2 by simultaneously performing 0(1)+1(2) on row 1 and 1(1)+0(2) on row 2, which we abbreviate by (1) ↔ (2). Here we go: | | ⎛ ⎞ ⎞ 3 −1 2 || 1 0 0 1 0 −3 || 0 1 0 (1) ↔ (2) ⎝ 1 0 −3 | 0 1 0 ⎠ → → ⎝ 3 −1 2 || 1 0 0 ⎠ −→ | | −2 1 −1 0 0 1 −2 1 −1 | 0 0 1 | ⎛ ⎞ 1 0 −3 || 0 1 0 ⎝ −3(1) + (2) → 0 −1 11 || 1 −3 0 ⎠ → (2) ↔ (3) → 1 2(1) + (3) 0 1 −7 | 0 2 1 (2) + 41 (3) | | ⎞ ⎞ ⎛ 4 ⎛ 3(3) + (1) 1 0 0 || 43 41 43 1 0 −3 || 0 1 0 ⎝ 0 1 −7 | 0 2 1 ⎠ → 7(3) + (2) → ⎝ 0 1 0 | 7 1 11 ⎠ (5.52) | 4 4 4 | 0 0 1 | 41 − 14 41 0 0 1 | 41 − 41 41 ⎛

The matrix we find on the right above is the inverse of the matrix we started with on the left: ⎛3 1 3 ⎞ 4 11 4 1 − 41 41 4

4

A−1 = ⎝ 47

4 1 4

⎠.

(5.53)

For matrix B we choose the first column, i = 1, to find the determinant: ⎛ ⎞ 0 1 −1 det B = det ⎝ 1 2 0 ⎠ , −1 −2 1 ( ) ( ) 1 −1 1 −1 − 1(−1)3+1 det , = 0 + 1(−1)1+2 det −2 1 2 0 = 0 + (−1)(1 − 2) − 1(0 − (−2)) = −1 /= 0. We find the inverse matrix using the same procedure.

(5.54)

5.1 Problems

105

| | ⎛ ⎞ 0 1 −1 || 1 0 0 1 2 0 || 0 1 (1) ↔ (2) ⎝ 1 2 0 | 0 1 0⎠ → → ⎝ 0 1 −1 || 1 0 | | −1 −2 1 001 −1 −2 1 | 0 0 | ⎛ ⎞ 1 2 0 || 0 1 0 ⎝ → 0 1 −1 || 1 0 0 ⎠ → (1) + (3) 00 1 | 011 | ⎛ ⎞ −2(2) + (1) 1 0 2 || −2 1 0 → ⎝ 0 1 −1 || 1 0 0 ⎠ → 00 1 | 0 11 | ⎛ ⎞ −2(3) + (1) 1 0 0 || −2 −1 −2 → ⎝ 0 1 −1 || 1 0 0 ⎠ → 00 1 | 0 1 1 | ⎛ ⎞ 1 0 0 || −2 −1 −2 (3) + (2) → ⎝ 0 1 0 || 1 1 1 ⎠ . 001 | 0 1 1 ⎛

⎞ 0 0⎠ → 1

(5.55)

We have found the inverse matrix ⎛

B−1

⎞ −2 −1 −2 = ⎝ 1 1 1 ⎠. 0 1 1

(5.56)

For matrix C let’s choose the first column, i = 1, to find the determinant: ⎛ ⎞ 10 1 det C = det ⎝0 1 0 ⎠ , 1 0 −1 ( ) ( ) 1 0 01 1+1 3+1 = 1(−1) det + 0 + 1(−1) det , 0 −1 10 = (−1)2 (−1 − 0) + 0 + 1(−1)4 (0 − 1) = −2 /= 0.

(5.57)

We conclude that C−1 exists. We find this inverse matrix using the procedure of Hassani Example 5.5.9 again. | | ⎛ ⎞ ⎞ 1 0 1 || 1 0 0 1 0 1 || 1 0 0 ⎝0 1 0 | 0 1 0⎠ → → ⎝ 0 1 0 || 0 1 0 ⎠ → | 1 1 0 0 1 | 21 0 − 21 1 0 −1 | 0 0 1 (1) − (3) 2 | 1 21 ⎞ ⎛ (1) − (3) 1 0 0 || 2 0 2 → → ⎝ 0 1 0 || 0 1 0 ⎠ . (5.58) 0 0 1 | 21 0 − 21 ⎛

106

5 Matrices

We have found the inverse matrix ⎛1 C−1

⎞ 0 21 = ⎝0 1 0 ⎠. 1 0 − 21 2 2

(5.59)

5.33 For which values of α are the following matrices invertible? Find the inverses whenever they exist. ⎛

1 A = ⎝α 0 ⎛ 0 C = ⎝1 α

⎞ α0 1 α⎠ , α1 ⎞ 1α α 0⎠ , 0 1

⎛

⎞ α1 0 B = ⎝1 α 1⎠ , 0 1α ⎛ ⎞ 11 1 D = ⎝1 1 α⎠ . 1α1

(5.60)

5.33 Remember the powerful idea expressed in Hassani Theorem 5.5.4 that the inverse of a matrix exists if and only if the determinant is nonzero. So we just need to calculate the determinant and set it not equal to zero to determine the conditions on the parameter α such that the inverse exists. (Herein we assume you can easily find the determinant. See Problem 5.30 for help with this.) Starting with A we find det A = 1 · (1 − α2 ) − α · α, √ √ = 1 − 2α2 = (1 − 2α)(1 + 2α).

(5.61)

The factorization in Eq. (5.61) allows us to read the solution; det A = √ 0 if √ α = ±1/ 2. Hassani Theorem 5.5.4 implies that A−1 exists iff α /= ±1/ 2. To actually find the inverse we again use the method of Hassani Example 5.5.9; see Problem 5.30 for detailed examples. The only new thing here is that instead of a purely numerical calculation, we have a single parameter α. But that doesn’t change the nature of the algebraic manipulations involved. We give the result here for comparison, A−1

⎛ 2 ⎞ α − 1 α −α2 1 ⎝ α −1 α ⎠ . = 2α2 − 1 −α2 α α2 − 1

(5.62)

5.1 Problems

107

We can check that Eq. (5.62) makes sense by multipling it by A and recovering the identity matrix. Also note√that it is clear from Eq. (5.62) why an inverse does not exist when α = ±1/ 2. For B we find det B = α(α2 − 2) = α(α −

√ √ 2)(α + 2),

(5.63)

√ from which one concludes det B = 0 if α = 0 or α = ± √ 2. Hassani Theorem 5.5.4 implies B is invertible iff α /= 0 and α /= ± 2. Solving for this inverse we find, B−1

⎛ α2 −1 ⎞ −1 α1 1 ⎝ α −1 α −1 ⎠ . = 2 2 α −2 1 −1 α α−1 α

(5.64)

Note the √form is consistent with our requirement B is invertible iff α /= 0 and α /= ± 2 for only then does Eq. (5.64) avoid the infinite matrix elements. For C we find det C = −(α3 + 1).

(5.65)

This result implies det C = 0 if α3 = −1. Some attention to detail is required here. Suppose we were working in the reals; that is, our physical problem required that the elements of matrix (C)i j ∈ R. In this case there is a unique solution, α = −1. However, if (C)i j ∈ C then we expect three solutions to a complex cubic equation (by the fundamental theorem of algebra, Hassani Theorem 10.5.6). Is there a systematic way to find the other two roots? Yes, thankfully there is. Probably the fastest solution is to note that the absolute value is unity so we seek complex numbers of the form (eiθ )3 = ei3θ = −1, =⇒ 3θ = π + 2πk,

k = 1, 0, −1.

(5.66)

Euler’s formula

(5.67)

The second line used Euler’s formula, eiθ = cos(θ) + i sin(θ),

and in particular eiπ = −1. Furthermore, because the trignometic functions are periodic with period 2π we can also include the solutions with k = ±1, giving the three solutions:

108

5 Matrices

π θ3 = − , 3 √ 3 1 α3 = − i . 3 2

π , 3 √ 3 1 , α2 = + i 3 2

θ1 = π,

θ2 =

=⇒ α1 = −1,

(5.68)

where again we used Euler’s formula. (This tidy solution was given to me by Sadri Hassani.) Perhaps it’s instructive to look at this problem another way. A cubic equation, like our equation for zero determinant, (α3 + 1) = 0, can be factored into three factors (α3 + 1) = (α − α1 )(α − α2 )(α − α3 ), where αi are three complex roots (again, see Hassani Theorem 10.5.6). We have already found one root, say α1 = −1. This gives us an equation to find the others: (α3 + 1) = (α + 1)(α − α2 )(α − α3 ), = α3 + (1 − α2 − α3 )α2 + (α2 α3 − α2 − α3 )α + α2 α3 . (5.69) Comparing the LHS and RHS of Eq. (5.69) we conclude 0 = (1 − α2 − α3 ), 0 = (α2 α3 − α2 − α3 ), 1 = α2 α3 .

(5.70)

These equations lead to a quadratic α2 − α + 1 = 0,

(5.71)

which gives the other two roots √ 1+i 3 α2 = , 2

√ 1−i 3 α3 = . 2

Hassani Theorem 5.5.4 implies that C−1 exists iff α / = −1 and α /= Solving for this inverse we find, C−1

⎛ ⎞ −α 1 α2 1 ⎝ 1 α2 −α⎠ . = 3 α + 1 α2 −α 1

(5.72) √ 1±i 3 . 2

(5.73)

Note the form is consistent with our requirement C is invertible. For D we find det D = −α2 + 2α − 1 = −(α − 1)2 .

(5.74)

5.1 Problems

109

This result implies det D = 0 iff α = 1, so D is invertible when α /= 1. Consistent with this we find the inverse is ⎛ ⎞ α + 1 −1 −1 1 ⎝ −1 −1 0 1 ⎠ . (5.75) D = α−1 −1 1 0

5.36 Use determinants to show that an antisymmetric matrix whose dimension is odd cannot have an inverse.

5.36 Admittedly it is not obvious where to start, though with some trial and error and especially persistence one can eventually stumble upon a solution. In fact Sadri Hassani gave me the following argument because the one I first stumbled upon was less succinct. Note that Leibniz’s rule Eq. (5.37) for calculating the determinant, as stated in Hassani Theorem 5.5.1, implies that det A = det (At ), = det (−A), = (−1) N det A,

Hassani Theorem 5.5.1 A is antisymmetric used Problem 5.18

(5.76)

where N × N is the size of the matrix. Now we see why the parity matters. For N odd, Eq. (5.76) can only be satisfied with det A = 0, which, together with Hassani Theorem 5.5.4, gives our desired result. The argument above is not difficult, many will wonder how one finds such an argument. Here are some reflections on this process. From the problem statement, we knew we had to use the fact that the matrix in question was antisymmetric. The defining characteristic of an antisymmetric matrix is that it equals the negative of its transpose, i.e. its elements are related by (A)i j = −(A) ji . Furthermore, since we are looking for proof that the inverse does not exist, it is clear that we are looking for properties of the determinant. So we should reflect on connections between the transpose and the determinant, which leads us to recall that det A = det At . Surely this must enter into the solution. How? Well, you have to play around with these ideas and search. Personally I wasted considerable time looking for reasons terms should cancel before I stumbled upon the simple argument above. For that reason I gave this the rare status of being a “difficult problem” in my Table of Solutions, even though the solution is actually quite simple …once you find it!

5.39 Show that if two invertible N × N matrices A and B anticommute (that is, AB + BA = 0), then (a) N must be even, and (b) trA = trB = 0.

110

5 Matrices

5.39 (a) By Hassani Theorem 5.5.4, for matrices A and B to be invertible they both must have nonzero determinants, det A /= 0,

det B /= 0.

(5.77)

Furthermore, because they anticommute, AB = −BA.

(5.78)

Let’s calculate the determinant of both sides of Eq. (5.78) det(AB) = det(−BA), det(A) det(B) = det(−B) det(A),

used Hassani Theorem 2.6.11

det(A) det(B) = (−1) det(B) det(A), N

used Eq. (5.38) (5.79)

where N is the dimension of A and B. On the RHS in Eq. (5.79) we used the fact each of the N elements of the matrix on the RHS were multiplied by −1. There are two cases. If N is odd, then Eq. (5.79) implies det(A) det(B) = − det(B) det(A),

(5.80)

which implies that either det(A) = 0 or det(B) = 0 (or both), either of which would contradict Eq. (5.77). Hence N is not odd. When N is even Eq. (5.79) becomes trivial, resulting in no contradiction. We conclude N is even. (b) On the other hand, tr(−B) = −tr(B) because tr(−B) =

N N ∑ ∑ (−B)kk = − (B)kk = −tr(B). k=1

(5.81)

k=1

Let’s multiply Eq. (5.78) on the left by A−1 (remember, we assumed both A and B are invertible, so this is definitely allowed) and find the trace, A−1 AB = −A−1 BA,

assumed A invertible

−1

trB = −tr(A BA), = −trB.

used Hassani Proposition 5.6.4

(5.82)

The last equality follows from the fact that the RHS of Eq. (5.82) has the form of a similarity transformation (by basis matrix R = A−1 ). Recall by Hassani Proposition 5.6.4 the trace is invariant under a similarity transformation. Note

5.1 Problems

111

that Eq. (5.82) implies trB = 0; the only number equal to minus itself is 0. Similarly, multiply Eq. (5.78) on the left by B−1 and find the trace, B−1 AB = −B−1 BA,

assumed B invertible

−1

tr(B AB) = −tr(A), trA = −trA = 0.

used Hassani Proposition 5.6.4

(5.83)

5.42 Let S and A be a symmetric and an antisymmetric matrix, respectively, and let M be a general matrix. Show that (a) trM = trMt , (b) tr(SA) = 0; in particular, trA = 0, (c) SA is antisymmetric if and only if [S, A] = 0, (d) MSMt is symmetric and MAMt is antisymmetric, (e) MHM† is hermitian if H is.

5.42 Observe the matrices S and A are square, given their symmetry, and M must be square too for (a), (d), and (e) to make sense. (a) The first property follows immediately from the definition of the trace, and the fact that the diagonal elements of the matrix and its transpose are exactly the same elements, ∑ ∑ (M) j j = (Mt ) j j = trMt . (5.84) trM = j

j

(b) To show that the trace of this product vanishes, tr(SA) = 0, let’s expand the product and exploit the symmetry properties of the two matrices to show that tr(SA) = −tr(SA): tr(SA) =

∑ (S)i j (A) ji , i, j

=−

∑

(S)i j (A)i j ,

used antisymmetry of A

∑ (S) ji (A)i j ,

used symmetry of S

i, j

=−

i, j

=−

∑

(S)i j (A) ji ,

relabelled dummy indices

i, j

= −tr(SA) = 0.

(5.85)

112

5 Matrices

The final equality in Eq. (5.85) comes from the unique solution a = 0 to an equation of the form a = −a, when a ∈ C. Now consider a particular example of a symmetric matrix S = 1. Inserting this in Eq. (5.85) gives tr(A) = 0. (c) First suppose the commutator vanishes, [S, A] = 0. We have (SA)t = At St , = At S,

used Hassani Theorem 5.2.1(b)

= −AS, = −SA,

used A antisymmetric used commutator vanishes

used S symmetric (5.86)

which proves that SA is antisymmetric. But we are not done yet. We must also prove that this antisymmetry implies the commutator vanishes. So let’s now start with the assumption that SA is antisymmetric and exploit symmetry properties to workout the implications for the commutator: SA = − (SA)t , = −At St , = AS,

assumed SA is antisymmetric used Hassani Theorem 5.2.1(b) used A antisymmetric, S symmetric

(5.87)

which shows that [S, A] = 0. This completes the proof. (d) Using Hassani Theorem 5.2.1 we could easily show MSMt = (MSMt )t . It’s only slightly longer to go back to first principles to show that (MSMt )i j = (MSMt ) ji . We do this by expanding the matrix products and exploiting the symmetry of S: (MSMt )i j =

∑ (M)ik (S)kl (Mt )lj ,

definition of matrix product

k,l

=

∑ (M)ik (S)kl (M) jl ,

definition of transpose

k,l

=

∑ (M) jl (S)kl (M)ik ,

changed order

k,l

=

∑ (M) jl (S)lk (Mt )ki ,

used symmetry of S

k,l

= (MSMt ) ji .

recognized matrix product

(5.88)

With manipulations similar to those used above we could show that (MAMt )i j = −(MAMt ) ji , establishing that MAMt is antisymmetric. For the

5.1 Problems

113

sake of variety, let’s build upon the results established in Hassani Theorem 5.2.1: (MAMt )t = (M(AMt ))t ,

used square matrix algebra associative

= (AM ) M , = (Mt )t At Mt ,

used Hassani Theorem 5.2.1(b) used Hassani Theorem 5.2.1(b)

= MAt Mt ,

used Hassani Theorem 5.2.1(c) used antisymmetry of A

t t

t

= −MAM , t

(5.89)

which confirms that MAMt is antisymmetric. (e) To show from first principles that MHM† is hermitian if H is, we must show that (MHM† )i j = (MSM† )∗ji if (H)i j = (H)∗ji . The manipulations are similar to those used above in (d): (MHM† )i j =

∑

(M)ik (H)kl (M† )lj ,

definition of matrix product

k,l

=

∑

(M)ik (H)kl (M)∗jl ,

definition of adjoint

(M)∗jl (H)kl (M)ik ,

changed order

(M)∗jl (H)∗lk (M)ik ,

used H hermitian

k,l

=

∑ k,l

=

∑ k,l

=

∑

(M)∗jl (H)∗lk (M† )∗ki ,

used definition of adjoint

k,l

= (MHM† )∗ji .

recognized matrix product (5.90)

This implies (MHM† ) = (MHM† )† , which means that the matrix MHM† is hermitian. While the first principles approach is slightly longer it is more transparent in some respects.

5.45 Suppose that there are two operators A and B such that [A, B] = c1, where c is a [nonzero] constant. Show that the vector space in which such operators are defined cannot be finite-dimensional. Conclude that the position and momentum operators of quantum mechanics can be defined only in infinite dimensions.

5.45 Let’s use reductio ad absurdum. We assume the operators are defined on a finite N ∈ N-dimensional vector space. Then the operators have square

114

5 Matrices

N × N matrix representations, as implied by Hassani Proposition 5.1.2. Then the commutator equation becomes the matrix equation, [A, B] = c1.

(5.91)

The trace of a matrix is just the sum of the diagonal elements so tr(D + E) = tr(D) + tr(E). Taking the trace of Eq. (5.91) we have tr ([A, B]) = tr(c1) = cN , tr (AB) − tr (BA) = cN , 0 = cN ,

used Hassani Theorem 5.6.2.

(5.92)

Recall Hassani Theorem 5.6.2 includes tr (AB) = tr (BA), the so-called cyclic invariance of trace (see Sect. L5.6 of ref [1]). Equation (5.92) is clearly a contradiction for some natural number N > 0 and arbitrary nonzero scalar c ∈ C. This proves our assumption of finite dimensions N is not valid; the operators must be defined on an infinite dimensional space. The canonical commutation relations of quantum theory state that the position operator X and corresponding momentum operator P have the commutation relation [X, P] = ih1

(5.93)

where h is a real number, Planck’s reduced constant; see [14, Eq. (1.6.46)]. Clearly position and momentum operators of quantum mechanics can be defined only in infinite dimensions.

5.2 Supplementary Problems 5.47 Find the determinant of the matrix ⎛ 01 ⎜1 2 ⎜ ⎝1 0 12

2 2 0 2

⎞ 1 0⎟ ⎟ 1⎠ 1

(5.94)

Hint: answer is -2. Make sure you really understand Hassani Eq. (5.27) by calculating this determinant and comparing your answer with that provided.

5.2 Supplementary Problems

115

5.48 Construct the matrix representation for the complex structure J ∈ End(V) on an even N = 2m-dimensional real vector space V as in Hassani Example 5.1.3 but for the other ordering of the basis vectors, i.e. (| e1 ⟩, | e2 ⟩, . . . , | em ⟩, J| e1 ⟩, J| e2 ⟩, . . . , J| em ⟩) ,

(5.95)

m where {| ei ⟩}i=1 are the first m vectors of a given orthonormal basis of V. Show examples when N = 2 and N = 6.

5.48 Following the rule in Hassani Box 5.1.1, we form columns from the components of the vectors obtained from the operator J acting on the basis vectors in Eq. (5.95). The first such vector J| e1 ⟩ = | em+1 ⟩ by construction. Note that it is automatically normalized because ⟨ em+1 | em+1 ⟩ = ⟨ Je1 | Je1 ⟩ , = ⟨ e1 | e1 ⟩ ,

by construction used isometric property, Eq. (5.147)

= 1.

(5.96)

So the first column vector has a single nonzero component at row m + 1, given by Jm+1,1 = 1. Similarly, hitting | e2 ⟩ with J gives by construction the m + 2 basis vector, which is represented by a column vector with a one at row m + 2, zeros elsewhere, i.e. Jm+2,2 = 1. Continuing in this way we find the first m columns of matrix J, Ji, j = δ(i−m), j ,

j = 1, . . . , m, and i = 1, . . . , m.

(5.97)

Eventually we arrive at the m + 1 basis vector | em+1 ⟩. Hitting | em+1 ⟩ with the complex structure gives J| em+1 ⟩ = J2 | e1 ⟩, = −| e1 ⟩.

by construction, used Eq. (5.141)

(5.98)

So representing this vector J| em+1 ⟩ by its components at column (m + 1) we have a −1 at row 1, i.e. J1,m+1 = −1, and zeros elsewhere. For the remaining vectors we obtain J| em+k ⟩ for k = 1, 2, . . . , m Ji, j = −δi,( j−m) ,

j = m + 1, . . . , 2m, and i = 1, . . . , m.

For N = 2m = 6 the matrix J would be

(5.99)

116

5 Matrices

⎛

0 ⎜0 ⎜ ⎜0 J=⎜ ⎜1 ⎜ ⎝0 0

0 0 0 0 1 0

0 0 0 0 0 1

−1 0 0 0 0 0

0 −1 0 0 0 0

⎞ 0 0⎟ ⎟ −1⎟ ⎟. 0⎟ ⎟ 0⎠ 0

(5.100)

For N = 2m = 2 the matrix J would be minus one times the matrix in Eq. (5.143).

5.49 Prove that det(AB) = det(A) det(B),

L169 of ref. [1] or Eq. (4.19) of ref. [9]

(5.101)

where A and B are square matrices of the same dimension. This is Theorem 3.16 of reference [3], and is related of course to Hassani Theorem 2.6.11 about linear operators. 5.50 To understand Hassani Theorem 5.5.1 you have to understand the symbol ∈i1 i2 ...i N , introduced in Hassani Eq. (2.29). ∈i1 i2 ...i N = +1 if i 1 i 2 . . . i N is an even permutation of 1, 2, . . . , N and ∈i1 i2 ...i N = −1 if i 1 i 2 . . . i N is an odd permutation of 1, 2, . . . , N . Convince yourself that ∈12 = +1 ∈21 = −1 ∈22 = 0

∈132 = −1, ∈321 = −1, ∈213 = −1,

∈4321 = +1, ∈3214 = −1, ∈2143 = +1,

∈54321 = +1, ∈43215 = +1, ∈32154 = +1.

(5.102)

See Problem 5.18 if you are unsure how to decide if a permutation is even or odd. 5.51 Redo the argument leading to Hassani Theorem 5.5.2 for the case where A represents an operator A ∈ End(C N ) as oppose to R N as considered in Hassani Sect. 5.5.

5.51 The argument generalizes to the complex case without modification. Let A be the matrix representation of an operator A ∈ End(C N ) as oppose to R N , with | v j ⟩ ∈ C N the jth column of A. Using Hassani Box 5.1.1 it follows that the operator and matrix are related through A| e j ⟩ = | v j ⟩,

j = 1, . . . , N .

(5.103)

5.2 Supplementary Problems

117

where {| e j ⟩} Nj=1 are the standard basis vectors of C N . Again let Δ be a determinant function, now in C N , that gives unity on the standard basis. Again we find that Δ (| v1 ⟩, . . . , | v N ⟩) = Δ (A| e1 ⟩, . . . , A| e N ⟩) ,

used Eq. (5.103),

= Δ A (| e1 ⟩, . . . , | e N ⟩) ,

used Hassan Eq. (2.31),

= det A · Δ (| e1 ⟩, . . . , | e N ⟩) ,

used Hassan Eq. (2.32),

= det A.

(5.104)

Defining the determinant of the matrix as that of the corresponding operator we have det A = det A = Δ (| v1 ⟩, . . . , | v N ⟩) .

(5.105)

Furthermore, because the rows of A, say | u i ⟩, correspond to the columns of At we immediately conclude from Eq. (5.105) that det At = Δ (| u 1 ⟩, . . . , | u N ⟩) , = det A.

used Hassani Theorem 5.5.1.

(5.106)

5.52 Justify in detail the results of Hassani Theorem 5.5.2: Let A be a complex square matrix. Then 1. det A is linear with respect to any row or column vector of A. 2. If any two rows or two columns of A are interchanged, det A changes sign. 3. Adding a multiple of one row (column) of A to another row (column) of A does not change det A. 4. det A = 0 iff the rows (columns) are linearly dependent. Hint: Use Eq. (5.105) and the properties of a determinant function (see Hassani Definition 2.6.5) to justify each of the four elements of the theorem above. 5.53 Redo the proofs for the identities in Problem 5.42(b),(d) and (e) using matrix notation instead of the index notation as we have done in the solutions given above. This forces us to remember and use matrix properties rather than rederiving them at each step. Assume that the identity in (a) holds, trM = trMt . 5.54 The Pauli spin matrices from Hassani Example 5.2.7, ( ) 01 σ1 = , 10

(

) 0 −i σ2 = , i 0

(

) 1 0 σ3 = , 0 −1

play an important role in quantum mechanics. Show that these matrices are:

(5.107)

118

5 Matrices

(a) hermitian, (b) unitary, (c) involutive, (d) anticommuting.

5.54 (a) Recall a hermitian matrix H obeys H† = H, (H)i j =

Hassani Definition 5.2.4

(H)∗ji .

used Hassani Eq. (5.12)

(5.108)

Applying this definition to the Pauli spin matrices we immediately find: (

) 01 = = σ1 , 10 ( )∗ ( ) 0 i 0 −i = = σ2 , σ2† = −i 0 i 0 ( ) 1 0 = σ3 . σ3† = 0 −1

σ1†

(5.109)

This confirms that all three Pauli spin matrices are hermitian. (b) Recall a unitary matrix U obeys U† U = UU† = 1.

Hassani Definition 5.2.4

(5.110)

Applying this definition to the Pauli spin matrices we find σi† σi = σi2 = σi σi† ,

i = 1, 2, 3.

For the individual matrices we find ( )( ) ( ) 01 01 10 σ12 = = , 10 10 01 ( )( ) ( ) 0 −i 0 −i 10 = , σ22 = i 0 i 0 01 ( )( ) ( ) 1 0 1 0 10 = . σ32 = 0 −1 0 −1 01

used σi all herimitian

(5.111)

used − i 2 = 1 (5.112)

This confirms that all three Pauli spin matrices are unitary. It also shows they are involutive as we explain below.

5.2 Supplementary Problems

119

(c) Recall an endomorphism A ∈ End(V) on a vector space V is involutive if it obeys A2 = 1.

Hassani Definition 3.1.24

(5.113)

Extending this to the matrices that represent operators we see that Eq. (5.112) implies that the Pauli spin matrices are involutions and in general any matrix that is both unitary and hermitian is therefore involutive. (d) Recall the anticomutator of two operators A, B ∈ End(V) was defined just before Hassani Proposition 4.2.5 as {A, B} = AB + BA.

(5.114)

Extending this definition to the square matrices that represent operators the notion of anticommuting means that {A, B} = AB + BA = 0. Note that the anticommutator is symmetric in the two arguments, {A, B} = {B, A}. This means we only need to check three pairs of matrices to confirm that the Pauli spin matrices are anticommuting: ( )( ) ( )( ) ( ) ( ) 0 1 0 −i 0 −i 0 1 i 0 −i 0 + = + , {σ1 , σ2 } = 10 i 0 i 0 10 0 −i 0 i ( ) 00 = = {σ2 , σ1 }, 00 ( )( ) ( )( ) ( ) ( ) 01 1 0 1 0 01 0 −1 0 1 + = + , {σ1 , σ3 } = 1 0 0 −1 0 −1 1 0 1 0 −1 0 ( ) 00 = = {σ3 , σ1 }, 00 ( )( ) ( )( ) ( ) ( ) 0 −i 1 0 1 0 0 −i 0i 0 −i + = + , {σ2 , σ3 } = i 0 0 −1 0 −1 i 0 i 0 −i 0 ( ) 00 (5.115) = = {σ3 , σ2 }. 00

5.55 Show that the Pauli spin matrices, see Hassani Ex. 5.2.7, satisfy the following identity σi σ j = δi j 1 + i∈i jk σk . Hint: use the results of Supplementary Problem 5.54.

(5.116)

120

5 Matrices

5.56 Show that the matrix of the Euler angles in Hassani Example 5.2.7(d) ⎛

⎞ cos ψ cos φ − sin ψ cos θ sin φ − cos ψ sin φ − sin ψ cos θ cos φ sin ψ sin θ ⎝sin ψ cos φ + cos ψ cos θ sin φ − sin ψ sin φ + cos ψ cos θ cos φ − cos ψ sin θ⎠ , sin θ sin φ sin θ cos φ cos θ

(5.117) is indeed orthogonal. Confirm also that it arises as described in the text: a rotation of angle φ about the z-axis, followed by a rotation of angle θ about the new x-axis, followed by a rotation of angle ψ about the new z-axis. Hint: These rotations are active transformations that rotate a vector, not passive rotations of the coordinate system. So the first rotation matrix is ⎛

⎞ cos(φ) − sin(φ) 0 ⎝ sin(φ) cos(φ) 0⎠ . 0 0 1

(5.118)

5.57 Prove that the transpose of the inverse of an invertible matrix A is the same as the inverse of the transpose of A.

5.57 Let A be an invertible matrix with inverse denoted A−1 so that A−1 A = 1 = AA−1 .

(5.119)

Now apply the transpose to this equation, (

)t )t ( A−1 A = 1t = AA−1 , ( )t )t ( At A−1 = 1 = A−1 At .

(5.120)

This implies that the inverse of At is ( t )−1 ( −1 )t A = A .

(5.121)

5.58 Recall from Hassani Sect. 5.2 that an arbitrary real or complex square matrix G can be written as the sum of a symmetric part S = St and an antisymmetric part A = −At , G = S + A. For example, the following works,

(5.122)

5.2 Supplementary Problems

S=

121

) 1( G + Gt , 2

A=

) 1( G − Gt , 2

(5.123)

but is this the only possible decomposition into symmetric and antisymmetric parts? Prove that S and A are unique for any given real or complex square matrix G. Hint: Use Hassani Proposition 2.1.13. 5.59 An arbitrary square matrix M can be written as the sum of a hermitian part H = H† and an antihermitian part A = −A† , M = H + A.

(5.124)

Sometimes the antihermitian part is written A = iK where K is hermitian, and M = H + iK

(5.125)

is called the Toeplitz decomposition. Only when M is real does this result corresponds to the decomposition above in Supplementary Problem 5.58. Show K is hermitian, find matrices H and A in terms of G and prove that H and A are unique for given square matrix G. Hint: The problem is very similar to Supplementary Problem 5.58 so try an analogous contruction, see Eq. (5.123), to find matrices H and A and Hassani Proposition 2.1.13 for the proof of uniqueness. 5.60 An important lesson from this Chapter is that the matrix representation of an operator, much like the matrix representation of vectors and tensors, depends upon the basis. Reconsider Hassani Example 5.1.5, where we are given a linear operator A ∈ L(R3 ), defined by its affect on a column vector in the standard basis: ⎛ ⎞ ⎛ ⎞ x x − y + 2z A ⎝ y ⎠ = ⎝ 3x − z ⎠ . z 2y + z

Hassani Eq. (5.8)

(5.126)

We are required to find the matrix, let’s call it A' , representing A in a non-standard basis B. But here use the tools that you learned in Hassani Sect. 5.4. That is, first represent, by a matrix A in the standard basis, the operator A given in Eq. (5.126). Then find and apply the similarity transformation to the new basis. For the new basis B, let us consider a basis ⎧ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫ 1 0 1 ⎬ ⎨ (5.127) B = | b1 ⟩ = ⎝ 1 ⎠ , | b2 ⟩ = ⎝ 1 ⎠ , | b3 ⟩ = ⎝ 0 ⎠ . ⎭ ⎩ 1 1 −1 deliberately modified from Hassani Example 5.1.5, as we’ll see later, to avoid a symmetric basis transformation matrix.

122

5 Matrices

5.60 First we find A in the standard basis following the procedure in Hassani Box 5.1.1: A| e j ⟩ =

3 ∑ (A)i j | ei ⟩,

j = 1, 2, 3.

(5.128)

i=1

Substituting the standard basis ⎛ ⎞ 1 | e1 ⟩ = ⎝0⎠ , 0

⎛ ⎞ 0 | e2 ⟩ = ⎝1⎠ , 0

⎛ ⎞ 0 | e3 ⟩ = ⎝0⎠ , 1

(5.129)

we find immediately ⎛

⎞ 1 −1 2 A = ⎝ 3 0 −1 ⎠ , 0 2 1

(5.130)

the matrix in fact that Hassani suggested we are tempted to write. To find the similarity transformation we would write the old (standard) basis as linear combinations of the new basis, | ej ⟩ = (

3 ∑

| bi ⟩(R)i j ,

i=1

) ( ) | e1 ⟩ | e2 ⟩ | e3 ⟩ = | b1 ⟩ | b2 ⟩ | b3 ⟩ R.

(5.131)

(Eq. (5.131) is the transpose of Hassani Eq. (5.16).) However the information we are given in Eq. (5.127) is the inverse of Eq. (5.131) so we multiply both sides of Eq. (5.131) by R−1 to find, (

) ( ) | b1 ⟩ | b2 ⟩ | b3 ⟩ = | e1 ⟩ | e2 ⟩ | e3 ⟩ R−1 .

(5.132)

Comparing Eq. (5.132) with Eq. (5.127) we see immediately that ⎛

R−1

⎞ 10 1 = ⎝1 1 0⎠. 1 1 −1

(5.133)

Some tedious calculations, see Hassani Example 5.5.9, allow us to invert this matrix to find

5.2 Supplementary Problems

123

⎛

⎞ 1 −1 1 R = ⎝ −1 2 −1 ⎠ . 0 1 −1

(5.134)

And now we can apply the similarity transformation that expresses A in the new basis, A' = RAR−1 , ⎛

Hassani Eq. (5.19) ⎞⎛

⎞⎛

⎞

1 −1 1 1 −1 2 10 1 = ⎝ −1 2 −1 ⎠ ⎝ 3 0 −1 ⎠ ⎝ 1 1 0 ⎠ , 0 1 −1 0 2 1 1 1 −1 ⎛ ⎞ 3 5 −6 = ⎝ −1 −6 10 ⎠ . −1 −4 5

(5.135)

(5.136)

It’s worth reflecting on Eq. (5.135) to help remember the formula. The matrix A' multiplies column vectors, representing vectors in basis B, on the left and produces a column vector in basis B. How does this work? You can think of the RHS of Eq. (5.135) as a decomposition of A' into three matrices. The first to hit the column vector in basis B is R−1 , which transforms it into a column vector represented in the standard basis. Then this standard-basis column vector is multiplied by A, the matrix representing A in the standard basis, so it produces a column vector in the standard basis. Finally the column vector is transformed to the basis B by R.

5.61 Give a physical explanation of the similarity transformation, A = R−1 A' R,

Hassani Eq. (5.19)

(5.137)

interpreting A' as a matrix representing an operator on a vector space in a basis B ' , while A represents the same operator in a basis B. Specify exactly how R relates the two bases and its effect on column vector representations of vectors in B. Hint: It might help to reread the paragraph in Hassani Sect. 5.4 just before Hassani Eq. (5.19) and/or the final paragraph of the solution to the previous problem, Supplementary Problem 5.60. 5.62 Reconsider Hassani Example 5.1.5 and Supplementary Problem 5.60 where we are given a linear operator A ∈ L(R3 ), defined by Eq. (5.127). We are required to find the matrix A representing A in a non-standard basis B. As in Supplementary Problem 5.60, use the tools that you learned in Hassani Sect. 5.4. That is, first represent, by a matrix A' in the standard basis, the operator A given in Eq. (5.127). Then

124

5 Matrices

find and apply the similarity transformation to the new basis. For the new basis B, use the basis from Hassani Example 5.1.5: ⎧ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫ 1 1 0 ⎬ ⎨ B = | b1 ⟩ = ⎝ 1 ⎠ , | b2 ⟩ = ⎝ 0 ⎠ , | b3 ⟩ = ⎝ 1 ⎠ . ⎩ ⎭ 0 1 1

(5.138)

Verify that you recover the matrix A given in the text, ⎛ A=

1 2 ⎜ 1 ⎝−2 5 2

2 − 23 5 2 1 2

1 0

⎞ ⎟ ⎠.

(5.139)

5.63 Let J ∈ End(V) be an isometric operator on the real vector space V with symmetric, bilinear positive definite inner product: ⟨a |b⟩ = ⟨b|a ⟩, ⟨ a | (β| b ⟩ + γ| c ⟩) = β ⟨ a | b ⟩ + γ ⟨ a | c ⟩ , ⟨ a | a ⟩ ≥ 0, and ⟨ a | a ⟩ = 0 =⇒ | a ⟩ = | 0 ⟩.

(5.140)

This corresponds to the inner product of Hassani Definition 2.1.1 for the case of a real vector space. Furthermore, assume J2 = −1.

(5.141)

Note that J fulfills the definition of a complex structure, Hassani Definition 2.4.5. (a) Show that ⟨ x |J| x ⟩ = 0,

∀| x ⟩ ∈ V.

(5.142)

(b) Show that the matrix ( J=

0 −1

1 0

) (5.143)

represents a complex structure V = R2 with natural inner product. Confirm that the property in (a) applies.

5.2 Supplementary Problems

125

(c) Reconcile the result in (a) with Hassani Theorem 2.3.8: A linear operator T on an inner product space is 0 if and only if ⟨ x |T| x ⟩ = 0 for all | x ⟩. Hint: reconsider the proof of Hassani Theorem 2.3.8 for the special case of a vector space over the reals.

5.63 (a) We follow the proof of Hassani Proposition 2.4.6. Let | x ⟩ ∈ V be an arbitrary vector and define | y ⟩ = J| x ⟩. Then for all | x ⟩ ∈ V we have ⟨ x |J| x ⟩ = ⟨ x | y ⟩ = ⟨ Jx | Jy ⟩ ,

used J isometric,

= ⟨ y |J |x ⟩ = − ⟨ y |x ⟩, = − ⟨x | y ⟩, = 0.

used linearity, J2 = −1,

2

used symmetry, (5.144)

(b) Consider R2 with standard inner product and J represented by the matrix J, ) ( 0 1 (5.145) . J= −1 0 Then ( J2 =

0 1 −1 0

)(

) ( ) 0 1 −1 0 = . −1 0 0 −1

(5.146)

The isometric property of J demands that ⟨ Ja | Jb ⟩ = ⟨ a | b ⟩ ,

∀| a ⟩, | b ⟩ ∈ V.

(5.147)

In R2 with operator J written as a 2 by 2 matrix we can write the vectors as column vectors so the standard inner product ⟨ a | b ⟩ between two vectors | a ⟩, | b ⟩ ∈ R2 becomes ( ) ) w ⟨a |b⟩ = x y = xw + yz. z (

Then the LHS of Eq. (5.147) becomes

(5.148)

126

5 Matrices

)( ) ) ( ))t ( 0 1 w 0 1 x , −1 0 z −1 0 y ( )( )( ) ) 0 −1 0 1 w y , 1 0 −1 0 z ( )( ) ) 10 w y , 01 z

(( ⟨ Ja | Jb ⟩ = ( = x ( = x

= xw + yz, = ⟨a |b⟩.

used Eq. (5.148) (5.149)

Comparing this with Eq. (5.147) we see the isometric condition holds for the matrix J. The condition we verified in (a) must apply to this matrix, ( )( ) ( ) ( ) 0 −1 ( ) −y x ⟨ x |J| x ⟩ = x y = x y , 1 0 y x ( ) x = −x y + x y = 0, ∀ ∈ R2 . y

(5.150)

(c) Recall Hassani Theorem 2.3.8: A linear operator T ∈ End(W) on an inner product space is 0 if and only if ⟨ x |T| x ⟩ = 0 for all | x ⟩. Here the inner product is defined in Hassani 2.2.1, with the positive definiteness property (see Eq. (5.140)). In fact we will not need this property. Consider initially the general case where W is complex. Later we’ll restrict consideration to a real vector space V and we’ll find that the strong condition T = 0 no longer applies. Proof If T = 0 then ⟨ x |T| x ⟩ = ⟨ x |0| x ⟩ , = ⟨x |0⟩, = ⟨x | y − y ⟩, = ⟨x | y ⟩ − ⟨x | y ⟩,

for some arbitrary | y ⟩ ∈ W used linearity

= 0, ∀| x ⟩ ∈ W.

(5.151)

Conversely, suppose ⟨ x |T| x ⟩ = 0 for all | x ⟩ ∈ W. Let | x ⟩ = α| a ⟩ + β| b ⟩ with for now α, β ∈ C and for always nonzero, and | a ⟩, | b ⟩ ∈ W. Then 0 = ⟨ x |T| x ⟩ = (α∗ ⟨ a | + β ∗ ⟨ b |)T(α| a ⟩ + β| b ⟩), = |α|2 ⟨ a |T| a ⟩ + |β|2 ⟨ b |T| b ⟩ + α∗ β ⟨ a |T| b ⟩ + αβ ∗ ⟨ b |T| a ⟩ , = α∗ β ⟨ a |T| b ⟩ + αβ ∗ ⟨ b |T| a ⟩ . (5.152)

5.2 Supplementary Problems

127

At this point we pause and consider the special case of a real vector space. Then α∗ β = αβ ∗ = αβ /= 0 so that we require only that ⟨ a |T| b ⟩ + ⟨ b |T| a ⟩ = 0.

(5.153)

This implies only that T is skew, i.e. TT = −T, where AT is the adjoint of operator A defined in Hassani Definition 2.4.3. T being skew is of course a much weaker conclusion than T = 0. For a real matrix representation J of the operator as in (b) this condition is met if the matrix equals the negative of its transpose. In summary, we find that the strong condition T = 0 applies only for a complex vector space W in Hassani Theorem 2.3.8. So in fact the positive definite property of the inner product is not an essential requirement for the case of a real vector space. For the case of a complex vector space, as considered for Hassani’s proof of his Theorem 2.3.8, the positive definiteness of the inner product was essential for his Theorem 2.3.7 to apply and to reach the strong conclusion T = 0.

5.64 Show that a unitary operator is represented by a unitary matrix in an orthonormal basis. Likewise, a unitary matrix represents, in an orthonormal basis, a unitary operator. Hint: Complete the demonstration of Hassani found at the end of his Sect. 5.3. 5.65 Is the product of two unitary matrices always a unitary matrix? It is tempting to reason as follows. The product of two unitary endomorphisms is also a unitary endomorphism, see Problem 4.32. Furthermore the matrix representation of the product of endomorphisms A and B is the matrix product of the matrices M(A) and M(B) representing the endomorphisms, M(B ◦ A) = M(B)M(A).

Hassani Eq. (5.7)

(5.154)

However only in an orthogonal basis is the adjoint of an operator represented by the adjoint of the matrix representing the operator, Hassani Box 5.3.1, which might raise some doubt. Prove directly from Hassan Definition 5.2.4 of a unitary matrix that the product of two unitary matrices is indeed a unitary matrix. 5.66 In special relativity physics takes place in a flat four-dimensional spacetime called Minkowski space. Introducing a rigid inertial reference frame O and suitably generalized Cartesian coordinates (called rectanglular or pseudo-Cartesian coordinates), we can represent any point in Minkowski space with a column vector (ct, x, y, z)t , where c is the speed of light. Because c is a universal constant it is convenient to choose units such that the speed of light is unity c = 1, so points have coordinates (t, x, y, z)t . In special relativity theory we learn that the change of ref¯ moving at constant speed erence frame from a given inertial reference to another O

128

5 Matrices

w along the z-axis results in a transformation of coordinates to the new coordinates (t¯, x¯ , y¯ , z¯ )t given by the famous Lorentz transformation for a boost, ⎛ ⎞ ⎛ t¯ γ ⎜x¯ ⎟ ⎜ 0 ⎜ ⎟=⎜ ⎝ y¯ ⎠ ⎝ 0 z¯ −wγ

0 1 0 0

⎞⎛ ⎞ 0 −wγ t ⎜x ⎟ 0 0 ⎟ ⎟⎜ ⎟, 1 0 ⎠ ⎝y⎠ 0 γ z

1 , and − 1 < w < 1. with γ = √ 1 − w2 (5.155)

Find the Lorentz transformation for a constant velocity v = (u, v, w) with norm restricted by u 2 + v 2 + w2 < 1.

(5.156)

Use two rotations to a coordinate system such that relative velocity between the inertial frames is completely along the z-axis.

5.66 Write the v = (u, v, w) in spherical coordinates v = (V, θ, φ).

(5.157)

A rotation about the z-axis by angle φ will create inertial frame O ' with new axes, say x ' and y ' -axes, without modifying the others, so t ' = t and z ' = z. The vector v will fall on the x ' , z ' plane. The matrix of this transformation is ⎛ '⎞ ⎛ ⎞⎛ ⎞ 1 0 0 0 t t ⎜x ' ⎟ ⎜0 cos φ sin φ 0⎟ ⎜x ⎟ ⎜ '⎟ = ⎜ ⎟⎜ ⎟ ⎝ y ⎠ ⎝0 − sin φ cos φ 0⎠ ⎝ y ⎠ . z' 0 0 0 1 z

(5.158)

Now rotate about the new y ' -axis by angle θ, creating inertial frame O '' with new axes z '' and x '' such that the velocity v falls on the z '' -axis. The y ' and t ' axes are not affected so these coordinates remain unchanged, i.e. y '' = y ' and t '' = t ' . The matrix of this transformation is ⎛ '' ⎞ ⎛ ⎞⎛ '⎞ t 1 0 0 0 t ⎜x '' ⎟ ⎜ 0 cos θ 0 − sin θ⎟ ⎜x ' ⎟ ⎜ '' ⎟ = ⎜ ⎟⎜ ⎟ (5.159) ⎝y ⎠ ⎝ 0 0 1 0 ⎠ ⎝y'⎠ . z '' z' 0 sin θ 0 cos θ The product of these two rotations can be written as a single matrix that transforms coordinates from O to O '' :

5.2 Supplementary Problems

129

⎛

⎞⎛ ⎞ 1 0 0 0 1 0 0 0 ⎜0 cos θ 0 − sin θ⎟ ⎜0 cos φ sin φ 0⎟ ⎟⎜ ⎟ R=⎜ ⎝0 0 1 0 ⎠ ⎝0 − sin φ cos φ 0⎠ , 0 sin θ 0 cos θ 0 0 0 1 ⎛ ⎞ 1 0 0 0 ⎜0 cos θ cos φ cos θ sin φ − sin θ⎟ ⎟. =⎜ ⎝0 − sin φ cos φ 0 ⎠ 0 sin θ cos φ sin θ sin φ cos θ

(5.160)

Note that a vector with spatial part parallel to v, say (t, a sin θ cos φ, a sin θ sin φ, a cos θ)t , will have components in O '' given by (t, 0, 0, a)t . Now the velocity is completely along the z '' -axis, with magnitude ||v|| = √ u 2 + v 2 + w 2 = V so we can apply the Lorentz transformation of the form in Eq. (5.155) ⎛ '' ⎞ ⎛ γ t¯ ⎜x¯ '' ⎟ ⎜ 0 ⎜ '' ⎟ = ⎜ ⎝ y¯ ⎠ ⎝ 0 −V γ z¯ ''

0 1 0 0

⎞ ⎛ '' ⎞ 0 −V γ t ⎜x '' ⎟ 0 0 ⎟ ⎟⎜ ⎟, 1 0 ⎠ ⎝ y '' ⎠ 0 γ z ''

with γ = √

1 1 − V2

.

(5.161)

Call the matrix in Eq. (5.161) ⎛

γ ⎜ 0 B'' = ⎜ ⎝ 0 −V γ

0 1 0 0

⎞ 0 −V γ 0 0 ⎟ ⎟. 1 0 ⎠ 0 γ

(5.162)

Now we must undo the rotations about z and y ' -axes in order to express the Lorentz transformation in the original coordinates. This requires finding the inverse to R, which is easily obtained by changing the sign of the two angles in the rotation matrices in Eqs. (5.158) and (5.159) and multiplying them. But even quicker would be to note that the product of two orthogonal matrices is orthogonal, so R is orthogonal: R−1 = Rt , ⎛ ⎞ 1 0 0 0 ⎜0 cos θ cos φ − sin φ sin θ cos φ⎟ ⎟ =⎜ ⎝0 cos θ sin φ cos φ sin θ sin φ ⎠ . 0 − sin θ 0 cos θ

(5.163)

The resulting transformation for the general boost B can be obtained from one single matrix

130

5 Matrices

⎛

γ ⎜ −γu −1 '' B=R B R=⎜ ⎝ −γv −γw where γ =

√

1 1−u 2 −v 2 −w2

−γu Lxx L yx L zx

−γv Lxy L yy L zy

⎞ −γw L xz ⎟ ⎟, L yz ⎠ L zz

(5.164)

and

⎛

⎞ ⎛ ⎞ Lxx 1 + (γ − 1) sin2 θ cos2 φ ⎝ L yx ⎠ = ⎝(γ − 1) sin2 θ cos φ sin φ⎠ , L zx (γ − 1) cos θ sin θ cos φ ⎞ ⎛ ⎞ ⎛ (γ − 1) sin2 θ sin φ cos φ Lxy ⎝ L yy ⎠ = ⎝γ sin2 θ sin2 φ + cos2 θ sin2 φ + cos2 φ⎠ , L zy (γ − 1) cos θ sin θ sin φ ⎛ ⎞ ⎛ ⎞ Lxz (γ − 1) sin θ cos θ cos φ ⎝ L yz ⎠ = ⎝ (γ − 1) sin θ cos θ sin φ ⎠ . L zz 1 + (γ − 1) cos2 θ

(5.165)

Comparing the result in Eq. (5.164) with Hassani Eq. (5.19) we see that the Lorentz transformation for a boost along an arbitrary velocity vector is given by a similarity transformation of the matrix for a boost along the z-axis.

5.67 Consider the equation relating the two bases | ai ⟩ and | a 'j ⟩ via the basis transformation matrix ρ ji , | ai ⟩ =

N ∑

ρ ji | a 'j ⟩,

i = 1, 2, . . . , N ,

(5.166)

j=1

see just before Hassani Eq. (5.16). Compare this to Hassani Eq. (5.3) for the more general case of a linear transformation A ∈ L(V N , W M ), A| ai ⟩ =

M ∑

ρ ji | b j ⟩,

i = 1, 2, . . . , N ,

(5.167)

j=1

What is the implied operator in Eq. (5.166) in the place of A? Why is this not a N is an eigenbasis contradiction to the result of Problem 5.2? Suppose that {| ai ⟩}i=1 (but not necessarily an orthonormal basis) and that we know the components of N . Find the basis transformation matrix these eigenvectors in a given basis {| a 'j ⟩}i=1 R = (ρ ji ).

5.2 Supplementary Problems

131

5.68 Prove that the complex structure J ∈ End(V) on an odd N = 2m + 1-dimensional real vector space V does not exist. Does a 3 by 3 matrix analogous to those given in Hassani Example 5.1.3 or your solution to Supplementary Problem 5.48 exist? If so, why is it not the complex structure? Hint: Go through the construction of the matrix representation of the complex structure as in Hassani Example 5.1.3 for the case of an odd-dimensional vector space. If | e N ⟩ is the last basis vector, orthogonal by construction to all the previous basis vectors, search for a contradiction regarding the properties of the vector J| e N ⟩. 5.69 Reconsider the argument leading up to Hassani Theorem 5.5.1 applied to a linear map A : V N → W N . It is non-standard but illustrative to choose different N N and {| bi ⟩}i=1 respectively, for the domain space V N and target bases, say {| ai ⟩}i=1 space W N of A. Justifying all the steps, show that the LHS of Hassani Eq. (2.32) becomes LHS = Δ A (| ai ⟩, . . . , | a N ⟩), ∑ = απ(1)1 . . . απ(N )N ∈π · det(B)Δ(| ai ⟩, . . . , | a N ⟩),

(5.168)

π

where B = End(V N ) is the operator such that | bi ⟩ = B| ai ⟩. Also show that the RHS of Hassani Eq. (2.32) remains RHS = det(A)Δ(| ai ⟩, . . . , | a N ⟩).

(5.169)

Conclude that det A = det B

∑ π

∈π

∏

(A)π(k)k = det B det A.

(5.170)

k=1

So the standard formula of Hassani Theorem 5.5.1 in which det A = det A relies on the standard choice of using the same basis in the domain and target space of the representation of the linear map A. Of course this is natural for an endomorphism A ∈ L(V) because in this case there is only one space.

5.69 LHS = Δ A (| ai ⟩, . . . , | a N ⟩),

used Hassani Eq. (2.32)

= Δ(A| a1 ⟩, . . . , A| a N ⟩). ⎞ ⎛ N N ∑ ∑ αi 1 1 | bi 1 ⟩, . . . , αi N 1 | bi N ⟩⎠ , = Δ⎝

used Hassani Eq. (2.31)

i 1 =1

i N =1

used Hassani Eq. (5.3)

132

5 Matrices

=

⎛

N ∑

αi 1 1 Δ ⎝| bi 1 ⟩, . . . ,

i 1 =1

⎞ αi N 1 | bi N ⟩⎠ ,

used Δ is N -linear

i N =1

N ∑

=

N ∑

( ) αi 1 1 . . . αi N N Δ | bi 1 ⟩, . . . , | bi N ⟩ ,

used Δ is N -linear

i 1 ,i 2 ,...i N =1

= = =

∑ π

∑ π

∑ π

( ) απ(1)1 . . . απ(N )N Δ | bπ(1) ⟩, . . . , | bπ(N ) ⟩ ,

used Δ is skew

απ(1)1 . . . απ(N )N ∈π · Δ(| b1 ⟩, . . . , | b N ⟩),

used Δ is skew

απ(1)1 . . . απ(N )N ∈π det(B)Δ(| a1 ⟩, . . . , | a N ⟩).

used Hassani Eq. (2.31)

(5.171) On the other hand the RHS of Hassani Eq. (2.32) remains RHS = det(A)Δ(| a1 ⟩, . . . , | a N ⟩).

used Hassani Eq. (2.32)

(5.172)

Comparing the LHS and RHS we conclude that det(A)Δ(| a1 ⟩, . . . , | a N ⟩) =

∑ π

απ(1)1 . . . απ(N )N ∈π · det(B)Δ(| a1 ⟩, . . . , | a N ⟩).

(5.173) So det(A) =

∑

απ(1)1 . . . απ(N )N ∈π · det(B),

Δ nonzero

π

= det(B)

∑

∈π

π

= det(B) det A.

N ∏

(A)π(k)k .

changed notation

k=1

used Hassani Theorem 5.5.1 (5.174)

Chapter 6

Spectral Decomposition

6.1 Problems 6.3 Show that the interestion of two invariant subspaces of an operator is also an invariant subspace.

6.3 Let A ∈ End(V) for some finite dimensional vector space V. Furthermore, suppose M and N are two distinct but overlapping invariant subspaces of A on V, with P = M ∩ N /= ∅.

(6.1)

Then for any | p 〉 ∈ P we have A| p 〉 ∈ M, A| p 〉 ∈ N ,

because | p 〉 ∈ M, because | p 〉 ∈ N ,

(6.2)

so that A| p 〉 ∈ M ∩ N = P.

(6.3)

This shows that P is also an invariant subspace of A.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 R. B. Scott, Problems and Solutions on Vector Spaces for Physicists, https://doi.org/10.1007/978-3-031-31218-2_6

133

134

6 Spectral Decomposition

6.6 Show that (a) the coefficient of λ N in the characteristic polynomial of any linear operator is (−1) N , where N = dim V, and (b) the constant in the characteristic polynomial of an operator is its determinant.

N 6.6 (a) Our goal is to find ∑ Na N , then coefficient of the λ term in the characteristic polynomial P(A) = n=0 an λ . The characteristic polynomial of an operator A ∈ End(V), on the N -dimensional vector space V is obtained from the determinant of the operator B = A − λ1, where λ ∈ C, see Hassani Eq. (6.7). In Chap. 5 we learned that the determinant of an operator is obtained from the determinant of the corresponding matrix representation. Applying Hassani Theorem 5.5.1 to B we get

det(A − λ1) = det(A − λ1), ∑ = sgn(π )(A − λ1)π(1)1 (A − λ1)π(2)2 . . . (A − λ1)π(N )N .

(6.4)

π ∈S N

Here π ∈ S N is a succinct way of referring to the permutations of the natural numbers 1 through N , S N being the so-called symmtric group including the set of all such permutations and sgn(π ) is the sign of the permutation π, being +1 for an even permutation and −1 for an odd permutation. (I’ve departed deliberately from the notation of Hassani Theorem 5.5.1 and Supplementary Problem 5.50, in favour of the notation of Eq. (L147) in Ref. [1], because I wanted to introduce you to these useful bits of group theory that will be covered by Hassani in Sect. 23.4). Each permutation of course has N numbers and π(1) refers to the first number, π(2) the second, etc. The key observation for the matter at hand is that the only term in the sum Eq. (6.4) with λ N is the one arising from sgn(π )(A − λ1)11 (A − λ1)22 . . . (A − λ1) N N .

(6.5)

We actually know that π = (1, 2, . . . N ) in this case, so sgn(π ) = +1, so the contribution to the sum is a single term a N λ N = (−λ1)11 (−λ1)22 . . . (−λ1) N N

(6.6)

And since (1)11 = (1)22 . . . (1) N N = 1, that is the identity is just the matrix with 1 down the diagonal, this becomes a N λ N = (−λ) N = (−1) N λ N ,

=⇒

a N = (−1) N .

(6.7)

6.1 Problems

135

(b) Our goal is to find a0 , the coefficient of the λ0 = 1, i.e. the constant term in the characteristic polynomial. Inspecting Eq. (6.4) we observe that there are many terms that do not contain any factor of λ, and their sum gives a0 . A simple way to obtain the sum of all such terms is to set λ = 0, which then gives det(A) = a0 . You can see this also from Hassani Eqs. (6.8) and (6.9).

6.9 Assume that A and A' are similar matrices. Show that they have the same eigenvalues.

6.9 We have different options for approaching this problem. As a relativist at heart, I would prefer to say that the definition of an eigenvalue given in Hassani Definition 6.2.1 is an operator equation, A| a 〉 = λ| a 〉,

Hassani Eq.(6.5)

(6.8)

so that the validity is true in all bases in which we choose to represent the operators and vectors and eigenvalues. The eigenvalues, being scalars, do not transform under a change of basis. Voila! This might seem like a large bite to chew at this point, but after working through several examples of arguments like this in general relativity, see [11, 16], you’ll likely develop a taste for powerful arguments of this sort. Let’s reassure ourselves by developing an argument working only with matrices. In this case we start by asking ourselves, how are similar matrices related? The answer is the appropriately named similarity transformation Eq. (5.32) we encountered in Chap. 5 A' = RAR−1 .

(6.9)

This formula is easier to remember if you consider that A and A' apply the same operator A to a vector but do so in different bases and R is the invertible basis transformation matrix that maps the column vector representation v from the unprimed basis to the column vector representation v' = Rv in the primed basis. How does that help? We’ll see in a second when we apply the RHS of Eq. (6.9). First apply the LHS of Eq. (6.9) to an eigenvector with representation in the primed basis by the column vector v' : A ' v ' = λ' v ' .

(6.10)

136

6 Spectral Decomposition

We put a prime on the eigenvalue, for the moment, and then later we’ll prove that λ' = λ. Now let’s apply the RHS of Eq. (6.9) to the same vector v' . Now we see that the first thing that happens is that R−1 , the inverse of the basis transformation, maps the column vector from the primed to the unprimed bases, v = R−1 v' . This is exactly what we wanted for the next step, where A hits a vector v, both representations in the same unprimed basis: RAR−1 v' = RAv, = Rλv, = λRv, = λv' .

because R−1 maps components to unprimed basis because v is an eigenvector because λ is a scalar because R maps components to primed basis (6.11)

Comparing the RHSs of Eq. (6.10) and Eq. (6.11), and bearing in mind that by definition an eigenvector is nonzero, we conclude λ' = λ. Eigenvalues are invariant under similarity transformations.

6.12 Find all eigenvalues and eigenvectors of the following matrices: (

)

11 0i ⎛ 10 A2 = ⎝0 1 10 ⎛ 11 A3 = ⎝0 1 00 A1 =

(

, ⎞ 1 0⎠ , 1 ⎞ 1 1⎠ , 1

)

01 , 00 ⎛ ⎞ 110 B2 = ⎝1 0 1⎠ , 011 ⎛ ⎞ 111 B3 = ⎝1 1 1⎠ , 111 B1 =

⎛

⎞ 2 −2 −1 C1 = ⎝−1 3 1 ⎠ , 2 −4 −1 ⎛ ⎞ −1 1 1 C2 = ⎝ 1 −1 1 ⎠ , 1 1 −1 ⎛ ⎞ 011 C3 = ⎝ 1 0 1 ⎠ . (6.12) 110

6.12 The business of finding eigenvalues and their eigenvectors is not complicated but does warrant some worked examples. I’ll give the details on matrix A1 to get us started and B3 because it gets “interesting”! And I’ll offer some words of advice for any pitfalls that I managed to find. As a general word of practical advice: the 3 × 3 matrices can lead to cubic equations that can be a lot of work to solve. Try to factor them immediately after writing down the determinant. In other words, don’t multiply the determinant out into a beautiful long cubic polynomial and then ponder how to solve it; that can lead to a lot of searching. Sometimes this requires a bit of creativity—see C3 .

6.1 Problems

137

A1 : We start with Hassani Eq. (6.7), with the operators A and 1 replaced by their matrix representations A and 1, which leads to the so-called characteristic polynomial, that we set to zero: det

) ( 1−λ 1 = (1 − λ)(i − λ) = 0. 0 i −λ

(6.13)

Fortunately the characteristic polynomial in Eq. (6.13) is already factored for us, so we can read the eigenvalues directly as λ1 = 1,

λ2 = i.

(6.14)

(The labels λ1 and λ2 in Eq. (6.14) are completely arbitrary; there is no intrinsic order to the eigenvalues. But the labels are useful because we want to associate eigenvectors with their eigenvalues.) The eigenvalues each have algebraic multiplicity unity because they appear in Eq. (6.13) only once; contrast Eq. (6.13) with Eq. (6.20). Now to find the eigenvectors we choose one of our eigenvalues, say λ1 = 1, and substitute it into the eigenvector equation Hassani Eq. (6.5), again replacing abstract operators and vectors by their matrix representations. In particular, we represent the as-yet-unknown vector by its two components (x, y)t : )( ) ( ) x 0 1 − λ1 1 = , y 0 0 i − λ1 ( )( ) ( ) 0 1 x 0 = , 0 i −1 y 0

(

=⇒ y = 0.

set λ1 = 1 (6.15)

We conclude y = 0, but what’s the value of x? The equations tell us that here it is completely arbitrary, though it cannot be zero since the definition of an eigenvector stipulates that eigenvectors are nonzero vectors. This arbitrariness should not concern us because eigenvectors are always defined only up to an arbitrary (nonzero) scalar. (You can see this mathematically; multiply the abstract eigenvalue equation Hassani Eq. (6.5) by say, 2, and you find that the vector 2| a 〉 is a valid eigenvector just like | a 〉: A(2| a 〉) = λ(2| a 〉).

(6.16)

138

6 Spectral Decomposition

You can understand this physically; consider the example of the rotation of a vector about the z-axis. Any vector of nonzero length lying along the z-axis is unaffected and so represents an eigenvector of this rotation, regardless of its length.) Let’s choose x = 1 so we can write our first eigenvector as the column vector v1 = (1, 0)t . This is the eigenvector associated with eigenvalue λ1 ; hence the same label. We must continue our analysis for all eigenvalues. Here there is just one more, λ2 = i, and substitution into the eigenvector equation Hassani Eq. (6.5), gives the equations for two components (x, y)t : )( ) ( ) ( x 0 1 − λ2 1 = , y 0 0 i − λ2 ( )( ) ( ) 1−i 1 x 0 = , 0 0 y 0 =⇒ y = (i − 1)x.

set λ2 = i (6.17)

Again there is some arbitrariness. Let’s choose x = 1 so that y = i − 1. We can write our second eigenvector as the column vector v2 = (1, i − 1)t . Finally we observe for completeness that each of our eigenvalues is associated with a 1D eigenspace spanned by a single eigenvector. So we say the geometric multiplicity of each eigenvalue is unity. Contrast this situation with B3 where there are two linearly independent eigenvectors associated with one of the eigenvalues. A2 : The characteristic polynomial is a cubic, but it factors easily, by (1 − λ) in fact, so that finally it can be written λ(1 − λ)(2 − λ) = 0.

(6.18)

We have three real eigenvalues, λ1 = 0, λ2 = 1, λ3 = 2. There is nothing wrong with an eigenvalue being zero; it’s the eigen vectors that must be nonzero. The eigenvectors are found as we did above for A1 and can be written ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 0 1 v1 = ⎝ 0 ⎠ , v2 = ⎝1⎠ , v3 = ⎝0⎠ . (6.19) −1 0 1 A3 : The matrix is in upper triangular form so the determinant simplifies. Using the standard fomula for the determinant one immediately finds the characteristic polynomial becomes simply (1 − λ)3 = 0.

(6.20)

6.1 Problems

139

There is only one solution to this equation and hence only one eigenvalue, λ = 1, of algebraic multiplicity three. The geometric multiplicity is unity however because the solution space can be spanned by a single eigenvector. This eigenvector can be written ⎛ ⎞ 1 v = ⎝0⎠ . 0

(6.21)

B1 : The characteristic polynomial is λ2 = 0.

(6.22)

There is only one solution to this equation and hence only one eigenvalue, λ = 0, of algebraic multiplicity two. (The geometric multiplicity is unity however because it’s a 1D eigenspace.) The eigenvector can be written v=

( ) 1 . 0

(6.23)

B2 : The characteristic polynomial is a cubic. It’s important to identify the common factor (1 − λ) so that you can write the characteristic polynomial finally as (1 − λ)(λ + 1)(λ − 2) = 0.

(6.24)

We have three real eigenvalues each of algebraic multiplicity one: λ1 = 1, λ2 = −1, λ3 = 2. The eigenvectors can be written ⎛

⎞ −1 v1 = ⎝ 0 ⎠ , 1

⎛

⎞ 1 v2 = ⎝−2⎠ , 1

⎛ ⎞ 1 v3 = ⎝1⎠ . 1

(6.25)

B3 : The characteristic polynomial is a cubic and unfortunately there is no common factor immediately apparent. We have to expand the terms in the determinant, but then find some lucky cancellation. You should get λ2 (3 − λ) = 0.

(6.26)

140

6 Spectral Decomposition

The eigenvalues are λ1 = 0 (algebraic multiplicity 2) and λ2 = 3 (algebraic multiplicity 1). Let’s search for the eigenvectors of the first eigenvalue together – it gets interesting. Substitute λ1 = 0 into the eigenvector equation Hassani Eq. (6.5), replacing abstract operators and vectors by their matrix representations. Represent the as-yet-unknown vector by its three components (x, y, z)t : ⎞⎛ ⎞ ⎛ ⎞ ⎛ 1 − λ1 1 x 1 0 ⎝ 1 1 − λ 1 1 ⎠ ⎝ y ⎠ = ⎝0 ⎠ , 1 1 1 − λ1 z 0 ⎛ ⎞⎛ ⎞ ⎛ ⎞ 111 x 0 ⎝1 1 1⎠ ⎝ y ⎠ = ⎝0⎠ . 111 z 0

(6.27)

We have three identical equations all saying x + y + z = 0. We’ve seen redundancy in the equations for the eigenvectors before. In fact, we expect it because we forced the determinant to vanish. But this is double redundancy. How do we solve this equation? There are lots of ways. Let’s set y = 0. Then x = −z, so we could choose v1a = (1, 0, −1)t . (Note, I added a second part to the label, a redundancy index, anticipating that we will find more than one eigenvector associated with λ1 .) And we could say z = 0, giving x = −y and v1b = (1, −1, 0)t . There’s a third possibility, setting x = 0, giving z = −y and v = (0, 1, −1)t . Technically, this is another eigenvector. But we do not include it in the list because it is not linearly independent from the others we just found. In particular, v = v1a − v1b . Geometrically, v falls on the plane spanned by v1a and v1b . The dimension of this eigenspace, the so-called geometric multiplicity, is two for λ1 = 0. And that’s the most it could be because the algebraic multiplicity (here two) provides a cap on the geometric multiplicity; see Supplementary Problem 6.47. For eigenvalue λ2 = 3 we find a single eigenvector v2 = (1, 1, 1)t . C1 : The characteristic polynomial is a cubic. Expanding the terms in the determinant, we find some lucky cancellation and arrive at (2 − λ)(λ − 1)2 = 0.

(6.28)

6.1 Problems

141

The eigenvalues are λ1 = 2 and λ2 = 1. The first has eigenvector v1 = (1/2, −1/2, 1)t . The second eigenvalue has algebraic multiplicity two and leads to three identical equations for the components of the associated eigenvectors x − 2y − z = 0. The situation is analogous to that found with matrix B3 and is handled as described above. We find two linearly independent eigenvectors span the associated eigenspace, for example v2a = (1, 0, 1)t and v2b = (1, 1/2, 0)t . C2 : The characteristic polynomial is a cubic. Expanding the terms in the determinant and searching for common factors you should spot (λ + 2) and eventually arrive at −(λ − 1)(2 + λ)2 = 0.

(6.29)

The eigenvalues are λ1 = 1 and λ2 = −2. The first gives the single eigenvector v1 = (1, 1, 1)t , which you might spot immediately but if not, solve the linear system in the standard way. For example, using two equations to eliminate y you’ll find 2x − 2z = 0, so x = z = 1 is one solution. Substituting these into any one of the three equations gives y = 1 confirming v1 found above. The second eigenvalue, λ2 = −2, results in the linear system Eq. (6.27) we have seen before with B3 and the eigenvalue λ = 0. Let’s profit from that experience and copy our two eigenvectors: v2a = (1, 0, −1)t and v2b = (1, −1, 0)t . C3 : Solving the cubic characteristic polynomial requires a sharp eye for common factors. Let’s have a look: ⎛ ⎞ −λ 1 1 det ⎝ 1 −λ 1 ⎠ = 0, 1 1 −λ −λ(λ2 − 1) − (−λ − 1) + (1 + λ) = 0, −λ(λ2 − 1) + 2(1 + λ) = 0.

(6.30)

We’ve arrived at a critical crossroads in this problem. Don’t expand, for if you do you’ll fall face first into the cubics. Factor! (λ2 − 1) = (λ − 1)(λ + 1), and then it’s clear sailing: −λ(λ − 1)(λ + 1) + 2(1 + λ) = 0, −(1 + λ)[λ2 − λ − 2] = 0, (1 + λ)(λ + 1)(λ − 2) = 0.

spotted (1 + λ) as global factor (6.31)

From here on, with the experience we’ve gained from the other problems, we are in familiar territory. We have two distinct eigenvalues. The first, λ1 = −1, has algebraic and geometric multiplicity two and results in the linear system

142

6 Spectral Decomposition

Eq. (6.27) we have seen twice before. Recall the two eigenvectors: v1a = (1, 0, −1)t and v1b = (1, −1, 0)t . The other eigenvalue λ2 = 2 has algebraic and geometric multiplicity unity and is associated with a single eigenvector, v2 = (1, 1, 1)t .

6.15 Consider (α1 , α2 , . . . , αn ) ∈ C n and define Ei j as the operator that interchanges αi and α j . Find the eigenvalues of this operator.

6.15 I found this a challenging problem but also a great problem because you can approach it in at least three different ways and learn something new and interesting each time. There’s (i) think like a physicist approach, (ii) think like a slick mathematician approach, and (iii) brute force approach. Let’s explore (i) and (ii) here and relegate (iii) to a Supplementary Problem 6.40 that exercises your determinant calculating prowess. (i) This operator interchanges only two of the components of the vector. When applied to an eigenvector it leaves the direction of the eigenvector unchanged though possibly scaling its magnitude by an amount λ. We can visualize this as follows. Draw an eigenvector with its tail at the origin and pointing in some direction. Draw the infinite straight line through the origin and along this vector extending to infinity in both directions. After being hit by the component interchange operator the resulting vector will still lie on this line. Consider now the Euclidean norm of the vector before and after it was hit by the operator. Interchanging two components leaves the norm unchanged. So after the operation the vector has to either remain completely unchanged, implying λ = 1, or it points in the opposite direction, implying λ = −1. That exhausts the physical possibilities, so the spectrum of the operator is simply λ = ±1. (ii) The great physicist Steven Weinberg tells us that this little arrow with a tail at the origin version of a vector is the Kindergarten version (see Sect. 3.1 of Ref. [21]). So let’s be sophisticated grown-ups about this. From the description of the operator as simply interchanging two components we conclude that applying the operator Ei j ∈ End(V) twice should take us back to the original vector, eigenvector or not. In other words Ei j (Ei j | a 〉) = | a 〉,

for any | a 〉 ∈ V .

(6.32)

Now lets apply the operator twice to an eigenvector, say | v 〉, with eigenvector λ:

6.1 Problems

143

Ei j (Ei j | v 〉) = Ei j (λ| v 〉), = λEi j | v 〉 = λ2 | v 〉, = | v 〉.

used Eq.(6.32).

(6.33)

So the eigenvalue solves λ2 − 1 = 0 (because the eigenvector cannot, by definition, be the zero vector). This proves the spectrum of the operator is simply λ = ±1. Of course there is nothing wrong with approach (i). It helps us see what’s really going on here. But there is something very elegant about the slick mathematician approach too, which you will appreciate even more after you have slugged through approach (iii).

6.18 Show that ||Ax|| = ||A† x||, for all | x 〉 ∈ V if and only if A ∈ End(V) is normal.

6.18 This is Hassani Proposition 6.4.2. Suppose A is normal. Then for any | x 〉 ∈ V, ||Ax||2 = 〈 Ax | Ax 〉 = 〈 x |A† A| x 〉, = 〈 x |AA† | x 〉, I 〈 〉 = A† x I A† x = ||A† x||2 .

used A is normal (6.34)

Because the norm of a vector is positive, we take the positive square root of both sides and thereby confirm the first part of the proposition: ||Ax|| = ||A† x||, for all | x 〉 ∈ V . Now we must show that only if ||Ax|| = ||A† x|| for all | x 〉 ∈ V then A must be normal. To show this implies normality we square both sides and arrive at ||Ax||2 = ||A† x||2 .

(6.35)

Reversing the above development, we arrive at 〈 x |A† A| x 〉 = 〈 x |AA† | x 〉,

∀| x 〉 ∈ V.

(6.36)

Rearranging this equation and using linearity, we write this as the expectation value of the commutator, 〈 x |[A† , A]| x 〉 = 0,

∀| x 〉 ∈ V.

By Hassani Theorem 2.3.8, [A† , A] = 0, so A must be normal.

(6.37)

144

6 Spectral Decomposition

6.21 Consider the matrix ⎛

⎞ 4 i 1 A = ⎝−i 4 −i ⎠ . 1 i 4

(6.38)

(a) Find the eigenvalues of A. Hint: Try λ = 3 in the characteristic polynomial of A. (b) For each λ, find a basis for Mλ , the eigenspace associated with the eigenvalue λ. (c) Use the Gram-Schmidt process to orthonormalize the above basis vectors. (d) Calculate the∑ projection operators (matrices) Pi for each subspace and verify that ∑ P = 1 and Pi = A. i i i λi √ (e) Find the matrices A, sin(πA/2), and cos(πA/2). (f) Is A invertible? If so, find the eigenvalues and eigenvectors of A−1 .

6.21 (a) As in Problem 6.12, we start with Hassani Eq. (6.7), with the operators A and 1 replaced by their matrix representations A and 1, giving the characteristic polynomial the roots of which give the eigenvalues: (4 − λ)3 − 3(4 − λ) + 2 = 0.

(6.39)

Cubics are notoriously laborious to solve so Hassani kindly offers a helping hand by pointing out that λ = 3 is a root. This means we can factor (3 − λ). In the spirit of helping you develop skills that will help you fly after you leave the nest, let’s pretend we didn’t know λ = 3 is a root of our nasty Eq. (6.39)? What to do!? You could plug it into Maple®, but then you don’t learn anything about cubics. It’s useful to notice that if we define z = (4 − λ) then our problem simplifies to a “depressed cubic”, i.e. one without a squared term, which are easier to solve. We write z 3 + pz + q = 0,

(6.40)

with p = −3 and q = 2. The three solutions to Eq. (6.40) are given in general by the so-called Cardano’s Formula, z 1 = u + v,

z 2 = uω + vω2 ,

z 3 = uω2 + vω,

(6.41)

where (

q u= − + 2

/

q2 p3 + 4 27

(

)1/3 ,

q v= − − 2

/

q2 p3 + 4 27

)1/3 ,

(6.42)

6.1 Problems

145

and ω is the primitive cube root of unity. Multiplying a number in the complex plane by ω = exp(i2π/3) rotates its representation about the origin by 120 degrees so its cube is unity. Recall there are three values for the cube root of unity, ω = 1, 0

√ −1 + i 3 , ω= 2

√ −1 − i 3 ω = . 2 2

(6.43)

Cubics have discriminants that play a role similar to the discriminant of a quadratic equation; the discriminant vanishes if and only the cubic has a multiple root. The discriminant D of a depressed cubic is based upon the quantity in the square root of Eq. (6.42), D = −(4 p 3 + 27q 2 ).

(6.44)

D = −(4 p 3 + 27q 2 ) = −(4 · (−3)3 + 27 · 22 ) = 0.

(6.45)

For our depressed cubic

( )1/3 so that we have a double root: This implies u = v = − q2 ( q )1/3 = 2 (−1)1/3 = −2, z 1 = 2u = 2 − 2 ( q )1/3 (−1) = (1)1/3 = 1. z 2 = z 3 = u(ω + ω2 ) = − 2

(6.46)

Here we use only the simple cube roots because we have accounted for all three cube roots of unity with the √ ω in Cardano’s Formula. (This is analogous to using just the positive root for b2 − 4ac in the quadratic formula b 1√ 2 x =− ± b − 4ac, 2 2

(6.47)

because we already took into account both square roots with the plus/minus sign!) Now to find the eigenvalues we recall z = (4 − λ), so λ1 = 4 − z 1 = 6, λ2 = λ3 = 4 − z 2 = 3.

(6.48)

Expanding (6 − λ)(3 − λ)2 and comparing with Eq. (6.39) we confirm these are indeed our eigenvalues. (b) (See Problem 6.15 for the worked examples of finding eigenvectors.) For λ1 = 6 we find a single eigenvector that we could write as a column vector

146

6 Spectral Decomposition

v1 = (1, −i, 1)t . This vector spans the eigenspace Mλ1 associated with the eigenvalue λ1 = 6. This one vector is enough because the algebraic multiplicity of this root of the characteristic polynomial was unity and so we were guaranteed that the geometric multiplicity was at most unity; see Supplementary Problem 6.47. For λ2 = 3, the fact that the algebraic multiplicity is two opens the possibility of a 2D eigenspace so we should be on the alert to find possibly two linearly independent eigenvectors. Indeed, substituting (x, y, z)t into (A − 3 · 1)(x, y, z)t = (0, 0, 0)t gives three identical relations between x, y and z: x + i y + z = 0.

(6.49)

Equation (6.49) and the fact that eigenvectors are nonzero are the only constraints on x, y and z. Remember there’s some arbitrariness here in selecting eigenvectors. One approach (again, see Problem 6.15 for worked examples) is to set one component to zero, another to one, and solve Eq. (6.49) the third component. For example, z = 0, x = 1 gives y = i, or v2a = (1, i, 0)t . And y = 0, x = 1 gives z = −1, or v2b = (1, 0, −1)t . These two vectors are linearly independent and therefore span the eigenspace Mλ2 associated with the eigenvalue λ2 = 3. (c) Eigenspace Mλ1 is 1D, so we simply divide our single eigenvector by its norm, calculated using the hermitian inner product (Hassani Definition 2.2.1) ⎛

⎞ 1 〈 v1 | v1 〉 = 1 i 1 ⎝−i ⎠ = 3. 1 (

)

(6.50)

So the normalized eigenvector is v1' = √13 (1, −i, 1)t . Eigenspace Mλ2 is 2D, so we have to find two orthogonal vectors in the eigenspace and normalize them. Recall the Gram-Schmidt process, Hassani Sect. 2.2.2, gives a systematic procedure for doing so. Let’s start by normalizing v2a , ' v2a

⎛ ⎞ ⎛ √1 ⎞ 1 1 ⎜ 2⎟ = √ ⎝ i ⎠ = ⎝ √i 2 ⎠ . 2 0 0

(6.51)

' Now remove the projection of | v2b 〉 onto | v2a 〉 from | v2b 〉, giving a new ' vector | v 〉 orthogonal to | v2a 〉. (The notation here is v for the column vector that represents abstract vector | v 〉.)

6.1 Problems

147

〉 〈 ' I I v2b | v ' 〉, | v 〉 = | v2b 〉 − v2a 2a ⎛ ⎞ ⎛ ⎞ ⎛ √1 ⎞ 1 ( ) 1 2 ⎟ −i 0 ⎝ 0 ⎠⎜ v = ⎝ 0 ⎠ − √12 √ ⎝ √i 2 ⎠ , 2 −1 −1 0 ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 1 √ 1 1 ⎜ i2 ⎟ ⎝ 2 i ⎠ ⎝ ⎠ = 0 − √ ⎝ √2 ⎠ = − 2 . 2 −1 −1 0

(6.52)

' You can verify that | v 〉 is orthogonal √ to | v2a 〉 but it’s not yet normalized. So we divide it by its norm ||| v 〉|| = 3/2,

⎛

√1 ⎜ √6i ⎜−

⎞

⎟ ' v2b = ⎝ /6 ⎟ ⎠. 2 − 3

(6.53)

' ' In summary, our orthonormal basis for Mλ2 is {| v2a 〉, | v2b 〉} represented by the two column vectors in Eq. (6.51) and Eq. (6.53). This basis is not unique. We could have started our Gram-Schmidt process with v2b instead of v2a and would have ended up with a different basis. Obviously these basis vectors are eigenvectors of A; that’s why we call it an eigenspace, it’s a space of vectors all of which are eigenvectors of the same matrix. They have the same eigenvalue so the eigenvalue is a useful label for the space. (Reread Hassani Proposition 6.2.2. if you are not on board with this point.)

(d) We are getting to the heart of the matter, since we are testing an instance of Hassani Theorem 6.2.10, which does indeed apply here. Recall from Hassani Sect. 6.1 that we can construct the projection operator P of a subspace M of a finite dimensional vector space V via an orthonormal basis of M, P=

∑

| ek 〉〈 ek |.

(6.54)

k

Let’s work with our matrix (row and column vector) representations of the orthonormal bases from (c) above. For eigenspace Mλ1 we have the projection matrix ⎛ ⎞ 1 ) 1 ⎝ ⎠ 1 ( −i √ 1 i 1 (6.55) P1 = √ 3 1 3

148

6 Spectral Decomposition

(Don’t forget the complex conjugate when taking the transpose to convert from column to row vector!) Now multiply the column and row vectors to get a 3 × 3 matrix, ⎛ ⎞ 1 i 1 1⎝ −i 1 −i ⎠ . P1 = (6.56) 3 1 i 1 For eigenspace Mλ2 we repeat the above procedure, now with two elements in the sum ⎛ 1 ⎞ ⎛ 1 ⎞ √ √ 6 ( ( ) √ ) 2 ⎜ √i ⎟ √1 − √i 0 ⎜− √i ⎟ √1 √i √2 P2 = ⎝ 2 ⎠ 2 + − ⎠ ⎝ 6 2 √ 6 6 3 2 0 − √3 ⎛ 1 i ⎞ ⎛ 1 i −1 ⎞ ⎛ ⎞ 2 −i −1 −2 0 6 6 3 2 1 1 i ⎠ = ⎝ i 2 i ⎠. = ⎝ 2i 21 0⎠ + ⎝ −i (6.57) 6 6 3 3 −1 −i 2 −1 −i 2 0 0 0 3 3 3 We immediately confirm that the sum of the two projection matrices gives the identity, P1 + P2 = 1.

(6.58)

Furthermore ⎛

⎞ ⎛ ⎞ ⎛ ⎞ 2 2i 2 2 −i −1 4 i 1 λ1 P1 + λ2 P2 = ⎝−i2 2 −i2⎠ + ⎝ i 2 i ⎠ = ⎝−i 4 −i ⎠ = A, 2 i2 2 −1 −i 2 1 i 4 (6.59) a most satisfying result. (e) All the hard slugging has been done in the earlier steps. Here √we just use the fruits of our labour. The square root of a matrix is the matrix A such that its square is the original matrix √ √ A A = A.

(6.60)

The most direct solution is to think of the matrix as representing an operator and, because the square root can be expanded in a Taylor series, use the formula given by Hassani Eq. (6.15) for a function of an operator,

6.1 Problems

149

f (A) =

r ∑

f (λi )Pi ,

Hassani Eq.(6.15)

(6.61)

i=1

where r is the number of eigenvalues. (The sum in Hassani Eq. (6.15) is over the eigenvalues so the upper limit should be r , the number of eigenvalues.) So here we have √

A=

2 ∑ √

λi Pi ,

(6.62)

i=1

which gives ⎛ √

A=

√ 2+ √ 2 ⎜ 1−3√2 ⎜i √ ⎝ 3 √ −1+ √ 2 3

√ ⎞ 2 −1+ √ 2 3√ ⎟ √ 2+ √ 2 i 1− √ 2⎟. 3√ √3 ⎠ −1+ √ 2 2+ √ 2 3 3 √

√ i −1+ 3

i

(6.63)

Let’s find the matrices sin(πA/2) and cos(πA/2). We have functions of matrices so we can use the formula, given by Hassani Eq. (6.15) as a function of an operator, if the functions are expandable in a series. Both cosine and sine are expandable in power series gives (obtained from their Taylor series) so we have f (A) =

r ∑

f (λk )Pk ,

used Hassani Eq. (6.15)

k=1

sin

(π ) (π ) (π ) A = sin λ1 P1 + sin λ2 P2 , 2 2 ( 2) 3π = sin (3π ) P1 + sin P2 , 2 = −P2 ,

(6.64)

where P2 is the matrix we found above in Eq. (6.57). Similarly, ( ) (π ) 3π A = cos (3π ) P1 + cos P2 , cos 2 2 = −P1 , where P1 is the matrix we found above in Eq. (6.56).

(6.65)

150

6 Spectral Decomposition

(f) Matrices are invertible iff none of their eigenvalues are zero. (That’s because a matrix is invertible iff its determinant is nonzero, and the determinant is equal to the product of all the eigenvalues, including degeneracy, see Hassani Eq. (6.9).) So A is invertible because 6 × 3 × 3 /= 0. The eigenvalues of A−1 are 1 1 = , λ1 6

1 1 = . λ2 3

(6.66)

and the eigenvectors are those we found for A (this result was stated in Problem 6.11).

6.24 Prove Lemma 6.6.4 by showing that 〈 a |T2 + αT + β1| a 〉 ≥ ||Ta||2 − |α| ||Ta|| ||a|| + β 〈 a | a 〉 ,

(6.67)

which can be obtained from the Schwarz inequality in the form | 〈 a | b 〉 | ≥ −||a|| ||b||. Now complete the square on the right-hand side.

6.24 Thinking hard about Eq. (6.67) out of context is not a good approach because it might seem out of the blue. Let’s back up and establish the context. Recall that our goal is to provide an alternative proof to Hassani Lemma 6.6.4, which states that if T is a self-adjoint operator on vector space V, and α and β real numbers such that α 2 < 4β, then H ≡ T2 + αT + β1

(6.68)

is invertible. We know (Hassani Theorem 4.3.10) that all strictly positive operators are automorphisms (i.e. invertible), so it is sufficient to prove that H is strictly positive, i.e. 〈 a |H| a 〉 = 〈 a |T2 + αT + β1| a 〉 > 0,

(6.69)

for all nonzero | a 〉 ∈ V. How to show that that gives a number greater than zero? As always, the principal difficulty is really knowing where to begin and in what direction to push! First of all it only makes sense to talk about a number being greater than zero if the number is at least real (and possibly in a subset of the reals, like the rationals or integers etc.) But on a complex inner product space the inner

6.1 Problems

151

〉 product is generally a complex number, so 〈 a | b' ∈ C. So how do we know that 〈 a |H| a 〉 ∈ R? Because T and therefore H is self-adjoint: 〉 〈 a | b' = 〈 a |H| a 〉 = 〈 a |H† | a 〉∗ , ∗

= 〈 a |H| a 〉 , 〉∗ = 〈 a | b' , 〉 =⇒ 〈 a | b' ∈ R.

definition of adjoint because H is self-adjoint (6.70)

Let’s expand the RHS of Eq. (6.69) to take stock of the task at hand: 〈 a |H| a 〉 = 〈 a |T2 | a 〉 + 〈 a |αT| a 〉 + 〈 a |β1| a 〉

(6.71)

Write | b 〉 = T| a 〉 so the first term 〈 a |T2 | a 〉 = 〈 a |T† T| a 〉 = 〈 b | b 〉 is the squared norm of a vector, clearly a positive quantity. The last term is 〈 a |β1| a 〉 = β 〈 a | a 〉 is the scalar β times the squared norm of vector | a 〉. We do actually know that β > 0. When α, β ∈ R α 2 < 4β,

=⇒

β>0

and

√ √ − 2 β < α < 2 β.

(6.72)

So our attention should now focus on the possible negative term, (PNT), 〈 a |αT| a 〉 = α 〈 a | b 〉 .

(6.73)

Of course it has to be real; for one thing all the other terms in Eq. (6.71) are real and for another thing T is hermitian so our previous argument applied to this PNT shows its real. But it could be less than zero so our job is to show that the positive terms must swamp this PNT. What can we do? One strategy you should definitely keep in mind is that of trying to construct a sort of worst case scenario. A step in this direction is to replace α 〈 a | b 〉 by −|α| | 〈 a | b 〉 | for instance, writing, α 〈 a | b 〉 ≥ −|α| | 〈 a | b 〉 |.

(6.74)

Now we have learned an inequality involving the dot product, the Schwarz inequality (Hassani Theorem 2.2.7), | 〈 a | b 〉 |2 ≤ ||a||2 ||b||2 ,

(6.75)

so this should come to mind even without the helpful hand of Hassani pointing us in this direction. To put this to use here, we take the square root of both

152

6 Spectral Decomposition

sides and multiply by the necessarily negative number −|α|, which changes the direction of the inequality −|α| | 〈 a | b 〉 | ≥ −|α| ||a||||b||.

(6.76)

Substitute this for the PNT in Eq. (6.71), 〈 a |H| a 〉 ≥ ||b||2 + β||a||2 − |α| ||a||||b||.

(6.77)

You might go back and reread the question at this point; what seemed potentially like an obscure question initially has evolved rather naturally, through consideration of the overall goal, to a partial solution and a clear way path to proceed for the rest. As he advises, we complete square )2 ( ( ) |α| α2 ||a|| + β − ||a||2 . ||b||2 + β||a||2 − |α| ||a||||b|| = ||b|| − 2 4 (6.78) Substituting this into Eq. (6.77) gives the desired result )2 ( ) ( |α| α2 ||a|| + β − ||a||2 , 〈 a |H| a 〉 ≥ ||b|| − 2 4

(6.79)

bearing in mind the restrictions on α. Putting this result in the context of Hassani Lemma 6.6.4, we note that this implies H is strictly positive because nonzero | a 〉 yields 〈 a |H| a 〉 > 0.

6.27 Find the polar decomposition of the following matrices: ( ) 2i 0 A= √ , 73

( B=

)

41 −12i , 12i 34

⎛

⎞ 10 1 C = ⎝0 1 −i ⎠ . 1i 0

(6.80)

6.27 Recall the goal, as revealed in the polar decomposition theorem, Hassani Theorem 6.7.1, is to express an operator T as a product U R of a positive operator R and an isometry U. Hassani Examples 6.7.2 and 6.7.3 show how this can be applied to matrices. Below I’ll give you the key results along the way so you can double check your work.

6.1 Problems

153

Matrix A: Step 1. Find A† A: (careful, it’s the dagger on the left! See Supplementary Problem 6.44 to see why it makes a difference.) √ ) 11 3 7 = R2 . A A= √ 3 7 9 (

†

(6.81)

Step 2. Spectrally decompose A† A. You’ll find two eigenvalues, λ1 = 18 and λ2 = 2. After you normalize the eigenvectors, you should have: ( v1 =

3 ) √4 7 , 4

v2 =

( √7 ) 4

− 43

.

(6.82)

From these you form the projection operators onto the associated eigenspaces. The symmetry is not an accident; because the eigenspaces are mutually orthogonal the projection operators must be hermitian (Hassani Proposition 6.1.3). P1 = v1 v1† , ( =

√ ) 9 3 7 16 16 √ , 3 7 7 16 16

P2 = v2 v2† , ( =

√ ) 7 −3 7 16 16 √ . −3 7 9 16 16

(6.83) (6.84)

It’s a useful debugging tool to check that P1 + P2 = 1. Step 3. With the spectral decomposition in hand, you can easily obtain the square root R= =

√ √ √ A† A = λ1 P1 + λ2 P2 , ( √ √ ) 17 2 3 14 8√ √8 3 14 15 2 8 8

.

(6.85)

Step 4. Verify that R has an inverse. (Were all the eigenvalues of R2 nonzero?) If so, calculate it. −1

R

√ ) ( √ 1 15√ 2 −3 √14 = . 48 −3 14 17 2

(6.86)

There’s a nice formula for 2 × 2 inverses: ( )−1 ( ) 1 ab d −b = . cd ad − bc −c a

(6.87)

154

6 Spectral Decomposition

Notice the factor in the Furthermore, the √ denominator √ √ is the determinant. determinant det(R) = det(R2 ) = λ1 λ2 = 18 · 2 = 6. There’s a 48 in Eq. (6.86) because we factored the 8 that appears in Eq. (6.85) outside the matrix too. Step 5. Finally an easy step to finish it all off: U = AR−1 =

√ ) ( √ 1 5 2i − 14i √ √ . 14 5 2 8

(6.88)

We can verify that it is unitary, U† U = 1. And of course we can verify that the polar decomposition worked: A = UR. Matrix B: Step 1. Find B† B: ( B† B =

41 −12i 12i 34

)(

41 −12i 12i 34

)

( =

1825 −900i 900i 1300

)

( = 25

) 73 −36i , 36i 52

= R2 .

(6.89)

Step 2. Spectrally decompose B† B. You’ll find two eigenvalues, λ1 = 2500 and λ2 = 625. (To simplify a little bit you could divide R2 by 25, find the eigenvalues: 100 and 25. Then multiply these by 25.) After you normalize the eigenvectors, you should have: v1 =

( ) 1 −4i , 3 5

v2 =

( ) 1 3i . 5 4

(6.90)

For the projection operators onto the associated eigenspaces you obtain the following (necessarily hermitian) matrices: P1 = v1 v1† , ( ) 1 16 −12i , = 25 12i 9

P2 = v2 v2† , ( ) 1 9 12i = . 25 −12i 16

(6.91) (6.92)

Check that P1 + P2 = 1. Step 3. With the spectral decomposition in hand, you can easily obtain the square root √ √ √ B† B = λ1 P1 + λ2 P2 , ( ) 41 −12i = . 12i 34

R=

(6.93)

6.1 Problems

155

Gosh, this looks familiar! We found R = B. Did something go wrong? No, not at all. It just means that U = 1. And that’s fine because the identity matrix is obviously unitary. So, mission accomplished: B = UR = R. Matrix C: Step 1. Find C† C: ⎛

⎞ 2 i 1 C† C = ⎝−i 2 −i ⎠ = R2 . 1 i 2

(6.94)

Step 2. Spectrally decompose C† C. You’ll find two eigenvalues, λ1 = 4 and λ2 = 1, the second with algebraic multiplicity two; we must have (and indeed we will soon find) a 2D eigenspace for this eigenvalue. After you normalize the eigenvectors, you might have: ⎛ v1 =

⎞

√1 3 ⎜√ −i ⎟ ⎝ 3⎠ , √1 3

⎛ −1 ⎞ √ 2

⎜ ⎟ v2a = ⎝ 0 ⎠ , √1 2

⎞

⎛ v2b =

√1 ⎜ √i 2 ⎟ ⎝ 2⎠ .

(6.95)

0

But hang on! If you proceed things will go horribly wrong. Why? You need an orthonormal basis to form the projection operators, emphasis on the orthogonality bit. “But, but eigenvectors are orthogonal”, you plead? Only if they are associated with different eigenvalues. Let’s back up. We search for eigenvectors associated with λ2 from the equation (R2 − λ2 1)(x, y, z)t = (0, 0, 0)t , x + i y + z = 0, −i x + y − i z = 0, x + i y + z = 0.

(6.96)

Obviously the first and last equations are linearly dependent, so we can just work with the first two. Now trying to solve these we discover the first and second equations are identical (multiply equation two by i). In summary, we have one constraint on three coordinates, giving us two degrees of freedom. We can find two linearly independent eigenvectors to span this 2D eigenspace (geometric multiplicity of 2). With y = 0 we must have z = −x, so our first eigenvector for λ2 , after normalization is v2a in Eq. (6.95). (You could work with a different eigenvector, as long as it has eigenvalue λ2 = 1 and it normalized.) Now we search for a second, linearly independent one using the same equations. Setting z = 0 should give a linearly independent one, with y = i x, so a second eigenvector for λ2 , points in the direction v = (1, i, 0). This might

156

6 Spectral Decomposition

be, and indeed is, partly in the direction of v2a . That’s no good! We want to remove that bit using our Gram-Schmidt toolkit, † v)v2a . v' = v − (v2a

(6.97)

Note v' is still an eigenvector with eigenvalue λ2 because we obtained it from a linear combination of eigenvectors with eigenvalue λ2 . And it’s orthogonal to v2a . Now we must normalize this vector: ( v2b =

√

1 v' † v'

)

⎛ v' =

⎞

√1 ⎜ /6 ⎟ ⎜i 2 ⎟ . ⎝ 3⎠ √1 6

(6.98)

Now with an orthonormal eigenbasis we can construct the projection operators onto the associated eigenspaces. You should obtain the following (necessarily hermitian) matrices: P1 = v1 v1† , ⎛ ⎞ 1 i 1 1⎝ −i 1 −i ⎠ , = 3 1 i 1

† † P2 = v2a v2a + v2b v2b , ⎛ ⎞ 2 −i −1 1 = ⎝ i 2 i ⎠. 3 −1 −i 2

(6.99)

Check that P1 + P2 = 1. Step 3. Now we can find the square root, √ √ √ A† A = λ1 P1 + λ2 P2 , ⎛ ⎞ 4 i 1 1⎝ −i 4 −i ⎠ . = 3 1 i 4

R=

(6.100)

(We can verify that it is indeed positive because it is hermitian and has all positive eigenvalues, the positive square roots of λ1 and λ2 .) Step 4. R has an inverse because all the eigenvalues are nonzero. It is, after some laborious calculations, found to be, ⎛ ⎞ 5 −i −1 1 (6.101) R−1 = ⎝ i 5 i ⎠ . 6 −1 −i 5

6.1 Problems

157

Step 5. Finally an easy step to finish it all off: U = CR−1

⎛ ⎞ 2 −i 2 1⎝ i 2 −2i ⎠ . = 3 2 2i −1

(6.102)

Note that U is unitary, U† U = 1, and that our efforts were not in vain because indeed C = UR.

6.30 Find the unitary matrices that diagonalize the following hermitian matrices: (

) 2 −1 + i , −1 − i −1 ⎛ ⎞ 1 −1 −i B1 = ⎝−1 0 i ⎠ , i −i −1 A1 =

(

) 3 i −i 3 ⎛ ⎞ 2 0 i B2 = ⎝ 0 −1 −i ⎠ . −i i 0

(

A2 =

A3 =

) 1 −i , i 0 (6.103)

6.30 By Hassani Corollary 6.4.13 we are guarenteed the existence of a unitary matrix that diagonalizes a hermitian matrix. (Nothing has been said about its uniqueness so technically speaking we should refer to “a” unitary transformation rather than “the” unitary transformation.) We can follow the procedure of Hassani Example 6.4.14, though I suggest we use slightly different notation. Let’s call R a basis transformation matrix mapping vector components from an eigenbasis to the non-eigenbasis in which the matrix was originally given. To be consistent with Hassani Example 6.4.14 we will later identify this as the hermitian transpose of a unitary matrix, i.e. U† = R. We learned in Hassani’s Sect. 6.4 that a normal operator has a diagonal representation in its eigenbasis. So the columns of R are given by a set of orthonormal eigenvectors, (read Problem 5.27 if you are not following the logic). So we have identified the problem as an eigenvalue/vector problem. Matrix A1 Proceeding in the standard way with√this eigenvalue/vector√problem, we find two simple eigenvalues λ1 = (1 + 17)/2 and λ2 = (1 − 17)/2. Two normalized eigenvectors are, ⎛ v1 = ⎝

√ 2−2i√

⎞

34−6 √ 17 ⎠ , 17 √ 34−6 17

√3−

⎛ v2 = ⎝

√ 2−2i√

⎞

34+6 √ 17 ⎠ . 17 √ 34+6 17

√3+

(6.104)

158

6 Spectral Decomposition

So the unitary transformation U = R† , the hermitian transpose of the basis transformation matrix obtained from these orthonormal eigenvectors, is: ⎛ U = R† =

√ 2+2i√

√

√3−

⎞

17 √ ⎝ 34−6 17 34−6 √ 17 ⎠ . 2+2i 3+ 17 √ √ √ √ 34+6 17 34+6 17

(6.105)

It’s laborious but straightforward to verify that indeed U diagonalizes A1 , giving D = UA1 U† ,

(6.106)

where D = diag(λ1 , λ2 ). How do we remember where the dagger goes? I think of it this way. On the LHS of Eq. (6.106) we are manifestly working in the eigenbasis, so the RHS must be too. A1 is in the non-eigenbasis, so the first thing we need on the far right of the RHS is a transformation of vector components from the eigenbasis to the non-eigenbasis. That’s what our matrix R does. In Hassani Example 6.4.14, this R = U† . Matrix A2 This problem does not throw up any new hurdles. If you can do the last one you can do this one. Here are the results: Two simple (non-degenerate) eigenvalues, λ1 = 4, λ2 = 2. Two normalized eigenvectors: (yours could differ by a complex scalar of absolute value unity) ( ) 1 i , v1 = √ 2 1

( ) 1 1 . v2 = √ 2 i

(6.107)

A unitary transformation mapping the matrix from its given basis to an eigenbasis, in which the matrix will be diagonal, is ( ) 1 −i 1 . U= √ 2 1 −i

(6.108)

Matrix A3 This problem does not throw up any new hurdles either. If you can do the last two you can do this one too. Here are the results: √ √ Two simple (non-degenerate) eigenvalues, λ1 = (1 + 5)/2, λ2 = (1 − 5)/2. Two normalized eigenvectors: (yours could differ by a complex scalar of absolute value unity)

6.1 Problems

159

⎛ v1 = ⎝

√

2

√

⎛

⎞

v2 = ⎝

10−2 5⎠ √ , 5−1) √ 10−2 5

√i (

√

2i

√

⎞

10+2 √ 5⎠ . 5 √ 10+2 5

√1+

(6.109)

A unitary transformation mapping the matrix from its given basis to an eigenbasis, in which the matrix will be diagonal, is ⎛ U=

√

−i( √

√

⎞

5−1) √ ⎝ 10−2 5 10−2 √ 5⎠ . −2i 1+ 5 √ √ √ √ 10+2 5 10+2 5 2

√

(6.110)

Matrix B1 Conceptually there is no new challenge here, but the computations get very messy. I did them numerically to double precision, but I’ll round them off. Three simple (non-degenerate) eigenvalues, λ1 = 2.21432, λ2 = −1.67513, λ3 = −0.53919. Three normalized eigenvectors: (yours could differ by a complex scalar of absolute value unity) ⎛

⎞ −0.75579i v1 = ⎝ 0.52066i ⎠ , 0.39711

⎛

⎞ 0.17215i v2 = ⎝−0.42713i ⎠ , 0.88765

⎛

⎞ 0.63178i v3 = ⎝0.73924i ⎠ . 0.23319 (6.111)

A unitary transformation U mapping the matrix B1 from its given basis to an eigenbasis, in which the matrix is diagonal, diag(λ1 , λ2 , λ3 ) = UB1 U† is approximately, ⎛

⎞ 0.75579i −0.52066i 0.39711 U = ⎝−0.17215i 0.42713i 0.88765⎠ . −0.63178i −0.73924i 0.23319

(6.112)

Matrix B2 Nothing new here, just very messy computations of the sort we did above, again presented to 5 decimal places. Three simple (non-degenerate) eigenvalues, λ1 = 2.46050, λ2 = −1.69963, λ3 = 0.23912. Three normalized eigenvectors: (yours could differ by a complex scalar of absolute value unity)

160

6 Spectral Decomposition

⎛

⎞ 0.90175i v1 = ⎝−0.12000i ⎠ , 0.41526

⎛

⎞ −0.15312i v2 = ⎝ 0.80971i ⎠ , 0.56650

⎛

⎞ −0.40422i v3 = ⎝−0.57443i ⎠ . 0.71179 (6.113)

A unitary transformation U mapping the matrix B2 from its given basis to an eigenbasis, in which the matrix is diagonal, diag(λ1 , λ2 , λ3 ) = UB2 U† is approximately, ⎛ ⎞ −0.90175i 0.12000i 0.41526 U = ⎝ 0.15312i −0.80971i 0.56650⎠ . 0.40422i 0.57443i 0.71179

(6.114)

6.2 Supplementary Problems 6.32 Prove Hassani Proposition 6.1.2 that M⊥ is a subspace of the inner product space V, where M⊥ is the orthogonal complement of M, an arbitrary subspace of V. N , where | a 〉 is an arbitrary vector in the 6.33 Let M ≡ Span{Ak | a 〉}k=0 N -dimensional vector space V, and A is some linear operator on V. Show that M is an invariant subspace of A.

6.33 M is a subspace of V by construction. This is clear because M is defined above as the span of a non-empty set of vectors within V, see Problem 2.6. Let | b 〉 be an arbitrary vector in M. To show that M is invariant under A, we must show that A| b 〉 ∈ M. By the definiton of M, there are scalars βk such that N N −1 ∑ ∑ |b〉 = βk Ak |a〉 = β N A N |a〉 + βk Ak |a〉. (6.115) k=0

k=0

Note | b 〉 is expressed as a linear combination of a set of N + 1 vectors in an N dimensional vector space. Thus this set is linearly dependent. From Eq. (2.44) N such that this means that there exists a nontrivial set of scalars {αk }k=0 α N A N |a〉 +

N −1 ∑ k=0

αk Ak |a〉 = 0.

6.2 Supplementary Problems

161

If α N /= 0, divide by α N and write A N |a〉 in terms of the other vectors. If α N = 0, apply A on the remaining sum to get α N −1 A N |a〉 +

N −2 ∑

αk Ak+1 |a〉 = 0.

k=0

If α N −1 /= 0, divide by α N −1 and write A N |a〉 in terms of the other vectors. If α N −1 = 0, apply A on the remaining sum to get α N −2 A N |a〉 +

N −3 ∑

αk Ak+2 |a〉 = 0

k=0

and continue as before. Let m be the first integer such that α N −m /= 0. Then α N −m A |a〉 + N

N∑ −m−1

αk Ak+m |a〉 = 0

k=0

and A |a〉 + N

N∑ −m−1 k=0

αk Ak+m |a〉 = 0. α N −m

Letting j = k + m , we get N −1 N −1 ∑ ∑ α j−m j A |a〉 = − A |a〉 ≡ η j A j |a〉, α N −m j=m j=0 N

{ 0 ηj ≡ α − α Nj−m −m

if j < m . if j ≥ m

Substitute this in Eq. (6.115) to get | b 〉 = βN

( N −1 ∑ k=0

) ηk A |a〉 + k

N −1 ∑

βk Ak |a〉 =

k=0

N −1 ∑

(β N ηk + βk ) Ak |a〉.

k=0

It is now obvious that A| b 〉 ∈ M, and that M is an invariant subspace of V.

6.34 Show that the eigenspace Mλ of an operator A is an invariant subspace of A. 6.35 Spell out the logic of the proof of Hassani Theorem 6.1.6 following the argument provided immediately before this theorem. 6.36 Find the eigenvalue(s) of the matrix A,

162

6 Spectral Decomposition

( A=

) 12 . 01

(6.116)

Compare the geometric and algebraic multiplicity for each eigenvalue.

6.36 As in Problem 6.12, we start with Hassani Eq. (6.7), with the operators replaced by the matrices A and 1, which leads to the characteristic polynomial, that we set to zero: ) ( 1−λ 2 = (1 − λ)2 = 0. (6.117) det 0 1−λ Fortunately the characteristic polynomial in Eq. (6.117) is already factored for us, so we can read the eigenvalue directly as λ = 1.

(6.118)

That’s all; there is only one eigenvalue. Its algebraic multiplicity is m = 2 because the term with this eigenvalue is squared in the characteristic polynomial in Eq. (6.117). Note that the characteristic polynomial must be factored, as in det(A − λ1) =

p ∏

(λ j − λ)m j ,

HassaniEq.(6.8)

(6.119)

j=1

to read the algebraic multiplicity. To find the geometric multiplicity we must find the eigenvector(s) and the dimension of the space they span. We substitute the eigenvalue into the eigenvector equation Hassani Eq. (6.5), again replacing abstract operators and vectors by their matrix representations. In particular, we represent the as-yetunknown vector by its two components (x, y)t : (

)( ) ( ) x 0 1−λ 2 = , 0 1−λ y 0 ( )( ) ( ) 02 x 0 = , 00 y 0 =⇒ y = 0.

set λ = 1 (6.120)

6.2 Supplementary Problems

163

We conclude y = 0, and the value of x is arbitrary, except that we cannot have x = 0 because eigenvectors are defined as nonzero vectors. For example, v = (1, 0)t represents our single eigenvector for the eigenvalue λ = 1. This single vector v spans a one-dimensional eigenspace so the geometric multiplicity of the eigenvalue is unity, which is less than its algebraic multiplicity.

6.37 Find the solution to the complex polynomial equation, (a + ib)2 = i.

(6.121)

6.38 Long ago ancient mathematicians discovered negative numbers by subtracting a larger natural number from a smaller one and realizing that the answer did not belong to the natural numbers. Realizing that the square root of 2 did not have an answer in the rationals, led to the discovery of the irrationals. Similarly searching for the square root of a negative number lead to the complex numbers. (Of course I’m simplifying history; the reader interested in the real story of the history of numbers should consult a work dedicated to this, for example Ref. [5].) Does every complex number have a square root that is also a complex number? If not we would have to enlarge the complex numbers to include such numbers. (Let’s call them the “Rob numbers”; it was my suggestion after all!) Prove that there are of course no silly “Rob numbers”. Hint: An instructive solution exists by showing that the equation (a + ib)2 = c + id,

(6.122)

has a solution a, b ∈ R for all values of c, d ∈ R. Another possibility is to use the Euler formula. This result is a special case of a more general property, the algebraic closure of the complex numbers. Hassani refers to this in the final paragraph of Sect. 6.3. A fuller consideration will be given in Chap. 10, see Hassani Theorem 10.5.6, the fundamental theorem of algebra. 6.39 Find the eigenvalue(s) of the matrix A, (

) 11 A= . 1i

(6.123)

Hint: The matrix here was deliberately chosen such that the complex algebra did not simplify, unlike √ most of those in Hassani Problem 6.12. Instead of trying to find the square root 1 − i /2 just express your answer in terms of such expressions.

164

6 Spectral Decomposition

6.39 As in Problem 6.12, we start with Hassani Eq. (6.7), with the operators replaced by the matrices A and 1, which leads to the characteristic polynomial, that we set to zero: ) ( 1−λ 1 = (1 − λ)(i − λ) − 1, det 1 i −λ = λ2 − (1 + i )λ − 1 + i = 0.

(6.124)

Equation Eq. ( 6.124) is a second order polynomial equation so we can use the quadratic formula √

(1 + i )2 − 4(−1 + i) , 2 1+i √ ± 1 − i/2. = 2

λ=

1+i ±

(6.125)

So there are two eigenvalues, each of algebraic multiplicity is m = 1.

6.40 Recall that in Problem 6.15 we found that the operator Ei j that interchanges the components i and j of a vector in C N has eigenvalues λ = ±1. Find the algebraic multiplicity of these two eigenvalues. Hint: Represent the operator as an N × N matrix and find all its eigenvalues. (This is the brute force approach to Problem 6.15.) Let A = Ei j − λ1 and without loss of generality specify i < j. The characteristic polynomial is obtained from det(A) =

N ∑

(A)ℓm (−1)ℓ+m Mℓm ,

used Hassani Eqs. (5.26, 5.27)

(6.126)

ℓ=1

where m is any fixed column and Mℓm is the minor of order N − 1. If you choose m = i then you only need the following to find the determinant: (A)ii = −λ,

Mii = (1 − λ) N −2 (−λ),

(A) ji = 1, (A)ki = 0, k /= i, j.

M ji = (1 − λ) N −2 (−1)i+ j−1 , (6.127)

You should draw a very good diagram and then verify the above formula. Finally you should find a characteristic polynomial: det(A) = (1 − λ) N −1 (λ + 1).

(6.128)

6.41 For endomorphism A : V → V, show that A − λ1 is normal iff A is normal.

6.2 Supplementary Problems

165

6.42 Let A : V → V be an invertible and diagonalizable endomorphism on an N dimensional complex inner product space V. Hassani example 6.4.12 explains an important and practical method to find the largest and smallest eigenvalues of A, sometimes called the power method. (a) Would the power method work if the largest eigenvalue were degenerate? (b) Consider the special case that two distinct eigenvalues have the same norm |λ1 | = |λ2 |,

λ 1 / = λ2 ,

(6.129)

and also happen to be the largest |λi | < |λ1 |, for all i > 2.

(6.130)

What would the algorithm yield in this case?

6.42 (a) We address this problem by repeating the argument of Hassani Example 6.4.12 but for the case of a degenerate largest eigenvalue. Let | v 〉 be an arbitrary nonzero vector in the inner product space V. Note that because A is diagonalizable, there exists a basis, say B, for V consisting of eigenvectors N , with corresponding eigenvalues λk that have been of A. Let B = {| bk 〉}k=1 ordered from largest λ1 to smallest λ N . By giving N distinct names to the eigenvalues, the degeneracy of the largest eigenvalue means that λ1 = λ2 = . . . λm ,

(6.131)

where m ≥ 2 is the dimension of the eigenspace Mλ1 corresponding to the largest eigenvalue. Applying n times the operator A to | v 〉 gives A |v〉 = A n

n

( N ∑

) ξk | bk 〉 , expand | v 〉 in eigenvector basis

k=1

=

N ∑

ξk An | bk 〉,

used linearity,

ξk λnk | bk 〉,

because eigenvector basis,

k=1

=

N ∑ k=1

) N ∑ λnk ξk | bk 〉 . ξ1 | b1 〉 + ξ2 | b2 〉 + . . . + ξm | bm 〉 + λn k=m+1 1

(

= λn1

(6.132)

166

6 Spectral Decomposition

Now for large enough n we can make the term involving the ratio of eigenvalues arbitrarily small: I nI I λk I I I