266 47 3MB
English Pages 168 Year 2015
Feynman Simplified 2C: Electromagnetism: in Relativity & in Dense Matter Everyone’s Guide to the Feynman Lectures on Physics by Robert L. Piccioni, Ph.D.
Copyright © 2015 by Robert L. Piccioni Published by Real Science Publishing 3949 Freshwind Circle Westlake Village, CA 91361, USA Edited by Joan Piccioni
All rights reserved, including the right of reproduction in whole or in part, in any form. Visit our web site www.guidetothecosmos.com
Everyone’s Guide to the Feynman Lectures on Physics Feynman Simplified gives mere mortals access to the fabled Feynman Lectures on Physics.
This Book Feynman Simplified: 2B covers one quarter of Volume 2 of The Feynman Lectures on Physics. The topics we explore include: Relativistic Maxwell’s Equations Lorentz Transform for Potentials & Fields Field Energy, Momentum & Mass Relativistic Particles in Fields Crystals Refraction & Reflection in Dense Matter Waveguides
To find out about other eBooks in the Feynman Simplified series, click HERE. I welcome your comments and suggestions. Please contact me through my WEBSITE. If you enjoy this eBook please do me the great favor of rating it on Amazon.com or BN.com.
Table of Contents Chapter 25: Waveguides Chapter 26: Relativistic Electrodynamics Chapter 27: Transformation of Fields Chapter 28: Energy & Momentum of Fields Chapter 29: Electromagnetic Mass Chapter 30: Particles in Fields Chapter 31: Crystals Chapter 32: Refraction in Dense Matter Chapter 33: Reflection & Transmission Chapter 34: Clever Tricks Chapter 35: Review of Part 2C
Chapter 25 Waveguides In Chapter 23, we examined the behavior of simple circuit elements as a function of frequency. We learned that the character of those elements often changes dramatically as frequencies increase. In V2p24-1, Feynman says: “Another interesting technical problem is the connection of one object to another, so that electromagnetic energy can be transmitted between them. In low-frequency circuits the connection is made with wires, but this method doesn’t work very well at high frequencies because the circuits would radiate energy into all the space around them, and it is hard to control where the energy will go. The fields spread out around the wires; the currents and voltages are not “guided” very well by the wires. In this chapter we want to look into the ways that objects can be interconnected at high frequencies.” In this chapter, we examine the theory of transmission lines: the “guiding” of electromagnetic waves in confined spaces toward desired destinations. For low frequency AC, such as 50 or 60 Hz, simple wires are adequate across distances up to hundreds of miles or kilometers. At 60 Hz, electromagnetic wavelengths in conductors may be 2000 miles (assuming wave velocity is about 2/3 of the speed of light). This means the quarter-wavelength at which radiation peaks is about 500 miles. At 50 Hz, the quarter-wavelength is about 1000 km. For kilohertz frequencies, such as local telephone wiring, twisted-pairs are employed to reduce crosstalk. Figure 25-1 shows four twisted-pairs that can carry four separate telephone conversations.
Figure 25-1 Four Twisted-Pairs
Twisting makes it more difficult for one party to hear another party's conversation, and reduces interference by diminishing signal pick-up in one pair due to fields radiated by adjacent pairs. Frequencies in the megahertz range are often transmitted through coaxial cables (“coax”). The simplest coax consists of two coaxial, thin hollow cylinders. Typically, the signal or power is transmitted through the central tube, while the outer tube is a ground shield connected to a zero-volt
potential. Figure 25-2 shows an inner conductor with radius r, and an outer conductor with radius R.
Figure 25-2 Coaxial Cable
One of the great advantages of coax is that its electromagnetic fields are completely contained in the space between the conductors (ideally). This means coaxial cables do not interfere with one another, even if many are bundled tightly together. They are also unaffected by external electrical devices or fields. In Chapter 22, we found that the impedance Z of a transmission line with inductance L and capacitance C per unit length is: 0
0
Z = √ (L / C ) 0
0
Let’s now analyze transmission through a coaxial cable from a different prospective. Figure 25-3 shows a cross-section of the interior of a coax (shown in gray), with the outer conductor at zero volts, and a signal propagating through the inner conductor that is represented by a solid black line. At a distance x from the start of the coax, let the signal voltage be V(x) and the current be J(x). A small distance further down the cable, at position x*, the signal has voltage V(x*) and current J(x*).
Figure 25-3 Signal In Coax
For a time-varying current, the coax has an inductive impedance L per unit length. This causes a voltage drop given by: 0
ΔV = V(x*) – V(x) = – L (x*–x) ∂J/∂t ΔV / Δx = – L ∂J/∂t 0
0
In the limit that Δx goes to zero (x* goes to x), we obtain a differential equation: ∂V/∂x = – L ∂J/∂t 0
We obtain a second differential equation by considering the time-varying voltage. The coax has capacitance C per unit length, so in a small length Δx = (x*–x), the stored charge q is: 0
q = C Δx V 0
The net current flowing into Δx must equal the change in charge within Δx. This means: J(x) – J(x*) = ∂q/∂t = C Δx ∂V/∂t 0
In the limit that Δx goes to zero, this means: – ∂J/∂x = C ∂V/∂t 0
Feynman says these are the two basic differential equations for any transmission line, adding: “We could modify them to include the effects of resistance in the conductors or of leakage of charge through the insulation between the conductors, but for our present discussion we will just stay with the simple example.” We now combine these equations using a familiar trick: differentiate the first equation with respect to x and the second with respect to t, so that both contain the term ∂ J/∂t∂x. 2
∂ V/∂x / L = – ∂ J/∂t∂x C ∂ V/∂t = – ∂ J/∂t∂x ∂ V/∂x – L C ∂ V/∂t = 0 2
2
2
0
2
2
0 2
2
2
2
0
2
0
The same trick, done the other way, yields two equations containing ∂ V/∂t∂x. 2
L ∂ J/∂t = – ∂ V/∂x∂t ∂ J/∂x / C = – ∂ V/∂x∂t ∂ J/∂x – C L ∂ J/∂t = 0 2
0 2
2
2
2
2
0
2
2
2
0
2
0
We see that both V and J satisfy the 1-D wave equation: ∂ ψ/∂x – ∂ ψ/∂t / v = 0 2
2
2
2
2
With v = 1/√(L C ), all solutions are of the form: 0
0
ψ = f(x–t/v) + g(x+t/v) Here, f is a wave moving toward +x with a voltage and a current that we will call V and J , and g is a wave moving toward –x with V and J . +
–
+
–
Let’s now calculate the key parameters of a coaxial cable: L , C , Z , and v. 0
0
0
Recall that in Chapter 18, we found the inductance of a solenoid by calculating its field energy. We will use the same approach here, and calculate the field energy of a coax.
The magnetic field energy is given by: U = (ε c /2) ∫ B•B dV 2
0
The B field at a distance ρ from a wire carrying current J is: B = J / (2πε c ρ) 2
0
In cylindrical coordinates, the volume integral over dV is: dV = ρ dx dβ dρ Here, x is the distance along the wire, ρ is the distance from the wire, and β is the azimuthal angle. The extra ρ arises because the distance moved by an incremental change in azimuthal angle is ρdβ. The integral over dβ equals 2π, and the integral over dx equals X, the length of the cable. (I’d prefer L, but we’re already using L for inductance.) The integral over ρ is over the region between the two cylindrical conductors, from ρ=r to ρ=R. There is no magnetic field outside the coax, because there is no net current flow through a cross-sectional surface that includes both conductors. The magnetic energy per unit length U/X is then: U = (ε c /2) ∫ 2πX ρ dρ J / (2πε c ρ) U = (J /2) ∫ X (dρ/ρ) / (2πε c ) U / X = (J /2) ∫ (dρ/ρ) / (2πε c ) U / X = ln(R/r) J / (4πε c ) 2
2
2
0 2
2
0
2
0
2
2
0
2
2
0
Another equation for magnetic field energy is: U = LJ /2 2
Equating these equations yields L , the inductance per unit length. 0
U/X = L J / 2X U/X = ln(R/r) J / (4πε c ) L = L / X = ln(R/r) / (2πε c ) 2
2
2
0
2
0
0
Now on to the capacitance. In Chapter 12, we found the stored charge of a coaxial capacitor at voltage V (the analog of heat flow between two concentric pipes). In our current notation, the equation is: Q = (2πε ) V X / ln(R/r) C = Q / VX = (2πε ) / ln(R/r) 0
0
0
This is the capacitance when the gap between conductors is empty (ideally, it would be vacuum). We therefore have the wave velocity v and impedance Z : 0
1/v = L C 1/v = {ln(R/r) / (2πε c )} {(2πε ) / ln(R/r)} 1/v = {1 / c } v=±c 2
0
0
2
2
0
2
Z Z Z Z
2 0 2 0 2 0 0
0
2
=L /C = {ln(R/r) / (2πε c )} / {(2πε ) / ln(R/r)} = { ln(R/r) / (2πε ) c ) } = ln(R/r) / (2πε c) 0
0
2
0
0
2
2
2
0
0
Feynman says the constant 1/(2πε c) has the units of resistance and a value of 60 ohms. The ratio R/r is never very large, and the impedance varies only logarithmically with that ratio. The result is that almost all coaxial cables have impedances between 50 ohms and a few hundred ohms. 0
Real coax has a dielectric between the inner and outer conductors, which changes the above results.
Rectangular Waveguides In V2p24-4, Feynman says: “The next thing we want to talk about seems, at first sight, to be a striking phenomenon: if the central conductor is removed from the coaxial line, it can still carry electromagnetic power. In other words, at high enough frequencies a hollow tube will work just as well as one with wires. It is related to the mysterious way in which a resonant circuit of a [capacitor] and inductance gets replaced by nothing but a can at high frequencies. “Although it may seem to be a remarkable thing when one has been thinking in terms of a transmission line as a distributed inductance and capacity, we all know that electromagnetic waves can travel along inside a hollow metal pipe. If the pipe is straight, we can see through it! So certainly electromagnetic waves go through a pipe.” Sometimes a simple observation is worth a thousand equations. Let’s find out what kind of waves can go through a pipe. In this mode, the pipe is called a waveguide. While the basic principles are the same for pipes of all shapes, we will analyze a rectangular pipe, since that is the simplest. We define a pipe that starts at z=0 and runs along the z-axis toward z=∞, with width X and height Y, as shown in the upper two images of Figure 25-4.
Figure 25-4 Rectangular Waveguide
Assume an unspecified wave source at z0. y
x
y
x
For the z-dependence, let’s try a typical wave solution: exp{iωt–ik z}. z
Putting these pieces together, we get: E = E sin(k x) exp{iωt–ik z} y
x
z
Figure 25-5 shows the E and B fields in the waveguide, with E reaching its maximum at the left and right ends of the image, and reaching its minimum (most negative) in between.
Figure 25-5 Fields In Waveguide
Inserting the expression for E into the 3-D wave equation yields: 0 = ∂ E /∂x + ∂ E /∂y + ∂ E /∂z – ∂ E /∂t /c 0 = – k E – k E – (–ω /c ) E k +k =ω /c 2
2
y 2 x
2 x
2
2
2
2
y
y
2 z
2
2 z 2
2
2
y
2
2
y
2
y
y
Since we already constrained k , this establishes a relationship between k and ω. x
z
k = ± √ {(ω/c) – (nπ/X) } 2
2
z
The “±” sign determines the wave direction: “+” for waves moving toward +z, and “–“ for waves moving toward –z. From this equation, we obtain the phase velocity (see Feynman Simplified 1D, Chapter 43): v =ω/k ph
z
The guide wavelength λ , the wavelength inside the waveguide, is: wg
λ = 2π v / ω wg
ph
In empty space, the wavelength is: λ = 2πc/ω. Comparing the two wavelengths yields: 0
λ λ λ λ
= 2π / √ {(ω/c) – (nπ/X) } = 1 / √ {(ω/2πc) – (n/2X) } = 1 / √ {(1/λ ) – (n/2X) } = λ / √ {1 – (nλ /2X) } 2
wg
2
2
wg
2
2
wg
2
0
2
wg
0
0
For very high frequencies, when ω>>c/X, the waveguide wavelength approaches the free space wavelength. For visible light and typical waveguides, λ and λ are virtually equal. wg
0
Cutoff Frequency Now consider low frequency waves traveling through our pipe. Let’s focus on the lowest transverse mode, the least oscillation in the x-direction, which corresponds to n=1. As the frequency decreases,
the wavelength increases at a faster than usual rate. In empty space, λ is inversely proportional to ω. But in a waveguide, we can rewrite the equation for λ as (with Ω=πc/X): wg
λ = 2πc / √ { ω – Ω } λ = 2πc / √ { (ω–Ω) (ω+Ω) } 2
2
wg wg
When ω is close to Ω, λ becomes inversely proportional to 1/√(ω–Ω). This means λ approaches infinity much more quickly than in empty space: at ω=Ω rather than at ω=0. Ω=πc/X is the cutoff frequency of a waveguide of width X for E orthogonal to the width. wg
wg
Consider the wave number k for frequencies below the cutoff (for n=1). Let’s rewrite our prior equation for k : z
z
k = ± (1/c) √ {ω – Ω } k = ± i (1/c) √ {Ω – ω } k =±iK with K = (1/c) √ {Ω – ω } 2
2
z
2
2
z z
2
2
When ω is less than cutoff frequency Ω, the wave number k is imaginary. This is not as crazy as it might sound. Let’s insert this imaginary wave number into the wave equation. z
E = E sin(k x) exp{iωt–ik z} E = E sin(k x) exp{iωt} exp{±Kz} E = E sin(k x) exp{iωt} exp{–Kz} y
x
y
x
y
x
z
In the last line, we assumed the wave source was at z1), and modes with more geometric complexity. x
If the electric field is entirely within the xy-plane, the magnetic field has a z-component, and the oscillation is called a transverse-electric (“TE”) mode. If E has a component in the z-direction, the magnetic field is entirely in the xy-plane, and the oscillation is called a transverse-magnetic (“TM”) mode. In rectangular waveguides, the lowest cutoff frequency occurs in the TE mode that we examined first. This is because Ω=nπc/X is lowest when n=1 and the side dimension X is as large as possible.
Typically, waveguides are used at frequencies just slightly higher than the lowest cutoff. This ensures that only the lowest mode propagates through the waveguide, which simplifies its use.
Feynman Remarkable Prospective In V2p25-10, Feynman presents an intriguing insight explaining the origin of a waveguide’s cutoff frequency. We earlier derived the cutoff equation mathematically, showing that the wave number becomes imaginary at low frequencies. There is nothing wrong with that derivation. Indeed, the same analysis leads to even more profound results in quantum mechanics. But, here Feynman provides a different approach that is less “imaginary” and may appeal more to your physical intuition. This approach works only for rectangular waveguides, while the “imaginary” approach works for waveguides of any shape. You have probably noticed that the waveguide’s vertical size Y does not enter into any of our prior equations. Feynman says that, in the TE-E mode, our waveguide operates identically for any value of Y, even for Y=∞. y
We therefore consider the waveguide shown in Figure 25-9, where the y-axis points out of the screen. The waveguide sidewalls at x=0 and at x=X extend to y=±∞. A vertical wire S that also extends to y=±∞ is located midway between the sidewalls, at x=X/2, z=0. The wire carries a current oscillating at frequency ω that produces electromagnetic waves that travel down the waveguide. 0
Figure 25-9 Wire Source Between Sidewalls
An isolated wire radiates cylindrical expanding waves. But the waveguide sidewalls constrain the wire’s radiation pattern. Assuming the sidewalls are ideal conductors, electric fields must be perpendicular to the sidewalls at their surface. We learned how to satisfy that requirement in Chapter 7: the fields from a charge near a conducting plane are the same as the fields from a charge and an image charge. In V2p25-11, Feynman says: “The image idea works just as well for electrodynamics as it does for electrostatics, provided, of course, that we also include the retardations. We know that is true because we have often seen a mirror producing an image of a light source. And a mirror is [an almost ideal] conductor for electromagnetic waves with optical frequencies.” The fields from S are unchanged by replacing the upper wall with an opposite polarity source (call 0
that S ) at x=2X; the same distance above the upper wall as S is below it. We define the polarity of S to be “+”, which makes S “–”. By “opposite polarity” we mean the oscillating currents in S and in S have a relative phase shift of 180 degrees. 1
0
0
1
1
0
Similarly, the image of S in the lower wall is an opposite polarity source (call that S ) at x=–X/2; the same distance below the lower wall as S is above it. The polarity of S is “–”. 0
2
0
2
But this is not the end by any means. As everyone who has stood between two parallel mirrors knows, one sees more than just two images; in fact one sees an infinite number of images. Each image produces another opposite polarity image of itself in the opposite mirror. The result here is an infinite column of sources of alternating polarities, as shown in Figure 25-10. The dotted lines indicate where the waveguide sidewalls originally were before being replaced by image sources.
Figure 25-10 Infinite Column of Sources
As Feynman says, this is: “in fact just what you would see if you looked at a wire placed halfway between two parallel mirrors.” The field in the waveguide is the same either with sidewalls or with an infinite column of alternating sources. We discovered how to solve this problem in Feynman Simplified 1C, Chapter 32, where we calculated the radiation field from various dipole arrays. (Feynman is showing us how all these different ideas fit together in one comprehensive theory.) Close to the sources, the fields are quite complex. Fortunately, what we are interested in here is waveguide transmission. So in what follows, we will consider only fields far from the sources at z=0. There is no direct radiation along the z-axis (the waveguide axis) because there are equal numbers of sources of opposite polarities that cancel one another. Where then do the waveguide’s waves come from? We also found in Feynman Simplified 1C that dipole arrays radiate in certain directions: the directions in which the waves from each source interfere constructively. Figure 25-11 illustrates when this happens. Here, at some time t, we see plane waves radiating up and to the right, at an angle +θ relative to the z-axis. The light solid lines indicate wave crests, and the
light dotted lines indicate wave troughs. The distance between consecutive crests (or troughs) is the wavelength λ . 0
Figure 25-11 Radiation at Angle +θ
Since we removed the waveguide walls, we will assume the remaining space is empty. The wave propagation velocity v is therefore c, the speed of light in vacuum, and λ = 2πc/ω. 0
Examine the gray right triangle that is enlarged in the lower portion of this image. The hypotenuse has length 2X, which is the distance between consecutive sources of the same polarity. The shortest side of the triangle has length u, and the angle at the top is θ. For radiation from S and from S to interfere constructively, the extra distance u traveled from S must be an integer number of wavelengths. This means: 1
2
2
u = ±mλ with m an integer 2X sinθ = u sinθ = ±mλ / 2X 0
0
For now, we will just consider the case of m=1. Note that this equation has no solutions for λ > 2X: a line of sources cannot constructively interfere at any angle if the wavelength is too great (if the frequency is low). We will see that this leads to the same cutoff criterion we found earlier. 0
Also note that “+” polarity sources also interfere constructively with S at angle θ. A similar triangle drawn with S -S as the hypotenuse has sides that are half as large as the original triangle. The requirement for constructive interference between S and S is therefore u/2=λ /2, which is exactly what the opposite polarities of S and S provide. 1
1
0
1
1
0
0
0
Thus all sources radiate constructively at angle θ. By symmetry, they also radiate constructively at angle –θ, which is shown in Figure 25-12.
Figure 25-12 Radiation at Angle –θ
Here, the radiation moves down and to the right, also at v=c with wavelength λ . 0
The actual radiation pattern is the superposition of radiation at +θ and at –θ. That superposition at some time t is shown in Figure 25-13. Again, the dotted horizontal lines indicate where the waveguide walls were before being replaced by image sources.
Figure 25-13 Sum of +θ & –θ Radiation
The three small dots with the letters A, B, and C below them indicate points where the +θ and –θ waves interfere maximally. Crests from both radiation patterns combine to produce the maximum positive E field at A and C, while their troughs combine to produce the minimum (greatest negative) E field at B. As time t increases, the +θ waves move up and to the right while the –θ waves move down and to the right, both at speed c with wavelength λ . In one full wave period, the crest at point A moves to point C; hence, the distance AC equals one guide wavelength, which we called λ above. 0
wg
The distance AC is the length of the hypotenuse of the gray right triangle near the bottom of Figure 2513. The middle-length side of this triangle is the distances between crests of the +θ wave, which is λ . And the angle between those two sides is θ. This defines the relationship between the two wavelengths: 0
λ = λ / cosθ λ = λ / √{1–sin θ} λ = λ / √{1–(λ /2X) } wg
0
wg
0
wg
0
2
2
0
Recalling that the cutoff frequency is Ω=πc/X, and that λ =2πc/ω, we can rewrite this as: 0
λ = λ / √{1–(Ω/ω) } 2
wg
0
We see that for ωΩ, λ >λ . A longer wavelength at the same frequency corresponds to a greater velocity. This confirms that the phase velocity, the speed of a single-frequency wave, can exceed c. Nonetheless, no real entities, including photons and energy, are ever transported faster than speed c. wg
0
Chapter 25 Review: Key Ideas • Transmission lines are characterized by two basic equations: ∂V/∂x = – L ∂J/∂t ∂J/∂x = – C ∂V/∂t 0
0
Here, V is voltage, J is current, and C and L are the capacitance and inductance per unit length. Both V and J satisfy the 1-D wave equation with wave velocity v = 1/√(L C ). 0
0
0
0
• A coaxial cable, with an inner conductor of radius r and an outer conductor of radius R, and vacuum between the conductors, has v=c and impedance z given by: 0
z = ln(R/r) / (2πε c) 0
0
The constant 1/(2πε c) has a value of 60 ohms, hence almost all coaxial cables have impedances between 50 ohms and a few hundred ohms. 0
• For a rectangular waveguide of width X and height Y, with X>Y, the lowest mode is: E = E sin(k x) exp{iωt–ik z} k = π/X y
x
x
z
ck = ± √ {ω – Ω } 2
2
z
Here, Ω = πc/X is the cutoff frequency: waves of lower frequency are rapidly attenuated. The “+” sign is for waves moving toward +z, and “–“ is for waves moving toward –z. The phase velocity, group velocity, and guide wavelength are: v v λ λ
= ω / k = c / √ {1 – (Ω/ω) } = dω/dk = c √[1 – (Ω/ω) ] = 2πv /ω = λ / √[1 – (Ω/ω) ] = 2πc/ω is the vacuum wavelength. 2
ph
z
2
gp
z
2
wg 0
ph
0
The power transmitted through an X-by-Y rectangular waveguide is: power = dU/dt = ε E 0
max
2
XYv / 4 gp
The oscillatory modes of a waveguide are characterized as either TE or TM. Assume the waveguide’s axis is in the z-direction. In the TE mode, the electric field is entirely transverse within the xy-plane, and the magnetic field has a z-component. In the TM mode, B is entirely transverse within the xy-plane, and E has a z-component. In rectangular waveguides, the lowest cutoff frequency occurs in the TE mode. Typically, waveguides are used at frequencies just slightly higher than the lowest Ω. This ensures that only the lowest mode propagates through the waveguide, which simplifies its use.
Chapter 26 Relativistic Electrodynamics Invariance in Nature & Physical Laws Throughout the development of physics, our understanding of nature has been advanced by discoveries of invariance: natural properties that remain the same when other conditions change. Physicists have repeatedly reformulated theories and physical laws to better represent newly discovered invariance principles. A familiar example is rotational invariance: experiments show that nature has no preferred direction. If everything were rotated by any angle about any axis, all natural phenomena would remain the same. Physicists discovered that incorporating rotational invariance into the statement of physical laws is most conveniently achieved by employing vectors. For example: F = qv×B is rotationally invariant. If everything rotates by angle θ about the z-axis, or equivalently if our coordinate system rotates by angle –θ about the z-axis, this law remains valid. It is valid for any angle θ and any axis z. While mastering vector algebra is initially challenging, it undeniably simplifies physical analysis. Even more importantly, vectors enable a deeper understanding of physical principles. Certainly, we could write the above law as: F = q(v B –v B ) F = q(v B –v B ) F = q(v B –v B ) x
y
z
z
y
y
z
x
x
z
z
x
y
y
x
These three equations yield the correct answer to any problem. However, rotations change the values of nine of these variables (all but q). Furthermore, the single vector equation makes a clearer statement of natural law than do the three component equations. Using vectors, we have been able to express even more complex relationships in manageable forms. Maxwell’s equations are an excellent example: Ď•E = ρ/ε
0
Ď×E = –∂B/∂t Ď•B = 0 c Ď×B = j/ε + ∂E/∂t 2
0
Reading “a changing magnetic flux through a closed loop drives a circulating electric field around that loop” is much more meaningful than staring at the three equivalent component equations. You already knew all that. Let’s now turn to two other invariance principles that will lead us to restate the equations of electromagnetism. These principles are relativity and the constancy of the speed of light, the two foundational principles of Einstein’s Special Theory of Relativity. We explored both these principles in detail in Feynman Simplified 1C, chapters 25 through 29. First espoused by Galileo, the principle of relativity is: absolute velocity is meaningless, only relative velocities have physical consequence. All natural phenomena are the same in any reference frame moving at a constant velocity, regardless of that velocity. Einstein combined relativity with the postulate that the speed of light, in vacuum, has the same value c, in all reference frames, regardless of their velocity. (In special relativity, reference frames must have constant velocity; general relativity removes this requirement.) Innumerable experiments have confirmed the principles of relativity and the constancy of c to astonishing levels of precision. New tests are continually being performed, both to achieve even greater precision and to extend the range of validated conditions. No principle of science is sacrosanct; all may be questioned thoughtfully, and subject to more stringent testing. Nonetheless, nearly all physicists agree that these two principles are among the most certain concepts of all human knowledge. To correctly represent nature, all physical theories and laws must properly incorporate both principles.
Four-Vectors As we discovered in 1C, the principles of relativity and the constancy of c are best incorporated into physical laws by employing 4-vectors in four-dimensional spacetime. For example, the coordinates of an event are represented by the position 4-vector: x = (ct, x, y, z) µ
Here, subscript µ is an index that ranges over the values t, x, y, and z, selecting the desired component of the 4-vector x . This is analogous to the subscript j ranging over x, y, and z, selecting the desired component of the 3-vector F. µ
Another example is the momentum 4-vector, often called the 4-momentum:
p = (E/c, p , p , p ) µ
x
y
z
Mastering 4-vectors in 4-D spacetime is also initially challenging. But learning this skill is essential to becoming a successful physicist, and it ultimately simplifies the math and illuminates a deeper understanding of nature. I assume that is your goal; why else would you be reading my book? We will find that everything we learned in 3-vector algebra can be expanded into 4-D spacetime with 4-vector algebra. Adding one more component is a small price to pay to see the universe in its full 4D glory. The first question is: What makes a valid 4-vector? What must we add as the fourth component to convert a proper 3-vector into a proper 4-vector? The answer starts with the criterion for a proper 3vector: rotational invariance. The three components of a proper 3-vector cannot be just any three quantities one might pick. The combination (birthdate, weight, salary) is not a proper 3-vector. An example of a proper 3-vector is r=(x,y,z), the displacement between the origin of a coordinate system and a point P with those coordinate values. The length of vector r is invariant under coordinate rotation: rotating a coordinate system by angle θ about the z-axis does not change the length of r, for any θ and z. Rotation also does not change the relationship of r to any another proper 3-vector: the angle between any two proper 3-vectors r and s is unchanged by any rotation. What all this ultimately means is that a proper 3-vector r transforms into another proper 3-vector r* in a rotated coordinate system according to: r = (x, y, z) r* = (xcosθ – ysinθ, ycosθ + xsinθ, z) Note that the rotational transformation has one plus sign and one minus sign. One can apply the same rule to 3-D rotations of any amount in any direction. Since this transformation is linear, any linear sum of proper 3-vectors produces another proper 3-vector. We can confirm that the rotational transformation does not change vector r’s length, whose square is the scalar product r•r. r•r = x + y + z 2
2
2
r*•r* = x* + y* + z* 2
2
2
r*•r* = x cos θ – 2xycosθsinθ + y sin θ + z + y cos θ + 2xycosθsinθ + x sin θ 2
2
2
2
2
2
2
2
2
= x (cos θ+sin θ) +y (cos θ+sin θ) +z 2
2
2
2
2
2
2
r*•r* = x + y + z = r•r 2
2
2
Similarly, proper 4-vectors must transform according to the Lorentz transformation. The Lorentz transformation is identical to the rotational transformation among the three spatial dimensions. With the addition of time as the fourth dimension, the Lorentz transformation adds a new procedure for boosts, transformations of velocity. The boost transformation from a reference frame S to a reference frame S* that is moving at speed v in the +x-direction relative to S, is: x* = γ (x – β ct) ct* = γ (ct – β x) y* = y z* = z Here, β=v/c, and γ=1/√(1–β ). One can apply the same rule for a boost of any amount in any direction. Note that the Lorentz transformation has two minus signs and that time transforms as well as space; these are the essential differences that produces all relativistic effects. Since the Lorentz transformation is linear, any linear sum of proper 4-vectors produces another proper 4-vector. 2
Since the Lorentz transformation mixes components, the four components of a proper 4-vector must all have the same units: they must all be velocities, or all be distances, or all have the same units of some other type. This requirement was trivial and left unstated in 3-D, because spatial dimensions naturally have the same units. But in 4-D, one component is intrinsically different, hence care is required to ensure it has the same units as the other components. This is why the new component of the position 4-vector is ct and not just t.
Dimensional Analysis & c=1 For brevity, Feynman adopts a standard physics convention that distance and time shall be measured in units that make the speed of light c equal to 1. One can do this by measuring time in seconds and distance in light-seconds (the distance light travels in one second). Astronomers prefer to measure time in years and distances in light-years. Some experimental high-energy physicists measure time in nanoseconds and distance in feet. Either way, c=1. Feynman adopts c=1 for most of two lectures, but not consistently. For your benefit, I will present the equations with all c’s included. In reading chapters 25 and 26 of Volume 2 of Feynman’s Lectures, you can restore missing c’s in his equations by: replacing each “t” with “ct” replacing each “v” with “v/c” replacing each “E” with “E/c” replacing each “ø” with “ø/c” replacing each “ρ” with “cρ” After that, check the units on both sides of each equation and put in c’s as necessary to dimensionally balance the equation.
Let me demonstrate this with an example: Ď×B = ∂E/∂t (with c=1 and j=0) Using the replacement list above, the equation becomes: Ď×B = ∂(E/c)/∂(ct) c Ď×B = ∂E/∂t 2
The replacements correctly restored the missing c’s. Alternatively, if one forgets the replacement list, one can go directly to balancing units, also called dimensional analysis. From the Lorentz force F=q(E+v×B), we know that the units of E must be the same as the units of vB; let’s write that [E]= [vB]. In dimensional analysis, Ď×B=∂E/∂t becomes: [B] / [x] = [E] / [t] The units on the left are magnetic field B divided by distance x, because the curl is a spatial derivative with many terms of the form ΔB /Δx. The units on the right are electric field E divided by time t. Substituting [E]=[vB] yields: y
[B] = [x] [vB] / [t] = [v] [B] 2
The units of distance [x] divided by time [t] are the units of velocity [v]. Every valid equation must have the same overall units on both sides. To balance this equation, we must multiply the left side by a velocity squared. Since we left out the c’s, this means the restored equation is: c Ď×B = ∂E/∂t 2
Try this on a few equations that you already know. With a little practice, you will find dimensional analysis is easy and highly effective. If you want to be doubly sure, keep all the c’s and do dimensional analysis on your results.
4-Vector Scalar Product The 4-D scalar product has some surprises. Feynman defines the scalar product of two 4-vectors A and B to be: A B = +A B – A B – A B – A B µ
µ
t
t
x
x
y
y
z
z
Here, repeating the µ subscript invokes the Einstein convention, directing us to sum over all four values of the index µ, with the indicated minus signs. Elsewhere Feynman says: “It’s rather awkward to have those minus signs, but that’s the way the world is.” There are several different conventions for indexing, for the order of components, and for the polarity
of product terms. One can get correct answers in any convention, but only if one uses it consistently. I will use Feynman’s conventions here, even though the most common modern convention reverses all the signs on the right side of the above equation. (Perhaps believing that one minus sign is less awkward than three.) Let’s examine the scalar product of the position 4-vector with itself; this is essentially squaring that 4vector. x = (ct, x, y, z) µ
x x = +c t – x – y – z 2 2
µ
2
2
2
µ
Since some of the above terms are negative, x x may be positive, zero, or negative. This might be unsettling if interpreted as a length-squared, so instead, this scalar product is often called an interval. µ
µ
Just as the scalar product of any two proper 3-vectors is invariant, the scalar product of any two proper 4-vectors is also invariant. Let’s confirm this for both the position and momentum 4-vectors. We will assume a stationary frame S and a frame S* moving along the x-axis at velocity v. The y and z terms are unchanged in each case, so for brevity I will abbreviate them as [yz]. x* x* = c t* – x* – y* – z* x* x* = γ (ct–βx) – γ (x–βct) –[yz] = γ {c t –2βctx+β x –x +2βcxt–β c t }–[yz] = γ { c t (1–β ) – x (1–β )} –[yz] x* x* = c t – x – y – z = x x 2
µ
µ
µ
µ
2
2
2
2
2
2
2 2
2
2
2
2 2
µ
2
2
2 2
2
2
2
2
2
2
2
2
2
µ
µ
p = (E/c, p , p , p ) p p = E /c – p – p – p p p = E /c – p = m c µ
x 2
2
µ
µ
µ
µ
2
y
µ
z
2
2
x 2
2
2 2
y
2
2 z
2
0
p* p* = E* /c – p* – p* – p* 2
µ
2
2
µ
2
x
2 z
y
p* p* = γ (E/c–βp ) – γ (p –βE/c) –[yz] 2
µ
2
µ
2
2
x
x
= γ {E /c –2βEp /c+β p –p +2βpxE/c–β E /c } –[yz] 2
2
2
2
2
x
2
x
2
2
2
x
= γ {E /c (1–β ) –p (1–β )} –[yz] 2
2
2
2
2 x
2
p* p* = E /c – p – p – p 2
µ
2
µ
2
p* p* = m c = p p 2
µ
µ
2
x
0
y
2 z
2
µ
µ
Note the useful relationship: γ (1–β ) = 1. 2
2
Is there a velocity 4-vector? Yes, but one must be careful. In V2p25-2, Feynman points out that
someone less clever than you might try: dx = (cdt, dx, dy, dz). This is correct µ
u = dx /dt = (cdt/dt, dx/dt, dy/dt, dz/dt) u = (c, v , v , v ). This is wrong µ
µ
µ
x
y
z
The problem is that dt is not invariant; its value is different in different reference frames. By analogy: (1, v /v , v /v ) is not a proper 3-vector. y
x
z
x
Feynman presents the right approach. Start with the momentum 4-vector. p = (E/c, p , p , p ) µ
x
y
z
Recall that E = γmc , p = γmv, where m is the object’s rest mass. Now divide p by the invariant scalar m. The linearity of the Lorentz transformation ensures that any constant multiple of a 4-vector is also a 4-vector. 2
µ
u = p / m = (γ c, γ v , γ v , γ v ) µ
µ
x
y
z
We have proven that u is a 4-vector, and we define this to be the 4-velocity. Its scalar product with itself is: µ
u u = γ (c –v –v –v ) u u = γ (c – v ) = c µ
µ
µ
µ
2
2
2
2
2 x 2
2 y
2 z
2
How To Make Antiparticles You can try this at home … if you have a huge tract of land (and billions of dollars to spare). Using what we have learned about 4-vectors, let’s analyze the process of producing new particles. To concentrate as much energy as possible in a tiny, particle-sized volume, physicists use brute force, smashing the highest energy particles available into other particles. Let’s start with what is easiest, and therefore what was done first: smashing a high-energy beam into a stationary target. If the beam is comprised of protons, the fundamental processes of interest are: beam-proton-hitstarget-proton; and beam-proton-hits-target-neutron. Let’s focus on the former. Since particle accelerators are extremely expensive and their costs increases with beam energy, our particular interest is calculating the least beam energy needed to produce an antiproton. We consider a proton-proton collision in two reference frames: the Lab frame, in which the target is stationary; and the center of mass (CM) frame, in which the beam and target protons have equal but opposite momenta. Figure 26-1 illustrates the before-collision and after-collision states in both the
Lab and CM frames.
Figure 26-1 Producing 2 New Particles
Here, the black circles represent protons and the open circle represents an antiproton. In Feynman Simplified 3C, Chapter 26, we explore elementary particles and various conservation laws that govern their interactions. One of those laws is the conservation of the total net number of quarks. For our purposes here, at modest energies, one consequence of quark conservation is that the (total number of protons) minus the (total number of antiprotons) never changes. This means that to create an antiproton we must simultaneously create a new proton. We cannot make just one new particle; we must make pairs. The minimum beam energy required to create an antiproton is therefore the minimum energy required to create a proton-antiproton pair. Hence the final state has four particles: three protons and one antiproton, as Figure 26-1 indicates. The minimum energy that four particles can have is the sum of their rest masses. That minimum is attained only if all four particles are stationary and have zero kinetic energy. This can occur only in the CM frame, where the total momentum is zero. In any other frame, total momentum is non-zero and at least some particles must have non-zero kinetic energy. We now employ four conservation laws, one for energy and one each for the three components of momentum. With 4-vectors, these four laws are combined into a single equation: p +p =p b µ
t
a
µ
µ
Here, p is the beam 4-momentum before collision, p is the target 4-momentum before collision, and p is the 4-momentum of the four-particle state after collision. b µ
t
µ
a
µ
In V2p25-5, Feynman stresses that every 4-vector equation is valid in every reference frame, just as the 3-vector equation F=dp/dt is valid in every frame. It is wise to choose frames in which the math is simplest. Taking the scalar product of each side with itself yields: (p + p ) (p + p ) = p p b µ
t
µ
b µ
t
a
µ
µ
a µ
p p +2p p +p p =p p M c + 2 p p + M c = 16 M c b µ 2
b µ
t
b µ
µ
2
t
t
b µ
µ
t
µ
2
a
µ
a µ 2 2
µ
2
Here, we used p p = m c : the square of any object’s 4-momentum is always equal to the square of its rest mass multiplied by c . Let the proton rest mass be M. Since antiparticles have the same mass as the corresponding particles, the minimum energy of the four particle final state is 4M in the CM frame, if all four are stationary. We therefore have: X µ
X µ
2
2
X
2
p p = 7 Mc t
b µ
µ
2
2
Now switch to the Lab frame in which: stationary target: p = (Mc,0,0,0) beam toward +x: p = (E/c,p,0,0) t
µ b µ
p p = EM = 7 M c E = 7 Mc t
b µ
µ
2
2
2
Since E is the beam proton’s total energy, the minimum beam kinetic energy to produce antiprotons is 6 Mc , which is 5.63 GeV. The first particle accelerator capable of producing antiprotons was the Bevatron at the University of California at Berkeley, whose design energy was 6.2 GeV, about 10% more than the minimum energy. (The accelerator got its name because American physicists once denoted billion electron-volts by “BeV” instead of the now standard “GeV”.) 2
We now turn to a related question: for a proton beam of energy E hitting another proton, what is the maximum energy U available for the production of particles? (Here, “production” includes the two original protons.) From above, we know that the total 4-momentum before and after the collision is: p +p =p b µ
t
µ
a µ
Squaring each side yields: (p + p ) (p + p ) = p p Mc + 2 p p + Mc = p p b µ 2 2
t
b µ
µ
t
µ
t
a
µ
b µ
a µ
µ
2
2
a
µ
a µ
The maximum U is attained when the energy used for motion is minimized, which is when all particles are stationary in the CM frame. In that case, the entire CM energy E is available for particle masses, and the right hand side becomes U /c . cm
2
2 p p + 2M c = U /c t
µ
b µ
2
2
2
2
2
For a stationary target, we found that p p = EM, where E is the beam energy. This means: t
µ
U = √ { 2Mc (E+ Mc ) } 2
2
b µ
This is unfortunate. Since costs scale almost linearly with E, a single-beam accelerator that costs 25 times as much delivers only 5 times the available energy. This is why physicists switched to machines that deliver more bang for the buck. Colliding-beam accelerators produce two beams that circulate in opposite directions and periodically collide headon. For colliding beams, the Lab frame is the CM frame, and all of the energy of both beams is available to produce particles. Hence: U = 2E A colliding-beam accelerator that costs 25 times as much delivers nearly 25 times the available energy. This is why we now build these much more complex accelerators. One of their many challenges is getting the two beams to actually collide. Aiming a beam at a large, stationary block of matter is trivial compared to aiming it at another beam that is 16 microns (0.0006 inches) wide and moving at virtually the speed of light.
4-D Gradient In three dimensions, we found that the differential operator combination: Ď = (∂/∂x, ∂/∂y, ∂/dz) transforms like a proper 3-vector. This is the 3-vector gradient operator. It is most commonly denoted by an inverted Δ, but since that symbol isn’t supported by all eBook formats, I use Ď instead. We want to find the four-dimensional equivalent of Ď. In V2p25-6, Feynman shows that a seemingly reasonable choice is wrong, and then merely states the right answer. Let’s instead derive the correct 4-gradient. Clearly the 4-gradient operator will be some combination of ∂/∂t, ∂/∂x, ∂/∂y, and ∂/∂z. By symmetry, the coefficients of the three spatial derivatives must all be the same. We also know that other spacetime 4-vectors often have minus signs in unexpected places. Without loss of generality, we set the coefficient of ∂/∂t to 1/c, and set the spatial coefficients to b, a quantity we seek to determine. Let’s therefore try the combination: Ď = (c ∂/∂t, b ∂/∂x, b ∂/∂y, b ∂/∂z) –1
µ
The 4-gradient must transform like a proper 4-vector. Let’s examine the partial derivatives of a scalar function f in two reference frames: stationary frame S; and frame S* moving toward +x at velocity v. To reduce clutter, we will ignore the y- and z-derivates for now. in S: Δf = ∂f/∂t Δt + ∂f/∂x Δx in S*: Δf = ∂f/∂t* Δt* + ∂f/∂x* Δx*
We now use the Lorentz transformation to relate Δt* and Δx* to Δt and Δx. cΔt* = γ (cΔt – vc Δx) Δx* = γ (Δx – vΔt) –1
cΔt = γ (cΔt* + vc Δx*) Δx = γ (Δx* + vΔt*) –1
in S*: Δf = γ{ ∂f/∂t* (Δt – vc Δx) + ∂f/∂x* (Δx – vΔt)} –2
We find ∂f/∂t by setting Δx=0 and taking the limit of infinitesimal Δt. (Δf/Δt) –> ∂f/∂t = γ{ ∂f/∂t* – v ∂f/∂x*} We also find ∂f/∂x by setting Δt=0 and taking the limit of infinitesimal Δx. (Δf/Δx) –> ∂f/∂x = γ{ ∂f/∂x* – vc ∂f/∂t*} –2
Now, let’s compare these to the results of our trial 4-gradient Ď f and its Lorentz transformed version Ď* f. µ
µ
Ď = (c ∂/∂t, b ∂/∂x, b ∂/∂y, b ∂/∂z) Ď* = (c ∂/∂t*, b ∂/∂x*, b ∂/∂y*, b ∂/∂z*) –1
µ
–1
µ
Ď f = γ (Ď* f + vc Ď* f) Ď f = γ (Ď* f + vc Ď* f) –1
t
t
x
–1
x
x
t
Inserting the component definitions converts the prior two equations to: c ∂ f/∂t = γ (c ∂f/∂t* + vc b ∂f/∂x*) b ∂f/∂x = γ (b ∂f/∂x* + vc ∂f/∂t*) –1
–1
–1
–2
∂f/∂t = γ (∂f/∂t* + b v ∂f/∂x*) ∂f/∂x = γ (∂f/∂x* + b vc ∂f/∂t*) –1
–2
For Ď to transform properly, these equations must match our earlier equations, which are: µ
∂f/∂t = γ{ ∂f/∂t* – v ∂f/∂x*} ∂f/∂x = γ{ ∂f/∂x* – vc ∂f/∂t*} –2
Both equations match with b = –1, which yields the 4-gradient operator Ď : µ
Ď = (+c ∂/∂t, –∂/∂x, –∂/∂y, –∂/∂z) –1
µ
Feynman says: Ď “behaves as a 4-vector should.” This simply means that the 4-gradient of a proper scalar field is a proper 4-vector. Recall his prior statement: “It’s rather awkward to have those minus µ
signs, but that’s the way the world is.” With Ď , the 4-divergence is defined as: µ
Ď A = +Ď A – Ď A – Ď A – Ď A µ
µ
t
t
x
x
y
y
z
z
Ď A = +c ∂A /∂t – (–∂A /dx) – (–∂A /dy) – (–∂A /dz) –1
µ
µ
t
x
y
z
Ď A = +c ∂A /∂t + ∂A /dx + ∂A /dy + ∂A /dz –1
µ
µ
t
x
y
z
The minus signs in the first line arise from the definition of the 4-vector scalar product. The additional minus signs in the second line arise from the definition of Ď . All minus signs cancel in the last line. µ
The divergence as defined above is the scalar product of two proper 4-vectors and is therefore an invariant scalar under the Lorentz transformation. It is sometimes helpful to separate a 4-vector scalar product into its temporal and spatial parts. The above equation then becomes: Ď A = +c ∂A /∂t + Ď•A –1
µ
µ
t
In V2p25-7, Feynman provides an example of using Ď . In Chapter 14, we found that ρ and j form a 4vector, called the 4-current, given by: µ
j = (cρ, j , j , j ) µ
x
y
z
Recall the 3-D equation for the conservation of electric charge: Ď•j = – ∂ρ/∂t We can rewrite this in 4-vector form. Ď j = c ∂(cρ)/∂t + Ď•j = ∂ρ/∂t + Ď•j –1
µ
µ
Ď j =0 µ
µ
Since the scalar product of two 4-vectors is an invariant scalar, this ensures the conservation of charge in every reference frame. Now consider the invariant scalar product Ď Ď : µ
Ď Ď = c ∂/∂t ∂/∂t – (–∂/∂x)(–∂/∂x) – (–∂/∂y)(–∂/∂y) – (–∂/∂z)(–∂/∂z) –2
µ
µ
µ
Ď Ď = c ∂ /∂t – ∂ /∂x – ∂ /∂y – ∂ /∂z –2
µ
2
2
2
2
2
2
2
2
µ
Ď Ď = c ∂ /∂t – Ď•Ď –2
µ
2
2
µ
The square of the 4-gradient equals c ∂ /∂t minus the square of the 3-gradient. This is the same combination that we found in the wave equation. This is the d’Alembert operator, also called the d’Alembertian, which operates on a 4-vector field to produce another 4-vector field. It is most often written: –2
2
2
☐=ĎĎ µ
µ
☐ = c ∂ /∂t – ∂ /∂x – ∂ /∂y – ∂ /∂z –2
2
2
2
2
2
2
2
2
Note that Feynman’s definitions have the opposite signs of the most common modern usage. He also writes the d’Alembertian as ☐ , which is not uncommon but is redundant (nothing else is symbolized by ☐ ). This square symbol is said to represent the four dimensions of spacetime squared. 2
1
We have not yet addressed the 4-D equivalents of cross products and curls, but we will in coming chapters. The 4-vector algebra that we have developed is summarized in the Review section at the end of this chapter.
Electrodynamics in 4-Vectors In Chapters 4 and 15, we derived the following equations for the scalar potential ø and the vector potential A: ø(r) = (1/4πε ) ∫ ρ(σ) dV / |r–σ| A(r) = (1/4πε c ) ∫ j(σ) dV / |r–σ| 0
V
2
0
V
Define the linear operator Ω as: Ω = (1/4πε c ) ∫ dV / |r–σ| 2
0
V
Evidently, after rearranging factors of c in the first equation, the above equations can be written: Ω (ø/c) = cρ Ω (A) = j Since (cρ, j) is a 4-vector and Ω combines 4-vectors linearly, (ø/c, A) must also be a 4-vector, which we define as the 4-potential A . Since ø and A separately satisfy the sourced-wave equation, so must A. µ
µ
☐A = j / c ε 2
µ
µ
0
In V2p25-9, Feynman reminds us that these equations are valid only in the Lorentz gauge, which is
defined in 3-D by: ∂ø/∂t + c Ď•A = 0 In 4-D, the Lorentz gauge is simply: Ď A = 0. µ
µ
A Moving Charge in 4-D Since the 4-potential A is a proper 4-vector, it is governed by the Lorentz transformation. For motion with velocity v toward +x, the components of A* in the moving frame are: µ
µ
A* A* A* A*
t x y z
= γ (A – vc A ) = γ (A – vc A ) =A =A –1
t
x
–1
x
t
y
z
Let’s now calculate the fields from a charge moving toward +x with velocity v. We could calculate the fields in our rest frame S in which the charge velocity is v. Alternatively, we could calculate the fields in the rest frame S* of the charge in which its velocity is zero. The latter is particularly simple. Assume charge q is at rest at the origin of the S* coordinate system. The vector potential is zero for a stationary charge, leaving only the scalar potential, which is: ø* = q / (4πε r*) 0
Here, r* is the distance in S* from the charge to the point P at which the fields are evaluated. The geometry is shown in Figure 26-2.
Figure 26-2 Moving Charge in 2 Frames
We can transform the potentials in S* back to our rest frame S according to: ø/c = γ ( ø*/c + 0) A = γ ( 0 + vc ø*/c) A =0 A =0 –1
x y z
ø = γ q / (4πε r*) A =vø/c 0
2
x
The last remaining task is calculating r* in terms of S coordinates. r* = √{ x* + y* + z* } r* = √{ γ (x–vt) + y + z } 2
2
2
2
2
2
2
This matches the result we derived (with much greater pain) using Lienard-Wiechert potentials in Chapter 21.
Maxwell in 2 Equations In V2p25-10, Feynman says the basic equations of electromagnetism can be reduced to two equations written in 4-D vector algebra. These are: ☐A = j / c ε 2
µ
µ
0
Ď j =0 µ
µ
One might quibble, saying that we must include: Ď A = 0, the Lorentz gauge F = q(E+v×B), the Lorentz force E = –Ďø – ∂A/∂t B = Ď×A µ
µ
Nonetheless, it is impressive how compact these equations have become. Feynman says: “There, in one tiny space on the page, are all of the Maxwell equations—beautiful and simple. Did we learn anything from writing the equations this way, besides that they are beautiful and simple? In the first place, is it anything different from what we had before when we wrote everything out in all the various components? Can we from this equation deduce something that could not be deduced from the wave equations for the potentials in terms of the charges and currents? The answer is definitely no. The only thing we have been doing is changing the names of things—using a new notation. …What then is the significance of the fact that the equations can be written in this simple form? From the point of view of deducing anything directly, it doesn’t mean anything.” He adds: “Perhaps, though, the simplicity of the equations means that nature also has a certain simplicity.”
Feynman’s Lectures preceded the now over-worked phrase “Theory of Everything.” But, he often foresaw future scientific and technological developments. Here is Feynman’s Theory of Everything from 1962. “Let us show you something interesting that we have recently discovered: All of the laws of physics can be contained in one equation. That equation is: U= 0 What a simple equation! Of course, it is necessary to know what the symbol means. U is …the “unworldliness” … Here is how you calculate the unworldliness. You take all of the known physical laws and write them in a special form. For example, suppose you take the law of mechanics, F=ma, and rewrite it as F–ma=0. Then you can call (F–ma)–which should, of course, be zero—the “mismatch” of mechanics. Next, you take the square of this mismatch and call it U .” 1
He goes on to define U = (Ď•E–ρ/ε ) , etc. Once we have a U for each known equation of physics, we simply add them: U = Σ U = 0, total unworldliness is zero 2
2
0
j
j
j
In V2p25-11, Feynman continues: “So the ‘beautifully simple’ law U=0 is equivalent to the whole series of equations that you originally wrote down. It is therefore absolutely obvious that a simple notation that just hides the complexity in the definitions of symbols is not real simplicity. It is just a trick. The beauty that appears…is no more than a trick. When you unwrap the whole thing, you get back where you were before." But he says the equations listed at the start of this section do far more than does U=0. “However, there is more to the simplicity of the laws of electromagnetism written [as 4vectors]. It means more, just as a theory of vector analysis means more. The fact that the electromagnetic equations can be written in a very particular notation which was designed for the four-dimensional geometry of the Lorentz transformations… It is because the Maxwell equations are invariant under those transformations that they can be written in a beautiful form. "There is, however, another reason for writing our equations this way. It has been discovered— after Einstein guessed that it might be so—that all of the laws of physics are invariant under the Lorentz transformation. That is the principle of relativity. Therefore, if we invent a notation which shows immediately when a law is written down whether it is invariant or not, we can be sure that in trying to make new theories we will write only equations which are consistent with the principle of relativity. "The fact that the Maxwell equations are simple in this particular notation is not a miracle, because the notation was invented with them in mind. But the interesting physical thing is that every law of physics—the propagation of meson waves or the behavior of neutrinos in beta decay, and so forth—must have this same invariance under the same transformation. Then when
you are moving at a uniform velocity in a spaceship, all of the laws of nature transform together in such a way that no new phenomenon will show up. It is because the principle of relativity is a fact of nature that in the notation of four-dimensional vectors the equations of the world will look simple.”
Chapter 26 Review: Key Ideas • Proper 4-vectors transform according to the Lorentz transformation, which is identical to the Euclidean transformation for rotations among the three spatial dimensions, but adds a boost for velocity changes. To transform 4-vector C from reference frame S to C* in reference frame S* that is moving at speed v in the +x-direction relative to S, the boost is: µ
C* C* C* C*
t x y z
µ
= γ (C – β C ) = γ (C – β C ) =C =C t
x
x
t
y
z
Here, β=v/c, γ=1/√(1–β ), and γ (1–β )=1. Since the Lorentz transformation is linear, any linear sum of proper 4-vectors produces another proper 4-vector. Some 4-vectors and their squares are: 2
2
2
4-position x = (ct, x, y, z) x x =ct –x –y –z µ
2 2
µ
2
2
2
µ
4-momentum p = (E/c, p , p , p ) p p = E /c – p = m c µ
x 2
2
µ
y
z
2
2
µ
2
0
4-velocity u = (γ c, γ v , γ v , γ v ) u u =c µ
x
y
z
2
µ
µ
4-current j = (cρ, j , j , j ) µ
x
y
z
4-potential A = (ø/c, A , A , A ) µ
x
y
z
• The 4–D equations of electromagnetism are: ☐A = j / c ε 2
µ
µ
0
Ď j = 0, conservation of charge µ
µ
Ď A = 0, the Lorentz gauge F = q(E+v×B), the Lorentz force E = –Ďø – ∂A/∂t B = Ď×A µ
µ
Vector Algebra in 3-D and 4-D Vectors 3D: A = (A , A , A ) 4D: Aµ = (A , A , A , A ) = (A , A) x
y
t
z
x
y
z
t
Scalar Product of Vectors 3D: A•B = A B + A B + A B 4D: Aµ B = A B – A B – A B – A B x
y
x
µ
t
z
y
x
t
z
y
x
z
y
z
Differential Vector Operator 3D: Ď = (∂/∂x, ∂/∂y, ∂/∂z) 4D: Ď = (c ∂/∂t, –∂/∂x, –∂/∂y, –∂/∂z) –1
µ
Gradient of Scalar Field f 3D: Ďf= (∂f/∂x, ∂f/∂y, ∂f/∂z) 4D: Ď f = (c ∂f/∂t, –∂f/∂x, –∂f/∂y, –∂f/∂z) –1
µ
Divergence of Vector Field 3D: Ď•A = ∂A /∂x +∂A /∂y +∂A /∂z 4D: Ď A = c ∂A /∂t +∂A /∂x +∂A /∂y +∂A /∂z x
µ
y
z
µ
–1
t
x
y
z
Laplacian / d’Alembertian 3D: Ď•Ď = ∂ /∂x + ∂ /∂y + ∂ /∂z 4D: ☐ = Ď Ď = c ∂ /∂t –∂ /∂x –∂ /∂y –∂ /∂z 2
–2
2
µ 2
2
µ 2
2
2
2
2
2
2
2
2
2
Chapter 27 Transformation of Fields In Chapter 21 and Chapter 26, we employed two different approaches to derive the potentials ø and A of a moving charge. Here we will use yet another approach to the same problem. Whenever you try to solve a complex problem, it is wise to try multiple approaches. If you get the same result in several different ways, you will be much more confident in your calculations.
Constant Velocity Potentials Consider the potentials at point F from a charge at point P moving toward +x with constant velocity v, as shown in Figure 27-1
Figure 27-1 Fields at F from Charge at P
In V2p26-1, Feynman says: you “should not be confused” by the fact that he is using different notation here than in prior chapters. I will endeavor to minimize that confusion. Let point F have coordinates (x,y,z). The charge q is at point P at time t with coordinates (x–vt,0,0), and r is the vector from P to F. Figure 27-1 also shows P , the position of charge q at retarded time t . The vector from P to F is r . Adapting the results of the prior chapter to this new notation, we have: ret
ret
ret
ø = γ q / (4πε r) A =vø/c A =A =0 0
2
x y
z
ret
r = |r| = √{ γ (x–vt) + y + z } 2
2
2
2
Note that these equations make no reference to retarded quantities: r is the distance to F from P, the position of the charge at the same time t at which the fields at F are evaluated. I find the next portion of this lecture rather confusing. Feynman stresses an important point, then presents a seemingly reasonable argument, and ends up showing why that reasonable argument is wrong. Perhaps intending this as a cautionary tale, he offers this warning: “Whenever you see a sweeping statement that a tremendous amount can come from a very small number of assumptions, you always find that it is false. There are usually a large number of implied assumptions that are far from obvious if you think about them sufficiently carefully.” I think this warning is partially appropriate. It is true that one should always be aware of the assumptions underlying any assertion. And, certainly many sweeping statements have been made, even by eminent scientists, based on implicit and sometimes unjustified assumptions. This is particularly true when “science” is presented by commercial media hungry for sensational soundbites. One must therefore remember that statements are never scientific proof; only observations of nature can be compelling evidence in science. Yet, sweeping statements are not always false; some are well-confirmed by observation, including the conservation of energy, momentum, and charge. These should be cherished. To distinguish the substantive from the merely flashy, physicists must exercise their own judgments, which we develop over time. Success in physics is not so easy that it can be achieved by a robot, slavishly following a few simple rules. Let’s now go carefully through this portion of Feynman’s lecture and discover what is worth learning. Feynman stresses that the equations we derived, assuming constant charge velocity, depend on where the charge is at the same time that the fields are evaluated, not at the retarded time t , as we have often said. This seems contrary to the certain truth that all effects from charge q require time to reach point F. How can what happens at F at time t be determined by what happens at P at the same time t? ret
This point is discussed in Chapter 21, where we compare the Lienard-Wiechert equations with Feynman’s general equations for the fields from a moving charge, which are: (4πε ) E(R,t)= +q{R/R +(R/c) d(R/R )/dt + d (R/R)/dt /c } 0
3
3
2
2
2
B(R,t) = +R×E(R,t) /Rc Here, R is the position at which we evaluate the fields from a charge q that was at (0,0,0) at time t , the time when q emitted these fields, with t =t–r/c. ret
ret
Since a charge moving with constant velocity has zero acceleration, the last term in Feynman’s
general equation is zero. We found in Chapter 21 that, for this case, the middle term exactly corrects for retardation, for the time delay due to the finite speed of light. We can therefore drop the middle term and replace R with r, the vector from where the charge is at time t to the point at which we evaluate the fields at time t. The retardation of fields is real, but the experimentally confirmed equations of electromagnetism automatically correct for that, if all charges move with constant velocity. As Feynman says, if a charge has a constant velocity, its actions at point P at time t are completely determined by its actions at point P at time t . The complex equations may obscure this reality, but it is nonetheless true. ret
ret
An essential point is that the simpler equations are valid if the charge has zero acceleration at t , the time when the fields that later reach point F are emitted at P . The acceleration may be non-zero either before or after t . ret
ret
ret
The above material is worth learning.
A Primrose Path To Avoid This section describes the tale Feynman tells in V2p26-1,2 that may seem reasonable, but is erroneous. You may find it amusing and cautionary, but you will miss nothing essential by skipping to the next section. Feynman says that if we make the “…assumption that the potentials depend only upon the position and the velocity at the retarded moment…a complete formula for the potentials for a charge moving any way” is provided by: ø = γ q / (4πε r) A =vø/c A =A =0 r = √{ γ (x–vt) + y + z } 0
2
x y
z
2
2
2
2
The scheme to finding the potentials at F at time t from a moving charge is: (1) find the charge position and velocity at time t ; (2) find the charge position at time t assuming constant charge velocity; and (3) use the above equations. ret
Feynman says: “…[knowing] the potentials from a charge moving in any manner whatsoever, we have the complete electrodynamics; we can get the potentials of any charge distribution by superposition. Therefore we can summarize all the phenomena of electrodynamics either by writing Maxwell’s equations or by the following series of remarks. (Remember them in case you are ever on a desert island. From them, all can be reconstructed. You will, of course, know the Lorentz transformation; you will never forget that on a desert island or anywhere else.) “First, A is a four-vector. Second, the Coulomb potential for a stationary charge is q/4πε r. µ
0
Third, the potentials produced by a charge moving in any way depend only upon the velocity and position at the retarded time. With those three facts we have everything…” He then explains the trap at the end of this primrose path: “It is sometimes said, by people who are careless, that all of electrodynamics can be deduced solely from the Lorentz transformation and Coulomb’s law. Of course, that is completely false… The fields depend not only on the position and the velocity along the path but also on the acceleration. So there are several additional tacit assumptions in this great statement that everything can be deduced from the Lorentz transformation." He then concludes with the warning I quoted earlier, which amounts to: “If it seems too good to be true, it probably isn’t true.” The moral of this long story is simple: don’t use the constant velocity equations when the velocity isn’t constant.
Constant Velocity Fields There are interesting situations in which charge velocities are constant, including electrons flowing in a conductor, or high-energy particles traversing a detector. In such cases, we can obtain the electric and magnetic fields from the constant velocity potentials. First, recall that: ∂r /∂z = – r ∂r/∂z ∂r /∂z = – r (1/2) (1/r) (2z) ∂ (1/r) /∂z = – z/r –1
–2
–1
–2
3
We now calculate — in 3-vector algebra — the three components of E from: E = – Ďø – ∂A/∂t ø = γ q / (4πε r) A =vø/c A =A =0 r = √{ γ (x–vt) + y + z } 0
2
x y
z
2
2
2
E = –∂ø/∂z – ∂A /∂t E = (γq/4πε ) (z/r ) z
z
3
z
0
Similarly, E = –∂ø/∂y – ∂A /∂t E = (γq/4πε ) (y/r ) y
y
3
y
0
2
The x-component is more complex. E = –∂ø/∂x – ∂A /∂t E = –(γq/4πε ) ∂r /∂x – vc ∂ø/∂t x
x
–1
x
–2
0
We will deal with each term separately. ∂r /∂x = (–r )(1/2r) γ 2(x–vt) (1) ∂r /∂x = –(1/r ) γ (x–vt) –1
–2
2
–1
3
2
∂ø/∂t = (γq/4πε ) (–1/2r ) γ 2(x–vt) (–v) ∂ø/∂t = (γq/4πε ) (1/r ) γ v (x–vt) 3
2
0
3
2
0
E = (γq/4πε ) {γ – γ v c } (x–vt) / r E = (γq/4πε ) (x–vt) / r 2
x
0
x
0
2
2
–2
3
3
We next calculate the three components of B from: B = Ď×A B = ∂A /∂x –∂A /∂y B = – vc ∂ø/∂y z
y
x
–2
z
As we found above, E = –∂ø/∂y since A =0. Hence: y
y
B = vc E –2
z
y
Similarly, B = ∂A /∂z –∂A /∂x B = vc ∂ø/∂z = – vc E y
x
z
–2
–2
y
z
Finally, B = ∂A /∂y –∂A /∂z = 0, since A =A =0. x
z
y
z
y
We can write B in one equation: B = v×E /c . 2
Now let’s examine the electric field as a whole. Combining the components derived above yields: E = (γq/4πε r ) (x–vt, y, z) with r = √{ γ (x–vt) + y + z } 3
0
2
2
2
2
Define X=x–vt. The coordinate system Xyz is centered where the charge is at time t. To be clear: q’s position is (x–vt,0,0) in xyz-coordinates, and (0,0,0) in Xyz-coordinates. The E field is:
E(t,X,y,z) = (γq/4πε r ) (X, y, z) E(t,R) = (γq/4πε r ) R with R = (X, y, z) and r = √{ γ X +y +z } 3
0
3
0
2
2
2
2
Since E is directly proportional to R, E points radially outward from q everywhere for q>0, and radially inward for q