Feynman Simplified 2D: Magnetic Matter, Elasticity, Fluids, & Curved Spacetime [2 ed.]

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Feynman Simplified 2D: Magnetic Matter, Elasticity, Fluids, & Curved Spacetime Everyone’s Guide to the Feynman Lectures on Physics by Robert L. Piccioni, Ph.D.

Copyright © 2015 by Robert L. Piccioni Published by Real Science Publishing 3949 Freshwind Circle Westlake Village, CA 91361, USA Edited by Joan Piccioni

All rights reserved, including the right of reproduction in whole or in part, in any form. Visit our web site www.guidetothecosmos.com

Everyone’s Guide to the Feynman Lectures on Physics Feynman Simplified gives mere mortals access to the fabled Feynman Lectures on Physics.

This Book Feynman Simplified: 2D covers the final quarter of Volume 2 of The Feynman Lectures on Physics. The topics we explore include: Principle of Least Action Tensors in 3-D and 4-D Spacetime Magnetic Materials Diamagnetism & Paramagnetism Ferromagnetism Elasticity & Elastic Matter Viscosity & Liquid Flow Gravity and Curved Spacetime To find out about other eBooks in the Feynman Simplified series, click HERE. I welcome your comments and suggestions. Please contact me through my WEBSITE. If you enjoy this eBook please do me the great favor of rating it on Amazon.com or BN.com.

Table of Contents Chapter 36: Principle of Least Action Chapter 37: Tensors Chapter 38: Magnetic Matter Chapter 39: Paramagnetism & Resonance Chapter 40: Theories of Ferromagnetism Chapter 41: Practical Ferromagnetism Chapter 42: Elasticity Chapter 43: Elastic Materials Chapter 44: Non-Viscous Fluid Flow Chapter 45: Viscous Fluid Flow Chapter 46: Curved Spacetime

Chapter 36 Principle of Least Action This is a special lecture on a general principle that applies to all of physics, not just electromagnetism. Feynman is famous for his profound understanding of the Principle of Least Action. The Feynman Lectures state this lecture: “is intended to be for ‘entertainment’.” That is code for: “this won’t be on the exam.” But, that does not mean this is unimportant. In fact, the Principle of Least Action is one of the most important principles of physics — a principle every serious physicist should understand. On V2p19-1, Feynman says: “When I was in high school, my physics teacher—whose name was Mr. Bader—called me down one day after physics class and said, ‘You look bored; I want to tell you something interesting.’ Then he told me something which I found absolutely fascinating, and have, since then, always found fascinating. Every time the subject comes up, I work on it. In fact, when I began to prepare this lecture I found myself making more analyses on the thing. Instead of worrying about the lecture, I got involved in a new problem. The subject is this—the principle of least action. “Mr. Bader told me the following: Suppose you have a particle (in a gravitational field, for instance) which starts somewhere and moves to some other point by free motion—you throw it, and it goes up and comes down. It goes from the original place to the final place in a certain amount of time. Now, you try a different motion. Suppose that to get from here to there, it went [along a very different path] but got there in just the same amount of time. Then he said this: If you calculate the kinetic energy at every moment on the path, take away the potential energy, and integrate it over the time during the whole path, you’ll find that the number you’ll get is bigger than that for the actual motion. “In other words, the laws of Newton could be stated not in the form F = ma but in the form: the average kinetic energy less the average potential energy is as little as possible for the path of an object going from one point to another.” We define an object’s action S to be its kinetic energy minus potential energy. In a gravitational field, this is:

S = mv /2 – mgx 2

Here, g is the acceleration of gravity, m is the object’s mass, v is its velocity, and x is its height above any convenient base elevation, such as sea level. If the potential energy represents all active forces, the principle of least action says: Objects follow the path of least action. Let’s consider Mr. Bader’s simple example: a ball thrown upward in a uniform gravitational field. The total action from time t=A to time t=B is: S = ∫ { mv /2 – mgx } dt B

AB

2

A

Figure 36-1 shows two possible paths, with x plotted vertically and time t plotted horizontally.

Figure 36-1 Two Possible Paths

The actual path taken by a real ball is a parabola, shown as the solid curve in Figure 36-1. An alternative that we might imagine is shown as the dashed curve. Both curves start at the same x and t, and both end at the same x and t. We see that the alternative path is more “interesting”, with more structure and sharper turns. We might imagine many alternatives to nature’s actual path, but as Feynman says in V2p19-2: “The miracle is that the true path is the one for which [S ] is least.” AB

Let’s make the problem even simpler. Let’s suppose no forces act on the ball, and therefore there is no potential energy term in our integral. The action then reduces to: S = ∫ { mv /2 } dt B

AB

2

A

Now, we know what the average velocity must be: total distance traveled Δx divided by total travel time Δt. We then write: Δx = ∫ v dt = ∫ (dx/dt) dt = ∫ dx B

A

Δt = ∫ dt B

A

B

A

B

A

= Δx / Δt Feynman uses the following argument to show that the action integral is minimized if v is always equal to : “As an example, say your job is to start from home and get to school in a given length of time with the car. You can do it several ways: You can accelerate like mad at the beginning and slow down with the brakes near the end, or you can go at a uniform speed, or you can go backwards for a while and then go forward, and so on. The thing is that the average speed has got to be, of course, the total distance that you have gone over the time. But if you do anything but go at a uniform speed, then sometimes you are going too fast and sometimes you are going too slow. Now the mean square of something that deviates around an average, as you know, is always greater than the square of the mean; so the kinetic energy integral would always be higher if you wobbled your velocity than if you went at a uniform velocity.” Personally, I am more comfortable with mathematical proofs than with verbal arguments and analogies. Analogies are almost never perfect, and arguments are often won by the loudest and most forceful, whether or not they are right. Despite some glaring errors in his theories, no one successfully argued against Aristotle for 2000 years. So, let’s do the math, starting with this little trick: (v – ) = v – 2v + v = (v – ) + 2v – 2

2

2

2

2

2

S = (m/2) ∫ v dt 2S /m = ∫ {(v–) + 2v – } dt B

AB

2

A

B

AB

2

2

A

The last term is easy: – ∫ dt = – Δt 2

B

2

A

The middle term is almost as easy: 2 ∫ v dt = 2 ∫ (dx/dt) dt 2 ∫ v dt = 2 Δx = 2 ( Δt) B

B

A

A

B

A

Therefore the action becomes: 2S /m = Δt (+2–1) + ∫ {(v–) dt 2

AB

B

2

A

Everything has been reduced to constants except the action S and the last integral. To minimize the action, we must minimize this integral. Since the integrand is a perfect square, it is always greater than or equal to zero. The minimum clearly occurs when v= always, just as Feynman argued. AB

If I were as smart as Feynman, and if I already knew the right answer, I would also be satisfied with

arguments and analogies. Hence, for a ball subject to no forces whatsoever, the motion of least action is traveling at a constant velocity from “here to there”, as shown in Figure 36-2.

Figure 36-2 Path Without Forces

Now, let’s make the problem more realistic by adding a conservative force with potential U(x). (Recall that all fundamental forces are conservative and that only conservative forces have meaningful potentials; see Feynman Simplified 1A, Chapter 10.) The action equation is: S = ∫ { mv /2 – U(x) } dt B

AB

2

A

To minimize S , we would like to reduce the integrand’s positive term (kinetic energy) and maximize its negative term (potential energy U). Figure 36-3 shows the object’s true path represented by the solid curve, and an alternative path represented by the dashed curve. AB

Figure 36-3 Dashed Path With High Potential

Let’s assume U(x) increases with increasing x, as does the gravitational potential. The alternative path offers the lure of a higher average U than the true path. The problem, however, is that rapidly increasing x to rapidly increase U(x) requires a large initial velocity v, and therefore a large kinetic energy mv /2 that increases the action S . 2

AB

Finding the minimum action is a puzzle whose solution optimally balances competing effects. Increasing U too much or too rapidly increases the kinetic energy thus increasing the action. But, increasing U too little or too slowly fails to reduce the action.

In V2p19-3, Feynman says: “That is all my teacher told me, because he was a very good teacher and knew when to stop talking. But I don’t know when to stop talking. So instead of leaving it as an interesting remark, I am going to horrify and disgust you with the complexities of life by proving that it is so. The kind of mathematical problem we will have is very difficult and [is of] a new kind. “You [might say:] ‘Oh, that’s just the ordinary calculus of maxima and minima. You calculate the action and just differentiate to find the minimum.’ “But watch out. Ordinarily we just have a function of some variable, and we have to find the value of that variable where the function is least or most. For instance, we have a rod which has been heated in the middle and the heat is spread around. For each point on the rod we have a temperature, and we must find the point at which that temperature is largest. But now for each path in space we have a number—quite a different thing—and we have to find the path in space for which the number is the minimum. That is a completely different branch of mathematics. It is not the ordinary calculus. In fact, it is called the calculus of variations.” Feynman says there are many similar variational problems in physics and mathematics. For example, we normally define a circle as the locus of points whose distance from a chosen center is r. An alternative definition is: a circle is the curve of length L that encloses the largest area. From that definition, one can construct a circle whose radius is L/2π. Feynman suggests you might amuse yourself by trying to find a circle fulfilling the second definition using integral or differential calculus.

Functions Near Extrema Before delving into the calculus of variations, let’s first examine carefully the behavior of an arbitrary function f near an extremum, either a minimum or a maximum. Recall that we can express any function as a Taylor series. Let’s assume our function f has a minimum, and define the x-axis so that this minimum occurs at x=0. For some set of constants a , the Taylor series is: j

f(x) = a + a x + a x + a x + … 2

0

1

2

3

3

We now show that a must be zero. At any extremum, the first derivative of f is zero. This means: 1

at x=0: df/dx = 0 = a + 2a x + 3a x + … 2

1

2

3

Hence a = 0, and the Taylor series reduces to: 1

f(x) = a + a x + a x + … 2

0

2

3

3

For small values of |x|, x is much smaller yet. Hence, any function changes very slowly near its extrema. The more elegant description is: 2

Near any function’s extrema,

the function changes only in second order. “Second order” means the change in f(x) is proportional to the change in x to the second or higher power. We write this: f(x) = a + O(x ) 2

0

O(x ) denotes any combination of unspecified terms that are proportional to second and higher powers of a small quantity x. 2

This fact will help us find the path of least action.

Variational Calculus Here is how variational calculus works. Imagine two paths that start at the same x and t, and end at the same x and t, as shown in Figure 36-4. The true path (what we seek) is represented by the solid curve, and an alternative path is represented by the dashed curve.

Figure 36-4 Path Difference: True vs. Alternate

We define x(t) and w(t) such that: true path: x(t) alternate: x(t) + w(t) Thus w(t) is the difference between the true path and the alternate path. The length of each vertical line in Figure 36-4 represents the value of w(t) at selected times. In our analysis we will consider alternate paths that deviate only slightly from the true path, much less than shown in the above image. This means w(t) will be small compared with x(t) at all t. The action along the true path is: S

trueAB

= ∫ { (dx/dt) m/2 – U(x) } dt B

A

2

The action along the alternative path is: S

= ∫ { (d[x+w]/dt) m/2 – U(x+w) } dt B

altAB

2

A

Since the true path has the least action, S variation in S, δS, to be:

altAB

δS = S

altAB

–S

trueAB

must be greater than or equal to S

trueAB

. We define the

>= 0

δS = ∫ { [(d[x+w]/dt) – (dx/dt) ] (m/2) – U(x+w) + U(x) } dt B

2

2

A

We will simplify this piece by piece, beginning with the U terms. For small w, we use the Taylor series: U(x+w) = U(x) + (dU/dx) w + (d U/dx ) w /2+ … U(x+w) = U(x) + (dU/dx) w + O(w ) – U(x+w) + U(x) = – (dU/dx) w + O(w ) 2

2

2

2

2

Now, let’s turn to the term in [ ]’s, which is much more interesting. [(d[x+w]/dt) – (dx/dt) ] = (dx/dt + dw/dt) – (dx/dt) = 2(dx/dt)(dw/dt) + (dw/dt) 2

2

2

2

2

Since w(t) is small, the term on the right is much smaller than the term on the left. We can lump that into the other O(w ) terms. We then have: 2

δS = ∫ { [m (dx/dt)(dw/dt)] – (dU/dx) w } dt + O(w ) B

A

2

Let’s ignore the clutter for a moment and consider the Big Picture. The integrand has the form: P dw/dt – Q w If it were Pw – Qw, we would immediately have a relationship between P and Q. So, the way forward is to turn Pdw/dt into an Rw. But how? The first key step in this problem is employing integration by parts. Recall the procedure: for any two functions u and v: d(uv)/dt = u dv/dt + v du/dt ∫ [d(uv)/dt]dt = ∫ u[dv/dt]dt + ∫ v[du/dt]dt B

B

A

A

uv | = ∫ u dv + ∫ v du B

A

B

A

B

A

B

A

∫ u dv = uv | – ∫ v du B

B

A

B

A

A

In the present case, set v = w and u = dx/dt. Integration by parts yields: ∫ (dx/dt) (dw/dt) dt = (dx/dt) w | – ∫ w {d(dx/dt)/dt} dt B

A

B

B

A

A

Now comes the second key step. The alternative path and the true path must both start at the same x and t, and both end at the same x and t. We vary the path between the endpoints, but not at the endpoints. This means: w(A) = w(B) = 0 Dropping O(w ), the equation for the variation of the action is becoming more manageable. 2

δS = ∫ { [–m (d x/dt ) w – (dU/dx) w } dt B

2

2

A

δS = – ∫ { m (d x/dt ) + (dU/dx) } w dt B

2

2

A

As we described earlier, any function changes very slowly near its extrema. In our case, the function is S. When the alternative path is the same as the true path, when w=0 everywhere, the variation δS equals zero. When the alternative path is close to the true path, δS will be very close to zero, with δS deviating from zero only in second order. Our problem boils down to finding the x(t) for which δS=0 for any small path deviation w(t). The third key step is realizing that, along the true path, the term in { }’s is zero everywhere. Why? Consider a function w(t) that is non-zero only over a very small range of t, say between t* and t*+Δt. If Δt is small enough, we can assume w(t) is constant over the time interval Δt. This reduces the integrand to a constant, making the integral trivial. δS = – { m [d x(t*)/dt ] + [dU(t*)/dx] } w(t*) Δt 2

2

On the true path δS = 0, which means: 0 = { m [d x(t*)/dt ] + [dU(t*)/dx] } 2

2

This relationship must hold for every value of t*. In fact, this is a general result: the principle of least action is local, not merely global. It applies not just to the complete time interval A to B, but rather it applies separately to every infinitesimal time interval. Just as momentum is conserved at every instant, motion is governed by least action at every instant. We finally have our solution:

for all t: m d x(t)/dt = – dU(t)/dx 2

2

since in one dimension, F = – dU/dx. We obtain the familiar Newtonian equation of motion: ma = F Variational calculus proves that, for a conservative force, every body moves according to Newton’s second law: F = ma. Should we quibble about the need to be “conservative”? Newton’s law applies to any force, even to a “non-conservative” force like friction. The principal of least action, however, applies only to conservative forces, because only for those can we define a corresponding potential energy. But “non-conservative” forces are only non-conservative when we fail to account for all the action. Friction heats atoms increasing their kinetic and potential energy. Energy is not lost, it is still conserved, but the accounting becomes more complex. At a fundamental level, all forces are conservative. In V2p19-5, Feynman provide sage advice on this variational method: “It turns out that the whole trick of the calculus of variations consists of writing down the variation of S and then integrating by parts so that the derivatives of [w] disappear. It is always the same in every problem in which derivatives appear. … [Next] comes something which always happens—the integrated part disappears [(dx/dt)w| in this case]. (In fact, if the integrated part does not disappear, you restate the principle, adding conditions to make sure it does!)” B

A

Least Action Or Most Action Recall our studies of optics in Feynman Simplified 1C, Chapters 30 and 31. We first understood that the propagation of light is governed by the principle of least time: light takes the path that requires the least time, even in the presence of materials with different apparent light speeds. But, later we understood that the propagation of light is governed by wave interference: light takes the path in which waves interfere constructively by arriving in phase after taking the nearly same travel time along nearby paths. The key point is that the dominant paths are those with nearby paths that have nearly the same travel times. Those are the paths that interfere constructively. The dominant paths are most often those with the least travel time, but they can also be those with the greatest travel time. Either way, they are always paths of extremal travel time: the change in travel time along slightly different paths is almost zero. Here we have the same physics and the same math. Objects move along paths of extremal action, because nearby paths have nearly the same action. These paths are most often those of least action,

but they can also be paths of greatest action. Either way, the variation of action along nearby paths is almost zero. While the key is extremal action, physicists nonetheless almost invariably called this the principle of least action.

Action in 3-D In V1p19-6, Feynman says the generalization to three dimensions is straightforward. The kinetic energy becomes: {(dx/dt) + (dy/dt) + (dz/dt) } m/2 = v•v m/2 2

2

2

The potential energy becomes: U(x,y,z). And the variation becomes a vector field w(x,y,z). While we can no longer conveniently graph the action in its complete 4-dimensionality, the solution is obtained using the same methodology. One can consider first path variations only in the x-direction, then turn to variations in y, and finally variations in z. One obtains three equations, and combining these yields Newton’s second law in its complete vector form: F = ma.

Multi-Body Action The variation method can be generalized for action amongst many bodies, say N interacting particles. We then have N kinetic energy terms of the form: v•v m/2. We also have N(N–1)/2 terms for U , the potential energy between particle j and particle k. The variation is 3N-dimensional, and we obtain 3N equations that reduce to Newton’s laws in 3-D for N particles. jk

Generalized Action The variational method can be generalized to encompass a very wide class of problems. It is even used in the tensor calculus of differential topology for calculations in general relativity. In V2p19-7, Feynman says that in general, the action does not have the form: kinetic energy minus potential energy. In V2p19-8, he says: “The question of what the action should be for any particular case must be determined by some kind of trial and error. It is just the same problem as determining what are the laws of motion in the first place. You just have to fiddle around with the equations that you know and see if you can get them into the form of the principle of least action.” In general, action S is defined by: S = ∫ Ł dt B

A

Here, the integral is from time = A to time = B, and the integrand Ł is the Lagrangian, named after Joseph Louis Lagrange (1736-1813). For a single particle in an electromagnetic field, Feynman says the action is: S = – ∫ mγ dt + ∫ q {v•A(r) – ø(r) } dt B

B

A

A

Here, r is the position vector from the origin to (x,y,z), v=dr/dt, γ is the usual relativistic factor 1/ √(1–v /c ), q is the particle’s charge and m is its mass, ø is the scalar potential, and A is the vector potential. After extensive variational calculations that Feynman does not present, describing them as “much more difficult”, he says one obtains this equation: 2

2

d(mγv)/dt = q { E(r) + v×B(r) } This is the Lorentz force in relativistic notation. In V2p19-8, Feynman says: “There is quite a difference in the characteristic of a law which says a certain integral from one place to another is a minimum—which tells something about the whole path—[compared with Newton’s] law which says that as you go along, there is a force that makes it accelerate. The second way tells how you inch your way along the path, and the [first] is a grand statement about the whole path. … I would like to explain why [these two seemingly very different types of laws are in fact the same].” The reason these different laws are interconnected is because of something we discussed above. For a path to achieve the least action over a large path, it must also achieve the least action over every infinitesimal segment of that path. This must be true because we can adjust the path in any small interval Δt without making any changes to the path elsewhere. If a path does not have the least action in Δt, we can adjust the path in Δt and achieve a lower total action for the entire path. Only if a path has the least action in every segment is further adjustment unproductive. Thus, least action, the overall path law, is really also a differential law — a law based on derivatives — that determines the proper path on a point-by-point basis, which is just what Newton’s laws do. In V2p19-9, Feynman says: “In the case of light we also discussed the question: How does the particle find the right path? From the differential point of view, it is easy to understand. Every moment it gets an acceleration and knows only what to do at that instant. But all your instincts on cause and effect go haywire when you say that the particle decides to take the path that is going to give the minimum action. Does it ‘smell’ the neighboring paths to find out whether or not they have more action? … “Is the same thing true in mechanics? Is it true that the particle doesn’t just ‘take the right path’

but that it looks at all the other possible trajectories? … The miracle of it all is, of course, that it does just that. That’s what the laws of quantum mechanics say. So our principle of least action is incompletely stated. It isn’t that a particle takes the path of least action but that it smells all the paths in the neighborhood and chooses the one that has the least action by a method analogous to the one by which light chose the shortest time. ” I added the bolding above to highlight Feynman’s most important statement. The reason that particles behave like light is because both have wave properties, according to the quantum mechanical principle of particle-wave duality. Both light and electrons propagate along paths in which their waves interfere constructively by arriving in phase after taking the nearly same travel time along nearby paths.

Least Action in Electrostatics In V2p19-10, Feynman says: “I want now to show that we can describe electrostatics, not by giving a differential equation for the field, but by saying that a certain integral is a maximum or a minimum.” We first consider the problem of calculating the electrostatic potential ø(r) from a charge density ρ(r) that is known at all r = (x,y,z). From Chapter 28, we know that the energy U* of an electric field is: U* = (ε /2) ∫ E•E dV = (ε /2) ∫ (Ďø) dV 2

0

0

Here, the integral is over all 3-D space. The potential energy of a charge distribution ρ(r) in an electrostatic potential ø(r) is: U = ∫ ρ(r) ø(r) dV Feynman says the equation for action S in this situation is: S = (ε /2) ∫ (Ďø) dV – ∫ ρ(r) ø(r) dV 2

0

We define the true potential to be ø(r) and an alternate potential to be ø(r)+w(r). Our goal is to find ø using the variational method. The variation of action S is: δS = (ε /2) ∫ (Ď[ø+w]) dV – ∫ ρ [ø+w] dV – (ε /2) ∫ (Ďø) dV + ∫ ρ ø dV 2

0

2

0

Let’s evaluate the above integrands, discarding terms that are second and higher order in w. (Ď[ø+w]) = Ď[ø+w] • Ď[ø+w] (Ď[ø+w]) = Ďø•Ďø + 2 Ďø•Ďw + O(w ) 2 2

2

ρ [ø+w] = ρ ø + ρ w δS = ∫ (ε Ďø•Ďw – ρ w) dV 0

Restating the dot product in component notation yields: Ďø•Ďw = (dø/dx)(dw/dx) + (dø/dy)(dw/dy) + (dø/dz)(dw/dz) Now we must integrate each term by parts to replace the derivatives of w. ∫

(dø/dx)(dw/dx) dx = (dø/dx)w | – ∫ (d ø/dx )w dx

+∞ –∞

+∞ –∞

+∞ –∞

2

2

As before. we require that w=0 at the integration limits. Before the limits were A and B. Here they are –∞ and +∞. But the idea is the same: the variation must be zero at some limits. Performing the same replacement for y and z reduces the variation to: δS = ∫ {–ε Ď ø – ρ} w dV 2

0

As before, at the true ø, δS must be zero for any variation w, which requires the expression in { }’s to be zero everywhere. We have therefore proven this key equation of electrostatics: Ďø = – ρ / ε 2

0

In V2p19-11, Feynman repeats the last portion of this derivation in vector notation, without resorting to component math as we did above. Recall that the vector equation for δS is: δS = ∫ (ε Ďø•Ďw – ρ w) dV 0

Note that: Ď•(w Ďø) = (Ďw)•(Ďø) + w Ď•Ď(ø) (Ďw)•(Ďø) = Ď•(w Ďø) – w Ď ø 2

In integrating over all space, the first term on the right becomes zero, as we now show using Stokes’ theorem. ∫

+∞ –∞

Ď•(w Ďø) dV = ∫

+∞ –∞

(w Ďø)•n dA

Here, the right integral is over the surface enclosing volume V, and n is the normal to that surface. Since the surface is at infinity, w must be zero everywhere on that surface. The integral for δS is then: δS = ∫ (– ε Ď ø – ρ) w dV 2

0

This is the same result we obtained using components. However, Feynman says the vector approach is more powerful and allows us to solve a broader range of problems. At the start of this section, we assumed a charge distribution ρ(r) that is known at all r. Now imagine that we want to calculate the potential ø(r) from some array of charged conductors. We cannot vary the potential within the conductors; those must be constant. But, we can vary the potential throughout the empty space outside these conductors. This means w=0 within and on the surfaces of all conductors. Since w must still be zero at infinity, this ensures the continued validity of: ∫

+∞ –∞

Ď•(w Ďø) dV = ∫

+∞ –∞

(w Ďø)•n dA = 0

Here, the volume integral is over all empty space (excluding the conductors), and the surface integral is over all boundary surfaces of empty space. These surfaces include the surfaces of all conductors and the surface at infinity; w=0 on all these surfaces. The δS equation remains: δS = ∫ {–ε Ď ø – ρ} w dV 2

0

with the integral over all empty space. This ensures the validity in empty space of: Ďø = – ρ / ε 2

0

Since charge densities and electric fields must be zero inside any perfect conductor, this equation applies there as well. That leaves only the surfaces of each conductor, where charge densities and electric fields may not be zero. For a situation in which charge density is zero everywhere outside conductors (no free charges), S reduces to the following integral over empty space: S = (ε /2) ∫ (Ďø) dV 2

0

S is now the energy of the electric field. The true potential ø produces an electric field with the least energy. Let’s apply the least action method to a simple example: the long cylindrical capacitor, shown in cross-section in Figure 36-5.

Figure 36-5 Cy lindrical Capacitor

The inner conductor has voltage +V, and the outer conductor is set at zero volts. The gap between conductors extends from radius A to radius B. We define r to be the radial coordinate, and as before ø(r) is the true potential and ø(r)+w(r) is an alternative potential in the space between the conductors. The limit conditions are: w(r=A) and w(r=B) are zero. Since the energy stored by a capacitor equals: CV /2 2

and we have specified V, the value of S for any trial potential ø(r)+w(r) can be used to calculate a corresponding capacitance C(S), according to: C(S) = 2S / V

2

The true potential ø(r) yields the least energy S and the least capacitance C(S). From our mastery of Chapter 12, we know that the true capacitance is: C = 2πε / ln(B/A) true

0

and that the true electric field is proportional to 1/r. In V2p19-12, Feynman asks us to pretend not to remember this answer (that may be easier for some than others), or imagine that we are facing a more complex geometry that has no analytical solution. He wants to show us how to get an approximate answer by trial and error. To phrase that more professionally: how to get an approximate answer with progressively more educated guesses. A surprising large portion of these Lectures is devoted to solving challenging arithmetic problems by hand. Feynman was amazingly proficient with numbers. He taught me how to accurately estimate cube roots and other fractional powers in my head without paper or calculator.

With the ubiquity of computers today, this appears old-fashioned. It seems younger people no longer take pride in being able to multiply and divide; they can simply use their pocket calculators or iPhones, and get much faster and more precise answers. But back in ye good olde days, in 1962, no one owned a computer or even a pocket calculator. Arithmetic was a survival skill. In fact, when I bought my first HP calculator in 1970, it cost $500 ($3000 in current dollars). I still use it everyday. While it is interesting to see Feynman work, physicists today immediately turn to computers when facing such problems. If you skip to the end of this chapter, you will miss some “entertainment” but no important physics. In any case, Feynman’s first trial potential corresponds to a constant electric field between the conductors. The potential must then vary linearly with r, as given by: ø(r) = V (1 – [r–A]/[B–A] ) Note that ø(r=A) = V and ø(r=B) = 0, matching the boundary conditions. The energy per unit length (the direction perpendicular to the screen in Figure 36-5) is: S/L = (ε /2) ∫ (Ďø) 2πr dr S/L = ε ∫ (–V/[B–A]) πr dr S/L = V πε /[B–A] ∫ r dr S/L = V πε /[B–A] (r /2) | S/L = (V πε /2[B–A] ) (B –A ) S/L = (V πε /2) (B+A)/(B–A) B

0

2

A

B

0

2

A

2

2

0

B

A

2

2

2

B

0

A

2

2

2

2

0

2

0

We now equate this with the capacitor’s stored energy per unit length: CV /2. 2

CV /2 = (V πε /2) (B+A)/(B–A) 2

2

0

C = πε (B+A) / (B–A) try1

0

That turns out to be not bad for small values of B/A, much worse for large B/A, and certainly not correct. In V2p19-13, Feynman says: “The next step is to try a better approximation to the unknown true ø. For example, we might try a constant plus an exponential ø, etc. But how do you know when you have a better approximation unless you know the true ø? Answer: You calculate C; the lowest C is the value nearest the truth.” Unfortunately, that advice falls far short of providing step-by-step instructions. Next, Feynman tries a quadratic potential. I will spare you the arithmetic; his result is:

C = 2πε (B + 4AB + A ) / 3(B – A ) 2

try2

2

2

2

0

Feynman provides tables of C , C , and C . Some of his numbers, at least in my first edition of the Lectures, are a bit off. Here are the correct values: true

try1

try2

Here, the numbers in the third column are the fractional capacitance differences between trial #1 and the true values. The fourth column is the same for trial #2, with “ppm” meaning parts per million. The second trial is clearly superior to the first, but is still substantially off for very large values of B/A. Feynman concludes saying: “I have given these examples, first, to show the theoretical value of the principles of minimum action and minimum principles in general and, second, to show their practical utility… you can guess an approximate field with some unknown parameters and adjust them to get a minimum. You will get excellent numerical results for otherwise intractable problems.”

Chapter 36 Review: Key Ideas • Near the extrema, maxima or minima, of any function f(x), f changes only in second order, meaning Δf is proportional to Δx to the second or higher power. We write this: f(x) = a + O(Δx ) 2

0

• An object’s action S from time A to time B, in a gravitational field, is: S = ∫ { mv /2 – mgx } dt B

2

A

Here, m is the object’s mass, v is its velocity, g is the acceleration of gravity, and x is the object’s height. The principle of least action says objects follow the path of extremal action, provided all forms of kinetic and potential energy are included. At a fundamental level, all forces are conservative.

• Integration by parts solves many complex problems. It has this form: ∫ u dv = u v – ∫ v du Variational calculus is necessary to find the path that minimizes action S. For a path variation w(t), we use integration by parts to convert terms proportional to dw/dt into terms proportional to w. If done properly, the integrated part always vanishes because w=0 at the integration limits.

• Extremal action is the key requirement of this variational method. Nature chooses the paths of extremal action, because nearby paths have nearly the same action. These paths are most often those of least action, but they can also be paths of greatest action.

• In general, action S is defined by: S = ∫ Ł dt B

A

Here, the integrand Ł is the Lagrangian. For a single particle in an electromagnetic field, Feynman says the action is: S = – ∫ mγ dt + ∫ q {v•A(r) – ø(r) } dt B

A

B

A

Chapter 37 Tensors Feynman’s Volume 2 Chapter 31 is titled “Tensors”, but it only briefly addresses the mathematics of tensors, and almost all the analysis is done in component form. This is like giving a lecture on “Vectors” and writing every equation in components. This is unfortunate because tensors are both beautiful and powerful. I provide here a more comprehensive introduction to tensors, for both 3-D classical physics and for 4-D curved spacetime. We will then explore Feynman’s lecture.

What is a Tensor? Tensors are a generalization of vectors. Tensor calculus is a beautiful branch of mathematics that empowers us to elegantly and effectively describe many complex, multi-dimensional phenomena. Tensors are essential in general relativity, where 4-D spacetime curves, twists, and stretches differently at every point, at every instant, and in every direction. Tensors are also employed in 3-D analyses of mechanics and wave propagation in anisotropic materials, those whose properties are different in different directions. The most important thing to know about tensors is that any tensor equation that is valid in one coordinate system is automatically valid without any modifications in all coordinate systems, regardless of their rotation or motion relative to the original coordinate system. That generality is one reason that the mathematics of general relativity is so challenging, but it is also one of the most powerful tools in solving problems. If we can identify a coordinate system in which we can solve a complex problem with a tensor equation, we have immediately solved the problem in all coordinate systems. General relativity is the only major branch of physics in which tensor equations are universally employed. Tensors are arrays of components that transform properly between coordinate systems. In 3-D, they transform according to Euclidian coordinate rotations. In 4-D spacetime, they also transform according to the Lorentz transformation. Tensors can have one component or millions of components, with each being a different function of all coordinates.

Let’s consider some quantities that are not tensors. Temperature is a simple quantity that changes with time and location, making it a function of the four coordinates of spacetime. Its values are different in different coordinate systems, but the values do not change according to the rotation matrices or the Lorentz transformation. Hence, temperature is not a tensor. Similarly, energy by itself is not a tensor. But, the proper combination of energy and momentum — (E/c, p , p , p ) — is a tensor because its components do transform properly. x

y

z

Tensors are characterized by their rank and by the dimensionality of the space in which they are defined. In physics, the most common spaces are Euclidian 3-D, and 4-D curved spacetime. The most common tensors have rank 0, 1, 2, or 4. The largest meaningful tensor I know has rank 10 and has 1,048,576 components; I will not share this tensor with you in this course. The simplest tensors are scalars; these are rank 0 tensors. These include π, 7, 0, and your age, all numbers that have the same values in all coordinate systems. We are also very familiar with rank 1 tensors: vectors. Every proper 3-vector is a 3-D rank 1 tensor, and every proper 4-vector is a 4-D rank 1 tensor. In Chapter 27, we encountered the 4-D rank 2 Faraday tensor that is shown below.

When a tensor is shown as an array of components, its components are enclosed in square brackets [ ], as above. As we discussed in Chapter 27, the Faraday tensor has the special property of being antisymmetric. This means F = –F for all combinations of indices μ and σ. As a result, all the diagonal components are zero, and components on opposite sides of the diagonal are the same, but with the opposite sign. With 4 zero components and 6 redundant components, the Faraday tensor has only 16–4–6 = 6 independent components. These are the 3 components of E and the 3 components of B. μσ

σμ

Tensors of rank 2 and greater can have interesting symmetry properties. Some tensors are symmetric, meaning that G = +G for all combinations of μ and σ. A rank 2, symmetric tensor has 6+4 = 10 independent components, and 6 others that are redundant. μσ

σμ

The sums and differences of tensors with the same rank and indices are also tensors. The product of

two tensors is a tensor, but the quotient of two tensors is not generally a tensor.

Tensor Indices Since tensors can have so many components, we use indices to avoid writing them all out individually. Above, we used two indices to identify the components of a rank 2 tensor, as we did in Chapter 27. Let’s now discuss tensor indices in general. A rank N tensor has N indices that all range over the same set of allowed values. A rank 2 tensor, for example, must have the same number of rows and columns, unlike a matrix that may have a different number of rows and columns. A rank N tensor has 4 components in 4-D, and 3 components in 3-D. N

N

In 3-D and flat (non-curved) 4-D spaces, tensor indices are written as subscripts. For example, consider two alternative notations for the components of the rank 1, position tensor: r r r r r

1

=r =x =r =y =r =z = r = ct = (ct, x, y, z) x

2

y

3

z

0

t

σ

Here, σ = 0, 1, 2, 3, or if you prefer, σ = t, x, y, z. These equations demonstrate a critical difference between the free index σ, which can have any value in the allowed range, and the fixed indices 0, 1, 2, 3, x, y, z, and t. The latter refer to specific components, whereas the former refers to any component corresponding to any possible value of σ. Free indices are just labels. The specific letters we choose have no significance mathematically or physically: x and x mean exactly the same thing, as do A and A . The significance of free indices lies in the relationships they establish. For example: A = B means that every component of B equals the component of A on the opposite side of the diagonal. Just like a vector equation, this tensor equation would mean exactly the same thing for any letters we might substitute for µ and σ. σ

μ

μσ

μσ

βα

σμ

Tensor Algebra A common tensor equation is: L =0 μ

This establishes the same equation for all values of the free index µ. In this case, it means each component of L equals zero. Every tensor equation is valid for all values of every unmatched free index. By “unmatched”, we mean a free index that appears only once in a single product term. Hence:

A =1 σμ

means every component of tensor A equals 1. We said above that the product of two tensors is itself a tensor. For example, if A and B are two position 4-vectors (rank 1 tensors), their product A B is a rank 2 tensor (AB) . The product of a rank N tensor with a rank M tensor is always a tensor of rank N+M. μ

μ

σ

σ

μσ

When a free index appears twice in one term, the Einstein summation convention directs us to sum over all values of the repeated index. This is similar to a vector dot product, and is called a tensor contraction (contraction in the sense that the number of indices and the tensor’s rank are reduced). For example: A B =A B +A B +A B +A B μ

μ

0

0

1

1

2

2

3

3

Here µ appears twice in the product term AB, which directs us to sum over all values of µ. All quantities on the right side are components, hence their sum is a scalar with no free indices. The tensor contraction of a rank N tensor with a rank M tensor has rank N+M–2. In a proper tensor equation, each non-zero term must have the same set of unmatched free indices. For example: valid for all σ: A x + C = B σμ

μ

σ

σ

invalid: A x + C = B because σ and β are not in every term σμ

μ

β

µ

The unit tensor [1] is as important in tensor calculus as 1 is in arithmetic. The components of the unit tensor are 1 when all indices are the same, and 0 whenever any two indices are different. In the rank 2, 4-D, unit tensor shown below, all diagonal components are 1 and all off-diagonal components are 0.

The symbol δ is often used in math and physics. It is called the Kronecker delta, and is defined by: μσ

δ = 1 if μ=σ and zero otherwise. μσ

The tensor contraction δ equals: μµ

δ =δ +δ +δ +δ =4 μµ

tt

xx

yy

zz

Like matrices, all non-singular tensors have inverses: A is the inverse of A . The contraction of a tensor with its inverse always equals the unit tensor. This is as close to tensor division as tensor calculus gets. –1

μσ

A

–1 σμ

A =A A μβ

σμ

–1 μβ

= [1] = δ

μσ

σβ

We define tensors as arrays of components that transform properly. In 3-D, that means they transform according to the rules of Euclidean rotations. Let’s review the procedure for rotating a coordinate system. We start with a point Q defined in a coordinate system S with Q at (x,y,z). We then rotate this system about its z-axis by angle θ, resulting in a new coordinate system S* with the same z-axis but new x- and y-axes. In S*, the coordinates of Q are (x*,y*,z*). The equations for transforming Q’s coordinates are: x* = + x cosθ + y sinθ y* = – x sinθ + y cosθ z* = z Note that we are rotating coordinate axes, keeping Q stationary. If we instead keep the axes stationary and rotate Q, the equations would be the same as above except with θ replaced by –θ. We can write this rotation transformation as a tensor Z(θ), whose components are shown below:

Note that Z (θ), the inverse of Z(θ), is simply Z(–θ), which is the transpose of Z: –1

Z =Z –1 jk

kj

In tensor notation, the above component rotation equations become: Q* = Z Q j

jk

k

Again the sum over k is implied. Without indices this is: Q* = Z Q The requirement that a tensor transform properly under any rotation R can be written: rank 1: A* = R A , sum over k j

jk

k

rank 2: A* = R R A , sum over nm rank 3: A* = R R R A , sum nmσ jk

jn

jkµ

jn

km

nm

km

µσ

nmσ

In each line, A* is the transformed tensor of the original tensor A. For rank 1, we sum over all three values of k. For rank 2, we sum over all 3×3=9 values of n and m. For rank 3, we sum over all 3×3×3=27 combinations of all possible values of n, m, and σ. Note that for rank 2, the above transform can also be written: A* = R R A = R R A* = R A R

–1

jk

jn

km –1

nm

jn

mk

A

nm

Note that since tensor operations are fully defined by paired free indices, the order of product terms is irrelevant, unlike matrix multiplication. We can form tensors by combining any proper polar vectors (but not axial vectors). For example, if A and B are proper polar 3-vectors, then:

j

k

C = A B is a proper rank 2, 3-D tensor. jk

j

k

Let’s show that C transforms properly for any rotation R by transforming each vector. (A* ) (B* ) = (R A ) (R B ) n

m

nj

j

mk

k

(A* ) (B* ) = R R A B n

m

nj

mk

j

k

(A* ) (B* ) = R R C = C* n

m

nj

mk

jk

nm

Hence, C transforms properly and is therefore a tensor. The same logic applies to tensors of any rank. For example: F Λ R μσ

αβ

δε

is a proper rank 6 tensor, although it has no physical meaning as far as I know. To understand the Feynman Lectures, that is as much as you need to know about tensors. Those who wish a glimpse of general relativistic tensor calculus can enjoy the next section (there’s no exam), while others can skip to the section on the polarization tensor.

Tensor Calculus in Curved Spacetime In 4-D curved spacetime, the dot product cannot be simply the sum of the products of corresponding components. This is because, coordinate axes may change directions and rulers may change lengths,

and all that can happen differently at every location and instant in time. We therefore need a metric to reveal the geometry at each event (ct,x,y,z) in spacetime. That metric specifies the invariant interval, the “true distance”, between any two nearby events. It turns out that knowing the interval between all nearby events completely determines the geometry everywhere. In curved space, the dot product of two 4-vectors A and B is: μ

μ

A B =A B =g A B =g A B = +g A B +g A B +g A B +g A B +g A B +g A B +g A B +g A B +g A B +g A B +g A B +g A B +g A B + g A B +g A B +g A B μ

μ

μ

μ

μ

t

t

x

t

tx

x

t

x

y

t

z

y

t

t

y

xy

x

y

yy

z

x

x

z

y

z

yz

z

zx

z

xz

y

yx

zt

y

tz

x

xx

yt

μ

ty

x

xt

σ

μσ

t

tt

σ

μσ

y

z

zy

z

zz

If ds is the separation 4-vector between two nearby events, the invariant interval ds between those points is: 2

µ

ds = (cdt, dx, dy, dz) ds = ds ds = ds ds = g ds ds µ 2

μ

μ

μ

μ

μ

σ

μσ

Here, ds is an invariant scalar — it measures the separation between nearby events and has the same value in any coordinate system. In Feynman’s sign convention, ds equals c dτ where τ is proper time, the time measured by an ideal clock moving between these nearby events. 2

2

2

2

Note that some indices are subscripted while others are superscripted. The former are called covariant indices, while the latter are called contravariant indices. Since superscripts look exactly like exponents, we try to avoid using exponents in tensor calculus whenever confusion might arise: x always means the second component of x , while (x) means x*x. Because the square of coordinate differentials occur so frequently, an exception to this rule is dx which means (dx) . 2

µ

2

2

2

In 4-D curved spacetime, we only sum repeated free indices if one is covariant and the other is contravariant. The difference between the two is demonstrated by the covariant and contravariant position 4-vectors in Feynman’s sign convention. x = (ct, –x, –y, –z) is contravariant x = (ct, x, y, z) is covariant α

α

The metric in flat spacetime, in Feynman’s sign convention, is:

In polar coordinates, the metric near a black hole in Feynman’s sign convention is:

Here, the g and g components are the normal polar coordinate factors, which are unaffected by gravity. Gravity dilates time and stretches space through the factor Ω=1–2GM/c r, where G is Newton’s gravitational constant, M is the black hole’s mass, and r is the distance from its center. Note that odd things happens when Ω = 0 at r = 2GM/c , the location of the black hole’s event horizon. One interesting effect is that the event horizon is timeless — the passage of time has no effect whatsoever on the event horizon, because g is zero at that radius. θθ

ϕϕ

2

2

tt

Note that, in the most common modern notation, the metric g has a minus sign on the time component and plus signs on the three spatial components, which is opposite to Feynman’s convention. μσ

The Lorentz transformation tensor is:

Here, β = v/c and γ = 1/√(1–β ). 2

Some typical index operations are: 1. Lowering an Index: x = g x 2. Raising an Index: x = g x

σ

µ

µ

µσ

µσ

σ

3. Lorentz Transform: X = Λ x σ

β σ

β

Like other square matrices, the metric tensor for most geometries can be diagonalized, meaning all non-diagonal components can be made zero with suitable transformations. The invariant interval is then reduced to 4 terms, and the inverse metric is simply g = 1/g . αα

αα

Diagonalizing the metric tensor can mix the coordinates in surprising ways. For example, the time coordinate t and radial distance coordinate r might be replaced by the coordinates u=ct+r and v=ct–r, leaving nothing that represents pure time. But since the tensor calculus of general relativity works in any coordinate system, such mixing is mathematically valid and can simplify our calculations even when it defies our intuition. When one becomes comfortable with tensor notation, it is possible to drop the indices altogether, as we do in vector algebra. We can then write equation #3 as: X = Λx The ultimate equation of general relativity, and Einstein’s greatest contribution to mankind, are his field equations, which are written: G = 8π T Here, G represents the geometry of spacetime, and T represents all forms of energy, including mass and momentum. G is now called the Einstein tensor, and T is called the mass-energy-stress tensor. Both are symmetric, rank 2 tensors. We say equations, the plural, because G = 8πT represents 16 component equations: 4 describe the conservation of energy and momentum; 6 relate energy density to spacetime curvature; and the remaining 6 are redundant. John Archibald Wheeler said the meaning of G = 8πT is: “The geometry of spacetime tells mass and energy how to move, while mass and energy tell space how to curve.” Brian Greene describes Einstein’s field equations as the choreography of the cosmic ballet of the universe. It is a duet in which both parties lead one another. I hope you found this brief taste of general relativity intriguing. For a thorough yet accessible explanation of the most profound theory of science see General Relativity 1: Newton vs Einstein.

Polarization Tensor In V2p31-1, Feynman says that crystals are often anisotropic, as discussed in Chapter 31. This means their properties are different in different directions. Consider, for example a crystal comprised of long, thin molecules. An external electric field that is

parallel to the molecule’s long axis can move electrons quite far. Conversely, in the direction perpendicular to that axis, electrons have less mobility, and will move much shorter distances for the same field strength. Macroscopically, we observe that the polarizability of such crystals are anisotropic. This results in a polarization vector P that is not collinear with the applied external electric field E. Let’s understand why this happens by considering a crystal whose primary axes are along the orthogonal coordinate axes x, y, and z. We will assume that along each coordinate axis, P is collinear and linearly proportional to E, according to: P = β E , for j = x, y, z j

j

j

Figure 37-1 illustrates this in two dimensions, x and y. The electric field E induces a collinear polarization P , while a field E induces a collinear polarization P . x

x

y

y

Figure 37-1 Anisotropic Polarization

Assuming the crystal has greater polarizability in the y-direction, we show P being greater than P , even though E and E have the same strength. When we vectorially add the E’s and the P’s, we find that P is not collinear with E. y

x

x

y

Clearly, we will obtain different results if E and E do not have the same strength, or if the crystal axes are not aligned with our coordinate axis. In general, an external electric field in the x-direction may induce a polarization P with non-zero components along all three axes. Similar effects with occur for electric fields in the y- and z-directions. x

y

That greater complexity is best managed by defining a polarization tensor Π, such that: P =Π E +Π E +Π E P =Π E +Π E +Π E P =Π E +Π E +Π E x

xx

x

xy

y

xz

z

y

yx

x

yy

y

yz

z

z

zx

x

zy

y

zz

z

In tensor notation, this is: P =Π E j

jk

k

Here the sum over k is implicit. Without the indices this becomes:

P=ΠE The crystal’s dielectric properties are completely described by the components of the rank 2 tensor Π, with its first index denoting the direction of polarization induced by an external field along the axis denoted by the second index. In V2p31-3, Feynman discusses what happens if our coordinate system rotates. As always, we realize that nature is indifferent to how we define coordinate systems: the same physics results regardless of how we choose to describe it. Therefore the equation: P=ΠE remains valid in any coordinate system (that is why we use vectors and tensors), but the components of P, Π, and E change from coordinate system to coordinate system. Feynman does not show us how to calculate the new components, but with tensors that is easily done. Multiplying the polarization equation by rotation tensor R yields: RP=RΠE We can insert R R anywhere, because a tensor multiplied by its inverse equals [1], the unit tensor. –1

R P = R Π (R R) E –1

Regrouping yields: R P = (R Π R ) (R E) –1

R P and R E are the P and E rank 1 tensors (3-vectors) in the new coordinate system, and (R Π R ) is the rank 2, polarization tensor Π in the new coordinate system. The tensor products produce everything we need. –1

That is beautiful math — a messy problem, elegantly solved. In V2p31-6, Feynman suggests “amusing yourself by proving that [the unit tensor] has the same form in any coordinate system.” Since I gave you the rotation tensor and showed you how to apply it, this should be easy. You can check your answer at the end of this chapter.

Energy Ellipsoid To gain proficiency with tensors, Feynman suggests we consider the energy required to polarize a crystal. The field energy external to the crystal is ε E /2, but what is the energy stored within the polarization of the crystal itself? 2

0

The energy du required to move a charge q an infinitesimal distance dx in an electric field E is: x

du = q E dx x

For N such charges per unit volume, the energy required per unit volume is: du = qN E dx x

Now notice that qN dx is the change in dipole moment dP . Hence, x

du = dP E x

x

In 3-D vector notation this is: energy density = dP • E In tensor notation this is: energy density = d{ (Π E ) E } jk

k

j

Note that Π does not vary; tensor Π is comprised of constants that describe the crystal’s properties. Now define e to be the unit vector in the direction of E, and let E be the magnitude of E; hence E=eE, and: du = (Π e ) e E dE jk

k

j

Integrating from an initial field strength of zero to the final field E, we obtain: u = (Π e ) e E /2 2

jk

k

j

u=E Π E / 2 j

jk

k

Here we sum over both j and k. Without indices, the equation is: u=EΠE/ 2 The energy density u is a scalar. In V2p31-4, Feynman says that, for any real crystal, the polarization tensor Π is symmetric, meaning Π = Π for all j and k. Tensor Π thus has 6 independent components and 3 redundant ones. He adds that this means the tensor components can be determined by measuring u for various field combinations. For example: jk

kj

for E only: 2u = Π E for E only: 2u = Π E for E & E : 2u = Π E + 2Π E E + Π E x

xx

y x

yy

y

xx

2 x 2 y 2 x

xy

x

y

yy

2

y

The first two conditions allow us to determine Π and Π . Knowing those, the last condition provides xx

yy

Π . We can similarly determine Π , Π , and Π , the other 3 independent components of Π. xy

zz

xz

zy

Feynman points out that the last equation is a quadratic in E and E . Figure 37-2 shows the locus of (E , E ) pairs that provide the same energy density u for a given polarization tensor Π. This locus forms an ellipse. x

x

y

y

Figure 37-2 Energy Ellipse in 2-D

In V2p31-5, Feynman says the: “ ‘energy ellipse’ is a nice way of ‘visualizing’ the polarization tensor.” In 3-D, the energy density is quadratic in E , E , and E . The locus of points (E , E , E ) with the same energy density u is an ellipsoid, as shown in Figure 37-3. x

y

z

x

y

z

Figure 37-3 Energy Ellipsoid in 3-D

All ellipsoids have three orthogonal principal axes. The coordinate system in Figure 37-3 has been aligned in those directions. In this figure, y is the first principal axis, because that is the direction of the largest diameter, which is 2b. The shortest diameter is 2a, along the principal axis in the xdirection. The last principal axis is in the z-direction, perpendicular to the other two; along this axis the diameter is 2c. The great advantage of the principal axes, as with moments of inertia (see Feynman Simplified 1D, Chapter 41), is that the behavior along each principal axis is particularly simple. By aligning our coordinate system to the crystal’s principal axes, we find that: 2u = Π E + Π E + Π E 2

xx

x

yy

2 y

zz

2 z

With these axes, the only non-zero components of Π are the 3 shown above: the other 6 are zero. This

means, rotating our coordinate axes to align with the crystal’s principal axes diagonalizes the polarization tensor Π, and reduces the polarization equations to: P =Π E P =Π E P =Π E x

xx

x

y

yy

y

z

zz

z

The polarization coefficients are still different, but now a field along any of these axes induces a polarization only along that axis. Feynman stresses that the three principal axes of the crystal may not be the axes of the unit cells. The latter are not necessarily mutually orthogonal. If the three polarization coefficients are equal, the crystal is isotropic with regard to polarization. In V2p3-6, Feynman says: “Now the ellipsoid of polarizability must share the internal geometric symmetries of the crystal. For example, a triclinic crystal has low symmetry—[its] ellipsoid of polarizability will have unequal axes, and its orientation will not, in general, be aligned with the crystal axes. On the other hand, a monoclinic crystal has the property that its properties are unchanged if the crystal is rotated 180º about one axis. So the polarization tensor must be the same after such a rotation. …That can happen only if one of the axes of the ellipsoid is in the same direction as the symmetry axis of the crystal. … “For an orthorhombic crystal, however, the axes of the ellipsoid must correspond to the crystal axes, because a 180º rotation about any one of the three axes repeats the same lattice. If we go to a tetragonal crystal, the ellipse must have the same symmetry, so it must have two equal diameters. Finally, for a cubic crystal, all three diameters of the ellipsoid must be equal; it becomes a sphere, and the polarizability of the crystal is the same in all directions.” Related to polarization anisotropy, crystals can also have anisotropic electrical conductivity. In simpler materials, conductivity σ is constant and isotropic. The electrical current j and applied external field E are then given by: j=σE However, if a crystal’s conductivity depends on the direction of current flow, σ must be represented by a tensor, and the above equation changes to: j =σ E µ

µβ

β

Here we sum over β for each value of µ.

Inertia Tensor In Feynman Simplified 1D, Chapters 40 and 41, we explore the moment of inertia of rotating, rigid

solids. Just as crystals have principal axes of polarization, solid objects have principal axes of inertia. If an object is rotated about a principal axis, the relationship between angular momentum L, angular velocity ω, and moment of inertia I is: L=Iω Here both L and ω are 3-vectors (axial vectors rather than polar vectors, to be precise). In general, objects have different moments of inertia about each principal axis, as does the rectangular cuboid (“block”) shown in Figure 37-4.

Figure 37-4 Non-Principal Axis Rotation

Here, the three principal axes are perpendicular to the block’s faces and are indicated by arrows. Due to the differing side lengths, the moments of inertia along the principal axes are all different. As shown, the block rotates about the vertical axis with angular velocity ω, and has angular momentum L that is not collinear with ω. This is because the vertical axis is not a principal axis and the three moments of inertia are different. This more complex relationship between L and ω is best represented by a tensor equation: L =I ω j

jk

jk

Again, we sum over all values of k for each value of j. I is the rank 2, inertia tensor. jk

In V2p31-7, Feynman writes out all the components. L =I ω +I ω +I ω L =I ω +I ω +I ω L =I ω +I ω +I ω x

xx

x

xy

y

xz

z

y

yx

x

yy

y

yz

z

z

zx

x

zy

y

zz

z

You be the judge of whether the tensor equation or the component equations provide greater insight. If we define our coordinate system along the object’s principal axes, the inertia tensor will be diagonal, and the above equation reduces to: L = (I ω , I ω , I ω ) xx

x

yy

y

zz

z

With the same logic that we used to analyze polarization, we can derive the components of the inertia tensor by calculating an object’s rotational kinetic energy. Let the object contain N particles, and let the nth particle be at position r = (r , r , r ) and have mass m . To simplify the math, we choose a coordinate system centered at the object’s center of mass (CM), and assume the axis of rotation passes through the CM, which is stationary. Mimicking the equations for polarization energy, the object’s total rotational kinetic energy can be written in two ways: n

nx

ny

nz

n

Σ m v /2 = K.E. = Σ I ω ω /2 n

2 n

n

jk

jk

j

k

Here v = | v | = |dr / dt|, and the sums are over: n =1 to N; and j & k = x, y, and z. n

n

n

To solve for the object’s moment of inertia I, we must separate the sum of mv into two parts: one that depends on angular velocity ω; and another that depends on the structure of the object itself, its sum of mr . 2

2

Since the motion is entirely rotational: v =ω×r n

n

The velocity is perpendicular to both ω and r , and is proportional to the magnitude of ω and to the component of r orthogonal to ω. The kinetic energy is then: n

n

K.E. = Σ m (ω×r ) • (ω×r ) /2 n

n

n

n

Let’s work on the squared cross product, using θ as the angle between ω and r . n

|ω×r | = ω r sin θ = ω r – ω r cos θ = ω r – (ω•r ) 2

2

n

2

2 n 2 n

2

2 n

n

2

n

2

2 n

2

n

2

n

We rewrite this in terms of components: = r (Σ w ) – (Σ ω r ) (Σ ω r ) 2 n

2

j

j

j

j

nj

k

k

nk

Next, we can combine the two sums in the right term, because they have different indices: 3 j-terms multiplied by 3 k-terms are the same as 9 jk-terms. We also make the ω-terms in the left sum match those of the right half with a little trick: ω = Σ δ ω ω . 2

j

k

jk

j

k

= r (Σ δ w w ) – Σ (ω r ω r ) 2 n

jk

jk

j

k

jk

j

nj

k

nk

|ω×r | = Σ (r δ – r r ) ω ω 2

n

jk

2 n

jk

nj

nk

j

k

Finally, we substitute this expression into the kinetic energy equation, and compare both versions to identify I . jk

2 K.E. = Σ I ω ω jk

jk

j

k

= Σ m Σ (r δ – r r ) ω ω n

n

2 n

jk

jk

nj

nk

j

k

I = Σ m (r δ – r r ) jk

n

2 n

n

jk

nj

nk

Feynman provides a different derivation with all components written out explicitly that we will examine next. If you wish, you can skip this second proof and proceed to the next section. K.E. = Σ m (ω×r ) • (ω×r ) /2 n

n

n

n

Feynman evaluates this for a single r with components (x, y, z). (ω×r)•(ω×r) = (ω×r) +(ω×r) +(ω×r) 2

2

x

y

= (ω z–ω y) + (ω x–ω z) + (ω y–ω x) 2

y

2

z

z

x

x

2 z

2

y

= ω z – 2ω ω zy + ω y + ω x – 2ω ω xz + ω z + ω y – 2ω ω xy + ω x 2 2 y

y

2 z

z

2 2 z 2 2 x

2

2 2

z

x

x

y

x

2

2

y

Regrouping the squared terms, gives us: = ω (y + z ) – 2ω ω zy + ω (x + z ) – 2ω ω xz + ω (x + y ) – 2ω ω xy 2

2

2

x

y

2

2

2

y 2 z

2

2

z

z

x

x

y

Using the equation r = x + y + z , yields: 2

2

2

2

= ω (r – x ) – 2ω ω zy + ω (r – y ) – 2ω ω xz + ω (r – z ) – 2ω ω xy 2

2

2

x

y

2

2

2

y 2 z

2

2

z

z

x

x

y

(ω×r)•(ω×r) = Σ (r δ – r r ) ω ω 2

jk

jk

j

k

j

k

The three j = k terms in the last equation give us the left half of the prior three lines, while the six other terms with j not equal to k give us the right half. Note that the sum in the last equation gives us both –ω ω zy and –ω ω yz, which combine to equal the right half of the first of the three prior lines. y

z

z

y

From this point, Feynman’s derivation proceeds as I presented above.

Cross Product as a Tensor In V2p31-8, Feynman says we have been using tensors unknowingly since we studied the cross product in Feynman Simplified 1A, Chapter 6. For example, the equation for torque from Feynman Simplified 1D, Chapter 39, is:

τ=r×F One can write this as: τ =r F –r F jk

j

k

k

j

Here, we see that τ is formed by two pairs of proper polar 3-vectors. As we discussed in the Tensor Algebra section, each product term on the right side is a proper rank 2 tensor, and so therefore is their difference. jk

Clearly, τ is antisymmetric: τ = –τ . In 3-D, this means τ has only three independent non-zero components. Feynman says it is “almost by accident” that these three components form a proper axial 3-vector. He says “accident” because this is true only in 3-D. In 4-D, for example, antisymmetric tensors have 6 non-zero components, which clearly cannot make a 4-vector. jk

jk

kj

jk

The same logic applies to any vector cross product; each can be written as a rank 2, antisymmetric tensor.

Stress Tensor In V2p31-9, Feynman explores a tensor describing a very different phenomenon: stress. External forces acting on solid elastic matter distort the solid, and create internal stresses and strains. Atoms in the solid are displaced from their equilibrium positions (strain), and exert forces (stress) upon one another. We examined this phenomenon briefly in our discussion of a deflected drumhead in Chapter 12. Let’s delve more deeply into such phenomena using a distorted block of Jello as our example. Inside a block of Jello, intermolecular forces hold the material together. At equilibrium, across any plane surface, the forces acting on one side of the surface must balance the forces acting on the other side of that surface; the opposing forces must be equal in magnitude and opposite in direction. Over any small area Δa, these opposing forces, call them ΔF and –ΔF, must be proportional to Δa. Let’s see why. Select a vertical flat surface in the x=0 plane. The x-axis is normal to the surface, and y and z are the coordinates within the surface. Figure 37-5 illustrates the coordinate system, area Δa (shown in gray), and the components of force ΔF = (ΔF , ΔF , ΔF ) = (α, β, γ). x

y

z

Figure 37-5 Forces on Δa = Δy Δz

Since any physically meaningful function can be represented by a Taylor series, we can write the force ΔF as: ΔF = a + a (Δa) + a (Δa) + O{(Δa) } 2

1

0

3

2

Since ΔF is zero if Δa is zero, a must be zero, and in the limit that Δa is infinitesimal, terms proportional to the second and higher powers of Δa are negligible. Thus, ΔF is proportional to Δa, which is equal to Δy × Δz. 0

In a static liquid, force ΔF is always normal to the surface; we call this pressure. But, Feynman says, in solids or flowing viscous liquids, there may also be non-zero forces parallel to the surface; these are called shear forces. Thus, there are three potential force components acting on the surface ΔyΔz. Similarly, there are three force components acting on ΔzΔx and ΔxΔy surfaces. All this is best addressed by defining the tensor S for the stress per unit area: jk

S = ΔF / Δa jk

j

k

where, Δa has area Δy Δz & is orthogonal to x Δa has area Δz Δx & is orthogonal to y Δa has area Δx Δy & is orthogonal to z x y z

In V2p31-10, Feynman says the next step is to show that S is indeed a tensor, and that it completely describes the solid’s internal forces. jk

To address completeness, let’s consider a small volume V within the solid. If the solid is in equilibrium, volume V will be stationary, and the vector sum of all forces acting on its surfaces must be zero. In the limit that V goes to zero, any forces that originate outside the solid are negligible compared

with the solid’s internal forces. This is because macroscopic forces (electromagnetism and gravity) are proportional to volume for a homogeneous solid. The volume of V is proportional to its small dimensions cubed, while its surface area is proportional to its dimensions squared. We can therefore pick a size at which the internal forces dominate. Without outside forces, the sum of the internal forces over all surfaces of V must be zero. The volume we select is the triangular pyramid shown in gray in Figure 37-6. Three of the pyramid’s edges meet orthogonally at the origin, and have lengths: Δx, Δy, and Δz.

Figure 37-6 Triangular Py ramid

Here, vectors A and B are along two edges of the slanted surface, and vector N = (N , N , N ) is normal to that surface. x

y

z

Let’s first calculate N and the area of the slanted surface. Both are defined by the cross product: A × B = N = n (area of AB-parallelogram) As we discussed in Feynman Simplified 1A, Chapter 6, the graphic significance of the cross product N=A×B is that the magnitude of N equals A the area of a parallelogram whose sides are A and B, and the direction of N is normal to that parallelogram. In this case, | AB|

A × B = (Δx, 0, –Δz) × (0, Δy, –Δz) N = (Δy Δz, Δx Δz, Δx Δy) A = √(Δy Δz + Δx Δz + Δx Δy ) 2

2

2

2

2

2

| AB|

The area of the slanted surface is one-half of A , the area of the parallelogram formed by A×B. | AB|

Next, let’s calculate the x-component of force on each of the four surfaces of the triangular pyramid, with each surface’s unit normal defined to point outward. We assume that the dimensions are small enough that the force across each surface is constant. The equation for force is: ΔF = S Δa j

jk

k

Note that force F and area a are vectors. On the bottom surface (y=0), the unit normal points toward – y, hence Δa is negative, and this surface contributes: y

bottom: (–1) S Δx Δz /2 xy

On the left surface (x=0), the unit normal points toward –x, hence Δa is negative, and this surface contributes: x

left: (–1) S Δy Δz /2 xx

On the back surface (z=0), the unit normal points toward –z, hence Δa is negative, and this surface contributes: z

back: (–1) S Δx Δy /2 xz

On the slanted surface, the force contribution F equals N , the x-component of force per unit area, multiplied by its surface area. x

F=N A x

| AB|

/2

Setting the total force in the x-direction equal to zero yields: 0 = F – S ΔxΔz/2 – S ΔyΔz/2 – S ΔxΔy/2 xy

xx

xz

N A = S ΔxΔz – S ΔyΔz – S ΔxΔy x

|AB|

xy

xx

xz

Recall that: N = (Δy Δz, Δx Δz, Δx Δy) We now define a unit vector n = (n , n , n ) that is parallel to N. x

y

z

n = Δy Δz / A n = Δx Δz / A n = Δx Δy / A x

| AB|

y

| AB|

z

| AB|

Our prior equation becomes: N =S n +S n +S n x

xx

x

xy

y

xz

z

We can generalize this to any force component as: N =S n j

jk

k

The above equation shows that the stress tensor S does indeed completely describe the solid’s internal stresses for surfaces of any orientation. Also, since N and n are both proper 3-D vectors, S must be a proper 3-D, rank 2 tensor. xk

xk

(If you find Feynman’s equation (31.24) in V2p31-11 confusing, you are not alone — I do too. Feynman is using a special notation that he invented and few others use. His symbol S on the right side of this equation is the jk-component of a rank 2 tensor, while his symbol S on the left is the jth component of the 3-D force vector S .) jk

jn

n

Feynman shows that the stress tensor is symmetric: S = S . Figure 37-7 shows the stresses on a small square area in the xy-plane. jk

kj

Figure 37-7 Stresses on Square

At equilibrium, the square cannot rotate, so the sum of all torques must be zero. For side length D, the total torque is: 0 = 2 S D/2 – 2 S D/2 yx

xy

Hence, S = S , and similarly for other mixed index combinations. The stress tensor is therefore symmetric. xy

yx

Any stress tensor can be diagonalized, which is equivalent to saying that each has three shear-free principal axes. In the coordinate system defined by the principal axes, all forces act normal to any surface (as does pressure). The stress tensor can also describe piezoelectricity, the phenomenon of a crystal producing an electric field E in response to stress, according to: E =Θ S j

jkn

kn

Here, we sum over k and n, and Θ is the piezoelectric tensor. jkn

As a final note, Feynman says that an inhomogeneous solid’s stresses may vary with location. Hence the stress tensor has different components at each point. Just as temperature is a scalar field, and wind velocity is a vector field, stress in a distorted solid is a tensor field.

Strain and Elasticity Recall from Feynman Simplified 1A, Chapter 7, Hooke’s law for springs:

F=kx Here, F is the force required to displace a spring a distance x from its equilibrium length, and k is the spring strength constant. The elastic energy U stored in a displaced spring is given by: U= F x/ 2 = kx / 2 2

Hooke’s law is adequate for small displacements of simple materials in one dimension. More complex solids, with more complex distortions, are best addressed using strain tensor T and elasticity tensor Γ. In the general case, we have: T=ΓS U= T S / 2 = Γ S S / 2 With indices, these are: T =Σ Γ

jknm

U= Σ

jknm

jk

nm

jknm

Γ

S

nm

S S /2 jk

nm

As you see, Γ has four indices, making it a rank 4 tensor with 3 = 81 components. Fortunately, there are many duplicate components, reducing the number of independent components to a still formidable number of 21, in the general case. That number drops substantially in highly symmetric crystals. In a cubic crystal, for example, Γ has only 3 independent components. 4

In an isotropic solid, Γ has only 2 independent components. Feynman explains why. The only tensor that is completely symmetric and isotropic is δ , or some construct thereof. One can build a rank 4 tensor out of δ in two ways: jk

jk

δ δ jk

nm

δ δ +δ δ jn

km

jm

nk

Hence, Γ must be a linear combination of these: Γ

jknm

= α (δ δ ) + β (δ δ + δ δ ) jk

nm

jn

km

jm

nk

for some constants α and β.

Field Momentum Tensor We next expand our vision into the four dimensions of spacetime. The stress tensor S described above is actually the 3×3 spatial portion of a 4×4, rank 2, 4-D stress jk

tensor. We previously defined the jk-component of S as the j-component of force per unit area on a surface orthogonal to the k-direction. Since force is the time derivative of momentum, we can describe force as the rate at which momentum flows. This means the jk-component of S is rate of j-momentum flow per unit area through a surface orthogonal to the k-direction. In fewer words: S is the flux of jmomentum in the k-direction. Here, j and k are any of x, y, or z. jk

What about time-components, such as S or S ? tt

tx

Recall that the time component of 4-momentum p is E/c. We will not prove this here, but the time components of the stress tensor are: µ

S S S S

tx ty tz tt

= S = energy flux in x-direction = S = energy flux in y-direction = S = energy flux in z-direction = energy density xt yt

zt

This 4-D tensor is called the stress-energy tensor, or the stress-energy-mass-momentum tensor, or various versions thereof. Normal convention denotes this: T . This tensor is featured in Einstein’s most important equation of general relativity: μσ

G = 8π T μσ

μσ

Just for entertainment, we provide the equations for two other famous tensors. For a gas of free particles (no forces), with density ρ, pressure P, and gas 4-velocity u: T = (ρ+P) u u – g P μσ

μ

σ

μσ

For electromagnetic fields in empty space: T = (ε /2) {F F – g F F /4} β

μσ

0

μ

αβ

μβ

μσ

αβ

Universality of Kronecker Delta Feynman challenges us to prove that the Kronecker delta δ has the same form [1] in every coordinate system. μσ

At the start of this chapter, we learned how to transform a tensor from one coordinate system to another using the rotation tensor R and its inverse R . For any rank 2 tensor A, the transformed tensor A* is given by: –1

A* = R A R

–1

In 4-D curved spacetime, this remains valid for rotations. For boosts, simply replace R above with the Lorentz tensor Λ. Multiplying any tensor X by the unit tensor δ yields X, as shown by: Y= δ X Y =Σ δ X Y =X αβ

μ

αβ

αβ

αμ

μβ

Here, we sum over µ for every combination of α and β. In that sum, δ equals 1 when α=μ, and equals zero otherwise. αμ

Now, replace A in the transformation equation with δ , yielding: μσ

δ* = R (δ R ) = R R = δ for any R –1

μσ

–1

μσ

μσ

QED

Chapter 37 Review: Key Ideas • Tensors are a generalization of vectors. Tensor calculus is a beautiful branch of mathematics that empowers us to elegantly and effectively describe many complex, multi-dimensional phenomena. Any tensor equation that is valid in any coordinate system is valid without modification in all coordinate systems. Tensors are characterized by their rank and by the dimensionality D of the space in which they are defined. A rank N tensor has N indices that all range over the same set of allowed values, and has D components. N

A fixed index, such as 1 or x, specifies a specific component or row or column, etc. A free index can assume any allowed value, such as 0,1,2, 3 or t, x, y, z. An unmatched free index appears only once in a single product term. Every tensor equation is valid for all values of every unmatched free index. In tensor equations, each non-zero term must have the same set of unmatched indices. When a free index appears twice in one term, the Einstein summation convention directs us to sum over all values of the repeated index. This is a tensor contraction. For example: A B =A B +A B +A B +A B μ

μ

0

0

1

1

2

2

3

3

All quantities on the right side are components, hence their sum is a scalar with no indices. The tensor contraction of a rank N tensor with a rank M tensor has rank N+M–2.

The Kronecker delta δ is defined by: μσ

δ = 1 if μ=σ and zero otherwise. μσ

All non-singular tensors have inverses: A inverse always equals the unit tensor:

–1

A

–1

A =A A

–1

σμ

μβ

σμ

μβ

= [1] = δ

μσ

is the inverse of A . The contraction of a tensor with its μσ

σβ

• The polarizability tensor Π of a crystal relates its polarization vector P to an external electric field E according to: P =Π E j

jk

k

• The moment of inertia tensor I of a solid relates its angular momentum vector L and angular velocity vector ω according to: L =I ω µ

µβ

β

I = Σ m (r δ – r r ) jk

n

n

2 n

jk

nj

nk

Here, the sum is over all particles in the solid, m is the mass of the nth particle and r is its position. n

n

• Any vector cross product can be expressed as a rank 2, antisymmetric tensor. For example, torque τ in vector form is: τ=r×F In tensor form, this is: τ = r F – r F jk

j

k

k

j

Chapter 38 Magnetic Matter There are several distinct classes of magnetism in macroscopic material objects. The most impressive is ferromagnetism, the subject of Chapter 40, which occurs in only a few select materials, primarily iron, nickel, and cobalt. Ordinary materials have magnetic effects that are much weaker — up to a million times weaker — than ferromagnetism. Let’s start with those. In electrostatics, we learned that all dielectric matter is attracted by an external electric field E. Positive charges within material bodies are displaced in the direction of E, while negative charges are displaced in the opposite direction. This creates an electric dipole that has lower energy in regions of higher electric fields. The resultant force pulls all types of matter toward higher electric fields. Magnetism is more complex — magnetic fields attract some materials and repel others. The two alternatives are easily distinguished by the apparatus shown in Figure 38-1.

Figure 38-1 Tapered Magnet Pole

Here, a small black sample dangles from a string between a magnet’s north and south poles. Magnetic field lines with arrows show that the B field is strongest near the tip of the tapered south pole. In this setup, ferromagnetic materials are strongly attracted to the strong field near the south pole tip. But ordinary, non-ferromagnetic materials are either weakly attracted or weakly repelled by the south pole. In V2p34-1, Feynman says materials that: “…are repelled in this way are called diamagnetic. Bismuth is one of the strongest diamagnetic materials, but even with it, the effect is still quite weak. Diamagnetism is always very weak.”

Conversely, materials that are weakly attracted to the tapered pole are called paramagnetic. Aluminum is a paramagnetic material. Feynman says he will provide some insights into these phenomena using classical (non-quantum) electromagnetism, but that a true understanding requires quantum mechanics. That these are quantum phenomena should not be surprising. Magnetism in matter is primarily due to atoms, and the development of quantum theory was driven by the desire to understand atoms for which classical physics fails utterly. In V2p34-2, Feynman says: “Such magnetic effects are a completely quantum-mechanical phenomenon. It is, however, possible to make some phoney classical arguments and to get some idea of what is going on. We might put it this way: …there are situations, such as in a plasma or a region of space with many free electrons, where the electrons do obey the laws of classical mechanics [and] some of the theorems from classical magnetism are worth while.” With that caveat, here is a brief summary of what this chapter will present. Diamagnetism: In all matter, an external B field induces weak currents that oppose B, in accordance with Lenz’s law. Since the induced magnetic moments are opposite to B, this diamagnetic effect causes all materials to be very weakly repelled by magnetic fields. In many substances, the magnetic moments of individual atoms balance exactly, resulting in zero net moment. In such cases, diamagnetism is the only effect, and these materials are weakly repelled by magnetic fields. Paramagnetism: In other substances, atoms do have permanent magnetic moments due to unbalanced spins or orbital motion. These atomic moments tend to align parallel to an external field B, causing these paramagnetic materials to be weakly attracted to B. Paramagnetism, when present, generally dominates the ever-present diamagnetism. The latter is not temperature-sensitive but the former is. Thermal energy drives interatomic collisions that disrupt the paramagnetic alignment of atomic magnetic moments. Paramagnetism is therefore stronger at lower temperatures.

Magnetic Moment & Angular Momentum In V2p34-3, Feynman explains the relationship between an electron’s magnetic moment µ and its angular momentum J. Let’s consider an electron in a circular orbit, as in an atom. From classical physics, we know that the equation for J is: J = m r×v Here, the electron’s mass is m, its velocity vector is v, and its position vector is r. In atoms, electron velocities are always much less than the speed of light. We also know that the magnetic moment µ is the product of current and enclosed area. See Chapter

15, but be aware that Feynman uses J for current there, whereas he uses J for angular momentum here. For a vast number of charges with charge density ρ, their current density is simply ρv. But for a single charge q in a circular orbit, its current is q multiplied by the number of orbits per unit time, which is v/2πr. Hence: µ = q (v/2πr) (πr ) µ = q r×v /2 2

In the last line, we included the directionality of µ: for positive current flowing counterclockwise, µ is parallel to angular momentum J. We can now examine the relationship between magnetic moment and angular momentum for any charged particle. µ = q r×v /2 = (q/2m) J In V2p34-3, Feynman defines the electron charge to be –q . This is particularly confusing since we all know that the electron charge q is negative. To clarify this, I will write the electron charge as –q , where q is the charge of a proton, which is indisputably positive. The equation for an electron’s magnetic moment is then: e

e

p

p

µ = – (q /2m) J p

It turns out that this last equation is also valid quantum mechanics for orbital angular momentum. However, each elementary particle in an atom also has an intrinsic magnetic moment that physicists attribute to its intrinsic spin. Quantum spin is somewhat like Earth’s spin, its daily rotation about its axis, but of course, quantum things are never really the same as the macroscopic things with which we are familiar. The actual relationships between spin magnetic moment µ and spin angular momentum J involve a quantum mechanical factor called g (not the gravitational acceleration g). For four prominent particles, these relationships are: muon: µ = – g (q /2m) J g = 2.002,331,8414 (±12) µ

µ

p

µ

electron: µ = – g (q /2m) J g = 2.002,319,304,361,17 (±15) e

e

p

e

proton: µ = + g (q /2m) J g = +5.585,694,702 (±17) p

p

p

p

neutron: µ = + g (q /2m) J g = –3.826,085,45 (±90) n

n

n

p

The g-values are the latest measured quantities, with the uncertainties in their right-most two digits given in parentheses. Note that the value of g is predicted by quantum electrodynamics (QED) and is measured experimentally to 14 decimal digits, one part in 10 trillion — a fantastic achievement by both theoretical and experimental physicists! g is one of the most precisely measured quantities in all of science. e

e

The Standard Model of Particle Physics says that the muon and electron are electromagnetically identical except for their mass — the muon is 207 times more massive. Their g-factors should be the same except for small mass-dependent effects. QED says those effects (different contributions from heavy particle exchange) account for 99.96% of the measured g-factor difference. That still leaves an observed but unexplained difference of 2.8±0.82 parts per billion. This amounts to a 3.4 standard deviation discrepancy, which is 99.9% unlikely to occur by random chance. If this discrepancy holds up in further experiments now underway, it will be evidence of new physics beyond the Standard Model — that would be exciting. One might be surprised that neutrons have a magnetic moment. Completely chargeless particles, such as photons and neutrinos, have no magnetic moment. However, we now know that neutrons are comprised of three charged quarks, whose charges are +2/3, –1/3, and –1/3 times q . While that sums to zero net charge, moving charged quarks within a neutron give it a magnetic moment. p

Because different types of atoms contain different mixtures of spin and orbital angular momentum, we characterize each type of atom using the Lande g-factor, according to: µ

atom

= – g (q /2m) J p

Here, g ranges from 1 to 2.002…, corresponding to the limits of the atom’s angular momentum being entirely orbital or entirely spin, respectively.

Precession of Atomic Moments In V2p34-4, Feynman reminds us that a magnetic moment precesses in an external magnetic field. A magnetic field B exerts a torque τ on a magnetic moment µ according to: τ = µ×B The direction of the torque seeks to align µ parallel to B. However, µ is a gyroscope with angular momentum J. As explained in Feynman Simplified 1D, Chapter 41, torque τ causes a gyroscope to precess by changing its angular momentum J. In a small time interval Δt, the change in angular momentum ΔJ is: ΔJ = τ Δt

Figure 38-2 shows magnetic field B, angular momentum J, and angular momentum change ΔJ. Also shown are the angle θ between B and J, and the dotted circle of radius r through which J precesses at frequency ω. Torque τ is parallel to ΔJ, but is not shown to reduce clutter.

Figure 38-2 Precession of J

Since atomic magnetic moments are primarily due to electrons, µ is antiparallel to J, and the direction of torque τ=µ×B is counterclockwise, as shown in Figure 38-2, which makes ω>0. Now that we know the vectors’ directions, let’s focus on their magnitudes. τ = µ B sinθ τ Δt = ΔJ = r ω Δt Since radius r = J sinθ, the prior equation becomes: τ = J sinθ ω Combining equations yields: J sinθ ω = { g (q /2m) J } B sinθ p

ω = g q B / 2m p

In V2p34-5, Feynman provides numerical values of the precession frequencies (f=ω/2π) for two cases: when magnetic moments are due to electrons; and when they are due to nucleons (protons or neutrons). for electrons: f = (1.4 MHz/gauss) g B for nucleons: f = (760 Hz/gauss) g B This is the classical description. The quantum description is not very different.

Diamagnetism Feynman says one of the “nice ways” to describe diamagnetism with classical physics is to imagine an isolated atom in a magnetic field that is slowly ramping up from zero strength. From Faraday’s law, an increasing B field creates a circulating E field, according to: Ď × E = – ∂B/∂t Let’s calculate E along the orbit of an electron at a distance r from the atom’s center, as shown in Figure 38-3.

Figure 38-3 E from Increasing B

The loop Γ is a circle of radius r centered on the atom. Stokes’ theorem equates the line integral of E•ds around Γ with minus the rate of change of magnetic flux through the gray area enclosed by Γ. This is: 2πr E = – πr ∂B/∂t 2

E = – (r/2) ∂B/∂t The induced field E exerts a torque on the electron with charge q that changes its angular momentum J, as given by: τ = dJ/dt = – q r × E p

Substituting the magnitude of E yields: dJ/dt = + (q r /2) ∂B/∂t 2

p

Now, we obtain the total change in angular momentum ΔJ by integrating this equation from the time when the magnetic field is zero to the time when it is B. ΔJ = + (q r /2) B 2

p

This means ΔJ is added to the atom’s intrinsic angular momentum (if any), which in turn adds Δµ to the atom’s initial magnetic moment (if any) according to: Δµ = – (g q / 2m) ΔJ p

Δµ = – q r B / 4m 2 p

2

Since the added angular momentum is entirely orbital, the g-factor in the upper line equals 1. As expected, the induced magnetic moment opposes the increasing B field. This induced opposing moment is the basis of diamagnetism. In an atom with no intrinsic magnetic moment, the induced moment results in potential energy U (see Chapter 16): U = – µ•B = + q r B / 4m 2 p

2

2

Since nature always seeks to reduce potential energy (and thereby increase entropy), diamagnetic materials are forced away from regions of strong fields. This is why a diamagnetic substance swings away from the tapered south pole in Figure 38-1. The r factor in the prior equation refers to the radius of an electron’s circular orbit, assuming that orbit is two-dimensional. While plausible in classical physics, in quantum mechanics orbits are three-dimensional. To be closer to quantum theory, we should express this in terms of the average square 3-D distance from the atom’s center. 2

in 2-D, r = x + y 2

2

2

in 3-D, = x + y + z 2

2

2

2

Assuming symmetric orbits, we should replace r with (2/3) . Making this substitution changes the induced moment equation to: 2

2

Δµ = – q B / 6m 2 p

2

As mentioned earlier, the diamagnetic effect exists in all materials, but is generally dominated by other magnetic effects in materials with intrinsic magnetic moments.

Larmor’s Theorem The intrinsic spin of elementary particles is a purely quantum phenomenon that has no corresponding classical equivalent. In classical physics, therefore, all atomic angular momentum is due to electrons’ orbital motion. This means the Lande g-factor is always 1, and all electrons precess at the same angular velocity ω = q B/2m. This is not the case in quantum mechanics. p

In classical physics, Larmor’s theorem applies: the atomic motion of electrons in an external B field is their no-field motion plus an additional rotation about B at frequency ω = q B/2m. However complex are the atomic motions of electrons absent an external field, adding a field merely adds a simple constant-frequency rotation. L

p

In V2p34-7, Feynman proves this theorem to the extent of the validity of classical physics. Begin with an electron in some complex orbit around a nucleus, but without an external B field. At any time t, let the electron be at position r and be subject to force F(r).

Now add field B. This changes the total force on the electron to: F*(r) = F(r) + q v×B Now, Feynman’s trick is to consider this electron in a rotating coordinate system, one turning about the B-direction with angular velocity ω. As we discuss in Feynman Simplified 1A, Chapter 9, a rotating coordinate system is not an inertial frame. For classical mechanics to yield correct results in a rotating coordinate system, we must add centrifugal and Coriolis pseudo forces to the normal real forces. These pseudo forces produce an apparent tangential force F and an apparent radial force F given by: t

r

F = 2mωv t

r

F = mω r – 2mωv 2

r

t

Here v is the electron’s radial velocity, which is the same in both stationary and rotating frames, while v is its tangential velocity relative to the rotating frame. For small angular velocities (ωr 0. 2

(2j+1) = 2ħ {(1/2) + (3/2) + …j } (2j+1) = (ħ /2) {1 + 3 + …(2j) } (2j+1) = (ħ /2) { (2j)(2j+1)(2j+2)/6 } = j(j+1) ħ / 3 2 z 2 z 2 z

2 z

2

2

2

2

2

2

2

2

2

2

For an integer j, the sum of all j is also twice the sum of the squares of all j>0, since j=0 does not contribute. 2

(2j+1) = 2ħ {1 + 2 + 3 + …j } (2j+1) = 2ħ { j(j+1)(2j+1)/6 } = j(j+1) ħ / 3 2

2 z 2 z

2 z

2

2

2

2

2

2

For both integer and half-integer j, we obtain: J = 3 = j(j+1) ħ 2

2 z

2

In V2p34-11, Feynman says: “Although we would think classically that the largest possible value of the z-component of J is just the magnitude of J—namely, √(J•J) — quantum mechanically the maximum of J is ħj, which is always a little less than ħ√j(j+1). The angular momentum is never ‘completely along [any] direction’.” z

To be quantitative, for spin 1/2, √j(j+1) = 0.86603, substantially more than 0.5.

Magnetic Energy of Atoms The magnetic energy of an atom is also quantized in quantum mechanics. Recall that we wrote an atom’s magnetic moment using the Lande g-factor as: µ

atom

= – g (q /2m) J p

In a magnetic field, the atom has potential energy U given by: U = – µ•B Aligning the z-axis parallel to B, these equations combine to yield: U = + g (q /2m) J B p

z

Since J is quantized, U becomes: z

U= + gµ B j B

z

where µ = q ħ / 2m is the Bohr magneton. B

p

This equation shows that the magnetic energy is proportional to both the strength of the B field and to the z-component of quantum number j. We say a magnetic field splits an atom’s energy into 2j+1 different energy levels, as shown in Figure 38-4 for a j=3/2 atom.

Figure 38-4 Energy Splitting for J=3ħ/2

Figure 38-5 shows the energy splitting for a j=1/2 atom (or a lone electron).

Figure 38-5 Energy Splitting for J=ħ/2

The following chapters will explore magnetic matter employing the quantum rules we learned here.

Sums of Integers Squared Consider first the sum of the squares of all integers up to n: S =1+2 +3 +…+n =Σ j 2

2

2

n

2

n

For a few values of n, S is: n

S =1+2 =5 S = 1 + 2 + 3 = 14 = 2•7 2

2

2

2

3

Both of these sums contain the factor 2n+1. Let’s factor that out, multiply by 3 to avoid fractions, and examine what remains for various n: n=1: 3S /(2n+1) = 1 n=2: 3S /(2n+1) = 3 n=3: 3S /(2n+1) = 6 = 2•3 n=4: 3S /(2n+1) = 10 = 2•5 n n n n

This series is simply the sum of the integers from 1 to n, which equals n(n+1)/2. We therefore have: 3S /(2n+1) = n(n+1)/2 S = n(n+1)(2n+1)/6 n

n

That works for n up to 4. To prove it for all n, we use the principle of mathematical induction: given an equation that is valid for some n, if we prove it is valid for n+1, then it must be valid for all n. S S S S S

n+1

= (n+1)(n+2)(2n+3)/6 = (n+1){ 2n + 7n + 6 }/6 = (n+1){ (2n +n) + 6(n+1) }/6 = (n+1){ n(2n+1) }/6 + (n+1)(n+1) = S + (n+1) 2

n+1

2

n+1 n+1

2

n+1

n

Thus the equation is valid for n+1 if it is valid for n. Since it is valid for n=1 through 4, the equation is valid by induction for all n. (since it is valid for n=4, it is valid for n=5, which makes it valid for n=6…) QED Next consider the sum of the squares of even integers: E =2 +4 +6 +…+n E = 4{ 1 + 2 + 3 + … + (n/2) } 2

2

2

2

n

2

n

2

2

2

E =4S E = 4 (n/2)(1+n/2)(n+1)/6 E = n(n+1)(n+2)/6 n

n/2

n n

QED Next consider the sum of the squares of odd integers: O =1 +3 +5 +…+n = {1 + 2 + 3 + 4 + 5 + … + n } – {2 + 4 + 6 + … + (n–1) } 2

2

2

2

2

2

2

n

2

2

2

2

2

2

2

O =S –E 6O = n(n+1)(2n+1) – (n–1)(n)(n+1) 6O = n(n+1){ (2n+1) – (n–1) } O = n(n+1)(n+2)/6 n

n

n–1

n n

n

QED Perhaps surprisingly, O = E — the sum of odd squares and the sum of even squares have the same equation, but of course with a different value of n. n

n

Chapter 38 Review: Key Ideas • Diamagnetism: In all matter, an external B field induces weak currents that oppose B, causing all materials to be very weakly repelled by magnetic fields. In many substances, the magnetic moments of individual atoms balance exactly, resulting in zero net moment. In such cases, diamagnetism is the only effect, and these materials are weakly repelled by magnetic fields.

• Paramagnetism: In other substances, atoms have permanent magnetic moments, due to unbalanced spins or orbital motion. These atomic moments tend to align parallel to an external field B, causing these materials to be weakly attracted to B. Paramagnetism, when present, generally dominates the ever-present diamagnetism. The latter is not temperature-sensitive but the former is. Thermal energy drives interatomic collisions that disrupt the paramagnetic alignment of atomic magnetic moments. Paramagnetism is therefore stronger at lower temperatures.

• The Lande g-factor relates an object’s angular momentum J to its magnetic moment µ, according to: µ = g (q/2m) J Here, q is the charge and m is the mass of the object. For orbital angular momentum g=1. For four prominent particles:

muon: µ = – g (q /2m) J g = 2.002,331,8414 (±12) µ

µ

p

µ

electron: µ = – g (q /2m) J g = 2.002,319,304,361,17 (±15) e

e

p

e

proton: µ = + g (q /2m) J g = +5.585,694,702 (±17) p

p

p

p

neutron: µ = + g (q /2m) J g = –3.826,085,45 (±90) n

n

p

n

• Magnetic moments precess in an external field B, turning about the B-direction at angular velocity ω (frequency f=ω/2π), where: ω = – g q B / 2m for electrons: f = (1.4 MHz/gauss) g B for nucleons: f = (760 Hz/gauss) g B

• Larmor’s theorem: in classical physics, the atomic motion of electrons in an external B field is their no-field motion plus an additional rotation about B at frequency ω = qB/2m. L

• In quantum mechanics, the z-component of angular momentum of an object with angular momentum j can only have one of 2j+1 specific, equally spaced values, such as: –jħ, –(j–1)ħ, –(j–2)ħ … +(j–2)ħ, +(j–1)ħ, +jħ Here, ħ = h/2π, where h is Planck’s constant, the number that sets the scale of quantum phenomena. For orbital angular momentum, j must be an integer. For intrinsic particle spins, j is an integer for bosons, and a half-integer for fermions. All elementary fermions, plus protons and neutrons, have spin 1/2. All known bosons have spin 1, except the Higgs boson that has spin 0. The total classical angular momentum J and the total quantum angular momentum j are related by: J = j(j+1) ħ 2

2

• The Bohr magneton µ and the magnetic energy U in a field B of the magnetic moment of a particle B

with angular momentum j parallel to B are: U= + gµ B j µ = q ħ / 2m B

B

p

This equation shows that a magnetic field splits an atom’s energy into 2j+1 different energy levels.

Chapter 39 Paramagnetism & Resonance In V2p35-1, Feynman bemoans not being able to present all of physics in one lecture: “In the last chapter we described how in quantum mechanics the angular momentum of a thing does not have an arbitrary direction, but its component along a given axis can take on only certain equally spaced, discrete values. It is a shocking and peculiar thing. You may think that perhaps we should not go into such things until your minds are more advanced and ready to accept this kind of an idea. Actually, your minds will never become more advanced—in the sense of being able to accept such a thing easily. There isn’t any descriptive way of making it intelligible that isn’t so subtle and advanced in its own form that it is more complicated than the thing you were trying to explain. The behavior of matter on a small scale—as we have remarked many times—is different from anything that you are used to and is very strange indeed. As we proceed with classical physics, it is a good idea to try to get a growing acquaintance with the behavior of things on a small scale, at first as a kind of experience without any deep understanding. Understanding of these matters comes very slowly, if at all. Of course, one does get better able to know what is going to happen in a quantum-mechanical situation—if that is what understanding means—but one never gets a comfortable feeling that these quantummechanical rules are ‘natural.’ Of course they are, but they are not natural to our own experience at an ordinary level.”

Why Quantization? I will attempt what Feynman doesn’t do in this lecture: provide a simple understanding of why quantization dominates the micro-world. The key concept you must grasp is particle-wave duality. This is the principle from which all quantum theory derives. Particle-wave duality says every entity in nature has both particle and wave properties, at all times. We tend to think of electrons and other “particles” as being rigid balls of zero size. Quantum mechanics says that is wrong: particles have a wavelength that spreads them in some fuzzy way across a small but decidedly non-zero volume. Duality also says that “waves”, such as light and sound, are ultimately comprised of individual, indivisible particles. Why should you believe that? For the only reason that you should believe anything in science: because experiments prove this is the way nature really is. As physicists, our vocation is to

understand how nature actually is, not how we wish it might be. Figure 39-1 shows stunning proof of particle-wave duality. In the upper image, light from source S passes through two small slits in an opaque screen. The resulting two beams interfere with one another, creating a fringe pattern on an imaging plane F. As explained in Feynman Simplified 1C, chapter 31, this fringe pattern is a hallmark sign of wave interference.

Figure 39-1 Light (above) vs. Electrons (below)

The lower image shows the same arrangement, but with source S emitting electrons instead of light. While less crisp, the real image shown here clearly has a fringe pattern, thus proving that electrons have wave properties. Figure 39-2, a magnified portion of the electron image, shows that image is comprised of many tiny dots — 140,000 to be exact. Each tiny dot is the interaction point of an individual electron. Since waves spread over large areas, the fact that each electron strikes only a single point on the detector proves electrons have particle properties.

Figure 39-2 Electron Image Zoom

A critical aspect of this brilliant experiment, done by Akira Tomonaga, is that the electron flux is so small that only one electron at a time passes from source to detector. Thus we must accept the startling conclusion that each electron passes through both slits simultaneously, creating two electron-waves that interfere with one another, contributing to the fringe pattern while striking only a single point on the detector. Only particle-wave duality has “rationalized” this behavior. To paraphrase Niels Bohr, if that “hasn’t profoundly shocked you, you haven’t understood it yet.” As with light, particles with shorter wavelengths have higher energies. Quantization is a direct consequence of particles having wavelengths, as we will now show. Feynman Simplified 1D, Chapter 44, explores waves in confined spaces, including the motion of a violin string. In brief, a string of length L that is fixed at both ends can only oscillate with waves of specific wavelengths. Its wavelength λ is quantized: λ must equal 2L/n, for some integer n, as illustrated in Figure 39-3.

Figure 39-3 Waves Modes: 3 Valid, 1 Not

Each allowed value of n is called a mode. Here, the upper three modes are valid, with n=1, n=3, and n=7. The lowest mode, with n=1.5, is invalid because the string’s right end is moving, contrary to the stipulation that it must be stationary. Note that quantization is most important when the size of the confining space L and wavelength λ are comparable. Let’s consider some numerical examples, recalling that the mode number n=2L/λ. If λ=2L, n=1, and the next higher mode, n=2, corresponds to a 100% jump in frequency (f=1/λ). But, if λ=2L/1000, n=1000, and the next higher mode, n=1001, is only a 0.1% jump in frequency. At the human scale, an electron’s λ=2L/10 , n=10 , and the jump to the next mode is imperceptible. This is why we almost never notice quantum effects. (Lasers are an exception.) 7

7

By similar logic, an electron in an atom must have a wavelength that “fits” within that confining space. To “fit”, the electron must have a quantized energy and a quantized angular momentum.

This is not a comprehensive explanation. That will come in Feynman Simplified 3A. But, I hope it provides you some insight into how these quantum rules arise, and shows how far this course will take you.

Quantized Magnetic States Back to Feynman in V2p35-2. From the prior chapter, we know that atomic scale systems, including atoms and elementary particles, can possess angular momentum J and magnetic moment µ. Quantum mechanics requires that J is quantized in any chosen direction. Taking the z-axis as an example, J must equal j ħ, where j is either an integer or half-integer. Depending on charge polarity and the Lande g-factor, µ is either parallel or antiparallel to J, as given by: z

z

z

µ = g (q/2m) j ħ z

z

The quantity (gq/2m) is an intrinsic property of each type of atom or particle. The magnetic potential energy U of µ in a magnetic field B is: U = – µ•B For a magnetic field entirely in the +z-direction: U = – (gq/2m) j ħ B z

Absent a magnetic field, the system has the same energy for all allowed values of j . With a magnetic field, the energy levels of states with different j are split, as shown in Figures 38-4 and 38-5. Two states that have an angular momentum difference of Δj have their energy levels split by: z

z

z

ΔU = – (gq/2m) ħB Δj

z

Consecutive energy levels are all separated by the same amount, which we define as: ħω = ħ |g q| B / 2m ω = |g q| B / 2m Here, ω is the frequency of a photon that is emitted or absorbed when the system transitions from one energy level to an adjacent level. This is the same frequency at which the system precesses in field B.

Stern-Gerlach Experiment The 1922 experiment of Stern and Gerlach proved that atomic angular momentum is quantized.

The experimental setup starts with an oven hot enough to evaporate silver. Individual silver atoms in the vapor escape through a small hole in the oven wall, forming a beam of atoms labeled A in Figure 39-4. A small aperture selects those few atoms whose velocities are nearly horizontal. Those atoms pass through a magnet whose south pole is sharply tapered.

Figure 39-4 Stern-Gerlach Beam Splitter

As we discussed in the prior chapter, tapering intensifies the B field at the south pole tip, as shown in a side view in the left portion of the image. After exiting the magnet, silver atoms deposit themselves on a glass plate P, as shown in a front view in the right portion of the image. While in the magnetic field, silver atoms acquire a magnetic potential energy U due to their magnetic moment µ. U is given by: U = – µ B cosθ Here, B points upward, from north pole to south pole, and θ is the angle between the magnetic moment µ and field B. Since B varies with vertical distance, the z-direction, a vertical force is exerted on each silver atom, according to: F = –∂U/∂z = + µ cosθ ∂B/∂z z

Classically, the magnetic moments of silver atoms will point in all directions with equal probability, and the vertical force spans a continuous range. As a result, the silver atoms should form a vertical line on the glass plate. But to nearly everyone’s great surprise, the experimental results were radically different — quantized rather than continuous. Silver atoms formed two small, separated dots on the plate; one dot from atoms deflected upward, and the second dot from atoms deflected downward. All atoms were deflected by the same angle, with half going up and half going down. No silver atoms landed between the two dots. This means every silver atom had a quantized magnetic moment, either +X or –X. Otto Stern received the 1943 Nobel Prize for this discovery. (Walter Gerlach did not share the Prize due to his suspected participation in the Nazi war effort.) From the known distances and field gradient, measuring the dot separation determines the magnetic moment of silver. Unfortunately, that separation is quite small, limiting the measurement precision.

The Rabi Method

Isidor Isaac Rabi found a means of improving upon the Stern-Gerlach experiment. Starting in 1937, Rabi measured the magnetic moments of several types of atoms with superb precision, up to one part per 1000. In V2p35-4, Feynman calls this: “fantastic precision”. For this achievement, Rabi received the 1944 Nobel Prize. In the Rabi method, atoms from an oven pass through a collimating aperture, as before. After passing through three magnets, atoms with zero net deflection traverse a final aperture and are counted by detector D, as shown in Figure 39-5.

Figure 39-5 Rabi Measurement of µ

The first and third magnets, M1 and M3, have tapered south poles, just like Stern-Gerlach magnets, except that M3 is inverted relative to M1. With magnet M2 turned off, atoms that are deflected upward by M1 are deflected downward by M3, and vice versa. Note that magnet M2 blocks off-axis atoms with a central aperture that may be difficult to see in the figure. Rabi’s key improvement is the operation of M2, which produces a strong B field pointing upward plus a much weaker horizontal B field oscillating at an adjustable frequency Ω. The purpose of M2 is to change the z-component of the beam atoms’ magnetic moments, thus preventing them from reaching detector D. Because the fields in magnets M1 and M3 are equal but opposite, the deflections of a beam atom through those magnets are also equal but opposite if its µ is unchanged. But if µ flips polarity in M2, its deflection in M1 and M3 will be in the same direction and the atom will be unlikely to pass through the final aperture and be counted by detector D. z

z

Let’s see how M2 can flip µ by examining an atom with two spin states: j = +ħ/2 and j = –ħ/2. Each state has magnetic potential energy: z

z

z

U = – µ B = – (gq/2m) j B z

z

z

The difference in potential energy between the two spin states is: ΔU = (gq/2m)B { (+ħ/2) – (–ħ/2) } z

ΔU = (gq/2m) B ħ z

To transition from one state to the other, the atom must either absorb or emit a photon of that energy. This can occur only if the photon’s energy is:

ħω = ΔU = (gq/2m) B ħ z

An electromagnetic field, oscillating at frequency ω, can provide photons of energy ħω that could be absorbed by an atom or that could stimulate an atom to emit its own photon of that energy. The latter is called stimulated emission, a process predicted by Einstein that is discussed in Feynman Simplified 1B, Chapter 20. This ω is also the frequency at which µ precesses in field B , as we discussed in the prior chapter. From a classical viewpoint, a horizontal magnetic field, such as B shown in Figure 39-6, exerts a torque µ×B that has a component in the z-direction. z

y

y

Figure 39-6 Precessing µ & Horizontal Field

For a spin 1/2 atom with j = +ħ/2, a torque with a positive z-component cannot increase j since it is already at its maximum. But a torque with a negative z-component can flip j , changing it to –ħ/2. Similarly, for a spin 1/2 atom with j = –ħ/2, a torque with a positive z-component can flip j , changing it to +ħ/2, whereas a negative z-component torque has no effect. z

z

z

z

z

The probability of flipping j is small, but if the field reverses direction at the same frequency that the atom precesses, this torque has much more time to effect a flip. Conversely, if the field oscillation frequency Ω is appreciably different than the precession frequency ω, the probability of a spin flip is extremely remote. z

Figure 39-7 shows a possible counting rate at detector D as a function of the frequency of the horizontal field Ω. The effect illustrated here is a resonance, just like those discussed in Feynman Simplified 1B, Chapter 13.

Figure 39-7 Counting Rate vs. Field Frequency

Almost no atomic spins flip in M2 when the field oscillation frequency Ω is very different from ω, the

frequency corresponding to the energy level split. In this case, nearly all atoms have the same µ from start to finish; they incur zero net deflection, and reach detector D. Only when Ω is close to ω do many spins flip in M2. Those atoms are doubly deflected by M1 and M3, and are unlikely to reach D. We see this as a drop in counting rate when Ω is close to ω. z

By finding the Ω with the minimum counting rate, Rabi measured (gq/2m) and determined g to a precision of up to one part in a 1000.

Paramagnetism of Matter Now that we know how to measure magnetic moments, we next examine paramagnetic matter: solid materials with permanent magnetic moments. In V2p35-6, Feynman explains that most materials are not paramagnetic. In atoms with an even numbers of electrons, pairs of electrons typically have equal but opposite spins and orbital angular momenta that cancel one another. As a result, the atom has zero net magnetic moment. Atoms with an odd number of electrons usually combine in solid matter with other atoms, sharing their valence electrons. Electron pairs again cancel one another’s spin and angular momentum, resulting in zero net magnetic moment. With these typical atoms, solid matter has no magnetic moment, and therefore exhibits only diamagnetism not paramagnetism. But, some atoms with an odd number of electrons have an unpaired electron in an inner shell. To understand the significance of that, we must briefly examine the atomic orbits of electrons. In Feynman Simplified 3C, Chapter 30, we employ quantum mechanics to calculate electron atomic orbits, and derive the structure of the Periodic Table of Elements. We discover there that because electrons have wavelengths, their allowed orbits are severely restricted. Allowed electron orbits are divided into shells denoted by a principal quantum number n that determines the orbit’s average distance from the nucleus. Classically, one can think of an electron in shell n=4 as orbiting “outside” an electron in shell n=3. A very imprecise analogy is Mars orbiting outside Earth. In the quantum realm, of course, nothing is really that simple: the orbits are “fuzzy” with lots of overlap. The value of n also determines how many electrons each shell can accommodate: shell n has a maximum capacity of 2n electrons. The electrons of hydrogen and helium are in shell n=1, while the electrons of uranium are in shells n=1 through n=7. 2

Shells are further divided into subshells, each with a different orbital angular momentum jħ. Shell number n has n subshells. The labels, angular momenta, and maximum capacities of four subshells are:

s: j=0, capacity 2 p: j=1, capacity 6 d: j=2, capacity 10 f: j=3, capacity 14 As is true throughout nature, the lowest energy states are the most favorable. Let’s see how electrons occupy the allowed orbits just described. Imagine an isolated nucleus with Z protons to which we add one electron at a time. The most recent electron enters the lowest energy orbit that has a vacancy. This is somewhat like a hotel without an elevator that is filling up with clients who hate stairs. An orbit’s energy (the number of stairs to a certain hotel room) depends primarily on its shell number; the energy increases with increasing n. But orbital energy also increases with increasing j. As a result, orbital energies do not follow a simple sequence of n and j values. For example, the 4s orbit has a lower energy than some 3d orbits. In this case, unpaired electrons in 3d can be “inside” and shielded by “outer” electrons in 4s. Those “inner” unpaired electrons give the atom a permanent, non-zero angular momentum and a permanent magnetic moment. Such atoms are paramagnetic. Let’s now consider a large collection of atoms, each with a permanent magnetic moment µ. These atoms could be in the gas, liquid, or solid phase. Absent an external magnetic field to define a preferred direction, the magnetic moments point in random directions and average to zero for macroscopic numbers of atoms. But when an external field is applied, atoms with magnetic moments acquire magnetic potential energy given by: U = – µ•B = – µ B cosθ From Boltzmann’s law of statistical mechanics (see Feynman Simplified 1B, Chapter 16), we know that the fraction of atoms with angle θ is proportional to: N(θ) ~ exp{–U/kT} = exp{+µBcosθ/kT} Here, k is Boltzmann’s constant and T is absolute temperature (T on the Kelvin scale). Atoms with µ parallel to B (θ=0) have the lowest energy, and will be more common than atoms aligned antiparallel to B (θ=π). The atomic moments are no longer balanced, resulting in a non-zero magnetic moment. We define M to be the magnetization per unit volume of these atoms, according to: M = N Here, N is the number of atoms per unit volume, and is their average moment, which is no longer zero. All this is logically equivalent to our discussion of dielectrics in an external electric field (see

Chapter 11). Following the same procedures used there, we find: M = N µ B / 3kT 2

This holds for small B fields (µB/kT > 1, approaching the asymptotic limit of Nµ . At that limit, all atomic moments are parallel to B, and M is as large as possible. This situation generally occurs only at extremely low temperature. 0

0

Figure 39-8 Magnetization vs. B

More commonly, µ B/kT