282 63 3MB
English Pages [394] Year 2002
Ed Groth
Thermal Physics
Princeton 2002
TEXTS Text: Kittel & Kroemer, Thermal Physics 2nd ed., Freeman. (to be followed closely or loosely depending on the topic) Others: Mandl, Statistical Physics, Wiley. Similar to K&K. Reif, Fundamentals of Statistical and Thermal Physics, McGraw Hill. Also similar to K&K. Feynman, Statistical Mechanics, a Set of Lectures, Addison Wesley. This is fairly advanced, but it's always worthwhile to see what Feynman has to say. Callen, Thermodynamics, Wiley. A classic treatment of thermodynamics.
TENTATIVE SYLLABUS We will start by following K&K and cover fundamentals of statistical mechanics and thermodynamics, including temperature and entropy. We will cover the Boltzmann, Bose, and Fermi distributions; black body radiation; chemical potential; Gibbs free energy; and phase transitions. Then we'll consider some of the more advanced topics in K&K. A more detailed syllabus is in the table below.
Lecture Summary as of 27-Nov-2002 Lecture 1 • Introduction • Some History (mostly taken from Reif) • Some Thermodynamic Concepts • Entropy • Example: Ideal Gas Entropy • What Those Large Numbers Mean • Quantum Mechanics and Counting States Lecture 2 • Reading • Entropy and the Number of States • Why is the Number of States Maximized? • Aside—Entropy and Information • Macroscopic Parameters • The Temperature Lecture 3 • Pressure • Chemical Potential • Probability • Averages • Probabilities for Continuous Variables. • The Binomial Distribution Lecture 4 • Example—A Spin System • The Boltzmann Factor • The Gaussian Distribution Lecture 5 • Reading • The Boltzmann Factor • Systems with Several Forms of Energy • Diversion: the Maxwell Velocity Distribution • Aside—the Gamma Function • The Partition Function
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Sum-1
Lecture Summary as of 27-Nov-2002 Lecture 6 • Entropy and Probabilities • Heat Capacity • Reversible Processes • Pressure • Pressure in a Low Density Gas, I • Pressure in a Low Density Gas, II Lecture 7 • States of a Particle in a Box. • Partition Function for a Single Particle in a Box • Partition Function for N Particles in a Box • Helmholtz Free Energy • The Free Energy and the Partition Function Lecture 8 • Reading • Classical Statistical Mechanics • A Classical Harmonic Oscillator • Classical Cavity Radiation Lecture 9 • A Quantum Harmonic Oscillator • Quantum Cavity Radiation • More on Blackbody Radiation Lecture 10 • Johnson Noise • Debye Theory of Lattice Vibrations • The Nyquist Frequency Lecture 11 • Reading • Parting Shot on Oscillators • Integrals Related to Planck’s Law • The Chemical Potential • Getting a Feel for the Chemical Potential
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Sum-2
Lecture Summary as of 27-Nov-2002 Lecture 12 • The Gibbs Factor • Example: Binding of N Molecules • More on the Chemical Potential—Energy to Add a Particle • Example: Chemical Potential and Batteries Lecture 13 • Example: Magnetic Particles in a Magnetic Field • Example: Impurity Ionization • Example: K&K, Chapter 5, Problem 6 • Fermi-Dirac and Bose-Einstein Distributions Lecture 14 • Reading • The Ideal Gas Again • The N Particle Problem • The Ideal Gas From the Grand Partition Function Lecture 15 • Internal Degrees of Freedom • Ideal Gas Processes • The Gibbs Paradox Revisited Lecture 16 • The Sackur-Tetrode Entropy and Experiment • The Ideal Fermi Gas • Heat Capacity of a Cold Fermi Gas Lecture 17 • Reading • More on Fermi Gases • Other Fermi Gases Lecture 18 • Bose-Einstein Gases • Superfluid Helium Lecture 19 • Heat and Work • The Carnot Cycle • Other Thermodynamic Functions c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Sum-3
Lecture Summary as of 27-Nov-2002 Lecture 20 • Reading • Gibbs Free Energy • Chemical Equilibrium • The Law of Mass Action • Application: pH • Other Ways of Expressing the Law of Mass Action Lecture 21 • The Direction of a Reaction • Application: the Saha Equation • Phase Transitions • Phase Diagrams Lecture 22 • First Order and Second Order Phase Transitions • The Clausius-Clapeyron Equation • The van der Waals Equation of State Lecture 23 • Reading • Phase Transitions and the van der Waals Equation of State • Droplets Lecture 24 • A Simple Model of Ferromagnetism • Superconductors, the Meissner Effect, and Magnetic Energy • The Ising Model Lecture 25 • Landau Theory of Phase Transitions • Mixtures Lecture 26 • Reading • Examples of Positive Mixing Energy • Liquid and Solid Mixtures Without a Solubility Gap • Liquid and Solid Mixtures With a Solubility Gap c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Sum-4
Lecture Summary as of 27-Nov-2002 Lecture 27 • Cooling by Expansion • Throttling Processes • The Joule-Thomson Effect • Cooling by Pumping • The Helium Dilution Refrigerator • Isentropic Demagnetization • Laser and Evaporative Cooling Lecture 28 • Semiconductor Basics • Electron Distribution in Semiconductors • Law of Mass Action for Semiconductors • Electron Distribution in Doped Semiconductors Lecture 29 • Reading • Electron Distribution in Degenerate Semiconductors • Ionization of Donors and Acceptors • Electron-Hole Interactions • The p-n Junction Lecture 30 • The Depletion Region in a p-n Junction • A Reverse Biased p-n Junction • A Forward Biased p-n Junction Lecture 31 • Reading • The Maxwell Velocity Distribution • Cross Sections • Example: Cross Section for Smooth Hard Sphere Elastic Scattering • Reaction Rates • The Collision Rate and the Mean Free Path Lecture 32 • Transport • Transport Coefficients • Diffusivity • A Bit More on the Diffusivity • Thermal Conductivity • Viscosity c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Sum-5
Lecture Summary as of 27-Nov-2002 Lecture 33 • The Boltzmann Transport Equation • The Boltzmann Equation and Simple Diffusion • Diffusion and the Fermi-Dirac Distribution • Electrical Conductivity Lecture 34 • Reading • High Vacuum Statistical Mechanics • Diffusion Equations • Sample Solution of the Diffusion Equation: Equilibrating Bar Lecture 35 • The Dispersion Relation • Random Walks and Diffusion • Sample Solution: A Temperature Oscillation • The Diffusion of a One Dimensional Bump • Time Independent Solutions of the Diffusion Equation • Continuity Equation for Mass Lecture 36 • To Do • Sound Waves in a Gas • Wave Functions for a Sound Wave • Heat Losses in a Wave
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Sum-6
Week 0. Introduction, Thermodynamic Conccepts
Physics 301
13-Sep-2002 1-1
Introduction In this course we will cover selected topics in thermodynamics and statistical mechanics. Since we only have twelve weeks, the selection is necessarily limited. You will probably need to take a graduate course in thermal physics or do studying on your own in order to gain a thorough knowledge of the subject. Classical (or maybe “conventional” is better) thermodynamics is an approach to thermal physics “from the large.” Statistical mechanics approaches the subject “from the small.” In thermodynamics, one is concerned with things like the pressure, temperature, volume, composition, etc., of various systems. These are macroscopic quantities and in many cases can be directly observed or felt by our senses. Relations between these quantities can be derived without knowing about the microscopic properties of the system. Statistical mechanics takes explicit account of the fact that all systems are made of large numbers of atoms or molecules (or other particles). The macroscopic properties (pressure, volume, etc.) of the system are found as averages over the microscopic properties (positions, momenta, etc.) of the particles in the system. In this course we will tend to focus more on the statistical mechanics rather than the thermodynamics approach. I believe this carries over better to modern subjects like condensed matter physics. In any case, it surely reflects my personal preference!
Some History (mostly taken from Reif) As it turns out, thermodynamics developed some time before statistical mechanics. The fact that heat is a form of energy was becoming apparent in the late 1700’s and early 1800’s with Joule pretty much establishing the equivalence in the 1840’s. The second law of thermodynamics was recognized by Carnot in the 1820’s. Thermodynamics continued to be developed in the second half of the 19th century by, among others, Clausius, Kelvin and Gibbs. Statistical mechanics was developed in the late 19th and early 20th centuries by Clausius, Maxwell, Boltzmann, and Gibbs. I find all of this rather amazing because at the time of the initial development of thermodynamics, the principle of energy conservation hadn’t been firmly established. Statistical mechanics was developed when the existence of atoms and molecules was still being debated. The fact that macroscopic properties of systems can be understood in terms of the microscopic properties of atoms and molecules helped convince folks of the reality of atoms and molecules.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Sep-2002 1-2
Still more amazing is the fact that the foundations of statistical mechanics were developed before quantum mechanics. Incorporating quantum mechanics did make some changes, especially in the counting of states, but the basic approach and ideas of statistical mechanics remained valid. I suspect that this is a reflection of both the strength and weakness of statistical methods. By averaging over many molecules you derive results that are independent of the detailed properties of individual molecules. The flip side is that you can’t learn very much about these details with statistical methods.
Some Thermodynamic Concepts From mechanics, we’re familiar with concepts such as volume, energy, pressure (force per unit area), mass, etc. Two new quantities that appear in thermodynamics are temperature (T ) and entropy (S). We will find that temperature is related to the amount of energy in a system. Higher temperature means greater internal energy (usually). When two systems are placed in contact, energy in the form of heat flows from the higher temperature system to the lower temperature system. When the energy stops flowing the systems are in thermal equilibrium with each other and we say they are at the same temperature. It turns out if two systems are in thermal equilibrium with a third system, they are also in thermal equilibrium with each other. (This is sometimes called the zeroth law of thermodynamics.) So the concept of temperature is well defined. It’s even more well defined than that as we will see later in the course. Two systems can exchange energy by macroscopic processes, such as compression or expansion or by microscopic processes. It is the microscopic process that is called heat transfer. Consider a collision among billiard balls. We think of this as a macroscopic process and we can determine the energy transfer involved by making measurements of a few macroscopic parameters such as the masses and velocity components. If we scale down by 24 orders of magnitude, we consider a collision between molecules, a microscopic process. A very large number of collisions occur in any macroscopic time interval. A typical molecule in the atmosphere undergoes ∼ 1010 collisions per second. All these collisions result in the exchange of energy and it is the net macroscopic transfer of energy resulting from all the microscopic energy transfers that we call heat. Recall that the first law of thermodynamics is dU = dQ + dW , where dU is the change of (internal) energy of a system, dQ is energy added to the system via a heat transfer, and dW is energy added by doing work on the system. Aside: you will often see the heat and work written as dQ ¯ and dW ¯ . This is a reminder that these quantities are not perfect differentials, just small changes. A system has a well c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Sep-2002 1-3
defined internal energy U(P, V, . . .) which can be differentiated with respect to P , V , . . ., but there is no such thing as the heat or work content of a system. The heat and work refer to energy transfers during a change to the system. So the first law really boils down to a statement of energy conservation. You can change the energy of a system by adding energy microscopically (dQ) or macroscopically (dW ). While we’re at it, the second law can be stated in many ways, but one way (without worrying too much about rigor) is: it’s impossible to turn heat completely into work with no other change. So for example, if you build a heat engine (like a power plant) you can’t turn all the heat you get (from burning coal) completely into electrical energy. You must dump some waste heat. From this law, one can derive the existence of entropy and the fact that it must always increase. (Or you can define entropy, and state the second law in terms of the increase in entropy).
Entropy Earlier, we mentioned that temperature is related to internal energy. So, a picture we might carry around is that as the temperature goes up, the velocities of the random motions of the molecules increase, they tumble faster, they vibrate with greater amplitude, etc. What kind of picture can we carry around for entropy? Well that’s harder, but as the course goes along we should develop such a picture. To start, we might recall that the change in entropy of a system is the heat added to the system divided by the temperature of the system (all this is for a reversible process, etc.): dS = dQ/T . If a dQ > 0 is added to one system, −dQ must be added to a second system. To ensure that entropy increases, T1 < T2 ; the first system is cooler than the second system. The molecules in the first system speed up and the molecules in the second system slow down. After the heat is transfered (in a direction which makes entropy increase) the distribution of molecular speeds in the two systems is more nearly the same. The probability that a fast molecule is from system 1 has increased while the probability that a fast molecule is from system 2 has decreased. Similarly, the probability that a slow molecule is from system 2 has increased and the probability a slow molecule is from system 1 has decreased. In other words, as a result of the increase of entropy, the odds have become more even. So increasing entropy corresponds to a leveling of the probabilities. Higher entropy means more uniform probability for the possible states of the system consistent with whatever constraints might exist (such as a fixed total energy of the system). So entropy is related to the number of accessible states of the system and we will c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Sep-2002 1-4
find that maximizing the entropy is equivalent to assuming that each accessible state is equally likely. The first law of thermodynamics can be written as dU = dQ + dW = T dS − p dV
or
dS = dU/T + p dV /T ,
where we’ve assumed that the number of particles in the system is constant and the work done on the system results from pressure acting while the volume changes. Suppose the system is an ideal gas. Then the energy depends only temperature dU = nCV dT , where n is the number of moles and CV is the molar specific heat at constant volume which we take to be constant. The equation of state is pV = nRT
or
p/T = nR/V ,
where R is the gas constant. We plug these into the first law and obtain dS = nCV
dT dV + nR , T V
which can be integrated to give Sf − Si = nCV log
Tf Vf + nR log . Ti Vi
So, we have an expression for the entropy difference between any two states of an ideal gas. But how can we relate this to what’s going on at the microscopic level? (Note, unless otherwise stated, by log, I mean a natural logarithm, loge .) First, let’s make a distinction between the macroscopic state and the microscopic state. The macroscopic state is completely specified (at equilibrium!) by a small number of parameters such as p, V , n, etc. Classically, the microscopic state requires the specification of the position and velocity of each particle r1 , v1 , r2 , v2 , . . . , rN , vN , where N is the number of particles. N is usually a huge number, comparable to Avogadro’s number, the number of particles in a mole, N0 = 6.02 × 1023 . Since there is such a large ratio of microscopic to macroscopic parameters, it must be that many microscopic states may produce a given macroscopic state. How many microscopic states are there? Why do we want to know? The idea is that the macroscopic state which is generated by the most microscopic states is the most likely. Suppose we say that S ∝ log g , c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Sep-2002 1-5
where g is the number of microstates corresponding to the macrostate. This definition has the desirable property that if we have two non-interacting systems with states g1 and g2 , and we bring them together, the entropy is additive. S = S1 + S2 . Since the systems are non-interacting, bringing the systems together does not change the states available to either system, and any microstate of system 1 may be combined with any microstate of system 2 to yield a microstate of the combined system. This means that there are a total of g1 · g2 states altogether. By defining the entropy with a logarithm, we ensure that it’s additive (at least in this case!). So let’s count states. At first sight, you might think there are an infinite number of states because r and v are continuous variables. Well, perhaps if you change them only slightly, you don’t really get a new state.
Example: Ideal Gas Entropy Consider one mole of ideal gas at STP. Its volume is V = 22.4 L = 2 × 104 cm3 and it contains N0 = 6 × 1023 molecules. How big is a molecule? Answer: about 1 ˚ A = 10−8 cm. −24 3 A molecular volume is Vm ≈ 10 cm . Imagine dividing our total volume V into cells the size of a molecule. There are M = V /Vm = 2 × 1028 cells. Let’s specify the micro-position state by stating which cells have molecules in them. That is, we are going to specify the positions of the molecules to a molecular diameter. How many states are there? Pick a cell for the first molecule. This can be done in M ways. Pick a cell for the second molecule. This can be done in M − 1 ≈ M ways. For the third molecule, there are M − 2 ≈ M ways. Continue to the N th molecule for which there are M − N ≈ M ways to pick a cell. Altogether there are about 1024 g ≈ M N ≈ 1028 , ways to distribute the molecules in the cells. The fact that we get M N rather than a binomial coefficient depends on the fact that M ≈ 1028 N ≈ 1024 . Also, we should probably divide by N! to account for permutations of the molecules in the cells (since we can’t distinguish one molecule from another), but leaving this out won’t hurt anything at this point. As an example, consider a two dimensional gas containing N = 10 molecules and M = 100 cells. The figure shows a couple of the possible position microstates of this gas.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Sep-2002 1-6
There are M!/N !(M − N)! = 1.7 × 1013 distinct states. Our approximation gives 1020 states; the difference is mostly due to ignoring the 10! in the denominator. Knowing the number of states, we have S ∝ N log M , V = N log , Vm constant for given amount of gas = N log V − N log Vm . volume term in entropy The N log Vm term is a constant for a given amount of gas and disappears in any calculation of the change in entropy, Sf − Si . Similarly, the N! correction would also disappear. So a lot of the (really awful?) approximations we made just don’t matter because things like the size of a molecule drop out as long as we only consider entropy differences. The N log V term is the volume term in the ideal gas entropy. By considering the microstates in velocity, we would obtain the temperature term (and we will later in the term!).
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Sep-2002 1-7
What Those Large Numbers Mean The key aspect of all this is the large number of states! Suppose we have a gas in equilibrium in a container of volume 2V . Why doesn’t the gas, by chance, wind up in one-half the container with volume V ? How many states are there in each case? g1 = And,
V Vm
N
,
g2 =
2V Vm
N .
g2 = 2N , g1 = 2Avogadro’s Number , 23
= 26×10 , 23
= 102×10 , = 1 000 · · 000 . · 23 2×10 zeros Such a state might be legal, but it’s extremely!!! unlikely. The fact that a system in equilibrium has the maximum possible entropy is nothing more than the fact that the normal equilibrium state has so many more ways to occur than an obviously weird state, that the weird state just never occurs.
Quantum Mechanics and Counting States You might be thinking that’s it pretty flaky to assert that we need only specify a molecular position to a molecular diameter. We’ve shown that as long as it’s small, the resolution has no effect on our calculation of changes in the entropy, so this is OK for classical mechanics. If we consider quantum mechanics, then we find that systems are in definite states. There are many ways to see this. An example is to consider a particle in a box and fit the wave functions in. Another way is to consider the uncertainty principle, ∆px ∆x ≥ ¯h/2 . If the state of the system is specified by a point in the x px diagram (phase space), then one can’t tell the difference between states which are as close or closer than the above. So we can divide up this phase space into cells of ¯h/2 and we can specify a state by saying which cells are occupied and which are not. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Sep-2002 1-8
As a numerical example, consider air (N2 ) at room temperature. mN2 = 28mp = 28 × 1.7 × 10−24 g = 4.8 × 10−23 g. A typical kinetic energy is mv 2 /2 = 3kT /2 with T = 300 K and k = 1.38 × 10−16 erg/K, then E ∼ 6 × 10−14 erg, v ∼ 5.1 × 104 cm/s, p ∼ 2.4 × 10−18 g cm/s. The molecular size is about r ∼ 1 ˚ A = 10−8 cm, so p r = 2.4 × 10−26 g cm2 /s > ¯h = 1 × 10−27 erg s .
Thus, at room temperature, one can specify the momentum of a molecule to a reasonable fraction of a typical momentum and the position to about the molecular size and still be consistent with quantum mechanics and the uncertainty principle. That is, room temperature air is classical, but not wildly separated from the quantum domain. If we consider lower temperatures or higher densities, electrons in metals, etc. quantum effects will be more important. The ideal gas at STP is a “low occupancy” system. That is, the probability that any particular state is occupied is extremely small. This means that the most likely number of occupants of a particular state is zero, one occurs very rarely, and we just don’t need to worry about two at all. This is the classical limit and corresponds to the Boltzmann distribution. If we have higher occupancy systems (denser and/or colder), then states occupied by two or more particles can become likely. At this point quantum mechanics enters. There are two kinds of particles: integer spin particles called bosons (such as photons or other particles that we associate with waves) and half-integer spin particles called fermions (protons, electrons, particles that we associate with matter). An arbitrary number of bosons can be placed in a single quantum state. This leads to Bose-Einstein statistics and the Bose distribution. At most one fermion can be placed in a quantum state. This leads to Fermi-Dirac statistics and the Fermi distribution.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Week 1. Entropy, Temperature, Pressure, Chemical Potential, Probability
Physics 301
16-Sep-2002 2-1
Reading This week, you should read the first two chapters of K&K.
Entropy and the Number of States As we discussed last time, in the statistical view, entropy is related to the number of “microstates” of a system. In particular, the entropy is the log of the number of states that are accessible to the system when it has specified macroscopic parameters (its “macrostate”). The fact that entropy always increases is just a reflection of the fact that a system adjusts its macroscopic parameters, within the allowed constraints, so as to maximize the number of accessible states and hence the entropy. So, a large part of statistical mechanics has to do with counting states and another large part has to do with deriving interesting results from these simple ideas.
Why is the Number of States Maximized? Good question. We are going to take this is an axiom or postulate. We will not attempt to prove it. However, we can give some plausibility arguments. First, remember that we are typically dealing with something like Avogadro’s number of particles, N0 = 6.02 × 1023 . As we discussed last time, this makes the probability distributions very sharp. Or put another way, improbable events are very improbable. The other thing that happens with a large number of particles has to do with the randomness of the interactions. Molecules in a gas are in continual motion and collide with each other (we will see later in the term, how often). During these collisions, molecules exchange energy, momentum, angular momentum, etc. The situation in a liquid is similar, one of the differences between a liquid and gas has to do with the distance a molecule travels between collisions: in a gas, a molecule typically travels many molecular diameters; in a liquid, the distance between collisions is of the order of a molecular diameter. In a solid, molecules tend to be confined to specific locations, but they oscillate around these locations and exchange energy, momentum, etc. with their neighbors. OK, molecules are undergoing collisions and interactions all the time. As a result, the distribution of molecular positions and speeds is randomized. If you pick a molecule and ask things like where is it located, how fast is it going, etc., the answers can only be given in terms of probabilities and these answers will be the same no matter which molecule you c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Sep-2002 2-2
pick. (Provided you pick the same kind of molecule - you’ll probably get different answers for a N2 molecule and an Ar atom, but you’ll get the same answers for two N2 molecules.) Sticky point: suppose we assume that the world is described by classical mechanics. Also suppose we know the interactions between molecules in some isolated system. Suppose we also know all ∼ N0 positions ri and momenta pi (and whatever else we might need to know to specify the system, perhaps the angular momenta of the molecules, etc.). Then in principle, the equations of motion can be solved and the solution tells us the exact state of the system for all future times. That is, there is nothing random about it! How do we reconcile this with the probabilistic view espoused in the preceding paragraphs? So far as I know, there are reasonable practical answers to this question, but there are no good philosophical answers. The practical answers have to do with the fact that one can’t really write down and solve the equations of motion for ∼ N0 particles. But we can in principle! A somewhat better answer is that we can only know the initial conditions with some precision, not infinite precision. As we evolve the equations of motion forward, the initial uncertainties grow and eventually dominate the evolution. This is one of the basic concepts of chaos which has received a lot of attention in recent years: small changes in the initial conditions can lead to large changes in the final result. (Have you ever wished you could get a 10 day or 30 day weather forecast? Why do they stop with the 5 day forecast?) Of course, the fact that we cannot measure infinitely precisely the initial conditions nor solve such a large number of equations does not mean (still assuming classical mechanics) that it couldn’t be done in principle. (This is the philosophical side coming again!) So perhaps there is still nothing random going on. At this point one might notice that it’s impossible to make a totally isolated system, so one expects (small) random perturbations from outside the system. These will disturb the evolution of the system and have essentially the same effect as uncertainties in the initial conditions. But, perhaps one just needs to include a larger system! If we recognize that quantum mechanics is required, then we notice that quantum mechanics is an inherently probabilistic theory. Also, I’m sure you’ve seen or will see in your QM course that in general, uncertainties tend to grow with time (the spreading out of a wave packet is a typical example). On the other hand, the system must be described by a wave function (depending on ∼ N0 variables), whose evolution is determined by Schroedinger’s equation . . .. As you can see this kind of discussion can go on forever. So, as said before, we are going to postulate that a system is equally likely to be in any state that is consistent with the constraints (macroscopic parameters) applied to the system.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Sep-2002 2-3
As it happens, there is a recent Physics Today article on exactly this subject: trying to go from the reversibility of classical mechanics to the irreversibility of statistical mechanics. It’s by G. M. Zaslavsky and is called, “Chaotic Dynamics and the Origin of Statistical Laws,” 1999, vol. 52, no. 8, pt. 1, p. 39. I think you can read this article and get a feel for the problem even if some of it goes over your head (as some of it goes over my head).
Aside—Entropy and Information In recent times, there has been considerable interest in the information content of data streams and what manipulating (computing with) those data streams does to the information content. It is found that concepts in information theory are very similar to concepts in thermodynamics. One way out of the “in principle” problems associated with classical entropy is to consider two sources of entropy: a physical entropy and an information or algorithmic entropy. This goes something like the following: if we had some gas and we had the knowledge of each molecule’s position and momentum, then the physical entropy would be zero (there’s nothing random about the positions and momenta), but the algorithmic entropy of our list of positions and momenta would be large (and equal to the physical entropy of a similar gas whose positions and momenta we hadn’t determined). What is algorithmic entropy? Essentially, the logarithm of the number of steps in the algorithm required to reproduce the list. In 1998, Toby Marriage wrote a JP on this topic. You can find it at http://physics.princeton.edu/www/jh/juniors fall98.html . One of our criteria for junior papers is that other juniors should be able to understand the paper; so I think you might get something out of this paper as well!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Sep-2002 2-4
Macroscopic Parameters We will be most concerned with systems in equilibrium. Such a system can usually be described by a small number of macroscopic parameters. For example, consider a gas. If the density of the gas is low enough, it can be described quite well by the ideal gas law when it’s in equilibrium: pV = NkT = nRT , where p is the pressure, V is the volume, N is the number of molecules, n is the number of moles, k = 1.38 × 10−16 erg K−1 is Boltzmann’s constant or the gas constant per molecule, R = 8.31 × 107 erg mole−1 K−1 = N0 k is the gas constant per mole, and T is the absolute temperature. Notice that some parameters depend on how much gas we have and some don’t. For example, if we replicate our original system, so we have twice as much, then V , N, U (internal energy), and S (entropy) all double; p, and T stay the same. We are ignoring the contribution of any surface interactions which we expect to be very small. Can you think why? Parameters which depend on the size of the system are called extensive parameters. Parameters that are independent of the size of the system are called intensive parameters. Note that the gas law is not the whole story. If more than one kind of molecule is in the gas, we need to specify the numbers of each kind: N1 , N2 , . . .. Also, the gas law does not say anything about the energy of the gas or its entropy. So, the gas law is an equation of state, but it needs to be supplemented by other relations in order that we know everything there is to know about the gas (macroscopically, that is!). For systems more complicated than a gas, other parameters may be needed. Another thing to notice is that not all parameters may be specified independently. For example, having specified N, T , and V , the pressure is determined. Thus there is a certain minimum number of parameters which specify the system. Any property of the system must be a function of these parameters. Furthermore, we can often change variables and use a different set of parameters. For a single component ideal gas, we might have p = p(N, V, T ),
U = U(N, V, T ),
S = S(N, V, T ) .
We might imagine solving for T in terms of N , V , and U, and we can write p = p(N, V, U ),
T = U(N, V, U ),
S = S(N, V, U ) .
Anything that depends only on the equilibrium state of the system can be expressed as a function of the parameters chosen. Which parameters are to be used depends on the particular situation under discussion. For example, if the volume of a system is under our control, we would likely use that as one of the independent parameters. On the other hand, many processes occur at constant pressure (with the volume adjusting to what it needs to c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Sep-2002 2-5
be). In this case, using p rather than V as the independent parameter will probably be more convenient.
The Temperature As we remarked, the entropy is the logarithm of the number of microstates accessible to a system. The number of states must be a function of the same macroscopic parameters that determine the macrostate of the system. Let’s consider a system described by its internal energy U, its volume V , and the number of each kind of constituent particle Na , Nb , . . .. For the moment, we ignore the possibility of reactions which can change particles of one kind into another kind. This means that our expressions will have the same form for Na , Nb , etc., so we’ll just assume a single kind of particle for the time being and assume we have N of them. Then the number of microstates is g = g(U, V, N ) .
If we have two systems, that we prevent from interacting, then the number of microstates of the combined system is g(U, V, N ) = g1 (U1 , V1 , N1 )g2 (U2 , V2 , N2 ) , with U = U1 + U2 ,
V = V1 + V2 ,
N = N1 + N2 .
This is straightforward. Any microstate in system 1 can be paired with any microstate in system 2, so the total number of microstates is just the product of the number for each system. Also, we have specified the macrostate in terms of extensive parameters, so the parameters for the combined system are just the sum of those for the individual systems. Following K&K, the dimensionless entropy is just σ(U, V, N ) = log g(U, V, N ) = log g1 g2 = log g1 + log g2 = σ1 (U1 , V1 , N1 ) + σ2 (U2 , V2 , N2 ) . So far, we haven’t really done anything. We’ve just written down some definitions twice. We have prevented the two systems from interacting, so nothing exciting can happen. Now let’s suppose we allow the systems to exchange energy. In other words, we allow U1 and U2 to vary, but any change in U1 has a compensating change in U2 so that U is constant. In addition, we prevent changes in volume and numbers of particles, so that V1 , V2 , N1 , and N2 remain constant. We’re placing the systems in thermal contact, but preventing changes in volume or particle number. We know what will happen: energy flows from the hotter system to the c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Sep-2002 2-6
cooler system until they come to thermal equilibrium at the same temperature. We know this from our intuitive understanding of the second law: heat flows from a hot object to a cold object. But, what about our postulate that a system maximizes the number of accessible microstates? In this case, it means that the system adjusts U1 and U2 to maximize the entropy. So,
∂σ ∂U1
=0 V,N
= = =
This means
since σ is maximized ∂σ1 ∂U1 ∂σ1 ∂U1 ∂σ1 ∂U1
+
V,N
+
V,N
−
V,N
∂σ1 ∂U1
∂σ2 ∂U1 ∂σ2 ∂U2 ∂σ2 ∂U2
V,N
= V,N
V,N
∂U2 ∂U1 since ∆U1 = −∆U2.
V,N
∂σ2 ∂U2
, V,N
after equilibrium has been established. So at equilibrium, the rate of change of entropy with respect to energy is the same for the two systems. If we started out with the two systems and we allowed them to exchange energy and nothing happened, then we know that ∂σ/∂U was already the same. If system 1 and system 2 are in equilibrium with respect to energy exchange and we allow system 1 to exchange energy with a third system and nothing happens, then ∂σ3 /∂U3 must also have the same value and nothing will happen if systems 2 and 3 are allowed to exchange energy. Thus, ∂σ/∂U has properties very similar to those we ascribe to temperature. In fact, we can define the temperature as: 1 = τ
∂σ ∂U
. V,N
This makes τ an intensive quantity (it’s the ratio of two extensive quantities), and it makes the energy flow in the “correct” direction. This can be seen as follows: if the two systems are not in equilibrium when we allow energy to flow, then the entropy of the combined systems must increase: The increase in
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Sep-2002 2-7
entropy after a very small amount of energy has been transfered is δσ > 0 = δσ1 + δσ2 1 1 = δU1 + δU2 τ τ2 1 1 1 δU1 . = − τ1 τ2 So if τ1 < τ2 , δU1 > 0, which means energy flows from the high τ system to the low τ system. Finally, if you remember your elementary thermodynamics, recall that dU = T dS − p dV which agrees with this definition of temperature. Units: from our definitions σ is dimensionless and τ has the dimensions of energy. You recall that temperature T has the dimensions of Kelvins and entropy S has the dimensions of ergs per Kelvin. As it turns out, S = kσ , τ = kT . Boltzmann’s constant is really just a scale factor which converts conventional units to the fundamental units we’ve defined above. It’s often said that we measure temperature in Kelvins or degrees Celsius or Fahrenheit because the measurement of temperature was established before the development of thermodynamics which in turn took place before the connection to energy was fully appreciated. What would you think if you tuned in to the weather channel and found out that the high tomorrow was expected to be 4.14 × 10−14 erg or 0.0259 eV??? (If I did the arithmetic correctly, this is ∼ 80◦ F.) Actually, to measure a temperature, we need a thermometer. Thermometers make use of physical properties which vary with temperature. (That’s obvious I suppose!) The trick is to calibrate the thermometers so you get an accurate measure of the thermodynamic temperature, τ /k. A recent Physics Today article discusses some of the difficulties in defining a good practical scale for τ /k < 1 Kelvin. (Soulen, Jr., R. J., and Fogle, W. E., 1997, Physics Today, vol. 50, no. 8, p. 36, “Temperature Scales Below 1 Kelvin.”) One other thing to point out here: You’ve no doubt noticed the V, N subscripts. When you read a thermodynamics text, you’ll often find the statement that this a reminder that V and N are being held fixed in taking the indicated partial derivative. Well, this is true, but since we have a partial derivative, which already means hold everything else fixed, why do we need an extra reminder? Answer: since there are so many choices of independent variables, these subscripts are really a reminder of the set of independent variables in use. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Sep-2002 2-8
Note that we can add energy to a gas keeping the volume and number of particles fixed. In this case the pressure and temperature rise. Alternatively, we can keep the pressure and number of particles fixed. In this case the volume and temperature increase. Furthermore,
∂σ ∂U
=
V,N
∂σ ∂U
. p,N
When it’s obvious from the context which set of independent variables are in use, I will probably be lazy and omit the subscripts.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Sep-2002 3-1
Pressure Last lecture, we considered two systems with entropy as a function of internal energy, volume and number of particles, σ(U, V, N ) = σ1 (U1 , V1 , N1 ) + σ2 (U2 , V2 , N2 ) . We allowed them to exchange internal energy (that is, they were placed in thermal contact), and by requiring that the entropy be a maximum, we were able to show that the temperature is 1 ∂σ . = τ ∂U V,N Suppose we continue to consider our two systems, and ask what happens if we allow them to exchange volume as well as energy? (We’re placing them in mechanical as well as thermal contact.) Again, the total entropy must be a maximum with respect to exchanges of energy and exchanges of volume. Working through similar mathematics, we find an expression for the change in total entropy and insist that it be zero (so the entropy is maximum) at equilibrium, 0 = δσ ∂σ1 ∂σ2 ∂σ1 ∂σ2 = δU1 + δU2 + δV1 + δV2 ∂U1 ∂U2 ∂V1 ∂V2 ∂σ2 ∂σ2 ∂σ1 ∂σ1 δU1 + δV1 , = − − ∂U1 ∂U2 ∂V1 ∂V2 from which we infer that at equilibrium, ∂σ1 ∂σ2 = , ∂U1 ∂U2 which we already knew, and ∂σ2 ∂σ1 = . ∂V1 ∂V2 This last equation is new, and it must have something to do with the pressure. Why? Because, once the temperatures are the same, two systems exchange volume only if one system can “push harder” and expand while the other contracts. We define the pressure: p=τ
∂σ ∂V
. U,N
We will see later that this definition agrees with the conventional definition of pressure as force per unit area. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Sep-2002 3-2
Chemical Potential Well, there’s one variable left, guess what we’re going to do now! Suppose we allow the two systems to exchange particles as well as energy and volume. Again, we want to maximize the entropy with respect to changes in all the independent variables and this leads to, 0 = δσ ∂σ1 ∂σ2 ∂σ1 ∂σ2 ∂σ1 ∂σ2 = δU1 + δU2 + δV1 + δV2 + δN1 + δN2 ∂U1 ∂U2 ∂V1 ∂V2 ∂N1 ∂N2 ∂σ1 ∂σ2 ∂σ1 ∂σ2 ∂σ1 ∂σ2 δU1 + δV1 + δN1 . = − − − ∂U1 ∂U2 ∂V1 ∂V2 ∂N1 ∂N2 So, when the systems can exchange particles as well as energy and volume, ∂σ2 ∂σ1 = . ∂N1 ∂N2 The fact that these derivatives must be equal in equilibrium allows us to define yet another quantity, µ, the chemical potential µ = −τ
∂σ ∂N
. U,V
If two systems are allowed to exchange particles and the chemical potentials are unequal, there will be a net flow of particles until the chemical potentials are equal. Like temperature and pressure, chemical potential is an intensive quantity. Unlike temperature and pressure, you probably have not come across chemical potential in your elementary thermodynamics. You can think of it very much like a potential energy per particle. Systems with high chemical potential want to send particles to a system with low potential energy per particle. Note that we can write a change in the entropy of a system, specified in terms of U, V , and N as 1 p µ dσ = dU + dV − dN , τ τ τ or rearranging, dU = τ dσ − p dV + µ dN . Which is the conservation of energy (first law of thermodynamics) written for a system which can absorb energy in the form of heat, which can do mechanical pV work, and which can change its energy by changing the number of particles.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Sep-2002 3-3
Probability Here, we will introduce some basic concepts of probability. To start with, one imagines some experiment or other process in which several possible outcomes may occur. The possible outcomes are known, but not definite. For example, tossing a die leads to one of the 6 numbers 1, 2, 3, 4, 5, 6 turning up, but which number will occur is not known in advance. Presumably, a set of elementary outcomes can be defined and all possible outcomes can be specified by saying which elementary outcomes must occur. For example, the tossing of the die resulting in an even number would be made up of the elementary events: the toss is 2 or the toss is 4 or the toss is 6. A set of elementary events is such that one and only one event can occur in any repetition of the experiment. For example, the events (1) the toss results in a prime number and (2) the toss gives an even number could not both be part of a set of elementary events, because if the number 2 comes up, both events have occurred! One imagines that a very large number of tosses of the die take place. Furthermore, in each toss, an attempt is made to ensure that there is no memory of the previous toss. (This is another way of saying successive tosses are independent.) Then the probability of an event is just the fraction of times it occurs in this large set of experiments, that is, ne /N , where ne is the number of times event e occurs and N is the total number of experiments. In principle, we should take the limit as the number of trials goes to ∞. From this definition it is easy to see that the probabilities of a set of elementary events must satisfy pi ≥ 0 , and
pi = 1 ,
i
where pi is the probability of event i and i is an index that ranges over the possible elementary events. The above definition is intuitive, but gives the sense of a process occuring in time. That is, we throw the same die over and over again and keep track of what happens. Instead, we can imagine a very large number of dice. Each die has been prepared, as nearly as possible, to be identical. Each die is shaken (randomized) and tossed independently. Again, the probability of an event is the fraction of the total number of trials in which the event occurs. This collection of identically prepared systems and identically performed trials is called an ensemble and averages that we calculate with this construction are called ensemble averages. You are probably thinking that for the die, the probabilities of each of the six elementary events 1, 2, 3, 4, 5, 6 must be 1/6. Well, they could be, but it’s not necessary! You’ve heard of loaded dice, right? All that’s really necessary is that each pi be non-negative and that their sum be 1. Probability theory itself makes no statement about the values of c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Sep-2002 3-4
the probabilities. The values must come from somewhere else. In general, we just assign probabilities to the elementary events. Often we will appeal to symmetry or other arguments to assign the probabilities. For example, since a die is a symmetric cube, no face can be distinguished (mechanically) from any other face and we can plausibly argue that the probabilities should be equal. Aside: well, the dots have to be painted on, so the die isn’t perfectly symmetric. Presumably, differences in the amount and pattern of paint on each face make a negligible difference in the mechanical properties of the die (such as cm, moment of inertia, etc.) so it’s a very good approximation to regard the die as symmetric. However, some dice have rather large indentations for each dot. I’ve occasionally wondered if this might make a detectable difference in the probabilities. In our discussion of the entropy, we postulated that a system is equally likely to be in any microscopic state consistent with the constraints. This amounts to assigning the probabilities and is basically an appeal to symmetry in the same way that assigning equal probabilities to each face of a die is an appeal to symmetry!
Averages Assuming there is some numeric value associated with each elementary event, we can calculate its average value just by adding up all the values and dividing by the total number of trials—exactly what you think of as an average. So, if event i produces the value yi , then its average value is 1 (y1 + y1 + · · · y1 + y2 + y2 + · · · y2 + · · ·) N n1 times n2 times 1 = (n1 y1 + n2 y2 + · · ·) N = yi pi .
y =
i
Quantities like y, whose value varies across an ensemble, are called random variables. After the average, we will often be most interested in the variance (often called the square of the standard deviation.) This is just the average value of the square of the deviation from the average. var(y) = σy2 = (y − y)2 , where σy is the standard deviation in y, not the entropy! The standard deviation is a measure of the spread about the average. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Sep-2002 3-5
Probabilities for Continuous Variables. Rather than giving one of a finite (or infinite) number of discrete outcomes, an experiment might result in the measurement of a random variable which is continuously distributed over some finite (or infinite) range. In this case we deal with a probability density rather than discrete probabilities. For example, we might make a measurement of a continuous variable x. Then the probability that the measurement falls in a small range dx around the value x is Prob(x < result < x + dx) = p(x) dx , where p(x) is the probability density. Just as for discrete probabilities, the probability density must satisfy p(x) ≥ 0 , and
p(x) dx = 1 . allowed range of x
We can simply define p(x) = 0 when x is outside the allowed range, so the normalization becomes
+∞ p(x) dx = 1 . −∞
The average of any function of x, y(x) is defined by
y =
+∞
y(x)p(x) dx . −∞
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Sep-2002 3-6
The Binomial Distribution As an example of working with probabilities, we consider the binomial distribution. We have N trials or N copies of similar systems. Each trial or system has two possible outcomes or states. We can call these heads or tails (if the experiment is tossing a coin), spin up or spin down (for spin 1/2 systems), etc. We suppose that each trial or system is independent and we suppose the probability of heads in one trial or spin up in one system is p and the probability of tails or spin down is 1 − p = q. (Let’s just call these up and down, I’m getting tired of all these words!) To completely specify the state of the system, we would have to say which of the N systems are up and which are down. Since there are 2 states for each of the N systems, the total number of states is 2N . The probability that a particular state occurs depends on the number of ups and downs in that state. In particular, the probability of a particular state with n up spins and N − n down spins is Prob(single state with n up spins) = pn q N −n . Usually, we are not interested in a single state with n up spins, but we are interested in all the states that have n up spins. We need to know how many there are. There is 1 state with no up spins. There are N different ways we have exactly one of the N spins up and N − 1 down. There are N(N − 1)/2 ways to have two spins up. In general, there are N n different states with n up spins. These states are distinct, so the probability of getting any state with n up spins is just the sum of the probabilities of the individual states. So N n N −n . Prob(any state with n up spins) = p q n Note that N
1 = (p + q)
=
N N n=0
n
pn q N −n ,
and the probabilities are properly normalized. To illustrate a trick for computing average values, suppose that when there are n up spins, a measurement of the variable y produces n. What are the mean and variance of y? To calculate the mean, we want to perform the sum, N N n N −n y = n p q . n n=0 Consider the binomial expansion N
(p + q)
N N n N −n p q = , n n=0
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Sep-2002 3-7
and observe that if we treat (for the moment) p and q as independent mathematical variables and we differentiate both sides of this expression with respect to p (keeping q fixed), we get N N n−1 N −n N −1 N(p + q) = n p q . n n=0 The RHS is almost what we want—it’s missing one power of p. No problem, just multiply by p, N N n N −n N −1 Np(p + q) = n p q . n n=0 This is true for any (positive) values of p and q. Now specialize to the case where p+ q = 1. Then N N n N −n Np = n p q = y . n n=0 A similar calculation gives var(y) = Npq . The fractional spread about the mean is proportional to N −1/2 . This is typical; as the number of particles grows, the fractional deviations from the mean of physical quantities decreases in proportion to N −1/2 . So with ∼ N0 numbers of particles, fractional fluctuations in physical quantities are ∼ 10−12 . This is extremely small. Even though the macroscopic parameters in statistical mechanics are random variables, their fluctuations are so small that they can usually be ignored. We speak of the energy of a system and write down a single value, even though the energy of a system in thermal contact with a heat bath is properly a random variable which fluctuates continuously.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Sep-2002 4-1
Example—A Spin System In the last lecture, we discussed the binomial distribution. Now, I would like to add a little physical content by considering a spin system. Actually this will be a model for a paramagnetic material. We’ll consider a system with a large number, N, of identical spin 1/2 systems. As you know, if you pick an axis, and measure the component of angular momentum of a spin 1/2 system along that axis, you can get only two answers: +¯h/2 and −¯h/2. If there’s charge involved, then there’s a magnetic moment, m, parallel or antiparallel to the angular momentum. If there’s a magnetic field, B, then this defines an axis and the energy m · B of the spin system in the magnetic field can be either −mB if the magnetic moment is parallel to the field or +mB if the magnetic moment is anti-parallel to the field. To save some writing, let E = mB > 0 so the energy of an individual system is ±E. In this model, we are considering only the energies of the magnetic dipoles in an external magnetic field. We are ignoring all other interactions and sources of energy. For example, we are ignoring magnetic interactions between the individual systems, which means we are dealing with a paramagnetic material, not a ferromagnetic material. Also, we are ignoring diamagnetic effects—effects caused by induced magnetic moments when the field is established. Generally, if there is a permanent dipole moment m, paramagnetic effects dominate diamagnetic effects. Of course, there must be some interactions of our magnets with each other or with the outside world or there would be no way for them to change their energies and come to equilibrium. What we’re assuming is that these interactions are there, but just so small that we don’t need to count them when we add up the energy. (Of course the smaller they are, the longer it will take for equilibrium to be established. . .) Our goal here is to work out expressions for the energy, entropy, temperature, in terms of the number of parallel and antiparallel magnetic moments. If there is no magnetic field, then there is nothing to pick out any direction, and we expect that any given magnetic moment or spin is equally likely to be parallel or antiparallel to any direction we pick. So the probability of parallel should be the same as the probability of antiparallel should be 1/2: p = 1 − p = q = 1/2. If we turn on the magnetic field, we expect that more magnets will line up parallel to the field than antiparallel (p > q) so that the entire system has a lower total energy than it would have with equal numbers of magnets parallel and antiparallel. If we didn’t know anything about thermal effects, we’d say that all the magnets should align with the field in order to get the lowest total energy. But we do know something about thermal effects. What we know is that these magnets are exchanging energy with each other and the rest of the world, so a magnet that is parallel to the field, having energy c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Sep-2002 4-2
−E, might receive energy +2E and align antiparallel to the field with energy +E. It will stay antiparallel until it can give up the energy 2E to a different magnet or to the outside world. The strengths of the interactions determine how rapidly equilibrium is approached (a subject we will skip for the time being), but the temperature sets an energy scale and determines how likely it is that chunks of energy of size 2E are available. So suppose that n of the magnets are parallel to the field and N − n are antiparallel. K&K define the “spin excess”, as the number parallel minus the number antiparallel, 2s = n − (N − n) = 2n − N or n = s + N/2. The energy of the entire system is then U (n) = −nE + (N − n)E = −(2n − N)E = −2sE . The entropy is the log of the number of ways our system can have this amount of energy and this is just the binomial coefficient. N! N σ(n) = log = log . n (N/2 + s)! (N/2 − s)! To put this in the context of our previous discussion of entropy and energy, note that we talked about determining the entropy as a function of energy, volume, and number of particles. In this case, the volume doesn’t enter and we’re not changing the number of particles (or systems) N . At the moment, we are not writing the entropy as an explicit function of the energy. Instead, the two equations above are parametric equations for the entropy and energy. To find the temperature, we need ∂σ/∂U. In our formulation, the entropy and energy are functions of a discrete variable, not a continuous variable. No problem! We’ll just send one magnet from parallel to anti-parallel. This will make a change in energy, ∆U, and a change in entropy, ∆σ and we simply take the ratio as the approximation to the partial derivative. So, ∆U = U(n − 1) − U(n) = 2E , ∆σ = σ(n − 1) − σ(n) N N − log = log n n−1 N! n! (N − n)! = log (n − 1)! (N − n + 1)! N! n = log N −n+1 n 1 can’t matter if N − n ∼ N0 = log N −n N/2 + s = log , N/2 − s c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Sep-2002 4-3
where the last line expresses the result in terms of the spin excess. Throwing away the 1 is OK, provided we are not at zero temperature where n = N. The temperature is then τ =
∆U 2E = . ∆σ log(N/2 + s)/(N/2 − s)
At this point it’s convenient to solve for s. We have N/2 + s = e2E/τ , N/2 − s and with a little algebra 2s E = tanh . N τ The plot shows this function—fractional spin excess versus E/τ . To the left, thermal
energy dominates magnetic energy and the net alignment is small. To the right, magnetic energy dominates thermal energy and the alignment is large. Just what we expected! Suppose the situation is such that E/τ is large. Then the magnets are all aligned. Now turn off the magnetic field, leaving the magnets aligned. What happens? The system is no longer in equilibrium. It absorbs energy and entropy from its surroundings, cooling the surroundings. This technique is actually used in low temperature experiments. It’s called adiabatic demagnetization. Demagnetization refers to removing the external magnetic field and adiabatic refers to doing it gently enough to leave the magnets aligned.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Sep-2002 4-4
The Boltzmann Factor An additional comment on probabilities: When the spin excess is 2s, the probabilities of parallel or antiparallel alignment are: p=
s 1 + , 2 N
q=
1 s − . 2 N
The ratio of the probabilities is q 1 − 2s/N = = e−2E/τ . p 1 + 2s/N This is a general result. The relative probability that a system is in two states with an energy difference ∆E is just Probability of high energy state = e−∆E/τ = e−∆E/kT . Probability of low energy state This is called the Boltzmann factor. As we’ve already mentioned, this says that energies < ∼ kT are “easy” to come by, while energies > kT are hard to come by! The temperature sets the scale of the relevant energies.
The Gaussian Distribution We’ve discussed two discrete probability distributions, the binomial distribution and (in the homework) the Poisson distribution. As an example of a continuous distribution, we’ll consider the Gaussian (or normal) distribution. It is a function of one continuous variable and occurs throughout the sciences. The reason the Gaussian distribution is so prevalent is that under very general conditions, the distribution of a random variable which is the sum of a large number of independent, identically distributed random variables, approaches the Gaussian distribution as the number of random variables in the sum goes to infinity. This result is called the central limit theorem and is proven in probability courses. The distribution depends on two parameters, the mean, µ, (not the chemical potential!) and the standard deviation, σ (not the entropy!). The probability density is 1
p(x) = √ 2πσ 2
− e
(x − µ)2 2σ 2 .
You should be able to show that 1 √ 2πσ 2
+∞
(x − µ)2 − 2σ 2 dx = 1 , e
−∞
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Sep-2002 4-5
1
x = √ 2πσ 2
1
var(x) = √ 2πσ 2
+∞
− xe
(x − µ)2 2σ 2 dx = µ ,
−∞ +∞
−∞
− (x − µ)2 e
(x − µ)2 2σ 2 dx = σ 2 .
Appendix A of K&K might be useful if you have trouble with these integrals. One can always recenter so that x is measured from µ and rescale so that x is measured in units of σ. Then the density takes the dimensionless form, 2 1 p(x) = √ e−x /2 . 2π
Sometimes you might need integrate this density over a finite (rather than infinite) range. Two related functions are of interest, the error function 2 erf(z) = √ π
z 0
1 e−t dt = 2 √ 2π 2
√
2z
2 e−x /2 dx ,
0
and the complementary error function ∞ ∞ 2 1 2 −t −x2 /2 dx , e dt = 2 √ erfc(z) = √ √ e π z 2π 2z where the first expression (involving t) is the typical definition, and the second (obtained √ by changing variables t = x/ 2 rewrites the definition in terms of the Gaussian probability density. Note that erf(0) = 0, erf(∞) = 1, and erf(z) + erfc(z) = 1. The Gaussian density is just the “bell” curve, peaked in the middle, with small tails. The error function gives the probability associated with a range in x at the middle of the curve, while the complementary error function gives probabilities associated with the tails of the distribution. In general, you have to look these up in tables, or have a fancy calculator that can generate them. As an example, you might hear someone at a research talk say, “I’ve obtained a marginal two-sigma result.” What this means is that the signal that was detected was only 2σ larger than no signal. A noise effect this large or larger will happen with probability ∞ 2 1 2 1 √ e−x /2 dx = erfc √ = 0.023 . 2 2π 2 2 That is, more than 2 percent of the time, noise will give a 2σ result just by chance. This is why 2σ is marginal. We’re straying a bit from thermal physics, so let’s get back on track. One of the reasons for bringing up a Gaussian distribution is that many other distributions approach c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Sep-2002 4-6
a Gaussian distribution when large numbers are involved. (The central limit theorem might have something to do with this!) For example, the binomial distribution. When the numbers are large, we can replace the discrete distribution in n with a continuous distribution. The advantage is that it is often easier to work with a continuous function. In particular, the probability of a spin excess, s, is ps =
N N N! p 2 +s q 2 −s . (N/2 + s)! (N/2 − s)!
We need to do something with the factorials. In K&K, Appendix A, Stirling’s approximation is derived. For large N, √ N ! ∼ 2πN N N e−N . With this, we have ps ∼
2πN NN pN/2+s q N/2−s (N/2+s) (N/2−s) 2π(N/2 + s) 2π(N/2 − s) (N/2 + s) (N/2 − s)
1 pN/2+sq N/2−s 2πN (1/2 + s/N ) (1/2 − s/N ) (1/2 + s/N )(N/2+s)(1/2 − s/N )(N/2−s) 1 pq pN/2+sq N/2−s = 2πN (1/2 + s/N ) (1/2 − s/N ) pq (1/2 + s/N )(N/2+s) (1/2 − s/N )(N/2−s) (N/2+s+1/2) (N/2−s+1/2) 1 p q = . 2πNpq 1/2 + s/N 1/2 − s/N =
Recall that the variance of the binomial distribution is Npq, so things are starting to look promising. Also, we are working under the assumption that we are dealing with large numbers. This means that s cannot be close to ±N/2. If it were, then we would have a small number of aligned, or a small number of anti-aligned magnets. So, in the exponents in the last line, N/2 ± s is a large number and we can ignore the 1/2. Then ps =
1 2πNpq
p 1/2 + s/N
(N/2+s)
q 1/2 − s/N
(N/2−s) .
This is a sharply peaked function. We expect the peak to be centered at s = s0 = s = n−N/2 = Np−N/2 = N(p−1/2). We want to expand this function about its maximum. Actually, it will be easier to locate the peak and expand the function, if we work with its logarithm. log ps = A +
1 s N 1 s N + s log p − log + + − s log q − log − , 2 2 N 2 2 N
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
where
20-Sep-2002 4-7
1 A = log 2
1 2πNpq
.
To locate the maximum of this function, we take the derivative and set it to 0 1 s 1 s d log ps = log p − log + − 1 − log q + log − +1. ds 2 N 2 N We note that this expression is 0 when s/N = p − 1/2, just as we expected. So this is the point about which we’ll expand the logarithm. The next term in a Taylor expansion requires the second derivative d2 log ps 1 1 =− − 2 ds N/2 + s N/2 − s 1 1 1 =− − =− , Np Nq Npq where, in the last line, we substituted the value of s at the maximum. We can expand the logarithm as 1 1 (s − s0 )2 + · · · log ps = A − 2 Npq where s0 = N(p − 1/2) is the value of s at the maximum. Finally, we let σ 2 = Npq, exponentiate the logarithm, and obtain, 2 2 1 p(s) ∼ √ e−(s − s0 ) /2σ , 2πσ 2
where the notation has been changed to indicate a continuous variable rather than a discrete variable. You might worry about this last step. In particular, we have a discrete probability that we just converted into a probability density. In fact, p(s) ds is the probability that that the variable is in the range s → s + ds. In the discrete case, the spacing between values of s is unity, so we require, p(s) (s + 1) − s = ps , which leads to p(s) = ps . Had there been a different spacing there would be a different factor relating the discrete and continuous expressions. All this was a lot of work to demonstrate in some detail that for large N (and not too large s), the binomial distribution describing our paramagnetic system goes over to the Gaussian distribution. Of course, expanding the logarithm to second order guarantees a Gaussian! In practice, you would not go to all this trouble to do the conversion. The way you would actually do the conversion is to notice that large numbers are involved, so the c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Sep-2002 4-8
distribution must be Gaussian. Then all you need to know are the mean and variance which you calculate from the binomial distribution or however you can. Then you just write down the Gaussian distribution with the correct mean and variance. Returning to our paramagnetic system, we found earlier that the mean value of the spin excess is N E s0 = tanh . 2 τ We can use the Gaussian approximation provided s is not too large compared to N/2 which means E < ∼ τ . In this case, a little algebra shows that the variance is 2
σ = Npq = N
E 1 sech 2 τ
2 .
For √ given E/τ , the actual s fluctuates about the √ mean s0 with a spread proportional to N and a fractional spread proportional to 1/ N . A typical system has N ∼ N0 , so the fractional spread is of order 10−12 and the actual s is always very close to s0 . While we’re at it, it’s also interesting to apply Stirling’s approximation to calculate the entropy of our paramagnetic system. Recalling Stirling’s approximation for large N, √ N! ∼ 2πN N N e−N . Taking the logarithm, we have log N ! ∼
1 1 log 2π + log N + N log N − N . 2 2
The first two terms can be ignored in comparison with the last two, so log N ! ∼ N log N − N . Suppose our spin system has s0 ≈ 0. Then the entropy is N! (N/2)! (N/2)! ∼ N log N − N − 2 (N/2) log(N/2) − (N/2)
σ ≈ log
= N log N − N log(N/2) = N log 2 = 4.2 × 1023
(fundamental units)
= 5.8 × 107 erg K−1
(conventional units) ,
where the last two lines assume one mole of magnets. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 1
Due 23-Sep-2002
H1-1
1. Consider a system in which the particles are Physics 301 students confined to room A08. Suppose the particles have no kinetic energy so they occupy chairs in the room and there is no interaction energy so there is at most one particle per chair. What is the entropy of this system? 2. This problem is a generalization of K&K problem 1 in chapter 2. Suppose the number of microstates (or multiplicity function) for a system with energy U, volume V , and number of particles N is g(U, N, V ) = CU αN V βN , where C, α, and β are constants. (a) What are the temperature and pressure of this system? (b) If the system is an ideal gas, what can you say about α and β? In particular, can you relate α to the specific heat at constant volume? What must β be in order to recover the ideal gas law? 3. This is essentially K&K, chapter 2, problem 2 and concerns the paramagnetic spin system. In this problem we assume that the temperature τ is large compared with the magnetic energy mB, so the spin excess 2s is small compared with the total number of spins N. Then the entropy as a function of spin excess is σ(s) ≈ σ0 − 2s2 /N , where σ0 = σ(0) = log g(N, 0). If U is the internal energy if the system, show that σ(U ) = σ0 −
U2 , 2m2 B 2 N
U 1 =− 2 2 , τ m B N and finally, find the equilibrium fractional magnetization: M/mN = 2s/N . Comment: note that U < 0 since the aligned magnetic dipoles have negative energy −E = −mB and anti-aligned dipoles have positive energy E = mB. We can rearrange the above expression as U − τ = E2 , N which says that the average energy per dipole U/N times the energy scale set by the temperature, τ , is a constant (mB)2 . The system would like to reduce its energy (make U as negative as possible) but it also wants to increase its entropy (make the alignment as random as possible). The temperature controls which is the more important. At high temperatures, maximizing entropy wins and U → 0. At low temperatures, minimizing c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 1
Due 23-Sep-2002
H1-2
energy wins and U gets large and negative. (The above expression goes to −∞, but that’s a sign of the approximation breaking down!) 4. K&K, This comes from K&K, chapter 2, problem 3. A quantum harmonic oscillator with frequency ω can exist in states with energies (relative to to the zero point energy) of ¯hω, 2¯ hω, 3¯ hω, . . .. If there are N identical oscillators, the number of ways to obtain an energy n¯hω is (N + n − 1)! g(N, n) = . n!(N − 1)! (This is worked out at the end of K&K chapter 1.) (a) Find the entropy of this system of oscillators when N is large (so that you can use the Stirling approximation, log N! ≈ N log N − N and you can ignore 1 compared to N . (b) Write the entropy as a function of the total energy U = n¯hω and the number of oscillators N: σ = σ(U, N ). Use this to find the expression for the energy in terms of the temperature: N ¯hω U = . exp(¯ hω/τ ) − 1 5. The Poisson Distribution. We discussed the binomial distribution in class. Recall, the probability of obtaining n up spins with N systems is N n pn = p (1 − p)N −n , n where p is the probability that any one system has an up spin. Consider the binomial distribution in the limit that N → ∞, p → 0, but Np → r. (a) Show that the probability of obtaining n is rn −r e . n! This is called the Poisson distribution. This distribution occurs in counting. For example, counting nuclear decays. Can you see why this is applicable? (Consider a time interval in which, on the average, there will be r decays and divide the interval into a large number, N , of equal subintervals. Then the probability of a decay in any interval is r/N , and if N is large enough you don’t need to worry about two or more decays in an interval...) pn =
(b) Show that the mean and variance of the Poisson distribution are both r. What is the fractional uncertainty? Note: you should of course look at the other problems in K&K. You will probably find chapter 2, problem 4 amusing! c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Problem Set 1 Solutions Problem 1. Let’s say there are N = 50 chairs and k = 30 students in the class. We also assume that the students are indistinguishable, have no kinetic energy and there is at most one student per chair. The number of states in this system is then N N! g= = k!(N − k)! k
(1)
Using Stirling’s approximation (with errors of ∼ (1/20) = 5%), this reduces to: σ = log g ≈ N log N − k log k − (N − k) log (N − k)
(2)
which for the numbers above, gives σ ≈ 33 which in dimensions of energy/K is s = 1.4 × 10−23 × 33 ≈ 4.5 × 10−22 J/K.
Problem 2. The number of microstates of the system is g(U, N, V ) = CU αN V βN .
(3)
This gives us the entropy of this system to be σ(U, V, N) = log g = αN log U + βN log V + log C
(4)
(a) The temperature and pressure are found from the entropy as 1 ∂σ αN = = τ ∂U U p ∂σ βN = = τ ∂V V
(5)
(b) For an ideal gas, pV = Nτ and U = NcV τ /k. If our system has to be an ideal gas, it means α = cV /k and β = 1. Problem 3. The magnetic energy per spin is mB, and the spin excess is 2s which gives us the total energy of the system to be U = −2mBs or s = −U/2mB. We are given the entropy as a 1
function of the spin excess to be σ(s) ≈ σ0 − 2s2 /N. The entropy in terms of the energy
is therefore
σ(U) ≈ σ0 −
U2 . 2m2 B 2 N
(6)
which gives the temperature to be 1 ∂σ U = =− 2 2 τ ∂U m B N
(7)
The equilibrium fractional magnetization is given by M 2s U = =− mN N mBN
(8)
Comment (in response to some questions): In this problem, we are dealing with macroscopic variables from the beginning (we dont know the details of the various states - all that goes in the formula for the entropy). So all the symbols are really average values. Also, the fact that the entropy is maximised holds for the system + reservoir, and so for a given external magnetic field, our system alone attains some value of energy, entropy and magnetization. Problem 4. For N quantum harmonic oscillators, the number of ways to get an energy U = n¯hw is g(N, n) =
(N + n − 1)! . n!(N − 1)!
(9)
For large N, and a finite temperature (compared to h ¯ w), the probability that the system is not in its ground state is not very small, and hence the average value of the energy will also not be very small. This justifies the approximation that n is large as well. (a) The entropy of the system is given by s(N, n) = log g ≈ (N + n) log(N + n) − N log N − n log n.
(10)
(b) This gives the temperature to be (we treat U, N, n as continuous variables as usual) 1 ∂σ 1 ∂σ = = = τ ∂U h ¯ w ∂n N 1 log(1 + ) = = h ¯w n
1 (log(N + n) + 1 − log n − 1) h ¯w 1 N¯hw log(1 + ) h ¯w U
(11)
We can now invert this relation to write U=
N¯hw exp(¯hw/τ ) − 1 2
(12)
Problem 5. The binomial distribution is N n N! pn = p (1 − p)(N −n) = pn (1 − p)(N −n) (N − n)!n! n
(13)
The Poisson distribution arises in the limit N → ∞, p → 0, Np → r.
(a) To be slightly more precise with the limit, we keep n fixed as well when taking N → ∞
(which is what we mean when we say it is applicable for counting small numbers). x N Using the facts that in this limit, (N −n) ≈ N, (N −n)!N n ≈ N!, and (1+ N ) → ex ,
we get
r n N! r N −n 1− n!(N − n)! N N n r N! r N r −n rn −r e . = 1 − 1 − ≈ n! (N − n)!N n N N n!
pn =
(14)
One could also use Stirling’s approximation (and log(1 + x) ≈ x for x 0 since we want the molecules that are about to collide with the wall, not those which have just collided. Pressure is the rate of change of momentum per unit area, so ∆Px N p= = ∆t ∆A V
+∞ +∞ +∞ −∞
−∞
−∞
p(vx , vy , vz ) mvx2 dvx dvy dvz =
N mvx2 . V
Since the distribution is independent of direction, we dropped the factor of two and extended the range of integration to vx < 0. Also since the distribution is isotropic, we have mvx2 = mvy2 = mvz2 = 13 mv 2 = 23 Etran , where Etran is the average translational kinetic energy per molecule. Finally, pV =
2 2 NEtran = Utran , 3 3
where Utran is the translational kinetic energy of all the gas molecules. If, in fact, the probability density for the velocities is the Maxwell density, then Etran = 3τ /2 (homework!) and pV = Nτ = NkT = nRT , where n is the number of moles and R is the gas constant.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Sep-2002 6-10
Pressure in a Low Density Gas, II Look inside a gas, and consider molecules with velocity components in dvx dvy dvz ≡ d3 v at (vx , vy , vz ) = v, these molecules have an x momentum density (momentum per unit volume) of δPx N = mvx p(v) d3 v . δV V All this momentum is carried in the x direction at speed vx . Note that positive momentum is carried in the positive direction while negative momentum is carried in the negative direction; both contribute to a momentum flux in the x direction. In fact, the flux of x momentum in the x direction (momentum per unit area perpendicular to x per unit time) is δPx N = mvx2 p(v) d3 v . δA δt V To get the total flux of momentum, we integrate over all velocities and come up with the same thing we had before. Momentum per unit area per second which is force per unit area which is pressure is N N p= d3 v mvx2 p(v) = mvx2 . V V So, why did we bother with this? For one thing, we don’t have to introduce a wall to talk about pressure. Pressure exists throughout a fluid. Secondly, it’s a first introduction to calculation of transport phenomena. In the preceding we considered the transport of x momentum in the x direction. Of course, y momentum is transfered in the y direction and z momentum in the z direction. These are usually numerically equal to the flux of x momentum in the x direction and we have an isotropic pressure. One can also transport x momentum in the y and z directions, y momentum in the x and z directions and z momentum in the x and y directions. For the simple gas we’ve been considering, these fluxes are zero (can you see why?). However, in more complicated situations, they might not be zero; they correspond to viscous forces. In general, we need a nine component object to specify the transport of momentum (a vector) in any of three directions. This is a second rank tensor, usually called the stress tensor. Summary: For an ideal gas, we’ve found an expression relating pressure, volume and translational kinetic energy. We related the energy to the temperature using the Maxwell velocity distribution, which was motivated by the Boltzmann factor. However, in writing down the Maxwell distribution, we “finessed” the issue of counting the states, so we haven’t really derived the ideal gas law.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Sep-2002 7-1
States of a Particle in a Box. In order to count states, we will use quantum mechanics to ensure that we have discrete states and energy levels. Let’s consider a single particle which is confined to a cubical box of dimensions L × L × L. You might think that this is artificial and wonder how the physics could depend on the size of a box a particle is in? It is artificial and it’s a trick to make the math easier. Once a box is big enough, the physics doesn’t depend on the size of the box, and the physics we deduce must not depend in any critical way on the box size when we take the limit of a very big box. (Of course, the volume of a system is one of the extensive parameters that describes the system and it’s OK for the volume to enter in a manner like it does in the ideal gas law!) In what follows, we’ll ignore rotational and internal energy of the particles and drop the “tran” subscript. As you probably know, particles are described by wave functions in quantum mechanics. The de Broglie relation between wavelength (λ) and momentum (P ) is P = h/λ, where h is Planck’s constant. The wave function for a particle in a box must be zero at the walls of the box (otherwise the particle might be found right at the wall). In one dimension, suitable wave functions are ψ(x) ∝ sin(nx πx/L) where 0 ≤ x ≤ L and nx is an integer. This amounts to fitting an integer number of half wavelengths into the box. (If you recall the Bohr model of the atom, the idea there is to fit an integral number of wavelengths in the electron’s orbit.) The momentum is Px = ±nxh/2L = ±nx ¯hπ/L. The ± sign on the momentum indicates that the wave function is a standing wave that is a superposition of travelling waves moving in both directions. The first three wave functions are shown in the figure. In three dimensions, we have ψ(x, y, z) ∝ sin(nx πx/L) sin(ny πy/L) sin(nz πy/L) , which corresponds to fitting standing waves in all three directions in the box. The momentum is π¯h (±nx , ±ny , ±nz ) . P = L The energy is E=
π 2 ¯h2 2 P2 = nx + n2y + n2z . 2 2m 2mL
Now we are getting close to being able to count states. Consider a three dimensional coordinate system with axes nx, ny , nz (regarded as continuous variables). There is a state at each lattice point (nx , ny , nz integers) in the octant where all are non-negative. Because ¯h is so small, and also because we usually deal with a large number of particles, we will be concerned with energies where the quantum numbers (the n’s) are large. How
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Sep-2002 7-2
many states are there with energy < E? Answer: 3
E 2 2 π ¯h /2mL2
1 4π N(< E) = 8 3
.
This is just the volume of an octant of a sphere with radius given by the square root above. It’s the number of states because each state (lattice point) occupies unit volume. For a large number of states, we don’t care about the small errors made at the surface of the sphere. The number of states with energy less than E is the integral of the density of states, n(E), E N(< E) = n(E ) dE , 0
where E is the dummy variable of integration. Differentiate both sides with respect to E using the previous result for N(< E), √ n(E) = 2π E
3 √ 2mL . 2π¯h
Recall that when we discussed the Maxwell distribution, we concluded that the density √ of states had to be proportional to E in order to give the Maxwell distribution. Sure enough that’s what we get. All the other factors get absorbed into the overall normalization constant. It will be instructive to work out a numerical value for the number of states for a typical case. So let’s suppose that E = 3kT /2 where T = 273 K and the volume L3 = 22 × 103 cm3 . That is, we consider an energy and volume corresponding to the average energy and molar volume of a gas at standard temperature and pressure. For m we’ll use a mass corresponding to an N2 molecule. The result is about 4 × 1030 states, more than a million times Avogadro’s number. We have been working out the states for one particle in a box. If we have more than one particle in the box, and they are non-interacting, then the same set of states is available to each particle. It the particles are weakly interacting, then these states are a first approximation to the actual states and we will usually just ignore the interactions when it comes to counting states. With this approximation, and with a mole of particles in the box, we’ve found that less than one in a million of the available states are occupied.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Sep-2002 7-3
Partition Function for a Single Particle in a Box We can use the same states we’ve just discussed to evaluate the partition function for a single particle in a box. We have
Z(τ ) =
nx ,ny ,nz
π 2 ¯h2 2 2 2 n exp − + n + n x y z 2mL2 τ
.
We will make a negligibly small error by converting the sums to integrals, π 2 ¯h2 2 2 2 n + ny + nz dnx dny dnz exp − , Z(τ ) = 2mL2 τ x 0 0 0 3 √ ∞ ∞ ∞ 2mτ 3 = L dx dy dz exp(−x2 − y 2 − z 2 ) , π¯h 0 0 0
∞
∞
∞
(rescaling variables) 3 √ ∞ 2 1 2mτ 3 = 4π L dr r2 e−r , 8 π¯h 0 (changing to spherical coordinates and integrating over angles) √ 3 2mτ 3 1 1 , = 4π V Γ 8 π¯h 2 2 =
V (2π¯h /mτ )3/2 2
(Γ(3/2) =
√ π/2) ,
= nQ V . The volume of the system is V = L3 and the quantity that occurs in the last line, nQ has the dimensions of an inverse volume or a number per unit volume. mτ is the square of a typical momentum. So h ¯ 2 /mτ ∼ λ2 , and the volume associated with nQ is roughly a cube of the de Broglie wavelength. This is roughly the smallest volume in which you can confine the particle (given its energy) and still be consistent with the uncertainty principle. K&K call nQ the quantum concentration. A concentration is just a number per unit volume, and nQ can be thought of as the concentration that separates the classical (lower concentrations) and quantum (higher concentrations) domains. For a typical gas at STP, the actual concentration n = N/V is much less than the quantum concentration (by the same factor as the ratio of the number of states to the number of molecules we calculated earlier), so the gas can be treated classically.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Sep-2002 7-4
Partition Function for N Particles in a Box If we have N non-interacting particles in our box, all with the same mass, then (see the homework) the partition function for the composite system is just the product of the partition functions for the individual systems, ZN (τ ) = Z1 (τ )N
wrong!
where ZN is the N -particle partition function and Z1 is the 1-particle partition function calculated in the previous section. Why is this wrong? Recall that the partition function is the sum of Boltzmann factors over all the states of the composite system. Writing ZN as a product includes terms corresponding to molecule A with energy Ea and molecule B with energy Eb and vice versa: molecule A with Eb and molecule B with Ea . However, these are not different composite states if molecules A and B are indistinguishable! The product overcounts the composite states. Any given Boltzmann factor appears in the sum roughly N! times more than it should because there are roughly N ! permutations of the molecules among the single particle states that give the same composite state. Why “roughly?” Answer, if there are two or more particles in the same single particle state, then the correction for indistinguishability (what a word!) is not required. However, we’ve already seen that for typical low density gasses, less than one in a million single particle states will be occupied, so it’s quite safe to ignore occupancies greater than 1. (If this becomes a bad approximation, other quantum effects enter as well, so we need to do something different, anyway!) To correct the product, we just divide by N!, ZN (τ ) =
1 N 1 Z1 = (nQ V )N . N! N!
To find the energy, we use ∂ log ZN , ∂τ ∂ (− log N ! + N log nQ + N log V ) , = τ2 ∂τ ∂ = τ2 (N log nQ ) , ∂τ (derivatives of N ! and log V give 0) 3/2 ∂ mτ = τ2 N log , ∂τ 2π¯h2
U = τ 2
=
3 Nτ , 2
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Sep-2002 7-5
which expresses the energy of an ideal gas in terms of the temperature. We’ve obtained this result before, using the Maxwell distribution. Note that our correction for overcounting of the microstates does not appear in the result. In lecture 6, we noted that p=−
∂U ∂V
, σ,N
and we remarked that keeping the entropy constant while changing the volume of a system means keeping the probability of each microstate constant. The average energy is E =
Es P (Es ) ,
s
and keeping the probabilities of the microstates constant means that P (Es ) doesn’t change. Thus, changing the volume at constant entropy changes the energy through changes in energies of the individual states. For each single particle state, Es ∝ which means
1 ∝ V −2/3 , 2 L
dEs 2 dV dU = , =− U Es 3 V
all at constant σ. Then the pressure is p=−
∂U ∂V
= σ,N
2 U . 3 V
Again, this is a result we’ve seen before.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Sep-2002 7-6
Helmholtz Free Energy Recall the expression for the conservation of energy, dU = τ dσ − p dV , where we have omitted the chemical potential term, since we won’t be contemplating changing the number of particles at the moment. If we have a system whose temperature is fixed by thermal contact with a heat bath, it is convenient to eliminate the differential in the entropy in favor of a differential in the temperature. For this we use a Legendre transformation—exactly the same kind of transformation used in classical mechanics to go from the Lagrangian, a function of coordinates and velocities, to the Hamiltonian, a function of coordinates and momenta. Define the Helmholtz free energy by F = U − τσ . Not all authors use the symbol F for this quantity—I believe some use A and there may be others. In any case, dF = dU − τ dσ − σ dτ = τ dσ − p dV − τ dσ − σ dτ = −σ dτ − p dV . If a system is placed in contact with a heat bath and its volume is fixed, then its free energy is an extremum. As it turns out, the extremum is a minimum. To show this, we show that when the entropy of the system plus reservoir is a maximum (so equilibrium is established), the free energy is a minimum. σ = σr (U − Us ) + σs (Us ) , = σr (U ) − Us (∂σr /∂U) + · · · + σs (Us ) , = σr (U ) − Us /τ + σs (Us ) , = σr (U ) − (Us − τ σs (Us )) /τ , = σr (U ) − Fs /τ . In the above, the subscripts r and s refer to the reservoir and system and U is the fixed total energy shared by the reservoir and system. Note that unlike our derivation of the Boltzmann factor, the system here need not be so small that it can be considered to be in a single state—it can be a macroscopic composite system. However, it should be much smaller than the reservoir so that Us U. Also, the partial derivatives above occur at fixed volume and particle number. Since σr (U) is just a number, and τ is fixed, maximizing σ requires minimizing F .
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Sep-2002 7-7
From dF = −σ dτ − p dV , we see σ=−
∂F ∂τ
and
p=−
V
∂F ∂V
. τ
Substituting F = U − τ σ in the right equation above, p=−
∂U ∂V
+τ
τ
∂σ ∂V
. τ
This shows that at fixed temperature, if a system can lower its energy by expanding, then it generates a “force,” pressure, that will create an expansion. This is probably intuitive, since we are used to the idea that the equilibrium state is a minimum energy state. If the system can increase its entropy (at fixed temperature) by expanding, this too, generates a “force” to create an expansion. Note that
∂σ ∂V
τ
∂2F ∂2F =− =− = ∂V ∂τ ∂τ ∂V
∂p ∂τ
. V
The outer equality in this line is called a Maxwell relation. These occur often in thermodynamics and result from the fact that many thermodynamic parameters are first derivatives of the same thermodynamic “potential,” such as the free energy in this case.
The Free Energy and the Partition Function Consider F = U − τ σ and σ = −(∂F/∂τ )V . Putting these together, we have U = F + τσ , ∂F = F −τ , ∂τ V ∂(F/τ ) . = −τ 2 ∂τ Recall the expression for energy in terms of the partition function U = τ2
∂ log Z . ∂τ
Comparing with the above, we see F = − log Z + C , τ c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Sep-2002 7-8
where C is a constant independent of τ . In fact, the constant must be zero in order to give the correct entropy as τ → 0. If τ is sufficiently small, only the lowest energy (E0 ) state enters the partition function. If it occurs g0 different ways, then log Z → log g0 − E0 /τ and σ = −∂F/∂τ → ∂(τ log g0 − E0 − τ C)/∂τ = log g0 − C. So C = 0 in order that the entropy have the correct zero point. Then F = −τ log Z
or
Z = e−F/τ .
Remembering that the Boltzmann factor is normalized by the partition function to yield a probability, we have e−Es /τ P (Es ) = = e(F − Es )/τ . Z Just for fun, let’s apply some of these results using the partition function for the ideal gas we derived earlier. F = −τ log Z , = −τ log((nQ V )N /N !) , = −τ (N log nQ + N log V − N log N + N)
= −τ N log (mτ /2π¯h2 )3/2 (V /N ) − τ N .
(Stirling’s approx.) ,
With p = −∂F/∂V , we have p = τ N/V , the ideal gas law again. For the entropy, σ = −∂F/∂τ , σ = N log(nQ V /N ) + (3/2)N + N , nQ 5 , = N log + n 2 where n = N/V is the concentration. This called the Sackur-Tetrode formula. Note that if one considers the change in entropy between two states of an ideal gas, σf − σi =
3 Vf τf + N log , N log 2 τi Vi
a classical result which doesn’t contain Planck’s constant. However, to set the zero point and get an “absolute” entropy, Planck’s constant does enter since it determines the spacing between states and their total number. The overcounting correction does not make any difference in the pressure above, but it does enter the entropy—as might have been expected. A final note is that these expressions for an ideal gas do not apply in the limit τ → 0. (Why?) c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 2
Due 1-Oct-2002
H2-1
1. In lecture, we discussed the Maxwell velocity distribution for a low density gas, p(v) dv = 4π
1 2πτ /m
32
2 e−mv /2τ v 2 dv .
(a) Determine the most probable speed (the speed at which the probability density is maximum), the average speed, v, and the root mean square speed, v 2 . (b) Evaluate these quantities numerically for nitrogen molecules at room temperature (293 K). 2. Problem 1 was kind of plug in and grind. This one requires some thought. One way to examine molecular speeds and compare with the Maxwell distribution would be to heat up a gas in an oven, and let some gas escape to a vacuum through a small hole in the oven. To measure the speed distribution, one can look at the distribution of distance traveled in a fixed time. Determine the distribution of the speeds of molecules which make it through the hole. Assume that the Maxwell distribution applies to the molecules in the oven. Hint: all other things being equal, faster molecules hit the hole more often than slower molecules! 3. Consider the paramagnetic spin system we introduced in lecture 4, and discussed again in lecture 6. Determine the free energy (as a function of τ ) for this system. From the free energy, deduce expressions for the energy and entropy. Of course, these expressions should agree with what’s been derived before by other techniques. This problem is essentially K&K, chapter 3, problem 1, except that it refers to our model for a paramagnetic system in which the two energy levels of a given magnet occur at −E and +E. 4. K&K, Chapter 3, Problem 9. Show that the partition function of two independent systems in thermal contact, but weakly interacting, is just the product of the partition functions of the two systems: Z1 and 2 (τ ) = Z1 (τ ) · Z2 (τ ) .
5. K&K, Chapter 3, Problem 7. It’s amazing that such a simple model might actually be relevant to something as complex as DNA! 6. K&K, Chapter 3, Problem 10. Note also, warming a gas makes it expand. Expanding a gas makes it cool. What happens to a rubber band when it is expanded? Your upper lip is a good temperature sensor. Place a rubber band in contact and stretch it.(Be careful, overdoing it could hurt!) Does it get hotter or colder? 7. K&K, Chapter 3, Problem 11. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 2
Due 1-Oct-2002
H2-2
8. In lecture 6, pages 1–3, we found expressions for probabilities that maximized the entropy. In one case, we found pi = e−1 − λ1 − λ2 Ei , where Ei is the energy of state i and λ1 and λ2 are Lagrange multipliers. Show that λ2 = 1/τ . Hint: if we make changes δpi to the probabilities, what must δpi satisfy? What are δσ and δU ?
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Oct-2002
Physics 301 Problem Set 2 Solutions Problem 1. The Maxwell velocity distribution is p(v)dv = 4π
1 2πτ /m
23
e−mv
2
/2τ 2
v dv.
(1)
(a) The most probable speed is given at the velocity at which p(v) has a maximum dp/dv = 0 which gives:
m 3 −mv2 /2τ 2v )e =0 2τ r 2τ ⇒ vmp = m
(2v −
(2)
(v = 0 is a minimum). The average speed is given by: hvi = 4π
1 2πτ /m
32 Z
∞
3 −mv2 /2τ
v e
dv =
0
r
8τ πm
Z
0
∞
xe
−x
dx =
r
8τ Γ(2) = πm
r
8τ . πm (3)
The mean of the square speed is given by: 2
hv i = 4π
1 2πτ /m
32 Z
∞
v 4 e−mv
2
/2τ
0
which gives the rms speed to be vrms =
4τ 4τ 3 1 τ 5 dv = √ Γ( 2 ) = 3 . Γ( ) = √ m πm 2 πm 4
(4)
p 3τ /m.
(b) For Nitrogen molecules at room temperature (T = 293K), the mass of a molecule is p m = 28/(6.023 × 102 3)g = 4.64 × 10−23 g. Then, kT /m ≈ 2.9 × 104 cm/s, this gives the various speeds as:
vmp ≈ 417m/s,
vavg ≈ 471m/s,
vrms ≈ 511m/s. Problem 2. Let’s assume a small circular hole in a wall of the oven. Let’s look at the molecules hitting the hole at a given angle. In unit time, a molecule with speed v travels a distance v which means that in unit time, the number of molecules with speed between v and v + dv hitting the hole at a given angle is proportional to vp(v)dv where the proportionality constant depends on the angle. We can now integrate over the angle 1
Physics 301
6-Oct-2002
and find that the total number of molecules going through the hole with speeds between v and v + dv is also proportional to vp(v)dv. The actual number depends on the density of molecules in the container and the specifics of the hole. To find the distribution of speeds, we should normalize the probability to unity: Z ∞ τ 2 2 . e−mv /2τ v 3 dv = 2 m 0
(5)
which gives us the normalized probability distribution to be pout (v)dv =
1 3 −mv2 /2τ v e dv. 2(τ /m)2
(6)
Note the dependence on temperature in this expression. More interestingly, note that if we collect all these molecules in a box and let them attain thermal equilibrium, the temperature of this box will be higher than the original box because there are more faster moving molecules than slower compared to the original box. Problem 3. There are two energy levels of each spin ∓E for spins parallel and antiparallel to the external magnetic field. E = mB where m is the magnetic moment and B
is the external magnetic field. The partition function for such a system of N independent spins is given by:
N . Z(τ ) = e−E/τ + eE/τ
(7)
F = −τ log Z = −Nτ log(e−E/τ + eE/τ ) = −Nτ log(2 cosh(E/τ )).
(8)
The free energy is therefore:
From this expression, we get the entropy ∂F sinh(E/τ ) E = N log(2 cosh(E/τ )) − Nτ ∂τ cosh(E/τ ) τ 2 NE = N log(2 cosh(E/τ )) − tanh(E/τ ), τ
σ=−
(9)
and the energy U = F + τ σ = −NE tanh(E/τ ).
(10)
Problem 4. The assumption that the two systems are in thermal contact but weakly interacting means two things - firstly, there is a common temperature τ ; and secondly, 2
Physics 301
6-Oct-2002
a general state of the combined system is a pair of states - one in each of the systems and the energy of such a combined state is given by the sum of the energies of the two individual states. (A formal way to say this would be that the state space of the combined system is a direct product of the state space of the two systems - the wavefunction of a state in the combined system would be a product of the two wavefunctions). If the states of the two systems are labelled by {Ei1 } and {Ei2}, the partition sum of the combined
system is given by:
Z1+2 (τ ) =
X i,j
=
X i
XX exp −(Ei1 + Ej2 )/τ = exp −Ei1/τ exp −Ej2 /τ
exp −Ei1 /τ
X j
i
j
exp −Ej2/τ = Z1 (τ )Z2 (τ ).
(11)
Problem 5. (a) The fact that the zipper can only open from the left and that you can open a link only if all the links to its left are already open means that the energy levels of the systems are given by {k}, k = 0, 1, ..N, where is the energy required to open a link. The partition sum can then be written as: Z=
N X
exp(k/τ ) =
k=0
1 − exp[−(N + 1)/τ ] . 1 − exp[−/τ ]
(12)
(b) For low temperatures ( τ ), exp[−/τ ] 1, and log Z = log (1 − exp[−(N + 1)/τ ]) − log (1 − exp[−/τ ]) ≈ − exp[−(N + 1)/τ ] + exp[−/τ ]
(13)
≈ exp[−/τ ]. The average number of open links is: hUi 1 ∂ log Z = τ2 ∂τ 1 2 ≈ τ exp[−/τ ]. τ2 = exp[−/τ ].
hki =
(14)
Note that at very low temperatures, there isn’t enough thermal energy to excite even one link. 3
Physics 301
6-Oct-2002
Problem 6. (a) If there are N1 links directed to the right and N2 to the left, then the length of the chain will be (N1 − N2 )ρ = 2sρ. The total number of links N1 + N2 = N. This gives N1 = N/2 + s, N2 = N/2 − s. The number of ways you can form a chain of length 2sρ is therefore the number of ways you can chose N/2 + s out of N. If we care only about the length of the chain l = 2|s|ρ and not whether it is pointing to the left or right, we have to multiply this by two for the choice of sign of s. This gives the total number of choices as G = g(N, s) + g(N, −s) = 2
( 12 N
N! . + s)!( 12 N − s)!
(15)
(b) The entropy of such a chain (using the Stirling approximation for large N and Taylor expanding the logarithms log(1 + x) ≈ x − x2 /2 for small |s|/N) is σ = log G ≈ 2N log 2N − 2N − ( 12 N + s) log( 12 N + s) − ( 12 N − s) log( 12 N − s) + 2( 12 N + s) N/2 + s 1 1 1 1 = 2N log 2N − N − 2 N log( 2 N + s) − 2 N log( 2 N − s) − s log N/2 − s
= 2N log 2N − N − N log 12 N − 12 N log(1 + 2s/N ) − 12 N log(1 − 2s/N ) 1 + 2s/N − s log 1 − 2s/N 2 2 2s 2s 2s 2s 4s ≈ log 2g(N, 0) − 12 N − 2 − 12 N − − 2 − s N N N N N 2 2 2s l = log 2g(N, 0) − . = log 2g(N, 0) − N 2Nρ2 (16) (c) From the thermodynamic identity, it follows that the force is given by f = −τ (∂σ/∂l) (in one dimension, the force is the analog of the pressure). This means that the force is f = lτ /N ρ2 .
(17)
Problem 7. For one particle on a line of length L, the energy eigenvalues are E n = 4
Physics 301
6-Oct-2002
h ¯ 2 π 2 n2 /2ML2 . The partition function is then (replacing the sum over n by an integration): Z ∞ h ¯ 2π2 2 Z= dn exp − n 2ML2 τ 0 √ r π 2ML2 τ (18) = 2 h ¯ 2π2 L L = . 1 ≡ L0 (τ ) (2π¯h2 /τ M ) 2 For N particles, the partition function is ZN = Z N /N!. This gives the entropy (for large N ) to be ∂ σ = log ZN + τ log ZN ∂τ ∂ = N log Z − N log N + N + τ N log Z ∂τ (19) L 1 = N log( ) − N log N + N + τ N L0 2τ L 3 = N log( )+ L0 N 2 Problem 8. We know that pi = exp(−1 − λ1 − λ2 Ei ). The probabilities satisfy the constraint: X pi = 1. (20) i
The energy and entropy are given by:
U=
X
pi Ei
i
σ=−
X
(21)
pi log pi .
i
If we make changes δpi to the probabilities of the states (and not the energies), then the P changes are also constrained by i δpi = 0. The changes in the energy and entropy are given by: X δU = δpi Ei i
X
X
1 δpi pi i i X X = δpi (1 + λ1 + λ2 Ei ) − δpi
δσ = −
i
= λ2
δpi log pi −
X
pi
i
δpi Ei = λ2 δU.
i
We see then that λ2 = δσ/δU = 1/τ .
5
(22)
Week 3. Harmonic Oscillator, Classical/Quantum Cavity Radiation, Oscillator Applications
Physics 301
30-Sep-2002 8-1
Reading This week, we’ll concentrate on the material in K&K chapter 4. This might be called the thermodynamics of oscillators.
Classical Statistical Mechanics Recall that statistical mechanics was developed before quantum mechanics. In our discussions, we’ve made use of the fact that quantum mechanics allows us to speak of discrete states (sometimes we have to put our system in a box of volume V ), so it makes sense to talk about the number of states available to a system, to define the entropy as the logarithm of the number of states, and to speak of maximizing the number of states (entropy) available to the system. If one didn’t know about quantum mechanics and didn’t know about discrete states, how would one do statistical mechanics? Answer: in classical statistical mechanics, the phase space volume plays the role of the number of states. We’ve mentioned phase space briefly. Here’s a slightly more detailed description. In classical mechanics, one has the Lagrangian, L(q, q, ˙ t) which is a function of generalized coordinates q, their velocities, q, ˙ and possibly the time, t. The equations of motion are Lagrange’s equations d ∂L ∂L − =0. dt ∂ q˙ ∂q Note that q might be a single variable or it might stand for a vector of coordinates. In the latter case, there is one equation of motion for each coordinate. The Hamiltonian is defined by a Legendre transformation, H(q, p, t) = pq˙ − L(q, q, ˙ t) , where p=
∂L , ∂ q˙
is called the momentum conjugate to q (or the canonical momentum). The equations of motion become (assuming neither L nor H is an explicit function of time) q˙ =
∂H , ∂p
p˙ = −
∂H , ∂q
so that each second order equation of motion has been replaced by a pair of first order equations of motion. If p and q are given for a particle at some initial time, then the time development of p and q are determined by the equations of motion. If we consider a single pair of conjugate c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
30-Sep-2002 8-2
coordinates, p and q (i.e., a one-dimensional system), and we consider a space with axes q and p, then a point in this space represents a state of the system. The equations of motion determine a trajectory (or orbit) in this space that the system follows. The q-p space is called phase space. If we consider a 3-dimensional particle, then three coordinates and three momenta are required to describe the particle. Phase space becomes 6-dimensional and is a challenge to draw. If we consider N 3-dimensional particles, then phase space becomes 6N-dimensional. Or, one might draw N trajectories in a 6-dimensional space. As an example of a phase space that we might actually be able to draw, consider two 1-dimensional particles moving along a common line. Suppose they are essentially free particles. The phase space coordinates are q1 , p1 , q2 , and p2 . (Subscripts refer to particles 1 and 2.) The figure shows an attempt at drawing a trajectory in the 4-dimensional phase
space. Since we have free particles, p1 and p2 are constants and q1 and q2 are linear functions of time, for example, q1 = p1 t/m1 . The figure shows a trajectory for q1 and for q2 . As shown, q1 has a positive momentum, so its trajectory is from left to right, while q2 has a negative momentum, so its trajectory is from right to left. Each point on the trajectory of q1 corresponds to exactly one point on the trajectory of q2 —the points are labeled by time and points at the same time are corresponding points. If we could draw in four dimensions, there would be a single line representing both particles and we would not have to point out this correspondence. Note that at some time, both particles are at the same physical place in space and simply pass through each other as we’ve drawn the trajectories above. Instead of passing through each other, suppose they have a collision and “bounce backwards.” This might be represented by the diagram shown in the next figure. This has been drawn assuming equal
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
30-Sep-2002 8-3
masses, equal and opposite momenta, and an elastic collision. I’m sure you can work out diagrams for other cases. Suppose we are considering a low density gas (again!). We certainly would not want to try to draw a phase space for all the particles in the gas and we certainly wouldn’t want to try to draw all the trajectories including collisions. In the two particle case we’ve been considering, suppose we blinked while the collision occurred. What would we see? The answer (for a suitable blink) is shown in the next figure. We’d see particles 1 and 2
moving along as free particles before we blinked and again after we blinked, but while we blinked, they changed their momenta. We’ve already mentioned that in a low density gas, the molecules travel several molecular diameters between collisions while collisions occur only when molecules are within a few molecular diameters of each other. One way to treat a low density gas is to treat the molecules as free particles and to try to add in something to account for the collisions. By looking at the drawing of the collision (where we blinked), we can see that one way is to say that the particles follow phase space trajectories for a free particle, except every now and then a trajectory ends and reappears—at random— somewhere else. The disappearance and reappearance of phase space trajectories does not really happen; it’s an approximate way to treat collisions. All this is motivation for the idea that collisions randomize the distribution of particles in phase space. Of course the randomization must be consistent with whatever constraints are placed on the system (such as fixed total energy, etc.) In general, if a system is in thermal contact with another system, we would expect that the exchanges of energy, c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
30-Sep-2002 8-4
required for thermal equilibrium, would result in randomization of the phase space location. The classical statistical mechanics analog of our postulate that all accessible states are equally probable is the postulate that all accessible regions of phase space are equally probable. In other words, a point in phase space plays the role of a state. The leveling of the probabilities is of course accomplished by the collisions and energy transfers we’ve just been discussing. It shouldn’t be too hard to convince yourself that any concept we’ve discussed that doesn’t explicitly require Planck’s constant can just as easily be done with classical statistical mechanics as with quantum statistical mechanics. Even in cases where we used ¯h, if there is a reasonable mapping of quantum states to phase space volume, the classical treatment will give the same results as the quantum treatment (but of course, lacking an h ¯ ). As an example, suppose we consider a single free particle in a box of volume V in thermal contact with a reservoir at temperature τ . Our derivation of the Boltzmann factor did not depend on quantum mechanics, so the probability of finding this particle in a state with energy E is exp(−E/τ ), just as before. The partition function is no longer a sum over states, but an integral over phase space volume, +L/2 +L/2 +L/2 +∞ +∞ +∞ ZC = dx dy dz dpx dpy dpz exp −(p2x + p2y + p2z )/2mτ , −L/2
−L/2
−L/2
−∞
−∞
−∞
where ZC stands for the classical partition function, and the volume is taken to be a cube of side L for convenience. The integrals over the coordinates give V and each integral over √ a momentum gives 2πmτ . The result is ZC = V (2πmτ )3/2 . Recall our previous result for the free particle partition function, 3 1 3/2 ZQ = V (2πmτ ) , h where the subscript Q indicates the “quantum” partition function. Note that the expression includes h, not h ¯ . So, in this case, the classical and quantum partition functions are the same except for a factor of h−3 . Mostly, we use the logarithm of the partition function. This means that many results that we derive from the partition function will not depend on whether we use ZC or ZQ . For example, the energy is τ 2 ∂(log Z)/∂τ , so the h3 factor has no effect on the energy. An important exception is the entropy. The entropy is missing an additive constant. This has no effect on relative entropy, but it does matter for absolute entropy. (How would you measure absolute entropy?) By comparing the two expressions one sees that for each pair of conjugate phase space coordinates, such as x and px , one should assign the volume h to a single state. Using classical considerations, we can (at least for a low density gas) reproduce the quantum results simply by using dx dpx dy dpy dz dpz , h h h c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
30-Sep-2002 8-5
as the appropriate volume in phase space. This works in general provided the average occupancy is very low. We see that the Maxwell velocity distribution falls out of the classical approach and the h−3 even if included, would get erased in the normalization factor for the probability density.
A Classical Harmonic Oscillator Now suppose we have a one dimensional harmonic oscillator, whose Hamiltonian is H=
p2 kq 2 1 2 + = p + m2 ω 2 q 2 , 2m 2 2m
where ω 2 = k/m is the natural frequency of the oscillator, Suppose this oscillator is in thermal equilibrium at temperature τ . What is the mean value of its energy? One way we can work this out is to take the Boltzmann factor as the probability density in phase space. So P (E) dq dp = C exp −(p2 + m2 ω 2 q 2 )/2mτ dq dp , where uppercase P is used for probability density to distinguish if from momentum. The normalization constant, C, is set by requiring that the integral of the probability density over phase space be unity. The position√and momentum integrals can be done separately √ and lead to normalization factors mω/ 2πmτ for the position coordinate and 1/ 2πmτ for the momentum coordinate. To get the average energy of this oscillator, we have
1 2 p + m2 ω 2 q 2 exp −(p2 + m2 ω 2 q 2 )/2mτ , −∞ −∞ 2m +∞ 2 2 2 +∞ 2 m ω q −m2 ω 2 q 2 /2mτ p −p2 /2mτ mω 1 e e dq dp =√ +√ , 2m 2πmτ −∞ 2πmτ −∞ 2m +∞ +∞ τ 1 τ 1 2 2 dx x exp(−x /2) + √ dy y 2 exp(−y 2 /2) , = √ 2 2π −∞ 2 2π −∞ τ τ = + =τ . 2 2
E = C
+∞
dq
+∞
dp
Of course, we could obtain the same result by calculating the partition function and going from there. Note that the harmonic oscillator has two ways to store energy: as kinetic energy or as potential energy. Each of these can be considered a degree of freedom and each stores, on the average, τ /2 = kT /2. This is an example related to equipartition of energy discussed by K&K in chapter 3.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
30-Sep-2002 8-6
Classical Cavity Radiation We’re familiar with the idea that hot objects radiate energy. There are the expressions “red hot” and “white hot” denoting very hot objects. The color comes from the appearance of the objects and is the color of the electromagnetic energy radiated by the object. Allowing two objects to exchange radiation is a way to place them in thermal contact. For ordinary temperatures this may not be a very efficient method of heat exchange compared to conduction or convection, but at high temperatures (in stars, for example) it can become the dominant method of energy transfer. Also, when working at cryogenic temperatures, one needs to shield the experiment from direct exposure to room temperature radiation because this can be an important heat load on the cold apparatus. How can we make a perfect absorber of radiant energy? If we could, what would it look like? If it absorbed all the radiation that hit it, then nothing would be reflected back, so we couldn’t see anything and it would appear black. A perfect absorber is called a blackbody. We could make a perfect absorber by making a large cavity with a small hole connecting the cavity to the outside world. Then, as seen from the outside, any radiation hitting the hole, passes through the hole and bounces around inside the cavity until it is absorbed. By making the cavity sufficiently big and the hole sufficiently small, we can make the chances of the radiation coming back out the hole before it’s absorbed as small as we like. (Of course, when the wavelength of the radiation is comparable to or larger than the size of the hole, then we have to worry about diffraction...) A cavity containing a hole must also radiate energy. If not, it would heat up to arbitrarily high temperatures (and of course, this violates the second law of thermodynamics by transferring heat from a cold object to a hot object with no other change). So when a cavity is in thermal equilibrium with its surroundings, It must radiate energy out through the hole at the same rate that energy is absorbed through the hole. A hole has no properties, so the radiated spectrum (energy at each frequency or wavelength or color) can only depend on the temperature. This spectrum is called the blackbody (or thermal or cavity) radiation spectrum. A real physical object which is a perfect absorber must radiate the same spectrum. We can place a physically black object into thermal contact with a cavity radiator. In order to avoid violating the second law, the energy absorbed must equal the energy radiated. If we consider a filter which is perfectly transparent in some frequency range and perfectly reflecting outside this range and we insert this filter between the two objects, then we conclude that the perfect absorber and the cavity radiator must radiate the same spectrum (the same amount of energy at each frequency). Finally, real absorbers are not perfect. If in equilibrium, a fraction a of the incident radiation is absorbed, with the rest being reflected, then it must be the case that it emits the fraction a of the ideal blackbody radiation, otherwise we can arrange to violate the second law. Finally, by using our filter again, we conclude that if it absorbs the fraction a(ω) at frequency ω, it must radiate the same fraction e(ω) = a(ω) of the ideal blackbody radiation spectrum. Jargon: a is called the absorptivity and e is called the emissivity.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
30-Sep-2002 8-7
The upshot of all this is that there is a universal radiation spectrum that depends only on temperature and is called the blackbody, thermal or cavity radiation spectrum. Let us try to calculate this spectrum. The spectrum is produced by electromagnetic fields inside the cavity. These fields contain energy and they are in equilibrium with the walls of the cavity at temperature τ . To make life simple, let’s suppose our cavity is a cube of side L. You may recall from your studies of electromagnetism that the fields in the cavity can be divided into modes with each mode oscillating like a harmonic oscillator. Electromagnetic energy oscillates back and forth between the electric field (like the position coordinate in a standard harmonic oscillator) and the magnetic field (like the momentum coordinate). Each mode can store energy independently, so each mode contributes a harmonic oscillator term to the Hamiltonian of the cavity. Different modes have different frequencies and this is where the spectrum comes from. So we are getting close: we’ve already calculated the average energy in a harmonic oscillator; all we have to do now is enumerate the modes and their frequencies and we’ll have the calculation of the blackbody spectrum. As you may know, the electric field for a given mode in a perfectly conducting cavity has components of the form Ez = E0 sin(ωt) sin(nx πx/L) sin(ny πy/L) cos(nz πz/L) , where E0 is the amplitude of the mode (an electric field, not an energy!), ω is the frequency of oscillation, and nx , ny , and nz are integers. The cavity is assumed to run from 0 to L in each coordinate. The sine terms ensure that electric field component parallel to a perfectly conducting wall is zero at the wall. (Electric field is always perpendicular to the surface of a perfect conductor). The cosine term ensures that the magnetic field (related to the E-field by Maxwell’s equations) has no perpendicular component at the wall. The integers nx , ny , nz are related to the number of half wavelengths that fit in the cavity. Maxwell’s equations tell us that π 2 c2 ω 2 = 2 n2x + n2y + n2z . L We can think of a mode as a wave bouncing around inside the cavity and when it has bounced around once, it must be in phase with the original wave in order to have resonance and a mode. This is another way of seeing how the integers arise. We also know that any electromagnetic wave in a vacuum has two polarizations. So for each set of positive integers there are two independent oscillators. And each oscillator has average energy τ according to our earlier calculation. To calculate the spectrum, we need to know how many sets of integers correspond to a given range in ω. We start by considering the number that correspond to a frequency less than a particular ω. If we consider a three dimensional space with axes nx , ny , and c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
30-Sep-2002 8-8
nz , then the number is the number of lattice points in this space in the positive octant with n2x + n2y + n2z < r = ωL/πc. This is just the volume of 1/8 of a sphere of radius r, 1 4π N (< ω) = 2 8 3
ωL πc
3 ,
where the factor of 2 accounts for the two polarizations. This should be reminding you very strongly of what we did when counting the states of a particle in a box. The number of oscillators in the frequency range ω → ω + dω is found by differentiating the above, and V ω2 n(ω) dω = 2 3 dω , π c where V has been inserted in place of L3 . Now, each oscillator has average energy τ , so the energy per unit frequency in the cavity is dU V ω2τ dω = 2 3 dω , dω π c and the total energy in the cavity is found by integrating over all frequencies,
+∞
U =
dω 0
V ω2 τ = ∞ !!! π 2 c3
This says there is an infinite energy in the cavity and this can’t be right! Also the energy per unit frequency result says that the energy is concentrated towards the high frequencies in proportion to ω 2 . This is called the ultraviolet catastrophe. It says that if you made a cavity and put a small hole in it to let the radiation out, you’d be instantly incinerated by the flux of X-rays and gamma rays! Of course, we’re all still here, so this doesn’t happen. Where did we go wrong? This is the same question physicists were asking themselves in the latter part of the nineteenth and the early part of the twentieth century. The answer is, we didn’t go wrong, at least as far as classical physics is concerned. Everything we did leading up to infinite energy density in a cavity is perfectly legal according to classical physics. It is one of the many contradictions that arose around the turn of the century that led to the development of quantum mechanics. One of the things to note is that cavity radiation could be measured and at low frequencies it gave results in agreement with what we’ve just derived. That is, the spectral density (1/V ) dU/dω is proportional to τ and to ω 2 . For higher frequencies the measured result falls far below our calculation. The ω 2 region is called the Rayleigh-Jeans part of the spectrum. What’s needed to cure our calculation is a way to keep the high frequency modes from being excited. We shall see that it is the discreteness of the energy levels provided by quantum mechanics that keeps these modes quiescent. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-1
A Quantum Harmonic Oscillator The quantum harmonic oscillator (the only kind there is, really) has energy levels given by En = (n + 1/2)¯ hω , where n ≥ 0 is an integer and the E0 = h ¯ ω/2 represents zero point fluctuations in the ground state. We are going to shift the origin slightly and take the energy to be En = n¯hω . That is, we are going to ignore zero point energies. The actual justification for this is a little problematic, but basically, it represents an unavailable energy, so we just leave it out of the accounting. The partition function is then Z=
∞
e−n¯hω/τ =
n=0
(Thus is just an infinite series energy of the oscillator
exp(¯ hω/τ ) 1 = . 1 − exp(−¯hω/τ ) exp(¯ hω/τ ) − 1
xn with x = exp(−¯hω/τ ).) We calculate the average
E = τ 2
∂ log Z ¯hω = . ∂τ exp(¯ hω/τ ) − 1
It’s instructive to consider two limiting cases. First, consider the case, that h ¯ ω τ . That is, the energy level spacing of the oscillator is much less than the typical thermal energy. In this case, the denominator becomes ¯ω h ¯hω e¯hω/τ − 1 ≈ 1 + + ··· − 1 = . τ τ If we plug this into the expression for the average energy, we get E → τ ,
(¯ hω τ ) ,
just as we found for the classical case. On the other hand, if h ¯ ω/τ 1, then the exponential in the denominator is large compared to unity and the average energy becomes E → ¯hω e−¯hω/τ ,
(¯ hω τ ) .
In other words, the average energy is “exponentially killed off” for high energies. Recall that we needed a way to keep the high energy modes quiescent in order to solve our cavity radiation problem! c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-2
Quantum Cavity Radiation Now that we have given a treatment of the quantum harmonic oscillator, we can return to the cavity radiation problem. Note that our counting of states is basically the counting of electromagnetic modes. These came out quantized because we were considering standing electromagnetic waves. That is, classical considerations gave us quantized frequencies and quantized modes. With quantum mechanics, we identify each mode as a quantum oscillator and realize that the energies (and amplitudes) of each mode are quantized as well. In addition, with quantum mechanics, we know that particles have wave properties and vice-versa and that quantum mechanics associates an energy with a frequency according to E = h ¯ ω. So given that we have a mode of frequency ω, it is populated by particles with energy h ¯ ω. We can get the mode classically by considering standing waves of the electromagnetic field and we can get it quantum mechanically by considering a particle in a box. Either way we get quantized frequencies. With quantum mechanics we also find that the occupants of the modes are particles with energies h ¯ ω, so we get quantized energies at each quantized frequency. When you take a course in quantum field theory, you will learn about second quantization which is what we’ve just been talking about! The particles associated with the electric field are called photons. They are massless and travel at the speed of light. They carry energy E = hν = h ¯ ω and momentum p = h/λ = hν/c = h ¯ ω/c, where ω and λ are the frequency and wavelength of the corresponding wave. Note that the frequency in Hertz is ν = ω/2π. When h ¯ ω τ , so the thermal energy is much larger than the photon energy, we have E τ → 1, ¯hω ¯ω h
(¯ hω τ ) .
The average energy divided by the energy per photon is the average number of photons in the mode or the average occupancy. We see that in the limit of low energy modes, each mode has many photons. When quantum numbers are large, we expect quantum mechanics to go over to classical mechanics and sure enough this is the limit where the classical treatment gives a reasonable answer. At the other extreme, when the photon energy is high compared to the thermal energy, we have E hω/τ 1 , → e−¯ ¯hω
(¯ hω τ ) .
In this limit, the average occupancy is much less than 1. This means that the mode is quiescent (as needed) and also that quantum effects should be dominant. In particular, the heat bath, whose typical energies are ∼ τ has a hard time getting together an energy much larger than τ all at once so as to excite a high energy mode. Perhaps a bit of clarification is needed here. When discussing the ideal gas, consisting of atoms or molecules, we said that a low occupancy gas was classical and a high occupancy c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-3
gas needed to be treated with quantum mechanics—apparently just the opposite of what was said in the previous paragraph! When a large number of particles are in the same state, they can be treated as a classical field. Thus at low photon energies, with many photons in the same mode, we can speak of the electromagnetic field of the mode. At high photon energies, where the occupancy is low, the behavior is like that of a classical particle but a quantized field. Let’s calculate the cavity radiation spectrum. The only change we need to make from our previous treatment is to substitute the quantum oscillator average energy in place of the classical result. The number of modes per unit frequency is the same whether we count the modes classically or quantum mechanically. However, since the average energy now depends on frequency, we must include it in the integral when we attempt to find the total energy. The energy per unit frequency is dU ¯hω V ω2 dω = 2 3 dω , dω π c exp(¯ hω/τ ) − 1 and the total energy in the cavity is,
+∞
dω
U= 0
¯hω V ω2 . 2 3 π c exp(¯ hω/τ ) − 1
It is convenient to divide the energy per unit frequency by the volume and consider the spectral density uω , where ¯hω 3 1 dU = 2 3 . uω = V dω π c (exp(¯hω/τ ) − 1) This is called the Planck radiation law. It is simply the energy per unit volume per unit frequency at frequency ω inside a cavity at temperature τ . For convenience, let x = h ¯ ω/τ . Then x is dimensionless and we have uω =
τ3 x3 , π 2 ¯h2 c3 ex − 1
The shape of the spectrum is given by the second factor above which is plotted in the figure.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-4
Changing the temperature shifts the curve to higher frequencies (in proportion to τ ) and multiplies the curve by τ 3 (and constants). At low energy the spectrum is proportional to ω 2 in agreement with the classical result. At high energy there is an exponential cut-off. The exponential behavior on the high energy side of the curve is known as Wien’s law. To find the total energy per unit volume we have
+∞
u=
dω uω ,
0 +∞
¯ ω3 h , π 2 c3 (exp(¯hω/τ ) − 1) 0 ∞ 3 τ4 x dx = , 3 3 2 π ¯h c 0 ex − 1 τ 4 π4 (looking up the integral) , = 2 3 3 π ¯h c 15 π2 τ4 . = 15¯h3 c3 =
dω
The fact that radiation density is proportional to τ 4 is called the Stefan-Boltzmann law.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-5
We can also calculate the entropy of radiation. We have U =Vu=
π2 4 3 3 Vτ . 15¯h c
We know that τ dσ = dU when the volume is constant, so dσ =
1 4π 2 4π 2 3 V τ dτ = V τ 2 dτ . 3 3 3 3 τ 15¯h c 15¯h c
We integrate this relation setting the integration constant to 0, (why?) and obtain σ=
4π 2 3 3 3 Vτ . 45¯h c
It is sometimes useful to think of blackbody radiation as a gas of photons. Some of the homework problems explore this point of view as well as other interesting facts about the blackbody radiation law. One application of the blackbody radiation law has to do with the cosmic microwave background radiation which is believed to be thermal radiation left over from the hot big bang which started our universe. Due to the expansion of the universe, it has cooled down. This radiation has been measured very precisely by the FIRAS instrument on the COBE satellite and is shown in the accompanying figure which was put together by
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-6
Lyman Page mostly from data collected by Reach, et al., 1995, Astrophysical Journal, 451, 188. The dashed curve is the theoretical curve and the solid curve represents the measurements where the error bars are smaller than the width of the curve! Other curves on the plot represent deviations in the curve due to our motion through the background radiation (dipole), irregularities due to fluctuations that eventually gave rise to galaxies and physicists (anisotropy) and sources of interfering foreground emission. The temperature is 2.728 ± 0.002 K where the error (one standard deviation) is all systematic and reflects how well the experimenters could calibrate their thermometer and subtract the foreground sources.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-7
More on Blackbody Radiation Before moving on to other topics, we’ll clean up a few loose ends having to do with blackbody radiation. In the homework you are asked to show that the pressure is given by p=
π2 τ 4 , 45¯h3 c3
from which one obtains
1 U, 3 for a photon gas. This is to be compared with pV =
pV =
2 U, 3
appropriate for a monatomic ideal gas. In an adiabatic (isentropic—constant entropy) expansion, an ideal gas obeys the relation pV γ = Constant , where γ = Cp /CV is the ratio of heat capacity at constant pressure to heat capacity at constant volume. For a monatomic ideal gas, γ = 5/3. For complicated gas molecules with many internal degrees of freedom, γ → 1. A monatomic gas is “stiffer” than a polyatomic gas in the sense that the pressure in a monatomic gas rises faster for a given amount of compression. What are the heat capacities of a photon gas? Since π2 4 U = 3 3V τ , 15¯h c 4π 2 ∂U CV = = V τ3 . ∂τ V 15¯h3 c3 How about the heat capacity at constant pressure. We can’t do that! The pressure depends only on the temperature, so we can’t change the temperature without changing the pressure. We can imagine adding some heat energy to a photon gas. In order to keep the pressure constant, we must let the gas expand while we add the energy. So, we can certainly add heat at constant pressure, it just means the temperature is constant as well, so I suppose the heat capacity at constant pressure is formally infinite! If one recalls the derivation of the adiabatic law for an ideal gas, it’s more or less an accident that the exponent turns out to be the ratio of heat capacities. This, plus the fact that we can’t calculate a constant pressure heat capacity is probably a good sign c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-8
that we should calculate the adiabatic relation for photon gas directly. We already know σ ∝ V τ 3 ∝ V p3/4 , so for an adiabatic process with a photon gas, pV 4/3 = Constant , and a photon gas is “softer” than an ideal monatomic gas, but “stiffer” than polyatomic gases. Note that γ = 4/3 mainly depends on the fact that photons are massless. Consider a gas composed of particles of energy E and momentum P = E/c, where c is the speed of light. Suppose that particles travel at speed c and that their directions of motion are isotropically distributed. Then if the energy density is u = nE, where n is the number of particles per unit volume, the pressure is u/3. This can be found by the same kind of argument suggested in the homework problem. The same result holds if the particles have a distribution in energy provided they satisfy P = E/c and v = c. This will be the case for ordinary matter particles if they are moving at relativistic speeds. A relativistic gas is “softer” than a similar non-relativistic gas! On problem 3 of the homework you are asked to determine the power per unit area radiated by the surface of a blackbody or, equivalently, a small hole in a cavity. The result is (c/4)u where the speed of light accounts for the speed at which energy is transported by the photons and the factor of 1/4 accounts for the efficiency with which the energy gets through the hole. The flux is J=
π2 τ 4 π2 k4 4 4 = 3 2 3 2 T = σB T 60¯h c 60¯h c
where the Stefan-Boltzmann constant is π2 k4 erg −5 = 5.6687 × 10 . σB = 60¯h3 c2 cm2 s K4 ¯ ω/τ . We saw that the Planck curve involved the function x3 /(exp(x) − 1) with x = h Let’s find the value of x for which this curve is a maximum. We have d x3 , dx ex − 1 x3 ex 3x2 − x = x , e − 1 (e − 1)2 (3x2 − x3 )ex − 3x2 = (ex − 1)2
0=
or 0 = (x − 3)ex + 3 . This transcendental equation must be solved numerically. The result is x = 2.82144. At maximum, ¯hωmax h νmax 2.82 = = , τ k T c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
or
02-Oct-2002 9-9
T νmax
=
Kelvin h = 0.017 . 2.82k Gigahertz
So the above establishes a relation between the temperature and the frequency of the maximum energy density per unit frequency. You will often see uν which is the energy density per unit Hertz rather than the energy density per unit radians per second. This is related to uω by the appropriate number of 2π’s. You will also see uλ which is the energy density per unit wavelength. This is found from uλ|dλ| = uω |dω| . This says that the energy density within a range of wavelengths should be same as the energy density within the corresponding range of frequencies. The absolute value signs are there because we only care about the widths of the ranges, not the signs of the ranges. We use ω = 2πc/λ and dω = (2πc/λ2 )|dλ|, dω uλ = uω , dλ 2πc ¯h(2πc/λ)3 = 2 3 , π c (exp(2π¯hc/λτ ) − 1) λ2 8πhc , = 5 λ (exp(hc/λτ ) − 1) 8πτ 5 x5 , = 4 4 x h c e −1 where x = hc/λτ . At long wavelengths, uλ → 8πτ /λ4 , and at short wavelengths uλ is exponentially cut off. The maximum of uλ occurs at a wavelength given by the solution of (x − 5)ex + 5 = 0 . The solution is x = 4.96511 . . .. From this, we have λmax T =
hc = 0.290 cm K . 4.97k
This is known as Wien’s displacement law. It simply says that the wavelength of the maximum in the spectrum and the temperature are inversely related. In this form, the constant is easy to remember. It’s just 3 mm Kelvin. (Note that the wavelength of the maximum in the frequency spectrum and the wavelength of the maximum in the wavelength spectrum differ by a factor of about 1.6. This is just a reflection of the fact that wavelength and frequency are inversely related.) Let’s apply some of these formulae to the sun. First, the peak of the spectrum is in about the middle of the visible band (do you think this is a coincidence or do you suppose c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-10
A = 5 × 10−5 cm. Using the displacement there’s a reason for it?), at about λmax ≈ 5000 ˚ law (and assuming the sun radiates as a blackbody), we find Tsun ≈ 5800 K. The luminosity of the sun is L = 3.8 × 1033 erg s−1 . The radius of the sun is r = 7.0 × 1010 cm. The flux emitted by the sun is J = L/4πr2 = 6.2 × 1010 erg cm−2 s−1 . This is about 60 Megawatts per square meter! We equate this to σB T 4 and find Tsun ≈ 5700 K, very close to what we estimated from the displacement law. Problem 17 in chapter 4 of K&K points out that the entropy of a single mode of thermal radiation depends only on the average number of photons in the mode. Let’s see if we can work this out. We will use ∂ τ log Z σ= , ∂τ V where Z is the partition function and the requirement of constant volume is satisfied by holding the frequency of the mode constant. We’ve already worked out the partition function for a single mode 1 Z= . 1 − e−¯hω/τ The average occupancy (number of photons) in the mode is n=
1 e¯hω/τ
−1
,
from which we find n+1 = e¯hω/τ , n
or
¯ω h n+1 = log . τ n
Now let’s do the derivatives to get the entropy ∂ (τ log Z) , ∂τ ∂ = log Z − τ log 1 − e−¯hω/τ , ∂τ ¯hω 1 1 1 −¯ hω/τ − , −e − −τ = log τ τ 1 − e−¯hω/τ 1 − e−¯hω/τ
σ=
¯hω e−¯hω/τ 1 + , 1 − n/(n + 1) τ 1 − e−¯hω/τ n/(n + 1) n+1 , = log(n + 1) + log n 1 − n/(n + 1) n+1 , = log(n + 1) + n log n = (n + 1) log(n + 1) − n log n , = log
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
02-Oct-2002 9-11
Which is the form given in K&K. This is another way of making the point that the expansion of the universe does not change the entropy of the background radiation. The expansion redshifts each photon—stretches out its wavelength—in proportion to the expansion factor, but it does not change the number of photons that have the redshifted wavelength—the number of photons in the mode. So, the entropy doesn’t change. (This assumes that the photons don’t interact with each other or with the matter. Once the universe is cool enough (≤ 4000 K) that the hydrogen is no longer ionized, then the interactions are very small.)
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
04-Oct-2002 10-1
Johnson Noise This is another application of the thermal equilibrium of electromagnetic modes. Consider an ideal transmission line, like a long piece of lossless coaxial cable. Suppose its length is L and suppose it is shorted out at each end. Then any wave that travels along the line is reflected at each end and to have an appreciable amplitude, the length of the line must contain an integral number of half wavelengths. In other words this is just a one-dimensional cavity of length L. There are modes of the electromagnetic fields, most conveniently represented by the potential difference between the inner and outer conductors. Vn = Vn,0 sin(ωt) sin(nπx/L), where n is any positive integer. Vn and Vn,0 represent the potential difference and the amplitude of the potential difference. The fields must satisfy Maxwell’s equations, so ω = nπc/L. Actually, if the coax is filled with a dielectric, the speed of propagation can be different from c, let’s assume it’s filled with vacuum. If this line is in thermal equilibrium at temperature τ , each mode acts like an oscillator and has average energy ¯hω/(e¯hω/τ − 1). Let’s consider the low frequency limit so the average energy in each mode is just τ . The number of modes per integer n is just 1. Then the number of modes per unit frequency is n(ω) dω =
L dω . πc
The energy per unit length per unit frequency is then uω =
τ , πc
at low frequencies. As you may know, all transmission lines have a characteristic impedance, R. If a resistor R is connected across the end of the line, then a wave traveling down the line is completely absorbed by the resistor. So, let’s take a resistor, in equilibrium at temperature τ , and connect it to the end of the line. Since the resistor and the line are at the same temperature, they are already in thermal equilibrium and no net energy transfer takes place. Each mode in the line is a standing wave composed equally of traveling waves headed in both directions. The waves traveling towards the resistor will be completely absorbed by the resistor. This means that the resistor must emit waves with equal power in order that there be no net transfer of energy. The energy in the frequency band dω per unit length headed towards the resistor is τ dω/2πc. This is traveling at speed c, so the power incident on the resistor is τ dω/2π which is also the power emitted by the resistor, What we’ve established so far is that the line feeds power τ dω/2π into the resistor and vice-versa. This means that a voltage must appear across the resistor. This will be a fluctuating voltage with mean 0 since it’s a random thermal voltage. However, its mean square value will not be zero. Let’s see if we can calculate this. As an equivalent circuit, we have a resistor R, a voltage generator (the thermally induced voltage source), c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
04-Oct-2002 10-2
a filter (to limit the frequencies to dω), and another resistor of resistance R representing the transmission line. Then the current is I = V /2R. The average power delivered to the resistor is then I 2 R = V 2 /4R = τ dω/2π. In the lab, one measures frequencies in Hertz rather than radians per second, ν = ω/2π. Finally V 2 = 4Rτ dν . This relates the mean square noise voltage which appears across a resistor to the temperature, resistance, and bandwidth (dν). Of course, this voltage results from fluctuations in the motions of electrons inside the resistor, but we calculated it by considering electromagnetic modes in a one-dimensional cavity, a much simpler system! This thermal noise voltage is called Johnson noise.
Debye Theory of Lattice Vibrations A little thought will show that sound waves in a solid are not all that different from electromagnetic waves in a cavity. Further thought will show that there are some important differences that we must take into account. The theory of lattice vibrations that we’ll discuss below applies to the ion lattice in a conductor. In addition, one needs to account for the thermal effects of the conduction electrons which behave in many respects like a gas. We’ll consider the electron gas later in the term. For now, we imagine that we’re dealing with an insulator. We will treat crystalline solids. This is mainly for conceptual convenience, but also because we want reasonably well defined vibrational modes. As a model, suppose the atoms in a solid are arranged in a regular cubic lattice. Each atom vibrates around its equilibrium position. The equilibrium and the characteristics of the vibrations are determined by interactions with the neighboring atoms. We can imagine that each atom is connected to its six nearest neighbors by springs. At first sight, this seems silly. But, the equilibrium position is determined by a minimum in the potential energy, and the potential energy almost surely increases quadratically with displacement from equilibrium. This gives a linear restoring force which is exactly what happens with a spring. So our solid is a large number of coupled oscillators. In general, the motion of a system of coupled oscillators is very complex. You probably know from your classical mechanics course, that the motion of a system of coupled oscillators can be resolved into a superposition of normal modes with the motion of each mode being simple harmonic in time. So, we can describe the motion with the N vectors ri which represent the displacement of each atom from its equilibrium position, or we can describe the motion with 3N normal mode amplitudes. For those of you that know about Fourier transforms, the normal modes are just the Fourier transforms of the position coordinates. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
04-Oct-2002 10-3
These normal modes represent elastic vibrations of our solid. They are standing elastic waves, or standing sound waves. In this respect, they are similar to the cavity modes we discussed earlier. There are two main differences. First, there are three polarizations: there are two transversely polarized waves (as we had in the electromagnetic case) and one longitudinally polarized wave (absent in the electromagnetic case). Second, there is a limit to the number of modes. If our solid contains N atoms, there are 3N modes. In the electromagnetic case, there is no upper limit to the frequency of a mode. High frequency modes with h ¯ ω τ are not excited, but they are there. In the elastic case, frequencies which are high enough that the wavelength is shorter than twice the distance between atoms do not exist. For simplicity, we are going to assume that the velocity of sound is isotropic and is the same for both transverse and longitudinal waves. Also, we’ll assume that the elastic properties of the solid are independent of the amplitude of the vibrations (at least for the amplitudes we’ll be dealing with). We’ll carry over as much stuff from the electromagnetic case as we can. A typical mode will look something like displacement component = A sin ωt sin
nx πx ny πy nz πz sin sin , L L L
where the sine factors might be cosines depending on the mode, A represents an amplitude, and for convenience, the solid is a cube of side L. The frequency and mode numbers are related by the speed of sound, v, ω2 =
π2 v2 2 nx + n2y + n2z . 2 L
If the solid contains N atoms, the distance between atoms is L/N 1/3 . The wavelength must be longer than twice this distance. More precisely 2L 2L > 1/3 , nx N with similar relations for ny and nz . In other words, the mode numbers nx , ny , and nz are integers within the cube N 1/3 × N 1/3 × N 1/3 . This lower limit on the wavelength (upper limit on the frequency) is an example of the Nyquist limit discussed later in these notes. The number of modes per unit frequency is just as it was for the electromagnetic case except that we must multiply by 3/2 to account for the three polarizations instead of two, n(ω) dω =
3V ω 2 dω . 2π 2 v 3
This works for frequencies low enough that the corresponding n’s are within the cube. It’s messy to deal with this cubical boundary to the mode number space. Instead, let’s c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
04-Oct-2002 10-4
approximate the upper boundary as the surface of a sphere which gives the same number of modes. In other words, there will be an upper limit to the frequency, called ωD , such that ωD ωD 3V V 3N = n(ω) dω = ω 2 dω = ω3 , 2 3 2v3 D 2π v 2π 0 0 which gives 2 1/3 6π N v. ωD = V Each mode acts like a harmonic oscillator and its energy is an integer times h ¯ ω. The quanta of sound are called phonons. A solid contains a thermally excited phonon gas. The average energies of these oscillators are just as they were in the electromagnetic case. We find the total energy by adding up the energies in all the modes, ωD 3V ¯hω U= ω2 dω , 2 3 2π v 0 exp(¯ hω/τ ) − 1 xD 3 x dx 3V 4 , τ = 3 ex − 1 2π 2 ¯h v 3 0 where ¯hωD xD = = τ
6π 2 N V
1/3
¯v h kθ θ = = , τ kT T
where θ is called the Debye temperature and is given by 2 1/3 ¯hv 6π N . θ= V k The Debye temperature is not a temperature you can change by adding or removing heat from a solid! Instead, it’s a characteristic of a given solid. The way to think of it is that a vibration with phonon energy equal to kθ is the highest frequency vibration that can exist within the solid. Otherwise the wavelength would be too short. (The weird factor of 6π 2 occurs because we replaced a cube with a sphere!) Typical Debye temperatures are a few hundred Kelvin. The limit of integration depends on the temperature, so in general, we can’t look up the integral. Instead, we have to numerically integrate and produce a table for different values of xD = θ/T . Such a table is given in K&K. There are two limiting cases where we can do the integral. The first case is very low temperature (T θ). In this case xD is very large and we can replace xD by ∞. Then the integral is π 4 /15 and we have π2 V 3π 4 N 4 4 U = τ = 3 3τ . 5k θ 10¯h3 v 3 c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
04-Oct-2002 10-5
The heat capacity at constant volume is then 12 4 π Nk CV = 5
3 T . θ
So a prediction of this theory is that at low temperatures, the heat capacities of solids should be proportional to T 3 . This is borne out by experiment! The other limit we can consider is very high temperature (T θ). In this case, we expect all modes are excited to an average energy τ , so the total should be U = 3Nτ . Is this what we get? At very high temperatures, xD 1, so we can expand the exponential in the denominator of the integrand, 3V U= 2π 2 ¯h3 v 3 3V = 2π 2 ¯h3 v 3 V = 2π 2 ¯h3 v 3 V = 2π 2 ¯h3 v 3 = 3Nτ ,
τ
xD
4
0 x D τ4
x3 dx , ex − 1 x2 dx ,
0
τ 4 x3D , ¯ 3 v 3 6π 2 N h τ , τ3 V 4
as expected. Actually, we picked ωD so this result would occur “by construction.” The heat capacity goes to 3Nk in this limit. In one of the homework problems you are asked to come up with a better approximation in the limit of small xD .
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
04-Oct-2002 10-6
The Nyquist Frequency We imagine that we have a function of time that we sample periodically, every T seconds. Then the Nyquist frequency is the frequency corresponding to a period of two samples. ωN = 2π/2T = π/T . Consider a sine wave at some frequency ω, y(t) = sin(ωt + φ) . Since we are sampling, we don’t have a continuous function, but a discrete set of values: ym = sin(ωmT + φ) . Suppose the frequency is larger than the Nyquist frequency. Then we can write it as an even integer times the Nyquist frequency plus a frequency less than the the Nyquist frequency: ω = 2nωN + Ω = 2πn/T + Ω , where −ωN ≤ Ω ≤ +ωN . Then ym = sin(2πnm + ΩmT + φ) = sin(ΩmT + φ) . In other words, when we sample a sine wave periodically, waves with frequencies greater than the Nyquist frequency look exactly the same as waves with frequencies less than the Nyquist frequency. This is illustrated in the figure. The arrows along the bottom indicate
the times at which the signal is sampled. A signal at the Nyquist frequency would have one cycle every two sample intervals. The high frequency wave has 3.7 cycles every two c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
04-Oct-2002 10-7
sample intervals or a phase change of 3.7π = (4 − 0.3)π every sample. We can’t tell how many multiples of 2π go by between samples, so the high frequency wave looks exactly like a low frequency wave with −0.3 cycles per two samples. The points show the value of the signal (either wave) at each sampling interval. Of course the application to the Debye theory of lattice vibrations is that the Nyquist spatial frequency is the highest frequency a periodic lattice can support.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 3
Due 08-Oct-2002
H3-1
1. We can consider blackbody radiation as a gas of photons. Let’s calculate the pressure in several ways. (a) As noted in lecture, a photon of frequency ω carries energy h ¯ ω and momentum ¯hω/c. So we have a momentum density (which is 0 when we add up the momentum in all directions) and a momentum flux density. Assuming that the directions of the velocities of the photons in an ideal gas are isotropically distributed, use our method of calculating the momentum flux (see lecture 6) to show that the pressure is p=
u π2 τ 4 . = 3 45¯h3 c3
(b) Use p = −(∂U/∂V )σ . Recall that changing the volume at constant entropy means keeping the photons in the same states as the energies of the states change. By considering photons in a cubical box of side L, show that at constant entropy, dV dL dω dτ =3 = −3 = −3 . V L ω τ This result can also be obtained from the expression for the entropy which shows that σ ∝ V τ 3. Using this result, evaluate the pressure by differentiating the energy with respect to volume at constant entropy and show that you obtain the same expression for the pressure as in part (a). See also K&K, chapter 4, problem 6. 2. By summing the average energy per oscillator in a cavity, we obtained the result that U=
π2 4 3 3 Vτ . 15¯h c
If we divide the average energy per oscillator by the energy per photon in the oscillator, we get the average number of photons in the oscillator. We can then sum this up over all the oscillators to get the total number of photons in the cavity. Carry out this summation and obtain an expression for the number of photons, N. (You will have to look up or numerically evaluate an integral.) Show that the entropy is proportional to the number of photons, σ ≈ 3.6N . See also K&K, chapter 4, problem 1. 3. If we have a cavity and we make a small hole in it, then we will be interested in the energy getting through the hole. Determine an expression for the flux density emerging from the hole or, equivalently, the flux density radiated by a surface which is a perfect blackbody. The flux density is the energy per unit area, per unit time, per unit frequency, per unit solid angle. Note that this will depend on the direction relative to the normal to the surface or the hole. For example, at 90◦ from the normal, the hole or surface is seen edge on and can’t radiate any energy in this direction. Integrate over solid angle and frequency to obtain the total energy per unit area per unit time emitted by a blackbody c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 3
Due 08-Oct-2002
H3-2
surface. This problem is similar in concept to last week’s problem on the distortion of the Maxwell velocity distribution for molecules emitted through a small hole. Also, it’s similar to K&K, chapter 4, problem 15. 4. K&K, chapter 4, problem 3. 5. K&K, chapter 4, problem 7. 6. K&K, chapter 4, problem 8. See also problem 19. 7. K&K, chapter 4, problem 11. 8. K&K, chapter 4, problem 14.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Oct-2002
Physics 301 Problem Set 3 Solutions Problem 1. (a) The momentum flux density is the momentum carried across an imaginary surface of unit area per unit time per unit frequency. Let’s say the normal to the surface is in the z direction. Then, there is no x and y component of momentum carried across this surface. We want to calculate then, the z-momentum flux density. The number of photons with momentum p ≡ (p, θ, φ) = (¯ hω/c, θ, φ) going through
the hole in unit time in the range ω to ω + dω is the same as the number of photons carrying the above momentum in a cylinder of length c,and base area A at angle (θ, φ) to the normal from the hole. The z-momentum carried by a photon of momentum p is h ¯ ω cos θ/c. The z-momentum flux carried by these photons is thus: dΩ h ¯ω cos θ (c cos θ) s(ω) n(ω) dω . c 4π
pz (ω)dωdΩ =
(1)
where 1. (c cos θ) is the volume of the cylinder, 2. s(ω) = (exp(¯ hω/τ ) − 1)−1 is the average number of photons in the mode ω,
3. n(ω)dω = ω 2 /π 2 c3 is the number of modes per unit volume in the range ω to ω + dω (travelling in all directions); and 4. dΩ/4π = sin θdθdφ is the fraction of the photons that are travelling in the direction of the hole. Integrating over the angles gives the momemtum flux density: Z 2π Z π 1 pz (ω)dω = h ¯ ωs(ω)n(ω)dω dφ sinθdθ cos2 θ 4π 0 0 2π 2 1 =h ¯ ωs(ω)n(ω)dω = u(ω)dω. 4π 3 3
(2)
where u(ω) is the spectral density. The pressure is given by summing over the frequencies (we can either do the integral or look up the total energy which we have seen many times before): p=
Z
0
∞
1 1 dω u(ω) = u. 3 3
(3)
(b) For photons in a cubical box, the frequencies are given by ω = const/L; V = L 3 which gives:
dV dL dω =3 = −3 . V L ω 1
(4)
Physics 301
13-Oct-2002
For constant entropy processes, the occupation numbers hsi = (exp(¯hω/τ ) − 1) −1
dont change, and this means that d(ω/τ ) = 0 which gives us the fourth relation dV /V = −3dω/ω = −3dτ /τ .
The energy of a photon gas is U = AV τ 4 and the entropy is σ = BV τ 3 . The energy can therefore be expressed as U = Cστ where A, B and C are constants. The pressure is then given by: p≡−
∂U ∂V
σ
=−
∂U/∂τ ∂V /∂τ
σ
=−
Cσ U u = = . (−3V /τ ) (3V ) 3
(5)
Problem 2. The average number of photons per mode is hn(ω)i =
1 exp(¯hω/τ ) − 1
(6)
Using the same density of states as to count the total energy, the total number of photons are
Z
Z ∞ V ω2 V ω2 N= dω 2 3 hn(ω)i = 2 3 dω π c π c 0 exp(¯hω/τ ) − 1 0 3 Z ∞ 3 2 Vτ Vτ x V τ3 V τ3 = 2 3 3 = 2 3 3 I 3 3 2ζ(3) ≈ 2.404 2 3 3 . dx x e −1 π c h ¯ 0 π c h ¯ c h ¯ π c h ¯
(7)
4π 4 4π 2 3 Vτ = N ≈ 3.6N. σ= 45 × 2.404 45¯h3 c3
(8)
∞
The entropy of the photon gas is
Problem 3. Let the area of the small hole be dA. We want to count the number of photons with momentum p ≡ (p, θ, φ) = (¯hω/c, θ, φ) going through the hole in unit time in the range
ω to ω + dω. This is the same as the number of photons carrying the above momentum in a cylinder of length c,and base area A at angle (θ, φ) to the normal from the hole. (Drawing such a cylinder would make this statement clear). This number is f(ω)dAdΩdω = (dAc cos θ) s(ω) n(ω) dω dΩ/4π (f denotes the number flux density) where 1. (dAc cos θ) is the volume of the cylinder, 2. s(ω) = (exp(¯ hω/τ ) − 1)−1 is the average number of photons in the mode ω,
3. n(ω)dω = ω 2 /π 2 c3 is the number of modes per unit volume in the range ω to ω + dω (travelling in all directions); and 2
Physics 301
13-Oct-2002
4. dΩ/4π = sin θdθdφ is the fraction of the photons that are travelling in the direction of the hole. The number of photons in the above frequency range going through the hole per unit area per unit time is given by an integration over the angles, and since each photon carries energy h ¯ ω, the energy flux density is: Z 2π Z 1 1 dφ (d cos θ) cos θ. Ehole (ω)dω = c¯hωs(ω)n(ω)dω 4π 0 0 1 1 = cs(ω)n(ω)dω = cu(ω)dω. 4 4
(9)
where u(ω) is the spectral density. The total energy going through the hole is therefore given by summing the above density over all the frequencies, and the answer is (this is basically the same integral as in problem 1, and in the calculation of total energy): F =
1 π2 τ4 cu = 4 60¯h3 c2
(10)
Problem 4. The gravitational self energy of the sun is (by dimensional arguments) E = −GM 2 /R
where the mass and the radius are those of the sun. (There is also a number in front of
order 1). By the virial theorem, the average kinetic energy is K = −E/2 = GM 2 /2R. Assuming the particles in the sun to be an ideal gas, K = 3Nk B T /2 which gives the estimate of the temperature as T ≈ GM 2 /kN = 3 × 107 K for the numbers given. Problem 5. (a) For a given mode of the photon gas with frequency ω, the allowed states are labelled by an integer k which is the occupany (number of photons) of the mode. The energies are k¯ hω. The partition function is therefore Z(ω) =
∞ X 0
exp(−k¯hω) = (1 − exp(−¯hω/τ ))−1
(11)
Since the modes are independent of each other, the full partition sum is the product of the one-mode partition sums over all the modes: Z=
Y (1 − exp(−¯ hωn /τ ))−1 n
3
(12)
Physics 301
13-Oct-2002
(b) The free energy is F = −τ log Z = −τ log
Y n
(1 − exp(−¯hωn /τ ))−1 = τ
X n
log(1 − exp(−¯hωn /τ )). (13)
The frequencies are ωn = nπc/L where L is the size of the box, and we can approximate the sum by an integral by using the usual density of states: Z
∞
V τ log(1 − exp(−¯hωn /τ )) 2 3 ω 2 dω π c 0 Z ∞ τV = 2 3 (τ /¯h)3 x2 log(1 − e−x ) π c 0 ∞ Z ∞ 4 1 3 τ V 1 x3 e−x −x x log(1 − e ) − = 2 3 3 dx 3 3 (1 − e−x ) π c h ¯ 0 0 τ 4V 1 π4 τ 4V π 2 = 2 3 3 0− =− 3 15 π c h ¯ 45c3 h ¯3
F ≈
(14)
Problem 6. The middle plane absorbs radiation from both the planes, and radiates back on both the sides. The power absorbed by unit area of the middle plane is σ B Tu4 + σTl4 , and the power 4 radiated by the same area is 2σB Tm . Equilibrium would mean that these two are equal
which gives T 4 = 12 (Tu4 + Tl4 ).
(15)
The radiation flux between the upper and middle plane is 4 J = σB Tu4 − σB Tm = σB Tu4 − 12 σB (Tu4 + Tl4 ) = 12 σB (Tu4 − Tl4 ) = 12 Jinitial
(16)
which is also the radiation between the middle and lower side. Problem 7. The total energy of a solid due to the elastic waves in the solid is Z xD 3V τ 4 x3 U = 2 3 3 dx x . e −1 2π h ¯ v 0
(17)
where v is the velocity of sound in the solid, and x D = θ/τ where θ = (¯hv/kB )(6π 2 N/V )1/3 the Debye temperature is related to the maximum frequency allowed. In the limit T θ,
xD 1, ex − 1 ≈ x; and the integral can be approximated as: Z xD 3V τ 4 3V τ 4 x3D 2 dxx = = 3NkB T. U≈ 2π 2 h ¯ 3v3 0 2π 2 h ¯ 3v3 3 4
(18)
Physics 301
13-Oct-2002
For T only moderately larger than θ, we get a better approximation by expanding the integral further in powers of xD , which means expanding the function x3 (ex −1)−1 further in
powers of x. Expanding to two orders would give U = 3NkT (1+a 1(θ/T )+a2 (θ/T )2 +. . .). The heat capacity is thus Cv = 3N k(1 − a2 (θ/T )2 + . . .). The first non-zero correction is
thus a2 which we shall calculate. To this order,
x3 1 x x2 x2 + )−1 = (1 + + )−1 2 6 x 2 6 1 x x2 x2 1 x x2 = (1 − − + ) = (1 − + ). x 2 6 4 x 2 12
(ex − 1)−1 ≈ (x +
This gives
Z
xD
x3 dx x ≈ e −1
Z
xD
(19)
x x2 x3 x4 x5 + )= D − D + D 2 12 3 8 60 0 0 (20) 3 3θ 1 θ 2 xD (1 − + ( ) ). = 3 8T 20 T which gives the value a2 = 1/20. Looking up table 4.2, we find Cv (0) = 24.93 for θ/T = 0 dx x2 (1 −
and Cv (1) = 23.74 for θ/T = 1. If we ignore higher approximations, this gives a 2 = (Cv (1) − Cv (0))/Cv (1) ≈ 1.19/23.74 ≈ 0.048 which is off from 1/20 = 0.05 by 4%. Problem 8. (a) If we have only longitudinal sound waves, the density of states is less than the usual one by a factor of 3: n(ω) = (V /2π 2 v 3 )ω 2 . The number of modes of the solid remain 3N . The Debye temperature in this case is given by Z ωD V 3N = n(ω)dω = ω3 2v3 D 2π 0 1/3 18π 2 Nv 3 ⇒ ωD = = (18π 2 ρ)1/3 v. V
(21)
The density is 0.145g/cc = 0.145 × 6.023 × 1023 /4 = 2.18 × 1022 /cc. This gives
θD = h ¯ ωD /k = (18π 2 ρ)1/3 (¯hv/k) ≈ 28.5K.
(b) The total energy is
Z ωD Z xD V ω2 h ¯ω V x3 4 dω = τ U= 2π 2 v 3 0 ex − 1 eh¯ w/τ − 1 2π 2 h ¯ 3 v3 0 Z ∞ 4 xD 1 V x3 V 4 4π ≈ τ = τ ex − 1 2π 2 h ¯ 3v3 2π 2 h ¯ 3 v 3 15 0 V π 2 4 3π 4 V ρ 4 3π 4 Nk 4 = τ = T 3 τ = 5θ 3 5k 3θD 30¯h3 v 3 D 12 T ⇒ Cv = Nπ 4 k( )3 ≈ 1.39 × 10−25 NT 3 J/K. 5 θD 5
(22)
Physics 301
13-Oct-2002
N = NA is 4g of He, which means that the specific heat per gram is C v ≈ (1.39 × 10−25 × 6.02 × 1023 /4)T 3 ≈ 0.021 T 3 J/gK.
6
Week 4. Chemical Potential, Gibbs Distribution, Fermi-Dirac and Bose-Einstein Distributions
Physics 301
07-Oct-2002 11-1
Reading This week we’ll work on chemical potential and the Gibbs distribution which is covered in K&K chapter 5.
Parting Shot on Oscillators Before we get to the main part of this week’s material, let’s have a quick recap on oscillators. If we have an oscillator of frequency ω, its energy levels are spaced by h ¯ ω. If this oscillator is in thermal equilibrium at temperature τ , then if h ¯ ω < τ , its average energy is τ . If h ¯ ω > τ , its average energy is exponentially “killed off” and it’s not too gross of an approximation to say that it’s 0. This happens because of the quantized states. Energy can only be exchanged with the oscillator in units of h ¯ ω and when this is larger than a thermal energy, the heat reservoir almost never gets enough energy together to excite the oscillator. We know the energy of the oscillator, and all we have to do is count up how many oscillators there are in order to find the total energy of the system. For blackbody radiation, the number of modes is proportional to ω 3 ( ω 2 dω), and all these modes are excited up to the maximum ω where ¯hω = τ . So the energy in blackbody radiation is proportional to τ 4 . In the case of lattice vibrations, we again have a number of modes proportional to ω 3 , but there are a finite number, so if we run out of modes before we reach the maximum ω of an excited oscillator, then every mode has energy τ and the total energy is 3Nτ (where 3N is the number of modes). If we don’t run out of modes before we reach the maximum ω of an excited oscillator, then the situation is just like that with blackbody radiation and the energy is proportional to τ 4 . These two cases correspond to high and low temperatures and give heat capacities which are constant or proportional to τ 3 for high and low temperatures. By the way, the fact that molar heat capacities of solids are usually 3R at room temperature (R is the gas constant) is called the law of Dulong and Petit. Finally, when we considered Johnson noise in a resistor, we made use of a one dimen sional cavity, where the number of modes is proportional to ω ( dω) and we considered the low temperature limit and found the energy proportional to τ dω. Basically, we know the energy of the modes and we count the modes. All the factors of π, h ¯ , etc., come out as a result of the proper bookkeeping when we do the counting. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
07-Oct-2002 11-2
Integrals Related to Planck’s Law Judging by experience with previous classes, some (maybe many) of you are wondering just how one goes about doing the integral
∞ 0
x2 dx . exp(x) − 1
The first thing to note is that doing integrals is an art, not a science. You’ve probably learned a number of techniques for doing integrals. However, there is never a guarantee that an arbitrary expression can be integrated in closed form, or even as a useful series. Some expressions you just have to integrate numerically! Let’s see what we can do about In = 0
∞
xn dx , exp(x) − 1
where n need not be an integer, but I think we’ll need n > 0. The first thing to do is to try and look it up! I like Dwight, Tables of Integrals and other Mathematical Data, 4th edition, 1964, MacMillan. (Actually, I bought mine when I was an undergraduate in the late 60’s. It seems that they were coming out with a new edition every 10 years, so maybe it’s up to the seventh edition by now!) Anyway, in my edition of Dwight, there is entry 860.39: ∞ p−1 Γ(p) 1 1 Γ(p) x dx = p 1 + p + p + · · · = p ζ(p) , ax e −1 a 2 3 a 0 and this is basically the integral we’re trying to do. We’ve talked about the gamma function, Γ(z), see lecture 5. The Riemann Zeta function is ∞ 1 ζ(s) = , ks k=1
where Re(s) > 1. The function can be defined for other values of s, but this series requires Re(s) > 1. A good reference book for special functions is Abramowitz and Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, US Government Printing Office. If you look up the zeta function in this handbook, you’ll find lots of cool stuff. For example, ζ(s) =
primes p
1 . 1 − 1/ps
The zeta function establishes a connection between number theory and function theory! Other things you’ll find in A&S are various integral representations, representations in c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
07-Oct-2002 11-3
terms of Bernoulli and Euler polynomials (if you don’t know what these are, they’re also discussed in A&S), special values, and tables of values. For example ζ(0) = −1/2, ζ(1) = ∞, ζ(2) = π 2 /6, and ζ(4) = π 4 /90. ζ(3), needed for the integral at the top of the page, does not have a simple value. Instead, we find it in a table, ζ(3) = 1.2020569 . . .. Can we write In in the form suggested by Dwight?
∞
xn dx , exp(x) − 1 0 ∞ n −x x e dx , = 1 − e−x 0 ∞ ∞ = xn e−x e−mx dx ,
In =
0
m=0 ∞
= 0
=
∞ m=0 ∞
xn
∞
e−(m+1)x dx ,
m=0 ∞ n −(m+1)x
x e
0
1 = (m + 1)n+1 m=0 = =
∞
1 (m + 1)n+1 m=0
∞
dx , ((m + 1)x) e−(m+1)x d ((m + 1)x) , n
0 ∞
y n e−y dy ,
0
∞
1 Γ(n + 1) , (m + 1)n+1 m=0
= Γ(n + 1)
∞
1 , (m)n+1 m=1
(m now starts at 1)
= Γ(n + 1) ζ(n + 1) , in agreement with Dwight. If you need to numerically evaluate ζ(s), you can just start summing the series. Suppose you’ve summed the inverse from 1 to M − 1. You should be able to show ∞ powers s (make some sketches) that M dx/x = 1/[(s − 1)M (s−1) ] is less than remainder of the ∞ sum and M dx/(x − 1)s = 1/[(s − 1)(M − 1)(s−1) ] is greater than the remainder of the sum. You can use the average of these two integrals as an estimate of the remainder of the sum and half their difference as a bound on the numerical error. (Actually the error will be quite a bit smaller!). As an example, consider ζ(2) = π 2 /6 = 1.64493 . . .. The sum of the first 10 terms, 1 + 1/4 + 1/9 + 1/16 + · · · + 1/100 = 1.5497677 . . .. The two integrals are just 1/11 = 0.09090909 . . . and 1/10 = 0.1. Their average is 0.09545454 . . . and half their c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
07-Oct-2002 11-4
difference is 0.004545 . . ., so numerically we can be pretty sure that the value is within 0.00455 of 1.64522. In fact, we actually miss by only 0.00028!
The Chemical Potential Recall in lectures 2 and 3 we discussed two systems in thermal (microscopic exchange of energy), volume (macroscopic exchange of energy), and diffusive (exchange of particles) equilibrium. By requiring that the entropy be a maximum, we found that dσ =
p µ 1 dU + dV − dN , τ τ τ
where µ is the chemical potential and N is the number of particles. In other words,
µ = −τ
∂σ ∂N
. U,V
We can also rewrite the differential relation above in the form dU = τ dσ − p dV + µ dN ,
from which we deduce µ=
∂U ∂N
. σ,V
Adding a particle to a system changes its energy by µ. Of course, the entropy is not a completely natural variable to work with as a dependent variable. To get around this, we use the Helmholtz free energy which we’ve previously defined as a function of temperature and volume. We now extend the definition to include particle number. In particular, consider the free energies of two systems in contact with a reservoir at temperature τ . We allow these two systems to exchange particles until equilibrium is established. The free energy of the combined system is F = F1 + F2 , where the subscripts refer to the individual systems. The free energy will be a minimum at constant temperature and volume. The change in free energy due to particle exchange is ∂F1 ∂F2 dF = dF1 + dF2 = dN1 + dN2 . ∂N1 τ,V ∂N2 τ,V c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
07-Oct-2002 11-5
We want dF = 0 at minimum and, since the total number of particles is constant, we have dN1 = −dN2 which means that
∂F1 ∂N1
=
τ,V
∂F2 ∂N2
=µ. τ,V
This constitutes yet another definition of the chemical potential. Is it the same chemical potential we’ve already defined? Yes, provided the free energy continues to be defined by F = U − τσ . Then when the particle number changes, we have dF = dU − τ dσ − σ dτ , = τ dσ − p dV + µ dN − τ dσ − σ dτ , = −σ dτ − p dV + µ dN , and it’s the same chemical potential according to either definition. By the way we defined the chemical potential, it must be the same for two systems in diffusive and thermal contact, once they’ve reached equilibrium. What if the two systems in diffusive contact do not have equal values of the chemical potential? Since dF = dF1 + dF2 = µ1 dN1 + µ2 dN2 , there will be a flow of particles in order to minimize F . If µ1 > µ2 , then dN1 < 0 and dN2 = −dN1 > 0, so particles flow from the system with the higher chemical potential to the system with the lower chemical potential. To summarize, µ = −τ
∂σ ∂N
=
U,V
∂U ∂N
= σ,V
∂F ∂N
. τ,V
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
07-Oct-2002 11-6
Getting a Feel for the Chemical Potential The chemical potential is another one of those thermodynamic quantities that seems to appear by magic. In order to gain intuition about the chemical potential, you will probably have to see it in action and work with it for a while. To start this process, note that adding a particle to a system requires that the energy of the system be changed by µ while the entropy and volume are kept constant. Better yet, the free energy changes by µ while the temperature and volume are kept constant. Why might adding a particle to a system change the system’s energy? There are at least two reasons. There might be macroscopic fields around (such as gravitational or electromagnetic fields) in which the particle has an ordinary potential energy (mgh or eΦ for example). In addition when a particle is added to a system at temperature τ , it must acquire a thermal energy which depends on τ and other parameters of the system. In other words the change in energy upon adding a particle can be due to both macroscopic fields and microscopic thermal effects. The distinction made in K&K between the external, internal and total chemical potentials is just a division into the macroscopic, microscopic, and total contributions to the energy upon adding a particle. Let’s find the chemical potential of the classical, ideal, monatomic gas. Recall in lecture 7, we found, 3 U = Nτ , 2
nQ F = −Nτ log +1 , n nQ 5 , σ = N log + n 2 3/2 mτ , nQ = 2π¯h2 N . n= V Of the thermodynamic potentials U, F , and σ above, only F is expressed in terms of its natural independent variables τ , V , and N. Let’s find µ by differentiating F with respect to N while keeping τ and V constant. ∂F nQ V ∂ µ= −Nτ log = +1 , ∂N τ,V ∂N N nQ V = −τ log +1 +τ , N c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
07-Oct-2002 11-7 nQ V , N n = τ log . nQ = −τ log
Note that for a typical gas, the concentration n is quite small compared to the quantum concentration, nQ , perhaps a part in 106 to a part in 105 . So we expect µ = −14τ to −11τ . If µ gets close to zero, then the concentration is approaching the quantum concentration and the classical treatment is no longer valid. Suppose we wanted to calculate the chemical potential by differentiating the entropy. We express the entropy in terms of its natural independent variables U , V , and N ,
nQ 5 σ = N log + , n 2 3/2 V 5 mτ + , = N log N 2 2π¯h2 substitute τ = 2U/3N,
=N
log
mU 3π¯h2
3/2
V N 5/2
5 + 2
.
Now differentiate with respect to N , multiply by −τ , and replace U with 3Nτ /2,
∂σ µ = −τ , ∂N U,V
3/2 5 ∂ V mU N log , + = −τ ∂N 2 N 5/2 3π¯h2
3/2 V 5 5 mU − , + = −τ log N 5/2 2 2 3π¯h2 nQ V , N n = τ log , nQ
= −τ log
the same as we obtained before. Since we’re having so much fun playing with the mathematics, suppose we wanted to find the chemical potential by differentiating the energy. This should be done at constant entropy and volume. The Sackur-Tetrode formula for the entropy is fairly messy to solve for the temperature, so we may not want to rewrite the energy in closed form as a function of the entropy, volume and number of particles. Instead, we can write the energy as a c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
07-Oct-2002 11-8
differential involving dN and dτ and find the relation between these differentials which makes the entropy change vanish. This will be left as a homework problem. Needless to say, one gets the same expression for the chemical potential that we’ve already derived. As an example of a macroscopic contribution to the chemical potential, consider our atmosphere which exists in the gravitational field of the Earth. Suppose the atmosphere consists of a single kind of atom of mass m, is isothermal, and is in equilibrium. Then the chemical potential must be the same everywhere in the gas. At height h above the zero level for the gravitational potential, there is a contribution mgh from the gravitational field. This means n µ = Constant = τ log + mgh , nQ or
n(h) = n(0)e−mgh/τ .
so the concentration decreases exponentially with altitude. With the ideal gas law, p/τ = N/V = n, so the pressure also decreases exponentially with altitude. We can write p(h) = p(0)e−h/h0 , where h0 = τ /mg = kT /mg = RT /Mg, is called the scale height of the atmosphere. R is the gas constant, and M is the molar mass. If we take, T = 300 K, M = 28 g (appropriate for Nitrogen molecules), and g = 980 cm s−2 , we find h0 ≈ 9 km. There are several problems with this simple model. First, the atmosphere is stirred by winds, so it is not in equilibrium. However, this is important only in the first few miles above sea level. At higher altitudes, the atmosphere is approximately isothermal, but with a considerable colder temperature, about 230 K according to the plot in K&K. Also, each molecule should have a slightly different scale height, with the lighter molecules having a large scale height (and being more likely to completely evaporate from the Earth).
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
09-Oct-2002 12-1
The Gibbs Factor In lecture 5, we divided up a large object into a system and a heat reservoir and we considered what happens when the system and reservoir exchange energy. This led us to the Boltzmann factor and the partition function. Now let’s consider the what happens if we divide up the large object into the system and a reservoir except this time we allow the exchange of particles as well as energy. The total energy is U0 and the total number of particles is N0 (which is not Avogadro’s number during this discussion!). What is the probability that the system is in the single state 1 with N1 particles and energy E1 compared to the probability that it’s in the single state 2 with N2 particles and energy E2 ? The ratio is P (N1 , E1 ) g(N0 − N1 , U0 − E1 ) × 1 = , P (N2 , E2 ) g(N0 − N2 , U0 − E2 ) × 1 where g(NR , UR ) is the number of states available to the reservoir when it contains NR particles and has energy UR . We are just applying our postulate that the probability is proportional to the number of available states. We have P (N, E) ∝ g(N0 − N, U0 − E) , ∝ eσ(N0 − N, U0 − E) , ∂σ ∂σ −E σ(N0 , U0 ) − N ∂N ∂U , ∝e ∝ e(σ(N0 , U0 ) + Nµ/τ − E/τ ) , ∝ eσ(N0 , U0 ) × e(Nµ/τ − E/τ ) , ∝ e(Nµ/τ − E/τ ) , where we dropped the first factor since it’s a constant for any given reservoir. The probability P (N, E) ∝ e(Nµ − E)/τ , is called the Gibbs factor. A probability distribution described by Gibbs factors is called the grand canonical distribution. Consider the sum Z=
e(Nµ − E)/τ ,
All N, E where the sum is over all numbers of particles N, and for each N , all possible states of N particles with energies E. Or, one can sum over all energies first, and then over all numbers of particles with that energy. Basically it’s a sum over all possible states which are parameterized by number of particles and energy. Z is called the Gibbs sum, the grand sum, or the grand partition function. K&K use a kind of cursive Z symbol for this sum. I don’t seem to have that in my TEX fonts, so we’ll use Z. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
09-Oct-2002 12-2
The grand partition function is used to normalize the probabilities: 1 (Nµ − E)/τ e . Z With the normalized probability distribution, we can compute mean values. For example, the mean number of particles in the system is 1 N = N e(Nµ − E)/τ , Z All N, E N (Nµ − E)/τ 1 e , =τ Z τ All N, E 1 ∂Z , =τ Z ∂µ ∂ log Z . =τ ∂µ P (N, E) =
The mean energy is slightly more complicated. If we differentiate Z with respect to 1/τ , it will pull down a term with the energy in it, but we’ll also get the number of particles again. 1 Nµ − E = (Nµ − E) e(Nµ − E)/τ , Z All N , E 1 ∂Z , = Z ∂(1/τ ) ∂ log Z , = ∂(1/τ ) ∂ log Z E = Nµ − , ∂(1/τ ) ∂ log Z = µ N − , ∂(1/τ ) ∂ log Z ∂ log Z = µτ − , ∂µ ∂(1/τ ) ∂ log Z ∂ log Z + τ2 . = µτ ∂µ ∂τ K&K define the activity as λ = eµ/τ . The grand partition function can be rewritten in terms of the activity as Z= λN e−E/τ , All N, E c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
09-Oct-2002 12-3
from which it follows that the average number of particles is N = λ
∂ log Z . ∂λ
Example: Binding of N Molecules This example is related to the myoglobin example discussed in K&K and also to K&K, chapter 5, problem 14. The example system is a hemoglobin molecule which can bind zero to four oxygen molecules. A hemoglobin molecule is similar to four myoglobin molecules, each of which can bind zero or one oxygen molecule. We will work out an expression for the average number of molecules as a function of the partial pressure of oxygen in the atmosphere. We will assume that each successive molecule binds with same energy (relative to infinite separation), < 0. This is not quite right as successive oxygen molecules are bound more tightly than the first oxygen molecule. Also, we will start by assuming that 0 to M molecules may be bound, and specialize to the case of four molecules later. Finally, we will assume that there is only one state in which N molecules are bound. This corresponds to assuming that the molecules are bound to hemoglobin in a definite order. We will let the activity be λ = exp(µ/τ ). Then the grand partition function is, 2 M Z = 1 + λe−/τ + λe−/τ + · · · + λe−/τ , M +1 −/τ 1 − λe = , 1 − λe−/τ M +1 −/τ −/τ . − log 1 − λe log Z = log 1 − λe Now we differentiate with respect to λ in order to find the average number of bound molecules, ∂ log Z , ∂λ M +1 (M + 1) λe−/τ λe−/τ =− + , M +1 −/τ −/τ 1 − λe 1 − λe M +1 M + M λe−/τ 1 − (M + 1) λe−/τ . = λe−/τ M +1 1 − λe−/τ 1 − λe−/τ
N = λ
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
09-Oct-2002 12-4
In the case that M = 1, which corresponds to myoglobin which can bind one molecule, the expression becomes λe−/τ N = . 1 + λe−/τ Now, λ = exp(µ/τ ), and in order for our system to be in equilibrium with atmospheric oxygen it must have the same chemical potential as atmospheric oxygen. This means that λ = n/nQ = p/τ nQ , where p is the partial pressure of oxygen, τ is the temperature of atmospheric oxygen (presumably room temperature), and nQ is the quantum concentration evaluated at temperature τ and for the mass of an O2 molecule. So λ can be evaluated numerically for any desired partial pressure of oxygen. If we look at the curve for myoglobin in figure 5.12 of K&K, it appears that the average number of bound molecules is about 1/2 when the partial pressure of oxygen is about 5 mm of Hg. (One atmosphere is 760 mm of Hg and oxygen is about 20% of the atmosphere, so the maximum partial pressure is roughly 150 mm of Hg. Let λ1/2 be the activity when the number of bound molecules is 1/2. From our expression above, we see that this means λ1/2 exp(−/τ ) = 1 or exp(−/τ ) = 1/λ1/2 . Let’s plug this into the formula for hemoglobin (M = 4) and express the result as a fraction of the maximum number of bound molecules. The result is N x 1 − 5x4 + 4x5 x 1 + 2x + 3x2 + 4x3 f= , = = 4 4 (1 − x5 )(1 − x) 4 1 + x + x2 + x3 + x4 where x = λ/λ1/2 . This curve is shown in the figure. Binding more molecules, all with the
same binding energy, causes a sharper transition from “empty” to “full!”
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
09-Oct-2002 12-5
More on the Chemical Potential—Energy to Add a Particle This section is based on the discussion in K&K in pages 250–252. As you recall, the chemical potential is the amount of energy required to add one particle to a system. Also, we learned that the chemical potential for an ideal monatomic gas is µ = τ log(n/nQ ) which is about −14 to −11τ for a typical gas under typical conditions. It appears that if we add one more particle to a gas, we’re not required to spend energy, but we get back some energy! This must be wrong, but what’s the explanation? The answer has to do with where the particle came from. There’s also an energy involved in removing the particle from its original location before we add it to the gas. Suppose we have two containers of the same gas at the same temperature, τ . Suppose the chemical potentials are different with µ2 > µ1 . Then the concentrations must be different, or equivalently, the pressures are different with p2 > p1 . If we remove a molecule from container 1 and add it to container 2, we receive energy µ1 from container 1 but must give energy µ2 to container 2. The total amount of energy that must be supplied by an external agent to move this molecule is µ2 − µ1 > 0. What does this turn out to be? ∆E = µ2 − µ1 , n2 n1 = τ log − τ log , nQ nQ n2 = τ log . n1 Now suppose we have N molecules of a gas at temperature τ and we isothermally compress it from volume V1 down to volume V2 or, equivalently, from concentration n1 up to concentration n2 . How much mechanical work is required? V2 −p dV , ∆W = V1
= −Nτ
V2
V1
= −Nτ log
dV , V
V2 , V1
V1 , V2 N/V2 , = Nτ log N/V1 n2 = Nτ log . n1 So the energy per molecule required to isothermally change the concentration from n1 to n2 is just the energy required to move one molecule from a gas at concentration n1 to a gas at concentration n2 . = Nτ log
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
09-Oct-2002 12-6
In fact, we could imagine doing the following: Isothermally compress the gas in container 1 from concentration n1 to concentration n2 . This requires spending an energy Nτ log(n2 /n1 ). Move the molecule from container 1 to container 2. This requires no energy since the concentrations and the chemical potentials are now the same. Expand the gas in container 1 back to concentration n1 . This recovers an energy (N − 1)τ log(n2 /n1 ), so the net expenditure of energy is τ log(n2 /n1 ) = µ2 − µ1 . Recall that the internal energy of an ideal monatomic gas depends only on its temperature (U = 3Nτ /2). Before and after we moved the molecule from container 1 to container 2, the temperature of all the gas was τ , so the internal energy of the gas did not change! Where did the energy τ log(n2 /n1 ) go??? Hints: has the free energy of the combined systems changed? What about the entropy?
Example: Chemical Potential and Batteries Surprise: chemical potential might actually have something to do with chemistry! An example has to do with batteries—or better, voltaic cells. K&K have a discussion of the lead acid battery used in cars on pages 129–131. However, I’ve been told that this discussion is not quite right. In particular see Saslow, W., 1996, PRL, 76, 4849. By the way, did you know that Princeton subscribes to many of the on-line journals? This means if you access the web from a Princeton address, you’ll be allowed to read the journals on-line. In particular, you can find Physical Review and Physical Review Letters on-line and the article cited above can be downloaded and printed out. Rather than discuss the lead acid battery, let’s look at a simpler (I hope) system: the Daniell cell. This is discussed by the same author in 1999, AJP, 67, 574. (True confession: I have not read the article in the American Journal of Physics, but rather, the preprint that used to be on the author’s web site. But the TAMU physics web site has been revamped and I can’t find the preprint anymore!) The Daniell cell is also discussed in chemistry textbooks, such as the one I used many years ago, Pauling, L., 1964, College Chemistry, (San Francisco:Freeman), p. 354. The following discussion is based on both of these sources. The figure shows a schematic of the cell. It has a solution of zinc sulfate (ZnSO4 ) surrounding a zinc electrode and a copper sulfate (CuSO4) solution surrounding a copper electrode. The two solutions are in contact. The zinc electrode is the negative electrode or cathode and the copper electrode is the positive electrode or anode. Chemical reactions occur at the electrodes. At the copper electrode, the reaction is Cu++ + 2e− → Cu . The copper ion was in solution and the electrons come from the electrode. The neutral copper atom “plates out” on the copper electrode. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
09-Oct-2002 12-7
At the zinc electrode, the reaction is Zn → Zn++ + 2e− . Zinc atoms in the electrode go into solution as zinc ions and leave behind two electrons on the cathode. If a wire is connected between the two electrodes, the electrons left behind by the zinc can travel through the external circuit to the copper electrode where they join up with the copper ions to plate out the copper atoms. (Of course, electrons go in one end and different electrons come out the other end. . ..) Charge is transfered inside the cell, through the electrolyte, by sulfate ions. That is, one can think of CuSO4 dissociating into Cu++ and SO−− at the positive electrode, 4 the Cu++ plates out leaving behind a spare sulfate ion which diffuses over to the negative electrode to join up with a zinc ion and form ZnSO4 . (Of course, sulfate ions don’t go all the way across the electrolyte; ions go in one end and different ions come out the other end. . ..) Essentially all the current in the electrolyte is carried by the ions and none by electrons. If we actually have a complete circuit, current will flow until one of the consumables is exhausted. If all the copper is plated out of solution or if the zinc electrode is completely dissolved, that will be the end of the cell. When operated in this mode, the cell converts chemical potential energy into electrical energy. Our methods apply to equilibrium situations, so we’ll discuss the situation when there is no current flowing in the external circuit and the system has reached equilibrium. (Actually, a non-uniform distribution of electrolytes is also not an equilibrium situation, so we are really assuming that the time for the electrolytes to diffuse is long compared to the time for the reactions at the electrodes to complete.) As zinc goes into solution and copper plates out, the electrodes acquire charges and electric potentials. When the the potentials are large enough the reactions stop. When the reactions stop, the chemical potentials of the atoms/ions must be the same whether they are in solution or on the electrodes. Let Va , Vs , and Vc be the electric potentials (voltages) of the anode, solutions, and cathode. Note that we assume the electrolytes (solutions) are equipotentials. If not, there would be a current flow until a uniform potential is established. The voltage of the cell (e.g., measured by a voltmeter placed across the anode and cathode) is Vcell = Va − Vc = (Va − Vs ) + (Vs − Vc ) . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
09-Oct-2002 12-8
Consider a zinc ion in the cathode. When equilibrium has been established, the chemical potential of the zinc in the cathode must be the same as that of zinc in solution. The chemical potential is made of two parts: the internal chemical potential and the potential energy of the ion in macroscopic electric potential of the cathode or the solution: µci (Zn++ ) + 2eVc = µsi (Zn++) + 2eVs , or µci (Zn++) − µsi (Zn++ ) = 2e(Vs − Vc ) , where e > 0 represents the magnitude of the charge on an electron and µci and µsi represent the internal chemical potentials in the cathode and the solution. Note that I have shown the zinc as zinc ions on the cathode as well as in solution. This is mainly for clarity and can be justified by noting that the conduction electrons in a metal are not localized to any particular atom. The difference of internal chemical potentials is determined by the chemical reaction. It is customary to divide this by the magnitude of the electric charge and the number of charges involved and tabulate as a potential difference. So, for example, my 1962 edition of the Handbook of Chemistry and Physics has a table titled “Potentials of Electrochemical Reactions at 25◦ C in which one finds +0.7628 V listed for the reaction Zn → Zn++ + 2e− . This means that Vs − Vc is about 0.76 V. At the anode, with no current flowing, we have µai (Cu++ ) − µsi (Cu++) = −2e(Va − Vs ) . The Handbook lists the electric potential of the reaction Cu → Cu++ + 2e− as −0.3460 V. Thus Va − Vs = 0.35 V and the open circuit cell potential is Vcell = 1.11 V. Comments: the potentials associated with reactions that occur at the cathode or anode are called half cell potentials. If this reminds you of redox reactions in chemistry, it should! The Handbook contains a table titled “Electromotive Force and Composition of Voltaic Cells” which gives the composition and voltage of selected cells. The half cell voltages are determined by defining a standard half cell (a platinum electrode over which hydrogen ions are bubbled) as a standard with zero half cell potential. Then all other half cells are measured relative to the standard. Recall: only potential energy differences are important! Finally, by now you should be getting a feel for why it’s called the chemical potential!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Oct-2002 13-1
Example: Magnetic Particles in a Magnetic Field Recall the paramagnetic spin system we discussed in lecture 4. In this system, there are magnets with orientations parallel or antiparallel to a magnetic field. In the parallel orientation, the energy is −mB = −E, where m is the magnetic moment and B is the magnetic field. In the antiparallel orientation the energy is +mB = +E. In lecture 4, we worked out the relative numbers of parallel and antiparallel magnets and found that it depended on the ratio of thermal to magnetic energies. Following the discussion in K&K, pages 127–129, suppose that we have the same kind of system, but in addition, the magnetic particles are free to move, so the aligned magnets will be attracted to regions of high field strength while the antiparallel magnets will be repelled from regions of high field strength. Of course, in the regions of high field, one would expect to find a greater fraction aligned even if the particles couldn’t move. . . Let n↑ be the concentration of parallel and n↓ be the concentration of antiparallel systems. Just as with an ideal gas, we expect that microscopic or internal contribution to the chemical potential should depend on the concentration, µ↑,int = τ log
n↑ nQ
and
µ↓,int = τ log
n↓ . nQ
We assume that we can treat the parallel and antiparallel magnets as distinct kinds of “particles.” To the internal chemical potential must be added the external potential due the energy in the magnetic field, n↑ − mB , nQ n↓ µ↓ = τ log + mB . nQ
µ↑ = τ log
Now, the parallel and antiparallel magnets are in thermal equilibrium with each other and can be changed into one another. That is, one can remove a particle from the parallel group and add it to the antiparallel group and vice-versa. When the system has come to equilibrium, at temperature τ , the free energy must be stationary with respect to changes in the particle numbers which means the chemical potentials of the two kinds of particles must be the same. Furthermore, we are allowing the particles to diffuse to regions of higher or lower field strength, and the chemical potential must be independent of field strength. So, µ↑ = µ↓ = Constant. This relation together with the previous equations are easily solved to yield n↑ (B) =
1 n(0)e+mB/τ 2
and
n↓ (B) =
1 n(0)e−mB/τ , 2
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Oct-2002 13-2
where we’re explicitly showing that the concentrations depend on B and n(0) is the combined concentration where B = 0. The combined concentration as a function of B is m2 B 2 + ··· . n(B) = n↑ (B) + n↓ (B) = n(0) cosh(mB/τ ) = n(0) 1 + 2τ 2 These relations show both effects we mentioned earlier. The higher the field strength, the greater the fraction of aligned magnets (as we already knew from lecture 4) and the greater the concentration of magnets. The magnetic particles diffuse to regions of high field strength. In figure 5.6, K&K show a plot of chemical potential versus concentration for several different field strengths. In problem 5 of chapter 5, we are asked for what value of m/τ was this figure drawn. The key datum to extract from the plot is that at a given chemical potential, the concentration increases by two orders of magnitude as B is increased from 0 to 20 kG. We can plug this directly into the previous expression to get m/τ = 5.30/(20000 G) = 0.000265 G−1 . Note that we had to use the cosh form of the expression, not the series, because mB/τ > 1. Problem 5.5 also asks how many Bohr magnetons must be contained in each particle. A Bohr magneton (roughly the magnetic moment of an electron) is µB = e¯ h/2mc where e and m are the charge and mass of an electron. µB = 0.927 × 10−20 erg G−1 . Doing the arithmetic, we obtain about 1200 magnetons. The particles must contain 1200 paramagnetic molecules with a spin of h ¯ /2 and a magnetic moment of µB . They could also contain a more or less arbitrary number of non-magnetic molecules.
Example: Impurity Ionization In pages 143–144, K&K discuss an impurity atom in a semiconductor. The atom may lose a valence electron and become ionized. The energy required to remove an electron from the donor atom is I. The model for this impurity atom is a three state system: the ionized state has energy 0 and no electron is present. There are two bound states, both have energy −I and both have one electron present. One has the electron with spin up along some axis and the other has the electron with spin down. The grand partition function is Z = 1 + e(µ + I)/τ + e(µ + I)/τ , where the first term comes from the ionized state and the second and third terms account for the spin up and spin down bound states. The average number of (bound) electrons and the average energy are e(µ + I)/τ + e(µ + I)/τ 2e(µ + I)/τ = , 1 + e(µ + I)/τ + e(µ + I)/τ 1 + 2e(µ + I)/τ −Ie(µ + I)/τ − Ie(µ + I)/τ −2Ie(µ + I)/τ E = = , 1 + e(µ + I)/τ + e(µ + I)/τ 1 + 2e(µ + I)/τ
N =
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Oct-2002 13-3
The probability that the impurity atom is ionized is P (N = 0) =
1 . (µ 1 + 2e + I)/τ
If we don’t know the value of µ, we can’t actually calculate any of these averages or this probability. What sets the value of µ? Answer: µ is determined by the electron distribution in the rest of the semiconductor. (A subject we’ll get to in a few weeks!) Although we don’t know µ at this point, we’re used to the idea that µ increases with increasing concentration. In the above expressions we see that increasing µ increases the mean number of particles in the system, decreases the mean energy (energy goes down for a bound particle), and decreases the probability of being ionized. All this is reasonable and might have been expected. The higher the concentration of electrons in the semiconductor, the harder it is for the atom to give an extra electron to the semiconductor and become ionized!
Example: K&K, Chapter 5, Problem 6 In this problem we are asked to work with a 3 state system. The states are: (1) no particle, energy is 0; (2) one particle, energy is still 0: (3) one particle, energy is , so a particle can be absent, present with zero energy, or present with energy . The grand partition function is Z = 1 + λ + λe−/τ , where λ = exp(µ/τ ). The three terms in this sum correspond to the three states enumerated above. The thermal average occupancy is just the average number of particles in the system and is 1 λ + λe−/τ N = 0 · 1 + 1 · λ + 1 · λe−/τ = . Z 1 + λ + λe−/τ Of course, this result can also be obtained using N = λ(∂/∂λ) log Z. Just as in the previous example, increasing µ (λ) makes it harder for the system to give the particle to the reservoir (which determines µ) and the system is more likely to contain a bound particle. The thermal average occupancy of the state with energy is N (E = ) =
λe−/τ λe−/τ = . Z 1 + λ + λe−/τ
Here we see that in the limit of very large µ (λ) the system always contains a particle, and the relative probability that the particle is in the high energy state is just the Boltzmann factor, exp(−/τ ). The average energy is E =
λe−/τ λe−/τ = . Z 1 + λ + λe−/τ
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Oct-2002 13-4
Finally, we are asked to calculate the grand partition function in the event that a particle can exist in both the zero energy state and the state with energy simultaneously. In other words, there is a fourth state of the system; it contains two particles and has energy . We have Z = 1 + λ + λe−/τ + λ2 e−/τ = (1 + λ) · (1 + λe−/τ ) . In this case, Z can be factored. K&K point out that this means that the system can be treated as two independent systems. This is an example of a general rule that for independent (but weakly interacting) systems, the grand partition function is the product of the grand partition functions for each independent system, just as the partition function of independent systems is a product of the individual partition functions (Homework 2, problem 4).
Fermi-Dirac and Bose-Einstein Distributions When we considered a low density gas in lecture 7, we considered the single particle states of a particle confined to a box. To treat more than one particle, we imagined that the particles were weakly interacting, so we could, to some level of approximation, treat each particle as though it occupied a single particle state. In the limit of no interactions between particles, this would be exact (but it might be hard to achieve thermal equilibrium!). For typical gases at room temperature and atmospheric pressure we found that the concentration was very low, so that the chance that any single particle state was occupied was very small, maybe one part in a million. We just didn’t have to worry about the chances of finding two particles in a state. Now we want to consider the distribution when there’s a good chance of finding single particle states occupied. We are going to assume that we have non-interacting particles in which each particle in the system can be said to be in a single particle state. There are two kinds of particles, fermions, which have half integer spins (spin angular momentum is a half integer times ¯h), and bosons which have integral spins. fermions obey the Pauli exclusion principle: at most one particle may occupy a single state. On the other hand, an unlimited number of bosons may be placed in any given state. Since we have independent single particle states, the grand partition function for all the states is the product of the grand partition function for the individual states. So to start with, let’s calculate the grand partition function for an individual state of energy . In the case of fermions, there are two possibilities: no particle present with energy 0 and one particle present with energy . Then Z = 1 + e(µ − )/τ .
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Oct-2002 13-5
The average number of particles in this state of energy is denoted by f() f() = N =
1 e(µ − )/τ . = 1 + e(µ − )/τ e( − µ)/τ + 1
This is called the Fermi-Dirac distribution and fermions are said to obey Fermi-Dirac statistics. If = µ, then the average number of particles in the state is 1/2. If < µ, the average occupancy is bigger than 1/2 and approaches 1 as → −∞. If > µ, the average occupancy is less than 1/2 and approaches 0 as → +∞. This distribution starts at 1 at very low energies, winds up at 0 at very high energies and makes the transition from 0 to 1 in the neighborhood of µ. The temperature controls the width of the transition. At very
low temperatures the transition is sharp. At high temperatures, the transition is gradual. For bosons, the possibilities are no particles present with energy 0, 1 particle present with energy , 2 particles present with energy 2, 3 particles present with energy 3, and so on. The grand partition function is Z = 1 + e(µ − )/τ + e2(µ − )/τ + e3(µ − )/τ + · · · =
1 . 1 − e(µ − )/τ
Note that µ < if the sum is to converge. The average occupancy is f() = τ
e(µ − )/τ 1 ∂ log Z = . = ∂µ 1 − e(µ − )/τ e( − µ)/τ − 1
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Oct-2002 13-6
This is called the Bose-Einstein distribution and bosons obey Bose-Einstein statistics. Again, note that µ < if the distribution function is to make sense. In fact, if we have weakly interacting particles occupying states of several different energies, they all have the same chemical potential which must therefore be less than the lowest energy of any available state. In other words µ < minimum . The minimum energy is often set to zero, and then µ < 0, but the real constraint is just that µ be lower than any accessible energy. The Bose-Einstein distribution diverges as → µ. As → +∞ the distribution goes exponentially to zero. The average occupancy is 1 when − µ = τ log 2. At lower energies there is more than one particle in the state and
at higher energies there is less than one particle in the state. The Bose-Einstein distribution, with µ = 0, is exactly the occupancy we came up with for photons in blackbody radiation. Photons have spin 1, so they are bosons and obey Bose-Einstein statistics. There is no lower limit on the wavelength, so the lowest conceivable energy is arbitrarily close to zero which means µ ≤ 0.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Oct-2002 13-7
At large energies both the Fermi-Dirac and Bose-Einstein distributions become f( → +∞) → e(µ − )/τ . In this limit the average occupancy is small and quantum effects are negligible; this is the
classical limit. The classical distribution is called the Maxwell-Boltzmann distribution.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 4
Due 15-Oct-2002
H4-1
1. Before Debye’s theory of lattice vibrations, there was Einstein’s theory. We will explore Einstein’s theory in this problem. Assume that you have a solid consisting of N atoms. Assume each atom is a 3D harmonic oscillator so there are 3N degrees of freedom. Assume each oscillator has the same set of energy levels, n¯hω. Finally assume that the oscillators are not coupled. That is, the each oscillator can oscillate independently of all the other oscillators. You might want to define an “Einstein temperature,” Θ = ¯hω/k. Determine the energy and heat capacity of this system. Show that for T Θ, the heat capacity agrees with the Dulong and Petit value. Show that for T Θ, the heat capacity goes to zero exponentially. This does not agree with experiment, at least as far as the heat capacity of solids is concerned. Comment 1. When Einstein made this theory, the low temperature heat capacity data were not very good. He was actually more concerned with explaining why some solids did not obey the Dulong and Petit law at room temperature. The reason is that they had fairly high natural frequencies of oscillation and therefore a large Θ. Comment 2. Although this theory, based on “uncoupled” oscillators, does not explain the heat capacity of solids, it does explain other things! Whenever we have independent oscillators and the temperature gets below ¯hω/k, the oscillator is basically in its ground state. Excitations of the oscillator are “frozen out.” So, the rotational and vibrational modes of diatomic and polyatomic gases do not contribute to the heat capacities at low temperatures. See K&K figure 3.9. Also, in a liquid, the transverse oscillations of the atoms about their equilibrium positions are basically uncoupled. So at low temperatures, the transverse oscillations freeze out and do not contribute to the heat capacity. This is why in last week’s homework problem on the heat capacity of liquid He, we only needed to include the longitudinal modes in the heat capacity. 2. Mathematical exercise as promised in lecture. Determine the chemical potential of an ideal gas by differentiating the energy with respect to particle number at constant volume and entropy. 3. K&K, chapter 5, problem 3. 4. K&K, chapter 5, problem 4. 5. K&K, chapter 5, problem 7. 6. K&K, chapter 5, problem 10. (OVER)
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 4
Due 15-Oct-2002
H4-2
7. K&K, chapter 5, problem 12. Note: if you get a very large tree, you have probably solved the wrong problem. A tree doesn’t need to hold up a column of water vapor, it needs to hold up a column of water (or rather sap, which is water with some “impurities” dissolved in it). To do this it uses osmotic pressure. Solute molecules in a dilute solution can be treated much the same way we treated an ideal gas. Suppose there is a membrane, through which solvent molecules can flow freely, but through which solute molecules cannot pass. (The membrane can be provided by the cells in the tree.) Also suppose that one side of the membrane contains no solute molecules and the other side contains a concentration n. What is the pressure difference across the membrane? Can you estimate n for sap? Can you estimate the height of a tree?
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Oct-2002
Physics 301 Problem Set 4 Solutions Problem 1. There are 3N oscillators, each of frequency ω. The average energy of one oscillator is h ¯ ω/(exp(¯hω/τ ) − 1). The average energy of the system is the sum of the average energies of all the oscillators which is simply 3N¯hω 3NkΘ U= ≡ exp(¯hω/τ ) − 1 exp(Θ/T ) − 1
(1)
the specific heat is therefore given by Cv =
∂U Θ eΘ/T Θ 1 = 3Nk( )2 Θ/T = 3Nk( )2 . 2 2 ∂T T (e T (4 sinh (Θ/2T )) − 1)
(2)
In the limits of high and low temperature, using the facts 2 sinh x ≈ e x for x 1, and sinh x ≈ x for x 1, this becomes: T Θ
Cv −→ 3Nk
(3) Θ 2 −Θ/T ) e T We see that for high temperatures, the value approaches the Dulong-Petit value, and for T Θ
Cv −→ 3Nk(
low temperatures, it is exponentially damped. Problem 2. We want to differentiate the energy U with respect to the particle number N at constant volume V and entropy σ. We shall try to express the energy as a function of the indendent variables N, V, σ. We know that the entropy of a an ideal gas is σ = N (log(nQ V /N) + 5/2), where nQ = (Mτ /2π¯h2 )3/2 . This gives us the temperature in terms of the entropy 2π¯h2 2π¯h2 2/3 nQ = τ= M M which gives the energy
N exp(σ/N − 5/2) V
2/3
2π¯h2 N 2/3 = ( ) exp M V
3 3 2π¯h2 N 5/3 U = Nτ = exp 2 2 M V 2/3
2σ 5 − 3N 3
.
2σ 5 − 3N 3
.
(4)
(5)
Now, we can differentiate with respect to N keeping the rest of the variables fixed: ∂U 3 2π¯h2 1 5 2/3 2σ 5 2σ 5 5/3 2σ µ= = N exp − −N exp − ∂N V,σ 2 M V 2/3 3 3N 3 3N 2 3N 3 2σ 5 2π¯h2 N 2/3 5 σ = ( ) exp − − M V 3N 3 2 N 5 σ nQ =τ − = −τ log . 2 N n (6) 1
Physics 301
20-Oct-2002
Problem 3. The average kinetic energy of an ideal gas is 3τ /2. For an ideal gas in a constant gravitational field, we know from equating chemical potentials that the density decreases exponentially as n(h) = n(0) exp(−mgh/τ ). We can normalize this by using the R∞ fact that the total number of particles is equal to N = 0 n(h)dh. The average potential energy per particle is given by
R∞ R∞ (mgh)e−mgh/τ dh (mgh)n(h)dh = 0 R ∞ −mgh/τ hVpart i = hmghi = 0 R ∞ n(h)dh e dh 0 0 R ∞ mg(τ /mg)2 0 xe−x dx R∞ = (τ /mg) 0 e−x dx
(7)
= mg(τ /mg) = τ.
The total energy per atom is therefore 3τ /2 + τ = 5τ /2, and the total heat capacity is 5k/2. Problem 4. Assuming the system of the ions in water is an ideal gas, the chemical potential is µ = τ log(n/nQ ). The difference in chemical potential across the membrane is δµ = τ log(n1 /n2 ) = kT log(104 ). Using the fact that at T = 300K, kT ≈ 0.025eV , we
get δµ ≈ 0.025 × 4 log 10 ≈ 0.24eV . This is equivalent to a cell of voltage 0.24V across the membrane.
Problem 5. Think of each atom as being on one site of the lattice - the sites are labelled and non-interacting, so it is enough to compute quantities for one site. At each site, there are four states with different energies and particle numbers. The grand partition sum for one of the sites is Z = λe∆/2τ + eδ/2τ + λe−∆/2τ + λ2 e−δ/2τ
(8)
The average number of electrons per site is given by hN i
1 d λ(e∆/2τ + e−∆/2τ ) + 2λ2 e−δ/2τ λ Z = ∆/2τ Z dλ λe + eδ/2τ + λe−∆/2τ + λ2 e−δ/2τ
(9)
Putting this to one gives us the relation: λ(e∆/2τ + e−∆/2τ ) + 2λ2 e−δ/2τ = λe∆/2τ + eδ/2τ + λe−∆/2τ + λ2 e−δ/2τ ⇒ λ2 e−δ/2τ = e−δ/2τ
(10)
⇒ λ2 e−δ/τ = 1
2
Physics 301
20-Oct-2002
Problem 6. (a) In the notation of K&K, Z =
P
ASN
e(N µ−E)/τ . The average number of particles is
found by differentiating this relation with respect to µ once. If we differentiate twice, we get: P X N2 Z ASN N 2 e(N µ−E)/τ Z ∂ 2Z (N µ−E)/τ = e = 2 ≡ 2 hN 2 i 2 2 ∂µ τ τ Z τ ASN This gives us the relation hN 2 i =
τ 2 ∂2Z Z ∂µ2
(11)
(12)
(b) We start with hN i = (τ /Z)(∂Z/∂µ). We differentiate this once to get: ∂ ∂ τ ∂Z τ ∂2Z τ ∂Z ∂Z hN i = = − 2 2 ∂µ ∂µ Z ∂µ Z ∂µ Z ∂µ ∂µ 2 1∂ Z 1 ∂Z ∂Z ∂ 2 hN i = τ − 2 ⇒τ ≡ h(∆N)2 i. 2 ∂µ Z ∂µ Z ∂µ ∂µ
(13)
Problem 7. If we naively estimate the height of the tree by saying that it is the maximum height to which the water vapour column at temperature 300K can support itself, this height is given by n(h) = n(0) exp(−mgh/kT ), where n(h) = rn(0). The mass of a water molcule is m ≈ 18 × 1.6 × 10−27 Kg which for r = 0.9, gives the height to be h ≈ 1660m.
This is (as said in the comment below the question) is solving the wrong problem. If
the water is held up due to osmotic pressure, then we can estimate the height as follows: There is a membrane which is permeable only to water and not to sugar. As a first approximation, assume the top of the tree is on one side of this membrane with sugar molecules, and the bottom part (immersed in a pool of water) is on the other side with no sugar. the presence of the sugar reduces the chemical potential, and so water will flow up to equalize the chemical potential (by increasing its gravitational potential energy). down Equating the chemical potential on the two sides gives mgh = −kT log(n up ) where w /nw
down m is the mass of a water molecule, and nup are the concentrations of water on w and nw
down the two sides. The relation with the sugar concentration is n up ≈ 1 − nsugar /ndown . w /nw w
Another way to think about this is that there is a pressure exerted by the sugar
molecules (an ideal gas, for which p = nsugar kT ) downwards, and this must be equal to the pressure difference in the water - this causes water to flow up above it to a height so as to equalize the pressure. 3
Physics 301
20-Oct-2002
Either way, we get the following relation: nsugar kT = ρwater gh ⇒h=
nsugar kT ρwater g
(14)
Plugging in numbers: ρwater = 1g/cc, T = 300K, g ≈ 10ms−2 , and the estimate for nsugar = 0.5% × nwater and we get h ≈ 70m.
Some of you estimated (or looked up) osmotic pressure directly to be ≈ 1atm, which
gives h ≈ 10m.
4
Week 5. Ideal Gas (again), Sackur-Tetrode Entropy, Ideal Fermi Gas
Physics 301
14-Oct-2002 14-1
Reading K&K chapter 6 and the first half of chapter 7 (the Fermi gas).
The Ideal Gas Again Using the grand partition function, we’ve discussed the Fermi-Dirac and Bose-Einstein distributions and their classical—low occupancy—limit, the Maxwell-Boltzmann distribution. In lecture 7, we considered an ideal gas starting from the partition function. We considered the states of a single particle in a box and we used the Boltzmann factor and the number of such states to calculate the partition function for a single particle in a box. Then we said the partition function for N weakly interacting particles is the product of N single particle partition functions divided by N!, ZN (τ ) =
1 N 1 Z1 = (nQ V )N . N! N!
We introduced the factor of N ! to account for the permutations of the N particles among the single particle states forming the overall composite state of the system. The introduction of this N! factor was something of a “fast one!” We gave a plausible argument for it, but without a formalism that includes the particle number, it’s hard to do more. Now that we have the grand partition function we can reconsider the problem. In addition to cleaning up this detail, we also want to consider how to account for the internal states of the molecules in a gas, the heat and work, etc., required for various processes with an ideal gas, and also we want to consider the absolute entropy and see how the Sackur-Tetrode formula relates to experiment.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
14-Oct-2002 14-2
The N Particle Problem The factor of N! in the ideal gas partition function was apparently controversial in the early days of statistical mechanics. In fact, in Schroedinger’s book on the subject, he has a chapter called “The N Particle Problem.” I think this is kind of amusing, so let’s see what the N particle problem really is. The free energy with the N ! term is FC = −τ log Z , = −τ log((nQ V )N /N !) , = −τ (N log nQ + N log V − N log N + N ) , 2 3/2 = −τ N log (mτ /2π¯h ) (V /N ) − τ N , where the subscript C denotes the “correct” free energy. Without the N! term, the “incorrect” free energy is FI = −τ log Z , = −τ log(nQ V )N , = −τ (N log nQ + N log V ) , = −τ N log (mτ /2π¯h2 )3/2 V For the entropy, σ = −∂F/∂τ , σC = N log(nQ V /N ) + (3/2)N + N , 5 nQ V + , = N log N 2 and σI = N log(nQ V ) + (3/2)N , 3 . = N log(nQ V ) + 2 With a given amount of gas, the change in entropy between an initial and final state is given correctly by either formula, σCf − σCi = σIf − σIi =
τf 3 Vf N log + N log . 2 τi Vi
But what happens when we change the amount of gas? Note that N and V are both extensive quantities; the concentration, n = N/V , is an intensive quantity. σC is proportional to an extensive quantity. On the other hand, σI c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
14-Oct-2002 14-3
contains an extensive quantity times the logarithm of an extensive quantity. This means that the (incorrect) entropy is not proportional to the amount of gas we have! For example, suppose we have two volumes V , each containing N molecules of (the same kind of) gas at temperature τ . Then each has σI = N (log(nQ V ) + 3/2), for a total of 2N(log(nQ V )+3/2). We can imagine a volume 2V divided in half by a removable partition. We start with the partition in place and the entropy as above. We remove the partition. Now we have 2N molecules in a volume 2V . The entropy becomes σI = 2N(log(2nQ V ) + 3/2) which exceeds the entropy with the partition in place by ∆σI = 2N log 2! But did anything really change upon removing the partition? What kind of measurements could we make on the gas in either volume to detect whether the partition were in place or not??? Note that the total σC is the same before and after the partition is removed. We might consider the same experiment but performed with two different kinds of molecules, A and B. We start with N molecules of type A on one side of the partition and N molecules of type B on the other side of the partition. Before the partition is removed, we have nQB V nQA V 5 5 σC = N log + N log , + + N 2 N 2 where the two kinds of molecules may have different masses and so might have different quantum concentrations. Now we remove the partition. This time we have to wait for equilibrium to be established. We assume that no chemical reactions occur—we are only waiting for the molecules to diffuse so that they are uniformly mixed. Once equilibrium has been established, each molecule occupies single particle states in a volume 2V and the entropy is 2nQB V 2nQA V 5 5 σC = N log + N log , + + N 2 N 2 which is 2N log 2 greater than the initial entropy. This increase is called the entropy of mixing. In the experiment with the same gas on both sides of the partition, the incorrect expression for entropy gave an increase which turns out to be the same as the entropy of mixing if we start out with two different gases. This emphasizes the point that the incorrect expression results from over counting the states by treating the molecules as distinguishable. In the mixing experiment, we can make measurements that tell us whether the partition has been removed. If we sample the gas in one of the volumes and find all the molecules are type A, then we’re pretty sure that the partition hasn’t been removed! If we find a mixture of type A and type B, then we’re pretty sure that it has been removed. Finally, note that if we go back to the case of the same gas, and reinsert the partition, c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
14-Oct-2002 14-4
then σI decreases by 2N log 2. This is a violation of the second law!
The Ideal Gas From the Grand Partition Function An ideal gas is the low occupancy limit of non-interacting particles. In this limit, both the Fermi-Dirac and Bose-Einstein distributions become the Maxwell-Boltzmann distribution which is f() = e(µ − )/τ , where f() 1 is the average occupancy of a state with energy . The chemical potential, µ, is found by requiring that the gas have the correct number of molecules, N= e(µ − )/τ , All states
= eµ/τ
e−/τ ,
All states
= eµ/τ Z1 , = eµ/τ nQ V , where Z1 is the single particle partition function we discussed earlier. Then µ = τ log
n , nQ
as we found earlier. The free energy satisfies
so
N
F =
∂F ∂N
= µ(N, τ, V ) , τ,V
µ(N , τ, V ) dN ,
0 N
τ log
= 0
=τ
N
N dN , nQ V
(log N − log(nQ V )) dN ,
0
N = τ N log N − N − N log(nQ V ) , 0 n −1 . = Nτ log nQ c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
14-Oct-2002 14-5
Of course, this is in agreement with what we had before. The N! factor that we previously inserted by hand, comes about naturally with this method. (It is responsible for the N in the concentration in the logarithm and the −1 within the parentheses.) As a reminder, the pressure is found from the free energy by ∂F p=− , ∂V τ,N which gives the ideal gas equation of state p=
Nτ . V
The entropy is found by differentiating with respect to the temperature, ∂F , σ=− ∂τ V,N which gives the Sackur-Tetrode expression, σ=N
5 nQ + log n 2
.
The internal energy is most easily found from U = F + τσ =
3 Nτ . 2
The energy of an ideal gas depends only on the number of particles and the temperature. Since dU = τ dσ − p dV + µ dN , the change in energy at constant volume and particle number is just τ dσ. Then the heat capacity at constant volume is CV = τ
∂σ ∂τ
, V,N
which for the case of an ideal gas is CV =
3 3 N = Nk , 2 2
where the last expression gives the heat capacity in conventional units. The molar specific heat at constant volume is (3/2)N0 k = (3/2)R where R is the gas constant. The heat c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
14-Oct-2002 14-6
capacity at constant pressure can be found by requiring that dV and dτ be such that p doesn’t change. ∂σ ∂U ∂V Cp = τ = +p . ∂τ p,N ∂τ p,N ∂τ p,N Since U depends only on N and τ ,
∂U ∂τ
= CV . p,N
With the ideal gas equation of state, V = (N/p)τ , p
∂V ∂τ
=N, p,N
so Cp = CV + N ,
or
Cp = CV + Nk
(in conventional units) .
For the molar heat capacities, we have Cp = CV + R , and for the ideal monatomic gas, these are CV =
3 R, 2
and
Cp =
5 R. 2
The ratio of specific heats is usually denoted by γ, which for an ideal monatomic gas is γ=
Cp 5 = . CV 3
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-1
Internal Degrees of Freedom There are several corrections we might make to our treatment of the ideal gas. If we go to high occupancies, our treatment using the Maxwell-Boltzmann distribution is inappropriate and we should start from the Fermi-Dirac or Bose-Einstein distribution directly. We have ignored the interactions between molecules. This is a good approximation for low density gases, but not so good for higher densities (but these higher densities can still be low enough that the MB distribution applies). We will discuss an approximate treatment of interactions in a few weeks when we discuss phase transitions. Finally, we have ignored any internal structure of the molecules. We will remedy this omission now. We imagine that each molecule contains several internal states with energies int . Note that int is understood to be an index over the internal states. There may be states with the same energy and states with differing energies. In our non-interacting model, the external energy is just the kinetic energy due to the translation motion of the center of mass, cm . Again, cm is to be understood as an index which ranges over all states of motion of the cm. Although we are considering internal energies, we are not considering ionization or dissociation. When a molecule changes its internal state, we assume the number of particles does not change. Let’s consider the grand partition function for a single state of center of mass motion. That is, we’re going to consider the grand partition function for single particle states—with internal degrees of freedom—in a box. The energy of the particle is cm + int . Then the grand partition function is Z = 1 + e(µ − cm − int,1 )/τ + e(µ − cm − int,2 )/τ + · · · + two particle terms + three particle terms + · · · , e−int /τ = 1 + e(µ − cm )/τ int
+ two particle terms + three particle terms + · · · , = 1 + e(µ − cm )/τ Zint + two particle terms + three particle terms + · · · , 2
3
Z Z = 1 + e(µ − cm )/τ Zint + e2(µ − cm )/τ int + e3(µ − cm )/τ int + · · · , 2! 3! where Zint is the partition function for the internal states. The above expression is strictly correct only for bosons. For fermions, we would need to be sure that the multiple particle terms have all particles in different states which means that the internal partition functions do not factor as shown above. However, we really don’t need to worry about this because we’re going to go to the classical limit where the occupancy is very small. This means we can truncate the sum c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-2
above after the second term, Z = 1 + e(µ − cm )/τ Zint . The mean occupancy of the center of mass state, whatever the internal state, is f(cm ) =
e(µ − cm )/τ Zint ≈ e(µ − cm )/τ Zint , )/τ (µ − cm Zint 1+e
which is just the Maxwell-Boltzmann distribution with an extra factor of the internal partition function, Zint . Now we should modify our previous expressions to allow for this extra factor of Zint . Recall that we chose the chemical potential to get the correct number of particles. In that calculation, exp(µ/τ ) must be replaced by exp(µ/τ )Zint , and everything else will go through as before. Then our new expression for µ is n n µ = τ log = τ log − log Zint . nQ Zint nQ The free energy becomes F = Fcm + Fint = Nτ
n log −1 nQ Zint
,
where Fcm is our previous expression for the free energy due to the center of mass motion of molecules with no internal degrees of freedom, and Fint = −Nτ log Zint , is the free energy of the internal states alone. The expression for the pressure is unchanged since in the normal situation, the partition function of the internal states does not depend on the volume. (Is this really true? How do we get liquids and solids? Under what conditions might it be a good approximation?) The expression for the entropy becomes σ = σcm + σint , where σcm is our previous expression for the entropy of an ideal gas, the Sackur-Tetrode expression, and ∂Fint ∂(Nτ log Zint ) ∂(log Zint ) σint = − = = N log Zint + Nτ . ∂τ ∂τ ∂τ V,N V,N V,N The energy, U, and therefore the heat capacities, receive a contribution from the internal states. The extra energy is Fint ∂Fint 2 ∂ . Uint = Fint + τ σint = Fint − τ = −τ ∂τ ∂τ τ V,N c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-3
To make further progress, we need to consider some specific examples of internal structure that can give rise to Zint . Suppose the molecules are single atoms but these atoms have a spin quantum number S. Then there are 2S + 1 internal states that correspond to the 2S + 1 projections of the spin along an arbitrary axis. In the absence of a magnetic field, all these states have the same energy which we take as int = 0. Then Zint = 2S + 1 and Fint = −Nτ log(2S + 1) , σint = N log(2S + 1) , Uint = 0 , so the entropy is increased over that of a simple ideal gas, but the energy doesn’t change. The increase in entropy is easy to understand. What’s happening is that each atom has 2S + 1 times as many states available as a simple atom with no internal structure. The entropy, the logarithm of the number of states, increases by log(2S + 1) per atom. That was a fairly trivial example. Here’s another one: Suppose that each molecule has one internal state with energy 1 . Then Zint = exp(−1 /τ ) and Fint = −Nτ log Zint = +N1 , σint = 0 , Uint = N1 , and ∆µ = −τ log Zint = +1 . In this example, we didn’t change the entropy (each molecule has just one state), but we added 1 to the energy of each molecule. This change in energy shows up in the chemical potential as a per molecule change and it shows up in the free energy and energy as N times the per molecule change. This example is basically a small test of the self-consistency of the formalism! More realistic examples include the rotational and vibrational states of the molecules. Single atoms have neither rotational nor vibrational modes (they do have electronic excitations!). A linear molecule (any diatomic molecule and some symmetric molecules such as CO2 , but not H2 O) has two rotational degrees of freedom. Non-linear molecules have three rotational degrees of freedom. Diatomic molecules have one degree of vibrational freedom. More complicated molecules have more degrees of vibrational freedom. If the molecule has M atoms, 3M coordinates are required to specify the locations of all the atoms. The molecule thus has 3M degrees of freedom. Three of these are used in specifying the location of the center of mass. Two or three are used for the rotational degrees of freedom. The remainder are vibrational degrees of freedom. You might be uncomfortable with 0 or two degrees of rotational freedom for point or linear molecules. To make this plausible, recall that an atom consists of a nucleus c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-4
surrounded by an electron cloud. The electrons are in states with angular momentum, and to change the angular momentum one or more electrons must be placed in an excited electronic state. This is possible, but if there is an appreciable thermal excitation of such states, the atom has a fair chance of being ionized. If the atom is part of a molecule, that molecule has probably been dissociated as molecular binding energies are usually small compared to atomic binding energies. The upshot of all this is that such excitations are not important unless the temperature is high enough that molecules are dissociating and atoms are ionizing! Rotational energy is the square of the angular momentum divided by twice the moment of inertia. (Ignoring things like the fact that inertia is a tensor!) Since angular momentum is quantized, so is rotational energy. This means that at high temperatures, we expect an average energy of τ /2 per rotational degree of freedom, but at low temperatures we expect that the rotational modes are “exponentially frozen out.” In this case, they do not contribute to the partition function, the energy, or the entropy. The spacing between the rotational energy levels sets the scale for low and high temperatures. Similarly, for each vibrational degree of freedom, we expect that the corresponding normal mode of oscillation can be treated as a harmonic oscillator and that at high temperatures there will be an average energy of τ per vibrational degree of freedom (τ /2 in kinetic energy and τ /2 in potential energy). At low temperatures the vibrational modes are exponentially frozen out and do not contribute to the internal partition function, the energy or the entropy. h ¯ times the frequency of vibration sets the scale for low and high temperatures. As an example, consider a diatomic gas. At low temperatures, the energy will be 3Nτ /2, the entropy will be given by the Sackur-Tetrode expression and the molar heat capacities will be CV = 3R/2 and Cp = 5R/2 with γ = 5/3. As the temperature is raised the rotational modes are excited and the energy becomes 5Nτ /2 with molar specific heats of CV = 5R/2 and Cp = 7R/2 and γ = 7/5. If the temperature is raised still higher the vibrational modes can be excited and U = 7Nτ /2, CV = 7R/2, Cp = 9R/2, and γ = 9/7.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-5
Ideal Gas Processes We will consider various processes involving a fixed amount of an ideal gas. We will assume that the heat capacities are independent of temperature for these processes. (In other words, the temperature changes will not be large enough to thaw or freeze the rotational or vibrational modes.) We will want to know the work done, heat added, the change in energy and the change in entropy of the system. Note that work and heat depend on the process, while energy and entropy changes depend only on the initial and final states. For the most part we will consider reversible processes. Consider a constant volume process. About the only thing one can do is add heat! In this case, pf /pi = Tf /Ti . Q = nCV (Tf − Ti ) , W =0, ∆U = nCV (Tf − Ti ) , Tf dT , ∆S = nCV = nCV log T Ti where Q is the heat added to the gas, W is the work done on the gas, n is the number of moles, CV and Cp are the molar heat capacities in conventional units and T and S are the temperature and entropy in conventional units. Consider a constant pressure (isobaric) process. In this case, if heat is added, the gas will expand and Vf /Vi = Tf /Ti Q = nCp(Tf − Ti ) , Vf p dV = −nR(Tf − TI ) , W =− Vi
∆U = nCV (Tf − Ti ) , Tf Tf TF Vf dT = nCV log + nR log . ∆S = nCp = nCp log T Ti Ti Vi Ti Consider a constant temperature (isothermal) process. Heat is added and the gas expands to maintain a constant temperature. The pressure and volume satisfy pf /pi =
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-6
(Vf /Vi )−1 .
W =−
Vf
p dV = −nRT
Vi
Vf
Vi
Vf dV = −nRT log , V Vi
∆U = 0 , Vf , Q = −W = nRT log Vi Vf dV Vf . ∆S = nR = nR log V Vi Vi Consider a constant entropy process. This is often called an adiabatic process. However, adiabatic is also taken to mean that no heat is transfered. Since it is possible to change the entropy without heat transfer, the term isentropic can be used to explicitly mean that the entropy is constant. It is left as an exercise to show that in an isentropic process with an ideal gas, pV γ = constant. Then W =−
Vf Vi
p dV = −pi Viγ
nRTi =− γ−1
1−
Q=0, nRTi ∆U = W = − γ−1
Vi Vf
Vf Vi
γ−1
dV pi Viγ = Vγ γ−1
1 Vfγ−1
−
1
Viγ−1
,
1−
Vi Vf
γ−1 ,
∆S = 0 .
Finally, let’s consider an irreversible process. Suppose a gas is allowed to expand from a volume Vi into a vacuum until its volume is Vf . This is called a free expansion. No work is done to the gas and no heat is added, so the energy and temperature don’t change. The initial and final states are the same as in the reversible isothermal expansion, so the entropy change is the same as for that case, Q =0, W =0, ∆U = 0 , ∆S = nR log
Vf . Vi
This is an adiabatic, but not isentropic, process. Note that if a gas is not ideal, then it may be that the energy depends on volume (or rather, concentration) as well as temperature. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-7
Such a deviation from the ideal gas law can be uncovered by measuring the temperature of a gas before and after a free expansion.
The Gibbs Paradox Revisited Last lecture’s treatment of the N particle problem might have engendered some uneasiness; especially the example with two volumes of gas that were allowed to mix. Recall that we had two volumes, V , each containing N molecules of gas at temperature τ , and separated by a partition. We considered two cases: the same gas on both sides of the partition and different gases, A and B, on the two sides of the partition. When the partition is removed in the same gas case “nothing happens,” the system is still in equilibrium and the entropy doesn’t change—according to the correct expression which included the N! over counting correction in the partition function. When the partition is removed in the different gas case, we must wait a while for equilibrium to be established and once this happens, we find that the entropy has gone up by 2N log 2.This is called the entropy of mixing. The incorrect expression for the entropy (omitting the N! over counting correction in the partition function) gives the same entropy of mixing in both cases. This manifestation of the N particle problem is called the “Gibbs paradox.” The fact that we have to wait for equilibrium to be established means that the mixing of the two different gases is a non-reversible process. Entropy always increases in nonreversible processes! On the other hand, removing the partition between identical gases is a reversible process (in the limit of a negligible mass, frictionless partition. . .). In a reversible process, total entropy (system plus surroundings) does not increase, and there is obviously no entropy change in the surroundings when the partition is removed from the identical gases. The question of measuring the entropy has come up several times. There is no such thing as an entropy meter that one can attach to a system and out pops a reading of the entropy! Changes of the entropy can be measured. Recall that dσ = dQ/τ ¯ for a reversible process. So if we can take a system from one state to another via a (close approximation of a) reversible process and measure the heat flow and temperature, we can deduce the entropy difference between the two states. To measure the absolute entropy of a state, we must start from a state whose absolute entropy we know. This is a state at τ = 0! We will see how this goes in the comparison of the Sackur-Tetrode expression with experimental results. Aside: the fact that we can only measure changes in entropy should not be that bothersome. At the macroscopic level, entropy is defined by an integral ( dQ/τ ). There is always the question of the constant of integration. A similar problem occurs with potential energy. It is potential energy differences that are important to the dynamics and only differences can be measured. For example, consider a mass m on the surface of the Earth. If we take the zero of gravitational potential energy to be at infinite separation of c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-8
the mass and the Earth, then the potential energy of the mass-Earth system is −GMm/R when the mass sits on the surface of the Earth at distance R from the center. I’m sure we’re all happy with this, right? But, there is no potential energy meter that you can attach to the mass and out pops a potential energy value. Instead, −GMm/R is a calculated value much like the entropy of mixing is a calculated value. What can we actually measure in the gravitational case? We can measure the force required to change the height (distance from the center of the Earth) of the mass and so measure F · dr. That is, we can measure the change in gravitational potential energy between two states. Of course, we have to be careful that there is no friction, that F = −mg, that there is negligible acceleration of the mass, etc. In other words, we have to approximate a reversible process! Reversible processes aren’t just for thermodynamics! I suspect they’re easier to visualize in other branches of physics, so they don’t cause as much comment and concern. What about measuring the “absolute” gravitational potential? This requires measuring the changes in potential energy between the state whose potential we know (infinite separation) and the state whose potential we want to know (mass on the surface of the Earth). I suppose if we had enough money, we might get NASA to help with this project! Of course, by calculation we can relate the “absolute” gravitational potential to other quantities, for example, the escape velocity, that can be more easily measured. This is one of the arts of theoretical physics. Back to our mixing example: Can we think of a reversible process which would allow us to mix the gases? Then we could calculate the entropy change by keeping track of the heat added and the temperature at which it was added. Also, if we knew of such a process. we could use it—in reverse!—to separate the gases again. The process I have in mind uses semi-permeable membranes. We need two of them: one that passes molecules of type A freely but is impermeable to molecules of type B, and a second which does the opposite. It passes type B molecules and blocks type A molecules. We can call these “A-pass” and “B-pass” membranes for short. Do such membranes actually exist? Semi-permeable membranes certainly exist. The molecules that are passed may not pass through “freely,” but if we move the membrane slowly enough, it should be possible to make the friction due to the molecules passing through the small holes in the membrane negligibly small. The possibility of finding the desired membranes depends on the molecules in question and almost certainly is not possible for most pairs of molecules. However, the fact that semipermeable membranes do exist for some molecules would seem to make this a reasonable thought experiment (if not an actually realizable experiment).
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-9
The figure is a schematic of our mixing apparatus. The volume 2V is in equilibrium with a thermal bath at temperature τ . At the center of the volume and dividing it into two sections of volume V are the two membranes. The A-pass membrane confines the N type B molecules to the right volume and the B-pass membrane confines the N type A molecules to the left volume. The next two figures show the situations when the gases are partially and fully mixed. The membranes are something like pistons and are moved by mechanical devices that aren’t shown. These devices are actually quite important as each membrane receives a net force from the gases. The mechanical devices are used to counteract this force and do work on the gases as the membranes are moved slowly and reversibly through the volume. What is the force on a membrane due to the gases? Consider the A-pass membrane. A molecules pass freely through this membrane so there is no interaction of this membrane with the A molecules. The B molecules are blocked by this membrane, so the B molecules are bouncing off the membrane as though it were a solid surface. Since there are B molecules on the right side of this membrane and not on the left, there is a pressure (only from the B molecules) which is just Nτ /VB where VB is the volume occupied by B molecules to the right of the A-pass membrane. The net force from this pressure points to the left. So as we move the membrane to the left, the B molecules do work on it. This comes from the energy, UB of the B molecules. The B molecule system would cool, except it is in contact with the heat bath, so heat flows from the bath to make up for the work done on the membrane and keep τ and hence UB constant. The same thing happens with the B-pass membrane and the A gas. It is the heat transfered (reversibly) from the reservoir to each gas that increases the entropy of the gas.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Oct-2002 15-10
By now, you’re convinced that the A molecules can be treated as a gas occupying the volume to the left of the Bpass membrane without worrying what’s going on with the B molecules and viceversa. We’ve arranged this “by construction.” Our model for an ideal gas is based on non-interacting molecules (well, weakly interacting, but only enough to maintain thermal equilibrium). We’ve also made the membranes so they interact only with A or B molecules but not both. So the A molecules interact strongly with the walls of the container and the B-pass membrane and interact weakly (→ 0) with everything else including other A molecules. So when the B pass membrane is moved all the way to the right, the A molecules undergo an isothermal expansion from V to 2V . We apply our ideal gas results for an isothermal expansion and find
2V
∆σA = V
dQ = τ
2V V
p dV = τ
2V
N V
2V dV = N log = N log 2 . V V
Of course, a similar result applies to the B molecules when the A pass membrane is moved all the way to the left. The total change of entropy in this process is ∆σ = 2N log 2 , which is what we had obtained before by applying the Sackur-Tetrode expression to the initial and final states of the irreversible process. Aside: suppose we have several ideal gases occupying the same volume. pi = Ni τ /V is called the partial pressure of gas i and is the pressure that the same Ni molecules of the gas would have if they occupied the volume by themselves (no other gases present). Then the total pressure is the sum of the partial pressures: p = i pi . This is called Dalton’s law. It falls out of our ideal gas treatment “by construction.” Since the gases are noninteracting, the presence of other gases cannot affect the rate of momentum transfer by a given gas! So, Dalton’s law seems trivial, but it probably helped point the way towards a non-interacting model as a good first approximation.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Oct-2002 16-1
The Sackur-Tetrode Entropy and Experiment In this section we’ll be quoting some numbers found in K&K which are quoted from the literature. You may recall that I’ve several times asked how one would measure absolute entropy? I suspect that I pretty much gave it away (if you hadn’t figured it out already) in the last section. The answer is you have to measure heat transfers from a state of known absolute entropy to the desired state so that you can calculate dQ/τ . What is a state of known entropy? Answer, at absolute 0, one expects the entropy to be very small and we can take it to be 0. Actually, there is the third law of thermodynamics (not as famous as the first two!) which says that the entropy should go to a constant as τ → 0. At absolute 0, a reasonable system will be in its ground state. In fact the ground state might not be a single state. For example if we consider a “perfect crystal,” its ground state is clearly unique. But real crystals have imperfections. Suppose a crystal is missing a single atom from its lattice. If there are N atoms in the crystal there are presumably N different sites from which the atom could be missing so the entropy is log N. Also, there’s presumably an energy cost for having a missing atom, so the crystal is not really in its ground state. But this might be as close as we can get with a real crystal. The point is that the energy and the entropy are both very small in this situation and very little error is made by assuming that σ(0) = 0. (Compare log N with N log(nQ /n) when N ∼ 1023 !) In fact, a bigger problem is getting to very low temperatures. In practice, one gets as low as one can and then extrapolates to τ = 0 using a Debye law (assuming an insulating solid). So to measure the entropy of a monatomic ideal gas such as neon, one makes heat capacity measurements and does the integral C(τ ) dτ /τ . The heat capacity measurements go to as low a τ as needed to get a reliable extrapolation to 0 with the Debye law. According to K&K, the calculation goes like this: solid neon melts at 24.55 K. At the melting point, its entropy (by extrapolation and numerical integration) is Smelting − S0 = 14.29
J . mol K
To melt the solid (which occurs at a constant temperature) requires 335 J mol−1 so the entropy required to melt is J ∆Smelt = 13.65 . mol K Again, a numerical integration is required to find the entropy change as the liquid neon is taken from the freezing point to the boiling point at 27.2 K. This is J Sboiling − Sfreezing = 3.85 . mol K Finally, 1761 J mol−1 is required to boil the neon at the boiling point, and J . ∆Sboil = 64.74 mol K c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Oct-2002 16-2
Now we have a gas to which we can apply the Sackur-Tetrode expression. Assuming S0 = 0, the total is Svapor = ∆Sboil + (Sboiling − Sfreezing ) + ∆Smelt + (Smelting − S0 ) = 96.40
J , mol K
σ = 6.98 × 1024 /mol , where I have quoted the sum from K&K which differs slightly from the sum you get by adding up the four numbers presumably because there is some round-off in the input numbers. (For example, using a periodic table on the web, I find the melting and boiling points of neon are 24.56 K and 27.07 K.) According to K&K, the Sackur-Tetrode value for neon at the boiling point is SSackur−Tetrode = 96.45
J , mol K
which is in very good agreement with the observed value. When I plug into the Sackur-Tetrode expression I actually get, SSackur−Tetrode = 96.47
J , mol K
still in very good agreement with the observed value. Why did I get a slightly different value than that quoted in K&K? I used S = R log
mkT 2π¯h2
3/2
p kT
5 + 2
,
Everything can be looked up, but I’m using p/kT instead of N0 /V , which assumes the ideal gas law is valid. However, this expression is being applied right at the boiling point, so it’s not clear that the ideal gas law should work all that well. Some other things to note. (1) If we had left out the N! over counting correction, We would have to add J R(log N0 − 1) = 447 , mol K to the above. This swamps any slight problems with deviations from the ideal gas law or inaccuracies in the numerical integrations! (2) The Sackur-Tetrode expression includes h ¯ which means it depends on quantum mechanics. So this is an example where measurements of the entropy pointed towards quantum mechanics. Of course, ¯h occurs inside a logarithm, so it might not have been so easy to spot!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Oct-2002 16-3
The Ideal Fermi Gas Consider a metal like sodium or copper (or the other metals in the same columns in the periodic table). These metals have one valence electron—an electron which can be easily removed from the atom, so these atoms often form chemical bonds as positively charged ions. In the solid metal, the valence electrons aren’t bound to the atoms. How do we know this? Because the metals are good conductors of electricity. If the electrons were bound to the atoms they would be insulators. Of course, there are interactions between the electrons and the ions and between the electrons and other electrons. But, as a first approximation we can treat all the valence electrons as forming a gas of free (non-interacting) particles confined to the metal. Let’s do a little numerology. First, let’s calculate the quantum concentration for an electron at room temperature, nQ = =
me kT 2π¯h2
3/2 ,
(9.108 × 10−28 g) (1.380 × 10−16 erg K−1 ) (300 K) 2
2π (1.054 × 10−27 erg s)
3/2 ,
= 1.26 × 1019 cm−3 , 3 1 . = 43 ˚ A In other words, the density of electrons is equal to the room temperature quantum concentration if there is one electron every 43 ˚ A. Now consider copper. It has a density of −3 8.90 g cm and an atomic mass of 63.54 amu. So the number density of copper atoms is nCu = 8.44 × 10
22
−3
cm
=
1 2.3 ˚ A
3 .
The number of electrons in the electron gas (assuming one per copper atom) exceeds the quantum concentration by a factor of 6700. For copper the actual concentration and the quantum concentration are equal at a temperature of about 100,000 K (assuming we could get solid copper that hot!). The upshot of all this is that we are definitely not in the classical domain when dealing with an electron gas in metals under normal conditions. We will have to use the FermiDirac distribution function. Low energy states are almost certain to be filled. When this is true, the system is said to be degenerate. Furthermore, the electron gas is “cold” in the sense that thermal energies are small compared to the energies required to confine them to high densities. (This is most easily seen from an uncertainty principle argument.)
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Oct-2002 16-4
So as a first approximation, we can use the Fermi-Dirac distribution at zero temperature. This means we will ignore thermal energies altogether. At zero temperature, the Fermi-Dirac distribution becomes, 1 1, < µ; f() = → 0, >µ. e( − µ)/τ + 1 Imagine a chunk of copper in which all the valence electrons have been removed (it would have a rather large electric charge . . .). Add back one valence electron remembering that the temperature is 0. This electron goes into the lowest available state. Add another electron, it goes into the state with the next lowest energy. Actually it’s the same center of mass state and the same energy, but the second electron has its spin pointing in the opposite direction from the first. The third electron goes in the state with the next lowest energy. And so on. What we are doing is filling up states (with no gaps) until we run out of valence electrons. Since we have the lowest possible energy, this configuration must be the ground state (which is the state the system should be in at 0 temperature!). We must choose the chemical potential so that our metal has the correct number of valence electrons. To do this, we need to know the number of states. Since we are considering free electrons, we are dealing with single particle states in a box. This is the same calculation we’ve done before. The number of states with position vector in the element d3 x and momentum vector in the element d3 p is d 3 x d3 p dn(x, p) = 2 , 8π 3 ¯h3 where the factor of 2 arises because there are two spin states for each center of mass state. When the number of states is used in an integral, the integral over d3 x just leads to the volume of the box, V . The element d3 p = p2 dp dΩ and the solid angle may be integrated over to give 4πp2 dp. Finally, the independent variable may be converted from momentum to energy with p2 /2m = , and we have V dn() = 2π 2
2m ¯h2
3/2
√
d .
It’s customary to write this as the density of states per unit energy V D() d = 2π 2
2m ¯h2
3/2
√ d .
Now we’re ready to calculate µ. At zero temperature, the occupancy is 1 up to µ and 0 above µ, so the total number of electrons is
µ N= D() d . 0 c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Oct-2002 16-5
Before we do this integral, a bit of jargon. The energy of the highest filled state at zero temperature is called the Fermi energy, F . So µ(τ = 0) = F . Then
F
N= 0
V 2π 2
2m h2 ¯
3/2
√ V d = 2 3π
or F = (3π 2 n)2/3
2m ¯h2
3/2 3/2
F ,
¯2 h , 2m
where n = N/V is the concentration. One also speaks of the Fermi temperature defined by τF = kTF = F . This is not actually the temperature of anything, but is a measure of the energy that separates degenerate from non-degenerate behavior. The Fermi temperature is a few tens of thousands of Kelvins for most metals, so the electron gas in typical metals is cold. Having determined the Fermi energy, we can determine the total energy of the gas. We just add up the energies of all the occupied states
F
U0 = 0
V 2 2π
2m ¯h2
3/2
√ V d = 2 5π
2m ¯h2
3/2 5/2
F
=
h2 3 V 2 5/3 ¯ = NF , (3π n) 2 5π 2m 5
where the subscript on U indicates the ground state energy. In the ground state, the average energy of an electron is 3/5 the Fermi energy. Also note that the concentration contains V −1 which means U0 ∝ V −2/3 which means that as the system expands, the energy goes down which means it must be exerting a pressure on its container. This is called degeneracy pressure. In fact, p=−
∂U0 ∂V
so pV =
= σ,N
2 U0 , 3 V
2 U0 , 3
just as for an ideal gas. Note that the derivative is to be taken at constant entropy. We are dealing with the ground state, so the entropy is constant at 0. As a point of interest, using the concentration for copper that we calculated earlier, we find F = 7.02 eV and TF = 81, 500 K , and the electron gas in copper really is “cold” all the way up to the point where copper melts! (1358 K)
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Oct-2002 16-6
Heat Capacity of a Cold Fermi Gas In the preceding section we considered the Fermi gas in its ground state. This does not allow us to consider adding heat to the gas because the gas would no longer be in the ground state. To calculate the heat capacity, we need to expand our treatment a bit. What we need to do is calculate the energy of the gas as a function of temperature. We will calculate the difference between the ground state energy and the energy at temperature τ and we will make use of the fact that the gas is cold. With a cold gas, all the action occurs within τ of µ. That is the occupancy goes from 1 to 0 over a range of a few τ centered at µ. Since we have a cold gas, this is a relatively narrow range. The difference in energy between the gas at temperature τ and the gas in the ground state is
∞ ∆U (τ ) = D()f() d − U0 , 0
∞
∞ = D()f()( − F ) d + D()f()F d − U0 , 0 0
∞ = D()f()( − F ) d + NF − U0 . 0
Now, we differentiate with respect to τ to get the heat capacity. CV =
∂∆U , ∂τ
∞
=
D()
0
df() ( − F ) d . dτ
So far, everything is “exact,” now we start making approximations. At τ = 0, the distribution is a step function, so its derivative is a delta function (a very sharply peaked function in the neighborhood of the step). This means that the main contribution to the integral occurs (even if τ is not 0) when is very close to µ. So we will ignore the variation in the density of states, evaluate it at µ and take it out of the integral. What about µ? At τ = 0, µ = F . As τ increases, µ decreases, but when τ F , the change in µ is negligibly small (plot some curves!), so we will take µ = F . Then we have
∞ df() ( − F ) d . CV = D(F ) dτ 0 Now, d df = dτ dτ
1 e( − F )/τ + 1
=
− F e( − F )/τ 2 . τ2 )/τ ( − F +1 e
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Oct-2002 16-7
At this point, we change variables to x = ( − F )/τ and we have
CV = τ D(F )
+∞ −∞
x2 ex dx 2 , (ex + 1)
where the actual lower limit of integration, −F /τ , has been replaced by −∞ since all the contribution to the integral is in the neighborhood of x = 0. It turns out that the integral is π 2 /3, so π2 D(F )τ , CV = 3 a surprisingly simple result! If we plug in the expression for the density of states, we have D(F ) = and CV =
3N , 2F
π2 τ π2 τ = . N N 2 F 2 τF
In conventional units, CV =
π2 T . Nk 2 TF
Some comments. This is proportional to T which means that the energy of the electron gas is U0 + constant · τ 2. Can we see how this happens? In going from 0 to τ , we are exciting electrons in the energy range from F −τ → F by giving them a thermal energy of roughly τ . The number of such electrons is roughly Nτ /F , so the added energy is roughly Nτ 2 /F . The Fermi gas heat capacity is quite a bit smaller than that of a classical ideal gas with the same energy and pressure. This is because only the fraction τ /τF of the electrons are excited out of the ground state. At low temperatures, heat capacities of metals have a linear term due to the electrons and a cubic term due to the lattice. Some experimental data may be found in K&K.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 5
Due 22-Oct-2002
H5-1
1. Consider a “rigid rotator.” This is a system with a fixed moment of inertia, I, and angular momentum L = ¯h where = 0, 1, 2, 3, . . .. The energy of this system is E=
L2 , 2I
h2 and there are From quantum mechanics, we know that L2 takes on the values ( + 1)¯ 2 + 1 states with L = ¯h. (a) Evaluate the partition function, the thermal average energy, and the entropy for a rigid rotator when the temperature is large, τ ¯h2 /2I. Hint: you can approximate the sum by an integral. (b) Evaluate the partition function, the thermal average energy, and the entropy for a rigid rotator when the temperature is small, τ ¯h2 /2I. Hint: in this case you might want to truncate the sum after just a couple of terms. Are the rotational modes “frozen out” at low temperatures? (c) Look up whatever you need to know to estimate I for a nitrogen (N2 ) molecule. Then estimate the temperature which divides low temperatures from high temperatures for the rotational modes of a nitrogen molecule. 2. K&K, chapter 6, problem 1. 3. K&K, chapter 6, problem 3. Note that the two occupancies are not the same. You might want to plot the occupancies versus x = exp[(µ − )/τ ] in order to see the difference. 4. K&K, chapter 6, problem 4. You really don’t need to know relativity to work this problem. You just need to use E = pc rather than E = p2 /2m as the relation between energy and momentum. 5. K&K, chapter 6, problem 9. 6. K&K, chapter 6, problem 10. 7. K&K, chapter 6, problem 11. Also, food for thought (not to be handed in): Suppose the atmosphere is in a state of isentropic convective equilibrium in the morning. Now the sun heats the ground, which heats the air, which starts to rise. (There will be rising and falling patches due to uneven heating). So a blob of air rises. What effect does the water vapor and especially the cooling of water vapor through its condensation point have on this process? 8. K&K, chapter 6, problem 15. Note that you get high temperatures even before the fuel burns. This might have something to do with making pollutants like nitrous-oxide, etc. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 5
Due 22-Oct-2002
H5-2
You might find K&K, chapter 6, problem 8 amusing, but you don’t need to hand it in!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
3-Nov-2002
Physics 301 Problem Set 5 Solutions Problem 1. The rigid rotator has energy levels E = L2 /2I, and there are 2l + 1 states with L2 = l(l + 1)¯h2 . The partition function of this system is given by Z=
X
(2l + 1) exp(−
l
l(l + 1)¯h2 ) 2Iτ
(1)
(a) In the limit of high temperature, τ h ¯ 2 /2I which means that the spacing between two adjacent energy levels is very small, and we can approximate the sum by an integral: Z≈
Z
∞
dl(2l + 1) exp(−
0
Z 2Iτ ∞ = 2 dxe−x , h ¯ 0 2Iτ = 2 . h ¯
l(l + 1)¯h2 ) 2Iτ (2l + 1)¯h2 l(l + 1)¯h2 ⇒ dx = ) (x = 2Iτ 2Iτ
(2)
We can now calculate the free energy, the average energy and the entropy: F = −τ log Z = −τ log(2Iτ /¯h2 )
∂ log Z = τ ∂τ U −F σ= = log(2Iτ /¯h2 ) + 1 τ U = τ2
(3)
(b) In the other limit of low temperature τ h ¯ 2 /2I, each term is much smaller than the previous one, and we can approximate the sum by just keeping the first two terms: h ¯2
h ¯2
Z ≈ 1 + 3e− Iτ This gives:
⇒ log Z ≈ 3e− Iτ .
(4)
h ¯2
F = −3τ e− Iτ
∂ − h¯Iτ2 3¯h2 − h¯Iτ2 3e = e (5) ∂τ I h ¯2 h ¯2 σ = 3( + 1)e− Iτ Iτ (c) Modelling the Nitrogen molecule as two Nitrogen atoms at the ends of a rigid stick, U = τ2
I = 2MR2 where M = 14/6 × 1023 g is the mass of a Nitrogen atom, and R ≈ 1
Physics 301
3-Nov-2002
0.5 × 10−8 cm is half the interatomic distance. This gives I ≈ 1.16 × 10 −39 gcm2. The characteristic temperature is h ¯ 2 /2Ik ≈ 3K.
Problem 2. f (E) =
1 e(E−µ)/τ
+1
(E−µ)/τ
1 e ∂f = (E−µ)/τ 2 ∂E (e + 1) τ ∂f 1 − = ∂E E=µ 4τ
−
Problem 3.
(6)
(a) The allowed occupancies are 0, 1, 2 with energies 0, E, 2E. The grand partition sum is given by:
Z = 1 + e(µ−E)/τ + e2(µ−E)/τ
(7)
This gives the average occupancy as: hN i = τ
∂ log Z e(µ−E)/τ + 2e2(µ−E)/τ = ∂µ 1 + e(µ−E)/τ + e2(µ−E)/τ
(8)
(b) Here, there are two orbitals in the usual quantum mechanics, each of energy E. The grand partition sum in this case is:
Z = 1 + 2e(µ−E)/τ + e2(µ−E)/τ
(9)
and the average occupancy is 2(e(µ−E)/τ + e2(µ−E)/τ ) 1 + 2e(µ−E)/τ + e2(µ−E)/τ 2e(µ−E)/τ = 1 + e(µ−E)/τ
hN i =
(10)
Both these curves asymptote to 2 as a function of e(µ−E)/τ , but the first one has half the slope of the second near the origin. Problem 4. If we think of the particles being in a box, the counting of states in this problem remains the same as in the non-relativistic one, because the de Broglie relation λ = h/p continues to hold. This means that the density of states remains the same as a 2
Physics 301
3-Nov-2002
function of the momentum. The energy momentum relation is different though: E = pc. We can therefore write down the partition function for one particle as: Z
∞
d3 xd3 p −pc/τ e (2π¯h)3 0 Z ∞ V = dp4πp2 e−pc/τ (2π¯h)3 0 Z 4πV τ 3 ∞ = ( ) duu2 e−u 3 (2π¯h) c 0 8π = 3 3τ3 h c
Z=
(11)
∂ This gives the average energy per particle to be U = τ 2 ∂τ log Z = 3τ .
Problem 5. We know that the chemical potential of a gas with internal structure is given by µ = µideal −τ log Zint . If we set the scale of the energy such that the ground state energy
of each atom is 0, the partition function of the internal states alone is Z int = 1 + e−∆/τ . The various quantities are then calculated to be: µ = τ log λτ (log(n/nQ ) − log(1 + e−∆/τ )) Z N F = dNµ = Nτ (log(n/nQ ) − 1 − log(1 + e−∆/τ )) = Nτ (log 0
∂F nQ (1 + e−∆/τ ) 5 ∆ 1 )V = N (log( )+ + ) ∆/τ ∂τ n 2 τ e −1 Nτ ∂F )τ,N = p = −( ∂V V
n − 1) nQ (1 + e−∆/τ )
σ = −(
(12)
The pressure is the same as that for an ideal gas because Z int does not depend on V. This means that (∂V /∂τ )P = N which gives Cp = Cv + N. ∂σ )V + N ∂τ 5 ∆2 e∆/τ =( + 2 )N 2 τ (1 + e−∆/τ )2
Cp = τ (
Problem 6. 3
(13)
Physics 301
3-Nov-2002
(a) For an ideal gas of N particles, we know that pV = Nτ , and C p = Cv + N. This gives γ = Cp/Cv = 1 + N/Cv . For an isentropic process, dσ = 0 ⇒ dU + pdV = 0 ⇒ Cv dτ + pdV = 0
dτ pdV + =0 τ Cv τ Nτ dV dτ + =0 ⇒ τ V Cv τ dτ dV ⇒ + (γ − 1) = 0. τ V ⇒
(14)
Using the ideal gas law to rewrite the first term in the above equation, we get pdV + V dp dV dp dV + (γ − 1) = +γ =0 pV V p V
(15)
and rewriting the second equation gives dτ dp dτ dp dτ + (γ − 1)( − )=γ + (1 − γ) =0 τ τ p τ p γ dp dτ + =0 ⇒ p (1 − γ) τ
(16)
(b) From the above relations, we get (∂p∂V )σ = −γp/V . The bulk modulus Bσ = −V (∂p∂V )σ = V (γp/V ) = γp. From the ideal gas law, we find (∂p∂V ) τ = −p/V which gives the other bulk modulus Bτ = V p/V = p.
Problem 7. (a) For mechanical equilibrium, let us consider a thin layer of atmosphere at altitude z and thickness dz. The difference in pressure between the lower and upper surfaces is equal to the weight of the atmosphere per unit area. If n is the concentration at height z, dp(z) = mgn(z)dz = mgpdz/τ . On the other hand, we know from problem 6 that for an isentropic system, dp/p =
γ dτ (1−γ) τ .
Putting these two together, we get:
mgdz γ dτ = τ (1 − γ) τ dτ mg(1 − γ) ⇒ = = const dz γ 4
(17)
Physics 301
3-Nov-2002
(b) Plugging in numbers, taking m = 28/6 × 1023g to be the mass of a Nitrogen moleclue, and γ = 1.4 as for an ideal diatomic gas, we get dT /dz = mg(1 − γ)/kγ ≈ 10K/Km.
(c) We know that p = nτ = (const)ρτ . We know from problem 6 that p 1−γ τ γ = const ⇒ p1−γ (p/ρ)γ = const ⇒ p/ργ = const.
Problem 8. From problem 6, we know that for an isentropic process, T V γ−1 = const. For our engine, if we look at the two end points of the process, T 2 = T1 (V1 /V2 )γ−1 = 300(15)0.4 = 886K = 613C.
5
Week 6. Fermi Gases, Bose-Einstein Gases, Heat, Work, Carnot Cycle
Physics 301
21-Oct-2002 17-1
Reading Finish K&K chapter 7 and start on chapter 8. Also, I’m passing out several Physics Today articles. The first is by Graham P. Collins, August, 1995, vol. 48, no. 8, p. 17, “Gaseous Bose-Einstein Condensate Finally Observed.” This describes the research leading up the first observation of a BE condensate that’s not a superfluid or superconductor. The second is by Barbara Goss Levi, March, 1997, vol. 50, no. 3, p. 17, “Bose Condensates are Coherent Inside and Outside an Atom Trap,” describing the first “atom laser” which was based on a BE condensate. The third is also by Levi, October, 1998, vol. 51, no. 10, p. 17, “At Long Last, a Bose-Einstein Condensate is Formed in Hydrogen,” describing even more progress on BE condensates. In addition, there is a recent Science report on an atomic Fermi Gas, DeMarco, B., and Jin, D. S., September 10, 1999, vol. 285, p. 1703, “Onset of Fermi Degeneracy in a Trapped Atomic Gas.”
More on Fermi Gases So far, we’ve considered the zero temperature Fermi gas and done an approximate treatment of the low temperature heat capacity of Fermi gases. The zero temperature Fermi gas was straightforward. We simply said that all states, starting from the lowest energy state, are filled until we run out of particles. The energy at which this happens is called the Fermi energy and is the same as the chemical potential at 0 temperature, F = µ(τ = 0). Basically, all we had to do was determine the density of states, a problem we’ve dealt with before. Working on the low temperature heat capacity required an approximate calculation of the energy versus temperature for a cold Fermi gas. In this calculation we assumed that the density of states near the Fermi energy is constant and this allows one to pull the density of states out of the integral and also to set the chemical potential to its 0 temperature value. These approximations work quite well for the electron gas in metals at room temperature because the Fermi temperature for these electron is typically several tens of thousands of Kelvins. To calculate the energy, etc., at arbitrary temperatures, one must numerically integrate the Fermi-Dirac distribution times the density of states to obtain the number of particles. Then the chemical potential is varied until the desired number of particles is obtained. Knowing the chemical potential, one can integrate the density of states times the Fermi-Dirac distribution times the energy to get the energy at a given temperature. All of this requires numerical integration or approximate techniques. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
21-Oct-2002 17-2
Figure 7.9 and tables 7.2 and 7.3 of K&K demonstrate that the low temperature heat capacities (low enough that the Debye lattice vibrations are accurately following a T 3 heat capacity) have a component proportional to the temperature and list the proportionality constants for various metals. One thing you will notice is that the proportionality constants agree with the calculations to only ∼ 30% and up to a factor of 2 in at least one case. This is most likely due to the fact that the electrons are not really a non-interacting gas. Also, there are effects due to the crystal structure such as energy bands and gaps.
Other Fermi Gases In addition to the conduction electron gas in metals, Fermi gases occur in other situations. In heavy elements, the number of electrons per atom becomes large enough that a statistical treatment is a reasonable approximation. This kind of treatment is called the Thomas-Fermi (or sometimes the Fermi-Thomas) model of the atom. Also in heavy elements, the number of nucleons (neutrons and protons) in the nucleus is large and, again, a statistical treatment is a reasonable approximation. The radius of a nucleus is R ≈ (1.3 × 10−13 cm) · A1/3 , where A is the number of nucleons. The coefficient in this relationship can vary by a tenth or so depending on just how one measures the size—scattering by charged particles, scattering by neutrons, effects on atomic structure, etc. Aside: the unit of length 10−13 cm which is one femtometer is called the Fermi in nuclear physics. The volume of a nucleus is V =
4π 2.2 × 10−39 A cm3 , 3
and the number density or concentration is nnuc =
A = 1.1 × 1038 cm−3 . V
The nuclear density (with this number of significant digits, the mass difference between neutrons and protons is negligible) is ρnuc = 1.8 × 1014 g cm−3 . Basically, all nuclei have the same density. Of course, this is not quite true. Nuclei have a shell structure and “full shell” nuclei are more tightly bound than partially full shell nuclei. Also, the very lightest nuclei show some deviations. Nevertheless, the density variations aren’t large and it’s reasonable to speak of the nuclear density. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
21-Oct-2002 17-3
The neutron to proton ratio in nuclei is about 1 : 1 for light nuclei up to about 1.5 : 1 for heavier nuclei. Assuming the latter value, then it is the neutrons whose Fermi energy is important. 2/3 ¯2 2 h F = 3π (0.6 · nnuc ) = 5.2 × 10−5 erg = 32 MeV . 2mn This is a little larger than K&K’s number because it’s computed for a nucleus with 40% protons and 60% neutrons, instead of equal numbers. Since the average kinetic energy in a Fermi gas is 3F /5, the average kinetic energy is about 19 MeV in a heavy nucleus. The experimentally determined binding energy per nucleon is about 8 MeV. This varies somewhat, especially for light nuclei; it reaches a peak at 56 Fe. To the extent that the binding energy per nucleon and the kinetic energy per nucleon are constant, the potential energy per nucleon is also constant. This reflects the fact that the nuclear force is the short range strong force and nuclei only “see” their nearest neighbors. The strong force is about the same between neutrons and protons, between protons and protons and between neutrons and neutrons. But, the protons have a long range electromagnetic interaction. As the number of particles goes up the “anti-binding” energy of the protons goes up faster than the number of protons (can you figure out the exponent?) so the equilibrium shifts to favor neutrons in spite of the fact that they are slightly more massive than protons. The Fermi temperature for neutrons in a heavy nucleus is TF = F /k = 3.8 × 1011 K , so nuclei (which are usually in their ground state) are very cold! In a star like the Sun, gravity is balanced by the pressure of a very hot, but classical, ideal gas. The Sun has a mass about 300,000 times that of the Earth and a radius about 100 times that of the Earth, so the average density of the Sun is somewhat less than that of the Earth (it’s about the density of water!). The temperature varies from about 20 million Kelvins at the center to about 6000 K at the surface. So it’s completely gaseous and the electrons are non-degenerate throughout. Since the sun is radiating, it is cooling. Energy is supplied by nuclear reactions in the Sun’s core. A typical white dwarf star has about the mass of the Sun but the radius of the Earth. It’s the degeneracy pressure of the electrons that balances gravity in a white dwarf. White dwarves shine by cooling. There are no nuclear reactions in the core, so after they cool enough, they become invisible. White dwarves are discussed in K&K, so let’s move on to neutron stars which are not discussed in K&K. In a neutron star it’s the degeneracy pressure of the neutrons that balances gravity. A typical neutron star has a mass like the Sun (M = 2 × 1033 g) but a radius smaller than New Jersey, let’s say R ≈ 10 km. Let’s assume that the mass in a neutron star is uniformly distributed. What’s the density? ρ = M/V = 4.8 × 1014 g cm−3 , c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
21-Oct-2002 17-4
about three times nuclear density. (Of course, the density in a star is not uniform and it may exceed 10 times nuclear density in the center, but we’re just trying to do a back of the envelope calculation here.) In terms of the concentration of neutrons, this corresponds to n0 = 2.9 × 1038 cm−3 . The Fermi energy for these neutrons is F,0 ≈ 86 MeV , and the Fermi temperature is TF = 1012 K . Neutron stars are nowhere near this hot. Otherwise they would be very strong sources of gamma rays. Instead they are thought to have temperatures of millions of degrees and radiate X-rays. I believe there are some observations which indicate this. Also, due to having a magnetic field and rotating, they can radiate electromagnetic energy and are observed as pulsars. In any case, the neutrons in a neutron star are cold! An interesting question is why is a neutron star made of neutrons? (Well, if it weren’t, we probably wouldn’t call it a neutron star, but besides that?) In particular, what’s wrong with the following? Let the star be made of protons and electrons, each with the concentration we’ve just calculated. Then the star is electrically neutral because there is a sea of positively charged protons and a sea of negatively charged electrons. But the protons have a slightly lower mass than the neutrons and this is true even if one adds in the mass of the electron, so this configuration would seem to be energetically favored over the neutron configuration. In fact, a free neutron decays according to n → p + e− + ν¯e , where ν¯e is the electron anti-neutrino. The half life is about 15 minutes. Neutrinos are massless (or very nearly so) and for our purposes we can ignore them. That is, we can assume that the neutrons are in equilibrium with the protons and electrons. If we need to change a neutron into a proton and electron, the above reaction will do it. If we need to change a proton and electron into a neutron, there is p + e− → n + νe .
What would be the Fermi energies of the protons and electrons in our hypothetical star? The Fermi energy for the protons would be very nearly the same as that for the neutrons above (because the concentration would be the same and the mass is nearly the same). On the other hand, the Fermi energy of the electrons would be larger by the ratio of the neutron mass to the electron mass, a factor of 1838, so the electron Fermi energy would be about 160,000 MeV, enough to make about 170 nucleons! Remember that the c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
21-Oct-2002 17-5
chemical potential (the Fermi energy since all the gases are cold) is the energy required to add a particle to a system. If neutrons are in equilibrium with protons and electrons, then the chemical potential of the neutrons equals the chemical potential of the protons plus the chemical potential of the electrons minus the energy difference between a neutron and a proton plus electron. In other words F,n = F,p + F,e − (mn − mp − me )c2 . Denote the concentrations of the neutrons, protons, and electrons by nn , np , and ne . Then np = ne , for charge neutrality and np + nn = n , where n is the concentration of the nucleons, which is not changed by the reactions above. To simplify the notation a bit, let x=
np ne = , n n
1−x =
nn . n
Each of the Fermi energies can be written in terms of the concentrations F,n
¯2 h = (3π 2 nn )2/3 = F,0 2mn
F,p
¯2 h = (3π 2 np )2/3 = F,0 2mp
F,e
¯2 h = (3π 2 ne )2/3 = F,0 2me
where F,0 =
n n0
n n0 n n0
2/3 (1 − x)2/3 , 2/3
2/3
mn 2/3 x , mp mn 2/3 x , me
¯2 h (3π 2 n0 )2/3 , 2mn
is the Fermi energy for a pure neutron gas at the concentration n0 we calculated previously. We plug these energies into the energy equation to obtain F,0
n n0
2/3
(1 − x)
2/3
= F,0
n n0
2/3
mn mn + mp me
x2/3 − E ,
where E = (mn − mp − me )c2 = 0.783 MeV is the mass energy excess of a neutron over a proton and electron. If we rearrange slightly, we obtain mn E n0 2/3 mn 2/3 x2/3 − (1 − x) = + , mp me F,0 n c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
21-Oct-2002 17-6
or
0.783 n0 2/3 , 86 n n 2/3 0 2/3 . = 0.000544 (1 − x) + 0.0091 n
(1 − x)2/3 = (1.0014 + 1838.7) x2/3 − or x2/3
If n is in the neighborhood of n0 , then x is small, we can ignore x on the right hand side, and we finally obtain x ≈ 1.3 × 10−5 . At higher concentrations x will get slightly smaller and at lower concentrations x will grow slowly. The concentration of neutrons, protons, and electrons are equal (x = 0.5) when n = 2.2 × 10−8 n0 = 6.4 × 1030 cm−3 . Such low concentrations will be attained only very near the surface of the neutron star. Caveats: (1) The Fermi energy of the electrons works out to be about 87 MeV, so the electrons are extremely relativistic, so we really shouldn’t be using our non-relativistic formula for the electron Fermi energy. One of this week’s homework problems gives you a chance to modify the treatment to allow for relativistic electrons. (2) With an electron and proton instead of a neutron, the pressure changes, so the equilibrium condition that we set up is not quite right. Nevertheless, this calculation gives the flavor of what’s involved and points to the correct conclusion: For most of its volume a neutron star is almost pure neutrons! Of course, we can turn the earlier question around: how is it that nuclei have any protons??? Haven’t we just shown that at nuclear densities, the nucleons must exist as neutrons, not protons???
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
23-Oct-2002 18-1
Bose-Einstein Gases An amazing thing happens if we consider a gas of non-interacting bosons. For sufficiently low temperatures, essentially all the particles are in the same state (the ground state). This Bose-Einstein condensation can occur even when the temperature is high enough that one would naively expect that higher energy states should be well populated. In addition, properties of the gas change when it is in this state, so something like a phase transition occurs. Note that photons obey Bose statistics so they constitute a non-interacting gas of bosons. We’ve already calculated their distribution (which has µ = 0). It’s just the Planck distribution and this distribution does not have a Bose-Einstein condensation. The difference between photons and the situation we’re about to discuss is that there is no fixed number of photons. If a photon gas is cooled, the number of photons per unit volume decreases. The gases we’ll be considering will contain a fixed number of matter particles. So let the gas contain N particles. The Bose-Einstein distribution is f() =
1 , e( − µ)/τ − 1
and the sum of this distribution function over all states must add up to N . For convenience, we adjust the energy scale so that the lowest energy state has = 0. When τ → 0, all the particles must be in the ground state, 1 =N, τ →0 −µ/τ e −1 1 , lim e−µ/τ − 1 = τ →0 N 1 lim e−µ/τ = 1 + , τ →0 N 1 µ , lim 1 − + · · · = 1 + τ →0 τ N 1 −µ lim = , τ →0 τ N τ lim µ = − . τ →0 N lim
Recall that µ must be lower than any accessible energy for the Bose-Einstein distribution and here we have µ < 0 in agreement with this constraint, although it converges to 0 as τ → 0, but this is to be expected as all the particles must pile up in the ground state when τ → 0. It’s instructive to evaluate µ for a mole of particles at a temperature of 1 K. The result is µ(1 K) = −2.3 × 10−40 erg . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
23-Oct-2002 18-2
If we consider a mole of 4 He and treat it as an ideal gas with p = 1 atm and T = 1 K, then its volume would be V = 82 cm3 . This would be equivalent to a cube of side L = 4.3 cm. Recall that the energies of single particle states in a cube are π 2 ¯h2 2 (nx , ny , nz ) = (nx + n2y + n2z ) . 2 2mL The ground state has nx = ny = nz = 1 and in the first excited state one of these quantum numbers is 2. Using the L we just calculated and the mass of 4 He, we find (1, 1, 1) = 1.34 × 10−31 erg ,
(2, 1, 1) = 2.68 × 10−31 erg .
Actually, the ground state energy is supposed to be adjusted to 0, so we need to subtract (1, 1, 1) from all energies in the problem. Then the ground state energy is 0 and the first excited state energy is 1 = 1.34 × 10−31 erg = 5.8 × 108 |µ| , at T = 1 K. The key point is that even though the energy of the first excited state is incredibly small, and you might think such a small energy can have nothing to do with any macroscopic properties of a system, this energy (or more properly, the difference in energy between the ground state and the first excited state) is almost nine orders of magnitude bigger than µ (at the temperature and density we’re considering). Under these conditions, what is the population of the first excited state? N1 = = = = = =
1 , e(1 − µ)/τ − 1 1 , 8 µ/τ −5.8 × 10 −1 e 1 , 1 − 5.8 × 108 µ/τ + · · · − 1 1 , −5.8 × 108 µ/τ 1 , 5.8 × 108 /N N , 5.8 × 108
so the occupancy of the first excited state is almost 9 orders of magnitude smaller than the occupancy of the ground state. Essentially all the particles are in the ground state even though kT is much larger than the excitation energy! Now we want to do a proper sum of the occupancy over the energy states. We might try to write ∞ N = f()D() d , 0 c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
23-Oct-2002 18-3
where
3/2 √ 2m , 2 ¯h is the same density of states we used for the Fermi-gas except there’s a factor of two missing because we’re assuming a spin 0 boson gas. (If the spin were different from 0, we would include a factor 2S + 1 to account for the multiplicity of the spin states.) The expression above has the problem that it fails to count the particles in the ground state. We have had this problem in previous calculations but it never mattered because there were only a few (2 or less) in the ground state and ignoring these particles makes absolutely no difference to any quantity involving the other ∼ 1023 particles. V D() = 4π 2
However, we are expecting to find many, and in some cases, most of the particles in the ground state. It would not be a good idea to ignore them in the sum! So we write the sum as ∞ N = N0 +
f()D() d , 0
where the first term is the number of particles in the ground state and the second term accounts for all particles in excited states. This term still makes an error in the low energy excited states (since we’re integrating rather than summing), but when these states contain a lot of particles, the ground state contains orders of magnitude more, so errors in the occupancies of these states are of no concern. In the case that these states don’t contain many particles, it means that the occupancies of all states are small, and again we make no appreciable error if we miss on the occupancies of a few of the low energy excited states. So, the number of particles in the ground state is N0 =
1 , e−µ/τ − 1
and the number of particles in excited states is ∞ f()D() d , Ne = 0
∞
= 0
=
V 4π 2
=
V 4π 2
=
V 4π 2
3/2 √ 1 V 2m d , 2 2 ¯h e( − µ)/τ − 1 4π 3/2 ∞ √ 1 2m d , 2 ¯h 0 e( − µ)/τ − 1 3/2 ∞ √ 1 2m d (since |µ|/τ /τ ) , ¯h2 0 e/τ − 1 3/2 ∞ 1 √ 2m 3/2 τ x dx (x = /τ ) , 2 x e −1 ¯h 0
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
23-Oct-2002 18-4 V = 4π 2
2m ¯h2
3/2 τ 3/2 Γ(3/2)ζ(3/2) ,
3/2 √ V 2m τ 3/2 , = 1.306 π 2 2 4π ¯h 3/2 mτ , = 2.612V 2π¯h2 = 2.612V nQ , where nQ is the quantum concentration again. The major approximation we made in the above calculation was ignoring the chemical potential. As long as there are an appreciable number of particles in the ground state, then |µ| must be much smaller then the energy of any excited state and this is a good approximation. With the numerical example we worked out before, |µ| will be closer to 0 than to the first excited state energy provided the ground state contains about 1015 or more particles which means the excited states must contain about 6×1023 −1015 = 6×1023 particles. In other words, our approximation for Ne above should be valid all the way to the point where Ne = N. This means that the Ne ∝ τ 3/2 . We define the proportionality constant by defining the Einstein condensation temperature, τE , such that Ne = N so 2π¯h2 τE = m
τ τE
3/2 ,
N 2.612V
2/3 ,
and we expect the expression for Ne should be valid from τ = 0 up to τ = τE . Then the number in the condensate is 3/2 τ N0 = N 1 − . τE Numerically, the Einstein temperature is TE =
115 2/3
,
Vm m
where TE is in Kelvins, Vm is the molar volume in cm3 and m is the molar weight in grams. For liquid 4 He, with a molar volume of 27.6 cm3 , this gives TE = 3.1 K. There is actually a transition in liquid helium at about 2.17 K. Below this temperature, liquid 4 He develops a superfluid phase. This phase is most likely a Bose-Einstein condensation, but it is more complicated than the simple theory we have worked out because there are interatomic forces between the helium atoms. We know this because there must be forces c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
23-Oct-2002 18-5
that are responsible for the condensation of helium gas to liquid helium at T = 4.2 K and one atmosphere. If you read the articles referenced at the beginning of these notes, you’ll see that a major problem faced by the experimenters in creating BE condensates in other systems is getting the atoms cold enough and dense enough to actually form the condensate. In the case of helium, the attractive interactions help to get the density high enough to form the condensate at more accessible temperatures!
Superfluid Helium As mentioned, the transition of 4 He at 2.17 K at 1 atm is believed to be the condensation of most of the helium atoms into the ground state—a Bose-Einstein condensation. That this does not occur at the calculated temperature of 3.1 K is believed to be due to the fact that there are interactions among helium atoms so that helium cannot really be described as a non-interacting boson gas! Above the transition temperature, helium is refered to as He I and below the transition, it’s called He II. K&K present several reasons why thinking of liquid helium as a non-interacting gas is not totally off the wall. You should read them and also study the phase diagrams for both 4 He and 3 He (K&K figures 7.14 and 7.15). The fact that something happens at 2.17 K is shown by the heat capacity versus temperature curve (figure 7.12 in K&K) which is similar to the heat capacity curve for a phase transition and also not all that different from the curve you’re going to calculate for the Bose-Einstein condensate in the homework problem (despite what the textbook problem actually says about a marked difference in the curves!). In addition to the heat capacity, the mechanical properties of 4 He are markedly different below the transition temperature. The liquid helium becomes a superfluid which means it flows without viscosity (that is, friction). The existence of a Bose-Einstein condensate does not necessarily imply the existence of superfluidity. To understand this we need to examine the mechanics of friction on a microscopic level. Without worrying about the details, friction must be caused by molecular collisions which transfer energy from the average motion (represented by the bulk velocity) to the microscopic motion (internal energy). If our Bose-Einstein condensate were really formed from a gas of non-interacting particles, then it would be possible to excite any molecule in the condensate out of the ground state and into the first excited state simply by providing the requisite energy (and momentum). Previously, we calculated that under typical conditions, the energy difference c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
23-Oct-2002 18-6
between the ground state and the first excited state was about 10−31 erg, an incredibly small amount of energy that would be very easy to provide given that thermal energies are about 10−16 erg. In order to have superfluid behavior, it must be that there are interactions among the molecules such that it’s not possible to excite just one molecule out of the condensate and into the first excited state. A better way to say this is that due to the molecular interactions, the single particle states are not discrete energy states of the superfluid. We need to consider the normal modes of the fluid—the longitudinal oscillations or the sound waves. In particular, we can consider travelling sound waves of wave vector k and frequency ω. (Rather than the standing waves which carry no net momentum.) A travelling wave carries energy in units of ¯hω and momentum in units ¯hk. The number of units is determined by the number of phonons or quasiparticles in the wave. Now imagine an object of mass M moving though a stationary superfluid with velocity Vi . In order for there to be a force on the object, there must be a momentum transfer to the superfluid. In order to do this, the object must create excitations in the fluid which contain momentum (the quasiparticles in the travelling waves we just discussed). Of course, if quasiparticles already exist, the object could “collide” with a quasiparticle and scatter it to a new state of energy and momentum. (This can also be viewed as the absorption of one quasiparticle and the emission of another.) We will assume that there are not very many existing quasiparticles and consider only the creation (emission) of a quasiparticle. So, let’s consider this emission process. Before the event, the object has velocity Vi and afterwards it has velocity Vf . We must conserve both energy and momentum, 1 1 MVi2 = MVf2 + ¯hω , 2 2 and MVi = MVf + ¯hk . We can go through some algebra with the goal of solving for Vi · k. The momentum equation can be rewritten as MVi − ¯hk = MVf , squared and divided by 2M, 1 1 2 2 1 MVi2 − ¯hVi · k + ¯ k = MVf2 . h 2 2M 2 Subtract from the energy equation to get ¯ω + ¯hVi · k = h
1 2 2 ¯h k , 2M
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
23-Oct-2002 18-7
or
¯hω ¯hk k = + . k ¯hk 2M What we really want to do is place a lower limit on the magnitude of Vi . This means we can drop the term containing M on the right hand side. The smallest value will occur when Vi is parallel to k/k, the unit vector in the k direction. This corresponds to emission of the quasiparticle in the forward direction. Thus Vi ·
Vi >
¯ω h . ¯hk
I’ve left the ¯h’s there in order to emphasize that the right hand side is the ratio of the energy to the momentum of an excitation, Suppose the excitations are single particle states with momentum h ¯ k and energy ¯h k /2m. This is the travelling wave analog to the standing wave particle in a box states we’ve discussed many times. Then the right hand side becomes ¯hk/2m which goes to zero as k → 0. Thus, an object moving with an arbitrarily small velocity can produce an excitation and feel a drag force—there is no superfluid in this case. (Note: k must be bigger than ∼ 1/L, where L is the size of the box containing the superfluid, but as we’ve already seen the energies corresponding to this k are tiny compared to thermal energies.) 2 2
Suppose the excitations are sound waves (as we’ve been assuming) and the phase velocity is independent of k. Then Vi >
ω = vs , k
where vs is the phase velocity of sound in the fluid. This means that if an object flows through the fluid at less than the velocity of sound, the flow is without drag! That is, the fluid is a superfluid. In fact, the vs is not independent of k and what sets the limit is the minimum phase velocity of any excitation that can be created by the object moving through the fluid. Figure 7.17 in K&K shows that this minimum is about 5000 cm s−1 for the low lying excitations in 4 He. K&K point out that Helium ions have been observed to travel without drag through He II at speeds up to about 5000 cm s−1 ! Comment 1: It appears that we’ve shown that any fluid should be a superfluid as long as we don’t move things through it faster than its speed of sound. In our derivation, we made an assumption that doesn’t apply in most cases. Can you figure out which assumption it was? Comment 2: As a general rule, superfluidity or superconductivity requires the condensation of many particles into a single (ground) state and a threshold for creating excitations. The minimum velocity required to create an excitation is the threshold for non-viscous flow. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Oct-2002 19-1
Heat and Work Now we want to discuss the material covered in chapter 8 of K&K. This material might be considered to have a more classical thermodynamics rather than statistical mechanics flavor. We’ve already discussed a lot of this material in bits and pieces throughout the term, so we will try to focus on the material not yet covered and just hit the highlights of the remaining material. Heat and work occur during processes. They are energy transfers. Work is an energy transfer by macroscopic means and heat is an energy transfer by microscopic means. We’ve discussed reversible processes several times and we’ll assume reversible processes unless we explicitly state otherwise. When work is done to or by a system, the macroscopic parameters of the system are changed—for example changing the volume causes p dV work to be performed. Performing work changes the energy, U, of a system. But work does not change the entropy. Heat transfer changes the entropy as well as the energy: dU = dQ ¯ = τ dσ .
A very important activity in any modern society is the conversion of heat to work. This is why we have power plants and engines, etc. Basically all forms of mechanical or electrical energy that we use involve heat to work conversion. Not all of them involve fossil fuels, and in some cases it may be hard to see where the heat enters. For example, what about hydro-electric power? This is the storage of water behind a dam and then releasing the gravitational potential energy of the water to run an electric generator. Where is the heat supplied? Heat is supplied in the form of sunlight which keeps the weather going which provides water in the atmosphere to make rain to fill the lake behind the dam. Of course, the economics of this process are quite different from the economics of an oil fired electrical generating plant. It was the steam engine (conversion of heat, obtained by burning coal, into work) that allowed the industrial revolution to proceed. The desire to make better steam engines produced thermodynamics! With an irreversible process you can turn work completely into heat. Actually, this statement is not well defined. What we really mean to say is that with an irreversible process we can use work to increase the internal energy of a system and leave that system in a final configuration that would be exactly the same as if we had reversible heated the system. For example, consider a viscous fluid in an insulating container. Immersed in the fluid is a paddle wheel which is connected by a string running over various pulleys and whatever to a weight. The weight is allowed to fall under the influence of gravity. Because the fluid is so viscous, the weight drops at a constant slow speed. Once the weight reaches the end of its travel, we wait for the fluid to stop sloshing and the temperature and pressure in the fluid to become uniform. Thus essentially all of the mechanical gravitational c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Oct-2002 19-2
potential energy is converted to internal energy of the fluid. We can take the fluid from the same initial state to the same final state by heating slowly (reversibly!) until we have the same temperature rise. There is no known way to convert (reversibly or non-reversibly) heat (more properly, internal energy, U ) entirely into work with no other change. This is one of the ways of stating the second law of thermodynamics. It is certainly possible to convert heat into work. (I’m getting tired of trying to say it exactly correctly, so I’ll just use the vernacular and you know what I mean, right?) The constraints are that you can’t convert all of it to work or there must be some permanent change in the system or both. For example, suppose we reversibly add heat to an ideal gas while we keep the volume constant. Then we insulate the gas and allow it to reversibly expand until its temperature is the same as when we started. Then the internal energy of the gas is the same as when we started, so we have completely converted the heat into work, but the system is not the same as when we started. The gas now occupies a bigger volume and has a lower pressure. The problem is that when we reversibly add heat to a system we add internal energy dU = dQ ¯ and we also add entropy dσ = dQ/τ ¯ , but when we use the system to perform work, we remove only the energy dU = dW ¯ and leave the entropy! If we want to continue using the system to convert heat to work, we have to remove the entropy as well as the energy, so there is no accumulation of entropy. The only way to remove entropy (reversibly) is to remove heat. We want to remove less heat than we added (so we have some energy left over for work) so we must remove the heat at a lower temperature than it was added in order to transfer the same amount of entropy. To make this a little more quantitative, consider some time interval (perhaps a complete cycle of a cyclic engine) during which heat Qh is transfered into the system at temperature τh , heat −Ql is transfered into the system at temperature τl , and energy −W in the form of work is transfered into the system. (So heat Ql > 0 leaves the system and work W > 0 is performed on the outside world.) At the end of this time interval we want the system to be in the same state it was when we started. This means ∆U = 0 = Qh − Ql − W , and ∆σ = 0 = We find
Ql Qh − . τh τl
τh Qh = , Ql τl
and ηC =
W τl =1− . Qh τh
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Oct-2002 19-3
The ratio of the heat input and output is the same as the ratio of the temperatures of the input and output reservoirs. The energy conversion efficiency or just efficiency, η is defined as the work output over the heat input, W/Qinput . For the ideal engine we’ve been considering, the efficiency is ηC , the Carnot efficiency, and is the upper limit to the efficiency of any real (i.e. non-reversible) engine operating between temperature extremes τh and τl . Carnot might be called the father of thermodynamics. He worked in the early 1800’s and understood the second law. This was before heat was recognized as a form of energy! Of course, this definition of efficiency is motivated by the fact that if you’re an electric power company, you can charge your customers based on W but you have to pay your suppliers based on Qh and you want to maximize profits! We live on the surface of the Earth and any engine must dump its waste heat, Ql , at what amounts to room temperature, about 300 K. This is roughly the equilibrium temperature of the surface of the Earth and is set by radiation equilibrium between the Sun and Earth and between the Earth and space (T = 3 K). See problem 5 in chapter 4 of K&K. Aside: are you surprised that room temperature and the surface temperature of the Earth are about the same? Anyway, the waste heat goes into the environment and usually generates thermal pollution. There may come a time when a cost is associated with Ql . In this case it’s still desirable to maximize η, because that minimizes Ql . Because waste heat must be dumped at room temperature, improving the Carnot efficiency requires increasing the high temperature, τh . But this is not so easy to do, especially in an economically viable power plant that’s supposed to last for many years, Comment 1: no real engine is reversible, so all real engines operating between temperature extremes τh and τl will have an efficiency less than the Carnot efficiency. Comment 2: many practical engines are designed in such a way that heat is exchanged at intermediate temperatures as well as the extremes. Such engines, even if perfectly reversible, frictionless, etc. have an efficiency less than the Carnot efficiency. However, if one assumes a perfect, reversible engine operating according to the specified design, one can calculate an efficiency (lower than the Carnot efficiency) which is the upper limit that can be achieved by that engine design. A reversible refrigerator uses work supplied from the outside to remove heat from a low temperature reservoir and deposit that heat plus the work used as heat in a high temperature reservoir. Such a refrigerator is basically the reversible engine just discussed, but run backwards! The signs of Qh , Ql , and W all change but anything involving their ratios remains the same. (Note if you wanted to “derive” refrigerators you could start from the same idea we used with the engine—entropy is only transfered when there’s a heat transfer and entropy must not be allowed to accumulate in the refrigerator.) With refrigerators, one uses a coefficient of performance, this is defined as the ratio of the heat c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Oct-2002 19-4
removed from the low temperature reservoir to the work required. This is γ=
Ql , W
and for a reversible refrigerator operating between temperatures τl and τh , the Carnot coefficient of performance is τl γC = , τh − τl and this is an upper limit to the performance of any refrigerator operating between the same temperature extremes. Aside: If you go to a department store and look at air conditioners, you will find something called an energy efficiency rating (EER) which is basically the coefficient of performance. But, I believe these are given in BTU per hour per watt. That is they have dimensions instead of being dimensionless! To convert to a dimensionless number you must multiply by 1055 Joules 1 Hour J Hour · = 0.29 . 1 BTU 3600 Seconds BTU s A typical EER you’ll find on an air conditioner is roughly 10, so the “real” γ is about 3! Note that all reversible engines operating between the same two temperature reservoirs must have the same efficiency. Similarly, all reversible refrigerators operating between the same two temperature reservoirs must have the same coefficients of performance. To see this, suppose that one has two reversible engines operating between the same two temperature reservoirs but they have different efficiencies. Run the low efficiency engine for some time, taking heat Qh from the high temperature reservoir, producing work W , and dumping heat Ql = Qh − W in the low temperature reservoir. Now run the other engine in reverse (it’s reversible!) as a refrigerator removing heat Ql from the low temperature reservoir, so the low temperature reservoir is exactly the same as when we started. To do this, the refrigerator is supplied with work W and it dumps heat Qh = Ql + W to the high temperature reservoir. Since this is the more efficient of the two reversible engines, W < W . So the net effect of running the two engines is to extract heat from the high temperature reservoir and turn it completely into work. This is a violation of the second law of thermodynamics, and it does not happen. Therefore all reversible engines operating between the same two reservoirs must have the same efficiency. Similar kinds of arguments can be used to show that all reversible refrigerators operating between the same two reservoirs must have the same coefficients of performance, that reversible engines are more efficient than irreversible engines, and that reversible refrigerators have higher coefficients of performance than irreversible refrigerators.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Oct-2002 19-5
The Carnot Cycle We’ve mentioned the Carnot efficiency and we’ve talked about heat engines, but how would one make a heat engine that (were it reversible) would actually have the Carnot efficiency? Simple, make an engine that uses the Carnot cycle. The Carnot cycle is most conveniently plotted on a temperature-entropy diagram. We plot the entropy of the “working substance” in an engine on the horizontal axis and the temperature of the working substance on the vertical axis. The working substance might be an ideal gas. It’s whatever component actually receives the heat and undergoes changes in its entropy and internal energy and performs work on the outside world. There are four steps in a Carnot cycle. In step ab, the temperature is constant at τh while the entropy is increased from σ1 to σ2 . This is the step in which the system is in contact with the high temperature reservoir and heat Qh = τh (σ2 − σ1 ) is added to the system. If the system is an ideal gas, then it must expand to keep the temperature constant, so it does work on the outside world. In step bc, the temperature is lowered at constant entropy. No heat is exchanged, the gas expands and does more work on the outside world. In step cd, entropy is removed at constant temperature by placing the system in contact with the low temperature reservoir. The heat removed is Ql = τl (σ2 − σ1 ). In this step, the gas is compressed in order to maintain constant temperature, so the outside world does work on the gas. In step da, the system is returned to the starting temperature, τh , by isentropic compression, so more work is done on the system. The hatched area within the path followed by the system is the total heat added to the system in one cycle of operation, Q = Qh − Ql = τ dσ . Since the system returns to its starting configuration (same U) this is also the work done in one cycle. Whatever the working substance, a Carnot cycle always looks like a rectangle on a τ σ diagram. (It has two isothermal segments and two isentropic segments.) Given a working substance we can plot the Carnot cycle on a pV diagram. The figure shows the Carnot cycle for a monatomic ideal gas. The vertices abcd in this diagram are the same as the vertices abcd in the τ σ diagram. So paths ab and cd are the constant temperature paths with τ = τh and τ = τl . Along these paths pV = const. Paths bc and da are the constant entropy paths with σ = σ2 and σ = σ1 . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Oct-2002 19-6
Along these paths pV 5/3 = const. The work done on the outside world in one cycle is the hatched area within the path, W = p dV .
The arrows on the paths indicate clockwise traversal. In this direction, the Carnot cycle is a Carnot engine producing work and waste heat from high temperature input heat. If the cycle is run in reverse—counterclockwise—one has a Carnot refrigerator using work to move heat to a higher temperature reservoir. As an example of a non-Carnot cycle, suppose we have a monatomic ideal gas which is the working substance of a reversible engine and it follows the rectangular path on the pV diagram shown in the figure. Along da, heat is added at constant volume V1 . On ab, heat is added at constant pressure. p2 , and work is performed. On bc heat is removed at constant volume, V2 , and on cd, heat is removed at constant pressure, p1 , while the outside world does some work on the system. As before the total work done on the outside world is the area within the path and in this case, W = (p2 − p1 )(V2 − V1 ) . The heat added is
3 5 N(τa − τd ) + N(τb − τa ) , 2 2 5 3 = (p2 V1 − p1 V1 ) + (p2 V2 − p2 V1 ) , 2 2 3 5 = p2 V2 − p2 V1 − p1 V1 . 2 2 The actual efficiency of this reversible engine is Qin =
η=
p2 V2 − p2 V1 − p1 V2 + p1 V1 W = 5 , 3 Qin 2 p2 V2 − p2 V1 − 2 p1 V1
while the Carnot efficiency is ηC = 1 −
p1 V1 . p2 V2
These formulae aren’t all that illuminating so let’s consider a numerical example: suppose p2 = 2p1 and V2 = 2V1 . Then the temperature at the hottest (upper right) vertex is 4 times the temperature at the lowest (lower left) vertex, so the Carnot efficiency is ηC =
3 . 4
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Oct-2002 19-7
The actual efficiency is η=
2 . 13
If you’re actually trying to build a heat engine that operates on this cycle, then as you improve the engine by reducing friction, heat losses, etc., you will approach an efficiency of 2/13 and this should be your goal, not the Carnot efficiency. If you want to approach the Carnot efficiency, you must redesign the cycle to be more like a Carnot cycle. In the cycle shown, the extreme temperatures are reached only at the upper right and lower left vertices. Most heat transfers are at less extreme temperatures and this is why the actual efficiency is so much less than the Carnot efficiency .
Other Thermodynamic Functions We have concentrated on the internal energy, U , in the preceding discussion. If we consider a constant temperature process, then the work done on the system is the change in the Helmholtz free energy. This is because at constant temperature, d(τ σ) = τ dσ, so dW ¯ = dU − d(τ σ) = dF
(constant temperature) .
Many processes occur at constant pressure, such as all processes open to the atmosphere. If a process occurs at constant pressure, then we are letting the system adjust it’s volume “as necessary,” so we cannot really use or supply any p dV work performed by or on the system. The p dV work “just happens.” If the system can perform work in other ways, then we can divide the work into the p dV work and other work, dW ¯ = dW ¯ other + dW ¯ pV , and
dW ¯ other = dU − dW ¯ pV − dQ ¯ , = dU + p dV − dQ ¯ , = dU + d(pV ) − dQ ¯ , = dH − dQ ¯
(constant pressure),
where H = U + pV , is called the enthalpy. In any constant pressure process the heat added plus the non-p dV work done is the change in enthalpy. In particular, if there is no non-p dV work done, the change in enthalpy is just the heat added. Note that dW ¯ other is what K&K call the “effective” work in a constant pressure process. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Oct-2002 19-8
In the event that we have a reversible process that occurs at constant temperature and constant pressure, the Gibbs free energy is useful. This is defined as G = F + pV = H − τ σ = U + pV − τ σ . It should be clear that dW ¯ other = dG
(constant temperature and pressure) .
Also, a system which is allowed to come to equilibrium at constant temperature and pressure will come to equilibrium at a minimum of the Gibbs free energy. As an example of the use of the Gibbs free energy, consider a system (cell) of two noninteracting electrodes in an electrolyte consisting of sulfuric acid dissolved in water. The sulfuric acid becomes two hydrogen ions and one sulfate ion, H2 SO4 ↔ 2H+ + SO−− . 4 When current is forced through the system in the direction to supply electrons to the cathode, the reaction at the cathode is 2H+ + 2e− → H2 , and the reaction at the anode is 1 SO−− + H2 O → H2 SO4 + O2 + 2e− , 4 2 and the net reaction is
1 H2 O → H2 + O2 . 2 If the current is passed through the cell slowly and the cell is open to the atmosphere and kept at constant temperature, then the process occurs at constant τ and p. The “other” work is electrical work. 1 Wother = G(H2 O) − G(H2 ) − G(O2 ) , 2 where the Gibbs free energies can be looked up in tables and it is found that the difference is ∆G = −273, 000 J mol−1 . The other work done is electrical work equal to the charge times the voltage. Since we have two electrons per mole, Wother = −2eN0 V0 , or
∆G = 1.229 Volts , 2N0 e where N0 is Avogadro’s number and e is the charge on an electron. V0 is the voltage that is established with no current flowing. At higher voltage, current flows and the cell liberates hydrogen and oxygen (electrolysis). If these gases are kept in the cell and allowed to run the reactions in reverse, one obtains a fuel cell in which the reaction of hydrogen and oxygen to form water generates electricity “directly.” V0 = −
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 6
Due 5-Nov-2002
H6-1
1. This is similar to K&K, chapter 7, problem 2. The goal is to work out the Fermi energy and the ground state energy of a completely degenerate relativistic Fermi gas. The calculation should proceed just as the calculation for the non-relativistic Fermi gas in the lecture notes except that one takes the energy to be given by = pc rather than = p2 /2m. Of course, the very low energy states may not actually be relativistic, but if the Fermi energy turns out to be many times the rest mass energy, then almost all the particles are relativistic and we don’t make an appreciable error if we assume that all of them are relativistic. You should be able to show that the Fermi and ground state energies are F = π¯hc
3n π
1/3 and
U0 =
3 NF . 4
With these results modify our treatment of the electron and proton concentration in a neutron star. As always, make suitable approximations. What do you get for the relative fraction of electrons and protons? 2. K&K, chapter 7, problem 5. Note that 3 He has a complicated phase diagram at low temperatures and can even become a superfluid. The 1996 Physics Nobel Prize was awarded to Lee, Osheroff, and Richardson for their discovery of superfluidity in 3 He. 3. K&K, chapter 7, problem 6. 4. K&K, chapter 7, problem 8. Note that Figure 7.19 shows the calculated heat capacity curve above τe as well as below τe where you are asked to calculate it. How would you calculate the energy, heat capacity, and entropy above τE ? (Note that you’re not being asked to do the calculation, just outline the calculation!) 5. K&K, chapter 7, problem 12. 6. In the Physics Today article by Collins, August, 1995, vol. 48, no. 8, p. 17, it is stated that a Bose-Einstein condensate of about 2000 87 Rb atoms forms at a temperature of 170 × 10−9 K. Use our theory to estimate the concentration of Rubidium atoms in the sample. What is the density? Assuming a spherical sample, what is the radius? Note that K&K, chapter 7, problem 10 hints at an upper limit for the mass of white dwarf stars. Beyond this limit, about 1.4M , gravity overwhelms the degeneracy pressure of the electrons and the white dwarf collapses (probably to explode as a supernova of “type Ia”). Similarly, there is a (more uncertain) upper mass limit to neutron stars beyond which gravity overwhelms the degeneracy pressure of the neutrons. What do you suppose happens when a neutron stars goes above the upper mass limit?
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002
Physics 301 Problem Set 6 Solutions Problem 1. As for the non-relativistic Fermi gas, we have dn(x, p) = 2d 3 xd3 p/(2π¯h)3 which gives dn(p) = (V /π 2 h ¯ 3 )p2 dp. Because of the dispersion relation E = pc, we have
dn(E) = (V /π 2 h ¯ 3 c3 )E 2 dE ≡ D(E)dE. In the ground state, all the states below the Fermi energy are filled and the rest are empty. We can now calculate the total number of particles and the total energy. N=
Z
EF
D(E)dE =
0
V EF3 π2h ¯ 3 c3 3
⇒ EF = (3π 2 h ¯ 3 c3 n)1/3 U0 =
Z
EF
0
3n = π¯hc( )1/3 . π
(1)
V EF4 π2 h ¯ 3 c3 4 3 = NEF . 4
ED(E)dE =
(2) EF3 3EF V = 2 3 3 π h ¯ c 3 4 In the neutron star treatment, we now have a new value of E F,e = π¯hc(3ne /π)1/3 =
π¯hcn1/3 (3x/π)1/3 where x = ne /n, the ratio of the electron concentration to the nucleon concentration. This changes the equilibrium equation to (cf: lecture 17): (1 − x)2/3 =
mn 2/3 E n0 2/3 2cmn x1/3 − x + ( ) 2 1/3 mp EF,0 n h ¯ (3π n)
(3)
Assuming n ≈ n0 (⇒ x 1) as before, we get (1 − x)2/3 ≈ x2/3 + 4.5x1/3 − 0.0091 1 ≈ 0.01 ⇒x≈ (4.5)3
(4)
We see that the relative concentration is about a 1000 times more than the non-relativistic case, but is still small compared to 1. Problem 2. (a) The density of the liquid 3 He is ρ = 0.081g/cc. The mass of one 3 He atom is m = 3/(6.023 × 1023 )g, and the number density is n = 0.081(6.023 × 10 23 /3)/cc = 1.6 × 1022 /cc. The Fermi sphere parameters are:
h ¯ (3π 2 n)1/3 ≈ 1.65 × 104 cm/s, m EF = 21 mvF2 ≈ 6.78 × 10−16 erg,
vF =
TF =
EF ≈ 4.9K. kB 1
(5)
Physics 301
11-Nov-2002
(b) The specific heat is Cv = (π 2 /2TF )NkB T ≈ 1.00NkB T . This differs from the experi-
mentally observed value of 2.89NkB T by a factor of about 3, but we should remember that our calculation is for a non-interacting Fermi gas.
Problem 3. (a) Dimensionally, the order of magnitude of the self energy is V ≈ −GM 2 /R. If we know the density distribution, we can divide the sphere into shells and find the exact gravitational potential energy. (b) The kinetic energy of electrons in their ground state is the total ground state energy of a free Fermi gas, which is K ≈ Ne EF ≈ (¯h2 /2Me )Ne n2/3 where n is the number 5/3
5/3
¯ 2 Ne /Me R2 . density of electrons n = Ne /V which gives K ≈ h ¯ 2 Ne /Me V 2/3 ≈ h
Assuming the star is neutral, the number of electrons is equal to the number of protons Ne = NH = M/MH where M is the mass of the star in which we have ignored the contributions of the electrons. This gives K ≈ (¯h2 /me R2 )(M/MH )5/3 .
(c) By the virial theorem, K ≈ −V which gives GM 2 /R ≈ (¯h2 /me R2 )(M/MH )5/3 ⇒ 5/3
M 1/3 R ≈ h ¯ 2 /(Me GMH ) which is a constant ≈ 1020 g 1/3 cm.
(d) If the mass of the white dwarf is equal to the sun M ≈ 2 × 10 33 g, the radius
(e) For a neutron star, the calculation in parts (b) and (c) go through by replacing M e by Mn . This gives M 1/3 R ≈ (Me /Mn )1020 ≈ 1020 /2000 = 1017 g 1/3 cm. The radius calculation becomes R ≈ 8 × 108 /2000 ≈ 5 × 105 cm = 5Km.
Problem 4. Let us assume that the ground state energy of the Boson gas is zero. We know that for temperatures τ < τE , the chemical potential is approximately zero. The energy of the gas is given by Z
U = E0 + Ef(E)D(E)dE Z ∞ V 2M 3/2 1/2 E ( 2 ) E dE = dE E/τ e − 1 4π 2 h ¯ 0 Z ∞ 3/2 x V 2M = 2 ( 2 )3/2 τ 5/2 dx 4π h ex − 1 ¯ 0 V 2M ≡ 2 ( 2 )3/2 Iτ 5/2 . 4π h ¯ The specific heat is Cv =
∂U 5V 2M = 2 ( 2 )3/2 Iτ 3/2 . ∂τ 8π h ¯ 2
(6)
(7)
Physics 301 and the entropy is
11-Nov-2002 Z
Z τ dU Cv dτ σ= = τ τ 0 Z τ 5V 2M (8) τ 1/2 = 2 ( 2 )3/2 I 8π h ¯ 0 5V 2M 3/2 3/2 = ( 2 ) Iτ 12π 2 h ¯ For temperatures above τE , the chemical potential is not zero, and it is determined
by the number of particles. In this range, we can write down an expression involving µ for R∞ the number of particles: N = 0 f(E)D(E)dE (here, D(E) is a function of µ and τ ). We can numerically find the value of µ for the given temperature and then plug it back into
the expression for energy. Problem 5. hNi =
1 e(E−µ)/τ
h(∆N)2 i = τ
∂hNi ∂µ
−1
1 = hNi2 e(E−µ)/τ (− ) τ 1 = hNi2 ( + 1) hNi
(9)
= hNi(hNi + 1) Problem 6. We know that 2π¯h2 τE = M
N 2.612V
2/3
(10)
which can be written as n = N/V = 2.612(MτE /2π¯h2 )3/2 . For the numbers in this problem τE = 170 × 10−9 K, M = 87/(6.023 × 1023 )g, the concentration is n ≈ 2.8 × 1013 /cc. The
mas density is Mn ≈ 4 × 10−9 g/cc. Assuming it is a spherical sample, V = 4πR3 /3 which means R = (3N/4πn)1/3 = 2.6 × 10−4 cm.
3
Week 7. Gibbs Free Energy, Chemical Equilibrium, Saha Equation, Phase Transitions
Physics 301
4-Nov-2002 20-1
Reading K&K chapter 9 and start on chapter 10. Also, some of the material we’ll be discussing this week is taken from Mandl, chapter 11.
Gibbs Free Energy As we discussed last time, the Gibbs free energy is obtained from the energy via two Legendre transformations to change the independent variables from entropy to temperature and from volume to pressure, dU (σ, V, N ) = +τ dσ − p dV + µ dN F = U − τσ H = U + pV dF (τ, V, N ) = −σ dτ − p dV + µ dN dH(σ, p, N) = +τ dσ + V dp + µ dN ↓ ↓ G = F + pV G = H − τσ G = U − τ σ + pV dG(τ, p, N) = −σ dτ + V dp + µ dN . There are a couple of general points to make here. First of all, if the system has other ways of storing energy, those ways should be included in all these thermodynamic functions. For example, if the system is magnetic and is in a magnetic field, then there will have to be an integral of the magnetization (magnetic dipole moment per unit volume) times the magnetic field times the volume element to account for the magnetic energy. The second point is that if the system contains several different kinds of particles, then µ dN is replaced by i µi dNi , where the index i runs over the particle types. (We will be doing this shortly!) The above way of writing the energy, the Helmholtz free energy, F , the enthalpy, H, and the Gibbs free energy, G are really just shorthand for what might actually have to be included. As remarked earlier, the Gibbs free energy is particularly useful for situations in which the system is in contact with a thermal reservoir which keeps the temperature constant, dτ = 0, and a pressure reservoir which keeps the pressure constant, dp = 0. Then if the number of particles doesn’t change, the Gibbs free energy is an extremum dG = 0, and in fact, it must be a minimum (because the entropy enters with a minus sign!). Another thing to note is that τ , p, and µ are intensive parameters while σ, V , N, and G itself are extensive parameters. This means that for fixed temperature and pressure, G must be proportional to the number of particles. Or, G = Nf(τ, p) where f is some function of the temperature and pressure. If we differentiate with respect to N, we have ∂G = f(τ, p) . ∂N τ,p c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Nov-2002 20-2
If we compare this with the earlier expression for dG, we see that f(τ, p) = µ(τ, p). In other words, the chemical potential of a single component system depends only on the temperature and pressure. Furthermore, G(τ, p, N) = Nµ(τ, p) .
What happens when there is more than one kind of particle in the system? In this case, we can show that G(τ, p, N1 , N2 , . . .) = Ni µi , i
We must have for any λ, G(τ, p, λN1 , λN2 , . . .) = λG(τ, p, N1 , N2 , . . .) , as this just expresses the fact that G and the Ni are extensive parameters. Now, set xi = λNi and differentiate with respect to λ, ∂G ∂xi = G(τ, p, N1 , N2 , . . .) . ∂x ∂λ i i Note that ∂xi /∂λ = Ni and when λ → 1, then xi → Ni , and ∂G/∂Ni = µi , so G(τ, p, N1 , N2 , . . .) =
Ni µi ,
i
but it is not necessarily true that µi depends only on τ and p. As an example, We can write down the Gibbs free energy for a classical ideal gas. We worked out the Helmholtz free energy in lecture 14. For a single component ideal gas, it is n F = Nτ log −1 , nQ Zint so
n − 1 + pV , G = Nτ log nQ Zint N/V − 1 + Nτ , = Nτ log nQ Zint p = Nτ log , τ nQ Zint
where we used the ideal gas law to replace Nτ with pV and N/V with p/τ . Of course, we could also have obtained this result with our expression for µ that we worked out in lecture 14! Note that, as advertised, µ is a function only of p and τ . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Nov-2002 20-3
If we have a multicomponent ideal gas, the situation is slightly more complicated. Starting from the Helmholtz free energy again, we have ni Ni τ log − 1 + pV , G= ni,Q Zi,int i Ni /V Ni τ log − Ni τ + pV , = n Z i,Q i,int i i (Ni /N )(N/V ) Ni τ log − Ni τ + pV , = n Z i,Q i,int i i xi p Ni τ log − Nτ + pV , = τ ni,Q Zi,int i xi p = Ni τ log , τ ni,Q Zi,int i
where xi is the fractional concentration of molecules of type i, xi = Ni /N = ni /n. Also, xi p = pi , the partial pressure of molecules of type i. The quantum concentrations are given a molecular subscript since they depend on the masses of the molecules as well as the temperature. The Gibbs free energy is of the form, i Ni µi , and the chemical potentials depend on pressure, temperature, and the intensive parameters xi . The derivation of G in the above paragraph hides an important issue in the internal partition functions, Zi,int . This is the fact that all energies in the system must be measured from the same zero point. In particular, if we have molecules that can undergo chemical reactions (which is where we’re headed), then we might have a reaction like A+B↔ C. If C is stable, then the reaction of A and B to produce C gives up some binding energy b , so the ground state energy for ZC,int is lower than zero by b . In other words, the internal energy states of the molecules are A : 0, B : 0, C : − b ,
A,1 ,
A,2 ,
B,1 , −b + C,1 ,
B,2 , −b + C,2 ,
A,3 ,
... ,
B,3 , . . . , −b + C,3 , . . . .
When we compute the internal partition function for molecule C we need to include −b as part of the energy in every term in the sum. This extra energy will factor out and we will have ZC,int = e+b /τ Z0,C,int , where Z0,C,int is the usual partition function with the ground state at 0. Since the logarithm of the partition function occurs in the chemical potential, the net effect is to add −b to µC c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Nov-2002 20-4
and −NC b to the Gibbs free energy. The message is that energies must be measured on a common scale. We will sometimes assume the internal partition functions are calculated with the internal ground state energy set to 0 and explicitly add any binding energies to the chemical potentials. Other times, we will assume that all binding energies are incorporated into the internal partition functions!
Chemical Equilibrium Suppose we have a chemical reaction which takes place at constant temperature and pressure. Then, we know that the Gibbs free energy is a minimum. But in addition to this condition, we also have the constraint imposed by the reaction. In particular, we can write any chemical reaction as ν1 A1 + ν2 A2 + ν3 A3 + · · · + νl Al = 0 , where Ai stands for a particular compound and νi denotes the relative amount of that compound which occurs in the reaction. For example, the formation of water from hydrogen and oxygen is typically written, 2H2 + O2 → 2H2 O . This becomes 2H2 + O2 − 2H2 O = 0 , with A1 = H2 ,
A2 = O2 ,
A3 = H2 O ,
ν1 = 2 ,
ν2 = 1 ,
ν3 = −2 .
If the reaction occurs, the change in numbers of molecules is described by νi , dNi = νi dR , where dR is the number of times the reaction occurs in the direction that makes the left hand side. Then the change in the Gibbs free energy is dG =
µi dNi =
i
µi νi dR =
i
µi νi
dR .
i
This must be an extremum which means that there is no change in G if the reaction or the inverse reaction occurs (dR = ±1), so
µi νi = 0 ,
i c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Nov-2002 20-5
when a reaction occurs at constant temperature and pressure. Note 1: the expression we’ve just derived also holds if the temperature and volume are held fixed. This is most easily seen by noting that when the temperature and pressure are held fixed, the reaction proceeds until i µi νi = 0 at which point the system has some particular volume determined by the total amount of reactants, the pressure and temperature. If we start with the temperature fixed and some particular volume, the reaction proceeds to equilibrium at which point the system has some pressure. Now imagine that one had started with this pressure, and allowed the reaction to proceed at constant pressure. Assuming there are not multiple minima in G, the reaction will wind up at the same place and have the same volume! Note 2: the expression we’ve just derived holds for a single chemical reaction. If there are several reactions going on but the net reaction can be reduced to a single reaction, the above holds. For example, if the reaction is catalyzed by another molecule via an intermediate step, the reaction rate might differ with and without the catalyst, but the equilibrium will be the same. Note 3: the νi are fixed. It is the chemical potentials which adjust to satisfy the equilibrium condition. Other constraints may need to be satisfied as well. For example, in the water reaction above, the equilibrium condition provides one equation for the three unknown chemical potentials. Two other conditions might be the total amount of hydrogen and the total amount of oxygen. Note 4: if there is more than one reaction, there may be several equations similar to i µi νi = 0 which must be satisfied at equilibrium. As an example, consider N2 + O2 ↔ 2NO . The equilibrium condition,
i
µi νi , can be written µN2 + µO2 = 2µNO .
In other words, we just substitute the appropriate chemical potentials for the chemicals in the reaction equation. If we also have 2N ↔ N2 ,
2O ↔ O2 ,
N + O ↔ NO ,
then we also have the additional relations among the chemical potentials (at equilibrium), 2µN = µN2 ,
2µO = µO2 ,
µN + µO = µNO .
Note that there are five kinds of molecules. There must be a total amount of nitrogen and a total amount of oxygen (two conditions), and there are four conditions of chemical equilibrium. There are six conditions for five chemical potentials. However, the four c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Nov-2002 20-6
equilibrium conditions are not all independent. For example, the last one can be derived from the previous three.
The Law of Mass Action We’ve seen that forchemical equilibrium, the chemical potentials adjust to satisfy the equilibrium condition, i µi νi = 0. Among other things, the chemical potentials depend on the concentrations of the molecules. To bring this out, we’ll consider the case that all molecules participating in a reaction can be treated as an ideal classical gas. (This, of course, works for low density gases, but also for low concentration solutes.) Then µi = τ log
ni = τ log ni − τ log ni,Q Zi,int = τ log ni − τ log ci , ni,Q Zi,int
where ci = ni,Q Zi,int , and ci depends on the characteristics of molecule i through its mass in the quantum concentration and its internal states in the partition function, but otherwise ci depends only on the temperature. Note that in this expression, we’re assuming that any binding energies are included in the internal partition function. The equilibrium condition can be written νi log ni = νi log ci , i
i
log
i
log nνi i
=
i
nνi i
= log
i
nνi i
=
i
log cνi i ,
cνi i ,
i
cνi i
,
i
nνi i = K(τ ) .
i
The last line is known as the law of mass action. The quantity K(τ ) is known as the equilibrium constant and is not a constant but depends on temperature. In terms of molecular properties, it’s given by ν K(τ ) = cνi i = (ni,Q Zi,int ) i . i
i
Note that at a given temperature, measurement of all the concentrations allows one to determine the equilibrium constant at that temperature. For complicated situations it is c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Nov-2002 20-7
easier to determine the constant experimentally than to calculate it from the molecular properties!
Application: pH Water can undergo the reaction H2 O ↔ H+ + OH− . In water at room temperature a very small fraction of the water molecules are dissociated into hydrogen and hydroxyl ions. The equilibrium concentrations satisfy [H+ ][OH− ] = 10−14 mol2 l−2 . The notation [whatever] denotes the concentration of whatever. This is almost in the form of the law of mass action. We need to divide by the concentration of H2 O to place it in the proper form. However, the concentration of water in water is about 56 mol/l and it doesn’t change very much, so we can treat it as a constant, and then the law of mass action takes the form of the above equation. Note that in pure water, the concentrations must be equal, so [H+ ] = [OH− ] = 10−7 mol l−1 . The pH of a solution is defined as pH = − log10 [H+ ] . The pH of pure water is 7. If an acid, such as H Cl is dissolved in water, the increased availability of H+ shifts the equilibrium to increase [H+ ] and decrease [OH− ], but the product stays constant. When H+ goes up, the pH goes down. Similarly, adding a base, such as Na OH, increases the concentration of [OH− ] and increases the pH.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Nov-2002 20-8
Other Ways of Expressing the Law of Mass Action We have written the law of mass action in terms of the particle concentrations, ni = Ni /V . The partial pressure of component i is pi = Ni τ /V , or ni = pi /τ . If we substitute these forms in the law of mass action and rearrange slightly, we have
pνi i
=
i
τ
νi
K(τ ) = τ
νi
K(τ ) = Kp (τ ) ,
i
where the equilibrium constant is now called Kp (τ ), depends only on temperature, and is the product of K(τ ) and the appropriate power of the temperature. We can also write the law of mass action in terms of the fractional particle concentrations, xi = Ni /N = pi /p, introduced earlier. We simply divide each partial pressure above by p (or each concentration by the overall concentration n = N/V and we have i
xνi i
νi τ = K(τ ) = Kx (τ, p) , p
where the equilibrium constant is the product of K(τ ) and the appropriate power of τ /p. In this case, the equilibrium constant is a function of pressure as well as temperature.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Nov-2002 21-1
The Direction of a Reaction Suppose we have a reaction such as A+B↔ C, which has come to equilibrium at some temperature τ . Now we raise the temperature. Does the equilibrium shift to the left (more A and B) or to the right (more C)? The heat of reaction at constant pressure, Qp , is the heat that must be supplied to the system if the reaction goes from left to right. If Qp > 0, heat is absorbed and the reaction is called endothermic. If Qp < 0, heat is released and the reaction is called exothermic. For a reaction at constant pressure, the heat is the change in the enthalpy of the system, Qp = ∆H. We have H = G + τσ ,
and σ=−
so H =G−τ
∂G ∂τ
∂G ∂τ
, p,Ni
= −τ p,Ni
2
∂ ∂τ
G τ
. p,Ni
What we actually want to do is to change the temperature slightly. Then the system is no longer in equilibrium and the reaction (in the forward or reverse direction) will have to occur in order to restore equilibrium. When the reaction occurs from left to right, the change in particle number is ∆Ni = −νi and the change in G is ∆G = −
µi νi .
i
If this is 0, we have the equilibrium condition (but we’ve taken it out of equilibrium by changing the temperature). The change in H is Qp = ∆H = −τ 2
∂ ∂τ
∆G τ
The chemical potential is
= +τ 2 p,Ni
µi = τ log
∂ ∂τ
xi p/τ ni,Q Zi,int
µi νi i
τ
. p,Ni
.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Nov-2002 21-2
We substitute into our expression for Qp and obtain, ∂ (νi log (xi p) − νi log (τ ni,Q Zi.int )) , ∂τ i ∂ = −τ 2 (νi log (τ ni,Q Zi.int )) , ∂τ i ∂ ν = −τ 2 (log (τ ni,Q Zi.int ) i ) , ∂τ i ∂ ν log (τ ni,Q Zi.int ) i , = −τ 2 ∂τ i
Qp = τ 2
= −τ 2
∂ log Kp (τ ) . ∂τ
We’ve related the heat of reaction to the equilibrium constant! This is called van’t Hoff’s equation. A note on signs: I’ve assumed that the νi are positive on the left hand side of the reaction and negative for the right hand side of the reaction. Mandl (who provides the basis for this section) assumes the opposite, so we wind up with our equilibrium constants being inverses of each other and opposite signs in the van’t Hoff equation. In any case, our law of mass action has the concentrations of the left hand side reactants in the numerator and the right hand side reactants in the denominator. So an increase in the equilibrium constant means the reaction goes to the left and a decrease means the reaction goes to the right. We see that if Qp is positive (we have to add heat to go from left to right, an endothermic reaction), then our equilibrium constant decreases with temperature. This means increasing the temperature moves the reaction to the right. Rule of thumb: increasing the temperature causes the reaction to go towards whatever direction it can absorb energy. We’ve just shown that increasing the temperature drives an endothermic reaction to the right. It will drive an exothermic reaction to the left.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Nov-2002 21-3
Application: the Saha Equation This section is related to K&K, chapter 9, problem 2. Consider the ionization of atomic hydrogen, p+ + e− ↔ H . Ionizing hydrogen from its ground state requires an energy of 13.6 eV, and as the above reaction is written, it’s exothermic from left to right. If we are considering low density gases, we can treat them as classical ideal gases and apply our law of mass action: [p+ ][e− ] (np,Q Zp,int)(ne,Q Ze,int ) = , [H] nH,Q ZH,int exp(I/τ ) where the partition function for the hydrogen atom is to be computed with the ground state at the zero of energy, as we’ve taken explicit account of the binding energy I = 13.6 eV. This (or more properly, some of the forms we will derive below) is called the Saha equation. Some of the factors in the equilibrium constant are easy to calculate and others are hard to calculate! Let’s do the easy ones. First of all, the mass of a proton and the mass of a hydrogen atom are almost the same, so the quantum concentrations of the proton and the hydrogen are almost the same and we can cancel them out. The quantum concentration of the electron is 3/2 me τ ne,Q = . 2π¯h2 The internal partition functions for the electron and proton are both just 2, since each has spin 1/2. This leaves us with the internal partition function of the hydrogen atom. This is complicated. First of all, the electron and proton each have two spin states, so whatever else is going on there is a factor of four due to the spins. Aside: in fact the spins can combine with the orbital angular momentum to give a total angular momentum. In the ground state, the orbital angular momentum is zero and the spins can be parallel to give a total angular momentum of 1¯h with 3 states or anti-parallel to give a total angular momentum of 0 with 1 state. The parallel states are slightly higher in energy than the anti-parallel state. Transitions between these states are called hyperfine transitions and result in the 21 cm line which is radiated and absorbed by neutral hydrogen throughout our galaxy and others. In any case, the energy difference between these states is small enough to be ignored in computing the internal partition function for the purposes of the Saha equation. When all is said and done, we have [p+ ][e− ] =4 [H]
me τ 2π¯h2
3/2
e−I/τ
1 ZH,int
,
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Nov-2002 21-4
where the factor of four accounts for the two spin states of the proton and the two spin states of the electron (there is a factor of four in the hydrogen partition function as well). If the temperature is small compared to the binding energy of hydrogen (which means it’s small compared to the difference between the first excited state and the ground state), then we might as well approximate the partition function as 4. This gives, [p+ ][e− ] ≈ [H]
me τ 2π¯h2
3/2
e−I/τ .
If we have only hydrogen and ionized hydrogen, [p+ ] = [e− ] and [e ] ≈ [H] −
me τ 2π¯h2
3/4
e−I/2τ .
Some points to note: the fact that the exponential has −I/2τ indicates that this is a mass action effect, not a Boltzmann factor effect. If there is another source of electrons (for example, heavier elements whose outer electrons are loosely bound), the reaction would shift to favor more hydrogen and fewer protons. The Saha equation applies to gases in space or stars as well as donor atoms in semi-conductors (modified for the appropriate physical characteristics of the atom and the medium). In fact, we can do a little more with the Saha equation. Let’s consider an atom which has several electrons, and ask about the ionization equilibrium between the ions that have been ionized i times and those that have been ionized i + 1 times, ni+1 [e− ] (ni+1,Q Zi+1,int )(ne,Q Ze,int ) , = ni ni,Q Zi,int exp(Ii+1,i /τ ) where ni+1 and ni are the concentrations of the two ions, ni+1,Q and ni,Q are the quantum concentrations of the two ions which are essentially the same, so we cancel them out, Zi+1,int and Zi,int are the internal partition functions of the two ions, and Ii+1,i is the difference in binding energy between the two ions. That is, Ii+1,i is the energy required to remove an electron from ion i and produce ion i + 1. Now, each ion will have some internal structure and energy levels. We let i+1,j be the energy (relative to 0 for the ion ground state) of the j th state of ion i + 1. This state has multiplicity gi+1,j . (If there is more than one state at a given energy we say that energy is degenerate and the multiplicity is the number of such states. Sometimes the multiplicity is called the degeneracy or the statistical weight.) Similarly, i,k and gi,k are the energy and multiplicity of the k th state of ion i. The fraction of ions i + 1 which are in state j is given by a Boltzmann factor, gi+1,j e−i+1,j /τ ni+1,j = . ni+1 Zi+1,int c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Nov-2002 21-5
If we substitute this expression into the Saha equation, and also substitute the quantum concentration of the electrons and the internal partition function of the electrons (2), we get 3/2 ni+1,j [e− ] 2gi+1,j me τ = e−(Ii+1,i + i+1,j − i,k )/τ . 2 ni,k gi,k 2π¯h This form of the Saha equation connects the concentration of ions in various energy levels to the electron concentration and the temperature. Note that we managed to get rid of the internal partition functions. Of course, now we have a relation connecting concentrations of states of a given energy level rather than concentrations of ions of a given ionization. We can apply the above expression to hydrogen (again!). There are only two ionization states. We let i = 0 and k = 0, so ni,k is the concentration of hydrogen atoms in the ground state (which has multiplicity g0,0 = 4 and energy 0,0 = 0. The ionized state is just a proton which has a multiplicity of 2, and no excited states. So [p+ ][e− ] = n0,0
me τ 2π¯h2
3/2
e−I/τ ,
which is essentially the same equation we had before except that now it includes only hydrogen atoms in the ground state and it is “exact.”
Phase Transitions Phase transitions occur throughout physics. We are all familiar with melting ice and boiling water. But other kinds of phase transitions occur as well. Some solids, when heated through certain temperatures, change their crystal structure. For example, sulfur can exist in monoclinic or rhombic forms. When iron is cooled below the Curie point, it spontaneously magnetizes. The Curie point of iron is Tc = 1043 K. A typical chunk of iron has no net magnetization because it magnetizes in small domains with the direction of the magnetic field oriented at random. The magnetization, even in the small domains, disappears above the Curie temperature. The transition between the normal and superfluid states of 4 He is a phase transition as are the transitions between normal and superconducting states in superconductors. You’ve probably heard about the “symmetry breaking phase transitions” that might have occurred in the very early universe, as the universe cooled from its extremely hot “initial” state. Such transitions “broke” the symmetry of the fundamental forces causing there to be different couplings for the strong, weak, electromagnetic, and gravitational force. The latent heat released in such a transition might have driven the universe into a state of very rapid expansion (inflation). c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Nov-2002 21-6
The spontaneous magnetization of iron as it’s cooled below the Curie temperature is an example of a symmetry breaking transition. Above the Curie point, the atomic magnets (spins) are oriented at random (by thermal fluctuations). So any direction is the same as any other direction and there is rotational symmetry. Below the Curie point (and within a single domain) all the atomic magnets are lined up, so a single direction is picked out and the rotational symmetry is broken. This is not an exhaustive list of phase transitions! Even so, we will not have time to discuss all these kinds of phase transitions. We will start with something “simple” like the liquid to gas transition.
Phase Diagrams Suppose we do some very simple experiments. We place pure water inside a container which keeps the amount of water constant and doesn’t allow any other kinds of molecules to enter. The container is in contact with adjustable temperature and pressure reservoirs. We dial in a temperature and a pressure, wait for equilibrium to be established, and then see what we have. For most pressures and temperatures we will find that the water is all solid (ice), all liquid, or all vapor (steam). For some temperatures and pressures we will find mixtures of solid and vapor, or solid and liquid, or liquid and vapor. The figure shows a schematic plot of a phase diagram for water. I didn’t put any numbers on the axes—which is why it’s schematic. (Also, there are several kinds of ice which we’re ignoring!) K&K give a diagram, but it doesn’t have any resolution at the triple point. Note that the first figure (which we’ll talk about some more in a minute) is something like a map: it says here we have vapor, there we have solid, etc. The second figure is a schematic of a pV diagram showing an isotherm. For an ideal gas, we would have a hyperbola. For the isotherm as shown, we have pure liquid on the branch to the left of point a, pure vapor to the right of point b and along the segment from a to b we have a mixture of liquid and vapor. If we move along this isotherm from left to right, we are essentially moving down a vertical line in the pτ diagram. To the left of point a we are moving to lower pressures, with liquid water. from a to b we are stuck at the line in the pτ diagram that divides the liquid from the vapor region, and to the right of b we are moving down in the vapor region. So the c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Nov-2002 21-7
entire transition from all liquid to all vapor which is a to b in the pV diagram happens in a single point in the pτ diagram. At this point, the water has a fixed temperature and pressure, and what adjusts to match the volume is the relative amounts of liquid and vapor. Now, at each location in the pτ diagram, we fix the temperature and pressure and let the system come to equilibrium. The equilibrium condition is that the Gibbs free energy is minimized. Ignoring for the moment the fact that the water can be a solid, the Gibbs free energy is G(p, τ, Nl , Nv ) = Nl µl (p, τ ) + Nv µv (p, τ ) , where the subscripts l and v refer to the liquid and vapor and we’ve made use of the fact that for a single component substance the chemical potential can be written as a function of p and τ only. There are several ways we might minimize G. First of all, if µl (p, τ ) < µv (p, τ ), then we minimize G by setting Nl = N and Nv = 0 where N is the total number of water molecules. In other words, the system is entirely liquid. If µv (p, τ ) < µl (p, τ ), we minimize the free energy by making the system entirely vapor. Finally, if µl (p, τ ) = µv (p, τ ), we can’t change the free energy by changing the amount of vapor and liquid, so we can have a mixture with the exact amounts of liquid and vapor determined by other constraints (such as the volume to be occupied). So, what we’ve just shown is that where liquid and vapor coexist in equilibrium, we must have µl (p, τ ) = µv (p, τ ) , which is exactly the same condition we would have come up with had we considered the “reaction” H2 Oliquid ↔ H2 Ovapor . This is a relation between p and τ and it describes a curve on the pτ diagram. It’s called the vapor pressure curve. With similar arguments, we deduce that solid and vapor coexist along the curve defined by µs (p, τ ) = µv (p, τ ) , which is called the sublimation curve, and solid and liquid coexist along the curve µs (p, τ ) = µl (p, τ ) , which is the melting curve. If we have all three chemical potentials equal simultaneously, µs (p, τ ) = µl (p, τ ) = µv (p, τ ) , c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Nov-2002 21-8
we have two conditions on p and τ and this defines a point. This unique (for each substance) point where solid, liquid, and vapor all coexist is called the triple point. For water, Tt = 273.16 K ,
pt = 4.58 mm Hg .
Actually, this is now used to define the Kelvin scale. If a substance has more than three phases, it can have more than one triple point. For example, the two crystalline phases of sulfur give it four phases, and it has three triple points. The vapor pressure curve eventually ends at a point called the critical point. At this point, one can’t tell the difference between the liquid phase and the vapor phase. We’ll say more about this later, but for now, consider that as you go up in temperature, you get sufficiently violent motions that binding to neighboring molecules (a liquid) becomes a negligible contribution to the energy. As one goes up in temperature, the heat of vaporization decreases. At the critical point it is zero. The critical point for water occurs at Tc = 647.30 K , pc = 219.1 atm . Another way to think of the phase diagram and the coexistence curves is to imagine a 3D plot. Pressure and temperature are measured in a horizontal plane, while µ(p, τ ) is plotted as height above the plane. This defines a surface. In fact we have several surfaces, one for µs , µl , and µv . We take the overall surface to be the lowest of all the surfaces— remember we’re trying to minimize G. Where µv is the lowest, we have pure vapor, etc. Where two surfaces intersect, we have a coexistence curve. Of course, the phase diagram corresponds to equilibrium. It is possible to have liquid in the vapor region (superheated), or solid region (supercooled), etc., but these situations are unstable and the system will try to get to equilibrium. Whether this happens rapidly or slowly depends on the details of the particular situation.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
8-Nov-2002 22-1
First Order and Second Order Phase Transitions In the phase diagram we’ve been discussing, as we cross a coexistence curve, G is continuous, but its slope changes discontinuously. This is true whether we cross the curve by changing temperature or by changing pressure. This means that the entropy and volume have step discontinuities. Recall, dG = −σ dτ + V dp + µ dN ,
so σ=−
∂G ∂τ
,
p,N
V =+
∂G ∂p
, τ,N
µ=+
∂G ∂N
. p,τ
The situation is sketched in the left pair of plots in the figure which shows the change in entropy resulting from the phase transition. Such a transition is called a first order
transition—the first derivatives of G have discontinuities. Second order transitions have discontinuities in the second derivatives. So things like the entropy and volume are continuous, but their slopes change suddenly. This is illustrated in the righthand pair of plots. Since there is a discontinuous change in the entropy in a first order transition, heat must be added, and ∆σ = L/τ , where L is the heat required for the system to go from completely liquid to completely vapor at temperature τ . This is called the heat of vaporization (or sometimes the latent c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
8-Nov-2002 22-2
heat of vaporization). Similarly, there are heats of melting (fusion) and sublimation. In a first order transition, the heat capacities dQ/dτ are δ-functions!
The Clausius-Clapeyron Equation Now we are going to return to a first order transition, like the liquid–vapor transition in water and see if we can say something about the functional form of the coexistence curve. The vapor pressure curve is given by µl (p, τ ) = µv (p, τ ) . Let’s move a short distance along the curve in which τ changes by dτ and p changes by dp. As we move along the curve, the chemical potentials change as well. If we remain on the curve, the change in both chemical potentials must be the same. We have dµl (p, τ ) = dµv (p, τ ) , ∂µl ∂µv ∂µv ∂µl dp + dτ = dp + dτ , ∂p τ ∂τ p ∂p τ ∂τ p ∂µv ∂µl ∂µv ∂µl + dp − dp = − dτ + dτ , ∂p τ ∂p τ ∂τ p ∂τ p ∂µl v − ∂µ + ∂τ ∂τ dp p p . = dτ v l + ∂µ − ∂µ ∂p ∂p
τ
τ
Now, what are all these partial derivatives? ∂µv ∂Gv /Nv − =− , ∂τ p ∂τ p,Nv ∂Gv 1 =− , Nv ∂τ p,Nv =+
σv 1 σv = = sv , Nv Nv
where sv is the entropy per particle in the vapor phase. The other partial derivative in the numerator gives the entropy per particle in the liquid phase, sl , while the partials in the denominator give the volumes per particle in the vapor and liquid phases, vv and vl , respectively. Altogether, we have dp sv − sl . = dτ vv − vl This is called the Clausius-Clapeyron equation. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
8-Nov-2002 22-3
Some comments on this equation are in order. First of all, dp/dτ is the slope of the vapor pressure curve (it has nothing directly to do with the equation of state of the substance). Secondly, we have the entropy per particle and the volume per particle. Since we have a ratio, the equation remains true if we use the entropy per mole and volume per mole, or the entropy per gram and volume per gram, etc. In words, the equation says the slope of the vapor pressure curve is the ratio of the change in specific entropy to the change in specific volume between the vapor and liquid phases. A similar equation applies to each coexistence curve. We just need to put in the right quantities. For example, the melting curve would have dp sl − ss = . dτ vl − vs A final comment is that the specific entropies and volumes are to be evaluated at the temperature and pressure at the point on the coexistence curve for which the slope is desired. The Clausius-Clapeyron equation is often written in other forms. In particular, the change in entropy can be immediately related to the latent heat. dp = , dτ τ ∆v where is the specific latent heat and ∆v is the change in specific volume. We can apply this to the melting of ice and the change in melting temperature with pressure. We start with what happens at 1 atm and 0◦ C = 273.15 K. The specific latent heat of fusion is 3.35 × 109 ergs g−1 . The specific volumes of ice and liquid water are vs = 1.09070 cm3 g−1 ,
vl = 1.00013 cm3 g−1 .
Remember, water expands as it freezes! The result is dp = −1.35 × 108 dyne cm−2 K−1 = −134 atm K−1 . dT The slope is negative! This accounts for the fact that the melting curve of water leaves the triple point headed up and slightly to the left. Most materials (which expand on melting!) have a melting curve which leaves the triple point headed up and slightly to the right. That is, a large positive slope instead of a large negative slope. This unusual property of water is often said to be the reason why we can have figure skating and ice hockey and why glaciers can flow. As a glacier meets up with an obstruction, the pressure at the point of contact with the obstruction increases until the ice melts and the liquid water can flow around the obstruction and refreeze on the other side. Similarly, ice skates can melt ice and the liquid water helps lubricate the skate. This sounds good, but the numbers don’t work out. For example, one needs about 10 meters of ice to generate a pressure of one atmosphere; to get a 10 degree change in melting temperature, we would need a 13 km thick glacier. A 50 kg skater on a pair of 20 cm by 0.1 mm (very sharp) skates would produce about a 1 degree c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
8-Nov-2002 22-4
change in melting temperature. Although this effect may play a role, it is probable that surface effects are more important. For example, a water molecule on the surface forms bonds with fewer neighbors than a molecule in the interior of the solid. Also, it may be attracted to the material in contact with the surface making it easier to “melt.” Now let’s look at what happens at the normal boiling point of water and the liquid– vapor transition. This occurs at 1 atm and 100◦ C = 373.15 K. The latent heat of vaporization is 2.257 × 1010 ergs g−1 and the specific volumes of the liquid and the vapor are vl = 1.043 cm3 g−1 , vv = 1673 cm3 g−1 . This gives dp = 3.62 × 104 dyne cm−2 K−1 = 0.036 atm K−1 . dT On Mauna Kea in Hawaii at an altitude of about 14,000 ft, the pressure is about 60% of sea level pressure. That is, the pressure has decreased by 0.4 atm. Using the slope we just calculated, we find that the boiling point of water decreases by 11◦ C to 89◦ C. Up to this point, we haven’t made any approximations in dealing with the ClausiusClapeyron equation. When we deal with the vapor pressure curve, we can usually neglect the volume of the liquid compared to the gas (as we’ve just seen). Also, we can use the ideal gas law to get volume in terms of the pressure and temperature (of course, if the substance is making the transition between liquid and gas, the ideal gas law may not apply all that well!). Recall, we need the specific volume, so if we have the latent heat per unit mass, then we can write v = RT /pM, where R is the gas constant (per mole) and M is the molecular weight (mass per mole). Then we have dp M = =p . dT T (RT /pM) RT 2 This is yet another form of the Clausius-Clapeyron equation. Note that if we evaluate the slope at the normal boiling point with this expression, we get dp = 3.54 × 104 dyne cm−2 K−1 = 0.035 atm K−1 , dT within 3% of what we had with the exact expression. We can rewrite the approximate form of the Clausius-Clapeyron equation as dp M dT = . p R T2 If we now assume that the does not depend on temperature or pressure, we can integrate this expression. M + constant , log p = − RT c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
8-Nov-2002 22-5
or
p = p0 e−M/RT .
What this says is that a semi-log plot of the vapor pressure against T −1 should be a straight line. K&K figure 10.3 shows that this is not all that bad of an approximation. Of course, we know that the latent heat isn’t constant and it goes to zero at the critical point. Part of the reason the plot in K&K doesn’t look all that bad is that the scale is very coarse and covers 8 orders of magnitude in pressure. Even though the latent heat isn’t constant, it’s a good approximation to assume it is for a small range of the curve and for a small range, the pressure is well approximated by an exponential of 1/τ . (One can think of the integration constant, p0 , as changing from one small range to the next.)
The van der Waals Equation of State If we want to have an atomic model for a liquid–vapor transition, we will need to model the gas as something more than non-interacting point particles. A reasonably successful approach models the gas molecules as having an attractive force for separations larger than some distance (roughly the equilibrium separation in the solid). The attractive force gets weaker and goes to zero as the separation is increased. If the molecules get too close, a strong repulsive force arises. We can approximate this force by thinking of the molecules as “hard spheres” which can get as close as twice their radii, but no closer. K&K, figure 10.7 shows a schematic of the potential energy curve of the interaction between two molecules. (Remember the force is the negative of the slope of this curve.) Fortunately, we don’t need to know the details of this curve. It’s only described to this extent in order to motivate the van der Waals equation of state. The van der Waals equation of state is N2 p + a 2 (V − bN ) = Nτ , V where a and b are constants that depend on the gas molecules. b is related to the hard sphere repulsion and a is related to the longer range attraction. This equation of state can be obtained by starting with the Helmholtz free energy of an ideal gas and making corrections to account for these effects. The ideal gas free energy is F = −Nτ (log(nQ /n) + 1) . If each molecule occupies a volume b then the effective volume available is V − bN , so the concentration should be replaced by N/(V − bN ). Of course, this is not entirely legitimate, but if each molecule has a volume b, then you would expect the pressure to diverge if the density reaches 1 molecule per volume b. This is exactly what this correction provides. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
8-Nov-2002 22-6
Since there is an attractive force between the molecules, there is a net negative contribution to the energy produced by every pair of molecules. We will evaluate this in an approximate way. Suppose φ(r) is the potential energy between two molecules separated by r. The potential energy of one molecule due to its interactions with all the other molecules is ∞ u= n(r)φ(r) dV , rmin
where rmin corresponds to the minimum distance set by b and n(r) is the concentration at distance r from the given molecule. The simplest thing we can do is to assume n(r) = n = const. This is called the mean field approximation. We assume that each molecule moves in the average field of all the other molecules and does not affect the density of the other molecules. Of course, since there is an attractive force, the concentration of molecules around any given molecule will be higher than it is at a randomly chosen point. That is, the molecular positions are correlated. The mean field approximation ignores these correlations. So, we have ∞
u=n
φ(r) dV = −2na ,
rmin
which is really just the definition of a. The factor of two is included for computational convenience. Due to its interactions with all the other molecules, a given molecule has, on the average, a change to it’s energy of −2na. There are N molecules, so the total change in energy due to the attractive part of the van der Waals interaction is ∆U = −2a
N2 , V
However, this double counts the interaction energy since each molecule is counted twice: once while contributing to the mean field and once while being acted upon by the mean field. So we need to divide by a factor of two (why we put 2 in to start with!). So ∆U = ∆F = −a
N2 . V
Our final approximate expression for the free energy of a van der Waals gas is N2 nQ (V − bN ) +1 −a . F = −Nτ log N V We differentiate with respect to the volume to get the pressure, ∂F N2 Nτ p=− −a 2 , = ∂V τ,N V − bN V which can be rearranged to N2 p + a 2 (V − bN ) = Nτ . V c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
8-Nov-2002 22-7
We can put the van der Waals equation of state into dimensionless form if we define pc = a/27b2 , Then
Vc = 3bN ,
3 p + pc (V /Vc )2
V 1 − Vc 3
τc = 8a/27b . =
8 τ . 3 τc
This equation is plotted for several values of τ in the figure. For large τ , it approaches
the ideal gas equation of state, but for small τ there are large deviations from the ideal gas equation of state. We will explore these deviations and see what they have to do with phase transitions next time!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 7
Due 12-Nov-2002
H7-1
1. K&K, chapter 8, problem 2. In this problem you are asked to consider a refrigerator whose energy is a heat source. Sounds a bit counter-intuitive doesn’t it! But consider, with the appropriate reversible engines/refrigerators you could use the engine to extract heat from the reservoir hh (provided by the flame) do work and deliver waste heat to the reservoir h (room temperature). Then you could use the work you just obtained to run a refrigerator extracting heat from the low temperature reservoir l (the cold part of the refrigerator) and dumping heat in the reservoir h. Perhaps there is a way to do it without explicitly going to the trouble of producing mechanical work. One way is described in M. W. Zemansky, 1957, Heat and Thermodynamics, 4th ed., (New York: McGraw Hill), pp. 235-7. Note that you don’t actually need to know a physical implementation to work this problem! 2. K&K, chapter 8, problem 7. (In addition to a simple yes or no answer, please explain!) 3. K&K, chapter 8, problem 9. 4. K&K, chapter 9, problem 1. Note that the third law of thermodynamics is discussed by K&K at the end of chapter 2. 5. K&K, chapter 9, problem 4. 6. The corona of the Sun is very hot, much hotter than the surface of the Sun. (The corona can be seen during total eclipses of the Sun.) Since it is so hot, it contains many highly ionized atoms. In particular, lines from Ca XIII (12 times ionized calcium) and Ca XV (14 times ionized) are seen. The ionization potentials of these ions are 655 and 814 eV, respectively. The lines of Ca XIII are much stronger than the lines of Ca XV. Use this fact to estimate (order of magnitude!) the temperature of the corona. The ionization potential is the energy required to remove another electron from the ion. Note that Ca XV is an astronomical convention. The chemical symbol followed by a Roman numeral indicates the ionization state of the atom. Since the Romans never invented zero, the neutral atom is designated by I, the single ionized atom by II, etc. This problem comes from M. Harwit, Astrophysical Concepts, 1973, Wiley, p. 144. 7. K&K, chapter 10, problem 1.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
17-Nov-2002
Physics 301 Problem Set 7 Solutions Problem 1. The reversible refrigerator is in contact with three thermal baths. The first one, the low temperature reservoir (inside the fridge) is at temperature τ l and the heat and the entropy flowing into the fridge are Q l and σl = Ql /τl . The second is the high temperature reservoir at temperature τh , and the heat and the entropy flowing into the fridge are Qh and σh = Qh /τh . The third is the reservoir that drives the process at temperature τhh , and the heat and the entropy flowing into the fridge are Q hh and σhh = Qhh /τhh . The first law gives us Ql + Qhh = Qh , and the second law gives us Ql /τl + Qhh /τhh = Qh /τh . Putting these two together, we get Qhh Ql + Qhh Ql + = τl τhh τh Ql 1/τh − 1/τhh ⇒ = Qhh 1/τl − 1/τh
(1)
which is postive because τl < τh < τhh . Problem 2. Let us assume that the carnot refrigerator can cool below room temperature τh to a temperature τl . In each cycle, heat Ql and entropy σl = Ql /τl is dumped into the fridge from the inside of the fridge at τl , heat Qh and σh = Qh /τh is dumped out by the fridge into the room at τh . The work done on the fridge is W = 100W . We also know that all the power put out by the bulb should be taken away from the inside, else the temeperature wont stay constant. This means that Ql is at least 100W . The first and second laws give us Ql + W = Qh , and Ql /τl = Qh /τh = (Ql + W )/τh . This gives us τl /τh = Ql /(Ql + W ) which is consistent with Ql > 100W = W for τl > τh /2. This means that the fridge can cool upto at least half the room’s temperature. Problem 3. For each cycle, there is heat dQh going out of the fridge at temperature Th , work dW being done electrically on the fridge, and heat dQ l going into the fridge at variable temperature T . The last quantity is a property of the solid and is given by dQl = −CdT = −aT 3 dT (the negative sign is because dT is negative). The first law says
dW + dQl = dQh , and the reversibility says dQl /T = dQh /Th . This gives us dQl − dQl T = −aTh T 2 dT + aT 3 dT.
dW = Th
1
(2)
Physics 301
17-Nov-2002
and the total electrical energy required to cool the sample to zero temperature is W = =
Z
dW = −
aTh4
Z
0
Th
a(Th T 2 − T 3 )dT
(3)
12
Problem 4. (a) We know that dG = µdN − σdτ + V dp. This gives us (∂G/∂N)τ,p = µ; (∂G/∂τ )N,p = −σ;
(4)
(∂G/∂p)N,τ = V ; Writing out the second differentials, and using the fact that the partial derivatives commute gives us: ∂ ∂G ∂ ∂G ( )τ,p )N,p = ( ( )N,p )τ,p ∂τ ∂N ∂N ∂τ ∂ ∂G ∂ ∂G ( ( )τ,p )N,τ = ( ( )N,τ )τ,p ∂p ∂N ∂N ∂p ∂ ∂G ∂ ∂G )τ,N )N,p = ( ( )N,p )τ,N ( ( ∂τ ∂p ∂p ∂τ
(
∂µ ∂σ )N,p = −( )τ,p ∂τ ∂N ∂µ ∂V ⇒ ( )N,τ = ( )τ,p ∂p ∂N ∂V ∂σ ⇒( )N,p = −( )τ,N ∂τ ∂p
⇒(
(5)
(b) From the last relation above, the volume coefficient of thermal expansion α = (1/V )(∂V /∂τ )p = −(1/V )(∂σ/∂p)τ which approaches zero as τ → 0 because, by the third law, the entropy approaches a constant value in this limit. Problem 5. (a) The law of mass action for each step 1mer + Nmer = (N + 1)mer looks like: [1][N] = KN . [N + 1]
(6)
Multiplying these equations for N = 1, 2, 3..N gives [1]N +1 = K1 K2 ...KN [N + 1] 2
(7)
Physics 301
17-Nov-2002
(b) For equilibrium of the basic reaction written above, we have dG = (µ 1 + µN − µN +1)dN c = 0. If we assume the participants to be ideal gases, equating the chemical potential of the two sides gives us: [1] [N] [N + 1] ) + F1 + τ log( ) + FN = τ log( ) + FN +1 nQ (1) nQ (N) nQ (N + 1) [1][N] nQ (N)nQ (1) FN +1 − FN − F1 ⇒ KN = = exp( ). [N + 1] nQ (N + 1) τ
τ log(
(8)
where the chemical potential of each species is made up of the ideal gas part with nQ (N ) = (MN τ /2π¯h2 )3/2 and an internal part which we write as F . (c) If N 1, nQ (N ) ≈ nQ (N + 1), and if we assume ∆F ≡ FN +1 − FN − F1 = 0 at
room temperature, then [N + 1]/[N] = [1]/KN ≈ [1]/nQ (1). At room temperature T = 300K, with M1 = 200/(6 × 1023 ), we have nQ (1) = 2.9 × 1027 /cc. For [1] = 1020 , we get [N + 1]/[N] = 3.3 × 10−8 .
(d) In reality, ∆F 6= 0. If the reaction has to go in the direction of the long molecules, then
at equilibrium, we should have [N +1]/[N] > 1, which gives ∆F = −τ (log(n Q (1)/[1])+ log([N + 1]/[N ])) < −τ (log nQ (1)/[1]) = −0.025 × 17 ≈ −0.4eV .
Problem 6. The reaction we can write down is Ca XV + 2e − = Ca XIII. The law of mass action tells us [Ca XV ][e− ]2 nQ (Ca XV )(nQ (e− ))2 ∆F/τ =K = e . [Ca XIII] nQ (Ca XIII)
(9)
Here, nQ as usual refers to the quantum concentration, and ∆F is the difference in the free internal energy between the left and the right side approximately equal to −655 − 814 = −1469eV .
Since the masses of the two ions are very close, the ra-
tio nQ (Ca XV )/nQ (Ca XIII) ≈ 1.
This gives ∆F/τ ≈ log([Ca XV ]/[Ca XIII]) −
2 log nQ (e− )/[e− ]. Since the lines of Ca XIII are much (say a million times) stronger than Ca XV , we can make an estimate log([Ca XV ]/[Ca XIII]) ≈ 6. We still need to make an estimate for log nQ (e− )/[e− ].
If we ignore this term, the estimate for
6
T ≈ ∆F/kB log([Ca XV ]/[Ca XIII]) ≈ 10 K. At this temperature, nQ ≈ 1024 /cc. As an estimate for the density of electrons, we assume that it is 10 times smaller than the
average density of the sun: [e− ] ≈ 6 × 1022 /cc. This means that there is a correction to the above estimate for the large number by about two orders of magnitude. We did not
have that much accuracy anyway (we said that the ratio of the concentration of the two 3
Physics 301
17-Nov-2002
lines is 106 ), so we can trust the above estimate roughly - the temperature of the corona is about a million degrees. Problem 7. The free energy of the van der Waals gas is F = −Nτ (log(nQ (V − Nb)/N) + 1) − N 2 a/V
(10)
where nQ = (Mτ /2π¯h2 )3/2 as usual. From this (keeping only first order terms in a and b from now on), we find (a) The entropy σ=−
∂F ∂τ
= N (log(nQ (V − Nb)/N) + 1) + Nτ
3 2τ
(11)
5 = N (log(nQ (V − Nb)/N) + ); 2 (b) and the energy ∂(F/τ ) ∂τ 3 N 2a = τ 2 (N − ) 2τ V τ2 3 N 2a = ( Nτ − ). 2 V (c) The pressure of a van der Waals gas is U = −τ 2
Nτ N 2a p= − 2 V − Nb V Nτ Nb N 2a ≈ (1 + )− 2 . V V V
(12)
(13)
We can now find the enthalpy H = U + pV as a function of temperature and volume: 3 N 2a Nτ Nb N 2a H(τ, V ) = ( Nτ − )+( (1 + ) − 2 )V 2 V V V V 2 2 5 N bτ N a = Nτ + −2 . 2 V V
(14)
Since the second and third terms are already first order in a, b, we only have to find V (p) to zeroeth order to write the enthalpy in terms of temperature and pressure. This is simply the ideal gas law V = Nτ /p, which gives us H(τ, p) =
5 Nap Nτ + Nbp − 2 . 2 τ 4
(15)
Week 8. Phase Transition Examples, Ising Model
Physics 301
11-Nov-2002 23-1
Reading Finish K&K chapter 10, start on chapter 11.
Phase Transitions and the van der Waals Equation of State Last time, we discussed the van der Waals equation of state. Note that we didn’t derive the equation of state. We simply showed how it might plausibly come from a plausible model of the interaction energy between molecules. The van der Waals corrections to the ideal gas equation of state are really the first terms in a Taylor expansion. Higher order terms are needed to accurately model a real gas. Our goal is not accurate modeling of a gas, but a model for a phase transition and that’s what we will consider here. Recall that the van der Waals equation of state is N2 p + a 2 (V − bN ) = Nτ , V where b is a volume related to the short range “hard sphere” repulsion of the molecules and a is a volume times an energy related to the longer range attraction between molecules. We put the van der Waals equation of state into dimensionless form with the definitions pc = a/27b2 ,
Vc = 3bN ,
τc = 8a/27b ,
and the equation of state becomes
3 p + pc (V /Vc )2
1 V − Vc 3
=
8 τ . 3 τc
In fact it’s convenient to work in the dimensionless variables pˆ =
p , pc
V Vˆ = , Vc
τˆ =
τ , τc
in which the equation of state becomes 8 1 3 ˆ = τˆ . V − pˆ + 3 3 Vˆ 2 A plot of the isotherms on a pV diagram showed that for τˆ 1 and Vˆ 1/3, they look very much like the ideal gas isotherms. In the neighborhood of τˆ = 1, the curves develop a “wiggle,” and for small τˆ, the volume is a multivalued (there are three roots) function c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002 23-2
of the temperature. We can get a little more insight by differentiating the pressure with respect to the volume at fixed temperature: 8ˆ τ 3 , − 3Vˆ − 1 Vˆ 2 dˆ p 6 24ˆ τ + , =− dVˆ (3Vˆ − 1)2 Vˆ 3 d2 pˆ 144ˆ τ 18 =+ − , dVˆ 2 (3Vˆ − 1)3 Vˆ 4 pˆ = +
The extrema of the pressure along an isotherm occur when dˆ p/dVˆ = 0 which gives the condition 1 τˆVˆ 3 = (3Vˆ − 1)2 . 4 The right hand side of this equation is plotted as the dashed line in the figure. Solid
curves show the left hand side for τˆ = 2, 1, and 1/2 from left to right. The curves always have an intersection at Vˆ < 1/3, since Vˆ = 1/3 is the point at which the right hand side goes to zero. However, the volume cannot be as small as or less than Vˆ = 1/3 as this corresponds to the hard spheres in contact and a singularity in the pressure. This means that the entire region of this diagram with Vˆ ≤ 1/3 is unphysical and does not correspond c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002 23-3
to real solutions. If the temperature is high enough, the unphysical root is the only root of the equation, the volume is a single valued function of temperature, and there is no phase transition. An example is the τˆ = 2 curve on the diagram. If the temperature is low enough, there are two roots in the physical region, the isotherm has a wiggle, and Vˆ is a triple valued function of pˆ for pˆ between the two roots. This case is illustrated by the τˆ = 1/2 curve in the figure. There is a root just past Vˆ = 1/3 and another outside the figure up and to the right (the cubic curve eventually catches up to the quadratic curve). This case corresponds to a phase transition. The dividing line between the two cases occurs when the cubic curve just touches (is tangent to) the quadratic curve. This is illustrated by the τˆ = 1 curve. This case corresponds to the critical point which divides the phase transition region from the no phase transition region. Since the slope of the isotherm is 0, but Vˆ remains a single valued function of pˆ, this point must be a horizontal (dˆ p/dVˆ = 0) inflection point. At an inflection point, the second derivative is 0. In other words, in this case we find the critical point by setting both the slope and the second derivative to zero. If we do this, we find pˆ = 1 ,
Vˆ = 1 ,
τˆ = 1 ,
which is what motivated the definition of pc , Vc , and τc in the first place! Now consider an isotherm for a temperature less than τc , for example, τˆ = 0.85 as shown in the figure. Note that the plot has a logarithmic volume axis in order to get everything in but still be able to see what’s happening at the “wiggle.” The dashed lines delimit the lower (ˆ p = 0.0496) and upper (ˆ p = 0.6206) pressures for which Vˆ is triple valued on this isotherm. The labeled points show the intersections of these lines with the isotherm. We suppose our van der Waals gas is in a cylinder topped by a piston with a weight on it to provide the pressure. The cylinder is in contact with a thermal reservoir to maintain constant τˆ. Suppose we start with a big volume and low pressure—that is, to the right of point a on the isotherm. We slowly and carefully add a bit of weight to the piston. This increases the pressure and the volume decreases in response. Some mechanical work was done on the gas and some heat was transfered to the heat reservoir. We moved left along the isotherm. We continue increasing the pressure and moving left along the isotherm. Once we pass point a, we have to be extremely careful that the gas doesn’t condense into a liquid. Suppose we are careful and we get to point b. At point b, we add just a tiny bit more weight to the top of the cylinder. What happens? Once we get to the left of point b decreasing the volume lowers the pressure the gas can support, this means the volume decreases even more which means the pressure decreases even more, etc. No point on the isotherm between b and c is stable! Anywhere in this region, a slight lowering of the pressure, causes a runaway collapse. Once we get to point b, and we continue to increase the pressure, we will wind up at point d. If we start to the left of point d, with the system a liquid, and slowly remove weight to lower the pressure, and we get to point c, we can’t lower the pressure any further. If we try, we will wind up at point a. We’ve just shown that all points between b and c are mechanically unstable. However, there are points to the right of b and the left of c where the Gibbs free energy is not a c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002 23-4
minimum when on the isotherm. How can this be? Again, we start to the right of a and follow the isotherm. Since the number of particles isn’t changing and the temperature isn’t changing, we have
p
G(τ, p, N) = Nµ(τ, p) =
V dp + Nµ(τ, p0 ) , p0
where p0 is the pressure where we start. As we go along the isotherm to the left, the Gibbs free energy increases (V > 0 and dp > 0) until we reach point b. At this point, it starts to decrease and continues to decrease until we reach point c. But, when we get to point c, it’s not as low as it was at point a, because the volume is smaller on b → c than it was on a → b. After point c, the free energy increases again. At point d, the free energy still isn’t as large as it was at point b (for the same reason as before). So what does all this tell us? First of all, for a given pressure between the two dashed lines, we can be on the a ↔ b or c ↔ d segment of the isotherm. The b ↔ c segment is excluded because it’s unstable. Near a or c, the a ↔ b segment has the lowest free energy and will be the equilibrium phase. Near b or d, the c ↔ d segment has the lowest free energy and will be the equilibrium c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002 23-5
phase. For some pressure between the two dashed lines, the free energies of the two phases are equal and at this pressure, the equilibrium is a mixture of the two phases with the amounts of the two phases determined by the volume. The pressure is determined by the requirement that the areas of the two hatched regions shown in the figure should be
equal. Note that this figure has a linear volume axis so that the areas appear correctly. Numerical calculations are used to determine that for τˆ = 0.85, Vˆ1 = 3.1276, Vˆ2 = 0.5534, and pˆ1 = pˆ2 = 0.5045. The equilibrium isotherm starts at the far right, goes through point a, comes to point 1, follows the constant pressure segment to point 2, goes through point d, and out the top left. It’s possible, under very carefully controlled conditions, to have the system on the segment 1 ↔ b, where it would be considered a supercooled gas. The system can also be on segment 2 ↔ c, where it would be considered a superheated liquid. The system can never be on the b ↔ c segment. On the 1 ↔ 2 segment, part of the system is at 1, as a gas, and part of the system is at 2, as a liquid. The relative amounts of gas and liquid determine the total volume of the system and just where along the 1 ↔ 2 segment the system appears.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002 23-6
The Gibbs free energy for the van der Waals gas is Nτ V 2N 2 a G(τ, V, N ) = − − Nτ log V − Nb V
nQ (V − Nb) N
− Nτ .
We can convert this to dimensionless variables by making the same substitutions as before. The result is G τˆVˆ 9 = τ 3/2 (Vˆ − 1/3) − τˆ . − τˆ log 3bnQ (τc )ˆ − Nτc Vˆ − 1/3 4Vˆ In this expression, nQ (τc ) is the quantum concentration evaluated at τc . The temperature dependence of nQ is explicitly shown as the τˆ3/2 factor in the logarithm. We want G(τ, p, N). It turns out that there is no analytic solution for V in terms of τ and p, so we can’t write down an expression for G in terms of the desired variables. Instead, we can evaluate G numerically. For this purpose, we’ve taken log(3bnQ (τc )) = −2. The actual value of this factor changes the overall slope of G versus τˆ, but does not change any of the c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002 23-7
details. The particular slope we’ve chosen helps fit things on the plots. Since V is a triple valued function of pressure for some temperatures, G will also be a triple valued function. (This is one of the reasons we can’t come up with an analytic form for G.) The figure shows G/Nτc versus τˆ for three pressures: the critical pressure, 1.4 times the critical pressure, and 0.6 times the critical pressure. The first two are smooth (continuous first derivative) curves. The curve for 0.6pc is the triple valued curve. There are branches corresponding to the gas phase, the liquid phase, and the unstable phase. The minimum, comprising part of the gas curve and part of the liquid curve, is the equilibrium phase. The next figure shows the a surface plot of the Gibbs free energy versus both tem-
perature and pressure. The critical point is in the center of the diagram. It’s not very pronounced, but perhaps you can spot the ridge line which separates the liquid and gas phases at low pressures and temperatures.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002 23-8
Droplets Suppose we have a system with a temperature and pressure that are close to the values where gas and liquid can co-exist, but they slightly favor the liquid. Suppose the system exists as a gas (it might have been warmer but has cooled down, such as a rising air mass on a hot day). If the system is to condense into a liquid, it’s not likely to do it all at once. Instead, small droplets form and grow. But do they? Let’s consider the Gibbs free energy of a droplet and see under what conditions growing a droplet would decrease the free energy of a system. So, we let ∆µ be the difference in chemical potential between the vapor phase and the “bulk” liquid phase. That is ∆µ = µv − µl > 0 , where the inequality holds since we postulated that pressure and temperature favor the liquid phase. You’ll notice that we said “bulk” liquid. What’s the difference between bulk liquid and droplets? Answer: the relative contribution of the surface energy. In a liquid, molecules have neighbors with which they have an attractive interaction and the interaction energy is negative. It’s this overall negative contribution to the energy that’s responsible for the wiggle in the van der Waals isotherms and the phase transition to a liquid. A molecule at the surface of a liquid is missing some neighbors (compared to a molecule in the interior), so it costs energy to have a surface. A liquid configuration minimizes its energy by minimizing its surface area. This is why droplets and bubbles are round! (What about bubbles on special frames?) The tendency to minimize surface area leads to a force known as surface tension. This is a force per unit length which acts perpendicular to any small length lying in a surface. Just like pressure is a force per unit area which acts perpendicular to any small area embedded in a fluid. Of course, pressure is repulsive while surface tension is attractive. The surface tension of a given liquid can depend on what’s dissolved in it and what gases are on the other side of the surface. For water in air, the surface tension varies from about 76 dyne cm−1 at 0◦ C to about 59 dyne cm−1 at 100◦ C. At room temperature (20◦ C) it’s 72.8 dyne cm−1 . Suppose you have a rectangular surface with sides a × b. You grab the b edge and pull it a distance da. The force required is γb, where γ is the surface tension. The work done is γb da = γ dA, where dA is the change in area of the surface. So a change in surface area requires an energy input to the system of γ∆A. This is very much like −p∆V work. With surface tension, we have a positive sign since it’s a tension rather than a pressure, and we have an area rather than a volume since it’s a force per unit length rather than a force per unit volume. The upshot of all this is that a surface of area A requires an energy γA to produce. Also, γ can equally well be expressed as energy per unit area. So for water at room temperature, γ = 72.8 erg cm−2 . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Nov-2002 23-9
Back to the droplets. Suppose we imagine removing some molecules from the vapor phase and making a spherical droplet with them. What is the change in Gibbs free energy? Answer 4π ∆G = Gl − Gv = −N∆µ + 4πR2 γ = − R3 nl ∆µ + 4πR2 γ , 3 where N is the number of molecules in the droplet, R is the radius of the droplet, and nl is the concentration of the liquid in the droplet. The first term in this expression accounts for the decrease in energy of going from vapor to bulk liquid. The second term adds a correction for the surface of the droplet. Here’s the interesting point. For small enough R, the quadratic surface energy term wins over the cubic volume energy term. A small drop lowers the free energy by evaporating. If R is big enough, the cubic term wins and this size drop can lower the free energy by growing. The dividing line is found by setting d∆G/dR = 0 which gives the critical radius Rc =
2γ , nl ∆µ
at which point the change in Gibbs free energy is ∆Gc =
16πγ 3 . 3(nl ∆µ)2
In dimensionless units we have ∆G = −2 ∆Gc
R Rc
3
+3
R Rc
2 ,
which is plotted in the figure. Approximating water vapor as an ideal gas, we have ∆µ = τ log(p/pvapor ), where pvapor is the equilibrium vapor pressure of the water. Recall (lecture 11) that the chemical potential for an ideal gas can be written as µ = τ log(n/nQ ) = τ log(N/V nQ ) = τ log(p/τ nQ ). If the pressure were the same as the vapor pressure, then the vapor and bulk liquid would have the same chemical potentials. So the difference in chemical potentials depends on p/pvapor , as indicated. If we assume a value for this ratio, then we know everything to estimate Rc . If we follow K&K and take p/pvapor = 1.1, T = 293 K, then Rc = 110 ˚ A, quite small, but still about 70 atoms in diameter! Do you suppose this has anything to do with cloud formation and seeding as a technique to get clouds to form?
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Nov-2002 24-1
A Simple Model of Ferromagnetism Recall way back in lecture 4 we discussed a magnetic spin system. In our discussion, we assumed spin 1/2 magnets with an energy ±E when anti-aligned or aligned with the magnetic field. We had a total of N spins and we let 2s be the “spin excess,” the number of aligned minus the number of anti-aligned magnets. We assumed that the magnets were weakly interacting with a thermal reservoir and with each other. We found that 2s E = tanh , N τ which gives small net alignment if E τ and essentially perfect alignment if E τ . It’s customary to speak of the magnetization which is the magnetic moment per unit volume, and we denote magnetic moment, not the chemical potential, by µ in this section. Then the magnetization is M=
N µB µB µ tanh = nµ tanh , V τ τ
where n is the concentration of elementary magnets and B is the magnetic field. Previously, we assumed that B was externally supplied. But of course, if the system has a net magnetization, it generates a magnetic field. We assume that when the magnetization is M, there is an effective field acting on each magnetic dipole proportional to the magnetization Beff = λM , where λ is a proportionality constant. This is essentially an application of the mean field approximation to get Beff . In the crystal structure of a ferromagnet (or any material for that matter), the electric and magnetic fields must be quite complicated, changing by substantial amounts on the scales of atoms. We are encapsulating all our ignorance about what’s really going on in the constant λ. In any case, we now assume there is no external magnetic field, and we have µλM M = nµ tanh , τ We can rewrite this in dimensionless form with the following definitions: m=
M , nµ
τc = nµ2 λ ,
t=
τ , τc
where τc is called the Curie temperature. With these definitions, our equation becomes m = tanh
m , t
which is actually kind of remarkable. It says that at any given temperature, a magnetization occurs spontaneously. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Nov-2002 24-2
In order to determine the spontaneous magnetization, we must solve this transcendental equation for m. The figure shows a plot of the left hand side (the straight line) and several plots of the right hand side for various values of t. At t = 1, the right hand side is tangent to the left hand side at m = 0. For t > 1, the curves intersect at m = 0. So there is no spontaneous magnetization when the temperature is greater than the Curie temperature. For t < 1, there is an intersection at a non-zero m which moves to larger m as t gets smaller, approaching m = 1 at τ = 0. The figure shows the magnetization versus temperature. For temperatures less than about a third of the Curie temperature, the magnetization is essentially complete—all the magnetic moments are lined up. This is the case for iron at room temperature. K&K show a similar plot including data points for nickel. The data points follow the curve reasonably well. Before we get too excited about this theory, we should plug in some numbers. For iron, the Curie point is Tc = 1043 K, the saturation magnetic field is about Bs = 21, 500 G, the density is ρ = 7.88 g cm−3 and the molecular weight is about 56 g mole−1 . We might also want to know the Bohr magneton, µB = 9.27×10−21 G cm3 . The Bohr magneton is almost exactly the magnetic moment of the electron. If we assume that one electron per atom participates in generating the magnetic field, we have n = 8.47 × 1022 cm−3 , M = nµB = 785 G. Note also that we expect B = 4πM = 9900 G. So we are in the ballpark. Next, let’s calculate Tc . To do this, we have to know λ, which hasn’t entered into the calculations so far. If we assume that the smoothed field which we just calculated is the effective field acting on an electron spin, then λ = 4π and Tc = 0.66 K, just a little on the small side! We’re off by a factor of 2 in the overall magnetic field and a factor of 2000 in the Curie temperature. Perhaps more than one electron per atom participates in generating the mean field. After all, iron has 26 electrons per atom. If the electrons pair with opposite spins, an even number per atom have to wind up with the same spin. (Of course this ignores the fact that electrons are in the conduction band of the solid.) Also, λ is supposed to characterize the field acting on the aligned electrons. Since it appears that a simple estimate of λ is off by a factor of a thousand or so, there must be some complicated interactions going on in order to get an effective field this strong. These interactions are presumably due to the other electrons in the atom, in nearby atoms, and in the Fermi sea c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Nov-2002 24-3
of electrons in the metal. Without a detailed understanding of what’s going at the atomic level, we can’t say much more about this model.
Superconductors, the Meissner Effect, and Magnetic Energy As you know, when some materials are cooled, they become superconductors. All resistance to the flow of electricity disappears. K&K state that superconductivity disappears for temperatures above about 20 K. This is a little out of date. In the last decade or so, high temperature superconductors were discovered (called high Tc ) and the record high temperature is around 190 K. (Of course, I might be out of date, too!) The new high Tc superconductors are ceramics with anisotropic superconductivity. The old-style or normal superconductors are metals with isotropic superconductivity. We will be talking about old-style superconductors. There are two kinds of superconductors, naturally called type I and type II! Type I superconductors completely exclude magnetic fields from their interiors when in a superconducting state. This is called the Meissner effect. Type II superconductors partially exclude magnetic field. Actually what happens is the type II superconductor organizes itself into vortex tubes with normal conductor and magnetic field in the centers of the vortices and superconductor and no magnetic field between the vortices. We will consider type I superconductors. The Meissner effect is actually quite amazing. You will recall from E&M (or you will learn when you take physics 304) that you can’t get a magnetic field inside a perfect conductor. If you try, then by Lenz’ law the induced currents create an induced magnetic field opposite to the one you’re trying to push into the conductor. In the case of a perfect conductor, the currents are sufficiently large to exclude the field from the interior of the conductor. So if you have a superconductor (in its superconducting state) and you try to move it into a region of magnetic field, you might not be surprised that the magnetic field is excluded. But, suppose you start with a superconductor in its normal state, warm, and you turn on a magnetic field. There is no problem with having a static magnetic field in a good, but not perfect, conductor. Now cool the superconductor until it becomes superconducting. When it makes the transition to the superconducting state, the magnetic field pops out leaving no magnetic field in the interior. This cannot be explained by Lenz’ law plus a perfect conductor. It’s a property of the superconducting phase. As you can imagine, it costs energy to expel the magnetic field from the interior of the superconductor. The system is in a superconducting state because the free energy in the superconducting state is lower than the free energy in the normal state. But the energy cost of excluding the magnetic field can make the normal state the minimum energy state. What’s actually observed is that at the transition temperature, superconductivity is destroyed by a very small magnetic field. As the temperature is lowered, it takes a larger and larger magnetic field to destroy the superconductivity. The increase in free energy of c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Nov-2002 24-4
a superconductor upon excluding a magnetic field B must just be the energy required to create a field of −B inside the super conductor. This energy is just ∆U = V B 2 /8π , or B 2 /2µ0 if you really want SI units! So the critical field will be determined by 1 B 2 (τ ) (FN (τ ) − FS (τ )) = c , V 8π where FN is the free energy of the normal state (which doesn’t change much with a magnetic field present) and FS is the free energy of the superconducting state in the absence of a magnetic field. For fields larger than Bc , the free energy of the normal state with magnetic field is less than the free energy of the superconducting state with excluded field and the normal state is the equilibrium state. In K&K chapter 8, you’ll find some plots of the critical field versus temperature for various superconductors.
The Ising Model In the mean field theory of ferromagnetism, we attempt to account for the interactions of the magnetic dipoles by considering the “global mean field.” Our only allowance for the microscopic structure of the atoms was the proportionality factor λ which related the field acting on a dipole to the mean field. In the Ising model, we go to the other extreme! We assume that all the interaction comes from nearest neighbors and that far away electrons, atoms, or molecules have no direct effect. The idea is that there is an extra interaction energy due to the magnetic dipoles (in addition to whatever else is going on in the material). Furthermore, the dipoles are arranged on a regular lattice. The extra energy is taken to be U =−
1 Jij σi σj − µB σi , 2 i,j i=j
i
where the factor of 1/2 is a double counting correction, σi = ±1 is proportional to the spin of the electron, B is an external, constant magnetic field, µ is the electron magnetic moment as we’ve had before and Jij represents the magnetic interaction between electrons i and j. Note that even at this point, we’ve already simplified things greatly. In particular, we’ve taken “scalar spins.” Presumably, an exact quantum mechanical treatment would consider σi and σj , that is, a vector treatment of the spins which would also include how the spins are aligned with respect to the line joining them. But, we’re going to sweep all that under the rug and just assume that the product of the spins gives ±1. Note: a c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Nov-2002 24-5
model in which the interaction energy is proportional to σi · σj but still ignoring terms like σi · rij , is known as the Heisenberg model. An additional simplification is the following, In general, we take Jij to be 0 if the dipoles are not nearest neighbors and the single constant J if they are. Thus every pair of nearest neighbor dipoles contributes either +J or −J to the energy. When looking for spontaneous magnetization, we will take B = 0. However, a non-zero B can be included if it’s desired to see how an external field affects the phase transition. A bit of history: the Ising model was first proposed in 1920 by Lenz. His PhD student was Ising who in 1925 showed that (what is now called) the Ising model had no phase transition in one dimension. Since the original proposal was motivated by the desire to study a phase transition, this was not a good sign and not much happened for a while. In the 1940s, 50s and 60s, people returned to the model and analytic solutions were obtained for the Ising model on a two dimensional square lattice. It turns out there is a phase transition in this case. The three dimensional Ising model has not been solved analytically. Nor has the 2D model with a non-zero B. However, with computers, it’s possible to tackle the Ising model numerically and folks have had a lot of fun playing with the model in the last couple of decades (that is, after K&K was published!). Ernst Ising died in 1998 and an obituary appeared in Physics Today. Perhaps by now, you’ve figured out that the Ising model is extremely difficult to solve. What does it mean to solve the Ising model? Answer: we would like to calculate the partition function from which we can calculate the free energy, entropy, specific heat, and so on as a function of temperature. If the model is a good representation of a phase transition, we should be able to see some kind of change in its properties on either side of the critical temperature. A rather advanced discussion of analytic techniques applied to the Ising model is given in H. S. Robertson, 1993, Statistical Thermophysics, (New Jersey: Prentice Hall), chapter 7. The mathematics involved is beyond the scope of this course. However, a couple of results can be quoted. First, the 1D Ising model has no phase transition. Second, the 2D square lattice Ising model (4 nearest neighbors) has a phase transition at sinh(2J/τc ) = 1 which gives τc = 2.269J . Other results (for example, a plot of the heat capacity) may be found in this reference. There is also a considerable amount of information about the Ising model on the web. In particular, there are a number of Java applets that do interesting Monte Carlo calculations of a 2D Ising model. Many of these let you set the parameters. Since you know where the critical temperature is, you can set the parameters appropriately if you play with any of these models. A good place to start is http://oscar.cacr.caltech.edu/Hrothgar/Ising/index.html . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Nov-2002 24-6
What is a Monte Carlo calculation? The basic features involve the following. Set up a 2D lattice (2D array in your code) of spins. Each spin can have the value ±1. To start, one might as well choose the values randomly (very hot) or all the same (extremely cold). Pick a value for J . Pick a value for τ . Now make a pass through the lattice. A pass might consist of visiting each spin in turn or it might consist of visiting a number of the spins chosen randomly. As each spin is visited, calculate the difference in energy between having the spin up and having the spin down. Then choose the orientation of the spin randomly so that the probability ratio is a Boltzmann factor: P (up) = e−(Eup − Edown )/τ . P (down) In other words, each spin is numerically placed in thermal contact with a heat bath at temperature τ . Successive passes correspond to time passing. Since the system is finite (very finite!), relatively large fluctuations in quantities such as the average energy of the system are expected. If you play with some of the simulations on the net, you will observe the following. At τ = 0, the system is in the ground state—all the spins are either +1 or −1, but in the absence of other influences, either case is equally likely, so there are two ground states. Note that the energy is U0 = −2NJ if each spin has four nearest neighbors. At very high temperatures, each spin is equally likely to be +1 or −1, so the average energy is 0. Of course you didn’t need any simulations to know the preceding—but what happens in the neighborhood of the transition temperature? The simulations show that big patches containing all one orientation of spin form. It’s sometimes described as “the system can’t make up its mind what it wants to do.” (A rather anthropomorphic statement, don’t you think?) In any case, just above the transition temperature, the patches tend to fluctuate but remain the same size (at a given temperature). Below the transition temperature, patches of one orientation tend to win out. This effect becomes stronger as the temperature is lowered.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
15-Nov-2002 25-1
Landau Theory of Phase Transitions In this section we’ll have a quick look at Landau’s theory of phase transitions. We will be looking at a general approach rather than any specific system. We will consider systems at constant temperature and volume, so the Helmholtz free energy is a minimum when in thermal equilibrium. We suppose that the phase of a system is determined by some parameter. For example, in the ferromagnetic system we discussed last time, the magnetization, M, was the parameter of interest. It had a value different from zero for temperatures below the Curie temperature, went to zero at the Curie temperature, and stayed 0 for higher temperatures. We will denote the parameter by ξ; in the generic case, it’s called the order parameter. Sometimes I call it “wiggly.” At thermal equilibrium, the order parameter will have some value, ξ0 (τ ), but we suppose that we can calculate the free energy for other values of ξ. That is, we write FL (ξ, τ ) = U (ξ, τ ) − τ σ(ξ, τ ) , where the subscript L indicates the “Landau free energy.” Of course the free energy, energy, and entropy depend on the volume and number of particles as well, but we’re assuming they’re constant and we’re not bothering to show them. When the system is in thermal equilibrium, the order parameter will take on the value which minimizes the free energy and we have F (τ ) = FL (ξ0 , τ ) ≤ FL (ξ, τ ) , for all ξ = ξ0 . Given FL (ξ, τ ), we can determine its minimum. It’s the appearance of a new minimum as τ is varied that is a sign of phase transition. Suppose FL is an even function of the order parameter. Also suppose FL can be expanded in a power series in the order parameter. (Both assumptions are reasonable, but are certainly not guaranteed to always be true!) We have 1 1 1 FL (ξ, τ ) = g0 (τ ) + g2 (τ )ξ 2 + g4 (τ )ξ 4 + g6 (τ )ξ 6 + · · · , 2 4 6 where the indices on the g’s indicate the corresponding power of ξ, and the fractions are put in for later convenience. Now we need to look at this power series to see under what conditions there is a minimum which moves around as we change the temperature. First of all, suppose all the g functions are positive. Then all of them increase the Landau free energy for all values of ξ 2 . So the minimum occurs when the order parameter is zero. Not very exciting! c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
15-Nov-2002 25-2
To get a minimum somewhere other than ξ = 0, we need at least one of the g functions to be negative and for this function to be large enough compared to the other functions that it can shift the minimum away from zero. Also, to have a transition, we would like the minimum to depend on the temperature. We suppose that g2 (τ ) = α(τ − τ0 ) , where α is a positive constant. The temperature τ0 is the temperature at which the phase transition will occur and this expression for g2 should only be taken as valid in a neighborhood of τ0 . It clearly can’t work all the way down to τ = 0! We suppose that g4 (τ ) > 0 and that α and g4 (τ ) are sufficiently large that all the interesting behavior occurs for small ξ so we don’t have to worry about g6 and higher order functions. With all these caveats, the Landau free energy is 1 1 FL (ξ, τ ) = g0 (τ ) + α(τ − τ0 )ξ 2 + g4 (τ )ξ 4 . 2 4 To find the minimum, we set the derivative with respect to the order parameter to zero, 0=
∂FL ∂ξ
= α(τ − τ0 )ξ + g4 (τ )ξ 3 , τ
which has the roots ξ =0,
and
ξ=±
(τ0 − τ )
α . g4 (τ )
The root at 0 is always at least an extremum of the free energy. The other roots are imaginary and unphysical if τ > τ0 . For temperatures lower than the transition temperature, τ0 , the roots are real, physical and are actually the global minima of the free energy. The figure shows plots of the free energy versus the square of the order parameter for several values of the temperature. For τ > τ0 the minimum is at ξ = 0. For τ < τ0 , the minimum moves away from 0 with decreasing τ .
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
15-Nov-2002 25-3
The figure shows the minimum value of the free energy as a function of the temperature. Both ξ0 and the minimum free energy are continuous functions of temperature. The minimum free energy below τ0 joins smoothly to the minimum curve g0 (τ ) at τ = τ0 Presumably the entropy is also a continuous function of temperature through τ0 . This means there is no latent heat and we are dealing with a second order transition. The ferromagnetic system we discussed last time is a phase transition of this sort. To show this, we need to express the energy and the entropy as functions of the magnetization which will be our order parameter. Recall that M is the magnetic dipole moment per unit volume, so we will express these quantities per unit volume. The energy of an aligned magnet in a magnetic field is −µB. In terms of magnetization, this becomes U (M) = −MB. With our mean field approximation where B = λM, this becomes U(M) = −λM 2 . But. . . It is the magnetic dipoles which are the source of the field so we have a double counting problem. Each magnet contributes once in the B factor of −µB and once in the µ factor. We need a factor of 1/2. Then the energy density is U (M) = −λM 2 /2. What about the entropy? In homework 1, problem 3, you worked out the entropy of our spin system. When the magnetization is small (that is, near the transition temperature), the entropy is 2s2 , N where N is the number of dipoles and 2s is the spin excess as usual. We’ve written the volume explicitly, so σ refers to entropy density. We have V σ(s) = V σ0 −
M = nµ
2s , N
so
M2 , 2nµ2 where n is the concentration of magnetic moments, µ. Then the free energy as a function of M is λM 2 M2 FL (M, τ ) = − + higher order terms . − τ σ0 + τ 2 2nµ2 The higher order terms arise because our approximation for the entropy is only accurate when the magnetization is close to zero. When the magnetization is large, then the entropy is small. Ignoring the higher order terms, we can rearrange this expression as 1 1 FL (M, τ ) = −τ σ0 + τ − nµ2 λ M 2 , 2 2 nµ g0 σ = σ0 −
g2 =α(τ −τ0 )
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
15-Nov-2002 25-4
and the higher order terms will produce M 4 and higher powers. So the Landau free energy for our mean field approximation for a ferromagnet has exactly the form we’ve been considering for a second order phase transition. This method of analysis correctly yields the existence of a Curie temperature and the functional dependence of the order parameter (magnetization) on temperature near the Curie point, but does not tell us what happens as τ → 0 (we neglected the higher order terms—what did you expect?). How would we go about making a first order transition? To have a first order transition, we need a discontinuous entropy so that we have a latent heat. This means that the free energy must be discontinuous. In the second order transition which we’ve just examined, the “new” minimum of the free energy first appears (τ = τ0 ) at the same location (ξ = 0) as the old minimum and then moves to non-zero values of the order parameter as the temperature is lowered below the transition point. What we want is for the new minimum to first appear at a non-zero order parameter. Then the order parameter (and the free energy) will jump discontinuously to the new minimum. In terms of our expansion of the free energy around ξ = 0, we need a function that starts at 0 (we imagine subtracting off g0 ), rises, dips below zero, and then rises again as ξ increases. Since we are dealing with a power series in ξ 2 , the ξ 2 term must be responsible for the initial rise. Then the ξ 4 term takes over and produces the dip. For larger ξ, the ξ 6 term wins and the function rises again. All this means that g4 must be negative and g6 must be positive. Our expansion for the free energy looks like 1 1 1 FL (ξ, τ ) = g0 (τ ) + α(τ − τ0 )ξ 2 − |g4(τ )|ξ 4 + g6 (τ )ξ 6 + · · · . 2 4 6 We’re allowing the ξ 2 coefficient to be either positive or negative depending on whether τ is larger or smaller than τ0 . Note that τ0 is not the transition temperature in this case. To find the locations of the minima, we differentiate with respect to the order parameter,
0=
∂FL ∂ξ
= α(τ − τ0 )ξ − |g4 (τ )|ξ 3 + g6 (τ )ξ 5 , τ
which has solutions ξ = 0 and ξ2 =
|g4 | ±
g42 − 4g6 α(τ − τ0 ) . 2g6
If τ is sufficiently large, the argument of the square root is negative and there are no extrema other than the one at ξ = 0. In particular, τ − τ0
1 (uAA + uBB ) . 2
In other words, the strength of the A-B interaction must be less than the average strength of the A-A and B-B interactions in order to have a solubility gap. In this case, we can relate the temperature at which a solubility gap appears to the energies per atom. The idea is that the as the temperature is lowered, the entropy of mixing contribution to the free energy decreases until a positive bump appears in the free energy which indicates a solubility gap. The border between homogeneous and heterogeneous mixtures occurs when the free energy is flat (linear) as a function of composition. In this case, the second derivative of the free energy, with a negative contribution due to the energy of mixing, and a positive contribution due to the entropy of mixing, must be zero. So, 0=
d2 FM d2 = (uM − τ σM ) , dx2 dx2 d2 1 px(1 − x)[2uAB − (uAA + uBB )] + τ [(1 − x) log(1 − x) + x log x] , = 2 dx 2 1 1 = −p[2uAB − (uAA + uBB )] + τ + . 1−x x
The peak of the bump occurs at x = 1/2, so the maximum temperature for which a heterogeneous mixture is the stable equilibrium is τM =
p [2uAB − (uAA + uBB )] . 4
Another example relates to the gold-silicon mixture we used as an example last time. In this case, the two pure substances have different crystal structures. So it is easy to imagine that starting from pure gold and adding a small amount of silicon, silicon atoms just substitute for gold atoms in the gold lattice which is a face centered cubic lattice. Similarly, starting from pure silicon and adding gold, the gold just substitutes for silicon in the silicon lattice which is a diamond lattice. Aside: what are face centered cubic and diamond lattices? Answer: to make a face centered cubic lattice, start with a cube and put an atom at each corner (8 atoms) and an atom in the center of each face (6 atoms). Now replicate this cube throughout space. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Nov-2002 26-3
Note that each atom in a corner position is shared among 8 cubes, and each atom in a face center is shared among two cubes, so there are 4 atoms per cube.√Each atom in the face centered cubic structure has 12 nearest neighbors at distance a/ 2, where a is the side of the cube. Consider an atom in the center of a face. Its nearest neighbors are the four corner atoms in the same face and the other face atoms in the two cubes it’s in, except for the face atoms on the opposite sides of the cubes. Note that if you start with a simple cubic lattice (atoms at the corner positions only) and add three more, each offset by a/2 along one side of the cube and a/2 along another side (there are three possibilities for doing this), you wind up with the face centered cubic lattice. So corner and face positions are equivalent. How about the diamond lattice? Start with a face centered cubic lattice. Add another face centered cubic lattice offset by a/4 along each √ side of the cube. Each atom in this lattice has four nearest neighbors at distance 3a/4. For example, the atom at (a/4, a/4, a/4) has neighbors at (0, 0, 0), (0, a/2, a/2), (a/2, 0, a/2), and (a/2, a/2, 0). An atom can be thought of as being at the center of regular tetrahedron and its four nearest neighbors can be thought of as being at the vertices of the tetrahedron. The tetrahedral structure matches the four covalent bonds that carbon (in a diamond!), silicon, germanium, and tin can form. To complete the story on the crystal structures of gold and silicon, gold’s face centered cubic lattice has a = 4.07 ˚ A, a nearest neighbor distance of 2.88 ˚ A and an atomic volume −1 3 of 10.2 cm mole . Silicon’s diamond lattice has a = 5.43 ˚ A, a nearest neighbor distance of 2.35 ˚ A and an atomic volume of 12.0 cm3 mole−1 . All these numbers are for room temperature. The bond lengths aren’t all that different, which means it isn’t that hard to substitute a few of one kind of atom in the other atom’s lattice. On the other hand, the bond lengths are different, and as the amount of substitution increases, the energy will go up. If the energy (or free energy) for silicon substituted in the gold lattice and gold substituted in a silicon lattice are plotted versus composition, they will cross at some intermediate composition. So not only is there a bump at intermediate compositions, there’s a kink! This will lead to a solubility gap when the temperature is low enough that the entropy of mixing doesn’t “win.” Note that this argument depends on the fact that the crystal lattices are sufficiently different that they can’t be continuously deformed into one another. A third example is a mixture of liquid 3 He and 4 He. Below 0.87 K, such a mixture separates into two components: liquid 3 He containing some dissolved 4 He floats on top of liquid 4 He containing some dissolved 3 He. As the temperature is lowered, almost all the 4 He condenses into the superfluid (BE) ground state with about 6% dissolved 3 He, while the remaining 3 He becomes almost pure. The phase diagram for this mixture is shown in K&K, figure 11.7. The fact that 3 He and 4 He become almost immiscible at low temperatures is the basis for operation of the helium dilution refrigerator. In addition, the overall reason for this effect is amusing. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Nov-2002 26-4
At these low temperatures, the 4 He is essentially all in the superfluid ground state (as already noted). In this state, its energy is all potential energy of interaction among atoms. (That is, no kinetic energy except zero point energy.) The 3 He is a degenerate Fermi liquid with all kinetic energy states up to the Fermi energy populated. Since 3 He and 4 He have the same electronic structure, one expects similar interaction energies between the molecules. The big difference is the kinetic energy. The kinetic energy per atom is 3F /5 and the Fermi energy depends on concentration as n2/3 . (See lecture 16.) If we consider a homogeneous mixture of 3 He and 4 He with x denoting the fraction of the 4 He and 1 − x the fraction of 3 He, then the energy of mixing is proportional to the concentration of 3 He to the 2/3 power or, uM ∝ (1 − x)2/3 , which starts at 1 at x = 0 and ends up at 0 at x = 1, and this may not seem like a bump, but remember, we’re concerned with a rise over the straight line connecting the energy per atom when it’s all 3 He and when it’s all 4 He. Relative to this line (the straight line from (0, 1) to (1, 0)), there is a bump. Another way to see this is to note that the second derivative with respect to concentration is negative, d2 u M 2 ∝ − (1 − x)−4/3 . dx2 9 Since we have a bump, we have a solubility gap which, as τ → 0, extends from x = 0 to x = 0.94.
Liquid and Solid Mixtures Without a Solubility Gap We’ve been considering a two component system that’s completely solid or completely liquid. If we consider the liquid and solid phases of two component systems, things get even more interesting. To start with, suppose that neither the liquid nor the solid phase has a solubility gap. Then we plot the free energy for the liquid and the free energy for the solid as a function of composition and neither curve will exhibit a local maximum. That is, neither curve will have a bump in the middle which would lead to a solubility gap. At high temperatures, the entropy wins and the liquid free energy curves lies below the solid free energy curve for the entire range of compositions and any composition will be a liquid at high temperatures. At low temperatures, the energy wins, and the solid free energy curve will lie below the liquid curve for all compositions and any composition is a homogeneous alloy at low temperatures. (These are basically just a restatement that there are no solubility gaps!) There must be a temperature range over which the solid and liquid free energy curves intersect. In this range, the appropriate free energy curve is the minimum of the two curves. Since there is an intersection, the free energy curve will be the liquid curve for one range of compositions and the solid curve for another range. (K&K figure 11.9 would be good a good thing to c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Nov-2002 26-5
study at this point.) If the overall minimum curve has a bump in it, then we have the same kind of situation that we had with the solubility gap. Draw the tangent curve below the bump; this is tangent to the minimum free energy curve near the two local minima. One local minimum is in the liquid part of the minimum free energy curve and the other is in the solid part. Anywhere between the tangent points, the system can lower its free energy by splitting into a liquid phase with the composition of the liquid tangent point and a solid phase with the composition of the solid tangent point. The overall amounts of liquid and solid are determined by the overall composition. (This is essentially the same argument we had with the solubility gap.) We can produce a phase diagram for this situation: a schematic is shown in the figure. The upper curve in the figure is known as the liquidus curve. Above this curve, the system is a homogeneous liquid. The lower curve is called the solidus curve and below this curve the system is a homogeneous solid. In between the two curves, the equilibrium configuration of the system is a mixture of homogeneous liquid and homogeneous solid, but with different compositions. The composition of the liquid is determined by the liquidus curve at the given temperature and the composition of the solid by the solidus curve. The relative amounts of liquid and solid are determined by the overall composition. Now imagine the following experiment. Start with a homogeneous liquid of some composition. Lower the temperature. Eventually the liquidus curve is reached. At this point, a further lowering of the temperature causes some of the material to solidify with the composition of the solidus curve—in general, this will have a greater fraction of the high melting temperature material than contained in the initial composition. The remaining liquid has more of the lower melting temperature material, so to continue to solidify what’s left, the temperature has to be lowered still more. In other words, this system does not have a well defined freezing point. It will solidify over a range of temperatures. Furthermore, the composition of the solid changes as solidification proceeds.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Nov-2002 26-6
Liquid and Solid Mixtures With a Solubility Gap As you can imagine, the number of things that can happen with mixtures is quite large. As a final example, we will consider a two component mixture with liquid and solid phases. We’ll suppose there is no solubility gap in the liquid phase (whenever we have a liquid, it’s homogeneous) but we’ll suppose there’s a solubility gap in the solid phase. At low enough temperatures, where there’s no liquid phase, we have the same situation as we’ve discussed before. For intermediate compositions (within the solubility gap) the equilibrium configuration is a heterogeneous mixture of solid A containing some dissolved B and solid B containing some dissolved A. Now suppose we have a high enough temperature that the liquid phase is the equi-
librium configuration for some compositions. The free energy curves for this temperature might look something like those shown (very schematically) in the figure. Starting from the left we have the free energy curve for a homogeneous solid of mostly A atoms with some B atoms substituted at lattice sites. Next is the free energy curve for a homogeneous liquid of A and B atoms. Finally, the curve on the right is for a homogeneous solid of mostly B atoms with some A atoms substituted at lattice sites. If the mixture were to remain homogeneous, its phase (liquid or solid) would be determined by the lowest of the three free energy curves. But once again, we can draw tangents to the curves and whenever the composition lies in a range between tangent points, the equilibrium configuration is a two c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Nov-2002 26-7
phase (in this case, liquid and solid) mixture. We can draw two tangents: one connects the mostly A solid and the liquid curves and the other connects the liquid and mostly B solid curves. The tangent points are indicated in the figure. At this temperature there are five possibilities. If the composition is less than xa , we have a homogeneous solid of mostly A atoms but with some B atoms. Between a and b (xa < x < xb ), the equilibrium consists of a homogeneous solid of composition xa and a homogeneous liquid of composition xb . The relative amounts of liquid and solid determine the overall composition within the range (xa , xb ). Between b and c (xb < x < xc ), we have a homogeneous liquid. Between c and d (xc < x < xd ), the equilibrium consists of a homogeneous liquid of composition xc and a homogeneous solid of composition xd . The relative amounts of liquid and solid determine the overall composition within the range (xc , xd ). Finally, to the right of d, (xd < x), we have a homogeneous solid of mostly B atoms with some A atoms. The phase diagram might look something like that shown schematically in the figure.
At high temperatures we have a homogeneous liquid. At low temperatures we have, depending on the composition, either homogeneous nearly pure solids, or a heterogeneous solid mixture. This is just what we discussed last time. At intermediate temperatures, we can have the five cases we discussed in the previous paragraph. (If the melting temperac 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
18-Nov-2002 26-8
tures of the two solids aren’t the same, we can also have a range of temperatures where there are three cases.) A mixture with this kind of phase diagram, that is, two liquidus branches, is called a eutectic system. The minimum temperature at which the mixture can exist as a liquid is the eutectic temperature (τe ) and the composition that remains liquid at τe is called the eutectic composition, xe . Now suppose we start out in the liquid state with a composition x > xe . We lower the temperature. Eventually we reach the right liquidus curve. As we lower the temperature still further, the system splits into two phases. The solid phase solidifies with the composition of the right solidus curve, while the remaining liquid follows the composition of the liquidus curve. What happens when the composition and temperature reach xe and τe ? Answer, the remaining liquid solidifies at that temperature. Of course, it solidifies into the heterogeneous solid we’ve been discussing. Similarly, if we start with a liquid with composition x < xe and lower the temperature, the solid that forms will follow the left solidus curve, while the remaining liquid follows the left liquidus curve until it reaches xe and τe where it solidifies. If we started with the eutectic composition, then the entire system would reach the eutectic temperature as a liquid, where it would solidify “all at once.” In this respect (having a single freezing and melting temperature), a eutectic behaves very much like a pure substance. You may be wondering about the fact that a eutectic system can have a melting temperature lower than the melting temperature of either pure component. This is somewhat non-intuitive. The key idea is that the system must come to some compromise between minimizing its energy and maximizing its entropy. If we have a pure substance as a crystalline solid, minimizing the energy has won over maximizing the entropy. With two components, with different crystal structures, the energy might still win over the entropy even allowing for the additional entropy of mixing. However, if we allow the system to be a liquid, then the energy of mixing may mostly disappear since we are not trying to force atoms into the “wrong” crystal structure. Then the entropy of mixing can have a substantial effect and make the free energy of the liquid lower than that of the solid at temperatures substantially below the melting temperature of either pure substance.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Nov-2002 27-1
Cooling by Expansion Now we’re going to change the subject and consider the techniques used to get really cold temperatures. Of course, the best way to learn about these techniques is to work in a lab where experiments in the micro-Kelvin to milli-Kelvin range are being done. One of the things you might learn, that you won’t find in K&K is that cryostats have to be mechanically isolated from their surroundings. Any energy that is transmitted to the interior of the cryostat winds up as thermal energy that the cooling apparatus must remove from the system. To get to really cold temperatures requires several stages. The first stage is usually the cooling of helium to just above it’s liquefaction temperature. This is followed by liquefaction and pumping on the helium to get to a few tenths of a Kelvin, the helium dilution refrigerator to get to milli-Kelvins, and adiabatic demagnetization to get to microKelvin. These days, one can use laser cooling and evaporative cooling of trapped atoms to get to nano-Kelvins. (Recall the Physics Today articles on the Bose-Einstein condensate, handed out with lecture 17.) These techniques are not discussed in K&K! So, how does one do cooling by expansion? The basic scheme involves a working substance (perhaps helium in the lab, a banned CFC in an old refrigerator?). The refrigerator operates in a cycle. The working substance (called the gas from now on) travels the following path. From the cold volume (where it picked up some thermal energy) it goes to a heat exchanger (where it picks up more thermal energy from the hot gas, and then to a compressor. The compressor compresses the gas, increasing its pressure and temperature. The hot, high temperature gas is allowed to cool in contact with the outside world (room temperature). The gas then flows through the heat exchanger where it gives up heat to the cold gas on the way to the compressor. Then it goes through an expansion device, where it expands, does work on the outside world, and cools. Finally, the cold gas is sent back to the cold volume where it extracts heat. The expansion device is often a turbine which allows the gas to expand approximately isentropically. If a given amount of gas has energy U1 , volume V1 and pressure p1 on the input side of the expansion device and U2 , V2 , and p2 on the output side, then the total energy going in is U1 + p1 V1 = H1 . (The pressure P1 pushes the volume V1 through the device.) Similarly, the energy coming out is U2 + p2 V2 = H2 . Then the work done on the expansion device is W = (U1 + p1 V1 ) − (U2 + p2 V2 ) = H1 − H2 =
5 N(τ1 − τ2 ) , 2
where we have assumed an ideal monatomic gas so U = 3Nτ /2 and pV = Nτ and the enthalpy is H = 5Nτ /2. Note that the heat exchanger operates at constant pressure, no mechanical work is done by the heat exchanger, and the energy transfered in the form of heat appears as the change in enthalpy of the gas. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Nov-2002 27-2
In a typical refrigerator for operation with helium, the gas is not liquefied in this step. There may several stages of expansion cooling required to get the helium close to its liquefaction temperature. In household refrigerators and air conditioners the gas may be allowed to condense into a liquid. In this case the expansion device is often a porous plug which acts as the expansion valve in a Joule-Thomson device to be described shortly.
Throttling Processes Consider a device with two insulated chambers. The chambers are separated by an insulating partition, with a small hole. The chambers have pistons which can be used to adjust the pressure in each chamber. Start with some gas (or whatever) in the left-hand chamber with volume V1 and pressure p1 . Since the partition has a small hole in it, some of the gas will get through the hole into the right-hand chamber. We adjust both pistons continuously, so the gas in the left hand chamber stays at pressure p1 and the gas in the right-hand chamber stays at pressure p2 < p1 . Since p2 < p1 , the gas gradually flows from the left hand chamber to the right hand chamber. No heat is exchanged and no work is done by the hole or the partition. The only work done is the work done by the pistons. This is an irreversible process. Changing the forces on the pistons by an infinitesimal amount will not cause the process to go in the opposite direction. There need not be friction or viscosity at the hole; in fact we assume there isn’t. There is a pressure drop at the hole simply because it is small and molecules only run into it (and get through to the other side) every so often. This process is called a throttling process. We also assume that the speeds of the pistons are slow enough that we can ignore any kinetic energy of the gas. So, we start with some amount of gas containing internal energy U1 , with pressure p1 , occupying volume V1 on the left-hand side of the partition. We end up with the same amount of gas containing internal energy U2 , with pressure p2 , occupying volume V2 on the right-hand side of the partition. The work done on the gas by the left piston is W1 = +p1 V1 and the work done on the gas by the right piston is W2 = −p2 V2 . As we’ve already mentioned, there is no heat transfer. Therefore the change in internal energy of the gas is equal to the work done on the gas, U2 − U1 = W1 + W2 = p1 V1 − p2 V2 , or U2 + p2 V2 = U1 + p1 V1 , or H2 = H1 .
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Nov-2002 27-3
The single small hole in our partition is simply to illustrate the point. In practice, the small hole may be replaced by a partition containing many small holes, by a valve, by a porous plug, or even by a wad of cotton! What’s needed is a way to restrict (throttle) the flow and produce a pressure drop without extracting heat or work. Note that in the refrigerator discussed in the previous section, the pressure drop was not produced by a throttling process. Instead, there was a turbine which extracted the work. That’s why that process was not a constant enthalpy process. Note also that since a throttling process is irreversible, the state of the gas as a whole is not an equilibrium state (part of it’s at p1 and part of it’s at p2 ). So we cannot really plot a throttling process on a p-V diagram. Some more comments on enthalpy: Recall dU = dQ − p dV and in a constant volume (isochoric) process the change in energy is just the heat added and f Uf − Ui = CV dT , i
For the enthalpy, dH = dU + d(pV ) = dQ + V dp , and in a constant pressure (isobaric) process the change in enthalpy is just the heat added and f
Hf − Hi =
Cp dT i
In the case of an adiabatic (K&K would say isentropic) process
f
Uf − Ui = −
p dV , i
while
f
Hf − Hi =
V dP . i
In the case of a free expansion (an irreversible process) in an insulated container, there is no work done and the energy is constant. In the case of a throttling process in an insulated container, we’ve just shown that the enthalpy is constant. For a monatomic ideal gas U=
3 NkT , 2
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Nov-2002 27-4
and H=
5 NkT . 2
As another example of the use of enthalpy, consider a chemical reaction which occurs in a beaker open to the atmosphere. This is a constant pressure process, so the heat generated or consumed by the reaction represents a change in enthalpy of the system. Chemical handbooks list reaction enthalpies.
The Joule-Thomson Effect When a gas is close to it’s liquefaction temperature, the molecules are necessarily close enough to each other that there is a significant contribution to the total energy from the interaction energy between molecules. This contribution is negative. If we let the gas expand, the interaction energy decreases in absolute value (that is it gets more positive), so work must be done against the attractive forces between molecules. If there is no other source for this work, it will come from the kinetic energy of the molecules of the gas and the gas will cool. How can we arrange for there to be no other source for the work? An extreme way would be to let the gas expand into a vacuum. This motivates the idea of an expansion valve which creates a pressure drop between two volumes of gas. In other words, the throttling process just discussed. A Joule-Thomson apparatus contains a valve or other flow restriction device like a porous plug which allows for a throttling process. As we’ve just seen, the gas goes through the expansion valve at constant enthalpy. That is H1 = H2 . For an ideal monatomic gas, H = 5Nτ /2, so there is no cooling of an ideal gas in a Joule-Thomson apparatus! Of course, we argued in the previous paragraph that it is the interaction energy between molecules that is responsible for liquefaction and an ideal gas has no interaction energy so maybe we need a better model than the ideal gas. How about a van der Waals gas? You showed in problem 7 of homework 7 that for a monatomic van der Waals gas H(τ, V ) =
5 N2 Nτ + (bτ − 2a) , 2 V
where a and b are the van der Waals constants we introduced earlier. The first term is the enthalpy of an ideal gas and the second term is a correction for the long range attraction and short range repulsion of the van der Waals interaction. Consider a temperature defined by 27 2a τinv = = τc , b 4 where τc is the critical temperature defined earlier. The van der Waals correction to the enthalpy changes sign at the inversion temperature, τinv . At temperatures above the c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Nov-2002 27-5
inversion temperature, the van der Walls correction is positive, and increasing the volume (expanding) at constant enthalpy means the temperature must rise. (Not what we want!) Below the inversion temperature, the correction is negative and expanding at constant enthalpy requires that the temperature decrease. Note that the inversion temperature fell out of our van der Waals model. What happens if the van der Waals model is not a good description? It must be that all gases have inversion temperatures, as all gas molecules have negative interaction energies at close (but not too close) separations. If one combines a heat exchanger with an expansion valve, then one has a Linde cycle. High pressure warm gas comes into the heat exchanger where it gives up some heat to the low pressure cool gas going the other way through the heat exchanger. We assume that the input gas and the output gas have the same temperature. After the high pressure gas leaves the heat exchanger, it goes through the expansion valve where part of the gas liquefies. The remainder exits to the heat exchanger. Of course, the input and output of the heat exchanger must be connected to something—perhaps one or more stages of an expansion cooling device. The whole process occurs at constant enthalpy as before. Suppose the fraction which liquefies is λ. Then Hin = λHliq + (1 − λ)Hout . The three enthalpies are the enthalpy of the gas at the input pressure and temperature, the enthalpy of the gas at the output pressure and the same temperature (by the heat exchanger condition on the temperatures) and the enthalpy of the liquid at the temperature at which its vapor pressure is the output pressure. One can solve for the liquefaction fraction, λ=
Hout − Hin , Hout − Hliq
so if the output enthalpy is greater than the input enthalpy, some liquid is formed. This will always be true if the Joule-Thomson expansion cools the gas. K&K figure 12.4 shows some calculated values of λ for various temperatures and pressures. Values around 20% are typical.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Nov-2002 27-6
Cooling by Pumping Once liquid helium has been obtained, pumping on the liquid helium cools the liquid by evaporation. In the steady state, the temperature of the helium is set by the vapor pressure maintained by the vacuum pump. Typical temperatures that can be reached are from a few tenths of a Kelvin to a few Kelvin. Table 12.2 in K&K lists helium vapor pressures versus temperature for both 3 He and 4 He.
The Helium Dilution Refrigerator As we remarked last time, at low temperatures 3 He and 4 He become immiscible. As τ → 0, the equilibrium configuration becomes essentially pure 3 He floating on 4 He containing about 6% dissolved 3 He. With such an arrangement, the chemical potential for 3 He must be the same in both the pure 3 He phase and the 4 He phase. Now suppose we have such an arrangement, but we replace the 4 He containing dissolved 3 He with pure 4 He. Then the 3 He “evaporates” into the 4 He until the “vapor pressure” of the 3 He in the 4 He reaches equilibrium (which happens at 6% concentration as τ → 0). In this evaporation process, latent heat is used up, and the liquid phase 3 He cools. If we arrange to replenish the pure 4 He in a cyclic or continuous method, we can make a continuous operation refrigerator. K&K describe several methods that are used to replenish the 4 He including distillation. The latent heat that’s released is just the difference in chemical potentials. The internal energy of the pure 3 He is proportional to the Fermi energy (presumably constant) plus a term proportional to τ 2 . (See the discussion in lecture 16 on the heat capacity of a cold fermi gas.) As the temperature goes down, the latent heat available goes down and the practical limit of a dilution refrigerator is about 10 mk.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Nov-2002 27-7
Isentropic Demagnetization This technique is usually called adiabatic demagnetization. We’ve already mentioned it in connection with our magnetic spin system that we discussed in lecture 4. Consider a paramagnetic system. We don’t want ferromagnetic interactions, and we need intrinsic magnetic moments, so we don’t want a diamagnetic system either. Suppose the system is in contact with a thermal bath (probably provided by liquid 3 He!) at temperature τ1 and an external magnetic field B1 is applied. We need µB1 /τ1 > 1. When this is satisfied, most of the magnetic moments are aligned and the entropy of the magnetic system is very low. In fact, the distribution of spins depends only on µB/τ , which means the entropy can only depend on µB/τ . Now disconnect the spin system from the thermal bath and decrease the field to B2 without changing the distribution of spins (leaving the distribution of spins constant is the isentropic part). Then the entropy doesn’t change. Since the entropy didn’t change it must be that µB/τ remained constant while the field was lowered. So the temperature had to change (it’s basically dialed in with the field). The final temperature satisfies B2 τ2 = τ1 . B1 The temperature of the spin system can be lowered substantially by this technique. The spin system isn’t completely isolated, it’s still in weak thermal contact with the lattice vibrations (phonons) of whatever solid the spins are contained in. (We assume the solid is an insulator, so we don’t need to worry about the electronic contribution to the heat capacity. At low enough temperatures, the energy that can be absorbed by the spin system can be comparable to the phonon energy content of the lattice. With an electron paramagnetic system, temperatures in the milli-Kelvin or fraction of a milli-Kelvin range are possible. (I think K&K figure 12.9 must have it’s temperature axis mislabeled. It must be the final temperature in mK, not K!) The same trick can be played with nuclear magnetic moments which are several orders of magnitude weaker than electron magnetic moments. Nuclear adiabatic demagnetization must start from temperatures in the milliKelvin range and can reach temperatures around the micro-Kelvin range. You might notice that the final temperature is proportional to the final magnetic field. Why not let the final magnetic field be zero? Then the temperature of the spin system (before it comes to equilibrium with the lattice vibrations) would be zero. Wrong! The external field can be taken to zero, but there will still be some internal magnetic field due to the magnetic interactions and inhomogeneities on the atomic scale. So there is some effective final value of B that’s not zero which limits the lowest temperature the spin system may reach.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
20-Nov-2002 27-8
Laser and Evaporative Cooling As we’ve already remarked, more modern techniques for getting really cold involve atom traps and laser and evaporative cooling. See the list of articles in lecture 17. To get a feel for laser cooling, consider an atom in the path of two lasers pointed in opposite directions (actually it’s the same laser, but with two beams directed at the atom from opposite directions). This will suffice for “1D cooling.” To get the other two dimensions cold, we need 4 more beams. The lasers are tuned just below a transition energy of the atom (from the ground state to some excited state). If the atom is at rest (very, very cold), it sees the laser beam as off resonance and nothing happens. If the atom moves towards one of the laser beams, that beam appears blue shifted to the atom and starts to be on resonance. If the atom absorbs some of the light it gets a kick opposite to its velocity, slows down and is cooled! What I’ve just described is way over-simplified. The coherent oppositely directed lasers form a standing wave. A proper treatment of the situation involves a quantum treatment of the atom and its electron states in the standing wave field of the laser. Nevertheless, the simplified treatment gives and idea of what’s going on. Evaporative cooling is just what it sounds like, the high energy atoms are allowed to leave the trap and what’s left behind are the lower energy and cooler atoms. Various tricks are used to produce interactions among the atoms in order that some will get some excess energy and evaporate. The article on a Fermi gas cited in lecture 17 describes such a trick.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
22-Nov-2002 28-1
Semiconductor Basics In solids, the electron energy levels are organized into bands. How does this come about? Imagine a regular crystal lattice composed of identical atoms. Imagine that the size of the crystal can be varied. So we can consider the crystal to have an atomic spacing anywhere from its actual spacing to very large separations between the atoms. What are the energy levels when the atoms are far apart? The energy levels are just the atomic levels but the degeneracy of each atomic level is multiplied by N , the total number of atoms in our crystal. Now imagine bringing the atoms closer together. Eventually, they will be close enough that the electrons in neighboring atoms can begin to have weak interactions. Each atomic energy level then gives rise to N very closely spaced levels corresponding to the original energy. As the atoms are brought still closer together, the interactions among neighboring atoms increase and the spreading out of the original atomic levels into a range of levels increases. In effect, each atomic level becomes a band of extremely closely spaced levels in the crystal. There can be gaps between bands and bands can overlap. The bands are most important for the energy levels of the outer electrons. The wave functions of the inner electrons barely overlap from one atom to the next, so there is very little spreading out of the energy levels of the inner electrons. The inner electrons can be thought of as belonging to a specific atom. They are localized at the site of the atom. On the other hand the outer electrons are more properly thought of as bound to the crystal rather than any particular atom. Of the most interest for what we’re about to discuss are the bands of the outermost electrons. What’s the difference between an insulator, a semi-conductor, and a conductor? Consider a crystal at τ = 0 (and suppose it doesn’t become a superconductor!). The energy levels available to the outer electrons are in bands as we’ve discussed. The number of levels within a band is just the number of atoms in the crystal. At absolute 0, electrons fill up the energy levels until they reach the Fermi energy. Where the Fermi energy occurs with respect to a band edge is critical for determining whether the material is an insulator or a conductor. If a band is just filled, so there is an appreciable energy gap to the next available level, the material will be an insulator. To make any change to the electron distribution, for example, to produce a distribution of electrons with a non-zero average velocity, requires that electrons be given enough energy to overcome the gap between bands. If the gap is substantial (several electron volts, say) ordinary electric fields will not do it. (Several electron volts is the energy of a photon of visible light!) On the other hand, if a band is only partially filled, or if the gap from a filled band to the next higher empty band is very small, then it is easy to give electrons enough energy to change their distribution and in particular to put them in a distribution with a net average velocity in which case they are carrying an electric current. An insulator has a filled band with a big gap to the next band. A conductor has a partially filled band or else a filled band with a very small gap to the next band. If the temperature is not zero, then the electron distribution does not end abruptly c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
22-Nov-2002 28-2
at the Fermi energy. Instead, there will be electrons in states with energies above the Fermi level and there will be holes (missing electrons) in states at energies below the Fermi level. If at τ = 0 a band was filled with a gap to the next band, then if the gap is not too large, there may be an appreciable number of electrons thermally excited to the next band (leaving the same number of holes behind). In this case, there aren’t as many charge carriers as in a good conductor, but there are more than in an insulator, so we have a semiconductor. Semiconductors are technologically useful because their electrical properties are relatively easy to control. We will investigate some of the properties of semiconductors starting with the electron distribution.
Electron Distribution in Semiconductors We will consider a situation where the valence band is completely full at τ = 0 but the conduction band is completely empty. This is in the absence of doping by impurities. With doping, there may be some excess electrons that must reside in the conduction band, even at τ = 0 or there may be some holes (a deficit of electrons) so the valence band is not completely full at τ = 0. The energy of the top of the valence band is denoted by v , the bottom of the conduction band is c , and the energy gap is g = c − v . The concentrations of electrons in the conduction band and holes in the valence band are denoted by ne and nh . With a pure semiconductor, they are equal so that the semiconductor is electrically neutral. But most of the fun with semiconductors comes from the doping. Impurity atoms which provide an extra electron are called donors. Silicon is the basis of much of the semiconductor industry. An atom from the periodic table column just to the right of silicon is a donor. Phosphorus is often used. An atom from the periodic table column just to the left of silicon is an acceptor. Boron is often used. Of course, not all donors and acceptors will be ionized, so the concentration of excess electrons will not be exactly the same as the nd , the concentration of donor atoms. Instead, consider the − concentration of donor and acceptor ions, n+ d and na . Then − ∆n = n+ d − na ,
is the net ionized donor concentration, or the net concentration of positive charges from the ions. If the sample is electrically neutral, this must be balanced by the net concentration of negative charges in electrons and holes in the conduction and valence bands, − ne − nh = ∆n = n+ d − na .
The electron distribution (that is, the probability that a state of energy is occupied) is the Fermi-Dirac distribution (see lectures 13 and 16), fe () =
1 . e( − µ)/τ + 1
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
22-Nov-2002 28-3
The chemical potential of the electrons is µ. At this point, we have to mention a little jargon. Recall that when we discussed Fermi gases earlier, we used the term “Fermi energy,” and the symbol F , to denote the chemical potential of the electrons at τ = 0; that is, the energy of the highest filled state at τ = 0. In the world of semiconductors, the term “Fermi level,” and the symbol F are used to denote the chemical potential of the electrons at non-zero temperature. We will follow K&K and continue to let F denote the chemical potential at τ = 0 and we will use µ for the chemical potential at other temperatures. The total number of conduction electrons is given by summing the electron distribution over all states in the conduction band, Ne =
fe () .
CB
Note that this is the total number, not the concentration, since we’re summing over all states. A hole in the valence band is the absence of an electron. The probability that a state is occupied by a hole (that is, there’s not an electron in the state) plus the probability that the state is occupied by an electron must sum to one since these events are mutually exclusive and are the only things that can happen. The distribution function for holes is then 1 1 fh () = 1 − fe () = 1 − = , e( − µ)/τ + 1 e(µ − )/τ + 1 and curiously, a hole (I keep wanting to say anti-electron, but it’s not really!) looks as though it has energy and chemical potential opposite to that of the electron, just as it has charge opposite to that of the electron. In any case, the total number of holes in the valence band requires a sum over the states in the valence band Nh =
fh () .
VB
These sums of the occupancies over states are similar in principle to the sums we did for the free electron gas. The big difference has to do with the number of states as a function of energy. In the free electron gas, we were able to use particle in a box states to get the number of states versus energy. With bands, the density of states are determined by the lattice structure, the doping, and so on. We will be making appropriate approximations! K&K table 13.1 gives some data for popular semiconductors. The energy gap at room temperature is near an electron volt for most of the entries in the table. Recall that at room 1 temperature τ = kT = 40 eV and thermal energies are small compared to the gap energies. Even the small 0.18 eV gap of In Sb is large compared to thermal energies. At least in pure (undoped) semiconductors, we expect to find the chemical potential somewhere in the middle of the gap. So let’s start by assuming that this is the case and we can check our results to see if they’re consistent with this assumption. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
22-Nov-2002 28-4
Consider the occupancy at the conduction band edge. It involves the exponential exp((c − µ)/τ ). By the assumptions we’ve just made, the argument of the exponential is large and the exponential is even larger. The same argument applies but more so for higher energy states in the conduction band. Then the electron distribution is fe () =
1 ≈ e−( − µ)/τ 1 . ( − µ)/τ e +1
Similarly, the hole distribution is fh () =
1 ≈ e−(µ − )/τ 1 . e(µ − )/τ + 1
So, with our assumptions, both electrons and holes are distributed “classically.” Of course, they’re not really, but we managed to get the distribution functions in a form that looks like the classical Boltzmann factor. What’s actually going on is that the step from 100% occupancy to 0 occupancy is hidden in the band gap. There’s a small exponential tail sticking into the conduction band and there’s one minus a small exponential tail sticking into the valence band. By considering holes, not electrons, as the occupants of the valence band, we convert this into an exponential tail of holes sticking into the valence band. Returning to the total number of electrons in the conduction band, we have Ne =
e−( − µ)/τ = e−(c − µ)/τ
CB
e−( − c )/τ .
CB
Nc
The expression for Nc looks very much like a single particle partition function. We’re just summing Boltzmann factors over all the energy states in the conduction band. When we did this sum for a single particle in a box, we came up with Z1 = nQ V with nQ = (mτ /2π¯h2 )3/2 . The τ 3/2 dependence comes about because the energy depends on momentum squared and the single particle states are uniformly distributed in 3 dimensions of momentum. (When we integrate, our volume element is dpx dpy dpz and we need a τ 1/2 for each momentum.) Since we have a Boltzmann factor, only the states near c will be important in determining Nc . Assuming we knew the density of states, we might expect that the integral we would write would again have a volume element dpx dpy dpz , so we might expect to get a τ 3/2 temperature dependence. Then we can account for the actually density of states by introducing an effective mass in place of the electron mass in the quantum concentration. So, we wind up with Nc = 2
m∗e τ 2π¯h2
3/2 V = nc V ,
where the factor of 2 accounts for the two spin states of the electron, nc is the conduction electron quantum concentration, and m∗e is called the density of states effective mass. This c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
22-Nov-2002 28-5
means it’s the fudge factor to be used in the density of states to get the overall number Nc to come out right. K&K table 13.1 lists these masses as well as the equivalent quantities for holes. Putting all this together, we have ne = nc e−(c − µ)/τ , and for our earlier assumptions concerning the size of the gap compared with thermal energies to be valid, we must have n e nc . We do the same thing with the holes, Nh =
e−(µ − )/τ = e−(µ − v )/τ
VB
e−(v − )/τ ,
VB
Nv
The quantum concentration of the holes is nv = Nv /V = 2
m∗h τ 2π¯h2
3/2 ,
where m∗h is the density of states effective mass for the holes. Then the hole concentration is nh = nv e−(µ − v )/τ . Note that the two effective masses are really just proportionality constants to get the number of electrons and holes to come out right. They need not be equal, even in the same semiconductor. It should also be pointed out that effective masses occur in other contexts in condensed matter physics. In particular, as an electron or hole moves through a semiconductor, it interacts with the surrounding atoms and electrons, perhaps causing them to move as well, and the electron may have a “dynamical” effective mass that’s different from the actual mass. The two quantum concentrations nc and nv are also known as the effective density of states in the conduction and valence bands.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
22-Nov-2002 28-6
Law of Mass Action for Semiconductors We have expressions for the electron and hole concentrations in the conduction and valence bands. If we multiply them together, the chemical potential drops out, ne nh = nc e−(c − µ)/τ nv e−(µ − v )/τ = nc nv e−(c − v )/τ = nc nv e−g /τ . If we have a pure semiconductor, no donors nor acceptors, then the electron and hole concentrations must be equal. The common value is called the intrinsic carrier concentration and the semiconductor is said to be an intrinsic semiconductor. This concentration is ni = (nc nv )1/2 e−g /2τ . So, we can write the previous expression as ne nh = n2i (τ ) , where the dependence on temperature has been explicitly indicated as a reminder. There’s a temperature in an exponential and three powers of temperature in the product of the quantum concentrations. Probably the earlier expression is to be preferred, since it hides less of what’s going on. On the other hand, the expression in terms of n2i is certainly more elegant and also, I believe, it’s possible to measure ni more or less directly without having to measure the energy gap and density of states. In any case, what we’ve just come up with is known as the law of mass action for semiconductors. The key assumption we made is that the chemical potential is far enough away from both the valence and conduction band edges that we can approximate a FermiDirac distribution (in both bands) as a classical Maxwell-Boltzmann distribution. The law of mass action will continue to apply (with essentially the same n2i ) for doped semiconductors provided the chemical potential does not move too close to one of the band edges and our assumption that the electron and hole distributions are classical remains valid. Why would the chemical potential move around? Just as with the standard Fermi-Dirac ideal gas, the chemical potential determines the number of particles. For example, if a semiconductor is doped with donors, it will have more electrons and fewer holes, the chemical potential shifts towards the conduction band, the electrons are less strongly cut off by the Boltzmann factor, and the holes are more strongly cut off by the Boltzmann factor. As a point of interest, we can use the data in table 13.1 of K&K to calculate the intrinsic carrier concentration for silicon at room temperature. The result is ni,Si (300 K) = 4.8 × 109 cm−3 . In the case of an intrinsic semiconductor, we can calculate the chemical potential in terms of the other quantities we’ve defined. We have ne = nc e−(c − µ)/τ = nh = nv e−(µ − v )/τ . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
22-Nov-2002 28-7
Isolating the chemical potential gives nv (c + v )/τ e e2µ/τ = , nc and taking the logarithm gives µ=
m∗ 3τ 1 (c + v ) + log h∗ . 2 4 me
For an intrinsic semiconductor, the chemical potential is at the center of the gap plus a small offset which comes from the differing densities of states at the two band edges. If for example, m∗h > m∗e , this means that there are a greater density of hole states than electron states. The correction increases µ past the gap center to provide a greater occupancy of the electron states relative to the hole states so that there will be equal numbers of electrons and holes. The logarithmic term remains a small correction, even for rather large effective mass rations because we’re working in the regime where thermal energies (∼ τ ) are small compared to g .
Electron Distribution in Doped Semiconductors As already mentioned, if we have a doped semiconductor the electron and hole concentrations are no longer equal but must adjust to maintain charge neutrality, − ne − nh = ∆n = n+ d − na .
If the occupancies are small enough that the classical approximation remains valid, then the law of mass action provides another equation relating ne and nh . (We assume for the moment that ∆n is known.) Solving for the concentrations, we have 1 n he = 2
∆n2
+
4n2i
± ∆n
.
In many cases of practical interest, the doping concentration is much greater than the intrinsic concentration and we can expand the square root 2n2i . ∆n2 + 4n2i ≈ |∆n| + |∆n| For an n-type semiconductor (donors), ∆n > 0, and ne ≈ ∆n ,
nh ≈
n2i . ∆n
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
22-Nov-2002 28-8
For a p-type semiconductor (acceptors), ∆n < 0, and n2i , ne ≈ − ∆n
nh ≈ −∆n .
In this limit, the semiconductor is called an extrinsic semiconductor and whichever carrier is the majority carrier has a concentration equal to the net concentration of dopants while the other carrier is inversely proportional to the dopant concentration. You can see that in an extrinsic n-type superconductor, the chemical potential must move towards the conduction band edge compared to its location in the intrinsic case. Similarly, a p-type semiconductor has a lower chemical potential than the corresponding intrinsic semiconductor. In fact, if we assume that the doping doesn’t change ni , the effective masses, and the gap width, we can solve for the chemical potential, much as we did before, and we get µ np =
1 3τ τ m∗ ∆n2 (c + v ) + log h∗ ± log 2 , 2 4 me 2 ni
where we’ve assumed that |∆n| ni . The upper sign applies to n-type and the lower sign applies to p-type semiconductors. You might think that the additional term invalidates our classical treatment of electrons and holes when we get to high temperatures. Actually the problem is at low temperatures because ni goes to 0 as the temperature goes to 0. As the temperature is lowered, µ goes to one or the other band edge and we must use the full Fermi-Dirac distribution.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 9
Due 26-Nov-2002
H9-1
1. K&K, chapter 11, problem 2. Note that when we discussed the helium mixture in lecture, we ignored the effect you’re about to discover! 2. K&K, chapter 11, problem 3. 3. K&K, chapter 11, problem 5. 4. K&K, chapter 12, problem 1. 5. K&K, chapter 12, problem 2. 6. K&K, chapter 12, problem 4. 7. K&K, chapter 12, problem 5.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
1-Dec-2002
Physics 301 Problem Set 9 Solutions Problem 1. At zero temperature, there is no entropy contribution and the free energy is equal to the energy f (x) = u(x). The fact that there is a finite residual solubility of 3 He in 4 He means then that the curve u(x) has a local minimum at x ≈ 0.94, and it curves slightly upwards after that. On the left, the minimum is at x = 0. Between the two points, it has a local maximum as usual. When the He mixture was discussed, we ignored the interaction energies completely - this has to be taken into account when the fraction of 3
He is small.
Problem 2. For either of the phases, the non-mixing parts of the free energy is a linear function f0 (x) = f0 (0) + xf00 (0) for small x. The mixing entropy (per molecule) is σ m = −[(1−x) log(1−x)+x log x]. If we ignore the mixing energy (or consider it to be part of the
linear approximation, then the full free energy is f(x) = f 0 (0) + xf00 (0) + τ [(1 − x) log(1 − x) + x log x]. The minimum of this ∂f/∂x = 0 is at x = exp(−f 00 /τ ). The fact that the liquid mixture is in equilibrium with the solid mixture means that x is between the minima of the solid and liquid free energy curves. The ratio of the concentrations in the solid and 0 0 0 0 liquid phases is then k = xs /xl = exp(−(f0s −f0l )/τ ). For the values given, f0s −f0l = 1eV ,
and τ = 1000kB = (1000/300 × 40) ≈ 1/12eV which gives k = e−12 ≈ 6 × 10−6 .
Problem 3. We have a finite amount of gold on a (infinite) crystal of silicon. At a temperature above the melting point of the eutectic, the composition will be therefore a lot of pure silicon and a little bit of homogeneous liquid with a certain ratio of elements which we can read off from the graph as the point of intersection of the horizontal line at the given temperature and the right ’wing’ of the curve. (All the gold will have melted into this liquid). Now we assume that the diffusion only occurs in the vertical direction. The height of the liquid layer is then given (in Angstroms) by 1000n Au = xhnliq where x is the ratio of constituents and n is the number density (the equation equates the number of gold atoms before and after the diffusion). nliq , the density of the homogeneous liquid is given by 1/nliq = x/nAu + (1 − x)/nSi . The height of the liquid layer is then h = 1000n Au/xnliq =
1000(1 + (1 − x)nAu /xnSi ). We are given the mass densities ρAu = 19.3g/cc and ρSi =
2.33g/cc. The number densities are then nAu = 5.91 × 1022 /cc and nSi = 4.99 × 1022 /cc. 1
Physics 301
1-Dec-2002
At 400C, the ratio x = 0.31 which gives h ≈ 3640˚ A, and at 800C, the ratio x = 0.43 ˚. which gives h ≈ 2570A Problem 4.
The liquefaction coeeficient is λ = (H out − Hin )/(Hout − Hliq ).
The
denominator is the sum of the latent heat and the difference in enthalpies of the gas in the out and in stages.
In the ideal gas approximation, this is H out − Hliq ≈
∆H + (5/2)N(τin − τliq ). The numerator for a van der Waals gas is Hout − Hin =
(5/2)N(τout − τin ) + (N 2 /Vout )(bτout − 2a) − (N 2 /Vin )(bτin − 2a). For the liquefaction setup, τout = τin , and τliq = τb is the boiling point of the liquid. To first order in a and b, we can use the ideal gas equation to write the volumes in terms of the pressures and we get Hout − Hin = N (b − 2a/τout )(pout − pin ) = Nb(1 − 2a/bτout )(pout − pin ).
To select the coefficients a and b, we set the molar volume of Helium V = 32cc = 2N b
and the inversion temperature τ = 51kB = 2a/b. The latent heat is ∆H = 0.082kJ/mol, Tb = 4.18K, Tin = 15K, pout is the atmospheric pressure. This gives us the estimate (writing all energies in J, and pressures in atm): Nb(1 − 2a/bτout )(pout − pin ) ∆H + (5/2)N(τin − τliq ) (32/2) × 10−6 (1 − 51/15)(patm − pin )105 = ≈ .013(pin − 1). (82 + 2.5(15 − 4.18)1.38 × 10−23 × 6 × 1023 )
λ=
(1)
which is approximately the values in the figure. Problem 5. (a) The fridge is ideal, so there is no loss of entropy which means that, for a cycle with change in temperature dT , the high temperature reservoir fixed at T 0 , and the low temperature reservoir (the gas) at variable T , dQh /Th = dQl /Tl . The work done for the above cycle is dW = dQh − dQl = (Th /T − 1)dQl = −(Th /T − 1)Cp dT for
the cooling the ideal gas, where Cp = (5/2)RT . Integrating this from T0 to Tb gives W1 = −(5/2)R(T0 − Tb) + (5/2)R log(T0 /Tb ). Once the gas has reached boiling point,
it remains at a fixed temperature and the heat removed is the latent heat ∆H. The
work done for this part is W2 = Qh − Ql = ∆H(T0 /Tb − 1). Adding all this up gives
W = (5/2)RT0 (log(T0 /Tb ) − (T0 − Tb )/T0 ) + ∆H(T0 − Tb )/Tb .
(b) We have T0 = 300K, Tb = 4.18K, ∆H = 82J/mol. Plugging in the numbers gives us W = 0.228kW hr/l. 2
Physics 301
1-Dec-2002
Problem 6. The heat load is 0.1W which is the amount of heat being extraced per second. This is equal to ∆H(dN/dt) where ∆H = 82J/mol is the latent heat of vaporization, and (dN/dt) is the number of moles being extraced per second. The extraction is happenning at room temperature, and so dN/dt = (P/RTroom )dV /dt = (P/Proom )(1/24)(100)mol/s. This gives us 0.1 = 82(P/Proom )(1/24)(100) ⇒ P = 0.22torr. This is the vapor pressure of the Helium gas, and we read off the temperature to be T min ≈ 1K. For the smaller heat
load of 10−3 W and a faster pump speed of 103 l/s, we have P = 2.2 × 10−4 torr, and we
read off the temperature to be Tmin ≈ 0.6K.
Problem 7. A magnetic field of B1 = 100kG correspods to a temperature of 7K (the magnetic moment of the ion is a Bohr magneton). The maximum cooling will be achieved when there is no external magnetic field at the end. The residual magnetic field is then B2 = 100G which corresponds to a temperature of 7mK. The change in entropy of the spin system is then N (tanh2 (7/T1 ) − tanh2 (7/1000T2 )). The Debye temperature of the lattice is 100K. For temperatures much less than 100K, the entropy of the lattice is
σl = N(4π 4 /5θ3 )T 3 ≈ 8 × 10−5 NT 3 and the change in entropy for the lattice is 8 × 10−5 N (T13 − T23 ).
For large temperatures, the entropy of the lattice overwhelms the spin entropy.
Let us say the temperatures are in the range much below the lattice temperature and above the spin temperatures. Then the change in spin entropy can be approximated by N (7/T1 )2 − N (7/1000T2 )2 . For significant cooling (T2 = 0.1T1 ), the change in lattice
entropy is approximately N 8 × 10−5 T13 , and the change in spin entropy is approximately
N (7/T1 )2 . Equating the two gives the temperature T ≈ 14K which is consistent with our earlier assumptions. (We can be slightly more precise and use the exact expression for the
spin entropies, and find the answer numerically - the answer shifts down a little bit, but we will probably use a slightly lower temperature in an experiment anyway).
3
Week 10. Electrons in Semiconductors
Physics 301
25-Nov-2002 29-1
Reading Finish K&K Chapter 13.
Electron Distribution in Degenerate Semiconductors When the chemical potential is sufficiently close to one of the band edges that the classical approximation no longer applies, we have to use the full treatment as we did (to some level of approximation) for a Fermi gas in lectures 16 and 17. In particular, our old expression for the number of particles was ∞ N = f() D() d , 0
and µ appearing in f() is adjusted to produce the correct number of particles, and D() d is the number of states with energies in the range → + d. For a spin 1/2 particle in a box we had 3/2 √ V 2m D() = . 2 2 2π ¯h As we’ve discussed, to get the right density of states we should replace m by m∗e . In addition, the energy that appears in the square root in the density of states really came from changing variables from momentum to energy. At the bottom of the conduction band, the electrons presumably have very little momentum. That is, the first state to be filled corresponds to an electron wave which is a standing wave made up of the smallest possible momentum components. The energy c is negative and is mostly due the interactions with the atoms (remember, before the atoms formed a crystal, the atomic state that gave rise to the conduction band was a bound electronic state). The upshot of all this is that we should replace the energy in the density of states expression by − c and the integral should start at c . We have 3/2 ∞ √ − c d 1 2m∗e Ne = ne = . 2 V 2π 2 ¯h c e( − µ)/τ + 1 Recall the definition of nc ,
nc = 2
from which we get 2 1 ne = √ 3/2 nc πτ
m∗e τ 2π¯h2 ∞
c
3/2 ,
√ − c d . e( − µ)/τ + 1
Now let x = ( − c )/τ and η = (µ − c )/τ , and we can write the result in dimensionless form ∞ √ ne 2 x dx = I(η) , = √ nc π 0 ex − η + 1 c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Nov-2002 29-2
where I(η) is called the Fermi-Dirac integral. In the classical limit, where η −1, the Fermi-Dirac integral reduces to our previous result ne = eη = e(µ − c )/τ . nc If this limit doesn’t apply, then an excellent approximation is η = log
1 ne ne +√ . nc 8 nc
Figure 13.4 of K&K shows how well this Joyce-Dixon approximation works.
Ionization of Donors and Acceptors We were working out the electron and hole concentrations in a semiconductor. We found that the concentrations depended on the concentration of donor and acceptor ions. Up to now, we’ve assumed that these were “givens.” Of course, the true givens are the concentrations of the donors and acceptors and an overall equilibrium will be established between donor and acceptor ion concentrations, the concentration of electrons in the conduction band, and the concentration of holes in the valence band. When a donor is introduced into a semiconductor, the extra electron may stay bound to the donor, or the donor may ionize and contribute the extra electron to the conduction band. When bound to the donor, the energy of the electron is d . This must be less than the lower edge of the conduction band c (otherwise the donor will always be ionized). However, d should not be a lot less than c or the donor will hardly ever be ionized and there wasn’t much point in introducing this donor in the first place. Thus donor bound states are “in the gap,” but close to the conduction band edge. Similarly, acceptor bound states, with energy a are in the gap but close to the valence band edge at v . Table 13.2 of K&K gives some values of ∆d = c − d and ∆a = a − v . The values are a few tens of milli-eV. Next we want to calculate the probability that a donor is ionized or neutral. There are two bound states corresponding to the two spin states of the electron. There is one ionized state and this state has no particles and no energy. Then the grand partition function for this system is Z = 1 + e(µ − d )/τ + e(µ − d )/τ . The probabilities that the donor is ionized or neutral are 1 , (µ 1 + 2e − d )/τ 1 . f(D) = 1 + 1 e(d − µ)/τ
f(D+ ) =
2
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Nov-2002 29-3
In the case of an acceptor, there’s one way to ionize it—adding an electron (which means producing the hole in the valence band) gives four covalent bonds with the four neighbors. Since all these electrons are equivalent, there’s only one way to produce this state. It has one extra electron and it has energy a . When it’s neutral, the added hole (missing electron) has two spin states, and the system has energy 0. Z = e(µ − a )/τ + 1 + 1 1 e(µ − a )/τ = , f(A ) = e(µ − a )/τ + 2 1 + 2e(a − µ)/τ 1 2 = . f(A) = e(µ − a )/τ + 2 1 + 12 e(µ − a )/τ −
To get the concentrations of ions, we multiply by the total concentrations. nd , 1 + 2e(µ − d )/τ na − , n− a = na f(A ) = 1 + 2e(a − µ)/τ + n+ d = nd f(D ) =
Recall our expressions for ne and nh in the classical limit ne = nc e−(c − µ)/τ , nh = nv e−(µ − v )/τ . We now have expressions for four concentrations in terms of givens and one unknown. The givens are the temperature, the concentrations of donors and acceptors, the quantum concentrations of electrons and holes in the conduction and valence bands (which depend on temperature and effective masses), the energies at the band edges, and the donor and acceptor ionization energies (whew!). The unknown is the chemical potential. The chemical potential must be chosen so that the system is electrically neutral. To help with this, note that the total negative charge concentration must equal the total positive charge concentration, + + n− = ne + n− a = nh + nd = n . We can make a semi-log plot of all four concentrations versus µ. The electron and hole concentrations are just straight lines, and the ionized donor and acceptor concentrations are essentially two intersecting straight lines. Then we can plot n+ and n− , and where they intersect determines µ and all the concentrations. All this is shown in the figure. The range of µ shown in the figure is from v to c . Of course, µ must remain a few τ away from either band edge in order that our classical approximation remain valid. The figure is drawn with µ scaled to run between 0 and 1 at the valence and conduction band edges. The curves have been drawn for nd = 1017 cm−3 , na = 1015 cm−3 , nv = c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Nov-2002 29-4
1019 cm−3 , and nc = 1020 cm−3 . The donor ionization concentration is flat when µ d and slopes downward at greater µ. Similarly, the acceptor ion concentration is flat when µ a and slopes downward at smaller µ. All the slopes are just ±1/τ which is why all the slopes in the drawing are the same! The temperature has been set so that as µ traverses the gap, exp(µ/τ ) changes by 20 orders of magnitude. In other words, the gap width is 20 × 2.303 × τ . (2.303 is log 10.) If we really want to consider µ close to the band edges, then we can use the Joyce-
Dixon approximation or the result of numerically integrating the Fermi-Dirac integral. When µ gets to within a few τ of the band edge, the ne and nh curves start to flatten. We can also use the figure to see what happens if the temperature is changed. As an example, suppose the temperature is doubled (actually this might damage your semiconductor, but this is a thought √ experiment!). The quantum concentrations nc and nv increase by a factor of τ 3/2 = 2 2, so the intersection of the ne or nh line with the right or left edges of the diagram moves up by this amount. But the major effect is the change in the slope of the sloping parts of all the curves. All slopes halve for a doubling of the temperature. The ne or nh curves go through the new values of nc or nv with the new + slope, while the n− a or nd curves continue to go approximately through na and a or nd and d with the new slope. Mentally shifting the curves around can give a qualitative c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Nov-2002 29-5
understanding of what happens with a change in temperature.
Electron-Hole Interactions In our discussion so far, the overall picture has probably been one of a “static” situation. But just as with a gas, which may seem quiescent, but has thermal motions at the molecular level, the situation within a semiconductor is similar. The holes and electrons have thermal motions. Electrons and holes may recombine with donors and acceptors. Similarly, electrons and holes may be ionized from donors and acceptors. Electrons and holes may recombine (annihilate!). And an electron-hole pair may be created. Some of these processes can emit or absorb light. For example, the light emitting diode or the CCD detector used in today’s video-cams.
The p-n Junction Probably the most important thing in semiconductor technology is the p-n junction. Take a piece of donor (n-type) doped semiconductor and place it in contact (thermally and electrically) with a piece of the same kind of semiconductor with acceptor (p-type) doping. What happens? To start with, the n-type semiconductor has lots of electrons and a Fermi level (µ) near the conduction band edge. The p-type semiconductor has lots of holes and a Fermi level near the valence band edge. When we place them in contact, we have a non-equilibrium situation! This is because the chemical potentials are different. Electrons will flow from regions where their chemical potential is high to regions where their chemical potential is low. Holes flow in the opposite direction (because a high chemical potential for an electron is equivalent to a low chemical potential for a hole). At the boundary between the two types of semiconductor we have excess electrons (excess negative charge) in the p-type and excess holes (excess positive charge) in the ntype semiconductor. If we imagine that the boundary is a plane, then we have a plane of positive charge next to a plane of equal negative charge. What does this do? It’s like the charge on the plates of a capacitor. It creates an electric field pointing from the n-type to the p-type semiconductor. There is an electric potential difference between the n-type and p-type semiconductors with the n-type at a higher potential. Now consider following an electron as it travels from the n-type to the p-type semiconductor. When it’s well inside the n-type semiconductor, its intrinsic chemical potential is µi,n . Here, “intrinsic” does not mean the chemical potential associated with the intrinsic carrier concentration, but rather, the chemical potential calculated ignoring external sources of potential energy, such as the electric potential energy we’ve just been discussing. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Nov-2002 29-6
As the electron traverses the junction and gets well inside the p-type semiconductor, its intrinsic chemical potential drops to µi,p . On the other hand its electric potential energy rises by e(Vn − Vp ) = e∆V , where Vn and Vp are the electric potentials (voltages) in the n and p-type semiconductors and the charge of an electron is −e. The electron traverses the voltage step from high to low voltage, but since it has a negative charge, it increases its potential energy. As we discussed in lecture 12, any external sources of potential energy must be added to the chemical potential. The system will be in diffusive equilibrium when the total chemical potential is the same everywhere. In other words, when µi,n − µi,p = e∆V , the electrons and holes stop diffusing across the junction. What we’ve just discovered is that a p-n junction spontaneously develops a voltage. Since µi,n is close to the conduction band edge and µp,n is close to the valence band edge, we expect that the voltage times the electron charge should be the same order as the energy gap. For example, silicon has an energy gap of 1.14 eV at room temperature, so we expect voltage steps across silicon p-n junctions to be about one volt. The exact value depends on the type and amount of doping, as we will now work out. Suppose one has an n-type semiconductor with no acceptors, suppose also that ni nd nc and that the d is close to c . Then the electron concentration and the chemical potential are determined by the intersection of the electron exponential with the flat part of the donor curve (that’s what all the assumptions are for), and we have nd = ne = nc e(µ − c )/τ or µi,n = τ log
nd + c . nc
In the p-type conductor, we assume that there are no donors, that ni na nv , and that a is close to v . Then the hole concentration and the chemical potential are determined by the intersection of the hole exponential with the flat part of the acceptor curve, and na = nh = nv e(v − µ)/τ , which means µi,p = v − τ log
na . nv
Now we subtract and find e∆V = µi,n − µi,p = g − τ log
nc n v . nd na
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
25-Nov-2002 29-7
One might have nc /nd ≈ nv /na ≈ 100, and in this case, the result for silicon at room temperature would be e∆V = 1.14 eV − 8.62 × 10−5 eV K−1 · 300 K · log 10000 = 0.90 eV . Note that as the temperature goes up, the potential difference gets smaller. The electron and hole concentrations stay at nd and na , but since the quantum concentrations go up with temperature and especially since the slopes on our diagram get shallower with increasing temperature, both chemical potentials move towards the center of the gap.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Nov-2002 30-1
The Depletion Region in a p-n Junction The figure is a schematic of our view of a p-n junction up to this point. The n-type material is on the left and the p-type is on the right. x is a spatial coordinate increasing from left to right. At the top are shown the energy levels one would have before the flow (diffusion) of electrons and holes to set up a space charge density which creates the electric potential step we are discussing. This is not an equilibrium configuration. Next is a diagram of the energy levels in the equilibrium configuration after the potential step has been established. The dotted lines indicate that the levels must connect from one side to the other, but to know how they connect, we will have to do some work! The chemical potential is now constant across the junction and all energy levels get a step as the junction is crossed. The next diagram shows the electric potential. The ⊕ and symbols indicate the location of positive and negative space charge. Finally, the bottom diagram shows the electric potential energy of an electron. This is the potential curve, but inverted due to the negative charge of the electron. As the electric potential curve is drawn, there is an infinitely thin plane of positive charge (holes) in the n-type semiconductor and a plane of electrons in the p-type semiconductor. (This gives a constant slope to the potential between the two planes.) Such a configuration is also not an equilibrium configuration. Instead, the positive and negative charge will spread out (that’s why it’s called space charge) in a region around the junction. To determine the space charge density, we must solve Gauss’ law (from E&M) in conjunction with the thermodynamic constraints. We will work through a simple model to see how it goes. We will assume that nothing depends on y or z, Also, we assume the doping changes abruptly from n-type to p-type at x = 0. In a real semiconductor, there is some diffusion of dopants, so the change from n-type to p-type can’t be instantaneous. However, if it occurs in a distance shorter than the depletion length (coming soon), then our model of an abrupt change is a good approximation.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Nov-2002 30-2
The electron and hole concentrations will be a function of x as will the energies of the band edges and the electric potential. Only the chemical potential is independent of x, once equilibrium has been established. Recall from Gauss’ law, ∇·E =
ρ , 0
written for SI units, with ρ the charge density, 0 the permittivity, and E the electric field. Note: if you took Physics 104, you learned the integral form of Gauss’ law, 1 E · n dA = ρ dV . 0 Enclosed Volume Closed Surface The integral and differential forms are can be derived from one another using the divergence theorem (which you should have seen in a math class by now!) X · n dA = ∇ · X dV , Closed Surface
Enclosed Volume
where X is any vector field. The electric potential is defined for static fields so that E = −∇Φ , which means ∇2 Φ = −
ρ . 0
If we specialize all this to our case, our fields are a function of x only, and we must use the permittivity of the semiconductor rather than that of free space. We have d2 Φ(x) 1 ρ(x) . = − dx2 What is the charge density? In the n-type semiconductor, at large distances from the junction, there is a concentration nd of positive charge and an equal concentration ne of negative charge, so the semiconductor is neutral, ρ = 0. As we get close to the junction (but stay on the n-type side), nd remains constant, but ne decreases (fewer electrons is the same as more holes). One way to think of the decrease in the electron concentration is that it occurs because the conduction band edge, c,n is raised, relative to the chemical potential, µ, by the addition of the electric potential energy of the electrons. That is, without an electric potential, we have nd = ne = nc e−(c − µ)/τ . with an electric potential, the difference between c and µ widens and the electron concentration decreases, ne = nc e−(c,n − eΦ − µ)/τ = nd eeΦ/τ . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Nov-2002 30-3
The charge density is then ρ(x) = e(nd − ne ) = end
1 − eeΦ/τ
,
and Gauss’ law becomes
d2 Φ(x) end eΦ/τ . 1−e =− dx2 We’ve found the differential equation that Φ must satisfy. We can integrate this equation with the following trick. Multiply both sides by 2 dΦ dx , 2
end dΦ dΦ d2 Φ(x) eΦ/τ , 1 − e = − 2 dx dx2 dx
or d dx
dΦ dx
2
τ eΦ/τ 2end d Φ− e . =− dx e
At this point we need to think about the boundary conditions, that is, the values of Φ at the limits of integration and in fact, we need to pick good limits of integration. To start with we assume that the semiconductor extends far enough away from the junction that the effects of the junction become negligible (that is, ne → nd . In this case, we might as well assume it extends to x = −∞ where we take the potential to be zero Φ(−∞) = 0. Also, x = −∞ will be one of our limits of integration. We will take the other limit to be x = 0. At this position, Φ(0) = −∆Vn , where ∆Vn is that part of the potential difference ∆V that occurs in the n-type material. When we integrate, we will have to evaluate dΦ/dx = −Ex at the limits of integration. We take Ex (−∞) = 0, and represent by E the value of the electric field at x = 0. Then τ τ 2end , Φ(0) − eeΦ(0)/τ − Φ(−∞) − eeΦ(−∞)/τ (−E)2 − 0 = − e e or 2end τ −e∆Vn /τ τ 2 E =− −∆Vn − e . + e e Now, e∆Vn is of the same order as the energy gap which we’re assuming is much bigger than τ . This means the exponential above can be ignored and we have τ 2end ∆Vn − , E=± e and we must choose the positive sign since the electric field points in the positive x-direction at the junction. We’ve obtained a relation between the electric field at the junction and part of the voltage drop across the junction. If we do the same arithmetic for the p-type material, the result is τ 2ena E= ∆Vp − . e c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Nov-2002 30-4
It’s the same electric field whether we calculate it from the n-type side or the p-type side, so we have 2end τ ∆Vn − , E2 = e 2ena τ ∆Vp − , E2 = e or E 2 1 τ = ∆Vn − , 2e nd e 2 E 1 τ = ∆Vp − , 2e na e add E 2 2e
1 1 + nd na
= ∆V −
or
2τ , e
E=
2e nd na nd + na
2τ . ∆V − e
We know the electric field at the junction in terms of the potential drop across the junction, the temperature, and the “givens.” Returning to our differential equation for Φ(x), we were able to integrate it once, but so far as I know, it can’t integrated in closed form to get Φ. Numeric integration is required to get Φ(x) and from Φ(x) one can calculate the electron concentration using ne (x) = nd eeΦ(x)/τ . The electron concentration is essentially 0 at the junction and it rises to nd as one moves away from the junction. We say that electrons are depleted at the junction. (And in the p-type material, holes are depleted at the junction.) To get a handle on how far from the junction the effects of the junction extend, we imagine that the electron density is 0 for some distance wn in the n-type semiconductor. We ask what value of wn is required in order that we have the same electric field at the junction as the electric field we calculated in the previous paragraph. We apply Gauss’ law in integral form to this region by drawing a box with unit area perpendicular to x with one face at x = −wn , where the electric field is 0 and the other face at x = 0 where the electric field is E calculated above. The charge in this box is nd wn e, and this divided by the permittivity must be E. So
2 na 2τ E ∆V − . = wn = end e nd (nd + na ) e c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Nov-2002 30-5
One expects that the it will take several times wn for the electron concentration to rise from 0 at the junction to nd inside the n-type region. Similarly, the depletion length for the holes is
2 nd 2τ E wp = ∆V − , = ena e na (nd + na ) e and the total depletion length is
w = wn + wp =
2 nd + na e nd na
2(∆V − 2τ /e) 2τ = . ∆V − e E
To calculate a representative number, we take nd = na =√ 1015 cm−3 , = 100 , and ∆V − 2τ /e = 1 V. We find, E = 1.34 × 104 V cm−1 . This is 10 smaller than the number quoted in K&K. It appears that K&K may have left out the dielectric constant when computing the electric field (the 10 in = 100 ). The characteristic width of the depletion region is w = 1.49 × 10−4 cm. Remember this is the width assuming that the depletion is 100% over this range and 0 outside the range. The depletion actually goes from 100% to 0 gradually over several times this distance.
A Reverse Biased p-n Junction Let’s connect the p-n junction to an external voltage source so that the positive terminal is connected to the n-type material and the negative terminal to the p-type material. The external voltage adds to the potential step at the junction. To see this consider the following. The positive terminal connected to the n-type material attracts electrons from the semiconductor. Since there are plenty of electrons in the n-type material, the electrons flow into the positive terminal and they keep an almost constant potential in the n-type material. But what happens at the junction? To replenish the electrons, we need electrons to flow across the junction from the p-type material. But the p-type material doesn’t have any electrons! Of course the same statements work for the holes. Holes are attracted by the negative voltage, the plentiful holes in the p-type material keep the potential almost constant in the bulk of the p-type material and to replenish the holes, we need holes to flow from the n-type material where there aren’t any. What happens is that the electric field at the junction gets bigger—we do all the same calculations but with the external voltage plus ∆V in place of ∆V —and the depletion region gets wider. The extra voltage drop is taken almost entirely at the junction and after the initial transient to establish equilibrium, there is almost no current flow.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Nov-2002 30-6
A Forward Biased p-n Junction Let’s connect the p-n junction in the opposite sense. We connect the external negative voltage to the n-type semiconductor and the positive voltage to the p-type semiconductor. Now what happens? Electrons are repelled from the negative terminal, so an electron current travels from the negative terminal toward the junction. (As electrons leave the region of the negative terminal, more are supplied by the negative terminal.) The electrons forced into the depletion region can recombine with the holes producing a smaller depletion region. This means that the potential drop across the junction is smaller than in the equilibrium open circuit condition. This means that electrons will diffuse across the junction to re-establish the equilibrium potential drop. Electrons that diffuse across recombine with holes on the p-side. These holes have come from the positive terminal and either recombine with the electrons that have diffused to the p-side or they diffuse to the n-side and recombine with electrons that have come from the negative terminal. Whew! In any case, the potential drop across the junction and the width of the depletion region are both smaller than in the equilibrium case. Electrons and holes are diffusing across the junction and recombining and there is a current of electrons going from the negative terminal to the junction (which is an electric current going from the junction to the negative terminal) and a current of holes (and electric current) going from the positive terminal to the junction. This is not an equilibrium situation! Since it’s not an equilibrium situation, it’s not really clear what to do about the chemical potential. The chemical potential must not be uniform since we have flowing electrons and holes. One approach is to say that the electrons in the conduction band follow equilibrium Fermi-Dirac statistics as do the holes in the valence band, but the electrons and holes aren’t in equilibrium with each other. Then one can use quasi-Fermi levels: separate chemical potentials for the valence and conduction bands (which also depend on position). Then the electron distribution in the conduction band is fc =
1 , 1 + e( − µc )/τ
with a similar expression involving µv for the valence band. This approach allows for an increase in the electron concentration in the n-type region (due to the electrons flowing in from the negative terminal) and an increase in the hole concentration as well, due to the holes diffusing across the junction and past the depletion region (not all will recombine in the depletion region). Also, if the electron concentration is increased, the hole concentration must increase if the material is to be electrically neutral. Electrons diffuse from high chemical potential to low chemical potential. Since this is a non-equilibrium process, we don’t know all that much about what happens. So we make the reasonable assumption that the electron (not electric) current, which is the number of electrons crossing a unit area in a unit time, is proportional to the negative of the gradient of the chemical potential as well as the electron concentration, so je ∝ −ne ∇µc , c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
27-Nov-2002 30-7
where je is the electron current. To get electric current we need to multiply by −e, so Je ∝ +ene ∇µc . Now we absorb e into the proportionality constant which is called the electron mobility µ ˜e , which has nothing directly to do with the electron chemical potential. Then the electric current carried by electrons is Je = µ ˜e ne ∇µc . At this point the electron mobility is to be regarded as an experimentally determined parameter characteristic of the material and temperature, etc. With luck, we will learn more about mobility when we discuss chapter 14. If the electron concentration is classical, then ne = nc e(µc − c )/τ , so
ne , nc and the electric current carried by electrons in the conduction band is µc = c + τ log
Je = µ ˜ e ne ∇c + µ ˜e τ ∇ne . The conduction band energy is level is equal to a constant plus −eΦ, so ∇c = −e∇Φ = eE, and Je = e˜ µe ne E + eDe ∇ne , where the diffusion coefficient is defined by De = µ ˜e τ /e. Again, we’ll learn more about diffusion coefficients later. For now it can be regarded as an experimentally determined parameter. The upshot of all this is that the electron current in the conduction band arises from an electric field (an electric potential gradient) and from a concentration gradient. Similarly, the hole current in the valence band is Jh = e˜ µh nh E − eDh ∇nh . A little thought will show that the signs are correct. Holes, having positive charge, travel in the direction of the electric field and contribute a current in that direction. They travel opposite to the hole concentration gradient and, having positive charge, contribute an electric current opposite to the hole concentration gradient. Throughout most of the semiconductor, the field points from the p-type to the n-type region and there is very little concentration gradient, so it is the electric field term which is responsible for the electric current. At the junction, in the depletion region, the electric field term points the other way! The same direction as it pointed in the static, equilibrium case. (Unless we apply a whopping external field!) However, in the neighborhood of the junction, the concentration gradients are large and it is the diffusion due to these gradients that carries the current across the junction. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 10
Due 3-Dec-2002
H10-1
1. K&K, chapter 13, problem 1. 2. K&K, chapter 13, problem 2. 3. K&K, chapter 13, problem 3. 4. K&K, chapter 13, problem 6. Hint: the intersection of the electron concentration curve + with the n+ d curve most likely occurs where the nd curve (as in the figure in the notes or K&K figure 13.6) is sloped down with constant logarithmic slope. 5. K&K, chapter 13, problem 7. 6. For the p-n junction we discussed in lecture, make sketches of the electric potential and the magnitude of the electric field as a function of x, were x is 0 at the junction, x < 0 in the n-type material and x > 0 in the p-type semiconductor. Do this for three cases, (a) Open circuit (the potential was already sketched in lecture). (b) Reverse bias. (c) Forward bias. Note that you can always add a constant to the potential, so only the shape is important. In the case of the electric field, both the overall value and the shape are important. These are supposed to be sketches and not involve a lot of computation, but be sure the electric field is consistent with the potential and be sure to get the sign of the electric field right!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
8-Dec-2002
Physics 301 Problem Set 10 Solutions Problem 1. The electron and hole concentrations are given by ne,h = 12 ((∆n2 + 4n2i )1/2 ± ∆n)
(1)
− where ∆n = n+ d − na is the net donor concentration. In the limit of small net donor
concentration (∆n ni ), the above expression reduces to ne,h = 12 2ni (1 + (
∆n 2 1/2 1 ) ) ± 2 ∆n 2ni
(∆n)2 ≈ ni + ± 12 ∆n 8ni 1 ≈ ni ± 2 ∆n
(2)
Problem 2. (a) We have a formula for the conductivity σ = e(n e µ ˜ e + nh µ ˜h ). We can plug in the expressions (1) for ne and nh . Then we minimize with respect to ∆n and find ∆n ∆n + 1)˜ µ − 1)˜ µh = 0 e + (p ∆n2 + 4n2i ∆n2 + 4n2i ∆n ⇒ p (˜ µe + µ ˜h ) = (˜ µh − µ ˜e ) (< 0 usually). ∆n2 + 4n2i ni (˜ µh − µ ˜e ) ⇒ ∆n = √ µ ˜e µ ˜h √ For this value of donor concentration, the conductivity is σ min = 2eni µ ˜e µ ˜h (p
(3)
(b) For an intrinsic semiconductor, ∆n = 0, and the conductivity is σ = en i (˜ µe + µ ˜h ). The minimum conductivity is lower than the conductivity of an intrinsic semiconductor by √ 2 µ ˜e µ ˜h /(˜ µe + µ ˜h ). (c) Plugging in numbers, we get for Si σmin = 5.6 × 10−7 (ohm cm)−1 , and for InSb σmin = 36(ohm cm)−1 .
Problem 3. The conductivity is 1/20(ohm cm) −1 . (a) For n-type semiconductors, we have σ ≈ ene µ ˜ e , which gives ∆n ≈ ne = σ/e˜ µe ≈ 8 × 1013 /cc.
(b) For p-type semiconductors, we have σ ≈ enh µ ˜h , which gives ∆n ≈ −nh = −σ/e˜ µh ≈ −1.6 × 1014 /cc.
1
Physics 301
8-Dec-2002
Problem 4. We are given ∆Ed τ log(nc /8nd ) which means that
n∗e = nc exp(−∆Ed /τ ) 8nd .
This gives ne ≈ (1/4)n∗e (8nd /n∗e )1/2 =
p
nd n∗e /2 =
p
nd ne /2 exp(−∆Ed /2τ ) nd .
Problem 5. The acceptor (and hence the hole) concentration falls off exponentially between the two values n1 at x1 and n2 at x2 , both of which are much larger than ni (it is p-type): n(x) = n0 exp(−kx). Fitting the two values gives k = log(n1 /n2 )/(x2 − x1 ). The
hole concentration is given by n = nv exp((µ − Ev )/τ ). The electric field is the gradient of the potential φ = −Ev /e. We have |E| = (τ /e)|k| = (τ /e) log(n1 /n2 )/(x2 − x1 ).
For n1 /n2 = 103 and x2 − x1 = 10−5 cm, we have E = 0.025 log(103 )/10−5 ≈
17KV /cm.
2
Week 11. Collisions, Mean Free Path, Transport
Physics 301
2-Dec-2002 31-1
Reading K&K chapter 14.
The Maxwell Velocity Distribution The beginning of chapter 14 covers some things we’ve already discussed. Way back in lecture 6, we calculated the pressure for an ideal gas of non-interacting point particles by integrating over the velocity distribution. In lecture 5, we discovered the Maxwell velocity distribution for non-interacting point particles. In homework 2, you worked out some of the velocity moments for the Maxwell distribution and you also worked out the distribution of velocities of gas molecules emerging from a small hole in an oven. Rather than derive the earlier results again, we’ll just summarize them here for convenient reference. First, of all, let f(t, r, v) d3 r d3 v be the number of molecules with position vector in the small volume element d3 r centered at position r and velocity vector in the small velocity volume d3 v centered at velocity v at time t. The function f is the distribution function (usually, we’ve assumed it to be a function of the energy, but we’re working up to bigger and better things!). When we have a Maxwell distribution, fMaxwell (t, r, v) = n
m 3/2 −m(vx2 + vy2 + vz2 )/2τ e , 2πτ
where m is the molecular mass, and n is the concentration. The Maxwell distribution does not depend on t nor r. Integrating d3 v over all velocities produces n, and integrating d3 r multiplies by the volume of the container, and we get N , the total number of particles. If we want the probability distribution for just the velocity components, we just leave off the n (and then we also don’t consider it to be a function of r). Sometimes, we want to know the probability distribution for the magnitude of the velocity. We change variables from vx , vy , vz to v, θ, φ, and dvx dvy dvz = v 2 dv sin θ dθ dφ. We integrate over the angles, pick up a factor of 4π, and the probability distribution for v is m 3/2 2 P (v) dv = 4π e−mv /2τ v 2 dv . 2πτ With this distribution, one finds the most probable speed (peak of the probability distribution) is 2τ τ = 1.414 . vmp = m m The mean speed (denoted by c¯ by K&K) is 8τ τ = 1.596 . c¯ = v = πm m c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-2
The root mean square speed (which appears in the energy) is vrms
= v 2 =
3τ = 1.732 m
τ . m
Cross Sections The next topic discussed in K&K is the mean free path. We’re going to use this as an excuse to discuss cross sections and develop an expression for reaction rates based on cross sections and the Maxwell velocity distribution. The first thing we need to do is to see where a cross section comes from and what it means. Consider a situation in which two particles can undergo some sort of interaction when they “collide.” The interaction may be probabilistic; examples might be a nuclear or chemical reaction, an elastic scattering, or an inelastic scattering (which means one or both particles must be in an excited state after the interaction). Suppose one has a stationary target particle, and one shoots other particles at it with velocity v. The concentration of the incident particles (number per unit volume) is n. The flux density of incident particles, or the particle current, is the number crossing a unit area in a unit time and this is just nv. Note that this has the dimensions of number per unit area per unit time. Now the number of interactions per second must be proportional to nv. If we double the density, we have twice as many particles per second with a chance to interact. Similarly, if we double the velocity, particles arrive at twice the rate and we have twice as many particles per second with a chance to interact. Actually, the “strength” of the interaction may depend on the relative velocity, so it’s not strictly true that the rate is proportional to v, but in a simple process, like the collision of hard spheres, it’s certainly true. We take as a starting point that the interaction rate is proportional to the velocity, and effects having to do with the energy of the collision are included in the proportionality “constant.” The interaction rate is then R = σ(v)nv , where σ is NOT the entropy. Instead it’s the proportionality constant and is called the cross section. We’ve explicitly shown that it might depend on v. There can be more complicated dependencies. For example, if the incident and target particles have spin, the cross section might depend on the spins of the particles as well as the relative velocity. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-3
The proportionality constant is called a cross section because it must have the dimensions of an area. R has the dimensions of a number (of interactions) per unit time, while nv has dimensions of number per unit area per unit time. The cross section can be thought of as an area that the target particle “holds up,” perpendicular to the incoming beam. If an incident particle hits this area presented by the target, an interaction occurs. Now some jargon. It’s often the case that one considers a scattering, in which case the question asked is how many particles per second scatter into the solid angle dΩ centered on the direction (θ, φ)? In this case, the number is proportional to dΩ, and the proportionality constant is often written as a differential so the rate into dΩ is R(→ dΩ) =
dσ nv dΩ , dΩ
where dσ/dΩ is called the differential cross section. If one integrates over all solid angles to find the rate for all interactions (no matter what the scattering direction), one has σ=
dσ dΩ , dΩ
and σ is called the total cross section in this case. Of course, a reaction (not just a scattering) may result in particles headed into a solid angle dΩ, so reactions may also be described by differential and total cross sections. If one knows the forces (the interaction) between the incident and target particles, one may calculate the cross sections. Cross sections may also be measured. One uses a target which has many target particles. One counts the scattered particles or the outgoing reactants using particle detectors. The total number of events is just Nnvσ(v)t where N is the number of target particles exposed to the incident beam and t is the duration of the experiment. This assumes that the chances of a single interaction are so small that the chances of interactions with two target particles are negligible. Of course, sometimes this isn’t true and a multiple scattering correction must be deduced and applied. Sigh. . .
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-4
Example: Cross Section for Smooth Hard Sphere Elastic Scattering As an example, we’ll work out the differential cross section for smooth hard sphere scattering. You might say, what does this have to do with thermal physics? Answer: not much, but it’s to motivate the use of the cross section as part of the next topic. We’ll consider a stationary target and an incident particle with velocity v. Both particles are smooth hard spheres with radius a and mass m. (Think of billiard balls.) We assume the collision conserves the kinetic energy of motion. This is what’s meant by an elastic collision. The momentum of the incident particle is p0 before the collision and p1 after the collision. The momentum of the target particle is p2 after the collision. Conservation of momentum and energy tell us 1 2 1 2 1 2 p0 = p1 + p . 2m 2m 2m 2
p0 = p1 + p2 ,
If we square the conservation of momentum equation, we have p20 = p21 + 2p1 · p2 + p22 . If we compare with the conservation of energy equation, we conclude p 1 · p2 = 0 , which means that either one of the particles is at rest after the collision or their momenta (and velocities) are perpendicular after the collision. I’m sure you knew this trick from Physics 103/5, right? Suppose the incident particle scatters at an angle θ relative to its initial direction. The target particle must recoil at angle θ − π/2 from the initial direction of the incident particle. We have p0 = p1 cos θ + p2 cos(θ − π/2) , 0 = p1 sin θ + p2 sin(θ − π/2) , from which we conclude p1 = p0 cos θ ,
p2 = p0 sin θ .
So far, our analysis has proceeded very much as it would have in Physics 103/5. We’ve applied conservation of momentum and energy, but we’ve not completely “solved” the collision. We have expressed the final momenta in terms of the initial momentum (known) and an angle (unknown). This is typical for collision problems. Conservation of energy and momentum provides four constraints but there are six components of momentum to be determined. (In case you’re wondering how come we have only one unknown, θ, the plane in which the scattering occurs is also undetermined. That is, we don’t know φ, either!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-5
To make further progress, we must examine the collision in more detail and specify the initial conditions more carefully. So we suppose that in the absence of a collision, the centers of the two spheres would pass within distance b of each other. b is called the impact parameter. The initial momentum points in the z-direction, as we’ve been assuming. Finally, the plane containing the initial momentum vector and the centers of the two spheres makes an angle φ with respect to the x-axis. It’s easy to see that the collision will be confined to this plane. Since we’ve assumed an elastic collision between smooth spheres, the incident particle can exert only a radial, not a tangential, force on the target particle. So we can calculate the angle at which the target particle recoils just from the geometry of the collision. Inspection of the diagram shows that sin(π/2 − θ) =
b , 2a
b b2 cos θ = , sin θ = 1 − 2 . 2a 4a At this point, we’ve completely solved the collision. Given the initial momentum or velocity, the impact parameter, and the initial plane of the collision, we’ve determined the final plane (same as initial), the scattering and recoil angles, and all the final momentum components. which means
However, we still haven’t figured out the cross section! We assumed specific values for φ and b. But we usually don’t have such control over the initial conditions. Instead we should assume that the impact parameter is in the range b → b + db and the azimuthal angle is in the range φ → φ + dφ. Then the cross section is just the area that this implies, dσ = b db dφ . However, it’s customary to express the cross section in terms of θ and φ rather than b and φ, since b cannot usually be measured, but θ is relatively easy to measure (it’s just the angle describing the location of the particle counter). From our previous expression relating b and θ, we have sin2 θ = 1 −
b2 . 4a2
Differentiate, 2 sin θ cos θ dθ = −2b db
1 , 4a2
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-6
so b db = |4a2 cos θ sin θ dθ| , where we don’t care that increasing b means decreasing θ, and dσ = 4a2 cos θ sin θ dθ dφ = 4a2 cos θ dΩ , or
dσ = 4a2 cos θ . dΩ Note that θ is confined to the range 0 ≤ θ ≤ π/2.
The total cross section is found by integrating the differential cross section over all allowed angles, π/2 2π σ= sin θ dθ dφ 4a2 cos θ = 4πa2 . 0
0
This makes sense. If the center of the incident sphere passes within 2a of the target sphere, there will be a collision. The area within 2a of a point is 4πa2 . We didn’t really need to do all this work to get the total cross section; in this case, we could have just written it down by inspection!
Reaction Rates Consider a gas composed of weakly interacting particles. We want the gas to be essentially an ideal gas and the distribution functions to be the Maxwell distribution functions. Suppose the gas contains two kinds of molecules with masses m1 and m2 and concentrations n1 and n2 . Suppose these two kinds of molecules, upon colliding, can undergo a reaction. The (total) cross section for this reaction is σ(v) where v is the relative velocity of the molecules. We want to calculate the rate at which the reaction occurs. Consider a molecule of type 1 moving with velocity v1 . The molecules of type 2 are coming from all directions with different velocities, so how do we calculate anything? We isolate a particular velocity and direction. Consider molecules with velocity v2 in the range d3 v2 . The concentration of such molecules is f2 (v2 ) d3 v2 . The relative velocity is v = |v2 − v1 |, so the rate of interactions with our single target molecule is dR = σ(v)vf2 (v2 ) d3 v2 . We consider target molecules in a small volume d3 r1 and with velocities in the range d3 v1 . There are f1 (v1 ) d3 r1 d3 v1 such molecules. The interaction rate for this number of target particles is dR = σ(v)vf1 (v1 )f2 (v2 ) d3 v1 d3 v2 d3 r1 . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-7
A little thought shows that dR has the dimensions of inverse time. To get the overall rate we need to integrate over all differentials. The integral over d3 r1 just gives the volume of the container, V . The integrals over the velocity elements are a little trickier since the integrand includes a dependence on the relative velocity. We have m 3/2 m 3/2 2 2 1 2 3 R = V n 1 n2 d v1 d3 v2 σ(v)ve−m1 v1 /2τ e−m2 v2 /2τ . 2πτ 2πτ We’re going to change variables to the center of mass velocity and the relative velocity. The center of mass velocity is u=
1 (m1 v1 + m2 v2 ) , M
where M = m1 + m2 . The relative velocity is v = v2 − v1 . Expressing v2 and v1 in terms of u and v, we have v2 =
m1 v+u, M
v1 = −
m2 v+u. M
The kinetic energy can be expressed as E=
1 1 1 m1 m2 2 1 1 1 m1 v12 + m2 v22 = v + Mu2 = µv 2 + Mu2 , 2 2 2 M 2 2 2
where µ=
m1 m2 , m1 + m2
is called the reduced mass. We’ve already used µ for a number of things—chemical potential, magnetic moment, mobility—oh well. The Jacobian of the transformation is unity, so
3/2 µ 3/2 2 2 M −Mu −µv /2τ 3 /2τ 3 R = V n1 n2 e σ(v)ve d u d v . 2πτ 2πτ We started with an integral written for a pair of particles. We’ve now rewritten the integral in terms of the center of mass motion and the motion relative to the center of mass. The collision occurs in the relative motion! The integral over the center of mass motion gives unity and finally, the rate per unit volume is µ 3/2 2 R r= σ(v)ve−µv /2τ d3 v . = n1 n2 V 2πτ c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-8
This expression is generally applicable whenever the gas can be described by a classical Maxwell Boltzmann distribution and the reaction rate is slow enough that the gas is always essentially in thermal equilibrium and the Maxwell Boltzmann distribution applies. For example, expressions like the above are used to calculate the nuclear reaction rates in the Sun when computing a numerical model of the Sun. If particles of types 1 and 2 are in fact the same, then µ = m/2, n1 = n2 = n, and we need a double counting correction of 1/2, 1 m 3/2 r = n2 2 4πτ
2 σ(v)ve−mv /4τ d3 v .
Usually, σ depends only on the magnitude of v, so we can integrate over the angles and m 3/2 2 2 σ(v)v 3 e−mv /4τ dv . r = 2πn 4πτ
The Collision Rate and the Mean Free Path We can use the expression we’ve just derived to calculate the collision rate among molecules in a gas. We assume the molecules are hard spheres of radius d/2. We ignore the correction to the concentration (as in the van der Waals equation of state). As we worked out earlier, the total cross section for a collision is πd2 , so the collision rate is m 3/2 2 2 2 2 v 3 e−mv /4τ dv . r = 2π d n 4πτ This integral is straightforward and the result is 4πτ 2 2 . r=n d m This is the number of collisions per unit time per unit volume. Now we want to convert this into the number of collisions per unit time per particle. A unit volume contains n particles, so we need to divide by n. In addition, each collision involves two molecules, so we need to multiply by 2. The rate per molecule is 16πτ 2 r1 = nd . m All the factors in this expression make sense. The bigger the concentration, the higher the collision rate; the larger the cross sectional area of a molecule, the higher the collision rate; the faster the molecular speed ( τ /m), the higher the collision rate. Except for the c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-9
numerical factor of analysis.
√ 16π, we could have written down this expression just from dimensional
The mean time between collisions is 1/r1 . If we assume that between collisions a particle typically moves with the average speed, we can get an estimate of the typical distance a particle moves between collisions. This is called the mean free path, c¯ 8τ 16πτ 1 = nd2 = √ = . r1 πm m 2nπd2 √ You will notice that this expression differs by 2 from the expression given in K&K, and derived by a completely different method. The method used by K&K is to imagine a particle sweeping out a cylinder of base area πd2 as it travels. If another particle in the gas has its center within this cylinder, then the first particle will collide with it. As the height of the cylinder grows, the chance that another particle is in the cylinder grows. When the height of the cylinder is such that there’s one particle (on average) within the cylinder, declare that a collision has occurred and the height of the cylinder is the mean free path. In other words, nπd2 K&K = 1 ,
or
K&K =
1 . nπd2
In actual fact, neither method is completely kosher! Our discussion of the collision rate is legitimate (within the assumptions of hard sphere molecules), but dividing the average rate into the average velocity is not strictly legal since the average of a ratio is not the same as the ratio of the averages. The K&K method ignores the relative velocities of the molecules. Why is this important? The simple argument applies to a fast moving molecule. On the average, it will run into another molecule after going a distance K&K . But a slow moving molecule (consider the limit of a molecule that happens to be at rest) will more than likely be hit by another molecule in the gas before it has a chance to move K&K . Averaging over both slow and fast molecules, the mean free path will be shorter than K&K . At this point, it’s useful to plug in some numbers just to get a feel for the collision rate and the mean free path. We consider a neon atom which has a diameter of about 2˚ A. The mass of neon is about 20.2 atomic mass units. We assume it’s an ideal gas and evaluate the concentration at 0◦ C and one atmosphere. Under these conditions, one mole occupies 22.4 liters, and the concentration is n0 =
N0 = 2.69 × 1019 molecules cm−3 , V
also known as Loschmidt’s number. The mean free path is = √
1 = 2.10 × 10−5 cm , 2nπd2
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
2-Dec-2002 31-10
about 1000 times the atomic diameter. The mean speed is c¯ =
8kT = 5.35 × 104 cm s−1 . πm
The collision rate per atom is r1 =
c¯ = 2.55 × 109 collisions s−1 .
An interesting thing happens in a vacuum. Since the mean free path and the collision rate are inversely proportional to and proportional to the concentration (density), lowering the density increases the mean free path and lowers the collision rate. Just as an example, consider a “vacuum” of 10−6 atmospheric pressure (not a particularly good vacuum). The collision rate drops to about 2550 times per second and the mean free path rises to about 21 cm. This is a macroscopic length! It’s comparable to the size of laboratory equipment. In good vacuums, the residual gas interacts more with the container than with other gas molecules.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-1
Transport When a system is not in equilibrium, various transport processes may occur. By a “transport process” is meant the transfer of energy, charge, particles, etc. by the physical motion of particles. In some cases, wave phenomena may be important in transport processes such as heat conduction in an electrical insulator. The fact that the material is an insulator means that the electrons are not free to move. The conduction of heat occurs by atoms passing energy from one to the next. So there is no net flow of atoms, but there is a flow of heat energy. This is similar to what happens with sound waves. Of course, sound waves can be thought of as phonons. . .. In the case of electric current, charge is transported by the physical movement of particles. In an electrical conductor, heat is transported by the “wavelike” action described earlier and also by the flow of electrons. We will consider systems which are only mildly out of equilibrium. Usually these systems will be in a steady state in which nothing varies with time. Of course, this is an idealization—if we consider heat conduction by a system connecting two thermal reservoirs, then the reservoirs are assumed to be so large that the removal of heat energy from one reservoir and its addition to the other makes negligible change in the reservoirs. If the system is only slightly out of equilibrium, it usually happens that a flow will be set up in such a direction that it would restore equilibrium. For example, consider a rod connecting two thermal reservoirs at different temperatures. Heat energy flows through the rod from the high temperature reservoir to the low temperature reservoir. This is the direction required to move closer to equilibrium. Note that since we’re discussing a non-equilibrium situation, the system tries to maximize or minimize the appropriate thermodynamic function (to get to equilibrium). In this case, we have reservoirs in thermal contact (through the rod). No work can be done and no particles can be exchanged. So the appropriate thing to do is to try and maximize the entropy. Sure enough, energy leaving a high temperature reservoir removes some entropy from that reservoir, but when that same energy is deposited in the low temperature reservoir, more entropy is added than was removed from the high temperature reservoir. So the system is creating entropy which is what you do if you want to maximize it! In other situations, the flow might be such that it would minimize the free energy. Since we are speaking of a flow in response to a non-equilibrium situation, it seems reasonable that the “strength” of the flow should be proportional to the departure from equilibrium, at least for small departures from equilibrium. In other words, we’re going to assume that the flow can be expanded in a Taylor series in whatever measures the departure from equilibrium. The constant term must be zero (no flow at equilibrium) so the first term must be the linear term. We’ll assume that the linear term is non-zero and departures from equilibrium are small enough that we only need consider the linear term. How do we measure a flow? An electric current density is an example. Recall that the electric current density is a vector which has the dimensions of charge per unit area c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-2
per unit time. In general, we’ll consider the flux density which is a vector with units of whatever is being transported per unit area per unit time. The direction of the vector gives the direction in which the whatever is being transported. The magnitude gives the net amount of whatever that crosses a unit area perpendicular to the direction of the vector in a unit time. In the case of electric current density, negative charges moving with some velocity in some direction produce the same current as positive charges moving with the same velocity in the opposite direction. The current or flux density is always the net current or flux density. Sometimes we want the current or flux crossing a surface. This has units of whatever per unit time. It is found by integrating the current density or flux density times the unit normal over the surface, I= J · n dA , surface
where I is the current or flux, J is the current density or flux density, n is the unit normal to the surface pointing from the negative side to the positive side of the surface, and dA is the differential area element on the surface. Food for thought: suppose the quantity being transported is itself a vector. How would you describe the flux density? How do we measure a departure from equilibrium? In general, any non-uniformities in a system indicate non-equilibrium. For example, temperature variations, variations in particle concentrations, or variations in electric potential might indicate a non-equilibrium situation. But there’s more to it than just a variation; there is also the scale of the variation. A 10, 000 Volt potential difference between two electrodes in air is not a big deal if the electrodes are separated by several meters. Rather dramatic effects occur if they’re separated by only a millimeter—probably more than the linear term is needed to describe the resulting transport! In other words, how rapidly the non-uniformity varies with position is the important quantity for determining the transport. The upshot of all this hand waving is that a transport process is described by J = (Constant) × [∇(Scalar Field)] . Some transport laws are the following: Ohm’s law relating electric current density, Jq to the electric field or the gradient of the electric potential where the coefficient is the conductivity, σ (not the entropy nor the cross section, here), Jq = σE = −σ∇Φ . The negative sign indicates that (positive) charge flows from high to low potential. For particle diffusion there is Fick’s law relating the particle flux density, Jn , to the concentration. The proportionality constant is called the diffusivity, D, Jn = −D∇n . c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-3
Again, the negative sign indicates that particles diffuse from high to low concentrations. Note that diffusivity has dimensions of area per unit time or length times velocity. Fourier’s law describes heat conduction. It relates the flow of energy (heat), Ju to the temperature gradient. The proportionality constant is the thermal conductivity, K, Ju = −K∇τ . Guess what the negative sign means! In these units, K has the dimensions of one over a length times a time. In conventional units, K has the dimensions of energy per Kelvin per length per time. K&K also discuss viscosity as a transport process. It is a transport process, but it’s the transport of momentum—a vector! As K&K discuss viscosity, they refer to the transport of x-momentum in the zdirection. This is a force in the x-direction exerted by something at small z on something at large z and vice versa. Ordinary pressure is a transport of (say) x-momentum in the x-direction; a force in the x direction exerted by something at small x on something at large x and vice versa. Ordinary pressure might be thought of as a “straight ahead” force. The force due to z-transport of x-momentum may be thought of as a “sideways” force. Of course, that’s too vernacular a term, so it’s actually called a shear force. The diagram shows the standard example of viscosity. A fluid occupies the space between two parallel plates or walls. One plate is kept stationary and the other has a constant velocity in the x direction. The velocity is small enough that the fluid flows smoothly (laminar, not turbulent, flow). In the steady state, there will be a gradient in the velocity of the fluid as indicated in the diagram. Fluid elements in neighboring layers exert a drag force on each other. Fluid in contact with the walls exerts drag forces on the walls. These forces are in the x direction, but they are transmitted in the z direction. Also, it should be fairly clear that the force on a wall must be proportional to the area of the wall. The intensive quantity is force per unit area, like a pressure. Force is momentum per unit time, so we have momentum per unit area per unit time which is a momentum flux density. The faster the plates move relative to each other for a given thickness of fluid, the larger the drag force, so the gradient we need is the z-derivative of the x-component of the velocity. In other words Fx dvx . = Jpx,z = −η Az dz The subscripts on J indicate it’s the x-momentum flux density in the z direction. The proportionality constant, η is called the viscosity. Its dimensions are mass per length per time. The cgs unit of viscosity is the poise, 1 poise = 1 g cm−1 s−1 .
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-4
Transport Coefficients The constants that occur in the transport equations must depend on the microscopic processes which are going on in the system. In particular, if the system is a gas, then they must depend on the collision rate, the typical velocity and the mean free path. For example, if we consider diffusion, then if gas atoms were point particles, diffusion would occur at the typical average speed of the gas molecules. Instead, diffusion is much slower because particles are bigger than points and they collide and have their velocities randomized after roughly a mean free path. What we are going to do now is obtain order of magnitude expressions for these constants in terms of microscopic parameters. The discussion will be at the same level as the simple derivation of the mean free path. We will miss out on factors of order unity, but we will determine the overall dependencies on microscopic parameters.
Diffusivity As a first example, we will work out an expression for the diffusivity of molecules in a gas (or solute molecules in a dilute solution). Let’s suppose there is a concentration gradient in the z-direction. Imagine sitting at a point in the gas and watching the molecules go by. If dn/dz is positive then when you see molecules coming from the +z direction, they’ve come from a place where the concentration was higher. Molecules coming at you from the −z direction came from a place where the concentration was lower. Molecules coming at you from anywhere in the xy-plane have come from places with the same concentration. Molecules from other directions have come from intermediate concentrations. How far away have the molecules come from? Assumption: on average, they came from one mean free path away and have the concentration at that location. What is the flux density of such molecules. Assumption: it’s just the concentration we’ve just mentioned times the average velocity. So if we look in direction (θ, φ), the concentration carried by the molecules coming from that direction is n(r0 + δr) = n(r0 ) +
dn dn δrz = n(r0 ) + cos θ , dz dz
where r0 is the position under discussion and δr is a vector of length in direction (θ, φ). As mentioned, to get the flux density, we multiply by the average velocity. But this gives the flux density in the direction opposite to (θ, φ). We want the z-component of the flux density (the other components average to zero by symmetry), so we need to multiply by a cos θ. Also, we need a minus sign to account for the fact it’s opposite to (θ, φ). Altogether, the z-component of flux density carried by molecules coming from (θ, φ) is dn Jz (θ, φ) = −¯ . c cos θ n(r0 ) + cos θ dz c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-5
Now, we need to average Jz over all θ and φ. That is, average over the surface of a sphere. The first term averages to 0. The second term involves the average of cos2 θ over the surface of a sphere. It’s useful to remember that this is 1/3. (But easy to work out if you forget.) Finally, we have 1 dn Jz = − c¯ , 3 dz from which we deduce that the diffusivity is D≈
1 c¯ , 3
where the approximation sign reflects the approximate nature of our treatment. This also means it doesn’t much matter whether you use the mean free path expression with or √ without the 2 factor!
A Bit More on the Diffusivity When we obtained our expression for the diffusivity, we did an integral over a sphere of radius one mean free path. The sphere is shown in the diagram. We used one mean free path, because, on average, a particle passing through the center of the sphere had its last collision one mean free path away. And, on average, it reflects the concentration, temperature, whatever, at the point where it had its last collision. The arrows in the diagram show the velocity vectors (on average—the average velocity, c¯) of particles leaving the sphere and heading toward the center. The size of the dots is meant to indicate the concentration. There are more particles per unit volume at the “top” of the sphere than at the “bottom”, so more particles go by in the downward than the upward direction. The integral we did, which gave the result D = c¯/3, was just a formal treatment of this picture.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-6
Thermal Conductivity To get an expression for the thermal conductivity, we can use the same approach, but modified in details, that we used for the diffusivity. Recall that the transport law is Ju = −K∇τ , where Ju is the energy flux density, τ is the temperature, K is the thermal conductivity which we want to evaluate by looking at what’s going on at the molecular level. As a first approach, we consider a point in an ideal gas in which there is a temperature gradient, go one mean free path away, look at the energy density, calculate the flux density of the energy from that direction, and then integrate over a sphere to get the average flux density of the energy. There is just one small problem. The energy density in an ideal gas must be constant! Why is that? There is a temperature gradient, but if the gas is at rest, with no external forces applied, there cannot be a pressure gradient. The ideal gas law is p = nτ and for a monatomic ideal gas, the energy density is (3/2)nτ which must be constant if p is constant. For other ideal gases, the 3/2 factor will be different, but it will still be true that the energy density is constant as long as the pressure is constant. This is the source of the following amusing paradox. You get up in the morning. It’s cold. You turn up the heat. Pretty soon you’re nice and warm. Where is the energy consumed by the furnace as the air in your room was heated? It’s not in the room where you are—it’s all outside! In any case, let’s represent the energy density, energy per unit volume by u = CˆV τ , where CˆV is the heat capacity at constant volume per unit volume for the gas under consideration. For a monatomic ideal gas, CˆV = 3n/2. If there is a temperature gradient, but the pressure is constant, then CˆV has a gradient opposite to the temperature. If the energy density is constant, then how can there be a flow of energy? The temperature varies, so the average velocity varies. The flux density varies because the energy is transported faster in high temperature regions than in low temperature regions. This is indicated in the figure where the dots represent energy density, not a function of position, and the arrows show the mean speed which is larger at the larger temperatures, so energy from the high temperature region (“top”) of the sphere arrives faster than energy from the low temperature region. The z component of the energy flux density coming from one mean free path away from the center of the sphere in the direction (θ, φ) is Ju,z (θ, φ) = −u¯ c(, θ, φ) cos θ , c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-7
Now, c¯(, θ, φ) = c¯(0) + cos θ Remember that c¯ ∝
√ τ , so
d¯ c . dz
d¯ c 1 dτ = . c¯ 2 τ
We plug this into our expression for the flux density and we have
dτ c¯ Ju,z (θ, φ) = −u cos θ c¯(0) + cos θ dz 2τ
.
Now we average over the surface of the sphere. The term containing c¯(0) gives 0 and the other term yields c¯ dτ dτ 1 Ju,z (θ, φ) = −u = − CˆV c¯ . 6τ dz 6 dz We conclude 1 1 K ≈ CˆV c¯ ≈ CˆV D . 6 2 Again, the approximation signs are there so we don’t take the factors out in front too seriously, because our treatment is very simplified. In fact, you will notice that we have an overall factor of 1/2 compared to K&K (who also provide a simplified treatment). I believe our factor of 1/2 comes from the proper treatment of the velocity gradient in terms of the temperature gradient. However, there are a lot of things we have left out! In fact doing the integrals correctly is a bit messy, but let me just mention one thing that makes changes to what we’ve done. The velocity appears in the energy, that which is to be transported, and in the speed of the transport. So we really need to average the energy times the velocity with the Maxwell probability distribution. This increases the average over what we have. It’s similar to the change in the speed distribution of particles coming out of small hole in an oven that we worked out in an early homework assignment. I believe this effect changes the factor out in front to 2/3 instead of 1/2. (For monatomic molecules—it affects diatomic molecules differently because not all the energy is translational kinetic energy.) In any case, the main physical terms are in our expression. Ignoring the overall factors, we have τ 1 τ 1 ˆ K ∝ CV c¯ ∝ n ∝ . 2 m nd m d2 This tells us that other things being equal, light, small molecules have higher thermal conductivities than heavy or big molecules. Table 14.3 of K&K shows this to be the case. For example, the conductivity of 4 He is about six times that of N2 . The other possibly surprising thing about this expression is that the only macroscopic parameter it depends on is the temperature. It does not matter how dense a gas is, it conducts heat at the c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-8
same rate. A low density gas doesn’t have as many molecules to conduct heat as a high density gas, and so you might have expected that a low density gas wouldn’t be as effective. However, the low density means that the molecules can go farther between collisions and “starting over.” There are fewer molecules, but each one can do a better job. Of course, this breaks down when the density is so high that the gas is about to become a liquid or when the density is so low that we are in the regime where the mean free path is on the scale of macroscopic dimensions.
Viscosity We apply the same method as before to relate the viscosity to microscopic parameters. The figure shows momentum in the x direction with a gradient in the z direction. Rightward pointing arrows are the momentum vectors and radial arrows are the transport velocity vectors. It’s important to remember that the momentum vectors represent a bulk motion of the gas. In other words, the velocity probability distribution has its center shifted away from zero, exp(−m(vx − vx0 )2 /2τ ) rather than exp(−mvx2 /2τ ). The transport velocity vectors are random velocities and we’re just picking the radial ones to calculate the flow at the center of the sphere. The transport law is dvx , dz where vx is the bulk velocity of the gas. If ρ = mn is the mass density then the xmomentum flux density z-component of the gas at (, θ, φ) which is flowing in a radial direction is Jpx,z (, θ, φ) = −ρ(, θ, φ)vx (, θ, φ)¯ c(, θ, φ) cos θ . Jpx,z = −η
We assume the density and the average velocity are constant, so dvx . c cos θ vx (0) + cos θ Jpx,z (, θ, φ) = −ρ¯ dz Now we average over the sphere. The term containing vx (0) gives zero and we are left with 1 dvx Jpx ,z = − ρ¯ c . 3 dz We conclude η≈
1 ρ¯ c ≈ ρD . 3
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
4-Dec-2002 32-9
Again, out treatment is very simplified. We’ve ignored the fact that the x component of the transport velocity must be averaged with the x-component of momentum using the Boltzmann factor. I believe that if this is taken into account, it multiplies the results by 4/5. But again, we’ve got all the important terms in our expression. Ignoring all the numerical factors, √ τ 1 1 η ∝ mn ∝ mτ . m nd2 d2 Small or heavy molecules have a higher viscosity than large or light molecules. Again, the only macroscopic parameter the viscosity depends on is the temperature. In particular, there is (surprisingly?) no dependence on pressure or density. Partially evacuating an experimental apparatus will not reduce “air drag” unless the vacuum is good enough that the mean free path becomes bigger than the size of the apparatus! Caveat: we are talking about the air drag for viscous flow, where the drag is caused by the viscous forces. This is low speed, laminar flow. A more usual case for macroscopic objects moving through air is inertial drag caused by having to move the air “out of the way.” This provides a force proportional to ρv 2 . If we form the ratio of the thermal conductivity to the viscosity we have K 1 CˆV = , η 2 ρ or
Kρ = Constant—independent of gas . η CˆV
Table 14.3 of K&K with experimentally determined values shows this to be approximately correct. Good! However, our constant is 1/2 (5/6 if the corrections mentioned above are included), while the table has values around 1.9 − 2.5. Not so good! To do a better job we need to pay more attention to all the details of the probabilities and be sure that we compute all the averages correctly. Also note in table 14.3 that the values for monatomic gases are somewhat higher than for diatomic gases. This goes in the direction mentioned in passing above.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Dec-2002 33-1
The Boltzmann Transport Equation We can do a somewhat better job of evaluating transport phenomena with a technique developed by Boltzmann. The idea is to start from the distribution function. Recall that we describe the number of particles in a space volume d3 r and a velocity volume d3 v located at r and v at time t by dN = f(t, r, v) d3 r d3 v . We are dealing with a six dimensional space. In the case that the momenta and velocities are the same except for the mass, this is just phase space. Liouville’s theorem which we’ll demonstrate in a minute says that the phase space volume along a stream line in phase space is constant. This means that the distribution function must be constant along a stream line. To see what this means, let’s consider a two dimension phase space, with coordinates q and p, which is described by a Hamiltonian, so q˙ =
∂H , ∂p
p˙ = −
∂H . ∂q
Consider the rectangular volume element ∆q∆p with corners at (q, p), (q + ∆q, p), (q, p + ∆p), (q + ∆q, p + ∆p) at time t. If a particle has coordinates (q, p) at time t, where will it be at time t + dt? This is determined by the Hamiltonian. The coordinates of the lower left corner of our rectangle move to ∂H(p, q) ∂H(p, q) (q + q˙ dt, p + p˙ dt) = q + dt, p − dt . ∂p ∂q The lower right corner goes to (q + ∆q + q˙ dt, p + p˙ dt) ∂H(q + ∆q, p) ∂H(q + ∆q, p) = q + ∆q + dt, p − dt , ∂p ∂q ∂ 2 H(q, p) ∂H(q, p) ∂ 2 H(q, p) ∂H(q, p) dt + ∆q dt, p − dt + = q + ∆q + ∆q dt . ∂p ∂q∂p ∂q ∂q 2 So the bottom side of the volume element is now the vector ∂2H ∂2H ∆q + ∆q dt, ∆q dt . ∂q∂p ∂q 2 Similarly, after time dt, the left side becomes ∂2H ∂2H ∆p dt, ∆p − ∆p dt . ∂p2 ∂p∂q c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Dec-2002 33-2
Now take the cross product of these two vectors to get the volume dV = ∆q ∆p
1−
∂2H ∂p∂q
2
∂2H ∂2H (dt)2 − (dt)2 ∂p2 ∂q 2
= ∆q ∆p ,
where the last equality applies because all the first order terms in dt canceled leaving only the constant term and second and higher order terms. The upshot is that our original rectangle transformed into a parallelogram of the same phase space volume. Assuming particles are neither created nor destroyed, particles with coordinates in the original box at time t are in the transformed box at time t + dt. Since phase space trajectories can’t cross, no particles moved into the box. The number of particles is the same and the area is the same. This means the phase space density of particles didn’t change. This is the Liouville theorem. Of course, the phase space density is just f(t, r, v), modulo some factors of the mass. In terms of f, what we’ve just shown is that df ∂f = + v · ∇r f + a · ∇v f = 0 , dt ∂t when particles are neither created nor destroyed and we are dealing with a Hamiltonian system. Note that a is the acceleration, ∇r indicates the spatial gradient and ∇v indicates the gradient with respect to velocity coordinates. The idea behind the Boltzmann transport equation is to divide the interactions of the particles into two parts. One part, due to macroscopic forces and potentials, is described by a Hamiltonian. The other part is due to the microscopic interactions between particles—the collisions. The “external” interactions satisfy Liouville’s theorem. The collisions “create” and “destroy” particles. That is, a particle undergoing a collision will “suddenly” disappear from its volume of phase space (it’s been destroyed) and reappear (be created) in a different volume of phase space. In this case we write df ∂f = + v · ∇r f + a · ∇v f = dt ∂t
∂f ∂t
, collisions
where the collision term is supposed to represent the net creation/destruction of particle density per unit time at that point in phase space. It will be greater than 0 if particles are entering the volume as a result of collisions. The trick to using this method is to make a guess for the collision term. To make a guess, the first thing to assume is that the system is not too far from equilibrium. If f0 represents the equilibrium distribution the system would have if the constraints keeping the system out of equilibrium were removed (or if f0 is the distribution the system is trying to get to for equilibrium), then one guesses that if f > f0 in some c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Dec-2002 33-3
volume of phase space, more particles will be scattered out of that volume than will be scattered in. (In equilibrium, the number scattered in and out must be the same, otherwise the number in the volume will change, and that’s not equilibrium.) Since we’re not too far from equilibrium, we guess that the net rate of scattering particles out of the volume will be proportional to the difference between the actual density and the equilibrium density. The proportionality constant must have dimensions of inverse time, so we just write it as one over a time which is called the collision time or relaxation time. The Boltzmann transport equation in the relaxation time approximation is, ∂f f − f0 + v · ∇r f + a · ∇v f = − , ∂t τc where τc is the collision/relaxation time, not the temperature, and the minus sign assures that particles leave the volume if the actual distribution is larger than the equilibrium density. Of course, τc doesn’t actually have to be a constant. It could depend on r or v.
The Boltzmann Equation and Simple Diffusion Suppose we have a steady state, so ∂f/∂t = 0 and suppose there are no external forces, so a = 0. Let’s suppose that f varies in the x direction. Then the Boltzmann equation becomes, df f − f0 vx =− . dx τc Since we want to be only mildly out of equilibrium, we expect that f −f0 is small (compared to f0 ) which means df/dx must be small. So we can use df0 /dx in place of df/dx and the error we make should be second order small. Then a first order approximation is f = f0 − τc vx
df0 . dx
Recall f is the density per unit volume per unit velocity volume. If we multiply by vx , and integrate over velocity volume, we’ll have the concentration flux density. In other words Jn,x =
3
vx f d v =
3
d v vx
df0 f0 − τc vx dx
.
For f0 , we should use the Maxwell velocity distribution times the concentration and the derivative of f0 will have the Maxwell velocity distribution times the concentration gradient. The first term will give 0 (no net vx , but the second term will give a non-zero result, τ dn dn Jn,x = −τc vx2 = −τc . dx m dx
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Dec-2002 33-4
This calculation yields a diffusivity D = τc
τ . m
Comparing with our earlier expression for the diffusivity, we find that the two will be the same if 8 τc = . 3π c¯ The relaxation time is roughly the time it takes to go a mean free path at the average speed or roughly the time between collisions. This had to be: in our earlier derivation we assumed that one collision—the last one at a distance of one mean free path—was enough to randomize the velocity, energy, etc. One might also assume that τc =
, v
that is, fast particles relax faster. Then our expression becomes
df0 d3 v vx2 , v dx 1 2 df0 (vx + vy2 + vz2 ) , = − d3 v v3 dx 1 df0 , = − d3 v v 3 dx 1 dn c , = − ¯ 3 dx
Jn,x = −
and D = ¯ c/3 as before.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Dec-2002 33-5
Diffusion and the Fermi-Dirac Distribution Up to now, we’ve been using the Maxwell-Boltzmann distribution function. Let’s see what happens with the Fermi-Dirac distribution. This is important for describing the electrical conductivity of a conductor. So f0 =
1 . e( − µ)/τ + 1
Note that this is normalized in a strange way—it’s missing the density of states. So we’ll have to multiply by the density of states. If we want to describe a situation where the temperature is constant but the concentration varies with position, it will be the chemical potential which varies. Let’s write down the diffusion integral Jn,x = −
d vx2 τc
df0 D() . dx
This is just like our earlier expression except that the density of states appears explicitly and the integral is over the energy rather than velocity volume. The density of states appearing here has to be the number of states per unit energy per unit volume. Recall in lecture 16, we used the same symbol for the number of states per unit energy. Now, df0 df0 dµ = . dx dµ dx What is df0 /dµ? If we are at low temperatures (room temperature for a metal whose Fermi temperature is thousands of Kelvins), then f0 is flat at 0 almost up to , then rises sharply to 1 and stays 1. This means that df0 /dµ must be essentially 0 everywhere but very close to µ where it must have a big spike. The area under the spike must be +1. You can see this, because integrating df0 /dµ must produce the step of 1 in f0 . Note that df0 /dµ = −df0 /d, and f0 has a step of −1 when increases past µ. Then Jn,x
df0 dµ d vx2 τc D() , =− dx dµ dµ df0 d vx2 τc D() , = dx d 1 dµ df0 d (vx2 + vy2 + vz2 )τc = D() , dx 3 d df0 1 dµ d (v 2 )τc D() , = 3 dx d 1 dµ 2 (v )τc D(F ) , =− 3 dx F
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Dec-2002 33-6
Where in the last line, we took τ = 0 so the energy and velocity are the Fermi energy and velocity. The density of states at the Fermi energy (lecture 16) is D(F ) = 3n/2F . The Fermi energy itself is ¯h2 µ(τ = 0) = F = (3π 2 n)2/3 . 2m Differentiate with respect to x, dn 2 ¯h2 2 F dn dµ = (3π 2 )2/3 n−1/3 = . dx 3 2m dx 3 n dx Now we can put it all together, 1 dn Jn,x = − vF2 τc , 3 dx so the diffusivity for a cold Fermi gas is D=
2 F 1 2 vF τc = τc . 3 3m
Electrical Conductivity We can apply the result just obtained for diffusion with a Fermi-Dirac distribution to calculate the electrical conductivity. Everything is the same, except that we want the flux density of charge, so we need to multiply the above by −e. Also, the variation in chemical potential is not due to a variation in concentration but the variation in electric potential energy dµ d = (−eΦ) = eEx . dx dx If we plug this in to our previous results, we have 1 ne2 τc 3n Ex . = Jq,x = − (−e)(eEx ) vF2 τc 3 2F m We conclude the electrical conductivity is σ=
ne2 τc . m
Note that τc is to be evaluated for the electrons with energies near the Fermi energy. When we were considering gases, we could make an estimate of the collision time by considering a simple model of the molecules as hard spheres and the collision time could be c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
6-Dec-2002 33-7
calculated by considering the collisions of the spheres. In a metal, the relaxation time for the electrons is harder to calculate. (We’re not going to calculate it!) The relaxation time is determined by interactions of the electrons with the lattice and especially impurities in the lattice. An interesting point is that a classical treatment of the electrons gives the same expression for the conductivity (see K&K). So from studying electric conductivity, you might not have decided quantum effects are important in a metal. However, the FermiDirac statistics reduce the contribution of the electrons to the heat capacity and so allow understanding of how a metal can have large electric and thermal conductivities due to the electrons, but very little electronic heat capacity. Note also that this is another clue that the electron relaxation time is determined by collisions with the lattice. Since the lattice “has the heat,” the electrons must interact with the lattice in order to get the heat to transport it.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 11
Due 10-Dec-2002
H11-1
1. Some fun with cross sections. (a) Consider elastic scattering of two hard spheres of radius a as in the lecture notes, but this time in the center of mass frame. Show that dσ = a2 , dΩ so the scattering is isotropic in the center of mass frame. The prime on dΩ is meant to indicate the center of mass frame. (b) In the lab frame, one of the particles is at rest before the collision. By considering the transformation of the center of mass angles (θ , φ ) to the lab angles (θ, φ), transform the above cross section to the lab frame and show that the same result is obtained as in lecture. 2. Diffusion Equations. We’ve discussed particle diffusion and found that the particle flux density is related to the concentration gradient according to the transport equation Jn = −D∇n . In coming up with this equation, we assumed that the system was in a steady state, but that’s not necessary, it’s also valid if the concentration and flux density are functions of time. (a) If particles are conserved then a change in particle concentration in a region must result from particles crossing the boundary of the region. Show that ∇ · Jn +
∂n =0. ∂t
Hint: you might find the divergence theorem useful. This equation expresses the conservation of particles. It is called a continuity equation. (b) If the continuity equation is combined with the transport equation, one can obtain a partial differential equation for n (not containing J). What equation do you obtain? This is called the diffusion equation. Comment: if the concentration is a linear function of position, the equation you just derived shows that the concentration is independent of time. So this is a steady state solution. At a local maximum, your equation should show the concentration decreases with time. Does it? Similarly at a local minimum. the concentration increases with time. Keep going, this problem continues on the next page!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 11
Due 10-Dec-2002
H11-2
Another comment: A very crude approximation to a spatial derivative of f is f/L where L is a characteristic spatial dimension of the system. A very crude approximation to a time derivative is f/T where T is a characteristic time scale of the system. (c) Suppose that in lecture, I open a bottle of ammonia in the front of the room. Estimate how long it will take you to notice an appreciable odor if you’re sitting in the back of the room. This is a “make crude but reasonable estimates” problem! You might get a relatively long time compared to normal experience. But normal experience may also involve convection (air currents). 3. K&K, chapter 14, problem 3. 4. K&K, chapter 14, problem 4. 5. K&K, chapter 14, problem 5. 6. K&K, chapter 14, problem 6. The main point here is to solve for the velocity as a function of radius!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
16-Dec-2002
Physics 301 Problem Set 11 Solutions Problem 1. (a) As a setup, we fix the scattering to be in the z − x plane, the initial momenta are ˆ ). The fact that we are in the ±pˆz, and the final momenta are ±p(cos θ zˆ + sin θ x center of mass frame means that the component of momentum perpendicular to the line joining the centers does not change, and the component parallel to it is reflected. This means that the line joining the centers makes an angle θ /2 with the z-axis. We get then, sin(θ /2) = b/2a, and so dσ = b db dφ = 2a sin(θ /2)2a cos(θ /2)d(θ /2) = a2 sin θ dθ dφ = a2 dΩ . The scattering is isotropic. (b) Changing coordinates from the center of mass frame to the lab frame, we have p1,f lab = p1,f cm + pcm . The momentum of the center of mass in the lab frame is simply half the initial lab momentum of the first ball which is exactly the intial c.m. momentum of the first ball. the equation is p1,f lab = p1,f cm + p1,icm . if we draw this diagram, it is clear that we have θcm = 2θlab . Changing back to the primed notation, we have sin θ = sin(2θ) = 2 sin θ cos θ. The axial coordinate remains the same φ = φ. We then have dσ = a2 dΩ = 4a2 cos θdΩ which the result in the lecture. Problem 2. (a) If we zoom in on a little ball surrounding a point in space, we can write the particle conservation equation in that ball as ∂ dS Jn .ˆ n+ dV n = 0 ∂t
(1)
where the second term is the rate of outflow of particles in the region and the first denotes the flow of particles out of the boundary. Using the divergence theorem to convert the first term to a volume intergral of ∇.Jn , and then saying that this holds for an arbitrary volume gives us the microscopic continuity equation ∇.Jn + (∂/∂t)n = 0. (b) The transport equation says Jn = −D∇n. Plugging this into the divergence equation above gives us a second order equation for n alone: −D∇2 n +
∂n =0 ∂t
(2)
(c) Using the crude approximations (∂/∂t)n ≈ n/T , and (∂/∂x)n ≈ n/L where T and L are some characteristic time and length scales in the problem, we can write the 1
Physics 301
16-Dec-2002
diffusion equation as n/T = Dn/L2 , and this gives an estimate for T = L2 /D. For the ammonia example, we have L ≈ 10m (size of room), we can either calculate the diffusivity as in the lectures D ≈ c¯l, or look up the experimental value - a search gave D ≈ 0.2cm2/s, which gives T ≈ 106 s. Problem 3. We have K = DCv n = (1/3)¯ cl(3/2)n = (1/2)¯ cln, and σ = nq 2 τc /m = nq 2 l/m¯ c. This gives K/στ = m¯ c2 /2τ q 2. Approximating the square of the mean speed by the mean square speed 3τ /2m, we get K/στ = 3/2q 2 . One should remember that this is just an estimate. Problem 4. (a) Electrons in a copper are a cold fermi gas. Their heat capacity per unit volume is Cˆel = 2 π 2 nkB T /2EF , where EF = (3π2 n)2/3 (¯ h2 /2m) is the fermi energy. The concentration of conduction electrons in copper is n = 8×1022 /cc which gives Cˆel = 2×105 erg/Kcc.
(b) The thermal conductivity is K = (1/3)vF2 τc Cˆv where the fermi velocity is vF = (2EF /m). We can approximate the relaxation time as τc = l/vF . This gives us K = (1/3)vF lCˆV . We are given l = 400 × 10−8 cm, so K = 4 × 107 erg/(cm s K). (c) The electrical conductivity is σ = ne2 τc /m = ne2 l/mvF ≈ 6 × 107 /ohm m. Problem 5. (a) The zeroeth order distribution is maxwell f0 = (C/τ 3/2 ) exp(−E/τ ). This gives df ∂f dτ 3 E dτ = = (− + 2 )f0 . dx ∂τ dx 2τ τ dx
(3)
The Boltzmann transport equation to first order then reads: f ≈ f0 − vx τc (−
3 E dτ + 2 )f0 . 2τ τ dx
(b) The energy flux in the x direction is given by x Ju = vx ED(E)f (E)dE
(4)
(5)
where the energy is a function of the x velocity E = (3mvx2 /2). (We consider the average in the three directions to be equal). Plugging in the expression (4) in this equation, we see that the first term with f0 alone is odd in vx and hence vanishes. the second term gives us the conductivity. 2
Physics 301
16-Dec-2002
(c) Jux
dτ E dτ 3 = − τc vx2 (− + 2 )f0 D(E)f0 (E)dE dx 2τ τ dx 2E 3 E dτ (− + 2 )D(E)f0 (E)dE = − τc dx 3m 2τ τ dτ ≡− K dx
(6)
(c) To perform the integral for K, we plug in the Maxwell distribution f0 (E)D(E)dE = C(E)e−E/τ dE. C(E) has a factor of the concentration and a factor of E 1/2 from the density of states. Calling E/τ = x, we have: τc nτ K= m
∞ 0
x(−x + 23 x2 )e−x x1/2 dx ∞ e−x x1/2 dx 0
τc nτ (−Γ(7/2) + 23 Γ(9/2)) τc nτ = =5 . m Γ(3/2) m
(7)
Problem 6. We shall consider a cylindrical sheet of fluid centered on the axis of the tube. This has length L, radius r and thickness dr. The viscous force on the inner surface is F = ηL2πr(dv/dr)r . The force on the outer surface is F + dF = ηL2π(r + dr)(dv/dr)r+dr in the opposite direction. The net force is dF = 2πηL(r(d2 v/dr2 ) + (dv/dr))dr where all the derivatives are evaluated at r. This force is equal to the pressure times the area p2πrdr. This gives us a differential equation for v(r) p d2 v 1 dv = . + 2 dr r dr Lη
(8)
To solve this, we substitute a quadratic ansatz v(r) = A + Br + Cr2 . We cannot have a linear term because, at r = 0, there shouldnt be a cusp, and we also have that the fluid does not flow at the wall (v(r = a) = 0). This gives v(r) = (p/4Lη)(r2 − R2 ), and the total flow rate is given by V˙ = v(r)2πrdr = (pπa4 /8Lη).
3
Week 12. High Vacuum, Diffusion, Sound Waves and Heat Losses
Physics 301
9-Dec-2002 34-1
Reading K&K Chapter 15.
High Vacuum Statistical Mechanics As we’ve mentioned, when the pressure is less than about a millionth of an atmosphere at room temperature, the mean free path becomes long enough that it’s comparable to the size of laboratory apparatus. As the gas becomes even more rarified, it becomes a better and better approximation to completely ignore the interactions between gas molecules and to consider only the interactions between the molecules and the walls of the container. The flux density of particles crossing a unit area perpendicular to the x direction and headed in the positive x direction is Jn,x
+∞ +∞ m 3/2 +∞ 2 2 −mvy2 /2τ −mv /2τ x =n vx dvx e dvy e dvz e−mvz /2τ , 2πτ 0 −∞ −∞ m 1/2 +∞ 2 vx dvx e−mvx/2τ , =n 2πτ 0 τ , =n 2πm 1 c. = n¯ 4
If there is a hole of area A in a vacuum system (with a good vacuum on the outside, too), the flux of particles leaving through the hole is Φ=
1 An¯ c = nS , 4
S=
1 A¯ c. 4
S is called the conductance of the hole. It’s just the volume of gas which flows through the hole per unit time. Of course, the gas isn’t very dense, so this may not be much mass! We can write the particle flux as Φ=
p Q S= , τ τ
Q = pS .
Q is called the throughput. It has the dimensions of a pressure times a volume per unit time. An interesting point about the throughput is that, numerically, it gives the volume per second that flows through a hole at unit pressure.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
9-Dec-2002 34-2
In general, there will be a flux of particles passing through a hole from both sides. The net flux passing through the hole from side 1 to side 2 is p1 c¯1 p2 c¯2 1 1 . − Φnet = A(n1 c¯1 − n2 c¯2 ) = A 4 4 τ1 τ2 Suppose we want there to be no net flux (by the way, we assume the gas on either side of the hole is the same kind of gas; the same molecular mass). Then
or
p1 c¯1 p2 c¯2 = , τ1 τ2 p1 τ1 τ1 c¯1 = = , p2 τ2 c¯2 τ2
and there can be a net flux of molecules even when the pressures are the same! The net flow is from the cold side to the hot side. In order to have zero net flux when there is a temperature difference, the pressure on the hot side must be higher than the pressure on the cold side! Since the energy of a molecule is proportional to the temperature, the net energy flux is
√ √ Pnet ∝ p1 τ1 − p2 τ2 .
Even when the pressures are equal and the net flow of particles is from cold to hot, the net flow of energy is from hot to cold! An interesting calculation involves the flow through a tube under high vacuum conditions. There must be a pressure difference from one end of the tube to the other to account for the “friction” between the molecules and the walls of the tube. We suppose the gas particles are flowing through the tube with mean speed (parallel to the tube) u. The tube has diameter d, input opening cross section A = πd2 /4, length L, and the concentration of molecules in the tube is n. We suppose that on average, when a molecule hits the wall of the tube, it has a velocity parallel to the direction of flow equal to the average velocity, u. When it recoils from its collision with the wall of the tube, on the average, it has zero velocity parallel to the tube. In other words, each collision with the wall transfers momentum P = mu and scatters the molecule isotropically. This is analogous to the assumption that a molecule is thermalized at its last scattering which occurred one mean free path away. This is the assumption we made when considering transport in gases. The rate at which molecules strike the tube is n¯ cπLd/4, so the net longitudinal force on the tube is π F = mun¯ cLd = ∆p A , 4 where the pressure difference times the area of the opening is “external force.” We solve for the average velocity and then the net flux of molecules through the tube, u =
∆p d , mn¯ cL
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
9-Dec-2002 34-3
Φnet = nuA =
∆p S ∆p Ad = , m¯ cL τ
1 πd Adτ = A¯ c , S= m¯ cL 4 2L
where
is the conductance of the tube. We see that the conductance is the same as the conductance of a hole with the same opening area as the tube multiplied by the ratio of the diameter to to the length. This just says that the longer the tube, the more likely it is that a molecule will collide with the walls of the tube and the harder it is for a molecules to get through. Of course, the above treatment breaks down for a short tube because it gives a bigger conductance than a simple hole. This is because the assumption that most molecules hit the wall breaks down for a short tube. K&K point out that the factor of π/2 in parentheses should be replaced by 4/3 if a more careful averaging is done. Oh well. . .. Vacuum pumps have a characteristic “speed” which is the volume per unit time pumped at the input pressure of the tube. Note that this is the same definition as that of the conductance of a hole or tube. Conductance is also referred to as speed. If a pump is in series with a tube the reciprocals of the speeds add to give the reciprocal effective speed. So speed (conductance) is inverse resistance and in this respect, the conductance and resistance we’ve been talking about are like electrical conductivity and resistivity. To see this suppose the system is isothermal and suppose the pressure at the intake end of the tube is p1 and the pressure at the output of the tube which is the input of the pump is p2 . Then the net flux is (analogous to electric current) Φnet = from which we deduce
p1 p1 − p2 p2 Seff = Stube = Spump , τ τ τ 1 1 1 = + . Seff Stube Spump
Also note that p/τ is analogous to √ the electric potential. (Although we might be pushing this a little too far, since there’s a τ in the speeds which comes from c¯.) If we start out with a volume V of ideal gas at temperature τ , containing N molecules, then p = Nτ /V . If this is attached to a pumping system with effective speed S, the rate at which the pressure decreases is dp τ dN τ pS S = =− = − p, dt V dt V τ V which has the solution
V . S Note that we assumed the system is isothermal. In actual practice, the pressure goes down rapidly (roughly like the above solution) at first, but then reductions occur much more p = p0 e−t/t0 ,
t0 =
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
9-Dec-2002 34-4
slowly. Molecules adsorbed on the surfaces of the vacuum apparatus de-adsorb and must be pumped out. Also, molecules trapped in the interiors of porous materials can outgas and keep the pressure from dropping as rapidly as one would expect.
Diffusion Equations All of our transport equations had the generic form of a gradient of a quantity times a constant yielded a flux density of a related quantity. For example, consider heat conduction and temperature, Ju = −K∇τ . When we discussed transport we assumed a steady state. However, the main requirement for the above transport equation to be valid is that the system be not too far from equilibrium so that we need only consider a linear relation between flux density and driving force. Suppose we consider the divergence of Ju . What is the physical meaning of ∇ · Ju ? To find out, we’ll integrate over a volume V bounded by a surface S and use the divergence theorem, ∇ · Ju dV = V
Ju · n dA , S
where n is the outward pointing unit normal to the surface S and dA is an element of area on the surface. Recall that the magnitude of Ju is the energy per unit time that flows across a unit area perpendicular to Ju . The integral above is the amount of energy that leaves the volume V by passing through the surface S. Assuming that any change in energy inside S must be the result of energy passing through S, we have V
d ∇ · Ju dV = − dt
u dV , V
where u is the energy density. If we assume S is a fixed surface, then we can move the time derivative inside the integral where it must become a partial derivative. We have V
∂u ∇ · Ju + ∂t
dV = 0 .
This must be true for any volume, so the integrand must be zero and we have ∇ · Ju +
∂u =0. ∂t
This is called a continuity equation. It simply says that the change in energy in a volume is due to energy passing through the boundaries of the volume. It can also be called an c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
9-Dec-2002 34-5
energy conservation equation. It’s exactly analogous to the charge conservation equation in E&M; just replace Ju with Jq (electric current density) and u with ρ (charge density). Now we consider the time rate of change of u in more detail. In our previous discussion, we took u = CˆV τ , where CˆV is the heat capacity per unit volume at constant volume, This was motivated by considering an ideal gas. We will assume that volume changes can be ignored, and then we have du = Cˆ dτ , which is valid for systems other than an ideal gas provided Cˆ can be treated as independent of temperature. Then the continuity equation becomes ∂τ ∇ · Ju + Cˆ =0. ∂t Finally we substitute from the transport equation to eliminate Ju and we have ∇2 τ −
1 ∂τ =0, Dτ ∂t
Dτ =
K . Cˆ
The constant Dτ is the “diffusivity” of the temperature or internal energy (since du = Cˆ dτ ). The equation we’ve just derived, which has a second space derivative minus a constant times a first time derivative, is called a diffusion equation. You will notice it looks a little like the Schroedinger equation, except it’s entirely real. This changes the character of the solutions from oscillations (for Schroedinger) to “spreading out” (for diffusion). It also looks a little like a wave equation except a wave equation has a second time derivative. Again, this changes the solutions from oscillations to diffusion. The diffusion equation, like Schroedinger’s equation and the wave equation, is one of the ubiquitous partial differential equations of physics. In the homework you are asked to derive the diffusion equation for particle diffusion. Note that the diffusivity has the dimensions of a speed times a length, or length squared divided by time. For a gas, the order of magnitude is the the average velocity times the mean free path. For an insulating solid, one might guess it’s the speed of sound times the mean distance between molecules. For a metallic solid, it’s more complicated as energy is transported by electrons but stored in lattice vibrations. The diffusion equation occurs with a magnetic field in a conducting medium. Starting from Ampere’s law (Gaussian units), we have 1 ∂E 4π + J. c ∂t c In a conducting medium, J = σE where σ is the conductivity. Insert this in place of the current density and take the curl of Ampere’s law. ∇×B =
∇ × (∇ × B) =
4πσ 1 ∂∇ × E + ∇×E. c ∂t c
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
9-Dec-2002 34-6
We expand the double curl with a vector identity, ∇ × (∇ × B) = ∇(∇ · B) − ∇2 B = −∇2 B , where we eliminated the first term since there is a Maxwell equation which says ∇ · B = 0. Substitute for E from Faraday’s law, ∇×E =−
1 ∂B , c ∂t
to get 1 ∂2B 4πσ ∂B − 2 . 2 2 c ∂t c ∂t If the conductivity is larger than the characteristic frequency associated with the time variation of B, in other words, for a good conductor, we can ignore the term with the second time derivative compared to the first and we have −∇2 B = −
∇2 B −
1 ∂B =0, DB ∂t
DB =
c2 . 4πσ
Sample Solution of the Diffusion Equation: Equilibrating Bar In the next few sections we’ll discuss some solutions of the diffusion equation. The first thing to notice is that it’s a linear, homogeneous equation. This means that any solution can be multiplied by a constant and this yields another solution. Also, the sum of any two solutions is a solution. In general, the solution must be determined in conjunction with the boundary conditions: specifications of the desired solution at a given time from which the solution can be integrated forward (it’s a first time derivative). It is often a good idea to expand the solution as a sum of simple solutions such as plane waves. The plane waves can be evaluated at t = 0 and adjusted to fit the boundary conditions. Then the diffusion equation is used to determine the time dependence of each wave. As an example, suppose we have a bar which has been used to conduct heat between a hot reservoir and a cold reservoir. If the bar has come to a steady state, and if it has a uniform cross section, there will be a linear temperature gradient along the bar. Suppose the reservoirs are removed and the bar is isolated from the rest of the world. What happens? Heat flows from the hot end of the bar to the cold end until the bar reaches a uniform temperature distribution. What happens in detail? To answer that, we have to solve the diffusion equation with the initial condition that there is a uniform temperature gradient in the bar. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
9-Dec-2002 34-7
We let x, 0 < x < L, represent the coordinate along the bar, with L being the length of the bar, At t = 0, when the reservoirs are removed, we can take the temperature in the bar to be 1 τ (x, 0) = ∆τ (1 − 2x/L) , 2 where ∆τ is the temperature difference from one end to the next. This makes the mean temperature 0. Actually, we can add a constant to this temperature without affecting the problem, so the mean temperature can be the mean temperature of the two reservoirs. Since this is a one-dimensional problem, ∇2 = d2 /dx2 and we have ∇2 τ = 0 which says dτ /dt = 0 which says that the bar just sits there with the linear temperature gradient. We know this can’t be right, but where did we go wrong? In fact, it’s right when the bar is between the two reservoirs and a steady state has been reached. But in this situation, heat is entering the bar at the hot end and leaving the bar at the cold end. Once we remove the bar from the reservoirs, we need to look for solutions with Jx (x = 0) = 0 and Jx (x = L) = 0. Since J ∝ ∇τ , we need solutions with dτ /dx = 0 at the ends of the bar. Our boundary conditions are that at t = 0 the bar has a linear temperature gradient as above, and for all time t > 0, dτ /dx = 0 at the ends of the bar. These may seem like incompatible conditions. The problem is the abrupt change of slope at the ends of the bar. We will see that what happens in our solution is that the heat flow in the bar when it’s removed from the reservoirs causes an “infinitely” fast flattening of the slope at the ends so there is no heat flow into or out of the ends. (In real life, it’s not possible to remove the bar from the reservoir abruptly, so it’s not necessary to have this infinitely fast change!) Suppose we consider a solution of the form sin(kx) f(t), where k is a constant (the wave number) and f(t) is a yet to be determined function of time. If we plug this into the diffusion equation, we have −k 2 sin(kx)f(t) −
∂f(t) 1 sin(kx) , D ∂t
which has the solution for f(t), 2 f(t) = f0 e−Dk t .
Similarly, 2 τ (x, t) = cos(kx)e−Dk t ,
is also a solution of the diffusion equation. An arbitrary sum of these solutions is also a solution. So, our plan is to evaluate the solutions at t = 0, find a sum which matches the boundary conditions, and then add the time dependence to the sum to get the solution for times greater than t > 0. The solutions above don’t necessarily satisfy the boundary condition having to do with the gradient of τ at the ends of the bar. In particular, we can’t c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
9-Dec-2002 34-8
use the sin(kx) functions at all, because they give dτ /dx ∝ cos(kx) which is non-zero at x = 0. So we can only use the cos(kx) functions which automatically satisfy the boundary condition at x = 0. At x = L, we have dτ /dx ∝ sin(kL). This will be zero if kL = nπ where n is an integer. So, our general solution which satisfies the boundary conditions at the ends of the bar is τ (x, t) =
∞
2 An cos(kn x)e−Dkn t ,
kn =
n=1
nπ , L
where An are constants to be adjusted to make the solution have the correct linear dependence at t = 0. In other words, we are writing the linear temperature gradient as a Fourier series. To determine the coefficients An , we multiply by cos(km x) and integrate from 0 to L. We get 0 for n = m. For n = m, we have
L
Am cos(km x) cos(km x) dx = Am L/2 . 0
We do the same thing with the desired linear dependence. Note that when n is even the cosine functions are symmetric about the center of the bar. The temperature gradient is odd about the center of the bar, so there will be no even terms in the sum. For the odd terms, the coefficients are
L
1 ∆τ (1 − 2x/L) cos(nπx/L) dx , 0 2
L L
∆τ L ∆τ L 2 = (1 − 2x/L) sin(nπx/L)
+ sin(nπx/l) dx , 2nπ 2nπ L 0 0
L
∆τ L = 0 − 0 − 2 2 cos(nπx/L)
, n π 0 2∆τ L = 2 2 , remember n is odd n π
An L/2 =
Which gives An =
4∆τ , n2 π 2
and our solution for t > 0 is τ (x, t) =
4∆τ −Dkn2 t , cos(k x)e n n2 π 2 n=1,3,5,...
kn =
nπ . L
Some comments are in order. Each mode decays like exp(−t/tn ) where the decay time is tn =
L2 , Dn2 π 2
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
9-Dec-2002 34-9
so short wavelength (large kn ) modes decay very fast compared to the n = 1 mode. The decay time is inversely proportional to the square of the wavenumber. It is the very short wavelengths that are necessary to produce the discontinuity in slope at the ends of the bar. The discontinuity decays very rapidly (infinitely rapidly if we go all the way to n = ∞!). The figure shows the temperature as a function of position in the bar for several
times including t = 0, and t = (0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1)t1 where t1 = L2 /Dπ 2 is the decay time of the longest mode. The plotted functions include the Fourier terms n = 1, 3, 5, 7, 9, 11, 13, 15. The function for t = 0 has some small wiggles at the ends. This is because it’s missing the high frequency (n > 15) components. By the time the n = 1 component has experienced one decay time, the next slowest decaying term, n = 3, has experienced 9 decay times. In other words the temperature profile becomes indistinguishable from a half cycle of a cosine very quickly!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Dec-2002 35-1
The Dispersion Relation When considering the diffusion equation in the last lecture, we found that an expansion in cosine components was quite useful. One might also ask under what conditions is a plane wave a solution of the diffusion equation? Suppose τ (r, t) = τ0 ei(k · r − ωt) . Plug into the diffusion equation and get iω −k 2 τ0 ei(k · r − ωt) + τ0 ei(k · r − ωt) = 0 , D or Dk 2 = iω . A relation between k and ω is called a dispersion relation. This comes from the fact that if k and ω are not proportional then a wave pulse disperses (or spreads out). In the example of the equilibrating bar, the spatial parts of the solutions are composed of pure oscillations. This means k is real and k 2 is positive. The dispersion relation tells us that ω is pure imaginary, ω = −iDk 2 . When plugged back into the plane wave solution, the negative i in ω times the negative i in the plane wave exponent gives −1, so we get an exponential decay in time. We’ll consider some other consequences of the dispersion relation later.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Dec-2002 35-2
Random Walks and Diffusion A simple one-dimensional random walk consists of N steps. Each step is of length but may be to the left or right. The probability for either direction is the same, 1/2. The question is, after N steps, how far from the origin (where you started) have you randomly walked? The probability of n steps to the right and (N − n) steps to the left is just the binomial distribution (see lecture 3) N 1 P (n) = . n 2N The distance from the origin is the number of positive steps minus the number of negative steps times the length per step d = [n − (N − n)] = (2n − N ) . The average distance is 0. What’s the root mean square distance? √ d2 = (4(n − N/2)2 2 = 4(Np(1 − p) = N . In other words, the mean position is still the origin, but the spread around the mean position grows as the square root of the number of steps. Now suppose each step is accomplished at speed c¯. Since each step is of length , the time per step is /¯ c. If the random walk lasts for time t, the number of steps is N = t/(/¯ c) = t¯ c/) and √ d2 = t¯ c . Notice that what’s multiplying t inside the square root is basically the diffusivity! So, our simple random walk satisfies d2 = Dt , The variance in position is proportional to the time and the proportionality constant is D. What does this have to do with transport and the diffusion equation, you ask? Recall, our picture for transport is that a molecule travels (on the average) at speed c¯ for about one mean free path . At that point (on the average), it has a randomizing collision and goes off in a random direction. This is three dimensional, but other than that it’s the random walk we’ve been discussing. Let’s consider again the example of the equilibrating bar. This time, we’ll think of it as energy that must randomly walk until equilibrium is achieved. That is, until the energy is sufficiently spread out (sufficiently large variance) that it appears uniform on the scale of interest. Consider a mode with wave number kn = nπ/L. The characteristic distance for this mode is kn−1 . Our random walk model says that the energy will random walk to this rms distance in time tn given by tn =
1 2 L2 dn = , D Dn2 π 2
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Dec-2002 35-3
and sure enough, tn is the decay time for mode n. This is a characteristic property of diffusion processes and random walks: doubling the length scale quadruples the time scale.
Sample Solution: A Temperature Oscillation Suppose we consider a slab of material which has a boundary at x = 0. The material extends to x > 0. We suppose everything is uniform in the y and z directions. Suppose that the x = 0 boundary is forced (by some external agent) to undergo a sinusoidal temperature oscillation, τ (0, t) = τ0 Re e−iωt . Since the diffusion equation is linear, the real part of a complex solution is a solution, so we’ll usually drop the indication that we’re considering the real part and take it to be implied. We want to know the temperature distribution, τ (x, t) in the slab. We’ll have a plane wave propagating in the x direction. The wave number is determined by the dispersion relation, iω k2 = . D We can take the square root, √ ω ω k=± i =± (1 + i) = ±k1 (1 + i) , D 2D where k1 =
ω/2D is a handy abbreviation. So the solution is
τ (x, t) = τ0 e±ik1 x ∓ k1 x − iωt = τ0 e∓k1 x e±ik1 x − iωt . So the solution is a plane wave (the last factor above) times an exponential decay (or growth) in the x-direction (the middle factor above). The exponential growth of the wave into the material is unphysical, and we must take the upper sign to be sure the wave decays as we go into the slab. We have a wave, but it is damped with a damping length comparable to its wavelength. This means that the wave “doesn’t get very far” (in terms of wavelengths). K&K give some numerical examples showing that not much dirt is needed to insulate underground pipes from the day-night cycle or the summer-winter cycle!
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Dec-2002 35-4
The Diffusion of a One Dimensional Bump When we looked at a one dimensional random walk, we found that the variance in position grows in proportion to the time. We’ve also argued that microscopically, diffusion is just a random walk. So, we might expect that if we start with an excess of energy concentrated at x = 0, it will diffuse away from the origin in such a way that the mean position of the energy remains at the origin, but the variance of the location of the energy grows with time. Furthermore, since the macroscopic energy distribution is determined by many small events, we might expect the distribution to be Gaussian. So, we guess that the one-dimensional temperature distribution x2 − 1 τ (x, t) = √ e 2at , 2πat might be a solution of the diffusion equation. Here, a is a constant that must have the same dimensions as a diffusion constant. This distribution is a Gaussian, centered at 0, with variance at. To find out if this is a solution, we must plug into the diffusion equation for the temperature. We find 2 ∂2τ 1 x2 1 −x /2at − + 2 2 , e = √ ∂x2 at a t 2πat 2 1 1 x2 ∂τ −x /2at − + = √ , e ∂t 2t 2at2 2πat The one-dimensional diffusion equation for the temperature is ∂2τ 1 ∂τ =0, − 2 ∂x D ∂t which is solved by our distribution provided a = 2D. This result reinforces the interpretation of diffusion as a random walk. It also has another application. The solution we’ve just constructed has the following interesting properties. First of all, +∞
τ (x, t) dx = 1 . −∞
In other words, the solution is normalized to unit “total temperature” for all t > 0. As t → 0, the distribution gets infinitely narrow and infinitely high—but it still integrates to 1. Therefore if g(x) is any reasonably well behaved function, it must be that lim
t→0
+∞ −∞
g(x)τ (x, t) dx → g(0) lim
t→0
+∞ −∞
τ (x, t) dx → g(0) .
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Dec-2002 35-5
If you know about δ-functions, what we’ve just shown is that as t → 0, our temperature distribution behaves like δ(x). We can shift the origin to x simply by subtracting x in the argument of the exponential. We can shift the origin of time to t by subtracting t from t in both places where t occurs. We can also call this function G(x, x , t, t ) rather than τ (x, t). Now we have
G(x, x , t, t ) =
1 4πD(t − t )
(x − x )2 − e 4D(t − t ) .
This is a solution to the thermal diffusion equation in x and t > t and it corresponds to a “point source” of unit temperature at x = x and t = t . The point source diffuses away from where it starts as time goes on. Now suppose that at t we have an arbitrary distribution of temperature τ (x, t ). Note that +∞ τ (x, t ) = lim τ (x , t )G(x, x , t, t ) dx . t→t
−∞
In other words, if we multiply our temperature distribution by G, integrate over x and take the limit t → t , we get back the temperature distribution we put in. What we’ve done is to treat the temperature distribution at t as an infinite number of point sources located at x . The strength of each point source is τ (x , t ) dx . Remember, the diffusion equation is linear and homogeneous so any sum of solutions is a solution. We know how a point source diffuses; the diffusion of a sum of point sources is just the sum of the diffusion of the individual point sources. In other words, if the temperature distribution at time t is τ (x, t ), the distribution at a later time t is
+∞
τ (x, t) =
τ (x , t )G(x, x , t, t ) dx .
−∞
If you look at what’s going on, you see that we are convolving the original temperature distribution with the Gaussian G to get the later temperature distribution. Convolving with a Gaussian is a smoothing operation. In fact, it’s often called a low pass filter. As time goes on, the filter gets wider in proportion to the square root of the time. Interesting point: suppose you have a temperature distribution at time t1 . To propagate the temperature distribution to time t2 , you can use G(x, x , t2 , t1 ). To propagate from t1 to t3 > t2 you can use G(x, x , t3 , t1 ). OR, you can propagate to t2 and then regard the temperature distribution at t2 as the initial distribution and propagate that to t3 with
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Dec-2002 35-6
G(x, x , t3 , t2 ). In other words +∞ τ (x , t1 )G(x, x , t3 , t1 ) dx , τ (x, t3 ) =
−∞ +∞
−∞ +∞ +∞
=
τ (x , t2 )G(x, x , t3 , t2 ) dx ,
=
−∞ +∞
= −∞
−∞
τ (x , t1 )G(x , x , t2 , t1 ) dx
τ (x , t1 )
Or,
G(x, x , t3 , t1 ) =
+∞
−∞
+∞ −∞
G(x, x , t3 , t2 ) dx ,
G(x, x , t3 , t2 )G(x , x , t2 , t1 ) dx
dx .
G(x, x , t3 , t2 )G(x , x , t2 , t1 ) dx .
To get the filter for time t1 → t3 , we convolve (or filter) the filter for t1 → t2 with the filter for t2 → t3 . A mathematical property of Gaussians is that when Gaussians are convolved, the variances add. This is just what we need for diffusion as the variance is proportional to the time. Mathematical note: G(x, x , t, t ) is called a Green’s function. It is the response of the system at x, t to a unit point source located at x , t . You’ve used Green’s functions before, you just didn’t know it. For example, a unit point charge located at r produces an electric potential at r 1 G(r, r ) = . |r − r | Then to find the electric potential from a distribution of charge, you convolve the charge distribution with the Green’s function ρ(r ) 3 3 Φ(r) = ρ(r )G(r, r ) d r = d r . |r − r | There’s no time dependence in this electrostatics problem, but the basic idea of convolution and a Green’s function is the same. Mathematical note 2: We’ve worked out the one dimensional Green’s function for the diffusion problem. In K&K chapter 15, problem 2, the two and three dimensional pulse response functions are worked out. Mathematical note 3: The Green’s function depends on the geometry of the system. We’ve assumed a one-dimension system that’s infinite in both directions. Temperature (or internal energy) can diffuse away to infinity. If we had a finite medium which did not permit heat flow past its boundaries, the Green’s function would be different. If we had a non-uniform medium, the Green’s function would be different, etc. c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Dec-2002 35-7
Mathematical note 4: Given a uniform, one-dimensional, infinite medium. we’ve completely “solved” the initial value problem. Given the temperature distribution at one time, we can find it at all later times just by doing an integral. (Well, actually an integral for each point and time!) Physics(!) note: The key idea is that in the absence of heat sources, sinks, or reservoirs, etc., temperature non-uniformities just diffuse away with time. Any non-uniformity spreads √ to a size 2Dt after time t and has its amplitude reduced √ by the same factor. (This assumes of course, that its initial size was much smaller than 2Dt.) In three dimensions, the size grows at the same rate, but the amplitude decreases as the cube since it’s spreading out in three dimensions rather than just one.
Time Independent Solutions of the Diffusion Equation In some cases, we’ll be interested in the time independent solution of the diffusion equation. Why would be interested in this? Answer: when a system reaches a steady state, the temperature distribution must satisfy ∇2 τ = 0 , as well as whatever boundary conditions exist. Suppose for example that a high (τ1 ) and low (τ2 ) temperature reservoir are connected by a bar with a non-uniform cross section. The transport equation applies, so Ju = −K∇τ . If we are in a steady state, then any energy entering a thin slab of the bar must leave the other side of the slab at the same rate. In other words, the energy flux (not flux density) must be constant along the length of the bar. So J (x)A(x) ≈ constant where x measures position along the bar and A(x) is the cross sectional area at position x. This says that, approximately, 1 ∇τ ∝ . A(x) As a specific example, suppose that the bar is in the form of a truncated cone, with A(x) = A1 (x2 /x21 ) where A1 is the cross section at x1 , and x1 is the value of the x coordinate at the τ1 end of the bar and √ x2 > x1 is the value of the x coordinate at the τ2 end of the bar. We also suppose that A1 x1 . Basically we have a bar that is a wedge of a spherical shell of inner and outer radii x1 and x2 . The temperature distribution must satisfy ∇2τ = 0 subject to the boundary conditions which are: the temperature at x1 is τ1 , the temperature at x2 is τ2 and there in no energy flux through the sides of the bar, only the ends. A little thought shows that if we actually had a complete spherical shell of inner and outer radii x1 and x2 at temperatures τ1 and τ2 , then the flux density would be radial and the solution to this problem would also solve the present bar problem. So we c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
11-Dec-2002 35-8
want the solutions for ∇2 τ which are spherically symmetric. These are the same as the solutions for a spherically symmetric electric potential in a charge free region. We know what these are. There can be a point charge at the center of the sphere (Φ = 1/r) and a constant. So the temperature as function of x must be τ (x) =
C1 + C2 , x
A little algebra finds the constants, τ (x) =
x1 x2 (τ1 − τ2 ) x2 τ2 − x1 τ1 + . x(x2 − x1 ) x2 − x1
Note that ∇τ ∝ 1/x2 as we wanted.
Continuity Equation for Mass Consider a medium in which the mass density is ρ(r, t) and the medium moves with a bulk velocity v(r, t). Then the mass flux density at r at time t (we’ll drop the explicit notation) is Jρ = ρv . Just like other quantities, this satisfies a continuity equation, or in this case a mass conservation equation, which says that mass flux leaving a small volume is just the rate of change of mass in the volume ∇ · Jρ +
∂ρ ∂ρ = ∇ · (ρv) + =0. ∂t ∂t
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Dec-2002 36-1
To Do Have a Good Holiday, find some time to study a little physics, and come back from break relaxed, refreshed, and ready to go! The 3R’s of higher education? Also, keep your eye in the web page for announcements of review sessions during reading period. I know some of you got a little lost in the period between midterms and the JP deadline, so I’ll especially try to review that material, as well as some of the basics from the earlier parts of the course.
Sound Waves in a Gas In this section we’ll work out the wave equation for sound waves in an ideal gas. To start with, let’s consider a plane wave in the pressure. The change in pressure from its equilibrium value is δp = δp0 ei(k · r − ωt) , where δp0 is the amplitude of the wave. The wave is assumed to be small enough that only first order quantities need to be considered. Of, course, this means that there is a sinusoidally varying pressure gradient which accelerates the gas. Suppose there is a pressure gradient in the x direction and consider a small volume of gas in a box with cross sectional area A and length dx. Then the net force on the gas in the box is −(∇p)x A dx. The momentum of the gas in the box at time t and position r is ρ(t, r)v(t, r)A dx. The rate of change of this momentum must be the force. (Note that we are ignoring viscous forces and assuming smooth flow!) When calculating the rate of change, we must note that the box is moving with velocity v, so at time dt, r → r + v dt. We cancel out the A dx and take the time derivative and find ∂ρvx + v · ∇(ρvx ) = −(∇p)x , ∂t or generalizing to a pressure gradient in an arbitrary direction, ∂ρv + v · ∇(ρv) = −∇p . ∂t Now, in equilibrium, ρ is a constant, p is a constant, and v = 0. When a wave is present, all of these have small oscillatory components. On the left hand side, we can ignore the v · ∇(ρv) term altogether since it has two powers of velocity and is second order small. In the other term, we can ignore the oscillatory wave in ρ since it is multiplied by the first order small velocity. To make all this explicit, let’s write ρ = ρ0 + δρ ,
p = p0 + δp ,
v = 0 + δv ,
τ = τ0 + δτ ,
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Dec-2002 36-2
and so on. The quantities prefixed by δ are the small oscillatory quantities and the others are the equilibrium quantities and we will drop products of oscillatory quantities in our expressions. In particular the force equation above becomes ρ0
∂(δv) = −∇(δp) . ∂t
If we have an ideal gas, p = nτ = so ρ0
ρτ , m
∂(δv) τ0 ρ0 = − ∇(δρ) − ∇(δτ ) . ∂t m m
We also have δU + p0 δV = τ0 δσ , which can be rewritten in terms of unit volume δU p0 δσ + δV = τ0 = τ0 δˆ σ, V0 V0 V0 where σ ˆ is the entropy per unit volume or the entropy density. Now divide by dt and remember that because we have an ideal gas, the energy density per unit volume is CˆV τ , p0 ∂(δV ) ∂(δτ ) ∂(δˆ σ) + = τ0 . CˆV ∂t V0 ∂t ∂t Note also that ρV = constant, so δρ/ρ0 = −δV /V0 and ∂(δτ ) p0 ∂(δρ) ∂(δˆ σ) − = τ0 . CˆV ∂t ρ0 ∂t ∂t Also τ0 ∂(δˆ σ )/∂t is the heat added per unit volume per unit time. This must equal the heat flow which is K∇2 (δτ ). Note that earlier, in deriving the thermal diffusion equation, we ignored the p dV term. This was OK, because p dV work is usually quite small for a solid. At this point, it may be useful to summarize the equations we’ve obtained so far. We have the continuity equation for mass, which written in terms of our small quantities, is ρ0 ∇ · (δv) = −
∂(δρ) , ∂t
and the force equation ρ0
∂(δv) τ0 ρ0 = − ∇(δρ) − ∇(δτ ) , ∂t m m
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Dec-2002 36-3
and the thermal diffusion equation ∂(δτ ) p0 ∂(δρ) − = D∇2 (δτ ) . ∂t CˆV ρ0 ∂t Note that D = K/CˆV and the thermal diffusion equation is the same as the one we discussed earlier without the density term. What we’re going to do is take the time derivative of the continuity equation and the divergence of the force equation. This gives is a second time derivative equal to a second space derivative which is what we need for a wave equation. There is one small problem: both δρ and δτ appear in the equation. We use the heat flow equation to relate the two. Let’s consider this latter relation. The time derivative produces (order of magnitude) 1/T where T is the period of the wave. The Laplacian produces 1/λ2 , where λ is the wavelength of the wave. So the ratio of the right hand side to the left hand side is of order DT D = , 2 λ vs λ where vs is the speed of whatever wave we’re dealing with—that is, the speed of sound. We know that D is roughly the speed of sound times the mean free path, so the ratio of the right hand side to the left hand side is roughly the mean free path over the wavelength of the wave. So until we get to very short wavelength waves, we can ignore the right hand side and we have ∂(δτ ) p0 ∂(δρ) − =0, ∂t CˆV ρ0 ∂t which means δτ =
p0 δρ . CˆV ρ0
What we’ve just shown is that sound waves (with wavelengths longer than the mean free path) are isentropic. The compressed parts of the wave are hotter than the rarefied parts of the wave, but the wave goes by so fast that there’s no time for heat to flow from a compression to adjacent rarefactions. The force equation becomes ρ0
∂(δv) τ0 p0 = − ∇(δρ) − ∇(δρ) , ∂t m CˆV m τ0 n0 τ 0 ∇(δρ) , =− + m CˆV m (CˆV + n0 )τ0 =− ∇(δρ) , CˆV m Cˆp τ0 =− ∇(δρ) , CˆV m γτ0 ∇(δρ) . =− m
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Dec-2002 36-4
We’ve used the facts that the heat capacity per unit volume at constant pressure is just CˆV + n0 and the ratio Cp/CV is denoted by γ. Now, we take the divergence of the force equation and we have ρ0
∂[∇ · (δv)] γτ0 2 =− ∇ (δρ) . ∂t m
We take the time derivative of the continuity equation ρ0
∂[∇ · (δv)] ∂ 2 (δρ) =− . ∂t ∂t2
We equate the right hand sides and find the wave equation ∇2 (δρ) −
1 ∂ 2 (δρ) =0, vs2 ∂t2
where vs =
γτ0 , m
is the speed of sound in the gas.
Wave Functions for a Sound Wave If we have a harmonic wave in a gas, then all the small quantities we’ve been discussing look like δρ = δρ0 ei(k · r − iωt) , δp = δp0 ei(k · r − iωt) , δτ = δτ0 ei(k · r − iωt) , δv = δv ei(k · r − iωt) , 0
δr = δr0 ei(k · r − iωt) , where δr(r, t) is the displacement of the gas from its equilibrium position. Note we’re using r as a coordinate that labels where we are and δr as a dynamical variable. This is confusing, but fairly common practice. In any case, we haven’t introduced δr before. It’s just δr =
δv dt .
For a complex harmonic wave, the differential operators ∇ and ∂/∂t are replaced by multiplication by ik and −iω respectively. The time integral operator is replaced by c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Dec-2002 36-5
multiplying by i/ω. Given one amplitude, all the others are determined by the various equations we’ve been dealing with. So let’s express them all in terms of δp0 . The force equation gives k ˆ 1 δp0 = k ˆ vs δp0 , δv0 = δp0 = k ωρ0 vs ρ0 γ p0 ˆ is a unit vector in the k direction. We see that velocity is parallel where vs = ω/k and k to the direction of the wave, so we have a longitudinal wave. With this expression for δv0 , we use the continuity equation to get δρ0 , δρ0 = ρ0
k · δv0 1 = 2 δp0 . ω vs
Using the facts that vs2 = γτ0 /m and p0 = ρ0 τ0 /m, we have vs2 = γp0 /ρ0 and δρ0 1 δp0 = , ρ0 γ p0 which could have been deduced from the isentropic equation for an ideal gas, pV γ = Constant. We use the thermal diffusion equation to find δτ0 , δτ0 =
p0 n0 τ 0 τ0 τ0 1 τ0 m 1 δρ0 = δρ0 = δp0 = δp0 = δp0 . δρ0 = 2 ˆ ˆ ˆ ˆ ˆ γ CˆV C V ρ0 C V ρ0 CV m CV m vs CV m γτ0
Slightly more algebra gives
γ − 1 δp0 δτ0 = , τ0 γ p0
which also could have been derived from the isentropic law. Finally, the displacement is found just by integrating the velocity ˆ ivs δp0 . δr0 = k γω p0 Note that the pressure, density, temperature and velocity are all in phase. The displacement is 90◦ out of phase. You may recall from Physics 103/5 that a pressure node is a displacement anti-node and vice-versa.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
13-Dec-2002 36-6
Heat Losses in a Wave The equations we developed earlier included the transfer of heat from the compressed, hot parts of the wave to the rarefied, cool parts of the wave. We argued that we could ignore the heat transfer. For very short wavelength (high frequency) waves we can’t ignore the heat transfer, and we need to keep the D∇2 δτ term. Rather than try to derive a wave equation, it’s probably easier just to plug the plane wave solutions (as in the previous section) into the various equations. We have three equations, (continuity, force, heat transfer) and three unknowns, δv0 , δτ0 , and δρ0 . The force equation gives −iωρ0 δv0 + ik
τ0 ρ0 δρ0 + ik δτ0 = 0 . m m
This tells us that we have a longitudinal wave, so we’ll just drop the vector symbols on k and δv0 for now, The continuity equation is ikρ0 δv0 − iωδρ0 = 0 . The thermal diffusion equation is (k 2 D − iω)δτ0 + iω
p0 CˆV ρ0
δρ0 = 0 .
Now, these are homogeneous equations which have a non-trivial solution only if the determinant of the matrix of coefficients is 0. This gives the characteristic equation (k D − iω) 2
k 2 τ0 p0 2 − ω − ik 2 ω =0. m mCˆV
After a fair amount of algebra, this can be put in the form Cˆp + ik 2 K/ω . ω =k m CˆV + ik 2 K/ω 2
2 τ0
So, the equations have a solution provided k and ω satisfy this dispersion relation. Note that when the terms containing the thermal conductivity are ignored we get the previous relation between ω and k. When they are important, we have a complex wave number and an attenuation of the wave resulting from the wave energy being dissipated as heat. Some things we’ve ignored: If the gas is not monatomic, the excitation of the rotational and vibrational modes can get out of phase with the wave. In this case, energy may be lost to the excitation of these modes. This is discussed in K&K, but we don’t have time.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301
Homework No. 12
Due: Never
H12-1
1. You may recall that I mentioned the (thermal) diffusivity for an insulating solid might be the speed of sound in the solid times the distance between atoms. Look up the thermal conductivity, specific heat (you will probably need specific heat at constant pressure for a solid), density, speed of sound, and whatever else you need for an insulating (non-metallic) solid. Calculate the thermal diffusivity. Calculate the speed of sound times the distance between atoms. Are your two numbers of the same order? Hint: you may be able to find some interesting data on the web. A cool starting point might be http://www.webelements.com/webelements.html
2. K&K, chapter 15, problem 3. 3. K&K, chapter 15, problem 4. 4. K&K, chapter 15, problem 6. In this problem you are being asked to find the temperature distribution in the cylinder or sphere that results in a steady state. (So ∂τ /∂t = 0.) Then the boundary condition at the surface is that the rate of heat flowing out of the surface must be the rate of heat generated within the volume.
c 2002, Princeton University Physics Department, Edward J. Groth Copyright
1 Physics 301: Solutions to Homework No. 12, John D. Naud ˆ where K is the thermal conductivity and Cˆ is the heat 1. The thermal diffusivity, Dτ is given by Dτ = K/C, capacity per unit volume. Consider the element Selenium (Se) which is an insulating solid at room temperature. From the URL given in the question we can find the following data: C = 25.36 J K−1 mol−1 .
K = 0.52 W K−1 m−1 ,
To convert the heat capacity per mole into a heat capacity per unit volume we use the molar volume VM = 16.42 cm3 mol−1 = 16.42 · 10−6 m3 mol−1 . Therefore the thermal diffusivity of Selenium is Dτ =
KVM (0.52 W K−1 m−1 )(16.42 · 10−6 m3 mol−1 ) = = 3.4 · 10−7 m2 s−1 . C 25.36 J K−1 mol−1
At the same website we can also find the speed of sound, cs , for Selenium and the bond length in the elemental solid a: cs = 3350 m s−1 ,
a = 232.1 · 10−12 m.
The product of these two numbers gives us cs a = 7.8 · 10−7 m2 s−1 , which is the same order of magnitude as Dτ . 2. We have a hypothetical climate in which the mean annual temperature is θ0 = 10◦ C, and in which the daily and annual variations in temperature are purely sinusoidal with amplitudes θd = 10◦ C. We are given the thermal diffusivity of soil to be D = 10−3 cm2 s−1 . Using Eq. (13) on pg. 426 of K&K and the principle of superposition (which is valid because the diffusion equation is linear), we can write the temperature as a function of depth z in the soil as θ(z, t) = θ0 + θd exp(−z/δ0 ) cos(ω0 t − z/δ0 ) + θd exp(−z/δ1 ) cos(ω1 t − z/δ1 ), p where ω0 = 2π/T0 = 2π/(24 hr) = 7.3· 10−5 s−1 , and δ0 = 2D/ω0 = 5.2 cm for the daily variation and ω1 = 2π/T1 = √ ω0 /365 = 2.0· 10−7 s−1 and δ1 = 365δ0 = 100 cm for the annual variation. This is a solution to the diffusion equation and obeys the correct boundary condition at z = 0. At a given depth z the minimum temperature is bounded by θmin (z) ≥ θ0 − θd e−z/δ0 − θd e−z/δ1 (Note that at a given z, this minimum is obtained if and only if there exists a time T such that ω0 T − z/δ0 = (2n + 1)π and ω1 T − z/δ1 = (2m + 1)π for some integers n and m.) We want to find a z such that the minimum temperature is above the freezing point of water, θF = 0◦ C. Hence θ0 − θd e−z/δ0 − θd e−z/δ1 = θF
⇒
e−z/δ0 + e−z/δ1 = 1.
Solving this equation numerically gives z ≈ 12 cm. 3. Let the x-axis be along the thickness of the slab such that the surfaces of the slab are at x = 0 and x = 2a. If θ(x, t) is the temperature in the slab, we have the initial condition θ(x, 0) = θ1 for 0 < x < 2a, and the boundary condition θ(0, t) = θ(2a, t) = θ0 for t > 0. Suppose we write the temperature as θ(x, t) = θ0 +
∞ X
An (t) sin(kn x).
n=1
This satisfies the boundary condition θ(0, t) = θ0 for t > 0. To satisfy the boundary condition θ(2a, t) = θ0 for t > 0 we require sin(kn 2a) = 0 for all n, which implies kn = nπ/2a. If we substitute the above form for θ(x, t) into the diffusion equation and use the orthogonality of the sine functions we find dAn (t) = −Dkn2 An (t), dt
Physics 301: Solutions to Homework No. 12, John D. Naud
2
which has solution: An (t) = Cn exp(−Dkn2 t), where Cn is an arbitrary constant. Using this result, our expression for the temperature becomes θ(x, t) = θ0 +
∞ X
Cn sin(kn x)e−Dkn t . 2
n=1
The coefficients Cn are fixed by the initial condition θ(x, 0) = θ1 . Note that we are interested in times for which all but the longest wavelength Fourier component (i.e., n = 1) have decayed away and the distribution becomes sinusoidal. Therefore, we only need to calculate C1 . Using Z 2a dx sin(kn x) sin(km x) = aδn,m , 0
for n, m > 0, we can find C1 by multiplying both sides of our expression for θ(x, 0) by sin(k1 x) and integrating from x = 0 to x = 2a: Z 2a Z 2a ∞ X 4a 4 dx sin(k1 x) = Cn dx sin(kn x) sin(k1 x) ⇒ (θ1 − θ0 ) = aC1 , ⇒ C1 = (θ1 − θ0 ). (θ1 − θ0 ) π π 0 0 n=1 The question asks us to find the time t1 at which the temperature difference between the center of the slab and its surface decays to r ≡ 0.01 of the initial difference θ1 − θ0 , that is θ(a, t1 ) − θ(0, t1 ) = r(θ1 − θ0 ) ⇒ C1 sin(k1 a)e−Dk1 t1 = r(θ1 − θ0 ) 4 −Dk12 t1 4a2 4 = r ⇒ t1 = log e , π Dπ 2 πr 2
⇒
where in the first step we kept only the first term in the Fourier series, and in the second step we used C1 = (4/π)(θ1 − θ0 ) and k1 = π/2a. 4.(a) In steady state dτ /dt = 0, and the continuity equation with sources reads ∇ · Ju = gu , where as always Ju = −K∇τ . Consider a cylindrical wire of radius R, where gu is constant inside the wire and zero outside. By symmetry we know that Ju points radially outward from the axis of the cylinder and depends only on r = |r|. If we consider a cylindrical volume of length L and radius r ≤ R, whose axis coincides with the axis of the wire, then Z Z Z 1 dV gu = dV ∇ · Ju = ds · Ju ⇒ (πr2 L)gu = 2πrLJu (r) ⇒ Ju (r) = gu rˆr, 2 V V ∂V where ˆr = r/|r|, and we have used the divergence theorem. From Fourier’s law we then have: dτ 1 gu = − ˆr · Ju = − r. dr K 2K Integrating this from r = R to r = 0 gives τ (0) − τ (R) =
gu R 2 . 4K
(b) Consider the spherical Earth of radius R, where again we assume gu is constant inside the sphere and zero outside. By symmetry we know that Ju points radially outward from the center of the sphere and depends only on r = |r|. If we consider a spherical volume of radius r ≤ R, whose center coincides with the center of the Earth, then Z Z Z 4πr3 1 dV gu = dV ∇ · Ju = ds · Ju ⇒ gu = 4πr2 Ju (r) ⇒ Ju (r) = gu rˆr. 3 3 V V ∂V From Fourier’s law we then have: dτ 1 gu = − ˆr · Ju = − r. dr K 3K Integrating this from r = R to r = 0 gives τ (0) − τ (R) =
gu R 2 . 6K
Final Examination
PHYSICS DEPARTMENT, PRINCETON UNIVERSITY
PHYSICS 301 FINAL EXAMINATION January 22, 2003, 1:30–4:30pm, Jadwin A08
This exam contains five problems. Work any three of the five problems. All problems count equally although some are harder than others. Do all the work you want graded in the separate exam books. Indicate clearly which three problems you have worked and want graded. I will only grade three problems. If you hand in more than three problems without indicating which three are to be graded, I will grade the first three, only! Write legibly. If I can’t read it, it doesn’t count! Put your name on all exam books that you hand in. (Only one should be necessary!!!) On the first exam book, rewrite and sign the honor pledge: I pledge my honor that I have not violated the Honor Code during this examination.
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam
22-Jan-2003 Page 2
Physical Constants and Conversion Factors
c = 2.998 × 1010 cm s−1 , h ¯ = 1.054 × 10−27 erg s , k = 1.380 × 10−16 erg K−1 , e = 4.803 × 10−10 statcoulomb , N0 = 6.025 × 1023 molecules mole−1 , melectron = 9.108 × 10−28 g , mproton = 1.672 × 10−24 g , mneutron = 1.675 × 10−24 g , mamu = 1.660 × 10−24 g , µB = 9.273 × 10−21 erg Gauss−1 , G = 6.673 × 10−8 cm3 s−2 g−1 . 1 atm = 1.013 × 106 dyne cm−2 , 1 eV = 1.602 × 10−12 erg , 1 cal = 4.186 × 107 erg .
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam
22-Jan-2003 Page 3
1. Two identical objects, A and B, are thermally and mechanically isolated from the rest of the world. Their initial temperatures are τA > τB . Each object has heat capacity C (the same for both objects) which is independent of temperature. (a) Suppose the objects are placed in thermal contact and allowed to come to thermal equilibrium. What is their final temperature? How much entropy is created in this process? How much work is done on the outside world in this process? (b) Instead, suppose objects A (temperature τA ) and B (temperature τB < τA ) are used as the high and low temperature heat reservoirs of a heat engine. The engine extracts energy from object A (lowering its temperature), does work on the outside world, and dumps waste heat to object B (raising its temperature). When the temperatures of A and B are the same, the heat engine is in the same state as it started and the process is finished. Suppose this heat engine is the most efficient heat engine possible. In other words, it performs the maximum work possible. What is the final temperature of the objects? How much entropy is created in this process? How much work is done on the outside world in this process?
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam
22-Jan-2003 Page 4
2. Consider waves on a liquid surface where the restoring force is produced by surface tension. Assume there is a single polarization and the dispersion relation is ω2 =
γ 3 k , ρ
where γ is the surface tension of the liquid, ρ is its density, ω is the frequency of the waves and k is the wavenumber of the waves. Our goal is to find the contribution of these waves to the low temperature heat capacity of the liquid. (a) If the surface is in equilibrium at temperature τ , what is the average energy of a wave with frequency ω? (Ignore the h ¯ ω/2 zero point energy.) (b) Suppose the surface is a square of side L and area A = L 2 . How many modes are there with wavenumbers between k and k + dk? (c) At low temperatures what are the energy and heat capacity of these surface waves? You will come up with an integral that can’t be done easily. Explain why it’s OK to set the upper limit to ∞. Having done this, convert the integral to dimensionless form and call its value I.
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam
22-Jan-2003 Page 5
3. The normal boiling point (that is, at p = 1 atm) of mercury is T = 630 K. At this pressure and temperature, mercury vapor may be treated as a monatomic ideal gas. The latent heat of vaporization is L = 5.93 × 1011 erg mole−1 . The atomic weight of mercury is 200.6 amu. (a) Suppose that the latent heat of vaporization is constant, that mercury vapor may be treated as an ideal gas over the range of interest, and that the specific volume of liquid mercury is negligible compared with the specific volume of mercury vapor. Estimate the vapor pressure of mercury at T = 300 K (roughly room temperature). (b) What is the entropy of the liquid at 630 K? Hint: You may need to know the entropy of the vapor. In case you don’t remember the expression, you might want to work it out. Remember that the number of states in d3 r and d3 p is dn = d3 r d3 p/(2π¯h)3 . You should also remember that for large N, log N! → N log N − N.
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam
22-Jan-2003 Page 6
4. Consider a line of N + 1 atoms. Each atom interacts only with its nearest neighbors, so there are N interactions. N is very large—we can always neglect 1 when compared to N. Also, log N ! ≈ N log N − N .There are two kinds of atoms on the line, type A and type B. The interaction energy for an AA pair is −; for a BB pair, −; and for an AB pair, −/2, where > 0. In other words, when the two atoms in a bond are the same, they are twice as strongly bound as when they are different. The whole line of atoms is in equilibrium at temperature τ . Exchanging energy with the heat bath means that atoms must change positions. For example, if there is a configuration, ABAB, then the three bonds in this configuration have a total energy −3/2. If the inner two atoms swap positions to give AABB, the three bonds have a total energy of −5/2 and energy was given to the heat bath. Let x denote the fraction of B atoms, so if there are NA and NB atoms of types A and B, respectively, then NA = (1 − x)N and NB = xN, and 0 ≤ x ≤ 1. (a) Suppose the temperature is high enough that the atoms of each type are randomly and independently distributed along the line (in other words, it is a homogeneous mixture). Obtain expressions for the energy, the entropy, and the Helmholtz free energy as a function of , τ , N and x. Be sure to use the fact that N is large to simplify your answers. (b) Suppose the temperature is zero! What is the configuration of atoms in this case? What is the ground state energy? What is the entropy? What is the free energy? Once again, use the fact that N is very large to simplify your answer. (c) In general, the system may be a homogeneous mixture like that in part (a) or it may be more like the configuration in part (b). Suppose x = 1/2 and determine the transition temperature which separates the (b) like state from the (a) like state. Hint: consider, for the homogeneous mixture case, the plot of free energy versus x at a given temperature.
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam
22-Jan-2003 Page 7
5. Recall that a definition for the entropy is σ =−
X
pi log pi ,
i
where pi is the probability that the system is in state i and the sum is over all states which satisfy whatever constraints are placed on the system. Note that this definition handles zero probability states and also produces an additive entropy when two non-interacting systems are combined. In all cases, suppose the volume of the system is fixed. The number of particles in each state is Ni and the energy of each state is Ei (a) Suppose the number of particles in the system is fixed and the energy is fixed. What are the probabilities that maximize the entropy. Of course, the sum will be include only those states which have the correct number of particles and energy. (Or, equivalently, the probability for a state with the wrong number of particles or energy or both is 0.) In this and subsequent parts, you may have to introduce some auxiliary “constants.” Be sure to identify or give a physical interpretation for Peach such constant. Hint: be sure to use the fact that the system is in some state: i pi = 1.
(b) Now suppose the number of particles is fixed, but the system is in equilibrium with a heat bath such that its average energy is E. What are the probabilities that maximize the entropy? As before, be sure to identify or give a physical interpretation for any constants you introduce. (c) Now suppose the system is in equilibrium with a heat bath and a “particle bath” (a reservoir with which it can exchange particles) such that the average energy is E and the average number of particles in the system is N. Now what are the probabilities that maximize the entropy? As before, be sure to identify or give a physical interpretation for any constants you introduce.
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam
22-Jan-2003 Page 8
THIS PAGE INTENTIONALLY LEFT BLANK
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
PHYSICS DEPARTMENT, PRINCETON UNIVERSITY
PHYSICS 301 FINAL EXAMINATION January 22, 2003, 1:30–4:30pm, Jadwin A08
SOLUTIONS This exam contains five problems. Work any three of the five problems. All problems count equally although some are harder than others. Do all the work you want graded in the separate exam books. Indicate clearly which three problems you have worked and want graded. I will only grade three problems. If you hand in more than three problems without indicating which three are to be graded, I will grade the first three, only! Write legibly. If I can’t read it, it doesn’t count! Put your name on all exam books that you hand in. (Only one should be necessary!!!) On the first exam book, rewrite and sign the honor pledge: I pledge my honor that I have not violated the Honor Code during this examination.
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 2
Physical Constants and Conversion Factors
c = 2.998 × 1010 cm s−1 , h ¯ = 1.054 × 10−27 erg s , k = 1.380 × 10−16 erg K−1 , e = 4.803 × 10−10 statcoulomb , N0 = 6.025 × 1023 molecules mole−1 , melectron = 9.108 × 10−28 g , mproton = 1.672 × 10−24 g , mneutron = 1.675 × 10−24 g , mamu = 1.660 × 10−24 g , µB = 9.273 × 10−21 erg Gauss−1 , G = 6.673 × 10−8 cm3 s−2 g−1 . 1 atm = 1.013 × 106 dyne cm−2 , 1 eV = 1.602 × 10−12 erg , 1 cal = 4.186 × 107 erg .
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 3
1. Two identical objects, A and B, are thermally and mechanically isolated from the rest of the world. Their initial temperatures are τA > τB . Each object has heat capacity C (the same for both objects) which is independent of temperature. (a) Suppose the objects are placed in thermal contact and allowed to come to thermal equilibrium. What is their final temperature? How much entropy is created in this process? How much work is done on the outside world in this process? Solution No work is done—the objects are just in thermal contact. If energy dQ ¯ is transfered from object A to B and the objects are allowed to come to equilibrium, the temperature change of object A is dτA = −¯ dQ/C and the temperature change of object B is dτ B = +¯ dQ/C. In other words, the temperatures of objects A and B change by equal and opposite amounts, so the final temperature is τf = (τA +τB )/2. The change in entropy of the system is Z τf Z τf C dτ C dτ ∆σ = + τ τ τA τ !B τf2 = C log τA τB (τA + τB )2 = C log . 4τA τB End Solution
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 4
(b) Instead, suppose objects A (temperature τA ) and B (temperature τB < τA ) are used as the high and low temperature heat reservoirs of a heat engine. The engine extracts energy from object A (lowering its temperature), does work on the outside world, and dumps waste heat to object B (raising its temperature). When the temperatures of A and B are the same, the heat engine is in the same state as it started and the process is finished. Suppose this heat engine is the most efficient heat engine possible. In other words, it performs the maximum work possible. What is the final temperature of the objects? How much entropy is created in this process? How much work is done on the outside world in this process? Solution If we have the most efficient engine possible, it must be a reversible engine which means that the entropy created is zero. Any entropy extracted from object A must wind up in object B. This means C dτa C dτb dσ = 0 = + , τa τb where τa and τb are the temperatures of object A and B. Integrating gives log τa + log τb = log(τa τb) = Constant , so τa τb is constant in this process. Thus, τf = done on the outside world is
√ τA τB . When τa changes by dτa , the work
τb dW ¯ = −C dτa − C dτb = −C 1 − τa
dτa .
We could integrate this expression to find the total work done, but it’s much easier to use the initial and final temperatures W = C (τA − τf ) − C (τf − τB ) √ = C (τA + τB − 2 τA τB ) .
√ An interesting point to notice: if we plug τf = τA τB into the expression for the entropy created that was derived in part (a), we find dσ = 0, as we should. End Solution
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 5
2. Consider waves on a liquid surface where the restoring force is produced by surface tension. Assume there is a single polarization and the dispersion relation is ω2 =
γ 3 k , ρ
where γ is the surface tension of the liquid, ρ is its density, ω is the frequency of the waves and k is the wavenumber of the waves. Our goal is to find the contribution of these waves to the low temperature heat capacity of the liquid. (a) If the surface is in equilibrium at temperature τ , what is the average energy of a wave with frequency ω? (Ignore the h ¯ ω/2 zero point energy.) Solution The thermal average occupancy of a mode is 1/(exp(¯hω/τ ) − 1), and the energy in the mode is just the energy per surface “phonon” times the occupancy. So h(ω)i =
h ¯ω . ¯ ω/τ − 1 eh
Note: if you forgot the occupancy expression, it only takes a couple of lines to work it out from the partition function: Z=
∞ X
e−n¯hω/τ =
n=0
hni =
1 . 1 − e−¯hω/τ
∞ 1 X −n¯hω/τ τ ∂ log Z ne =− . Z n=0 h ¯ ∂ω
End Solution (b) Suppose the surface is a square of side L and area A = L 2 . How many modes are there with wavenumbers between k and k + dk? Solution We will need standing waves which means an integer number of half wavelengths must fit in the square. This requires kx =
nx π , L
ky =
ny π . L
We see that the “volume” of a mode is dkx dky =
π2 . L2
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 6
The “volume” contained between k and k + dk is the area of a quarter annulus of radius k and thickness dk. This is πk dk/2. Then the number of modes between k and k + dk is N (k) dk =
πk dk/2 L2 A = k dk = k dk . 2 2 π /L 2π 2π End Solution
(c) At low temperatures what are the energy and heat capacity of these surface waves? You will come up with an integral that can’t be done easily. Explain why it’s OK to set the upper limit to ∞. Having done this, convert the integral to dimensionless form and call its value I. Solution To get the energy, we need to add up the energy per mode times the number of modes. From the dispersion relation, we find 2/3 2 ρ ω 1/3 dω . k dk = 3 γ Then
U =
Z
ωmax 0
h ¯ω A 2 ¯ ω/τ − 1 2π 3 eh
2/3 ρ ω 1/3 dω , γ
where ωmax indicates that there aren’t an infinite number of modes. The number of modes must be the same as the number of molecules in the surface. However, at very low temperatures, the higher energy modes are not excited (they are killed by the exponential in the denominator), so we make negligible error if we extend the integral to infinite frequency. To make the integral dimensionless, change variables to x = h ¯ ω/τ . Then 2/3 7/3 Z ∞ 4/3 A 2 ρ τ x dx U = , 4/3 2π 3 γ ex − 1 h ¯ 0 | {z } =
AI 3π
7 1/3
ρ2 τ γ 2h ¯4
=I
.
The heat capacity (at constant area) is ∂U 7AI CA = = ∂τ 9π
ρ2 τ 4 γ 2h ¯4
1/3
.
Note that what we’ve done is very similar to the derivation of the Debye theory for the low temperature heat capacity of solids. We find that C ∝ τ 4/3 instead of τ 3 . Part of the difference comes from the fact that we are considering a two dimensional rather than a three dimensional system. (This would have given τ 2 .) The other cause of the difference is the strange dispersion relation. Finally, as a point of interest, I = 1.68. End Solution c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 7
3. The normal boiling point (that is, at p = 1 atm) of mercury is T = 630 K. At this pressure and temperature, mercury vapor may be treated as a monatomic ideal gas. The latent heat of vaporization is L = 5.93 × 1011 erg mole−1 . The atomic weight of mercury is 200.6 amu. (a) Suppose that the latent heat of vaporization is constant, that mercury vapor may be treated as an ideal gas over the range of interest, and that the specific volume of liquid mercury is negligible compared with the specific volume of mercury vapor. Estimate the vapor pressure of mercury at T = 300 K (roughly room temperature). Solution This is a problem for Clausius-Clapeyron. Recall that along the melting curve, dp ∆σ = , dτ ∆V where ∆σ is the change in specific entropy between the liquid and vapor and ∆V is the change in volume. Since we can ignore the volume of the liquid, ∆V = V vapor = τ /p for one mole. The change in entropy is just the latent heat divided by the temperature, so pL dp = 2 . dτ τ We can integrate and get p2 log = −L p1 or
1 1 − τ2 τ1
,
p2 = p1 e−(L/N0 )(1/kT2 − 1/kT1 ) .
At this point, we just plug in the numbers to get p(300 K) = 3.9 × 10−6 atm . As a point of interest, the CRC handbook gives the vapor pressure as 2.9 × 10 −6 atm, so our estimate is not bad! End Solution (b) What is the entropy of the liquid at 630 K? Hint: You may need to know the entropy of the vapor. In case you don’t remember the expression, you might want to work it out. Remember that the number of states in d3 r and d3 p is dn = d3 r d3 p/(2π¯h)3 . You should also remember that for large N, log N! → N log N − N.
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 8
Solution If we know the entropy of the gas, we can get the entropy of the liquid by subtracting L/τ . So we need to know the entropy of the gas. The appropriate expression is the SackurTetrode formula which is not something that I would use brain cells to memorize. We can work it out. The one particle partition function is found from 3/2 Z Z 3 3 d r d p −(p2x + p2y + p2z )/2mτ mτ Z1 = = nQ V . e =V (2π¯h)3 2π¯h2 For N particles Z1N (nQ V )N = . N! N! Note the factor of N !. The free energy is nQ +1 . F = −τ log ZN = −τ N log n ZN =
The entropy is
∂F nQ 5 σ=− = N log + , ∂τ n 2
which is the Sackur-Tetrode expression. From this, we must subtract the entropy added by the heat of vaporization. We have for the entropy of the liquid nQ 5 L σliquid = N0 log + − , n 2 τ where the number of molecules has been set to the number in one mole. To get the entropy in conventional units, we also need to multiply by k, so we have ! ! 3/2 5 L mkT kT Sliquid = N0 k log + − , 2 p 2 T 2π¯h = (6.025 × 1023 )(1.380 × 10−16 )× ! ! 3/2 200.6(1.660 × 10−24 )(1.380 × 10−16 )630 (1.380 × 10−16 )630 5 log + 2π(1.054 × 10−27 )2 1.013 × 106 2
5.93 × 1011 , 630 = (8.315 × 107 )(20.40 + 2.50) − 9.413 × 108 = 19.042 × 108 − 9.413 × 108 , = 9.629 × 108 erg K−1 mole−1 , or σ = S/k = 6.98 × 1024 mole−1 . −
Actually, this is what is determined for the entropy of the liquid by starting from near absolute 0 and keeping track of the heat and entropy added to get to the boiling point. This is another way that the Sackur-Tetrode expression has been confirmed! End Solution c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 9
4. Consider a line of N + 1 atoms. Each atom interacts only with its nearest neighbors, so there are N interactions. N is very large—we can always neglect 1 when compared to N. Also, log N ! ≈ N log N − N .There are two kinds of atoms on the line, type A and type B. The interaction energy for an AA pair is −; for a BB pair, −; and for an AB pair, −/2, where > 0. In other words, when the two atoms in a bond are the same, they are twice as strongly bound as when they are different. The whole line of atoms is in equilibrium at temperature τ . Exchanging energy with the heat bath means that atoms must change positions. For example, if there is a configuration, ABAB, then the three bonds in this configuration have a total energy −3/2. If the inner two atoms swap positions to give AABB, the three bonds have a total energy of −5/2 and energy was given to the heat bath. Let x denote the fraction of B atoms, so if there are NA and NB atoms of types A and B, respectively, then NA = (1 − x)N and NB = xN, and 0 ≤ x ≤ 1. (a) Suppose the temperature is high enough that the atoms of each type are randomly and independently distributed along the line (in other words, it is a homogeneous mixture). Obtain expressions for the energy, the entropy, and the Helmholtz free energy as a function of , τ , N and x. Be sure to use the fact that N is large to simplify your answers. Solution A number of you tried to do this part via the partition function. I don’t know how to calculate the partition function for this situation—the problem is counting the configurations correctly. None of the partition function solutions solved the problem. If I use this problem again, I will give a hint that the partition function is too hard to do! (Sorry!) Since the atoms are randomly distributed, if we pick an atom at random, it’s an A atom with probability 1 − x and a B atom with probability x. Thus there are N(1 − x) 2 AA pairs, Nx(1 − x) AB pairs, N(1 − x)x BA pairs, and Nx2 BB pairs, so the total energy is U = −N ((1 − x)2 + (1 − x)x/2 + x(1 − x)/2 + x2 ) = −N(1 − x − x2 ) = −N + Nx(1 − x) . There are N !/(NA !NB !) ways to distribute NA and NB atoms to N sites. The entropy is the logarithm of this number, so σ = log N! − log NA ! − log NB ! = N log N − N − NA log NA + NA − NB log NB + NB = N log N − N (1 − x) log N (1 − x) − Nx log Nx = −N ((1 − x) log(1 − x) + x log x) Finally, the free energy is F = U − τ σ = N (− + x(1 − x) + τ ((1 − x) log(1 − x) + x log x)) . End Solution c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 10
(b) Suppose the temperature is zero! What is the configuration of atoms in this case? What is the ground state energy? What is the entropy? What is the free energy? Once again, use the fact that N is very large to simplify your answer. Solution If the temperature is zero, the system minimizes its free energy by minimizing the energy, in other words by going to the ground state. AA or BB bonds represent the least energy, while it takes energy to create an AB bond. We want the fewest possible AB bonds. The minimum we can have is 1. All the A atoms are on the left of the line while all the B atoms are on the right of the line, or vice-versa. The ground state energy is thus U = −(N − 1) − /2 ≈ −N . There are two ground states, so σ = log 2 ≈ 0. And, the free energy is just the energy in this case, F = −N. End Solution
(c) In general, the system may be a homogeneous mixture like that in part (a) or it may be more like the configuration in part (b). Suppose x = 1/2 and determine the transition temperature which separates the (b) like state from the (a) like state. Hint: consider, for the homogeneous mixture case, the plot of free energy versus x at a given temperature. Solution It’s clear that if the free energy is plotted against x at a given temperature, the curve is symmetric about x = 1/2. This means x = 1/2 will be a local maximum or a local minimum. If it’s a local maximum, then the system can lower its free energy by separating into a mostly A phase containing some B and a mostly B phase containing some A, in other words a configuration similar to part (b). If it’s a local minimum, then a homogeneous mixture, as in part (a) is the equilibrium phase. So we need to find the temperature at which the curve goes from concave down to concave up. In other words, at what temperature is the second derivative with respect to x 0? F = N (− + x(1 − x) + τ ((1 − x) log(1 − x) + x log x)) dF = N ((1 − 2x) + τ (− log(1 − x) + log x)) dx d2 F 1 1 = N −2 + τ + , dx2 1−x x Inserting x = 1/2 and setting the second derivative to 0 gives τ = /2. End Solution
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 11
5. Recall that a definition for the entropy is σ =−
X
pi log pi ,
i
where pi is the probability that the system is in state i and the sum is over all states which satisfy whatever constraints are placed on the system. Note that this definition handles zero probability states and also produces an additive entropy when two non-interacting systems are combined. In all cases, suppose the volume of the system is fixed. The number of particles in each state is Ni and the energy of each state is Ei (a) Suppose the number of particles in the system is fixed and the energy is fixed. What are the probabilities that maximize the entropy. Of course, the sum will be include only those states which have the correct number of particles and energy. (Or, equivalently, the probability for a state with the wrong number of particles or energy or both is 0.) In this and subsequent parts, you may have to introduce some auxiliary “constants.” Be sure to identify or give a physical interpretation for Peach such constant. Hint: be sure to use the fact that the system is in some state: i pi = 1.
Solution This case corresponds to the micro-canonical ensemble. We want to maximize the entropy subject to the constraint that the probabilities sum to 1. Also, we only include states with the correct number of particles and energy. We use a Lagrange multiplier and maximize ! X X f1 = −pi log pi + λ1 1 − pi . i
i
Taking the derivative with respect to a particular p i gives ∂f1 = − log pi − 1 − λ1 . ∂pi
Setting the derivative to zero gives pi = e−1 − λ1 , so all the probabilities (for the states with the correct numbers of particle and the correct energy) are the same. If there are M such states, the constraint that the probabilities sum to 1 gives 1=
X i
e−1 − λ1 = Me−1 − λ1 ;
λ1 = 1 − log M ;
pi =
1 . M
So, λ1 is 1 minus the entropy; it also plays a role in normalizing the probabilities. End Solution c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 12
(b) Now suppose the number of particles is fixed, but the system is in equilibrium with a heat bath such that its average energy is E. What are the probabilities that maximize the entropy? As before, be sure to identify or give a physical interpretation for any constants you introduce. Solution This case corresponds to the canonical ensemble. In this case, we want to maximize f2 =
X i
−pi log pi + λ1
1−
X i
pi
!
+ λ2
E−
X
pi Ei
i
!
.
Differentiating with respect to a given pi gives ∂f2 = − log pi − 1 − λ1 − λ2 Ei . ∂pi Setting the derivative to 0, we have pi = e−1 − λ1 − λ2 Ei , so the probabilities are proportional to an exponential function of the energy of a state. This is just a Boltzmann factor and we identify λ2 = 1/τ , where τ is the temperature. Furthermore, we see that X i
pi = 1 = e−1 − λ1
X
e−Ei /τ = e−1 − λ1 Z ,
i
where Z is the partition function. In this case, λ1 = log Z − 1 and again, λ1 has to do with normalizing the probabilities. End Solution
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 13
(c) Now suppose the system is in equilibrium with a heat bath and a “particle bath” (a reservoir with which it can exchange particles) such that the average energy is E and the average number of particles in the system is N. Now what are the probabilities that maximize the entropy? As before, be sure to identify or give a physical interpretation for any constants you introduce. Solution This case corresponds to the grand canonical ensemble. In this case, we want to maximize ! ! ! X X X X pi Ni . pi Ei + λ3 N − f3 = −pi log pi + λ1 1 − pi + λ2 E − i
i
i
i
Differentiating with respect to a given pi gives ∂f3 = − log pi − 1 − λ1 − λ2 Ei − λ3 Ni . ∂pi Setting the derivative to 0, we have pi = e−1 − λ1 − λ2 Ei − λ3 Ni , so the probabilities are proportional to an exponential function of the energy of a state times an exponential function of the number of particles in the state. This is just a Gibbs factor and we identify λ2 = 1/τ and λ3 = −µ/τ , where τ is the temperature and µ is the chemical potential. Furthermore, we see that X i
pi = 1 = e−1 − λ1
X i
e(µNi − Ei )/τ = e−1 − λ1 Z ,
where Z is the grand partition function. In this case, λ 1 = log Z − 1 and again, λ1 has to do with normalizing the probabilities. End Solution
c 2003, Princeton University Physics Department, Edward J. Groth Copyright
Physics 301 Final Exam Solutions
22-Jan-2003 Page 14
The grade distributions are shown in the figure. The maximum for the homework is
174 and the maximum for the final is 60. The total score is calculated as 50(Homework/174 + Final/60) .
c 2003, Princeton University Physics Department, Edward J. Groth Copyright