The Statistical Mechanics of Lattice Gases, Volume I [Course Book ed.] 9781400863433

A state-of-the-art survey of both classical and quantum lattice gas models, this two-volume work will cover the rigorous

126 90 20MB

English Pages 536 [533] Year 2014

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
Introduction
Chapter I. Preliminaries
Chapter II. The Pressure
Chapter III. States: The Classical Case
Chapter IV. States: The Quantum Case
Chapter V. High Temperature and Low Densities
References
Index
Recommend Papers

The Statistical Mechanics of Lattice Gases, Volume I [Course Book ed.]
 9781400863433

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

THE STATISTICAL MECHANICS OF LATTICE GASES

The Statistical Mechanics of Lattice Gases VOLUME

I

BARRY SIMON

Princeton University Press Princeton, New Jersey

Copyright © 1993 by Princeton University Press Published by Princeton University Press, 41 William Street, Princeton, New Jersey 08540 In the United Kingdom: Princeton University Press, Chichester, West Sussex All Rights Reserved Library of Congress Cataloging-in-Publication Data Simon, Barry, 1946The statistical mechanics of lattice gases / Barry Simon. p. cm. Includes bibliographical references and index. ISBN 0-691-08779-2 1. Lattice gas. 2. Statistical mechanics. I. Title. QCl 74.85.L38S6 1993 530.Γ3—dc20 This book has been composed in 10/12 Times Roman

92-36714 CIP

Princeton University Press books are printed on acid-free paper, and meet the guidelines for permanence and durability of the Committee on Production Guidelines for Book Longevity of the Council on Library Resources

Printed in the United States of America 1 0

9

8

7

6

5

4

3

2

1

TO MY CHILDREN, Rivka and Sanford, Benny, Zvi, Ari, Chana

Contents

INTRODUCTION

xi

I.

Preliminaries

3

1. Models to be Discussed Appendix to 1.1. Thermodynamics and the Models of Statistical Mechanics 2. Models Not to be Discussed 3. Convexity Inequalities: Abelian Case 4. Convexity Inequalities: Non-Abelian Case 5. Linear Functionals on Infinite-Dimensional Spaces 6. Legendre Transforms 7. States on C*-Algebras

3

II. The Pressure 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

The Basic Formalism Convergence of the Pressure: Free Boundary Conditions Convergence of the Pressure: Other Boundary Conditions Pressure for Coulomb Interactions Transfer Matrices, I: One Dimension Transfer Matrices, II: Two-Dimensional Ising Model Duality and Other Transformations Appendix to II.7. Bethe Lattices and the Bethe Approximation Surface Pressure and Surface Tension Limit Theorem, I: Bishop-de Leeuw Order Limit Theorems, II: Lieb's Method Limit Theorems, III: 1/D Expansion Ursell Functions

17 19 34 48 58 66 75 97 97 111 117 121 129 136 154 170 173 186 191 195 205

CONTENTS

13. MeanFieldTheory 14. Limit Theorems, IV: Mean Field Limit 15. Limit Theorems, V: Potts Model Limit III. States: The Classical Case 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

States Equilibrium States: The DLR Equations Translation Invariant Equilibrium States and Tangent Functionals Entropy and the Gibbs Variational Principle Appendix to III.4. The Microcanonical Ensemble Pure States and Pure Phases Ward Identities and the Classical Bogoliubov Inequalities Absence of Continuous Symmetry Breaking in Two Dimensions Energy Shift Estimates The Third Law of Thermodynamics Phase Transitions and Fluctuations of Macroscopic Observables Decay of Correlations in Two-Dimensional Plane Rotors

IV. States: The Quantum Case 1. 2. 3. 4. 5.

States Entropy and the Gibbs Variational Principle Time Automorphisms Equilibrium States: The KMS Condition Equilibrium States: The Gibbs Condition Appendix to IV.5. KMS Conditions and the Second Law of Thermodynamics 6. Energy Shift Estimates 7. The Inequalities of Bogoliubov and Bruch-Falk 8. Absence of Continuous Symmetry Breaking in Two Dimensions V. High Temperature and Low Densities 1. Dobrushin's Uniqueness Theorem 2. Analyticity and Decay of Correlations 3. KRVO Distance and Mean Field Theory Bounds on Transition Temperatures 4. Ward Identities and Mean Field Theory Bounds on Transition Temperatures 5. Mean Field Bounds on the Magnetization

viii

216 228 232 235 235 246 254 263 279 282 290 295 303 312 322 330 337 337 344 350 354 366 378 382 387 398 400 400 411 422 429 433

CONTENTS

ix

6. Fisher's Bounds 7. High-Temperature and Small-Fugacity Expansions Appendix to V.7. Combinatorial Aspects of Polymer Expansions 8. Low-Temperature Expansions 9. Detailed Analysis of the Decay of Correlations at Large Temperatures

438 447 464 470 486

REFERENCES

501

INDEX

521

Introduction

In 1979-80, my last year at Princeton (although I didn't know it at the time!), I gave a course on rigorous results in the statistical mechanics of discrete lattice models. In preparing the course, I realized that a twenty-year period of explosive growth in our knowledge seemed to be drawing to a close as the most approach­ able problems were solved. So it seemed like a good time to think about a book summarizing the state of our knowledge. It was clear such a book would need to be comprehensive, even encyclopedic, since it was describing a mature subject. Little did I realize that within a short time the project would bifurcate into a two-volume plan, nor that it would stretch out for over ten years while I juggled other responsibilities and interests. I was correct that the subject had matured. A book begun in 1969 would have had to be extensively rewritten as the seventies progressed but the changes/additions to this book as the process of writing got drawn out were not large. Indeed, the only result of the eighties comparable in depth to the major advances of the sixties and seventies were those of Frohlich-Spencer on multiscale cluster expansions (which are to be discussed in volume 2). I made several decisions about the book early on: I would restrict myself to " standard" lattice models which meant no spin glasses or field theories or ... but I would include something about quantum systems. I'd freely use correlation inequalities in passing, even though their comprehensive treatment would wait for the end of volume 2. Many of the "sexiest" topics — infrared bounds, the Frohlich-Spencer theory, the Lee-Yang theorem, and correlation inequalities — have been postponed until volume 2. I hope that it won't take another fourteen years for it to appear but it is certainly going to take some time. This has been a fun subject to write about as it has so many high points of great beauty while it requires little in the way of fancy functional analytic argument. The earlier parts of the book were kicking around in manuscript form for many years and benefited from comments from many friends and colleagues to whom I am grateful. I'd especially single out Jean Bricmont, Jurg Frohlich, and Alan Sokal for detailed commentary. A variety of secretaries also deserve thanks, most notably Joan Yap and Cherie Galvez. BARRY SIMON

Los Angeles, Ca. August, 1992

THE STATISTICAL MECHANICS OF LATTICE GASES

Chapter I

Preliminaries

1.1 Models to be Discussed Lattice models are caricatures invented to illuminate various aspects of elementary statistical mechanics, especially the phenomena of phase transitions and sponta­ neously broken symmetry. The simplest of all models is the Ising (or Lenz-Ising) model. This model was suggested to Ising by his thesis adviser, Lenz. Ising [1925] solved the one-dimensional model, an easy task (we will solve it three times: once in this section, once using transfer matrices in Section II.5, and once using high-temperature expansions in Section V.6), and on the basis of the fact that the one-dimensional model had no phase transition, he asserted there was no phase transition in any dimension. As we shall see, this is false. It is ironic that on the basis of an elementary calculation and erroneous conclusion, Ising's name has become among the most commonly mentioned in the theoretical physics literature. But history has had its revenge. Ising's name, which is correctly pronounced "E-zing," is almost universally mispronounced as "I-zing"! To describe the Ising model, we pick an integer ν and let Zv be the family of v-tuples of integers α = ( α 1 ; · · •,α ν ) called sites. Fix 6=1,2,···. Let A 1 = {a I I a, I < i] so Ae has (26+ l)v = I AeI sites. At each site, we place a "spin," σα, which can take the values ±1 corresponding to "spin up" or "spin down." We imag­ ine the spin represents a little magnet which can point in one of two directions. Thus, in finite volume, a configuration of one system corresponds to giving I Ae I numbers, each ±1; that is, there are 2ΙΛ,; distinct configurations (values of "σα"). Given a configuration ((TaJaeAe. we define its energy to be HAt(ca) = - J

In (1.1.1), I (χ — γΙ =

ν

Σ σ«σγ· I α-γΙ -1

α,γ€Λ(

(1.1.1)

la, - ν,· I so the sum is over nearest neighbors in Zv. The ι= 1 symbol < αγ > is a reminder that normally we have a convention that, in sums like

4

CHAPTER I

(1.1.1), each pair is counted only once. Thus, for example, if J a y is a symmetric matrix vanishing on the diagonal

Σ Jay = ^ Σ Δ

a

Σ Jay

In (1.1.1), J is a parameter which we most often take positive. J > 0 means that the more pairs of spins which point parallel (i.e., σασμ= 1), the lower the energy. Since we will see that low energy states have a higher weighting in statistical mechanics, this means that spins have a preferred possibility of pointing parallel, so in our magnetic picture, the system is ferromagnetic. Indeed, the Hamiltonian (= energy function or operator) (1.1.1) is often called the nearest-neighbor, spin- j Ising ferromagnet; the meaning of the phrase "spin- γ" will be made clear below. According to the fundamental laws of statistical mechanics as laid down by our forefathers (Maxwell, Boltzmann, and Gibbs), if our system is placed in equilib­ rium with a heat bath at temperature, T, then configurations occur with probabili­ ties, the probability of a configuration being proportional to εχρ(-β//(σα)) where β = (kT)~l and k is Boltzmann's constant. Thus, the expected value, < /(σ) >, of some function, /, of the configuration is given by =Ζ"' Σ exp(-β//(σα))/(σα)

(1.1.2)

σα = ±1

the sum being over the 2IAe' possible configurations. Z is the normalization con­ stant Z = Σ εχρ(-β//(σα))

(1.1.3)

σα = ±1

called the partition function. The set of weights e~^ H /Z is called the Gibbs distri­ bution; see the appendix for a discussion of a "justification" of the Gibbs choice. The partition function is of importance because of its connection with some basic thermodynamic objects. It is quite natural to associate the expected value of H < Η(σ α ) > = - d(In Ζ)/β?β

(1.1.4)

with the internal energy of the system, in which case basic thermodynamics (see, e.g., Sommerfeld [1956], Pippard [1964]) says that F a ,=-β" 1 InZ

is the "free energy," a fundamental thermodynamic "potential."

5

PRELIMINARIES

To further discuss the physics of the Ising model, it is useful to introduce an additional parameter, A, representing an external magnetic field. Choosing units so that the magnetic moment of a spin is 1, we see that H —— J

Σ

Ia-jYl=I ;α,γ e Aj



A Σ

KeA1

·

Now Z, < >, F depend on Aj, β and h (we imagine fixing J ) . The magnetization of the magnet is given by AfAi(M)=ClAer1 Σ oa>=\AtrldF/dh. aeAe

(1.1.5)

In general, F is a kind of generating function for truncated correlation functions. An observed aspect of ferromagnets in nature is that (at least at low enough temperatures) they have memories; that is, if a magnetic field is turned on and then off, the system remains magnetized in the direction of the field. Thus, one would like to think that, for β sufficiently large, Iim Μλ.(Μ)*0 Alo

(1.1.6)

but this is false. There are two related ways of seeing this. First, we can note that the finite-volume expectation < · >Ae,p,A is continuous in β, Λ so limMA( (β, Λ) = Alo (β. A = O) and since < · >Λ(,β,Α=ο's left invariant under the map σα —¥ -σα (all α), Mfil (β, Λ = 0)= 0. Alternatively, in finite-volume FA( is analytic in β, h and in particular it is C1 in A. Moreover, by an obvious symmetry, it is an even function of A; hence, its derivative at Λ = 0 is zero. The reconciliation of the obvious analyticity of the Gibbs formalism and the observed discontinuities in nature (discontinuities which, we might add, come at such well-defined values of parameters, you can set your thermometers — indeed, we do set our thermometers—by them) is an interesting problem. The accepted solution is that true discontinuities in derivatives only occur in the limit of infinite A. The picture that results is that, in nature, where material is finite in size, a func­ tion like the density of water at fixed pressure and varied temperature is really a smooth function of T (or would be, if one waited for thermal equilibrium at each value of T). However, it is close to a discontinuous function; that is, the density jumps over such a short temperature interval (an interval whose size shrinks with increasing volume of our sample), that for all practical purposes, the density is dis­ continuous. Under such circumstances, it is obviously convenient to study objects in the idealization of infinite volume. It should be mentioned that this notion of infinite-volume reconciling phase transitions and statistical mechanics took time to be accepted by the theoretical

6

CHAPTER I

physics community. Indeed, it was only with Onsager's solution of the Ising model in 1944 in two dimensions and zero field that the notion was more or less universally accepted (Peierls's work in 1936 should have done the trick, but it wasn't widely appreciated at the time). In fact, at the van der Waals Centenary Conference in 1937, there was a spirited debate on whether phase transitions are consistent with the formalism of statistical mechanics. After the debate, a vote was taken on whether the infinite-volume limit could provide the answer. While the infinite-volume limit did win, it was a close vote! (See the discussion on pp. 432-33 in Pais's beautiful biography of Einstein [1982].) Thus, a basic object for which one wants to prove existence and then study it is the free energy per unit volume, /(β. h ) = Iim IAfI"1 F A (β, h ) .

(1.1.7)

Chapter II proves the existence of this limit and some general features of the func­ tional dependence of /. We note that independently of its interest for phase transi­ tions, the existence of the limit (1.1.7) is of great physical significance. After all, if one needs to know the specific heat of iron, one looks it up in a table. One doesn't bother to find out the precise geometric configuration used by the experimenter in measuring this specific heat; rather, one takes three points on faith: That the exper­ imenter was competent enough to take a big enough piece of material; that surfaces effects aren't important; and that for large pieces of material, the specific heat is proportional to the volume, that is, that limits like (1.1.6) exist! In finite volume, there are, in principle, fluctuations about the predictions of thermodynamics, so that one only expects thermodynamic arguments to be exact in this infinite-volume limit. For this reason, the infinite-volume limit is often called the thermodynamic limit. It was a basic discovery of Gibbs (when, prior to his work on statistical mechan­ ics, he worked on the thermodynamics of mixtures; see Wightman [1979] for a delightful discussion of the history) that thermodynamic functions are concave or convex in many basic parameters (convex functions are discussed in Section 1.3); in particular, / is a concave function of h for each fixed β. Such functions auto­ matically have one-sided derivatives (see Prop. 1.3.1(f)), that is, (Djlf) (β, h) = Iim eiO

(±ε)_1 [/(β, h ± ε) - /(β, h)] exists. The symmetry h —> -h now says /(β,- h) = /(β, h) so (Dhf) (β, - h) = -(Dj1 f) (β, h). Given the formula in (1.1.5), it is natural to replace (1.1.6) by the conjecture (D\ /)(β, A = O)* (D~h /)(β, h= 0).

(1.1.8)

Such a discontinuity in the derivative of the fundamental thermodynamic function is called a first-order phase transition. If some higher-order derivative fails to exist or is discontinuous, we say there is a higher-order phase transition. Of

7

PRELIMINARIES

course, it could happen that / fails to be C1 in one parameter and only fails to be C2, or even is analytic, in another. It is thus often useful to speak of a. first-order phase transition in h (or some other parameter). (1.1.5) suggests it should be useful to study infinite-volume limits of finitevolume Gibbs expectations. Here, a new phenomenon enters: The limit (1.1.7) always exists and the limit is the same if HA is modified by changing "boundary conditions." The limit of states might not exist (or more properly, we only know that it exists at special values of parameters, like high temperatures [small β], or under special restrictions on the interactions like ferromagnetism) and, in general, the limit may depend on boundary conditions. Since < Oao > is simpler than < I Af

l-1

Σ σα α ε ΛI

it should be particularly useful to consider infinite-volume

states invariant under translations (so < σα > is independent of a), so that (1.1.5) suggests that -β_13//3λ = < σα > for such states. What the detailed analysis (see chapter III) shows is that the values of β < σα >, as one runs through all "suitable" infinite-volume translation invariant states, is precisely the set of numbers between -(D\ f)(h) and -(DJ1 f)(h). A similar relation holds for other functions than σα: for example, if we want to know about expectation values of σασ δ for some fixed δ, we must look at the free energy per unit volume, /(β, η), associated to the Hamiltonian α+

—J

^

G U OY — Η

^

I α—γI = 1

a so that

α,α+δ € Ae

CT W O A _6

and look at derivatives with respect to η. Thus, one sees a basic fact: First-order phase transitions in some parameter are equivalent to the existence of more than one distinct translation invariant infinite-volume state. If there are states at h = 0 with < σα >*0, we have states which do not have the symmetry σα —» - σα of the basic interaction (rather, σα —» - σα takes the state to another suitable infinite-volume limit). This illustrates the fundamental notion of spontaneous broken symmetry, a notion which (in a related context; see Section 1.2) is fundamental to all modern theories of elementary particle physics. There is an alternative interpretation of the Ising model which associates a dif­ ferent physics to it: namely, consider a "gas" of particles. At each site, a, we either place a particle, in which case we set pa = 1 or we don't, in which case we set pa = 0. The particles attract one another in the sense that we gain energy, K > 0, if two particles are nearest neighbors, that is, H A=- K

I α—γ I= 1 α,γεΛ

ΡαΡγ-

8

CHAPTER I

We do not wish to study this model in a picture where the particle number

is fixed (or in which the density is fixed), but rather want to allow variable particle numbers. This means that one is supposed to work in the so-called grand canonical ensemble. Rather than the density, a direct physical quantity, one has a parameter, (i, the chemical potential. One forms the grand canonical partition function

In the grand canonical ensemble, the basic thermodynamic object is the pressure, defined by d.1.9) We claim that this is just a rewriting of the Ising model; for define Then

where

dimension), C

and

involves

constant terms and multiples of for on the boundary of A, but most significant, the number of terms is comes from the fact that in spins not on the boundary have 2v neighbors while boundary spins have fewer neighbors. is only a boundary term and so it shouldn't affect the limit, which suggests that are related above. This is indeed correct. We discuss such "physically equivalent" interactions in Section II.l. This lattice gas language has several aspects. First, the Ising model is a lousy model of a magnet, since the symmetry of magnets in three dimensions should be SO(3) (rotations), not (spin flip) and, as we shall discuss shortly, that change is very significant. The Ising model is, however, a pretty good model of an alloy means one material, the other; (J. is now a difference of chemical potentials). Second, while we drop the subscript "GC" on Z, we will talk about the "pressure" and limit (1.1.9) rather than "the free energy per unit volume" and the limit (1.1.7). We do this because the phrase is shorter and it is convenient not to have to continually insert the factor However, we warn the reader first

PRELIMINARIES

9

that much of the literature discusses free energy, and because of the minus sign, getting lower bounds on / is the same as getting upper bounds on p. Moreover, we continue to use expressions like "lattice gas" and "pressure" in cases where the basic variables (which replace or are such that a canonical ensemble is more appropriate. As long as we don't take the language too seriously, this is no problem. Third, we note that the spin flip symmetry is more obscure in lattice gas lanis special; but it is the guage. It is not clear in this language why point where the liquid-gas transition takes place. In fact, the hidden symmetry is as responsible as the lattice cutoff for the proofs of phase transitions in these models of "gases." Fourth, we note it is useful to have the two intuitions: Correlation inequalities (see chapter X) are a powerful tool—the GKS correlation inequalities are intuitive in the magnetic picture, whilethe FGK inequalities are most intuitive in the lattice gas picture. The fugacity, which is a basic variable in the grand canonical ensemble, is a natural variable in the Lee-Yang theorem (see chapter IX). As a warm-up, we compute in case Then, summing independently over the the same as summing independently over the choices

we see that

and we see that

In the and if wesame compute way, using the fact that if k

m, then we have

; Since

10

CHAPTER I

(actually, we don't need that only that Thus, in one dimension, is analytic in Of course, this calculation does not establish that (1.1.8) fails if v = 1. One can calculate exactly (see Section II.5) and indeed

Thus, is jointly analytic and (1.1.8) does not hold in one dimension. Onsager [1944], in a celebrated work, computed We discuss his result in Section II.6. is real analytic in except at a single point, in , but the second derivative (related to the specific heat) diverges as In I , that is p has a second-order phase transition in Since one has not computed , one cannot directly see that (1.1.8) holds if v = 2, that is, that there is a first-order transition in h for large However, by computing infinite-volume expectations at h = 0 and using various theoretical devices which are described in the later chapters, one can indirectly compute and, in particular, establish that (1.1.8) holds if There is an alternate way of establishing (1.1.8) for P large which has the disadvantage of not computing t exactly, but the advantage of working in a large variety of situations where one does not know how to compute exactly. This method, due to Peierls [1936], and its descendants, will be the subject of chapters VI and VII in volume 2. It is useful to describe an intuition behind the method which explains why, for the (discrete symmetry) Ising model, one has a first-order phase transition only if Later in this section, we will discuss how this intuition is modified in the case of continuous symmetry. To discuss this intuition, we need a basic fact which is reasonable and which will be proven in chapter X (see Griffiths [1971]). The plus boundary condition state, is defined by adding to H where is the set of a in Af with at least one neighbor outside and is the number of such neighbors. This Hamiltonian describes relative energies properly in a situation where we imagine freezing all spin outside to be +1 and let the spins inside be variable. We claim that (1.1.10) In (1.1.10), the subscript refers to the limit as or Griffiths [1971]) and a is arbitrary since

(which exists, see chapter X is translation invariant

PRELIMINARIES

(see chapter X or Griffiths [1971]). Given that D+ρ is the maximal value of < σα > over all infinite-volume states, (1.1.10) is reasonable since intuitively this should be maximized if all outside spins are positive. Now, we can explain why if ν > 1, one expects < σα >+> 0 if β is large. First consider ν = 1. With plus boundary conditions, the ground configuration (mini­ mum energy configuration) has all spins σα = + 1. The lowest energy configura­ tions with σ0 = - 1 have a long array of minus spins containing zero. Their energy is 2 J more than the ground configuration, but their number is arbitrarily large as I —> °°. Thus, the large number of such configurations should overwhelm the factor e~2^J for any β. Of course, this consideration of the simplest low energy states is not definite, but it is suggestive. We will prove D+ ρ = 0 if V = 1 for very general systems in Sections II.5 and III.8. Now consider ν = 2. The simplest configurations with σα=-1 consist of a "connected" block of minus spin containing zero. The cost, relative to the ground configuration, in energy for such a block is proportional to the length, q, of the boundary of the block, since there are that number of neighboring pairs with σασγ =- 1 (this is not quite true at corners unless q is suitably defined). Thus, the energy of such a configuration is 2 Jq. A simple combinatorial estimate shows that the number, nq, of blocks with boundary length q and with zero in the block is bounded by Aq for suitable A. Thus, relative to the ground configuration, the total weight of these elementary configurations is £ S^jqAq which can be made arbi+> 0 for β large and indeed that Iim < σα >+,=»,p,A=o= Again, the intuition isn't a proof: In the infinite volume β— β < oo situation, the probability that all σα = +1 is zero (for while it costs an energy 4J to flip a single spin, there are an infinite number of spins one could slip) — indeed, one can show the fraction of spins with C = - 1 in a typical config­ a

uration for
+ is i (1- < O0 >+) > O if β < oo. But the rigorous argument of

Peierls isn't too far removed from the intuition. The logarithm of a number of configurations like nq is related to entropy (see Section III.4), so nqe~^E = εχρ(-β£ + In nq) and one talks about a balance of energy vs. entropy. In one dimension, entropy overwhelms energy and there is no first-order phase transition; while in two dimensions, energy can overwhelm entropy at large β. As we already explained, the Peierls argument, described in chapter VI, provides a rigorous version of entropy-energy arguments in cases where energy overwhelms entropy. In cases where entropy overwhelms energy, there is a rigorous version of entropy-energy arguments due to Simon-Sokal [1981] which exploits the Gibbs variational principle of Sections 111.3,4. We do not dis­ cuss the Simon-Sokal method in these volumes. The moral of all this, something we will prove in many ways below, is that dis­ crete symmetries are unbroken in one dimension but can be spontaneously broken in dimension two or more. *

*

*

12

CHAPTER I

The model we described above has a number of generalizations whose study will concern us in these volumes. First, one often wants to allow more general interactions than nearest neighbors, that is, able function J on

is replaced by

for some suit-

In addition, more complicated interactions than ones

involving pairs are considered. More significant, one replaces by a a taking more values. The simplest examples allow m values equally spaced and placed symmetrically about zero. They are usually normalized so that either the spacings are 1 or so that the maximum value is 1 (so, for example, if m = 4, the natural choices are either this model is called the spin S Ising model (the reason for the name will become clearer when we describe quantum models below). Thus, what we discussed above is t h e I s i n g model. One also wants to consider cases where is a continuous variable and is replaced by

for some measure

(called the a priori measure) on an

interval [a, b\. The formalism is discussed in detail in Section II.l. So long as takes values in M ("one-component models"), the qualitative nature of the models is similar to that of the spin-j Ising model. New phenomena occur if is a vector valued variable with values in component models") with a rotation invariant a priori measure. The simplest models in the class are the N-vector models, where is a unit vector in and is the unique rotation invariant measure on In the nearest-neighbor model, the Hamiltonian is

where e is a fixed unit vector (often taken to be [1, 0, . . . , 0], except for the tradiin case N = 3). We define tion to take

The case TV = 2 is often called the plane rotor, and N = 3 the classical Heisenberg model-, it is a classical analog of the (quantum) Heisenberg model described below (and also a classical limit; see Section 11.10). As already emphasized, these models have a continuous symmetry if namely, if h = 0, H is invariant under simultaneous rotations. For this reason, they

PRELIMINARIES

13

are better models of magnets than the Ising model. A new phenomenon occurs. Let us describe it for plane rotors and first if v = 1. In one dimension, we saw that for the plus boundary condition model, the lowest energy configurations with have energy 2 / relative to the ground configuration. For plane rotors, one can do better: write so plus boundary conditions mean we set outside Relative to the ground configuration (all the lowest energy configuration in one dimension with has and energy and in particular, this goes to zero as t goes to infinity. If we try similar configurations in v-dimensions, that is, then the number of nonparallel pairs is and the energy cost per pair . Thus, there are energy configurations in with and energy this is bounded as (actually, one can arrange to choose configurations with energy going to zero as ; see Section III.7). This suggests that for /V-vector models, energy cannot overwhelm entropy if v = 2. The moral is that continuous symmetry breaking does not occur if v= I, 2 and occurs if Of course, the above construction should cause one to pause: Perhaps we just weren't clever enough. However, if there is a positive lower bound on the energy of plus boundary conditions configurations with and that for any , and there is an integer, so that all configurations with energy smaller than have at most N spins with The absence of continuous symmetry breaking if v = 2 will be discussed in Section III.7 and its occurrence if v in chapter VIII. There is, however, a special phenomenon for plane rotors if v = 2. When v = 1, decays exponentially as we will see that at all temperatures (see Section II.5). For v = 1 and N = 2, there is a phenomenon discovered by Kosterlitz-Thouless [1973] and Berezinskii [1971] and rigorously proven by Frohlich-Spencer [1981a] that at large only decays as a negative power of We discuss this result in chapter VII. It is believed, but not proven, that if v = 2,, , the decay is exponential at all temperatures. Finally, we should briefly describe the simplest quantum model. The spin-^ model involves the 2 2 Pauli matrices

(1.1.11)

obey the commutation relations of the generators of SO(3). Given A, one takes independent spin-y spins

(to do this properly, one needs the notion

of tensor product of vector spaces; see Section II. 1). They act on Now we form

14

and define finite-volume states as follows: Observables are operators

CHAPTER I

and

This is called the (spin-4 quantum) Heisenberg model. The spin S model lets be generators of the irreducible 2 5 + 1 -dimensional representation of SO(3). Otherwise, the formalism is identical These models have a continuous symmetry and the expected behavior is the same as in the classical case: No symmetry breaking occurs if v = 2 (see Section IV.7) and symmetry breaking is expected if . However, there is still no proof of this for the ferromagnet . There is a proof for the antiferromagnet (see Dyson-Lieb-Simon [1976], [1978], or chapter VIII). There are two connections between these quantum models and the classical models we have already discussed. Consider the sequence of spin S quantum models where is replaced by Since the basic operators should "commute in the limit," that is, the limit should be classical. Since , the limit should be a classical Heisenberg model. To prove convergence of the finite-volume partition functions as directly is easy (see, e.g., Millard-Leff [1971]). Sufficient control to allow one to interchange the and limits and get convergence of pressures as was obtained by Lieb [1973a] and is described in Section 11.10. A second connec*: i is as follows: One can consider anisotropic quantum (or for that matter, classical) Heisenberg models where is replaced by The extreme anisotropy case where is equivalent to the Ising model with spin S quantum systems corresponding to spin S Ising systems (explaining the name). This is because is spanned by vectors indexed by c o n f i g u r a t i o n s w i t h t a k i n g values S so that

Thus, if g is an arbitrary operator function of the commuting operators have

we

and we see the identity of the extreme anisotropic quantum model and the Ising model.

15

PRELIMINARIES

There is a general class of quantum models, where the "observables" at a site α are just all Hermitian operators on Cd and Zh is a trace on Cd" with dA = dA. This is described in Sections II.l and IV.1. Quantum systems have one important structure not shared by all classical sys­ tems, namely, a natural (Heisenberg picture) dynamics on the observables. After time t, an observable A goes into A(t) = e" H Ae~"H . That this has a limit as A —» +

+

+

+

+

+

Fig. 1.2.2. The hard square gas

+

22

CHAPTER I

Unoccupied sites we call monomers. We give an allowed configuration a weight, zm, where 0 < ζ < 1 and m is the number of monomers. The partition function is Z L (z), the sum of the weights. One can show that p(z) = Iim IA1I"1 InZ L (z) £->«>

exists, but there is no phase transition in z ; that is, ρ is real analytic away from z = 0. This is a theorem of Heilmann-Lieb [1970], [1972] (see also Gruber-Kunz [1971]), whose papers can be consulted for additional references. An interesting model of a liquid crystal related to monomer-dimer models can be found in Heilmann-Lieb [1979]. (ii)

Zn and Potts Models

There have been a number of attempts to generalize the (spin-γ) Ising model. The only such generalization with finite configuration space that we discussed in Section 1.1 was the spin S Ising model. The spin S model lacks an important fea­ ture of the spin-y model: namely, in the spin- j case, there is a symmetry group which, when restricted to a single site, acts transitively on the single spin space; that is, all possible values (at a single site) can be obtained from any fixed one by applying the symmetry. The spin S model doesn't have this property: σα = S - 1 is in no sense equivalent to the state CA = S. In 1952, Potts [1952] defined two mod­ els with a single-site spin space consisting of η points, but with a symmetry group large enough to act transitively. The one with symmetry group Zn (= group of inte­ gers mod n) is usually called the Zn model (or clock model), and the one with sym­ metry group Sn (= group of permutations of η objects) bears Potts's name. In a real sense, the spin S Ising model "generalizes" the spin- ~ model in that its behav­ ior is qualitatively like that of the spin-y model. For η large, the Zn and Potts models have qualitatively rather different features from the Ising models (and from each other). The Zn model or clock model has «-states ωα = 0, 1, ..., η - 1 at a site and a Hamiltonian

//λ

~~ J

I α-γΙ = 1

α,γ ε Λ

COS j (e^oc Vw

(Oy)I )

2π Usually, one describes this by an angle O01= — ω„ on the circle or a two vector η σα= (cos(0a), sin(0a)) (so cos(0a - θγ) = σα · σγ). In some ways, for η large this model looks like a plane rotor. Indeed, for η very large, one expects that for β with βη'2 < < 1, the model will qualitatively look like a rotor model, while for β/Γ2 > > 1 like a discrete model (n~ 2 enters, since min I cos(0a) -II = IK 2 IH 1 for η large). The θα*0 most dramatic aspect of this is that in two dimensions and η large, there are (at

PRELIMINARIES

23

least) three temperature ranges: For small, correlations decay exponentially; for intermediate, the decay is as a power; and for (3 very large, there is long-range order (see Frohlich-Spencer [1981a], [1981b] for proofs and Elitzur et al. [1982] for further discussion). For model is an Ising model; for n = 3, it agrees with the n = 3 Potts model (after a constant shift of energy and redefinition of coupling constant). For n = 4, the variables act like independent spin-y Ising spins, and since

the Hamiltonian decomposes into

independent Ising models. This realization is useful for comparing transition temperatures in the Ising and plane rotor models (see Aizenman-Simon [1980]). models have become especially popular because the lattice gauge models are believed to be initmately related to the SU(n) lattice gauge model; see Elitzur et al. [1982], For the Potts model, let and take

with 8 the Kronecker delta function. This is a model where neighboring sites like to be the same. One particular interesting feature is that for n large, the transition is first order in (see Kotecky-Shlosman [1982] and chapter VI). (iii)

Random Ising Model and Spin Glasses

There is a class of models where randomness is put into the couplings. The simplest examples are the random Ising models whose finite Hamiltonians are d.2.1)

where the h's. For

are random variables. Let E denote the expectation value over the fixed, define

The "annealed pressure" (1.2.2)

24

CHAPTER I

just corresponds to thinking of the h's as "spins," and if the h's have the distribution of independent identically distributed (i.i.d.) random variables, (I.2.1)-(I.2.2) fit into the general framework we consider. Indeed, if, as is typical, (1.2.3) with z a independent Ising spins, then since

the A dependence of as and h are varied is trivial, and the model has the same phase structure as an ordinary Ising model. Much more interesting is the "quenched pressure" (1.2.4) (the names "quenched" and "annealed" come from the theory of spin glasses based on the fact that the two models are believed to describe materials made by the physical processes called by these names). The same method we use to prove the convergence of the pressure in normal models (see Section II.2) shows that the limit (1.2.4) of the are i.i.d.'s and indeed more can be shown by exploiting the law of large numbers. Namely, for almost all choices of (with respect to their initial a priori distribution) lim

exist and equals

Thus, pQ

describes the system for "typical" sets of couplings, while in terms of viewing the couplings as fixed, pA gives high weighting to atypical configurations which dominate in the limit as If wetake the distribution (1.2.3) for with A fixed, we get a quenched pressure , and for example, (1.2.5) where

(the limit may need to be taken through subsequences) and the last equality in (1.2.5) holds for a.e. One can ask where, for large, This turns out to be a hard question! There was once some disagreement among theoretical physicists in the

PRELIMINARIES

25

field about the minimum dimension, for suchlong-range order. Recently, Bricmont and Kupiainen [1987] have proven that For a spin glass, one makes the spin-spin couplings random; typically, one has a Hamiltonian

where the are random variables, typically Ising spins, and where the are fixed, and in cases of physical interest, often only have power law decay at infinity. One wants to know if

has a nonzero limit as i

Khanin-Sinai [1979] have proven that if the

are

independent and , then the fluctuations actually allow the quenched pressure to exist for J ' s decaying so slowly that the methods of Section II.2 do not apply (essentially Section II.2 requires and (Khanin-Sinai [1979]) only requires that

For additional rigorous literature on

spin glasses, see Frohlich-Zegarlinski [1987]; Aizenman-Lebowitz-Ruelle [1987], [1988]; and see Parisi [1984] for physicists' reviews.

(iv)

Unbounded Spin Models

There is a class of models where the single spin space, Q, is R with an a priori measure, which is supported on all of R or at least on an unbounded set. Thus, is not compact. As a result, even if H A is continuous on , it may not be bounded. The point is not that the spins are unbounded but that the interactions are unbounded. A typical example of interest is to take a priori measure, and Hamiltonian, H

As we will see in (v), such models are related to quantum field theory and their study has been advocated to help understand There is no question that the intuition built up in lattice systems is essentially in understanding but I have seen no evidence that the tools produced to study the special technicalities of unbounded spin models is of use in Of course, the models do have some

26

CHAPTER I

mathematical interest of their own. Ruelle [1970] identified some important conditions and estimates for studying unbounded spins going under the rubric "superstability." An extensive theory has developed starting with the applications of Ruelle's estimates by Lebowitz-Presutti [1976]; see Benfatto et al. [1978] and Presutti et al. [1976], (v)

Quantum Mechanics and Quantum Field Theory

What we will discuss here is not really a lattice system, but we would be remiss if we didn't indicate to the reader the close connection between spin systems and (imaginary time) quantum theory. We will only scratch the surface here; for further discussion, see Simon [1974], [1979b], and Glimm-Jaffe [1981], We describe the connection first in a simple example. Let H be the Hamiltonian, eigenvector, spec (//).

where is the harmonic oscillator Hamiltonian. There is an of H with . Automatically, since inf be corresponding objects for

Then (1.2.6) and, for example, (1.2.7) These formulas look a little like statistical mechanics: (1.2.6) like a formula for a free energy and (1.2.7) like a correlation function. The analogy is made extremely close if we use the Trotter product formula to write where and use the explicit formula for Then (1.2.6) becomes:

is clearly the partition function of a one-dimensional unbounded spin system with a priori measure (except at the edges) and ferromagnetic nearest-neighbor pair coupling. By going to high dimensions one gets analogs for quantum field theory (but with replaced by this is called

PRELIMINARIES

27

Euclidean (quantum) field theory since Lorentz invariance translates into Euclidean invariance after the t —> it replacement). There is a connection between quantum field theory and statistical mechanics. First, there is analogy on the level of ideas: Phase transitions and spontaneous symmetry breaking in statistical mechanics have, as their analog, spontaneous symmetry breaking of a different sort—this plays a fundamental role in current theories of elementary particles. Second, there is an analogy on the level of mathe­ matical techniques: Correlation inequalities, introduced in qft by Guerra et al. [1975a], [1975b], and high temperature expansions, introduced in qft by Glimm et al. [1973], are the two standard tools—both are borrowed from statistical mechanics. (vi)

Lattice Gauge Theories

Lattice approximations to ordinary Euclidean quantum field theory are techni­ cally quite important, but one can claim that they play no fundamental role since the theories, at least some of the time, can be rigorously defined by perturbation about a free continuum theory. This is really no longer true for gauge quantum theories which are intrinsically nonlinear. It would be out of place to discuss con­ tinuum gauge theories here, but it does seem appropriate to describe the lattice gauge theory of Wilson [1974]. Let G be a compact group with Haar measure, d\i, and let χ be a real character. Let Zv denote the hypercubic lattice in v-dimensions. The "sites" of the model are bonds, b, on the lattice, that is, pairs α, γ e Zv with I ot — γΙ = 1. Every bond is given once, and for all, a preferred direction and the configuration space in A is GB, that is, the assignment to each bond of an element of G. A plaquette, P, is a set of four bonds, bu b2, bit bA successively forming a square (they must therefore lie in a plane). Given a plaquette, P, we traverse it in some direction and let O1, σ2, σ3, σ4 each be ±1 denoting the ratio of the preassigned directions on ft, and those inherited from traversing P. Let

where g 1 is the inverse of g. Because χ is real, χ ρ has a value independent of the direction in which we traverse P, and since χ (gh) = %(hg), χρ has a value indepen­ dent of which site we begin the loop with. Then, the basic Hamiltonian is

the sum being over each plaquette counted once. The g in front is a constant not to be confused with the group elements. Notice that H has an enormous symmetry group. For, if HE GA is the assignment of an element of G to each α e A, and if

28

CHAPTER I

directed from y to then has the property that This local gauge symmetry is never broken, but plays an important role in the physics. While the theory fits into the framework of chapter II, the basic questions one asks are rather different from the questions one asks in statistical mechanics. A fundamental role is played by the Wilson loop, defined as follows. Let be an enormous LxL square planar loop. Let be a labeling of bounds in the loop, and the relative signs. Set

It is believed that when the parameters of the theory are such that quarks are not confined, then for L large, but that if quarks are c o n f i n e d , I t is known rigorously (Simon-Yaffe [1982]; Seiler [1980]) that

for c,d and all large L. For additional remarks about the physics and mathematical physics of gauge models, see Pokorski [1987] and Seiler [1982]. (vii)

The Hubbard Model

We want to briefly describe a quantum lattice model of interest to the understanding of ferromagnetism. The issue is to understand how the Pauli principle, together with simple electron interactions, produces the spin aligning interaction which is taken as input for the Ising model. Models of this form were introduced by Gutzwiller [1963], Hubbard [1963], and Kemeny [1965], and are usually called the Hubbard model. We will assume, in giving our description, that the reader has some experience with fermion algebra. Given a set A in the lattice, we imagine fermions, labeled by and that is, we have matrices on a complex space of dimension determined (uniquely up to unitary equivalence)

The Hubbard model Hamiltonian depends on the parameters T, U

29

PRELIMINARIES

The first term (which counts each pair as both αβ and βα) describes hopping, that is, it preserves the number of spin up and spin down fermions ιN ν

WL=· * γ ° α* σ ° α σ σ aeA

and if ψ describes an explicit configuration of fermions, Caa cpCT ψ is zero if either there is a fermion of type σ at site α or no fermion of type β and it is ψ, the config­ uration with a fermion moved from β to a, if the above conditions fail to hold. The second term indicates that fermions of opposite spins do like to be on the same site. commute with Ha, SO one is interested in Ha restricted to the space of vectors ψ with /V+A) ψ = nA ψ, /ViA) = mA\f. Typical questions concern the behav­ ior of the ground state (minimum energy state of Ha) as Λ—with μλ/Λ —> W00, mA/A —> /W00. One is interested in whether ψ gives higher weight to neighboring spins being parallel or antiparallel. Since H has built into it a repulsion (if u > 0) and fermions, it can be regarded as a schematic model of electrons with Coulomb repulsion. The study of the nature of the ground state is a caricature useful as a test of the validity of Heisenberg's idea that ferromagnetism can come from a combination of electron repulsion and the Pauli principle. This model is farthest, of the models mentioned so far, from the theme of this book. It does not even fit into the framework of quantum lattice systems we pre­ sent since operators at distinct sites do not commute (they anticommute). We men­ tion it as a lattice model where the thermodynamic limit is relevant, and because it is important to realize that the Ising model only addresses one-half of the issue of "explaining" ferromagnetism: namely, going from local spin aligning interactions to bulk magnetism. It does not address the question of a microscopic justification of local spin alignment. For references on the Hubbard model and rigorous results in one dimension, see Lieb-Wu [1972], (viii)

Percolation

The next model appears even farther from statistical mechanics than many we have discussed, but we shall see that the analogy is surprisingly close, and that crossfertilization has been significant. We will describe bond percolation in detail. We consider a simple cubic lattice Zv and pick ρ e [0, 1]. A configuration of the system is the assignment of a number Xb, 0 or 1 ("open" or "closed") to each near­ est-neighbor pair b = (α, β). We assign Xb as independent random variables with distribution Prob(tft = 1) = p, Prob(xfc = 0) = (1 - p). We say that two sites, α, β lie in the same cluster if there is a path from α to β going through closed bonds, that is, bonds with Xb = 1. Let Ca denote the cluster containing a. One defines the "two-point function" A(a, β; ρ) by A(α, β; p)=Prob$eCa)

30

CHAPTER I

and one looks at the probability distribution for Explicitly for

number of sites in

One says that percolation occurs if . . The model was introduced by Broadbent [1954] as a model of the spread of fluid through a random porous medium. It is clear why we say that percolation occurs if . I t is intuitively clear and not too hard to prove that is monotone nondecreasing in p. It is a remarkable discovery of Broadbent and Hammersley [1957] and Hammersley [1957] that is strictly between 0 and 1. Let us indicate a proof of this fact for v = 2; the upper bound is related to the Peierls's argument (see chapter VI). Theorem 1.2.1. If Proof: If must contain with lal arbitrarily large, and so there must be a simple path of closed bonds starting at 0 of arbitrary length (a path is simple if no bonds are repeated). Thus, for any Prob(There is a simple path of closed bonds of length 1' from 0 to some Prob(y has only closed bonds)

The number of such paths is certainly bounded by the number of all walks of length I starting at 0 that don't backtrack from one step to the next. This number is clearly . Moreover, the probability that any given y has only closed bounds is clearly Thus, for any that is, so

For the other inequality, we introduce the important notion of the dual problem. Through every open bond, draw a line of length 1 perpendicular to the bond, centered at the midpoint of the bond (see fig. 1.2.3, where It is clear that there is always a simple closed curve (contour) surrounding (in fig. 1.2.3, this contour has length 14). Thus, we have Prob(There is no contour surrounding 0). Actually, one can do better. Namely, we can pick an 0 and imagine closing all bonds in (there are ing at contours surrounding Thus

block, containing such bonds) and then look-

Prob(There is no contour surrounding

PRELIMINARIES

31

Fig. 1.2.3. A dual contour

We claim the number of contours of length I surrounding is bounded by since we can imagine the contour as a closed curve starting at some point within i of 0 is a generous upper bound on that) and simple paths of length I starting at any point number fewer than _ as we saw above. Since any contour surroundmust have length at least Am, we see that ing Prob(There is no contour surrounding So long as

we can arrange for this number to be strictly positive by

choosing m large. Thus,

that is,

In fact, Kesten [1982] has proven that Thus, a simple object like is nonanalytic in p and the percolation probability, pH, is like a critical temperature in a statistical mechanical model. One can push the analogy further by looking at the rate of divergence of various quantities, for example, . Another object of interest is

the expected cluster size. Notice that if

, then

32

CHAPTER I

so S looks like a susceptibility. This analogy is made clearer by an idea of EssamSykes [1964]. We define F ( P < h )= Σ e ~ h " n ~ l N ( P > n ) · /1 = 0 This turns out to be an analog of the "pressure" in the Ising model. It is real ana­ lytic in h and ρ in the region h > 0 and 3F 1 - N(p, ) = Iim - — (p, h) h Io an s ( p )=

F Km 373"

A l o oh1

so 1 - N ( p , °o) behaves like a magnetization and S like a susceptibility. The cross-fertilization of percolation and lattice gases is illustrated by the issue of correlation inequalities. The first "correlation inequality" in the modern sense was actually proven by Harris in 1960 in the context of percolation (Harris [I960]). Griffiths's work seven years later [1967a], [1967b] on correlation inequalities in Ising ferromagnets was done without knowledge of Harris's inequality, and caused within a short period an explosion of new inequalities and results. In particular, Fortuin, Kasteleyn, and Ginibre discovered the FKG inequalities [1971] trying to understand how Harris's inequalities might have an analog in lattice gases. More recently, correlation inequalities in lattice gases have been used to suggest analo­ gous inequalities in percolation theory (see, e.g., Aizenman-Newman [1984]). Another example of cross-fertilization: Russo's geometric ideas in percolation [1978] were one source of motivation in Aizenman's analysis of equilibrium states in the two-dimensional Ising model (Aizenman [1980]) (see chapter VI). Another percolation model concerns directed percolation: One only allows paths from 0 to γ which have Σα, increase in each step. There are also site perco­ lation models and mixed bond-site percolation where, with some probability, one places barriers at sites. For rigorous reviews of percolation, see Kesten [1982] and Durrett [1981], and see Essam [1972] and Frisch-Hammersley [1963] for reviews of the physics litera­ ture. (ix)

Stochastic Spin-flip Systems

We want to describe the simplest examples of a set of dynamic stochastic pro­ cesses. For generalizations of the model, extensive discussion, and references, see Durrett [1981], Griffeath [1977], Liggett [1977], and Spitzer [1975], The set of configurations of our system is still { - 1,1}Z , that is, the assignment of a value σα to each α e Zv and states are just measures on vX'. But now states move

PRELIMINARIES

33

in a time-dependent way, that is, there is a dynamics is the set of probability measures on % (see Section III. 1). One is given a basic function from obeying: (a) Translation invariance, that is, if then I for any (b) Finite range, that is, only depends on finitely many is supposed to describe the rate at which spins flip at site a if the system is in state a; that is, we imagine as time-dependent random variables and we want

Explicitly, one defines an operator L on , the continuous functions depending on only finitely many variables in the notation of Section II. 1) by

where

has the spin at site a flipped). Liggett [1977]

has shown that the closure of this operator generates a s e m i g r o u p o n a n d then is defined by

Here are simple examples of c's: (a) Stochastic Ising model:

defines a dynamical

model introduced initially by Glauber [1963]. (b) General exponential models: Let sense of Section II. 1 which is finite range. Define (c) Contact processes: Fix a finite subset

be an interaction of the

and

Let

This is a public health model! We imagine sites as occupied by individuals who are susceptible to a disease. means that individual a is infected (resp. healthy). Individuals who are infected recover at rate 1, while healthy

34

CHAPTER I

individuals are infected at a rate depending on the number of nearby infected indi­ viduals. These processes are reviewed in Griffeath [1981], One is interested in the number and properties of invariant measures, that is, μ with Τ,μ = μ and the question of when a given measure, v, approaches some invari­ ant measure as (i.e., does TtV have a limit?). One interesting feature of models (a) and (b) is that any Gibbs state (see Section III.2) for the interaction Φ is an invariant state for Tt. The converse is only known if ν = 1,2 (see Griffeath [1977]), but may well be true in general.

1.3

Convexity Inequalities: Abelian Case

Convexity plays a major role in statistical physics. It was Gibbs who first realized the importance of convexity ideas in thermodynamics and statistical mechanics (see Wightman [1979] for a history of convexity in thermal physics). In this sec­ tion and the next, we expose many of the basic mathematical ideas. In particular, we will prove two fundamental inequalities on sums or integrals: Jensen's inequal­ ity, which is a generalization of the familiar geometric-arithmetic mean inequality [ ( a b ) ^ < y ( a + b ) ] , and Holder's inequality, which is a generalization of the Schwarz inequality. Definition: A set C in a vector space V is called c o n v e x if and only if for all χ, ν e C and 0 < θ < 1, θ* + (1 -θ)ν e C. A real valued function F on a vector space V is convex if and only if F ( Q x + ( I - Q ) y ) < Q F ( x )+ ( I - Q ) F ( y ) for all x, y e V and 0 F(χ)} is a convex set. Proposition 1.3.1: (a) A C2 function F on M is convex if and only if F " ( x )> 0 for all x. (b) A C2 function F on R" is convex if and only if the η χ η matrix

Α

d2F υ(χ)= ^ 3

OXlOXj

W

is positive definite for all x. (c) A convex function on M" is bounded on balls. (d) A convex function F on a normed linear space which is bounded on balls is continuous; indeed, for y fixed \F(x)-F{y)\ 2. Then, for any integral p

by Thm. 1.4.9. The Lie product formula completes the proof in this case. Given the result for suppose and use

since

if XY is self-adjoint.

Corollary 1.4.13 (The Golden-Thompson inequality): matrices

For A,B self-adjoint

PRELIMINARIES

57

Proof:

As two examples of the applicability of these inequalities, we have: Theorem 1.4.14: Define the function F on h x « self-adjoint matrices by F(A) = In Tr Then F is convex. Proof:

(by Golden-Thompson)

(by Holder's inequality)

Example: Fix a finite subset

and some function

on

Let

be the Ising model partition function, and let

(where t are Pauli matrices) be the Heisenberg model partition function. Then

The upper bound comes from the convexity result Thm. 1.4.14 writing noticing that the trace, with replaced by is just the Ising model Z. The lower bound comes from (1.4.4) by choosing with a basis

58

CHAPTER I

^αζ'^α ) ~

)

and noting that ( Sa

I Τγ · Τχ I S

a

) = SySx

for γ Φ λ.

1.5

Linear Functionals on Infinite-Dimensional Spaces

One of the pleasant features of the study of lattice gases as opposed to other areas of mathematical physics, such as quantum field theory, continuum statistical mechanics, or Schrodinger operators, is that the discrete lattice structure has a ten­ dency of preventing various analytic technicalities. Indeed, many of the basic devices can easily be explained to a bright undergraduate without the necessity of hiding anything. Nevertheless, every once in a while, some sophisticated notion concerning infinite-dimensional functional analysis must be pulled out of the hat as the coup de grace in some otherwise elementary argument. In this section, we want to summarize some of the necessary theory. This is not intended as a replacement for a serious textbook presentation, and we will give many textbook references along the way. In particular, we will use words like "topology" carefully, but we will not give the definitions here. The reader who feels uncomfortable, because of acrophobia, is urged to defer or even completely avoid this section. We begin by describing some notions concerning dual pairs of vector spaces and associated linear functional notions, and apply these ideas to the question of tangents to convex functions. Finally, we discuss briefly the basics of Choquet theory. Our vector spaces will normally be over the real numbers. Definition: A dual pair (E , F ) is a vector space, E , and set F of linear functionals on E, which is a vector subspace of the set of all linear functionals on E, and so that, for every nonzero χ e Ε, there is an C ε F with 1(χ)ψ0. If E is finite dimensional, the final requirement forces F to be all linear func­ tionals, but this is not true once E is infinite dimensional. Notice if χ e E, we can define a linear functional Ux on F by υx ( i ) = K x ) • The final requirement on dual pair implies that the map x — > U x is one-one, so that we can think of £ as a set of functions in F, and with this association, (F, E) is automatically a dual pair also. This explains the use of the term "dual." Many books use < f, χ > in place of i(x) to indicate the fact that E and F should be viewed on an equal footing. For the basic notions of topological spaces, see Reed-Simon [1972] and Choquet [1969],

PRELIMINARIES

59

Definition: Given a dual pair (E, F), the weak or (E, F) topology on E is that topology in which a net in E converges to x in E if and only if for every in F. Clearly, by definition, every is a continuous function on E when E is given the (£, F) topology. The converse is also true; see, for example, ReedSimon [1972], Thm. IV.2. Theorem 1.5.1: If is a o(E, F)-continuous linear functional on E, then

F.

Example 1 : Let X be a compact Hausdorff space, let C(X) denote the continuous functions on X, and (X) the Baire measures on X (see Reed-Simon [1972], pp. 105 ff). (C(X),

(X)) with

is a dual pair. For sequences

(but not nets), weak convergence on C(X) is easy to describe, and only if for each x and The

weakly if topology on

is often called vague convergence. Example 2 : If and are both dual pairs, so is (with and the weak topology on is just the product topology. We will use this fact frequently in the case because, given a function g (not necessarily linear) on E, we will often want to consider the related sets in given by and

In infinite-dimensional analysis, a major role is played by a generalization of Lemma 1.3.8; see Reed-Simon [1972], p. 130 for a proof: Theorem 1.5.2 (Separating hyperplane theorem): Let < E, F > be a dual pair, and let C be a convex subset of E and a point with Then (1) There exists a nonzero linear functional I (not necessarily in F) with (1.5.1) (2) If C is open in the weak topology, then the of (1) may be chosen in F. (3) If C is closed in the weak topology, then the of (1) may be chosen to be in F, and so that the inequality in (1.5.1) is strict. One application of this result is to extend Prop. I.3.6(iii) to infinite dimensions. Definition: Let is called tangent to / at

be a dual pair, and F : if and only if

a convex function,

(1.5.2) for all From the form of (1.5.2), the following is obvious:

60

CHAPTER I

Proposition 1.5.3: Fix / convex and ; E. The set of F which are tangent to / at x is a convex set which is closed in the a(F, E) topology. Theorem 1.5.4: Let (E, F) be a dual pair. Let / be a weakly continuous convex function from E to Then, for every there exists a tangent to / at y. Proof: and

so that either

is an open convex set, so by Thm. 1.5.2, there exist and

F

(1.5.3) Since

for all

can go to and Then (1.5.3) becomes:

since, for any

inf

Let

E, that is, is tangent to / at y.

Remark: Full continuity of / was not used. All that was needed here is that is open. If £ is a normed linear space, F = its norm dual, and if / is only norm continuous (a weaker hypothesis than weak continuity), then is a norm open convex set. Such sets are automatically weakly open, so weak continuity can be replaced by norm continuity in the last theorem. This theorem deals with existence. As for uniqueness, we have the following theorem of Mazur [1933]; see Dunford-Schwartz [1958] for a proof: Theorem 1.5.5: Let F be a norm continuous convex function on a separable Banach space, X. Then, there is a unique tangent functional at is a dense in X. Dense 's are a good candidate for "generic sets"; see Reed-Simon [1972]. Of course, there can be many, even dense (but not dense sets of nonunique tangents as can be seen already in the case R. (For example, let be a counting of the rationals, and let

The convex function

has nonunique tangents at every rational. Finally, as regards tangent functionals, one can ask which functionals, I, enter as tangent functionals to F at some point. Definition: Let {E, F) be a dual pair. Let P be a convex function on E. We say that F is P-bounded if and only if

Theorem 1.5.6 (Bishop-Phelps [1963]): Let P be a (norm) continuous convex

PRELIMINARIES

61

function on a Banach space, X. Then in the norm on the tangent functionals to P are dense in the P-bounded functionals. For a proof including additional information, see Israel [1979]. The proof is an application of the separating hyperplane theorem.

As the other subject in this section, we want to provide, mainly without proofs, an introduction to the basic principles of Choquet theory. For more details, see Choquet [1969], Lanford [1971], and Phelps [1966]. Definition: Let be a convex set. x e C is called an extreme point of C if and only if y, z and + (1 - 0)z imply that y = z = x. (C) = [extreme points of C}. Thus, extreme points are points which are not interior points of line segments in C. If C is the disc in then (C) is its boundary; but for convex polygons, [C) is the set of corners. Definitions: Let (E, F) be a dual pair. Let £ be an arbitrary set. The closed convex hull of A, written cch(A), is the smallest [E, F)-closed convex set containing A; equivalently, it is the closure of

Theorem 1.5.7 (Krein-Millman theorem): Let (E, F) be a dual pair, and let C be a compact (in the {E, F) topology) convex subset of E. Then

In particular, 0,λ~ 1 j e A } U {0,0). Consider now compact convex subsets, A, of M2. It seems likely (and is indeed true) that only when A is a triangle do points in A have a unique convex decompo­ sition in terms of extreme points. Think of the three cones, c(A) in M3 obtained when A is a triangle, a square, and a circle. The triangular case is distinguished by the fact that only in that case is [x + c(A)] O c(A) of the form y + c(A). This fact should be borne in mind when trying to understand the theorem and definitions below. Definition: A convex cone, C, is a subset of a vector space, E, so that, for all x, y e C and μ, λ > 0, we have that μχ + Xy e C. C is called proper cone if C O (—C) = {0}. C is said to generate E if and only if E = C-C. If C is a proper convex cone generating E, we introduce the C-order on E by χ > ν if and only if χ — y e C. Note that the C-order has the following properties: (i) x>x; (ii) If x > y and y >x, then x = y; (iii) x>y and λ>0 implies λχ>λν; (iv) x>y and ζ>w implies x + z > y + w, (v) x>y implies χ + a > y + a for any a; (vi) x = y- ζ with y >0 and z>0; (vii) x>y and y> ζ implies x>z. Moreover, if an order obeys (i)-(vi) [(vii) is implied by (iv) and (v)], then the order is just the C-order for C = {xlx >0}, which is a proper convex cone generating E. Definition: Let £ be a vector space. Let AczE be a convex subset of E. Let c(A) C E χ R be the cone with base A, let E = c(A) - c(A), which is a subspace of £xR. We say that A is a simplex if and only if E with the c(A)-order is a lattice, that is, given any x , y e E , there is a (necessarily unique) ζ e E, so that z > x , z > y , and for any w with w>x, w>y, we have w>ζ (i.e., χ and y have a least upper bound [l.u.b.]). It is not hard to see that this is equivalent to the statement that, for any x,yec(A), there exists zec(A), so that [x + c(A)] O [j + c(A)] = [z + c(A)].

PRELIMINARIES

65

The name comes from the fact that the only finite-dimensional sets of the above form are the usual simplices. The relevance of this definition to Choquet theory is the following (see Choquet [1969], Phelps [1966] and Lanford [1971] for proofs). Theorem 1.5.9 (Choquet-Meyer theorem): Let (E, F) be a dual pair, and A E a compact convex subset of E. Suppose that the ( E , F) topology on A is given by a metric. Then, A has the property that every x A is the barycenter of a unique measure supported on (A) if and only if A is a simplex. Example 3 (cont'd): Let A =

(X). Then and The map sets up an isomorphism of and and under this isomorphism, C(A) is identicalto The order on is the usual one, that is, if and only if for all positive / in C(X). We claim that is a lattice under this order. Indeed, by the invariance of the order under addition, we need only show that jli and 0 have a least upper bound for any measure (i.. This follows from the Hahn decomposition theorem, which implies there is a Baire set A with positive and negative. For clearly, and is an upper bound for p. and 0, and if then

so

is the least upper bound.

Example 4 (cont'd): As an example, consider = the 2 x 2 matrices. Then we can associate with by (A) = Tr(pA). Under the association, we see that If we write p in terms of the familiar Pauli a matrices (see [1.1.11])

where corresponds to (Euclidean norm). Thus, is the unit ball in , which is certainly not a simplex. In fact, one can prove that is a simplex if and only if ft is abelian (see Bratteli-Robinson [1979], Example 4.2.6). Example 5 (cont'd): It will be important for later applications to know when a subset of a simplex is a simplex. It is not automatic; for example, let A be the standard three-dimensional simplex (regular tetrahedron). A plane passed between opposite edges and parallel to each will intersect A in a rectangle which is not a simplex. The following two theorems are thus of interest: Definition: A closed, convex subset, B, of a compact convex set, A, is called a face if and only if x,y A and implies that x and y lie in B.

66

CHAPTER I

For example, x

A is an extreme point if and only if { jc } is a face.

Theorem 1.5.10: The face of a simplex is a simplex. Proof: Let A be a simplex, and B a face of A. It is easy to see that because B is a face, any extreme point of B is an extreme point of A, and thus, if v B and v is the barycenter of a measure (i supported on (B), then is also supported on (A). Since y has a unique representation as the barycenter of a measure supported on (A), a fortiori, it has, at most, one representation as the barycenter of a measure supported on (B). The result now follows from the Choquet-Meyer theorem. Theorem 1.5.11: Let (E, F) be a dual pair. Let A be a simplex in E. Let be a family of linear functional on F, and let be a family of real numbers. Suppose

is nonempty. Suppose that c(B) has the property that if x,y e c(B), then their l.u.b. in the c(A) order lies in c(B). Then, B is a simplex.

Proof: Define

on £ x l by

Then, it is easy to see that (1.5.4) In particular, if z, x c(B) and (A), then Given x,y e c(B), we must show that they have a l.u.b. in the c(B) order. Let z be their l.u.b. in the c(A) order. By hypothesis, z c(B). By (1.5.4), z — x c(B), so z is an upper bound in the c{B) order. Since c(B) a c(A), any upper bound in c(B) order is one in c(A) order, soTransforms z must be a in c(B) order. 1.6 Legendre The procedure of taking Legendre transforms is a very common one in physics. The way it is usually described is that, given a function f ( x ) on R" and v e K", one first finds a point x* obeying (1.6.1) and then defines

(y) by (1.6.2)

67

PRELIMINARIES

Of course, the usual presentation is vague about what to do if (1.6.1) has more than one solution or no solution. A much more satisfactory definition is to take f* (y) = sup [x • y - /(*)].

(1.6.3)

JC

Note that if the supremum is taken at some point x*, and if / is smooth there, then x* must obey (1.6.1). Moreover, with this definition it turns out that, if / is con­ vex, then (/*)* = /·

(1.6.4)

For the case η = 1, these ideas in the guise of "conjugate convex functions" go back to the early part of this century and ideas of Young [1912] (and η = 1 is special in that /* can often be computed by a very explicit procedure; see Example 2 below). Surprisingly, the general «-dimensional theory, especially (1.6.4), was only developed in 1948 by Fenchel [1949]. The general theory will concern us here because of its application in Section III.4, where we will see that, on the infinite-dimensional spaces of interactions and states, the pressure and the entropy (actually not s(p), but -s(-p)) are Legendre transforms. Many of the ideas associated to the Gibbs variational principle are best understood within this context. For this application, we will need to deal with the theory on an infinite-dimensional space. It turns out that even if / is very nice, /* may not be continuous (indeed, the entropy, s(p), is not continuous), but it may only have a weaker property: Definition: Let g be a real valued function from a topological space to the reals. We say that g is lower semicontinuous (l.s.c.) if an only if g(x)< Jirn gOO. y—>X

(1.6.5)

If -g is Ls.c., we say that g is upper semicontinuous (u.s.c.). Remarks: 1. A useful mnemonic for l.s.c. is to note that (1.6.5) says that g can get lower as y reaches x. 2. g is continuous if and only if it is both l.s.c. and u.s.c. 3. Often, one allows g to take the value +°° and still speaks of l.s.c., if (1.6.5) holds. 4. (1.6.5) says that, for any ε and any x, there is a neighborhood, N, of χ so that g(y) > g(x) - ε if y e N. This says that { (jc, λ) I λ < g(.x)} is an open set. We thus see that Proposition 1.6.1: g is Ls.c. if and only if {( χ , λ) I λ > g { x ) ) is a closed set. We remark that it is also easy to see that g is l.s.c. if and only if, for each fixed λ, {χ I g(x) < λ} is a closed set.

68

CHAPTER I

We are now prepared to make (1.6.3) as a formal definition: Definition: Let (E, F) be a dual pair. Let A b e a subset of E, and / a real valued function on A. The Legendre transform, of / is defined on the set = (i.e., on the /-bounded functionals) by

(1.6.3/)

Notice that doesn't depend on / alone, but also on A, and we should really write rather than Definition: A Fenchel pair is a pair ( / , A) of a real valued function, / , on a set, A, with the property that

is a closed convex subset of E x E. It is easy to see that Proposition 1.6.2: ( / , A) is a Fenchel pair if and only if (i) A is a convex set. (ii) / is a convex function on A. (iii) / is lower semicontinuous on A. (iv) If x A but then Notice that if we set / to be

on E\A, then (iii), (iv) says that / , as a function

on The point of the definition of Fenchel pair is the following two theorems: Theorem 1.6.3: Every Legendre transform Proof:

is a Fenchel pair. is a closed convex set as

the intersection of such sets. Theorem 1.6.4 (Fenchel [1949]): If ( / , A ) is a Fenchel pair with = (f,A). Proof: For all x

A and

then

we have that (1.6.6)

so

PRELIMINARIES

69

To complete the proof, we need only show that, given a pair and or and there is an with

with either

(1.6.7) Now, let The above condition on is that (jc0, t0) . It is easy to see that F x R is a dual pair to E x R (Example 2 of Section 1.5), and that the ( £ x K , F x W ) topology is just the product topology on E x R. Since T is closed by hypothesis, the separating hyperplane theorem (Thm. 1.5.2) says that we can find and s so that (1.6.8) Since t can get arbitrarily large, we cannot have cases and s = 0. If then

Since (1.6.8) says that the sup is finite, we see that calculation and (1.6.8) says that

We consider separately the

and the last

Taking we have (1.6.7). If 5 = 0, then (1.6.8) says that (1.6.9) In particular, there is some

Thus, letting

Thus, by the above construction for any We claim that, for any indeed,

we see that

we find

CHAPTER I

70

so that, by (1.6.9) we have (1.6.7) if we take

sufficiently large.

Remark: Fenchel's theorem is a special case of a general form of the "bipolar theorem," a general theorem about duality; see Choquet [1969], Thm. 22.7.

Example 1 : Let 1

and define / on M by

Let Define The Example and implies Then, is smooth basic for Holder's and 2yinequality Indeed, :and fixed, Letgoes inequality and (1.6.6), to Itbe note is Let any easy (see that that atstrictly betoReed-Simon is, the seeinverse so monotone thatits/ maximum is function [1972]). convex. socontinuous defined occurs We claim function byat that a point on (1.6.10) with

PRELIMINARIES

For

71

is smooth and the maximum occurs at

and

Consider the curve or (fig. 1.6.1). The point is on the curve, is the area of a rectangle, and is the area under the curve. Thus, is the area between the curve on the y axis, that is, is given by (1.6.10). In this one-dimensional case, is often called the conjugate convex function (a term also sometimes used in general), and the inequality (1.6.6) is often called Young's inequality. It plays a fundamental role in the theory of Orlicz spaces (Krasnoselskii-Ruttickii [1969]). By the duality theorem of Fenchel,

We are heading toward showing that equality holds in the resulting inequality (1.6.11)

if and only if is tangent to / at x. In the last section, we considered tangents to globally defined convex functions. Here we extend the notion and Thm. 1.5.4.

Fig. 1.6.1. The conjugate convex functions

72

CHAPTER I

Definition: Let ( / , A) be a Fenchel pair. and only if

is called tangent to / at

if

(1.6.12) for all Theorem 1.6.5: Let ( / , A) be a Fenchel pair. Suppose that A is open in E and / is continuous on A. Then, there is a tangent to / at any point x A. Proof: Identical to Thm. 1.5.4. Theorem 1.6.6: Let ( / , A) be an arbitrary Fenchel pair. Fix x. Then, equality holds in (1.6.11) if and only if is tangent to / at x.

Proof: (1.6.12) is equivalent to

Thus, t is tangent to / to x if and only if

Since (1.6.11) always holds, we have equality if and only if Bis tangent to / at x. Obviously, the last two results imply: Corollary 1.6.7: Under the hypothesis of Thm. 1.6.5, for any x e A, there is an F for which equality holds in (1.6.11). Corollary 1.6.8:

is tangent to / at jt if and only if x is tangent to

at

Proof: Both statements are equivalent to

If there is a unique tangent to / at each point x A, there is a well-defined correspondence from A to , and if there is a unique tangent plane to at each this correspondence is one-one. When has nonunique tangents, has very special forms in part of its domain of definition. Definition: Let ( / , A) be a Fenchel pair. We say that / is ruled over a subset B of A if B is a convex set with more than one point and there is an affine functional, L for all on B, so that for all This name comes from the fact that a portion of the graph of / is flat. Proposition / , A) be a Fenchel pair,toand A be a is point with multiple tangents.1.6.9: Let BLet be( the family of tangents / atletx.x Then ruled over B.

73

PRELIMINARIES

Conversely, if A* is open, f * continuous and f * is ruled over B, there is a point χ e A with B a subset of the set of tangents to / at x. Proof: The set, B, of I obeying (1.6.12) is trivially seen to be convex. Since tangency is equivalent to f *( Q= K x ) - f * ( x )

f * agrees with the affine functional L(l)=l(x)- f(x) on B. For the converse, let S be the convex open set {(K, t ) I Ke A * ; t > f * ( l ) } , and C the disjoint set ((IU)I Cs B; f*(l) = t). C is convex since /* is ruled over B. By the separating hyperplane theorem (in a form slightly stronger than Thm. 1.5.2) there exists (x0, s) e E χ M with inf [ϋ(χ0) + si]^ sup [l ( x 0 ) + si]. (W e S

e c

Clearly, s >0. Since B c z A * , if s = 0, then iI A * takes its minimum value on A * . But a nonzero linear functional does not take its minimum value on an open set. Thus, j > 0. It follows that, with χ = - ί Jt _1

0

f*(l')-l'(x)>f*(t)-t(x) for all Ee B and I' e A*. Thus, χ is tangent to f* at each Ie B, so by Cor. 1.6.8, the result is proven. • Imagine a function / with a three-dimensional graph similar to that shown in figure 1.6.2, that is, a point, p, with a set of tangent planes which has three extreme points. Moreover, ρ is the end point of three curves coming together, and each curve is a point with a one-parameter family of tangents. A typical example would be the free energy of water as a function of temperature and pressure, where ρ is the triple point at which ice, water, and steam can all occur. The Legendre trans­ form will then be as pictured in figure 1.6.3, where the function is ruled over the triangular region and each of the adjacent regions is a union of ruled lines. In the case of water just mentioned, the variables dual to temperature and pressure are the entropy and specific volume (i.e., inverse of the density), and the Legendre transform is the internal energy. The triangle then corresponds to the fact that, at the triple point, different densities and entropies are allowed. More about such pic­ tures can be found in Wightman [1979]. We should close with a word of warning about these pictures. The one shown where the set of ruled lines ends by getting shorter and shorter (or in a triangle) is

CHAPTER I

Fig. 1.6.2. A sample convex function

Fig. 1.6.3. The Legendre transform of the sample

75

PRELIMINARIES

so reasonable that one might think that this is the only possibility consistent with convexity. This is not true, as can be seen by looking at the function (Simon-Sokal [unpublished]): f ( x , y) = IJCI + x 2 + y 2 if y < 0 = ( χ 2 + y 4 ) i + X 2 + y 2 if y > 0 where a family of double points ((0, y) with y < 0) ends (at y = 0) without the dis­ continuity of the derivative going to zero.

1.7

States on C*-Algebras

Certain aspects of the theory of C*-algebras and von Neumann algebras play a basic role in quantum statistical mechanics. A thorough treatment of the necessary background could require an entire book; indeed, the first volume of BratteliRobinson [1979] is devoted precisely to that task. Our goal here in this section and parts of chapter IV is an exposition of some of the main themes, often leaving out details of proofs, emendations, and the like. In this section, we will present those aspects of the abstract theory which do not rely too heavily on the special structure of quantum lattice systems. After the basic definitions, we define states and the associated GNS construction. We then describe various convex subsets of states emphasizing two situations where there are natural unique decompositions: the central decomposition and the ergodic decomposition associated to asymptotically abelian automorphisms. Finally, we will describe some aspects of the Tomita-Takesaki theory. Definition: A C*-algebra is a complex algebra, Ct, (i.e., a complex vector space with a product obeying the usual distributive laws), a norm, Il · II, under which a is a Banach space and an antilinearmap a—*a obeying: (i) {ab)* = b*a (ii) Ilafcll < IIaIIIIM (iii) ΙΙαΊΐ = IIaII (iv) ΙΙα*αΙΙ = ΙΙαΙΙ2. We will always suppose that our C*-algebras have an identity, that is, an element, 1, with al = Ια = α and Il 1 Il = 1. Example 1 (C(X)): Let X be a compact Hausdorff space, and let a = C ( X ) , the algebra of complex-valued, continuous functions on X under the operations of pointwise multiplication and addition. If 11/11= sup l/(x)l and f*(x) = f(x), ® is a X€ X C*-algebra. Notice that C(X) is an abelian algebra, that is, multiplication is com­ mutative.

76

CHAPTER I

Example 2 (.¾.¾') and its subalgebras): Let K be a (complex) Hilbert space, and £(30 the set of all bounded operators on 3C, which is an algebra if the product is operator composition. If Il · Il is the operator norm and A* is the adjoint, then £(30 is a C'-algebra. Any subalgebra of £(30 closed under norm limits and taking adjoints is obviously also a C*-algebra. Such a concrete subalgebra of £(3Q is called a C*-algebra of operators. £(3C) is not abelian. A subclass of the (^-alge­ bras of operators has special properties which make them especially tractable. A C*-algebra of operators, Ct, on 3C is called a von Neumann algebra if and only if Aa e Ct and if Aa —»A in the weak operator topology, then AeQ (i.e., if and only if Ct is weakly closed). Von Neumann algebras are important because of the follow­ ing: If A e Ct, a C*-algebra, is self-adjoint, then any polynomial in A, and hence any continuous function of A, lies in Cl. In general, the spectral projections of Ct may not lie in Ct (see Reed-Simon [1972] for a discussion of spectral theory), but if Ct is a von Neumann algebra and A = A* €(1, then the spectral projections of A also lie in Ct. Thus, if & is a von Neumann algebra, it has lots of orthogonal projections; indeed, it has enough so that finite linear combinations of the projections are dense. There exist C*-algebras with no nontrivial projections. Given any C*-algebra of operators, Ct, one can form Ctw, its weak closure, the smallest von Neumann algebra containing Ct. The name "von Neumann algebra" concerns the following remark­ able theorem of von Neumann (see, e.g., Bratteli-Robinson [1979], p. 74 for proofs):

Definition: Given any subset cS on £(3Q, we define the commutant, cS', of Jby 3" = {B e £(.¥) IBA = AB for all A e :>"}. The commutant, if", of J' is called the double commutant.

Theorem 1.7.1 (von Neumann density theorem): If Ct is any C -algebra of opera­ tors, then CI" = Qw. Thus, given A € Cf ", we can find Are e Cf with Aa —»A weakly. Actually, one can find An e Ct with An —> A strongly and IIAnII < IIAII (Kaplansky density theo­ rem); see Bratteli-Robinson [1979], pp. 74-75. The two examples above are not as special as they appear. This is the content of the following theorem of Gel'fand and Naimark (see Bratteli-Robinson [1979], Section 2.3.4, and for the abelian theory, Bratteli-Robinson [1979], Section 2.3.5).

Theorem 1.7.2 (Gel'fand-Naimark): (a) Every abelian C*-algebra, Ct, is isometrically isomorphic to some C(X). X is uniquely determined by Ct. (b) Every C*-algebra, Ct, is isometrically isomorphic to a C*-algebra of opera­ tors.

Remark: 1. (a) comes out of the beautiful Gel'fand theory of abelian Banach algebras; see Bratteli-Robinson [1979]. 2. Below, we will see some parts of the proof of (b).

PRELIMINARIES

77

3. In some of the older literature, what we have called a " -algebra of operators" is just called a -algebra," and what we have called a -algebra" is called a -algebra." Thm. 1.7.2(b) is then stated as "every -algebra is a -algebra." Given this fact, modern terminology is not to introduce the extra name and just refer to -algebras. Often below, we will drop the phrase "of operators." Example 1, revisited: By the above, every C(X) is (more precisely, is isometrically isomorphic to) a -algebra of operators. How? Let , be a measure on X with supp p, = X, and let Given define by It is not hard to see that is an isometric (because supp isomorphism. makes sense if One can also see that if then , If X is connected, has no nontrivial projections; &" has one for each "event" (a measurable set modulo sets of measure zero). In the above theorem, we were careful to note that the X in part (a) is uniquely determined by the realization in given by (b) is not unique. We see this in Example 1, revisited, above: There is a differentone for each measure. Even if we look at when there is a unitary V : so that and call inequivalent if no such V exists, then there are many inequivalent (one for each measure class). Thus, one is motivated to look at representations of CI, that is, maps, of into some which preserve adjoints, the identity, and all algebraic operators. We do not require that n be an isometry; indeed, Ker is allowed so that n may not be an algebraic isomorphism. While we do not require that be continuous, it follows automatically that it is continuous. Since the argument gives a flavor of the subject, we give it: Proposition 1.7.3: Let be two -algebras of operators, and -algebraic homomorphism taking into Then (1.7.1) for all A. Proof: Suppose that (1.7.1) holds for A self-adjoint. Then, for any self-adjoint, so

and thus, (1.7.1) holds in general. Thus, we need only prove (1.7.1) for self-adjoint A with and so Thus, which implies that Given a -algebra, and representation, on pick

is

. Then,

The subspace

78

CHAPTER I

is left invariant by so we may as well look at that is, without loss, we can look only for cyclic representations, that is representations with is called a cyclic vector. To be precise, by a cyclic representation of we mean a triple of a representation on and a distinguished unit vector so that is dense in Given such a representation, we define (1.7.2) It is easy to see that

is a state where:

Definition: A state on a -algebra, is a linear functional obeying (a) if We note that in the abstract context has several different meanings which can be shown to be equivalent; for example,

if and only if

B. To see that

we note that if

Example 1 (cont'd):

is a cyclic representation with

ing

for some

then The correspond-

obeys

Thus, states represent a kind of noncommutative analog of the set of measures. The set of all states on we will denote by It is easy to see that if then (the same argument as in Prop. 1.7.3 if A is self-adjoint; since that

for all complex x, one see that and is closed in the weak '-topology, that is, where if and only if for all Thus, is a closed subset of the compact set,

of functionals in

of norm at most 1. Thus, since

implies

that Proposition 1.7.4:

is a compact convex set.

Example 3 (States on is a Hilbert space under the inner product Since all norms on a finite-dimensional space are equivalent, any linear functional has the form for some Since the projection onto is positive for any unit vector for all i if is a state. Conversely, if Thus, If every such p has the form

for some

in

with

Thus,

if are the conventional Pauli a-matrices,

is isomorphic to the unit ball in

PRELIMINARIES

79

and its extreme points to It is very far from being a simplex. Indeed (see Bratteli-Robinson [1979], Example 4.2.6), is a simplex if and only if is abelian! The subsets of which we will show below are simplexes, have some abelian object lurking in their background. It is a simple but remarkable fact that every state is an for some cyclic representation: Theorem 1.7.5 (The GNS construction): Let sentation so that

Then, there is a cyclic repre-

Moreover, this representation is essentially unique in that, if cyclic representation with then there is a unitary map U

is another so that (1.7.3)

for all

and so that

U is uniquely determined by these properties.

Proof: Define an "inner product" on by This inner product has all the proper properties, except it may not be strictly positive definite. Let By the Schwarz inequality for if then for all B. It follows that if then "lifts" to an honest inner product on the quotient space Given we denote its equivalence class in as [A]. Let be the abstract completion of i If and then

so,

Thus, we can define

by

80

and thus, extends to all of operations, and

CHAPTER I

It is easy to see that

that is, Let , Then, and dense by construction. This proves the first half of the theorem. Given a cyclic representation define U by

preserves algebraic

is

Then, so U is an isometry and so extends to all of Its range is all of 3€ by the cyclicity of Thus, U is unitary. Moreover, and

so, (1.7.3) holds. Uniqueness of U is left as an exercise. is called the GNS representation associated to co. GNS stands for Gel'fand, Naimark, and Segal. Uniqueness of the representation has an important consequence. Let a be an automorphism of Of, that is, an invertible map preserving adjoints and the algebraic operations. (It thus preserves positivity and so norms, since inf We say that the automorphism a is implementable in the representation, Jt, if and only if there is a unitary U : so that (1.7.4) Corollary 1.7.6: Fix an automorphism, of Let and let be the associated GNS representation. Then, is implementable in by a unitary U obeying (1.7.5) if and only if (1.7.6) and in that case, U is uniquely determined by (1.7.5).

PRELIMINARIES

81

Proof: That (0 obeys (1.7.6) if there is a U obeying (1.7.4,5) is trivial. Conversely, if co obeys (1.7.6), then is a representation of Ct, and

Thus, by Thm. 1.7.5, there is a unique U obeying (1.7.4,5). An important consequence of the uniqueness of U is that if G is a group, and a family of automorphisms obeying and if for all g, then the obeying also obey We note that Thm. 1.7.5 is a key element in the proof of Thm. 1.7.2(b). One shows (using a Hahn-Banach argument) that points, and then takes

has enough elements to separate and obtains an isomorphic repre-

sentation. Our next result is a kind of noncommutative analog of the Radon-Nikodym theorem (although, admittedly, its commutative specialization is only a weak form of the Radon-Nikodym theorem; the proof below is very close in spirit to von Neumann's proof of the Radon-Nikqdym theorem; see Reed-Simon [1972], Thm. 1.19). By we mean elements in with that is, positive multiples of elements in Theorem 1.7.7: Fix Then, there exists a unique T

and suppose that the commutant of

also. so that (1.7.7)

and (1.7.8) Conversely, if T in obeys (1.7.8), then (1.7.7) obeys have the form (1.7.7) with corresponding then

if and only if

We write Proof: The converse is easy: For T

and A

implies

are both positive if (1.7.8) holds. To prove the direct result, we return to the proof of Thm. 1.7.5 and define a sequilinear form, F, on a by

82

CHAPTER II

Clearly, if A then A = 0 for all B. Thus, F lifts to

since and

, and thus F(A, B)

By the Reisz representation theorem, there is an operator, T, on

so that (1.7.9)

implies (1.7.8), and taking A - I, (1.7.7) holds. To check that T we compute

Since is dense, As for the final statement, it is trivial that

implies

If

let a -

so there exists with by the uniqueness of T, and The last theorem allows us to identify the extreme points of Corollary 1.7.8: Let to The following are equivalent: (a) co is an extreme point. (b) is multiples of the identity. (c) is irreducible, that is, , has no nontrivial subspace for all A

with

Proof: (a) (b). It is easy to see that to is extreme if and only if CO implies |X is a multiple of co. But by Thm. 1.7.7, is in one-one correspondence with and this is if and only if , This is just Schur's lemma; see Bratteli-Robinson [1979], Prop. 2.3.8.

Thm. 1.7.7 helps explain why the set of states of a non-abelian algebra isn't a simplex for the cone over for some c} is, by that theorem, isomorphic to If is the n n matrices, the extreme rays of this set are multiples of rank 1 projections, and it is easy to see nonuniqueness of convex combinations. To get unique decompositions, we will need to deal with subsets of so that the corresponding subset of generates an abelian

PRELIMINARIES

83

subalgebra, The two cases of interest will be where , and where is the commutant of with being the implementors of a group of automorphisms. We develop this latter theory first. The key to a good decomposition theory of invariant states is the notion of asymptotic abelianness, and the weaker notions of G-abelianness. The notion of asymptotic abelianness is due to Robinson [unpublished]. Its relevance to the characterization of ergodic states was found by Doplicher, Kastler, and Robinson [1966], and to decompositions by Ruelle [1966] and Kastler-Robinson [1966], There are later developments by these authors, and others; see Bratteli-Robinson [1979], pp. 454ff. The theory was developed for its applications to quantum statistical mechanics. While the theory is not restricted to the group we will only need the theory in this case, so we restrict ourselves to it. Also, since the stronger notion of asymptotic abelianness will hold in our applications, we restrict ourselves to that case. Definition: A -algebra is a algebra, together with a family of automorphisms indexed by obeying , the invariant states, will denote the set of with , It is obvious that is a compact convex set. Given an invariant state, p, on a -algebra, Cor. 1.7.6 yields a set of unitary operators obeying

We define to be the projection onto the set of all r| for . By the von Neumann ergodic theorem (see the proof of Thm. III. 1.8), all (1.7.9)

Of course, If the only vectors in Ran are multiples of say the p is an ergodic state. By (1.7.9), one immediately has that

, we

Proposition 1.7.9: If p is ergodic, then for all A, B (1.7.10)

Definition: A -algebra is called asymptotically abelian if and only if, for all A, B one has that where, as usual, The main decomposition theorem for Theorem 1.7.10: Let a be a Then

-algebras is:

-algebra, which is asymptotically abelian.

84

CHAPTER II

(a)

the invariant states, is a Choquet simplex.

(b) | is an extreme point if and only if it is ergodic. The theorem is easier to prove in the case where (1 is abelian. Since this special case is needed in classical statistical mechanics, we develop that special case in Section III. 1 without recourse to the C* -machinery below. As a warm-up, the reader might want to look at that special case (Thms. III. 1.10,12). We will need to develp some machinery to prove Thm. 1.7.10. Until we make it an explicit hypothesis, we do not assume asymptotic abelianness in the lemmas below. First, we need an extension of Thm. 1.7.7: Proposition 1.7.11: (b) The association between

sets up a one-one correspondence and

Proof: (a) By (1.7.9), if C commutes with

so,

it commutes with

Conversely, if C , then for all since is dense. We have used (b) By Thm. 1.7.7, we need only show that, given co

so

we have

that if and only if T . But, by definition of and since is invariant. Moreover, since . Thus, by uniqueness of T, if then that is, The converse is easy. Asymptotic abelianness will enter because it will imply that is abelian. First, we need to study in a little more detail. Notice that since if A then , We can view PA as an operator o n , which we denote that is Lemmaby1.7.12: (a) isomorphism of (b) on Ran

viewed as a set of operators . Then

PRELIMINARIES

85

Remark: may not be an algebra, but it is closed under taking adjoints and sums. Proof: (a) That is a *-homomorphism is easy, so we need only show that its kernel is {0}. If 0, then , so since for all B , and so by cyclicity A = 0. (b) Since commutes with and it is obvious that Conversely, given C define D by (1.7.11) Letting

we see that

since and This shows that D is well defined and extends to a bounded operator on It is an exercise in the definition (1.7.11) that D and in Thus, D is in by Prop. 1.7.11(a). It follows that

so, by cyclicity This last lemma says that, for to be abelian, we need only show that abelian. This will follow if we can show that is abelian because of: Lemma 1.7.13: Let be a family of operators on a Hilbert space, so that (a) A implies (b) A, B i implies that (c) There exists with dense in Then is abelian and equals the set of strong limits of the algebra generated by Proof: That is immediate from (b), as is the fact that is abelian. Since is closed under taking adjoints (by [a]), it suffices to show that any C with has , Pick Then, since abelian,

CHAPTER I

86

so,

and thus

Thus, without loss we can suppose

thatSince C commutes is self-afjoint. with . we have that so, since each B commutes with C and and is dense, By the continuity of the functional calculus (Reed-Simon [1972], Thm. VIII.20), we have, for any bounded continuous function / , that Pick Then, is a limit of polynomials in , and so in and so is in Remarks: 1. A priori, the > may not be uniformly bounded, which is why we go through the above contortions with resolvents and 2. By the (strong) von Neumann density theorem, , but we don't need that abstract result. Rather, abelian implies that while abelian implies that Lemma 1.7.14: If (i is an asymptotically abelian the set of Lemma 1.7.12 is abelian.

Proof: Fix A, B

By (1.7.9) and

by the condition of asymptotic abelianness.

-algebra and p

then

PRELIMINARIES

87

Proof of Theorem 1.7.10: By the last lemma, is abelian. Let Then is dense in the hypothesis of Lemma 1.7.13. Thus, is abelian and is one dimensional if and only if Ran trivially implies sion 1, so does

is one dimensional. For dim

has dimension 1. Conversely, if and so

Ran and so obeys We claim that has dimen-

has dimension 1.

By Lemma 1.7.12, is isomorphic to Thus, we conclude that , is abelian and that dim 1 if and only if dim Ran To prove (b), note that by Prop. 1.7.11, dim multiples of 1) if and only if that is, if and only if p is an extreme point of Thus, p is an extreme point of if and only if dim Ran is 1. To prove (a), we need only show that is a lattice. Given Thus, there exist with Now, by the above, vX^ is abelian, and so by Thm. 1.7.2 for some X. Since 1, under this isomorphism, T and S correspond to functions / , g in C(X) with 1 for all x. Let and let correspond to / g under the isomorphism. It is easy to see that is the greatest lower bound of |j. and v. Suppose that d is separable. Given Thms. 1.5.9 and 1.7.10 imply to is the barycenter of a unique measure, ft, supported on the ergodic states. Further developments of the theory (see, e.g., Bratteli-Robinson [1979], chapter 4) yield an explicit formula for Given A define a function By the Stone-Weierstrass theorem, polynomials in the A's are dense in the measure on ergodic states whose barycenter is , is given by (1.7.12).

The notion of central decomposition, due to Sakai [1965], is the state analog of the direct integral decompositions of von Neumann [1949] (the relation was made precise by Effros [1961]; (see Bratteli-Robinson [1979], Section 4.4). Nonuniqueness of the decomposition of a state into extreme states is associated to carrying the decomposition too far. To understand what we mean by this, let us analyze Example 3 further. Example 3 (cont'd): If we saw that was isomorphic to , where Of special interest is the representation, T|, of on j given by putting the inner product and letting Then, the GNS triple is essentially if Ker p = and

is not an

88

CHAPTER II

irreducible representation. It is the direct sum of the natural representation of (i on taken twice, but this direct sum can be taken in an infinite number of ways. One difference between the abelian and non-abelian case is that in the abelian case, irreducible subrepresentations of cyclic representations always have multiplicity one, while in the non-abelian case we can have higher multiplicities. Nonuniqueness comes from decomposing things so far that we have split up a part of the representation of multiplicity at least two, putting equivalent representations into different pieces of our decomposition. Thus, one can hope to get unique decompositions if one keeps things "disjoint" where: Definition: Let Ct be a -algebra, and let on and on be two representations of CL We say that Ui and n2 have equivalent subrepresentations if and only if there exist subspaces with and a unitary map V : so that If and do not have equivalent subrepresentations, we call and disjoint and write Two states on (2 are called disjoint (we write

if their associated GNS representations are

disjoint. Thus, and are disjoint if and only if their sets of associated vector states are disjoint sets; here, by an associated vector state we mean a state of the form A where is a unit vector in It is common to consider co-normal states, that is, states of the form where p is a positive trace class operator on with One can show that disjointness is equivalent to saying the family of -normal states is disjoint from the family of -normal states. We want to decompose states so far that we can't go further without losing disjointness. In this regard, the following lemma and proposition are useful: Lemma 1.7.15: Let lent to a subrepresentation of

If

for some c, t h e n i s unitarily equiva-

Proof: By Thm. 1.7.7, we can find T the closure of and find an explicit GNS realization for p which is manifestly a subrepresentation of

Proposition 1.7.16: Let where Then, and are disjoint if and only if there is a nontrivial projection P in with

Proof: Suppose first that such a P exists. Then

PRELIMINARIES

89

is an explicit GNS realization for a>], and similarly for and Extend V to all of

by setting

we see that

be a unitary with Since

But

with

and Ran

that is, Since P We have thus shown that

no such V exists, that is, that CO] and 0)3 are disjoint. Conversely, suppose that coj and co2 are disjoint. Since Thm. 1.7.7 and find T with

we can apply

and since

We first claim that T is a projection, for if not, there exists in spec T, and so 0 with a nonzero spectral projection, Q, for T associated to the interval Since Then, . By the last lemma, and have equivalent subrepresentations violating the assumed disjointness. Thus T = P, a projection in All that remains is to show that We will show that If not, let

By definition of the GNS space for 0 representation of Moreover, if A

so, by the lemma,

is a realization of is manifestly a sub0, since

is a subrepresentation of

, which is a subrepresentation of

This violates the assumption of disjointness, so we conclude that f j = 0. We have thus shown, for any

90

CHAPTER II

But then BP = PBP, so replacing B by B* and taking adjoints, PB = PBP, that is, as desired. This proposition makes the following class of states special. Definition:

is called a factor state or a primary state if and only if (the center, is trivial (i.e., multiples of the identity). Since is, in any event, a von Neumann algebra, it has nontrivial projections if it is nontrivial. Thus, the last proposition has an immediate consequence. Corollary 1.7.17: A state is a factor state if and only if it cannot be written as a nontrivial convex combination of disjoint states. This result suggests that we should be able to get uniqueness if we make a decomposition into disjoint factor states. Since we have to expect an integral, in general, we will demand a stronger version of disjointness than just that individual states in the "support" of be disjoint. It will be convenient to say that (not normalized) are disjoint if either one is zero or both are nonzero and and are disjoint elements of We note that when is separable, one can show the factor states are a Baire subset of

Theorem 1.7.18 (Central decomposition): Let Ct be a separable -algebra. For each to there is a unique probability measure obeying (i)

las barycenter GO, that is,

(ii) For any Baire set S

are disjoint

states. (iii) |i is supported on the set, S, of factor states of that is, |0. is called the central decomposition of co. This theorem is discussed in chapter 4 of Bratteli-Robinson [1979], is given by the formula (1.7.12) if is now interpreted as the projection onto the closure of . An alternate way of understanding this theorem is to look at the weak closure, . Then, is a simplex (since i is abelian, this shouldn't be surprising) and its extreme points are factor states, (i is just the unique measure o n s u p p o r t e d on its extreme points with barycenter co.

The final topic in the abstract theory that we want to describe is the TomitaTakesaki theory, something that plays a critical role in the study of quantum statistical mechanics. The main theorem appeared in an unpublished work of Tomita, who sought to generalize the theory of a class of algebras called Hilbert algebras (essentially algebras with a tracial state; see below). Complete proofs with refinements and applications appeared first in a paper of Takesaki [1970]. At roughly the same time as this work, Haag, Hugenholtz, and Winnink [1967]

PRELIMINARIES

91

discovered that the interplay between states and dynamics in quantum statistical mechanics (KMS conditions; see below and Section IV.4) yielded a structure partly in common with the earlier theory of Hilbert algebras. It then turned out that there was a close relation between these two themes: the Tomita-Takesaki theory can be interpreted as saying that all states of a certain form are equilibrium states for a suitable quantum system with a suitable dynamics. Antilinear operators enter in the Tomita-Takesaki theory; since they are not as familiar as the usual linear operators, we quickly sketch some aspects of their the­ ory. A, a map on a Hilbert space, 3C, is antilinear if it has a dense subspace D(A) as domain of definition, and if A(a.u + βν) = aAu + βΑν for all Μ, ν e D(A), α, β e C. Given such an A, we say that u e D(A*) with A*u = w if and only if (ιι, Λν) = (v,w)

(1.7.13)

for all ν e D(A). The appearance of (v, w), rather than the (w, v) of the linear the­ ory, is necessary because of antilinearity. An anti-isometry is an everywhere defined antilinear map, J, with WJuW = Hull. It follows that (Ju,Jv) = (v,u)

(1.7.14)

so (J* Ju,v) = (Jv,Ju) = (u,v), that is, J J = I. An antiunitary is an anti-isometry with range 3C. A is called closed if and only if Γ(Α) = {(u,Au) e 3Cx 3C I u €ΐ D(A)} is closed. If A is closed, one can write A = J\A\ where IAI = VA*A is a (lin­ ear) self-adjoint operator and J is antilinear with J* J and JJ* projections (a partial anti-isometry). If Ran A is dense and Ker A = {0}, then J is an antiunitary. A J\A\ is called the polar decomposition of A. The Tomita-Takesaki theory provides a non-abelian version of the following consequence of Lemma 1.7.13: If 911 is an abelian von Neumann algebra of opera­ tors with a cyclic vector, then 9JI = 9ll'. Since we will not provide complete proofs of the general theory, we will at least discuss a special case in complete detail. A state ω on a C* -algebra, &, is called a tracial state if and only if a>(AB) = co(SA) for all A, B e Ct. 3 = {ΑΙω(Α*Α) = 0} is then closed under taking adjoints, and so it is a two-sided star-ideal. Thus, Q/3 is still a C*-algebra, so by passing to it, we may as well suppose that ω(Α* A) > O for all A (such an ω is called faithful). Given such a state, we can pass to the GNS space and define J on SC0,, by J πω(Α)φω = πω(Α*)φω.

(1.7.15)

It is easy to see that J is antilinear on {πω(Α)φω} and ΙΙ/πω(Α)φωΙΙ2 = ΙΙπω(Α*)φωΙΙ2 = ω(ΑΑ*) = ω(Α*Α) = ΙΙπω(Α) φωΙΙ2 because ω is tracial. Thus, J has an extension which we still call J to all of SC t o which is antiunitary. Moreover, since A** - A, we have that J2-I, and thus J = J*. The basic result here is:

92

CHAPTER II

Theorem 1.7.19 (Ambrose's theorem): If co is a faithful tracial state on a bra, and J is given by (1.7.15), then (1.7.16) that is,

and

are anti-isomorphic.

Proof: Let . By using the von Neumann density theorem, one sees that Thus, we may as well drop the jt's and imagine that d is a concrete von Neumann algebra of operators on a space with a vector which is cyclic, and so that

is tracial. (1.7.15) and (1.7.16) then become (1.7.15') (1.7.16')

Let A, B

Then

Thus, if B, C

so, by cyclicity, [ J A J , B] = 0, that is, we have shown that Now let X Then, since J is antilinear and J = J*,

(1.7.17)

and X and B commute. Thus, by cyclicity of (1.7.15') holds also if A is replaced by X Next, let X, Y Then, by the antiunitarity of / , that is cyclic is, co isfor also aThus, tracialwe state canfor repeat the Moreover, proof of (1.7.17) (1.7.17)to and get

imply that

PRELIMINARIES

Since,

93

we obtain

and thus (1.7.16') is proven. We want to analyze a larger class of states than just tracial states. Clearly, given a concrete -algebra, a cyclic vector, the idea of studying a map is a good one. To do this, one needs to know that implies A sufficient condition for this is that. for all A surprisingly, this is also necessary. Definition: A vector is called separating for a only if A implies that A- 0.

-algebra,

of operators, if and

Proposition 1.7.20: Let a be a -algebra of operators on a Hilbert space and let Then (i) is separating for Ct if and only if 0 for all A with A

(ii) If . is cyclic for 0 as η —» °°. One can show (see Israel [1979]) that if An —> °o in van Hove sense, then pAn (Φ) —> p(Φ). 2. The result holds in both the classical and quantum case. Once one has the basic estimate (Lemma II.2.2C and II.2.2Q) the proofs are identical, so we only give it in the classical case. 3. The strict translation invariance of the interaction is not necessary ; for exam­ ple, if Φ is periodic or even only almost periodic in a very weak sense, the limit exists (see Ruelle [1972]). However, some kind of restriction is needed; see the example before in Section II. 1. 4. Convexity of ρ is a stability statement; for example, it implies the positivity of the specific heat. 5. A function F is called strictly convex if there is strict inequality in the basic definition whenever χ Φ y and θ * 0, 1. The pressure ρ is definitely not strictly convex because (see Section II.1) it is linear along lines of physically equivalent interactions. In the big space, ffi, there are other examples preventing strict con­ vexity (Israel [1979]). However, in Si, after physical equivalence is factored out, ρ is strictly convex; see Israel [1979] (earlier results are due to Griffiths-Ruelle [1971] and Roos [1974]). 6. We will give two proofs of Thm. II.2.1 below: one as a formal proof and the other in Example 4 below. The basic estimate is the following:

112

CHAPTER II

Lemma II.2.2C: Let d\i be any measure. Then for real valued functions / , g: (11.2.2) Proof: By the inequality

This together with interchange of g and / implies (II.2.2). The quantum analog of this lemma is Lemma II.2.2Q: Let A, B be self-adjoint matrices on a finite-dimensional space. Then: (11.2.3) Proof: Clearly plicity listed in decreasing order, then

is the /'th eigenvalue counting multiso

This and symmetry implies the result.



Proof of Theorem II.2.1: As mentioned above, we only give the details in the classical case. By the basic .^-estimate (Prop. II. 1.11):

Thus by (II.2.2): (11.2.4) This inequality establishes (II.2.1) once the limit is shown to exist. Similarly, convexity of p follows from the convexity of which is a consequence of estimates in Sections 1.3 and 1.4. Thus, the proof is reduced to showing that the limit exists. Now use (II.2.4) again. Suppose we can show that the limit exists for ff, a dense set in £8. Given any and e, pick with Then Since we conclude that

THE PRESSURE

113

Since e is arbitrary the limit exists for all The dense set 5 we pick will be the family of finite-range interactions. We say that if and only if for all X with

bigger than r. We let

Given any

we define

by

It is easy to see that in is dense in It thus suffices to prove that the limit exists for Notice that, since only finitely many have diameter less than r, if , then Now fix and with , divide the 6cube, into a-cubes and a left over region C of volume , that denote the interactions between spins in is,

We can write

R consists of interactions which involve spins in or which involve spins in more than one Since has range r, the spins involved in R must be within a "border region" in of size Thus, the number of sites involved in R is at most Since the contribution of each site is certainly dominated by we have that

Next, note that (assuming

114

CHAPTER II

by the translation invariance. The last two formulas and (II.2.2) imply that

where is shortened for with M the fe-cube. Dividing by to infinity through a suitable sequence and using na we find that:

Now take

, taking ('

through a suitable sequence and find that

There is another approach to the existence of the thermodynamic limit which only works directly in special cases. It is worth mentioning since it will be useful for Coulomb lattice gases (where and since it gives information on the sign of With one trick (see Example 4) it gives a second proof of Thm. II.2.1. The basic result on which it depends is the following, which will also be useful in the thermodynamic limit of the entropy (see Section III.3). Definition: A function on {1, 2 , . . . } is called subadditive if and only if f(n A function is called subadditive if it is subadditive in each « ; with the other nl's held fixed. Theorem II.2.3: Let / be a function on exists and equals

which is subadditive. Then,

Remarks: 1. The existence of the limit is not supposed to include its finiteness in this case. It could be 2. The n, can go to infinity at arbitrary rates including going to infinity successively. Proof: We consider the case v = 1; the general case is similar. Fix a and given b, let With we have by induction and the subadditivity relation that

THE PRESSURE

115

Since

we conclude that

for any a. Thus

Theorem II.2.4: Let be a classical interaction withthe following property: for any X and any rectangle with and , one has that and that for any such X and A

for any rectangle

disjoint from A and containing

Then (11.2.5)

for any rectangle Remarks: 1. The proof also provides another demonstration of the existence of the limit for such 2. Using (1.4.6), the quantum case can be accommodated.

Proof: By Jensen's inequality for disjoint rectangles

where By hypothesis,

so

(11.2.6)

116

C H A P T E R II

Thus,

is subadditive in the sides of rectangle so the limit exists and equals sup

Example 1 (Symmetric systems): Consider a consisting only of pair interactions with a fixed function on Suppose that there is a mapping a : so that = id, is left invariant by a and Then, by symmetry

This shows that (II.2.5) holds for

the Ising ferromagnet of spin-y in zero external field. Example 2 (Ferromagnets): Let and be Ising spins. An interaction with is called a ferromagnetic interaction if for all X. We will prove in Section X. 1 that for any

and

if Thus for such ferromagnets, (II.2.5) holds. This extends to many other types of ferromagnetic interactions. Example 3 (Negative interactions): If for all X, then even if doesn't factor, the above proof shows that (II.2.6) holds. Example 4 (General interactions): For any , Then, since c is constant

max

(X) and (X) = It is easy to see

that (when

By Example 3,

has a limit. This provides a new proof of Thm. II.2.1. We add a word about the history of proving existence of the thermodynamic limit for pressure, often in the more complicated continuum-particle systems. The first rigorous results are due to van Hove [1949] with subsequent contributions by Yang-Lee [1952]. Systematic treatments begin with Ruelle [1963a] and then Dobrushin [1964], Fisher [1964], Fisher-Ruelle [1966], and Griffiths [1969], For more recent references, see the book of Ruelle [1969] and review article of Griffiths [1971],

Finally, we want to say something about the ground state energy density and its relation to the pressure. While we will not mention it again, similar considerations work with other boundary conditions as in the next section. Definition: Given Clearly,

B, let

(classical case) (quantum case). by (Prop II. 1.11) and the basic estimate on R in the

T H E PRESSURE

117

proof of Thm. II.2.1 shows that Thus, as in that proof, we have Theorem II.2.5: For any

the ground state energy

exists. In many simple cases, one can determine e explicitly. One famous case where this cannot be done is the nearest-neighbor quantum Heisenberg antiferromagnetic; see Keffer [1966] for an introduction to the literature on the value of e for this model. The connection of e and p is given: Theorem II.2.6: Suppose that the support of all open A. Let Then

Proof: Fix

Clearly,

Thus, by Thm. 1.3.3, find

is all of

such that

that is,

for

so taking limits

has a finite limit : for all

with

Given e,

and

The uniformity of the convergence in p necessary to find such an ^ follows from the proof of Thm. II.2.1 which bounds errors by something proportional to It is easy, using the hypothesis on the support of that

proof for extends to all

II.3

so that Since is arbitrary, this completes the By a density argument of the type used in Thm. II.2.1, the result

Convergence of the Pressure: Other Boundary Conditions

In the last section, we considered the convergence of where was defined with "free boundary conditions," that is, spins inside had no interaction with spins outside Here we want to consider with some other boundary conditions We want to show not only that converges but also that the limit is the same limit as with free boundary conditions. The reason will be very simple. Suppose we show that (II.3.1)

118

as

C H A P T E R II

where

Then by Lemma II.2.2,

as We begin with the classical case and consider first periodic boundary conditions. We consider only cubes, although it is easy to accommodate rectangles. Given a cube A of side a and some we let be that unique point of A with each component of divisible by a. Given we define by

for all a , that is, we extend the configuration to in A to all of it be periodic. Write to mean that (i) lies within A. We define

by demanding that Some translate of X

(11.3.2) (11.3.3) The prime is included to indicate we don't include, with na with n twice in the sum but only once. A case can be made for either of these as being more "natural." The first is only a finite sum. Moreover, since the points in any X with lie inside a fixed cube of side 3a, we have that (11-3.4) so can be defined for any an infinite sum, but as long as and

the big Banach space. The second sum is it is not hard to see that the sum converges

(II.3.5) However, if

is more natural from the point of reflection positivity. We note that then

Theorem II.3.1: If side of , goes to to

(resp. then (resp. converges as a, the the infinite-volume, free boundary condition pressure.

Proof: By (II.3.4), (II.3.5), and Lemma II.2.2C

and

THE PRESSURE

119

so as in the free boundary case we need only prove convergence and equality for

Let Then, as noted above, only consider the first case. But boundary, so

for a large, so we need only involves spins within r of the

so using Lemma II.2.2C,

This proves convergence and the equality. Next, we consider "external" boundary conditions. Let and we let s x t denote the obvious configuration in define by (II. 1.10). Clearly,

Given t and we

(11.3.6)

and if

and A is a cube of a side

If we define then by the usual arguments: Theorem II.3.2: Let A be a sequence of cubes of size trary sequence of configurations in Then for any TheGiven convergence is uniform in choice of on a probability measure we define

Let

be an arbi-

Then without any change, Thm. II.3.2 extends to arbitrary Another possible choice for weighting with v is to let (11.3.7)

120

C H A P T E R II

By the above estimate,

mine To then be for Notice proofs lations written Again, In Thm. StillTo the "trace define all another II.3.1 the define quantum are so mod that in States class"), partial some matrix virtually extends a.choice finite-dimensional; external are Then We translate case, (•) form trace. there further by define identical would we is without It the we linear fields, is islet Let acalled above. can discussed will be unique change. so and to we be describe lies in we let the be need naturally operator the do in in ^-dimensional, partial infinite-dimensional A. chapter not the the asDbother above. notions D defined. Hamiltonians. trace eon IV. =f to of tr(B)A. IfGiven iof statethem with CIfnpartial and m-dimensional; eany case The These written bthen has trace operator explicitly. yone theorems then properties realizing and requires one-one D C ="state" and on and Suppose the on detertransCtheir (C). be on CX,

T H E PRESSURE

121

A state p on is the assignment of an operator with Tr(p x ) = 1 ( all X) and Given a state p on we define

The usual estimates hold, and for any

on

for each finite X for all disjoint X, Y

we have

In

Tr(exj

Finally, as regards corrections to the limit, we will prove later (Section II.5) that for general cubes and nearest-neighbor Ising models, (to be compared with for free boundary conditions).

IL4

Pressure for Coulomb Interactions

Let us consider the very simplest lattice Coulomb model one can think of. At each site in we place a charge of either so we may as well think of the charge at a site as an Ising spin, The basic Hamiltonian is (11.4.1)

For each q, we let that is, we consider an a priori measure which might assign different weights to and and we ask for

(11.4.2)

We will show that this limit exists (and is finite) for all |3, q and even determine the q dependence explicitly. We will then say something about extensions, history, and give a partial guide to the enormous literature on the subject. We remark that often one writes things in a grand canonical ensemble as follows: Let =number of a € A with and consider

(II.4.3)

122

If we note that

CHAPTER II

we have that

where so it suffices to consider the limit (II.4.2). The first observation one makes is that the associated to (II.4.1) does not lie in This actually must be because the sign of is critical for p to be finite, that is, Proposition II.4.1: For any q and

Proof: By symmetry, the sum is invariant under replacing q by 1 — q, so without loss, take Let for all A and take A to be an L x L x L cube. The number of pairs a , y is asymptotically and is at least so

Therefore, the contribution of to the sum is exp In proposition. We begin controlling the limit (II.4.2) for the c a s e W e lowing estimate whose proof we defer:

proving the will use the fol-

(II.4.4) for some constant d. The formula allows o a to take values other than The following result includes the existence of the limit (II.4.2) for Proposition II.4.2: Let be an arbitrary probability measure on the properties (a) (b) is supported on [-A, A] for some A. Let be given by (II.4.1). Then for any

exists and is finite (the limit being in the sense of cubes).

with

T H E PRESSURE

123

Proof: Let Z A be the partition function. By the symmetry and the method of Example 1 following Thm. II.2.4, - In is subadditive in the sides of A. This proves existence of the limit. The fact that it is finite depends on (II.4.4) which implies

To control the limit (II.4.2) for we must understand the physics of Coulomb systems a little more deeply than we have so far. [Of course, (II.4.4) already indicates that there are cancellations in Coulomb systems; these are critical for the physics.] The basic idea we will exploit is that there is an overwhelming tendency for Coulomb systems to be neutral, that is, for the only configurations to count in the sum to be those with For such configurations, changing q from to some other value only multiplies the sum by Thus, and are comparable and controlling the limit for will thereby control the limit for any As preliminary to the basic estimate we need, we note: Lemma II.4.3 (a): For any signed measure p with we have (II.4.5) (b) Let dp be concentrated on the sphere of radius R, be rotationally invariant, and of total weight Q. Then

obeys

Proof: (These are standard facts; we give proofs for the sake of completeness.) (a) By a limiting argument, it suffices to prove positivity for with But since has Fourier transform we have

with

the Fourier transform of / .

124

CHAPTER II

(b) This is a direct calculation, due to Sir Isaac Newton, but here is a proof without calculations. is clearly harmonic in R and and rotationally invariant. By looking at the differential equation for rotationally invariant harmonic functions for and similarly for Since obviously we see and since for

Theorem II.4.4: Let A be any cube. Let be an arbitrary set of numbers with sup

be given by (II.4.1) and let Then (11.4.6)

for positive a, b. Remarks: 1. (II.4.4) is a special case of (II.4.6); although we only state this theorem for cubes, the proof of the lower bound holds for arbitrary 2. The values of a, b we get are one can clearly do better by l exploiting cubes in places of spheres (aL~ comes from the capacity of a sphere of radius the capacity of a cube of side L is strictly larger). Proof: Let p be the unit measure on a sphere of radius

Let dp

Then, using (b) of the lemma,

(11.4.7)

Now let y be the unit measure on the sphere of radius LV3 and let Using (b) again and the fact that the L x L x L cube lies inside the sphere, we have (11.4.8) (11.4.5,7,8) imply (II.4.6). We are now ready for Theorem II.4.5: For any q with 0 < q < 1, and

exists, the limit going through cubes, and is finite. Moreover,

T H E PRESSURE

125

(II.4.9) if Proof: Fix . By symmetry we can suppose Let (q) be the sum over all configurations Let (q) denote the sum over only those configurations with obeying (H.4.10) (2.5 plays no special role, we only want Since

and

we have, by (II.4.10), that

so that But by (II.4.6): This estimate for additivity of

together with implies

(II.4.11) (n.4.12) (which follows from sub-

(II.4.13) But then, II.4.11 implies exp for L large, so (II.4.12) yields (II.4.13) with arbitrary q in (0,1) replacing q = j in both Z and The theorem is clearly implied by (11.4.11), (II.4.13), and Prop. II.4.2. The above proof closely followed the intuition we described before Lemma II.4.3. Notice that since independently of q, (II.4.9) fails for and, in p a r t i c u l a r , c a n n o t equal for all q. Indeed, one can

126

CHAPTER Il

Iim /?(β, q ) = j In [4 q ( l - q ) ] βίο and equals p ( $ = 0 , q ) only if q - y. This noncontinuity in β is only possible because ρ(β, q) = °° for β < 0 (for interactions in ίΐΐ, ρ is convex in β for β in (-°°, °°) and continuous automatically). Notice also that the proof really shows that no matter what the value of q, Iim ΙΑΓ1 < Q(A) >Λ (with < · >Λ the finite-volume Λ-χ» Gibbs state) is zero if β > 0. (By the law of large numbers, if β = 0, the limit is (2q -1)·)

Thus there are a number of rather strange properties due to this tendency for Coulomb systems to be neutral in the strong sense of even being neutral on scales which are large but not very large. In many ways the most striking is Debye screening discussed below. But this neutrality is also responsible for many limits existing where one would expect divergence and for many especially pretty proper­ ties of Coulomb systems. It must always be borne in mind when considering Coulomb systems; because of it, the study of Coulomb systems is a subject often separate from the rest of lattice gases. For this reason, we will not treat Coulomb systems extensively. However, we should emphasize that one can make the case that for physics, Coulomb systems are as important as (and perhaps more impor­ tant than) the lattice gases we do discuss! Let us describe extensions of Thm. II.4.5 in a number of remarks: (1) It is easy to add interactions in &8 which preserve σα —> -σα symmetry, for (II.4.6) will still hold. In particular, for any / with Σ l/(-c(N +M)

(II.4.15)

the analog of (II.4.4). The importance of this question was raised in the thirties by Onsager but a complete solution took over thirty years. In fact (II.4.15) is false if one does not require that at least one species of particles obey Fermi-Dirac statis­ tics (Dyson [1967]) (fortunately electrons do this!). If one of the species obeys these statistics, (II.4.15) holds; this was proven by Dyson and Lenard [1967], [1968], but with a constant c, which is too large by many orders of magnitude. Lieb and Thirring [1975], [1976] discovered a proof, which is not only much improved over that of Dyson-Lenard, but which yields a constant that is within about an order of magnitude of what the right constant should be (assuming the hydrogen molecular solid is the ground state). Because of the long-range nature of Coulomb potentials, it does not follow immediately from (II.4.15) that the thermodynamic limit of the free energy exists. Because of the special role played by spheres in controlling Coulomb cancella­ tions, it turns out to be very useful to consider partition functions in balls. Using a clever ball packing analysis, Lieb and Lebowitz [1972] succeeded in proving that free energy exists for continuum quantum systems. The rotational symmetry of the continuum theory then plays an important role. Since this is lacking in the lattice

128

CHAPTER Il

theory, there do not exist as strong results in the literature for the lattice theory. By using a Sine-Gordon transformation (see below), Kunz [unpublished] has treated "essentially charge symmetric" gases, that is, systems with charge values S11, ..., η so S n and corresponding probabilities of occurrence p u . . . , p n with Σ Pi = ι=1 that p, > 0 and so that for every S 1 Φ 0, there is an S j with S j - -S 1 . The method used above to prove Thm. II.4.5 appears to be new. In any event, a definitive treat­ ment of the Coulomb lattice gas free energy with general a priori distribution remains to be presented. In one and two dimensions the Coulomb potential diverges at spatial infinity, which makes the tendency to neutrality even stronger, but which does introduce some extra complication and phenomena. Because the short distance behavior is less singular, there is classical stability of the continuum model generally in one dimension and if the charges aren't too large in two dimensions as first proven by Frohlich [1975]], [1976]. For discussion of one dimension, see especially Lenard [1973], One of the more striking phenomena in Coulomb systems is Debye screening. To appreciate it, note that for a system of two Ising spins O1, G2 with Hamiltonian H = -J G1C2, the two-point function is < G i G 2 > = tanh^/).

By a simple correlation equality (see Griffiths [1967a]), if H — — Σ Ά/γ G f x Gy

Jay ^O

then < σασγ > > tanh (β·/αγ). Thus for positive J 's, the falloff of < σασγ > is no faster than that of J . In Coulomb systems with J's not all positive, at high temperature and/or small fugacity, < σασγ > falls exponentially at large distances. Physically this is produced by the tendency for the value of σα to be "lost" due to screening by charges of the opposite sign. Debye screening was first proven in a class of continuum models with short distance cutoffs by Brydges [1978], with later developments by Brydges and Federbush [1980]. This work and many other developments in the theory exploit a device called the "Sine-Gordon transformation." This depends on the Gaussian formula: ex

with

P(-1

Σ ^αβ p. Proof: It is easy to see that in each dimension there is a transfer matrix formalism whose "state space" is the array of spins in the hyperplanes orthogonal to that direction. The result follows from Prop. II.5.4 (b) once the positive definiteness of A is established. This positive definiteness follows from the machinery of infrared bounds. • Remark: This result is a lattice analog of a field theoretic result noticed by Guerra et al. [1975a], [1975b], [1976],

II.6

Transfer Matrices, II: Two-Dimensional Ising Model

Onsager's celebrated calculation [1944] in 1944 of the free energy of the twodimensional nearest-neighbor Ising model in zero field is a benchmark in the study of statistical mechanics. Among other things, it set to rest once and for all the question of whether the thermodynamic limit can produce nonanalyticity by pro­ viding a function with explicit singularities (actually Peierls's earlier work [1936] [see Section VI.1] should have settled this problem in physicists' minds, but appears not to have). Onsager's paper succeeded in diagonalizing the transfer matrix by studying a Lie algebra and using rather formidable techniques and calculations; it has a repu­ tation of incomprehensibility. Some considerable simplification of this work was accomplished by Kaufman [1949], who related the Lie algebra to Clifford algebras. By stating things in terms of "fermion operators," Schultz, Mattis, and Lieb [1964] made this approach even more palatable and it is their presentation that we will fol­ low in this section. It should be mentioned that an alternate approach using combinatorial methods in place of algebraic methods was developed beginning with the work of Kac and Ward [1952] with especially significant contributions by Hurst and Green [1960] (see the review articles listed below for more detailed history). The method fairly easily accommodates couplings of different strengths in the vertical and horizontal

137

THE PRESSURE

directions, but for simplicity we will take them equal. There have also been solu­ tions for other two-dimensional lattices than the square lattice, such as the triangu­ lar and honeycomb lattices. Among the review articles on the subject are ones by Newell and Montroll [1953], Domb [1960], Schultz et al. [1964], Green and Hurst [1964], Harary [1967], and Temperley [1972]. A nice presentation of a combinatorial approach can be found in Percus [1969]. From 1944 until 1967, virtually all models with rich structure and exact solu­ tions were close relatives of Onsager's model. This was changed by Lieb's calcu­ lation [1967b] of the "entropy of square ice" and subsequent generalization to the class of six-vertex models. These models are discussed and the history reviewed in Lieb and Wu [1972]. We will not discuss these various models here. As a warm-up for the Ising model, we will describe the "solution" of two other models: the free Fermi form and the one-dimensional quantum xy model. We use the symbol { ·, ·} for an anticommutator, that is, {A,B}=AB+ BA. A fermion operator is (for our purposes here, we can view it as acting on a finite-dimensional inner product space), A, obeying [A, A] = O

(II.6.1)

(A*, A] = 1

(II.6.2)

where 1 stands for the identity matrix and A* is the adjoint of A. Proposition II.6.1: If A is a fermion operator on 3C, then 3C is the orthogonal direct sum of Ker(A) and Ker(A*). A is an isometry from Ker(A*) onto Ker(A) and A* is an isometry from Ker(A) to Ker(A*) with Α*Αφ =φ

if φ e Ker(A*)

(II.6.3a)

ΑΑ*φ = φ

if φ G Ker(A).

(II.6.3b)

Proof: (II.6.3) follows immediately (II.6.2). From these equations we see that Ker(A) O Ker(A*) = {0}. Even more is true: if φ € Ker(A*) and ψ e Ker(A), then (Φ, ψ) = (Φ, (A* A + ΑΑ*)ψ) = (φ, (Α*Α)ψ) + (ΑΑ*φ, ψ)= 0 so Ker(A) is orthogonal to Ker(A*). By (II.6.1), Ran(A) c Ker(A) so A certainly maps Ker(A*) to Ker(A). Simi­ larly, (II.6.1) implies that (A*)2 = 0, so the same conclusion holds if the roles of A and A* are reversed. To see that 3C= Ker(A) + Ker(A*), we note that by (Π.6.2)

138

CHAPTER II

and by (II.6.1), and The above implies that must have even dimension and that (II.6.1-2) have a unique representation by 2 x 2 matrices with and any other representation is just a direct sum of these. are called n fermion operators if they obey (II.6.4) Proposition II.6.2: If there are n fermion operators on then must have a dimension which is a multiple of and the representation is a direct sum of copies of the following representation, on There is a unique vector rj (the fermion vacuum) with

and by

has a basis of vectors

given

with

with and

the rearrangement of and

ordered, so that

Proof: By induction on n, n = 1 is just the last proposition. Given n fermion operators, by that proposition we can write But (II.6.4) implies that and all map Ker into itself. On that space we thus have a (n - 1) fermion operator which we know about by induction. This observation proves the theorem.

T H E PRESSURE

139

By the "free fermi form," we mean the operator (11.6.5) where M is a Hermitean matrix and are n fermion operators (this is often called "quasifree," "free" being reserved for a particular M). H is an operator on a space of dimension (at least) We want to show how to find its eigenvalues and eigenvectors by diagonalizing, which is only n x n. For, let U diagonalize M with U unitary, that is, (11.6.6)

(11.6.7) Define (11.6.8)

so, by (II.6.7)

Inserting this in (II.6.5) and using (II.6.6), we see that

Moreover, by (II.6.4), (II.6.7), and (II.6.8), we see that

so that we know all about the Indeed, the various commute and the are their simultaneous eigenvector (r) 's for B, not A). Thus (11.6.9) The reader should notice that the above is just another way of writing the alternating algebra discussed in the Section 1.4.with is£-fixed just the of justdimension span acts

140

like

CHAPTER II

The operator H leaves

invariant and acts like

there. (II.6.9) is just the infinitesimal version of

when the are eigenvectors of M. The point of the above is that the purely algebraic relations (II.6.4) are a tip-off to the alternating algebra structure being present. Another way of writing these same relations is to define 2n by (11.6.10) or equivalently (11.6.11) It is easy to see that if are 2n Hermitean operators and the A's and are related by (II.6.10), then (II.6.4) holds for the A's if and only if (11.6.12) The t ' s are said to generate a Clifford algebra and the uniqueness Prop. II.6.2 is also a uniqueness theorem for (II.6.12). With the above in mind, let us try to solve a particular one-dimensional quantum system. Consider the set up of the one-dimensional spin quantum Heisenberg model but take (II.6.13a) which is called the quantum xy Hamiltonian for obvious reasons. We want to show how to compute the pressure for such a model; since it is a quantum model, the transfer matrix methods of the last section are not applicable. The c ' s come close to obeying (II.6.12) in that

Ironically, the problem is that a s aren't noncommutative enough; anticommutation at different sites is replaced by commutation. The linear transformation needed to diagonalize the matrix playing the role of M here will mix different sites so the

T H E PRESSURE

141

new operators will have neither simple commutation nor anticommutation relations. There is a device going back to Jordan and Wigner [1928] which gets around this. It will be more convenient to express this in terms of fermion algebra than Clifford algebra, so define

so that (11.6.13) (11.6.14) The Jordan-Wigner transformation defines

by

(11.6.15)

where Since is Hermitean with eigenvalues

(indeed,

is just

has eigenvalues 0 and Notice that

From these relations one readily checks that (11.6.16)

so that the A's are fermion operators and (11.6.17) (II.6.18a) (II.6.18b) (11.6.18c) (II.6.18d) There is no simple relation between

and products of

142

CHAPTER II

Now take to be the xy model Hamiltonian with n sites 1, . . . , n. Because we are in one dimension (and this is where one dimension comes in)

where M is the matrix By the arguments in Section 11.2, the "boundary t e r m s " ^ v i l l not affect the pressure in

with

If values

has eigenvalues,

then by arguments above, has eigenso (recalling tr is normalized trace)

But because of our arranging periodic boundary c o n d i t i o n s , h a s eigenvectors with Thus and

THE PRESSURE

143

Recognizing that the Riemann sum is approaching an integral we have Theorem II.6.3: For the one-dimensional quantum xy model with Hamiltonian (II.6.13), the pressure lim is given by: o

J

See Thompson's review article [1972]for further discussion of the onedimensional xy model, including external field and anisotropic couplings in the xx and yy directions. Before leaving the presentation of these ideas, we note that one can further extend the analysis to diagonalize objects like using Bogoliubov transformations', this is described, for example, in the book of Berezin [1966]. These maps define B's as sums of A's and Indeed, it is quadratic forms of this type which enter below in analyzing the two-dimensional Ising model; but we can avoid the general theory at the cost of diagonalizing a 2 x 2 matrix by hand. With these lengthy preliminaries out of the way, we can turn to the analysis of the Ising model. We consider an array of spins with periodic boundary conditions in all directions. Label the spins i We consider a transfer matrix for the problem. The matrix will be matrix whose rows and columns can be labeled by the 2 assignments ..., L. We let L

(II.6.19) with the convention

As in the last section

is recognized as the partition function

associated to the Hamiltonian

144

CHAPTER II

(II. 6.20) Since In (largest eigenvalue of for any fixed L and since we know that one can compute the pressure by first taking M to infinity and then L to infinity (see Section II.3), we have Lemma II.6.4: Let p(K) be the pressure for the Hamiltonian (II.6.20) (with no factor in front of sum defining Z). L e t b e the largest eigenvalue of the matrix T given by (II.6.19). Then L

Thus, we are reduced to the task of merely diagonalizing a m a t r i x . We write as a matrix product (II.6.21a) where

is the diagonal matrix (II.6.21b) (11.6.21c)

It will pay to rewrite the S and V matrices in terms of other objects. Define matrices

All we have done is thought of as an L-fold tensor product of 's, and taken copies of the basic x matrices. Clearly, we can write in terms of x and since kl

THE PRESSURE

145

(11.6.22) we can write

in terms of the

The

's will enter later. Notice that (11.6.23)

Trivially (as matrices) (II.6.24a) To rewrite

, we begin by noting that by (II.6.22):

Noting that

we are led to seek

and a so that

Clearly, is given by (II.6.25) Thus (II.6.24b) S is written as exponential of a quadratic in %'s but seems to be only an exponential of a linear in x's; but by writing as we can remedy this! Thus in forming fermion operators out of x's, we want to use and This motivates the definition of new matrices By direct calculation L

146

CHAPTER II

As a consequence, we can find that the o's obey (II.6.14) on account of (II.6.23) and

(II.6.26a) (II.6.26b) As in the xy model, we next make the Jordan-Wigner transformation (II.6.15). Using (II.6.17) and (II.6.26b) (II.6.27a) By (II.6.18), forfc= 1,..., L - 1 This is not true for the k = L and it appears that disaster has struck! We will shortly use the periodic boundary conditions to diagonalize things, so we can't say we'll just compute Z with free boundary conditions. Moreover, changing that term won't change logs of eigenvalues of by much (which is what worked for the xy model) but it seems nontrivial to work on

. What is true is that

where

Define measures the "number of fermions." Thus

(II.6.27b)

THE PRESSURE

147

Note that since

(which explains why N is the number of fermions), we see that commutes with any bilinear in A's and . , and so with In particular, leaves invariant each of spaces with and on those spaces equals . , , that is, L

Lemma II.6.5: The eigenvalues of are just all those eigenvalues of with eigenvectors obeying and those eigenvalues of obeying We now pass to diagonalizing the quadratic forms for the exponential of clearly has "periodic" boundary conditions so we use plane waves and since has "antiperiodic boundary conditions," we use the plane waves appropriate for that, that is, we define (II.6.28a) (where

is chosen for later simplicity) and pick (II.6.29a)

for

and (II.6.29b)

for and we have supposed that L is even for simplicity. (II.6.28a) then is inverted by (II.6.28b) Putting this into (II.6.27) and using the symbol [resp. to indicate the product of positive q's obeying (II.6.29a) [resp. (II.6.29b)], we see that

148

CHAPTER II

where

As a result (II.6.30)

(II.6.30) is not merely a product. The commute for different 0], and where the n take values 0 or 1 and with even (for or odd (for Below, we'll show that is monotone in I I and q

with of

and - if Thus, the largest eigenvalue has a log as Dividing by L and noticing that the Riemann sum turns into an integral

(and recalling the

in writing

), we see that

Theorem II.6.9 (Onsager [1944]): Let solve (II.6.32) with given by (II.6.25). Let be the pressure for the two-dimensional spin Ising model with Hamiltonian . Then (11.6.33) To study this further, we first study (II.6.25) and (II.6.32). The first equation can be rewritten as (II.6.25') This shows a symmetry between K and . so t h a t S i n c e (II.6.32) is symmetric in K and we see that (11.6.34) by the explicit formula (II.6.37). This result predates Onsager's work as we shall discuss in the next section. Of particular interest we will see shortly is the point K determined by c

(11.6.35) given by the equivalent formulas: (11.6.36)

THE PRESSURE

151

or numerically: (II.6.32) can be rewritten, using (II.6.25) (II.6.32') From this, we read the facts that cosh the formula relating Since

monotonicity of

in q and

is solved by

and In [tanh is analytic in K, we see that is analytic in K uniformly in as long as we stay away from the point . Thus Proposition II.6.10: is real analytic away from the point To study the singularity near , we first note that

by (II.6.25'). Differentiating (II.6.32'), we see

and differentiating again

Noting t h a t a n d that

t

h

a

t

,

we see

152

is bounded as

CHAPTER II

But

so using

we see that (II.6.37) This is thecelebrated logarithmic divergence of the specific heat (which is related to This shows that p(K) is singular at K . This completes our presentation of the Onsager solution. We will discuss some expansions and further results in a series of remarks: (1) Further analysis of the eigenvalues (see, e.g., Schultz-Mattis-Lieb [1964]) shows that for , the gap between the two smallest eigenvalues remains nonzero in the limit L —» °° and is exactly (as L c

Correlations like odic states] in the limit indeed one can show that

fixed,

the infinite volume limit of the perishould fall like and

is given by (see, e.g., Kadanoff [1966] and Wu [1966]): Theorem II.6.11: For K < K , the mass gap m(K) is given by c

(II.6.37) For later purposes, we note that for all states are the same, so < • > can be defined with any boundary conditions. We also warn the reader that the rate of decay of fall-offdepends on direction (see Section V.9). (2) For there is exactly one eigenvalue degenerate with the lowest one in the limit and then a gap. At the critical point , more and more

THE PRESSURE

153

eigenvalues get closer together, giving a continuous distribution beginning at the lower end in the limit as L —» (3) Yang [1952] and subsequently Montroll, Potts, and Ward [1963] computed a candidate for the spontaneous magnetization. The result is the following: Theorem IL6.12: Define the spontaneous magnetizati* where

i is the pressure for Hamiltonian

Then for K> Kc (the "Yang magnetization")

and We note that neither paper computed directly

for

many years there was a gap in the rigorous proof in that these authors computed things that should equal M but which weren't known to be. Montroll et al. compute

with < • > the limit of periodic boundary condition states. But it can be proven that rith the "plus boundary condition state," that see Griffiths Thus, with recent progress, it is known that (II.6.38) really gives M. Notice the above results say that the critical value of K determined by (i) maximum of K ' s for which < O a o 0 > falls exponentially; (ii) minimum of K ' s for which M(K) * 0; and (iii) singular point for p(K) all agree. Unfortunately, this model (and variants) is the only model for which we rigorously know this! Notice that M(K) 0 as K i Kc, indeed in that limit:

(4) By carrying through the analysis above with distinct couplings Kx in the direction of transfer and K2 in the other direction, one finds that (II.6.33) holds if the K in 2 sinh 2K is Kx and if is defined by (H.6.32) [not (II.6.32')] with K replaced by and by where s still given by

154

CHAPTER Il

S i n h (Ar 1 ) S i n h ( £ ^ ) = 1 .

The critical point is now given by K * = K 2 (5) Other examples which have been treated include the triangle and honey­ comb lattices, see, for example, Syozi [1972] and Onsager [1944], (6) No one has succeeded in computing Ising model pressures exactly if ν > 3 or in external field. In the above calculations, it is clear what goes wrong in each case: the Jordan-Wigner transformation required the transfer matrix to be parame­ terized by a one-dimensional set of spins; thus the restriction to ν = 2. If we had an external field, we'd have exponentials of τ2τ'ζ, τΛ, and X . There would result linear terms in CJ and these do not transform nicely under Jordan-Wigner transfor­ mations. 1

i

II.7

Duality and Other Transformations

The striking relation (II.6.34) which relates the two-dimensional Ising pressures at two distinct temperatures was found before the exact solution and is one of a num­ ber of relations that results from transforming one model into another. One inter­ esting aspect of such relations is that under an assumption that ρ is singular at only one temperature, one can sometimes determine that temperature from the relation. For example, (II.6.34) and this assumption imply that the singularity is at K = K* (this is the correct answer; i.e., the assumption is correct). Later in this section, we will make a similar determination of the transition temperature in the nearestneighbor Ising model on the triangular and hexagonal (honeycomb) lattices. In this section, we begin by deducing (II.6.34) and generalizations, a set of ideas going under the name duality, or more descriptively, high-temperature/lowtemperature duality. We will then describe another way of going from triangular to hexagonal lattices, known as the star-triangle or "Υ-Δ" transformation. Finally, we present one example of a whole class of transformations to Coulomb gases. We begin with the Ising model (II.6.20) in a two-dimensional square of side L, with vanishing boundary conditions. We begin by describing two expansions of ZL(K), the finite-volume partition function: one, a power series in tanh K is most rapidly convergent for K small and so it is called the high-temperature expansion (since K is inverse temperature); the other, a power series in e~2K is most rapidly convergent for K large and so is called the low-temperature expansion. Duality will arise from the fact that the coefficients of the two expansions are "essentially" the same. We emphasize that despite the names, both series converge for all K so long as we fix L. The situation for L = °° will concern us in chapter V. There are high-temperature expansions in K for virtually all models (see Sec­ tion V.7) but for spin-y models, one can develop a somewhat simpler expansion in tanh K by exploiting an idea of van der Waerden [1941]: If y = ±1, then

THE PRESSURE

Thus

155

the number of pairs in the Hamiltonian (II.6.20) and IAI the number of sites):

In (II.7.2), the sum is over all high-temperature graphs, G, defined to be an arbitrary subset of the family of nearest-neighbor bonds, IGI is the number of bonds in G and < • > 0 indicates 2~m times the sum of all choices aa = ±1 (i.e., expectations in the uncoupled system). (II.7.2) comes from expanding the product in (II.7.1). By the boundary, 3G, of a high-temperature graph, we mean the family of sites a contained in an odd number of elements of G. This notion is quite intuitive; for example, if G has 3G = 0 , the empty set, then G is a set of closed loops; if 3G = {a,y} a single pair, then G has a path from a to y and a family of closed loops. Note that is either 0 or 1 depending on whether 3G = 0 . Thus, we have Proposition II.7.1: Let ZL(K) be the partition function for (II.6.20) with free boundary conditions in a square box of side L. Then

where is the number of sites and is the number of bonds. Note that a similar formula holds for periodic boundary conditions except that then and we allow bonds in the graph linking sites which are not nearest neighbors in but are in the torus ("opposite" edges). The low-temperature expansion is based on the idea of associating to each configuration, a measure of how far it is from the ground state configurations (all a ) and We thus associate a graph G to each configuration of spins by

156

CHAPTER II

Clearly,

so 01.7.4) To go further, we must answer two questions: exactly which graphs arise as i for some 's and how many distinct lead to the same G. These are answered by Proposition II.7.2: A graph G is the graph of some configuration if and only if for each elementary square in the lattice (i.e., the bonds joining the four sites contains either 0, 2, or 4 bonds. Moreover, each G with this property is G(o a ) for precisely two Proof: Let a , , a 2 > a 3 , a 4 be the four sites in an elementary square numbered around the square. Since

we see that each has the required property. Conversely, let G be given with the required property. Construct o a from G by fixingwith y in the lower left corner of A and then determining a a by

where H is any graph with A simple argument shows that this is independent of the choice of since has the required property and that i s G for this choice of o a . Obviously, we get two a ' s by beginning instead with determines all But knowing o y and the value of all o a Og with so there are exactly two configurations. Thus (II.7.5) with the set of all graphs with the property in Prop. II.7.2, which we call lowtemperature graphs. The final step in this analysis is to note that the similarity between which says each square has 0, 2, or 4 lines in G and which says that each site lies on either 0, 2, or 4 bonds. We therefore define the

157

THE PRESSURE

dual lattice to Z2 to be the family of centers of squares in Z2. Notice every nearest-neighbor bond in Z2 is naturally associated to a nearest-neighbor bond in the dual lattice; for example, in figure II.7.1, sites in the original lattice are shown by dots, in the dual lattice by open circles, and a single pair of associated bonds is shown. Thus, given any graph, G, in the lattice, we can associate uniquely to it a graph, G*, in the dual lattice. In particular, every configuration has associated to it a unique graph in the dual lattice called its contour, shown in an example in figure 11.1.2. Notice that contours precisely separate the regions of plus spins from the regions of minus spins. These contours will be the major heroes in chapter VI. Notice also that the contours associated to a configuration in an L x L square are graphs in an (L + 1) χ (L + 1) square with two properties: (i) 3Γ contains no points in the (L - 1) x (L - 1) "interior" of the region; (ii) Γ has no bonds between two boundary points of the (L + 1) x (L + 1) region, ((i) is just the translation of G e §.) Thus

Proposition II.7.3 Z L ( K ) = 2Ε Κ Ζ Σ exp(-2/m>

r

(II.7.6)

the sum being over all graphs obeying (i) and (ii) above. The types of graphs only match up to boundary conditions. By controlling these boundary errors, one can use these free boundary conditions to obtain

Fig. II.7.1. Dual lattice

158

CHAPTER Il

-

+

+

-

Fig. II.7.2. A configuration and its contour

(II.6.34) but it is more convenient to find boundary conditions where finite-volume duality is exact. At first sight, periodic boundary conditions seem attractive since, naively, it looks like the LT periodic graphs are identical to the HT periodic graphs. Alas, this is not true; there are HT graphs of a single line winding around the torus but such a single line cannot divide the torus into two regions of plus and minus spins and so will not be an LT graph (the LT graphs are precisely the HT graphs whose Z2 homology is 0). The good boundary conditions turn out to be plus boundary conditions for LT. Return to the LT expansion, but consider an (L - 1) χ (L - 1) array of free spins surrounded by spins all forced to be plus. Let Z+L_\(K) denote the corresponding partition function. Given a configuration in the (L - 1) χ (L - 1) array, one needs to consider the associated configuration in the (L + 1) χ (L + 1) array with plus spins on the outside to get the energy. The contour for any such configuration is precisely an arbitrary graph in the L χ L dual graph with 3Γ = 0. Noting that there

THE PRESSURE

159

is no factor of two configurations for each set of contours (since we have fixed pluses outside), we conclude Theorem 11.1 A (HT-LT duality): Let K and

be related by (11.7.7)

Let Zl{K) denote the free b.c. partition function in a square of side L and the plus b.c. partition function in a square of side (L - 1). Then (11.7.8) Proof: By Prop. II.7.1, the left side of (II.7.8) is exactly

while by Prop. II.7.3 in the plus case, the right side of (II.7.8) is

the set of graphs being identical (all sums are finite for L fixed so there is no convergence question). The choice (II.7.7) makes (II.7.8) hold. • We repeat the fact noted in thelast section that (II.7.7) is equivalent to (sinh or to tanh We conclude from (II.7.8) that Corollary II.7.5: Proof: (II.7.8) immediately yields

tanh K, we have

since [sinh 2K] [sinh Thus, the corollary is true. This is the promised proof of (II.6.34) which does not go through the exact solution. The results of Thm. II.7.4 and Cor. II.7.5 (but using transfer matrices rather than expansions) were first found by Kramers and Wannier [1941] in a celebrated paper published in 1941, three years before Onsager's exact solution.

160

CHAPTER II

Fig. II.7.3. A dual relation Under the assumption of a single singularity in p, they predicted the transition temperature subsequently verified by Onsager, who also first understood the expansion view of duality. Before leaving the two-dimensional square lattice model, we want to note that there are general relations between correlations also (but between products of o 's in the HT system and a product of system). Rather than give these in general, we present one special case we will need in the next section. Theorem II.7.6 denote the plus b.c. partition function in a rectangle of sides L, and i denote the partition function in a rectangle of sides and with b.c.shown in the left side of figure II.7.3, that is, plus on the bottom and the lower spins on the sides and minus the top and upper M_ + 1 side spins. denote free b.c. expectations in a rectangle of sides L and and let a , y denote the spins singled out in the right part of figure II.7.1, that is, at opposite sides in row measured from the bottom. Then (9.7.19)

161

THE PRESSURE

Proof: Make an LT expansion of numerator and denominator of the left side of (II.7.9). We have already seen the denominator is (up to factors which cancel from top and bottom) the sum of all HT graphs for K in the free b.c. expansion of Z m++m_+2- The numerator has contours which in the dual model lead to all graphs whose boundary is exactly {α,γ}. This yields (II.7.9). • In 1971, Wegner [1971] considered a family of models which are precisely the LT duals of Ising models in dimension higher than two. He noted they have the very interesting property of singularities of the pressure without one expecting multiple phases. It turns out that the ν = 3 model is of special interest: it is a lattice gauge theory (see Section 1.2) with lattice group, Z2 (such models hadn't been invented as lattice approximations to gauge theories in 1971!). Here is the model: In Z3, let 9 denote the set of plaquettes, that is, basic unit squares formed by four neighbors. Take a model with a spin, σα, associated to each nearest-neighbor bond in Z3. Given P e 9, let C denote the product of the four spins corresponding to the four edges of P. Given a finite volume, Λ, let l

p

Ha = -K Σ °P /»eS> fcA

(II.7.10)

and let p w ( K ) denote the pressure of this model, with ΙΛΙ = number of sites, not bonds (the number of bonds ξ number of "spins" = 3ΙΛΙ). Let P1(K) be the pres­ sure for the usual nearest-neighbor, three-dimensional Ising model. Then Theorem II.7.7 (Wegner [1971]): If e~2K' = tanh K, then p , ( K ) ~ In [2(cosh/03] = Mtf*)-3tf* ·

(II.7.11)

Proof: The L T graphs for the model (II.7.10) are described in terms of which pla­ quettes have Cp = -1. If we consider the lattice whose sites have the contour of basic cubes, and associate a graph of plaquettes to the dual lattice bond graph obtained by taking the bond perpendicular to the plaquette (see fig. II.7.4), we get an HT graph for the Ising model. By using periodic b.c., (II.7.11) results. The above is of special interest because of the following two facts: (i) If the three-dimensional Ising model always has either one or two phases, and if when there are two phases they have < σα > Φ 0, then pt(K) is C1 in K in (0, °°).

(ii) If p w ( K ) is C1, then the corresponding model always has a unique phase. One expects that the three-dimensional Ising model has the phase structure which would lead us to conclude that Pi ( K ) is C1 by (i) but nonanalytic. Thus by (ii) and (II.7.11), pw(K) is C1 so the model has a unique phase for all K but pw(K) does have a nonanalyticity. This shows that nonanalyticity of ρ need not be asso­ ciated with multiple phases.

162

CHAPTER Il

ο

Fig. II.7.4. A plaquette and its dual bond As a final example of duality, we will consider the honeycomb and triangular lattices. Before doing so, we should note there is a general abstract treatment of duality. McKean [1964] noted duality is intimately connected with group theory via a Poisson-summation formula (see Lemma II.7.14 below) and the general theory developed by Greenberg, Gruber, Hintermann, Merlini, and Slawny is expressed in terms of a large variety of groups (some of the flavor of these ideas will occur in Sections VI.7-9). A comprehensive review can be found in the monograph Group Analysis of Classical Lattice Systems by Gruber, Hintermann, and Merlini [1977] which contains extensive reference to the original literature. The triangle and honeycomb lattices are discussed in Section II.1 (see figs. II.1.1 and II.1.3). Let p,(K) and Pi1(K) denote the pressures of the models on the triangular and honeycomb lattices with Hamiltonian -H = K

X σασγ

I α-γΙ = 1

and normalized by dividing by the number of sites. As figure II.7.5 shows, the triangular and hexagonal lattices are dual to one another. Thus, LT graphs for the /!-model will be HT graphs for the /-model (and vice-versa). Computing with periodic b.c. and bearing in mind the critical fact that the hexagonal model has twice as many sites as the triangular model (the number of bonds is the same but if 3 2

number of sites is t and h, then b = - h and b = 3t, so h = 21), we have Theorem II.7.8: If K and K* are related in the usual way, then p h (K) - In 2 - ^ In [cosh K]= 1 - p t (K*) -\κ*

(II.7.12)

THE PRESSURE

163

Fig. II.7.5. A pair of dual lattices Proof: Up to boundary terms which vanish in the limit, duality says that

(II.7.12) comes from this (if we use

by taking logs, dividing by h, and

taking h to infinity. (11.7.12) has a more symmetric-looking form. By following the proof of Cor. II.7.5, (II.7.12) becomes

164

CHAPTER II

Fig. II.7.6. Hexagonal/triangle Y -

relation

There is a second relation between p, and ph found by Onsager [1944] which depends on the geometry in figure II.7.6 and the following basic idea:

Lemma II.7.9 (Y-A transformation)

be four +1 spins. Then

where (11.7.15) (11.7.16) Proof: The left side of (II.7.14) is clearly a symmetric function of the a, invariant under flipping all spins. It thus only takes two possible values, depending on whether We can clearly fit (II.7.14) if (11.7.17) (11.7.18) which are equivalent to II.7.15 and II.7.16. The name comes from the fact that if we join interacting spins, the four spins interact in a Y (star) and after summing out we get a triangle. Applying this idea to figure II.7.6 we find: Theorem II.7.10: Let K, R,

be as above. Then (11.7.19)

THE PRESSURE

165

Combining (II.7.13) and (II.7.17) we find Theorem II.7.11: (a) Given K, define K by (11.7.20) Then (11.7.21) (b) Given R, define R' by (11.7.22) Then (11.7.23) Proof: (a) Given define in the usual way and then (2 cosh That is given by (II.7.20) is then straightforward hyperbolic trigonometric manipulation as is the formula

(use (II.7.18) needed to obtain (II.7.21) from (II.7.13) and (II.7.17)). Given obeying (II.7.20), define ' by

(II.7.22) is obvious and (II.7.23) follows from (II.7.21) and (II.7.17). The following corollary should, if viewed strictly as a corollary, have the phrase "assuming there is a unique singularity of p" but the exact solutions show this, so we don't put in the phrase.

Corollary II.7.12: (a) The critical

for the hexagonal model is given by (cosh

(b) The critical R for the triangular model is given by

166

CHAPTER Il

Remark: In mean field theory (see Sections 11.13 and 11.14), the critical K is 1 / η where η is the number of spins interacting with a given spin (3,4,6, respectively for the hexagonal, square, and triangular models), so one should look at 3Kc, 4Kc, and 6Rc, which have values 1.98, 1.76, and 1.65, respectively (in line with the expected limit 1 as η -» °°). Even slower variation is found if one uses Fisher's bound (Section V.6) to suggest it might be even better to multiply by η - 1, or look at 2Kc, 3Kc, and 5RC with values 1.32, 1.32, and 1.37. In addition to the Y - A transformation, there is a procedure of introducing new spins called "decoration." An extensive review can be found in the article of Syozi [1972], *

*

*

There is one last "duality" we want to briefly indicate, which will relate certain two-dimensional nearest-neighbor models to Coulomb gases. These ideas go back to the fundamental work of Kosterlitz-Thouless [1973] and Villain [1975] and have been developed especially by Jose et al. [1977] and Kadanoff-Zisook [1981], and in the work of Frohlich and Spencer [1981a], [1981b], and [1981c]. As we will see, there is a connection with the Sine-Gordon transformation discussed in Section II.4. We will only indicate the ideas by giving one equality on the level of partition functions. Many of the more subtle applications involve equalities of cor­ relation functions where simple correlations in the nearest-neighbor model corre­ spond to expectations of lines of dipoles. We will consider plane rotors but with the usual cos(0a - θγ) replaced by a slightly different interaction. Let

Vp(x) =-ln| £ exp[- l- β(χ - 2πη)2]|.

We will consider an LxL array, Λ, of plane rotors with interaction energy H=

Σ νβ(θα-θγ). Ια-γΙ = 1

(II.7.24)

Notice that for β large, H looks just like the more usual plane rotor Hamiltonian (up to a constant), in the sense that when β is large, only low energy configura­ tions, that is, ones with θα - θγ = O enter. For χ small, V$(x) — ^ β χ2 and -β cos 1 9 χ ~ const -- β χ . Thus, while V is somewhat artificial, it is a good approxima­ tion to plane rotors and has a much simpler duality. The model with Hamiltonian (II.7.24) is called the Villain model. We will study this model with free b'.c.; that is, we include no interactions with the outside. Let Λ be the dual lattice to Λ, that is, the (L - 1) χ (L - 1) array of centers of basic lattice squares in Λ. Let Λ' be the (L + 1) χ (L + 1) array of sites (with the corners removed), whose dual lattice is Λ, that is, the set Λ' and its nearest

THE PRESSURE

167

neighbors. Given {