138 21
English Pages 327 Year 1987
Ricardo Maiie
Ergodic Theory and Differentiable Dynamics Translated from the Portuguese by Silvio Levy With 32 Figures
Springer-Verlag Berlin Heidelberg New York London Paris Tokyo
Ricardo Mane
I.M.P.A. Estrada Dona Castorina 110 22460 Rio de Janeiro RJ Brasil
Silvio Levy Dept ofMathematics Princeton University Princeton, New Jersey, 08544 U.S.A.
Title of the original Portuguese edition: Introdu~iio a Teoria Erg6dica. © 1983 by Ricardo Maii.e
Mathematics Subject Classification (1980): 58F 11, 58 Fl5
ISBN 3-540-15278-4 Springer-Verlag Berlin Heidelberg NewYork ISBN 0-387-15278-4 Springer-Verlag NewYork Berlin Heidelberg Library of Congress Cataloging-in-Publication Data Maiie, Ricardo, 1948Ergodic theory and differentiable dynamics (Ergebnisse der Mathematik und ihrer Grenzgebiete; 3. Folge, Bd. 8) Translation of: Introdu~o il teoria erg6dica. Bibliography: p. Includes index. I. Ergodic theory. 2. Measure theory. I. TIiie. IL Series. QA614.M3613 1987 515.4'2 86-25983 ISBN 0-387-15278-4 (U.S.) This work is subject to copyright. All rights are reserved, whether the whole or part of the matenal is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. Duplication of this publication or parts thereofis only permitted under the provisions of the German Copyright Law of September 9, 1965, in its version of June 24, 1985, and a copyright fee must always be paid. Violations fall under the prosecution act of the German Copyright Law. Springer-Verlag Berlin Heidelberg 1987 Printed in Germany
(i'J
Typesetting: Asco Trade Typesetting Ltd., Hong Kong Printing and bookbinding: Konrad Triltsch, Wilrzburg 2141/3140-543210
To Alexandre, Claus, Paulinho and Zeze
Preface to the English Edition
This version differs from the Portuguese edition only in a few additions and many minor corrections. Naturally, this edition raised the question of whether to use the opportunity to introduce major additions. In a book like this, ending in the heart of a rich research field, there are always further topics that should arguably be included. Subjects like geodesic flows or the role of Hausdorff dimension in contemporary ergodic theory are two of the most tempting gaps to fill. However, I let it stand with practically the same boundaries as the original version, still believing these adequately fulfill its goal of presenting the basic knowledge required to approach the research area of Differentiable Ergodic Theory. I wish to thank Dr. Levy for the excellent translation and several of the corrections mentioned above. Rio de Janeiro, January 1987
Ricardo Mane
Introduction
This book is an introduction to ergodic theory, with emphasis on its relationship with the theory of differentiable dynamical systems, which is sometimes called differentiable ergodic theory. Chapter 0, a quick review of measure theory, is included as a reference. Proofs are omitted, except for some results on derivatives with respect to sequences of partitions, which are not generally found in standard texts on measure and integration theory and tend to be lost within a much wider framework in more advanced texts. Chapter I starts with a quick and superficial introduction, then presents the main kinds of dynamical systems around which ergodic theory has developed. This development itself starts in chapter II, devoted to the classical concepts and theorems. Chapters III and IV are devoted to contemporary ergodic theory, born in 1958 with the introduction of the notion of entropy by Kolmogorov and developed primarily by Sinai, Anosov, Bowen and Ornstein, in the sixties and seventies. Chapter III is a typical example of differentiable ergodic theory. It studies ergodic properties of Anosov diffeomorphisms and expanding maps. The techniques used in this analysis have become classical, and remain the conceptual foundation for a good part of today's research. Entropy is the subject of chapter IV; we start with the basic formalism and the calculation of simple examples, then discuss topological entropy, the variational principle of entropy and the construction of the unique entropy-maximizing measure for hyperbolic homeomorphisms. We conclude with Lyapunov exponents, the Pesin formula for the entropy of volume-preserving diffeomorphisms, and the Brin-Katok local entropy formula. We have included many advanced results without proof, in the belief that an introductory text does not have to deprive the reader of a comprehensive and up-to-date panorama of the subject. In particular, we state Ornstein's famous classification theorem. There are good and readily accessible expositions of this result (see references in section 1.12), so we see no point in plagiarizing them here. The theorems of Katok and Pesin (sections IV.15 and IV.10) are a different story: there seem to be as yet no pedagogical treatments of them. A third kind of result quoted without proof is exemplified by Manning's theorem on the linearization of Anosov diffeomorphisms (section IV.15): strictly speaking, they are outside the main stream of ideas presented in this work, but familiarity with them is fundamental to a balanced, global understanding of our subject. A good part of the information in this book is contained in the exercises. This is intentional, and a careful reading, at least, of all the exercises is essential.
X
Introduction
The reader is also encouraged to concentrate on a careful understanding of new ideas and statements, and not so much on proofs, in a first reading. The proofs are often arid and demanding, and a less motivated reader may well be turned away if he attempts to go through all of them. I would like to thank Elon Lima for asking me to write this book: his insistence during slack periods was decisive in its coming to light. Alexandre Freire helped me immensely, proofreading the original and contributing relevant comments.
Table of Contents
Ghapter 0. Measure Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1. Measures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Measurable Maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Differentiation and Integration ...... ·. . . . . . . . . . . . . . . . . . . . . . . . . . . . Partitions and Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 3 3 6 7
Chapter I. Measure-Preserving Maps................................
15
1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. The Poincare Recurrence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Volume-Preserving Diffeomorphisms and Flows. . . . . . . . . . . . . . . . . . . 4.- First Integrals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Hamiltonians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Continued Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Topological Groups, Lie Groups, Haar Measure. . . . . . . . . . . . . . . . . . . 8. Invariant Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Uniquely Ergodic Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. Shifts: the Probabilistic Viewpoint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Shifts: the Topological Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. Equivalent Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15 27 32 37 39 42 45 52 58 61 67 77
Chapter II. Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
1. Birkhoff's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89 100 106 109 125 127 134 141 148 151 157 161
2. 3. 4. 5.
2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Ergodicity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ergodicity of Homomorphisms and Translations of the Torus . . . . . . . More Examples of Ergodic Maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Theorem of Kolmogorov-Arnold-Moser. . . . . . . . . . . . . . . . . . . . . Ergodic Decomposition oflnvariant Measures . . . . . . . . . . . . . . . . . . . . Furstenberg's Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mixing Automorphisms and Lebesgue Automorphisms............. Spectral Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gaussian Shifts............................................... Kolmogorov Automorphisms................................... Mixing and Ergodic Markov Shifts..............................
XII
Table of Contents
Chapter III. Expanding Maps and Anosov Diffeomorphisms............
166
1. Expanding Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Anosov Ditfeomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Absolute Continuity of the Stable Foliation.......................
166 178 189
Chapter IV. Entropy..............................................
207
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Proof of the Shannon-McMillan-Breiman Theorem. . . . . . . . . . . . . . . 3. Entropy..................................................... 4. The Kolmogorov-Sinai Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Entropy of Expanding Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. The Parry Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Topological Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. The Variational Property of Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Hyperbolic Homeomorphisms ................... ·.......... ·. . . . . 10. Lyapunov Exponents. The Theorems ofOseledec and Pesin......... 11. Proof of Oseledec's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. Proof of Ruelle's Inequality................. . . . . . . . . . . . . . . . . . . . . 13. Proof of Pesin's Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14:" Entropy of Anosov Diffeomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15. Hyperbolic Measures. Katok's Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . 16. The Brin-Katok Local Entropy Formula. . . . . . . . . . . . . . . . . . . . . . . . .
207 212 214 218 227 231 236 244 251 263 267 281 285 293 296 298
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
305
Notation Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
309
Subject Index..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
311
Chapter 0. Measure Theory
The purpose of this chapter is to present the basic definitions and theorems of measure theory. Proofs are only included when they cannot be found in standard references or when the formulation of the statement involved differs significantly from the usual one. The basic references for this chapter are Rudin [R6] and Munroe [M13]. .
1. Measures Let X be a set. A family d of subsets of X is called an algebra if Xed, Aed=>Aced, A,Bed=>AUBed.
It follows from these properties that
A,Bed=>AnB = (ACUB"fed, A, Bed=> A\B :=An B"ed.
We say that dis au-algebra if
A;Ed,i;;,, l=UA;Ed. i
If d is a family of subsets of X, we say that d is generated by d 0 if d 0 c d and every u-algebra d' of subsets of X containing d 0 also contains d. If (dn)n;;,,i is a family of subsets of X, we denote by Vn;;,,i dn the u-algebra generated by Un;;,,i dn. If d is an algebra of subsets of X, we say that µ: d -+ [O, + oo] is a measure if, for every family (A;);;;,, 1 of disjoint subsets of X such that A;ed, i;;,, 1 and U1;;,, 1 A1ed, the following holds:
µ(u
i~t
A;)=
I
µ(A;).
i~l
If X is a topological space, we define the Borel u-algebra of X as the u-algebra
0. Measure Theory
2
generated by the family of open sets of X. Sets in the Borel u-algebra of X are called Borel subsets of X. Theorem 1.1. Let d be the Borel u-algebra of Rn. Then there exists a unique measure il: d -+ (0, + oo] such that, for every open box A := (a 1 , b1 ) x · · · x (an, bn), we have
il(A)
= {b1 -
a 1)
• ..• • (bn
- an)-
This measure is called the Lebesgue measure. A measure space is a triple {X, d, µ), where d is a u-algebra of subsets of X and µ: d-+ [0, + oo] is a measure. We say that Xis u-finite if X can be written as a countable union X = 1 An with µ(A.) < + oo for all n. If µ(X) = 1 we call X a probability space. Let (X, d, µ) be a measure space. A set A c X has measure zero if there exists A 1 Ed such that Ac A 1 and µ(Ai)= 0. We call two sets A 1 , A 2 c X equivalent modulo zero, and we write A 1 = A 2 (mod 0), if the symmetric difference A 1 A A 2 has measure zero. If f/' is a family of subsets of X, we write A e 9'(mod 0) if there is a set A 0 e9' such that A= A0 (mod0). We define
U.;,,
9'(mod0) :={Ac XIAES(modO)}. We say that f/' generates d(mod0) if d = d 0 (mod0), where d 0 is the u-algebra generated by f/'. We write Ac B(mod0) if A is a subset of Band A= B(mod0). Finally, we say that a property of points of a set S c X holds almost everywhere (a.e.) in S, or for almost every point in S, if the set of points of S where it does not hold has measure zero. Theorem 1.2 (Approximation theorem). If (X,d,µ) is a probability space, a subalgebra d 0 c d generates d(mod 0) if and only if, for every A Ed and i; > 0, there exists A 0 ed0 such that µ(AAA 0 ) ~ i;. Theorem 1.3 (Lusin). Let X be a locally complete separable metric space, d the Borel u-algebra of X, andµ: d-+ (0, 1] a measure. For every Borel set Ac X and every e > 0 there exists a compact set K c A such that µ(A \K) ~ i;. Theorem 1.4 (Additivity criterion). Let d µ: d-+ [0, 1] a function such that
be an algebra of subsets of X and m
µ(A1 U· ··U Am)=
L µ(A;) i=l
for every family (AJ 1 ,s;;,,;m, A; Ed of disjoint sets. Suppose there is a family 0 and choose N := N(A, e) > 0 satisfying the hypothesis of the theorem. Given n ~ N, take an A,. e d which is a union of atoms in fl'. and such that µ(A ,1 A.) es;; e. Observe that A(l, n) is also a union of atoms in &'•. It follows that we can write n
A,. n A(..1., n)
= U &'.(xi), i=l
where the union is disjoint and x 1 , µ(An A. n A(..1., n))
••• ,
x,. belong to A(l, n). Then
= Lµ(&'.(xj) n A):,;;;; A, :Eµ(&',.(xj)) = lµ(A. n A(2, n)). i
I
On the other hand, the inequality µ(A ..1 A.) es;; e implies that µ(A,. n A(..1., n)) :,;;;; e
+ µ(An A(..1., n)),
µ(An A. n (..1., n)) ~µ(An A(2, n)) - e.
13
5. Partitions and Derivatives
We conclude that µ(An A(l, n)) -
8 .,.;
AB
+ lµ(A n A(l, n)),
or again
This shows that lim µ(An A(l, n))
=0
n ➔ +oo
for every O < 2 < 1. Since F,.(x) < 1 for every x, and l can be taken arbitrarily close to 1, we conclude that f..l A converges in measure to the·constant function 1. Another application of this result to A' completes the proof of the theorem. □ The next theorem provides a simple criterion to determine, when Xis a metric space, whether a sequence of partitions satisfies the assumptions of the preceding theorem: Theorem S.S. Let (X, d, µ) be a probability space, where X is a locally complete separable metric space and dis the Borel a-algebra of X. Let (&'n),.;;, 1 be a sequence
of partitions such that
lim ( sup diam(P)) n ➔ +oo
= 0.
Pe!Pn
Then the sequence &'n satisfies the assumptions of theorem 5.4, and dis equal (mod 0) to V':=1 &1,.. Proof Let A e d and e > 0. By theorem 1.3 there exists a compact set K c A such that µ(A \K)..;; e/2. Let U1 ::, U2 ::, • • • be a sequence of open sets such that ():'=i Un = K. Then lim,.➔ +oo µ(U,. \K) = 0. Let m > 0 be such that µ(U,,.\K)..;; e/2,
and choose N > 0 so that sup diam(P)
~
d(K, u::,)
Peflln
for n
~
N. Then, for n ;;,,, N, the set
satisfies
so that µ(A,. LI A) .,;; e/2
+ µ(A,. \K) .,;; e/2 + µ(U,. \K) ~ e,
proving the first assertion. To prove the second, we observe that by the reasoning
0. Measure Theory
14
above any Aed can be approximated by Ane theorem 1.2 d is equal (mod 0) to &'n.
V':=i
V:'=
1 &'"'
so by the approximation
0
To conclude, we formulate a criterion to determine when, given a probability space (X, d, µ), there exists a countable family 9' c d which generates d (mod 0). Such probability spaces are called separable.
Theorem 5.6. Given a probability space (X, d, µ), the following properties are equivalent: a) (X, d, µ) is separable; b) It' 1 (X, d,µ) is separable; c) ft'P(X, d, µ) is separable for every 1 ,;;; p < oo; d) There exists a countable subalgebra d 0 c d which generates d (mod O); e) There exists a sequence of partitions &'1 ,;;; &'2 ,;;; ···such that &'n generates d(mod0).
V':=i
Chapter I. Measure-Preserving Maps
1. Introduction If (X, d, µ) is a measure space, we say that a measurable map T: X -+Xis measurepreserving, and that µ is invariant under T, if for every Aed we have µ(A)= µ(T- 1 (A)). The dynamic behavior of measure-preserving maps is the theme of ergodic theory. This section elaborates on this last assertion by introducing, in an informal way, some objects and methods of ergodic theory. The remainder of the chapter will be devoted to a formal presentation of the more basic of these objects. The development of the methods will be postponed till the next chapter. We start with the- situation which led Poincare to prove the first theorem of ergodic theory. Consider a differential equation
x = f(x)
(1)
defined in an open set U c Rm. Suppose that f is of class C 1 and that it satisfies the following assumptions: a) For every p e U there exists a solution x: R-+ U of (1) with initial condition p, i.e. such that x(O) = p. b) If f = U1, ... .fm), then
for every p e U, i.e. f is a field whose divergence vanishes. c) The Lebesgue measure of U is finite. Under these conditions, Poincare [P4] proved that for almost every pe Uthe solution x: R-+ U of(l) with initial condition pis recurrent, i.e. liminfllx(t)- pll
= 0.
t-+oo
Assumptions (a), (b) and (c) are often satisfied by equations in Mechanics, as shall be explained in section 4. The importance of Poincare's result is that it shows that if the evolution of a system is described by an equation satisfying these assumptions, then the system returns infinitely often to configurations arbitrarily
L Measure-Preserving Maps
16
close to the initial one, except for a set of initial configurations which can be neglected from the probabilistic point of view, since it has zero Lebesgue measure. The above result is proved by combining elementary properties of differential equations with a theorem, now known as the Poincare recurrence theorem, which states the following: Let K be a separable metric space, T: K ➔ K a measurable map, and µ a finite Borel measure on K which is invariant under T. Then almost every point x is recurrent, i.e. satisfies liminfd(T"(x),x) = 0. n-+oo
"Almost every point" here refers to the measure µ, so an equivalent statement is that µ ( {x
hi:~nf d(Tn(x), x) > 0}) = 0.
In order to get from the Poincare recurrence theorem to the result about differential equations, consider the one-parameter family of maps¢,,: U ➔ U, teR defined by putting ¢,,(p) := x(t), where pe U and x: R ➔ U is the solution of (1) satisfying x(O) = p. It is a basic result in the theory of differential equations that tf,, is a diffeomorphism of U for each t, and that tf,,¢,, = 'Pt+s for each t ands. Assumption (b) implies, as will be seen in section 3, that¢,, preserves the Lebesgue measure for all t. Applying the Poincare recurrence theorem to K = U, T = ¢, 1 and µ the Lebesgue measure, we obtain that lim inf d(tf,f(p), p)
=0
for almost evey p EU. But, from the relation 'Pt+• ¢,f(p)
= ¢,,¢,,, it immediately follows that
= ¢,.(p),
and, from the definition of ,;., we know that ¢,.(p) solution of(l) with initial condition p. Thus lim inf d(x(t), p)
~
= x(n),
where x: R ➔ U is the
lim inf d(x(n), p) = lim inf d(tf,f(p), p)
= 0.
Billiards are another interesting class of examples. Take a bounded open set U in the plane whose boundary au is a finite union of simple closed curves which we assume for simplicity to be of class C00 • Consider a punctual particle moving inside U along straight lines and hitting the boundary in perfectly elastic collisions; this means that the directions of incidence and reflection form the same angle with the tangent to the boundary at the contact point. In order to study the movement of the particle, we introduce the space K := (0, n) x au and the map T: K ➔ K defined as follows: For (a, p) EK we consider the particle hitting the boundary at point p, coming from a direction that forms an angle ex with the tangent vector to au at p. This information determines the new direction of movement of the particle, and hence the point q where the particle hits again the boundary for the first time, as well as the new angle of incidence p. We then put T(a:,p) := (p,q). This is a measurable map (and is in fact of class C"" except on a finite number of curves in K).
1. Introduction
17
Denote by Ao the Lebesgue measure in au (i.e. the unique Borel measure in au such that if A c au is a regular arc then A0 (A) is the length of A). Denote by A the Lebesgue measure in R; then A x Ao is a measure in K. Birkhoff [A6] has shown that the measure µ in K defined by µ(A):=
L
sin 0 d(A x Ao)
is invariant under T. The recurrence theorem, applied to T andµ, shows that almost every point (a.,p)eK is recurrent. We leave it to the reader to show from this that for almost every x e U and almost every u e R 2 a particle that starts from point x with velocity u will return infinitely often to points arbitrarily close to x, with velocity arbitrarily close to u. We now consider an example of a very different nature, the map tf>: [0, l]-. [0, 1] defined by
for x =f, 0, and (0) := 0. This map is called the Gauss transformation, and it plays an important role in the theory of continued fractions, which we now introduce briefly. (A more detailed development comes in section 6.) Given 0 < x < 1, we can write 1 x=--n1 + (x)' where
L Measure-Preserving Maps
18
If ¢,(x) ¥- 0, we can repeat the procedure with ¢,(.x) and obtain 1
x=----l--
+ n2 + ;2(x)
n1
where n2 := [1/¢,(x)]. If ¢,n(x) ¥- 0 for all n (this property is equivalent to x being irrational), we can associate with x the sequence (n1(x))1;;, 1 of positive integers (generally denoted simply by (n1)), such that
1
x=-------1---n2
+
1
n3+----l--
··+ •
ni
+ ¢, 1.(x)
for every j ;;,, 1. It can in fact be proved that X
l
= lim - - - - -1- - i-+oo
n1
+
l n2
+
l n3+--l
··+. n1 which is usually expressed by writing
1
X=----1-n1
+
1 n2+--1
n3
+~
Denoting by In the interval (1/(n + 1), 1/n), the sequence (n1) is determined by the property ¢,i(x) E lnJ+t, j ;;,, 0
The map ¢, possesses the important property that it preserves the Borel measure in [O, l] given by µ(A):= -1I 2 Og
f
A
l -1-d).,
+X
where). is Lebesgue measure. In other words, µ(r1 (A)) = µ(A) for every Borel set Ac [O, l]. This is the so-called Gauss measure, and its invariance will be proved in the next section. In Chapter III we will show that it is the only measure invariant under ¢,, and that it is absolutely continuous with respect to A.
19
1. Introduction
We next mention a result, proved in Cahpter IV, to illustrate the applications of this measure to the theory of continued fractions. Let O < x < 1. If x is rational, then it is of the form
1
X
=------1--n2
+
1 n3+--l-
··+• nk If x is irrational, it is important to evaluate the error at each finite stage of the expansion, ie. the difference 1 Am(x) := x - - - - - - -1- - n1(x) + - - - - -1- nz(x)
+---. 1 ·+-. nm(x)
It is easy to prove (see section 6) that limm ➔ +oo Llm(x) exist a~ 0 and C > 0 such that Am(x)
0} intersects N. a) Prove that there exists a diffeomorphismf: N ➔ N of class C' (called the Poincare map associated to the transverse section N) with the following property: for every x e N there exists ,(x) < 0 such that f(x) = i,6(,(x), x) e N and (t, x) ¢ N for every 0 < t < ,(x). Prove that , is of class C'. b) If w is a volume form of class C' on M, prove that the (n - 1)-form ro on N oefined by
a
rop(u1,••··"•-l) := wp(X(p),ui,••··"•-1)
for peN and u 1 , ... , u._ 1 e T,,M is a volume form of class C' on N. Prove also that ro is !-invariant if w is X-invariant. 3.5 Let (D, ) be a flow defined on a manifold M. For x e M, we define the w-limit set of x as the set of points ye M such that liminfd((t,x),y)
= 0,
,.....,.1:+(x)
where d is any metric generating the topology of M. a) Show that ,+(x) < oo implies w(x) = 0b)I-et D,,, = nT>O DT. Prove that ,(D,,,) = D,,, for every t, and that w(t) CD,,, for every xeM. c) (Hopf) If preserves a o--finite measure µ, prove that for almost every x e M we have w(x) = 0 or xew(x). 3.6 Find a function f: R2 \{0} ➔ (0, +oo) of class COO such that the field X on {0} given by
R2
X(x,y):= -f(x,y)(x:x +y:y)
preserves the Lebesgue measure.
4. First Integrals
37
4. First Integrals Let f be a diffeomorphism of a manifold M. A first integral (or prime integral) off is a function H: M--+ R of class C'° such that Hof= Hand H has regular values c ER with H-1 (c) =I- 0- (This latter condition is, by Sard's theorem, equivalent to saying that M has some component restricted to which H is not constant.) If c is such a regular value, H-1 (c) is a submanifold of M satisfying f(H- 1 (c)) = H- 1 (c). As we shall see in the next section, diffeomorphisms which admit first integrals occur naturally in classical mechanics. Here we shall prove the following property: Proposition 4.1. Let f: M-+ M be a diffeomorphism preserving a volume form w. If f possesses a first integral H: M--+ Rand c is a regular value with H- 1 (c) =I- 0, there exists a volume form on H-1 (c) invariant under f1H- 1 (c).
In principle, flH- 1 (c) could leave invariant many different volume forms. In the proof, however, we shall construct one which is in some sense the most natural (see exercise 4.4). For the question of uniqueness of invariant volume forms, see exercise 4.1.
Proof We define a volume form OJ on H- 1 (c) in the following way: Let xEH- 1 (c) and u 1, ... , u._ 1 E 7'xH- 1(c). Put (1)
where u E TxM is such that (DxH)u
= 1.
Equality (1) does not depend on the choice of u because if u' is another vector in TxM satisfying (DxH)u' = 1, we have (DxH)_(u - u') = 0, which implies u u' E TxH- 1(c), and
Let us check that f*OJ =OJ.If xEH- 1(c) and u 1 , ... , u._ 1 are vectors in 7'xH- 1(c), we obtain (f*OJ)..(u1,···•Un-1)
= 0JnxiC(Dxf)u1,····(Dxf)u._1) = wf(w, (Dxf)u 1, ... , (Dxf)u._i),
(2)
where w E ½"M satisfies (Dnx>H)w = 1. Let w0 E TxM be such that (Dxf)Wo = w. Differentiating Hf= Hat x we obtain
which implies (DxH)Wo = (Df(x)H)(Dxf)wo = (Df(x)H)w = 1.
Thus (2) can be written in the form
(3)
I. Measure-Preserving Maps
38
(f*w)..,(ui, ... , Un-il
= OJf(xi((D..,/)wo, (D..,/)u1.,. •., (D..,f)u,,_i) = (/*OJ),.(w0 , u 1, ... , u,,_1) = OJ..,(w0 , Ui, ... , U,,-1 ),
the last equality holding by virtue of the invariance of OJ under f. But from (3) and (1), We conclude that (/*w).., = w.., for every x e H-1 (c), which means that wis invariant under /IH-1(c). 0 We shall also need in the next section the related notion of a first integral of a vector field. If Xis a vector field in M, a function H: M-+ Risa first integral of X if H has critical values c with H- 1(c) -# 0 and (DpH)X(p) = 0 for every p e M. If (D, ¢,) is the flow generated by X, any first integral of Xis also a first integral of ¢,1, forte R.
Exercises 4.1 a) Let f be a cro diffeomorphism of a manifold M. If f does not possess first integrals, prove that f leaves at most one cro volume form invariant (up to multiplication by a scalar). b) Let f be a C"" diffeomorphism of a manifold M. If/ leaves a cro volume form invariant, prove that there exists a submanifold N c M such that /(N) = N and /IN leaves invariant exactly one volume form (up to multiplication by a scalar). 4.2 a) Prove that if/ is a diffeomorphism of Mand H: M---> Risa first integral, then H(y) = H(x) for every x e Mandy in the OJ-limit set of x. b) The co-limit set of a diffeomorphism f: M ..... M is defined as the set L +(f) = e M ro(x). Prove that if a diffeomorphism f: M --+ M admits a first integral then L +(f) is uncountable.
U,,
4.3 a) Prove that for every volume form OJ in S1 x R and every diffeomorphism g: S 1 ..... S 1 there exists a diffeomorphism f: S 1 x R ..... S 1 x R preserving OJ and such that f(p, 0)
= (g(p), 0)
S1•
for every p e b) Find a diffeomorphism of a compact manifold which preserves a volume form and leaves invariant a compact suhmanifold, but which, when restricted to that invariant submanifold, does not preserve any volume form. 4.4 If f, M, H, c and ro are as in the statement of proposition 4.1, show that the form ro defined in the proof of 4.1 is unique with the property that for every xeH-1 (c) we have
39
5. Hamiltonians
for any sufficiently small r.
5. Hamiltonians A symplectic manifold is a pair (M, ro), where M is a manifold and OJ is a nondegenerate closed 2-form of class C"', called the symplectic structure of M. Nondegenerate means that if x EM and u E T,,M are such that rox(u,w) = 0 for every w E TxM, then u = 0. The most natural examples ofsymplectic manifolds are the pairs (U,ro), where U is an open set in R2 " and ro is the form defined by n
OJ
here q 1 ,
••• ,
=
I dP; i=l
I\
dqi;
q. indicate the first n coordinates in R" and p 1 ,
..• ,
p. the last
n coordinates. Another class of examples is the pairs (M, ro), where M is a 2-
dimensional manifold and ro is a volume form. Not every manifold M admits a symplectic structure: there-are some conditions which M must satisfy. For example, M must be even-dimensional (exercise 5.lb) and orientable (see proof below). In the 2-dimensional case, orientability is also a sufficient condition, since, as mentioned above, a volume form provides a symplectic structure in dimension two. However, in higher dimensions there are other necessary conditions of a deeper nature. Given two symplectic manifolds (M1 , ro 1 ) and (M2 , ro 2 ), we say that a diffeomorphism f: M 1 -+ M 2 is a symplectic diffeomorphism or symplectomorphism if OJ 2 ((Dxf)u,(Dxf)w)
= ro 1 (u, w)
for every x EM1 and u, w E TxM 1 • If there exists such a diffeomorphism between M 1 and M 2 , we say that M 1 and M 2 are symplectomorphic. Locally, all symplectic manifolds of same dimension are symplectomorphic. This is-a-consequence of the following result, which provides a universal local model for symplectic manifolds: Theorem 5.1 (Darboux [R4] [A 7]). If (M, ro) is a symplectic manifold of dimension 2n, every point in M possesses a neighborhood V such that (V, wl V) is symplectomorphic to (U, Li=i dP; I\ dq;), where U is an open set in R2 ".
Every symplectic manifold (M,OJ) possesses a canonical volume form iiJ given by ii, :=
OJ I\ . ~. I\ OJ.
It is clear that ro is a 2n-form. That it is non-degenerate is a simple consequence of
I. Measure-Preserving Maps
40
Darboux's theorem, but can also be proved using elementary considerations (exercise 5.1). It follows from the definitions that a symplectic diffeomorphism preserves the volume form thus defined. If M is a manifold and x a point in M, we denote by Tx* M the cotangent space to M at x, i.e. the space oflinear maps from T,;M into R. If ro is a symplectic structure on M, we define for every x EM a map Ax: T,;M .... T,;* M in the following way:
(Axu)w := rox(u, w),
WE
T,;M.
The map Ax is evidently linear, and is injective because ro is non-degenerate. It is thus an isomorphism. Using this isomorphism we can, given a symplectic manifold and a C 1 function H: M .... R, define its symplectic gradient V# Has the vector field onM given by V/ H :=
A; 1 (DxH);
in other words, V/ H is the unique vector in TxM such that (DxH)w
= rox(V/ H, w)
for all w E TxM. A field which is the symplectic gradient of some function is called a Hamiltonian field, and the function from which it derives (which is unique up to a constant) is called the Hamiltonian of the field. If U is an open set of R2" with the symplectic structure given by w := If=i dp; /\ dq;, the symplectic gradient of a function H: U .... R is given by V#H
= (oH , ... aq1
,:H, -!H ,... ,-!H). q.
P1
P.
The function H is a first integral of the field f # H because (DxH)V/ H
=
wx(V/ H, V/ H)
Moreover, V # H preserves the volume form symplectic:
w, and
= 0. the flow generated by it is
Proposition 5.2. Let (M, w) be a symplectic manifold. If X is a C 1 vector field on M, the following conditions are equivalent: a) Xis locally Hamiltonian, i.e. every point of M possesses a neighborhood V where there is a C 2 function H: V---> R satisfying V# H =XIV; b) If (D,¢,) is the flow generated by X, then for every T > 0 andW- < T the diffeomorphism ¢,,IDT: (DT,wlDT) .... (,o((Dpi,6,)u,(Dpi,6, )w) = 0.
Thus, for every !ti< T, we have W,t,,cviC(DA)u,(Dpl,P,)w)
= wp(u, w) +
0
t
(LxW),t,,(pi((Dpi,6s)u,(Dpi,6x)w)ds
= wp(u,w).
(2)
If X is of class C 1 only, then (1) does not make sense, because Lxw is not defined. But we can approximate X in the C 1 topology by a sequence (X.) of C 2 vector fields, and, denoting by (D, ip) the flow generated by x., we obtain W,t,,(pi((Dpi,6,)u, (Dpi,6,)w)
= lim W,t,j•>(p)((Dpipf•>)u, (Dpip)">)w) n--++oo
= wp(u,w). (b) = (c) (outline). If Xis of class C 2 one uses (b) and (2) to prove that Lxw = 0, and (c) follows by using (1). If Xis of class C 1 only, we proceed as before, approximating X in the C 1 topology by a sequence (X.), and proving that Lx.w converges to 0. Then d(ix.w) = Lx.w converges to zero; but d(ix.w) converges to d(ixw). □
Exercises
5.1 Let OJ: Rm x Rm-+ R be an alternating bilinear map. a) Prove that if1(x) E In,
if ~i(x) 'F 0
or n; := 0
if ~i(x)
= 0.
This sequence is obviously normal, and
so that
This implies the theorem.
Exercises
6.1 Prove that,given a sequence(mi);;;a,o of positive integers such that L;(l/m;) < oo, there exists for almost every xe(O, 1) an integer N such that n;(x) ~ m;
for allj ~ N. Hint: The set of points x where this property fails is co
n U ri((O, 1/m;)).
n;;.O j=l
6.2 Prove that there exists O < Ao < 1 such that 1(~2 )'(x)I < Ao at every point where )'(x) exists. Deduce that there exist G > 0 and O < A < 1 such that
(~ 2
Ix for every x e (0, 1) and k
~
lno(x) ... lnk(x)I ~
mt
1.
7. Topological Groups, Lie Groups, Haar Measure A topological group is a Hausdorff topological space G endowed with a group operation G x G ➔ G, denoted multiplicatively, satisfying the following conditions: a) The map (x, y) ➔ xy is continuous. b) The map x ➔ x- 1 is continuous.
L Measure-Preserving Maps
46
Under certain mild additional assumptions, it can be proved that (a) implies (b) (see exercises 7.8, 7.9). The simplest examples of topological groups are the finite groups, the (additive) real and complex numbers, and the circle S 1 = {z e Cllzl = 1}, with the operation induced by multiplication in C. Given a family {G,.},..,... of topological groups, we defme its product TI«e..t G,. as the product topological space (i.e. the set of all maps x: A -+ G,. such that x(oc) e G,. for all oceA, with the coarsest topology that makes all the projections x1-+x(oc) continuous), endowed with the product group structure (ie. {xy)(oc) = x{oc)y{oc)). Examples are the tori T• := S 1 x . ~. x S 1, and the solenoid, defined as the product of a countable family of copies of S 1 • An important theorem of Pontrjagin [P5] asserts that every compact abelian metric topological group is a product of finite cyclic groups and circles. . A Lie group is a manifold M of class C00 endowed with a topological group operation which is of class C00 with respect to the differentiable structure of M. It follows from the implicit function theorem that the map x H x- 1 is a C00 -diffeomorphism (exercise 7.9). (It is sufficient to formulate the definition with a C 1 condition on the manifold and the operation, since this implies that they are C00 , and, in fact, analytic.) Every continuous homomorphism from a Lie group into another is C00 , and every closed subgroup of a Lie group is a C00 submanifold, and thus also a Lie group. The finite-dimensional topological groups above are all examples of Lie groups. The next simplest examples are the groups GL(n) of linear isomorphisms of R•. Since GL(n) is an open set in the vector space !l'(R") of linear maps from R• into itself, it is a C"' manifold, and the operation (A, B) -+ AB is obviously C"', as can be seen by writing A and B as matrices relative to some basis of R". That SL(n) := {AeGL(n)ldet(A) = 1}
U«
O(n) := {AeGL(n)IAA*
= 1}
are Lie groups follows from the fact that they are closed subgroups of GL(n), but can also be proved directly (ex. 7.4); O(n) is compact, and it can be shown that every compact Lie group is isomorphic to a closed subgroup of O(n). If G is a topological group, one associates to each x e G the homeomorphisms Lx: G -+ G (left translation by x) and Rx: G -+ G (right translation) defined by Lx(Y) := yx, Rx(y):=xy. If G is a Lie group, left and right translations are diffeomorphisms. A left-invariant measure on G is a measureµ on the Borel a-algebra of G invariant under all left translations and satisfying the following condition: for every Borel set A, µ(A)
= inf µ(U) = sup µ(C). U open U::,A
C closed Cc::A
This condition is satisfied by any finite measure by Lusin's theorem (0.1.3). It can be replaced by the finiteness ofµ on compact sets if the group is locally compact
7. Topological Groups, Lie Groups, Haar Measure
47
(Halmos [HO]). Right-invariant measures are defined in the same way. Ha measure is both left- and right-invariant. it is called simply an invariant measure.
Theorem 7.1. A complete topological group has a left-invariant measure if and only
if it is locally compact. In this case, the left-invariant measure is unique up to a positive multiplicative constant. If G is compact. we have v(G) < oo for any left-invariant measure v (check), so v can be divided by v(G), giving rise to a normalized measure µ. The uniqueness part of theorem 7.1 guarantees thatµ is the only left-invariant measure with total mass 1; we call it the Haar measure on G. Corollary 7.2. The Haar measure of a compact group is right-invariant, and invariant under surjective continuous homomorphisms. Proof Let G be a compact group andµ its Haar measure. For xeG we define a measure v on G by v(A) := µ(R,; 1 (A)). Then vis left-invariant, because for ze G and A a Borel set in G we have
Moreover, v obviously satisfies v(G) = 1, so by uniqueness v = µ, which means that µ is invariant under R" for any x.
Analogously, if H: G-+ G is a surjective continuous homomorphism we define a measure v by v(A) := µ(H- 1 (A)). We leave it to the reader to show that v is left-invariant, henceµ = v, andµ is H-invariant. D Theorem 7.1 was proved by von Neumann for compact groups; the extension to locally compact groups is due to Haar. See Pontrjagin [P4] for the proof in the compact case. Here we will prove the theorem for Lie groups; we will obtain a left-invariant measure which arises from a C'" volume form. Let G be a Lie group. Let ( ·, · ). be an inner product in T,G, where e denotes the identity of G. For xeG we define an inner product ( ·,·)"in TxG by (u, w)x = ((D.Lx)- 1 u,(D.L")- 1 w)•. It is easy to see that these inner products endow G with a Riemannian structure for which every left translation is an isometry. In particular, if w is the volume form associated to this structure, it follows thatµ., is left-invariant. This settles the existence part. We now prove uniqueness. Let v be another left-invariant measure. We leave it to the reader to show that v( {e}) = 0. Thus v(B,(e)) < + oo for small values ofr, say 0 < r < R. From the left invariance of v and of the metric on M it follows that v(B,(x)) < + oo for every xeGand O < r < R. We claim that for every xeGwehave . v(B,(x)) v(BR(e)) l1msup--~----. r ➔ o µ.,(B,(x)) µ.,(BR(e))
(1)
By the left invariance ofµ, v and the metric on M we just have to consider the case x = e. Take any sequence (rn)n=i, 2 , ••• such that O < rn T" such that n] = fn:.
The map / is called the linear lift off
7. Topological Groups, Lie Groups, Haar Measure
49
Proof. Let U0 be a neighborhood of O in R", and V0 a neighborhood of the identity in T,., chosen so that 11:I U0 is a homeomorphism between U0 and Vo. Let l7z c:: Vi c:: Vo be neighborhoods of e such that
+ /(Vi) C:: Vo f(Vz) + f(V2) c:: Vi, 1 and let U1 := (11:I U0)- Vi and U2 := (11:I U0)-1Vz.
(1)
f(V1)
(2)
We define j: U1 -+ lt" by ](x) := (11:I U0 )- 1/11:(x). (Condition (1) guarantees that (11:I U0 )- 1 is well defined at the point /11:(x).) H x, ye U2 , we have, be condition (2), (3)
and n(i(x)
+ f(y)) =
f(n(x)
+ 11:(y)) =
f(n(x
+ y)).
(4)
+ y)
(5)
(3) and (4) imply that f(x)
+ f(y) =
n(U0 )- 1/(n(x
+ y)) =
f(x
for every x, ye U2 • Let U3 c:: U2 be a neighborhood of Osuch that 1U3 c:: U3 for every O:::;; A.::;;; 1. Then, for XE U3 and p, q integers such that IPI < lql, we have
or, in other words, ](rx)
= r](x)
(6)
for every rational r < 1. Since f is continuous, so is J, and (6) holds for every real number O :,; r ::;;; 1. Thus JI U3 satisfies (5) and (6). It is easy to see that J possesses a linear extension R" -+ R"; we shall prove that nJ = fn. Let S c:: R" be the set of points where n] and fn coincide.Sis obviously closed. Let peS and ye U3 • Then
J:
nf(p
+ y) = nf(p) + 11:f(y) = fn(p) + fn(y) = f(n(p + y)),
so that p + U3 c S, and S is also open. Thus S = R", and 11:J = fn. The coefficients of ]in the canonical basis (e 1 , ... ,e,.) of R" are integers because, if x E Z", then nJ(x) = fn(x) = f(e) = e, so that f(x)E n- 1 (e) = Z"; in particular, the coefficients of ](e;) are integers. To prove uniqueness, suppose that/: R"-+ R" is linear and = fn, and take XE U1 • Then n(x)e V1 and/n(x)E V0 ,and,ifx is close enough to 0, we have/(x)e U0 • The relations fn(x) E V0 = n(U0 ) and /(x) E U0 imply that /(x) = (nl U0 )- 1/n(x). In other words,/ coincides with Jon a neighborhood of 0. Since both are linear maps, it follows that J =
n/
J
I. Measure-Preserving Maps
50
Now assume that f is an isomorphism, and take j-1 such that 1ej-1 = 1-11e. Then 1t.ff-1 = f1ej- 1 = fr 1 = fa. Since 1tl = l1t, we obtain. by uniqueness. that I = ff- 1 • But j and j-1 have integer-coefficient matrices, so their determinants are integers and, since det(j)det(j-1 ) = det(I) = 1, we get ldeto such that
for allj > 1. 7.2 Let G be a compact Lie group with Haar measure µ and ip: G-+ GL(n) a continuous homomorphism. a) Prove that there exists an inner product in Rn such that
0) is contained in the closure of the set of recurrent points of T, which is generally called the Birkhoff center of T. This is an easy consequence of the Poincare recurrence theorem, since the set of recurrent points has measure 1. It is not true that the Birkhoff center is the union of the support of all invariant measures. There exists an example (Nl] of a C"' diffeomorphism of T 2 , with a single fixed point p, for which every point of T 2 is recurrent but the only invariant probability measure is the Dirac measure at p. ' When the homeomorphism T: X ➔ X possesses periodic points, it is easy to associate with them invariant measures, in the following way: Let n be the period ofx. Put (1)
We clearly have µe .AT(X). The following theorem says that for a generic diffeomorphism of a compact manifold without boundary every invariant probability measure can be approximated by a convex linear combination of measures of the type of (1). Recall that a subset S of a topological space X is called residual if it contains the intersection of a countable family of open and dense subsets of X. The space X is said to be a Baire space if every residual subset is dense. The commonest class of Baire spaces is the locally complete metric spaces. A property of points of a space X is generic if it holds for a residual set of points of X. Theorem 8.5 [M9]. If Mis a compact manifold without boundary and Diff1 (M) is the set of diffeomorphisms of M with the C 1 topology, there exists a residual set 9' c Diff 1 (M) such that, for every f E f/', the set .Ar(M) is the convex hull of measures of the form (1).
Exercises
8.1 Let/: S 1 -+ st be a homeomorphism with fixed points. Prove that a point of st is recurrent if and only if it is a fixed point off" for some n E Z. Deduce that if f has a finite set of fixed points then every probability measure invariant by f is a convex linear combination of Dirac measures concentrated on fixed points of fn, neZ. 8.2 Let f: [O, 1] x S 1 -+ S 1 be continuous and such that for every t E [O, 1] the map
J.( ·) = f(t, ·) is a homeomorphism. Is there a continuous map µ: [O, 1] -+ .A(S 1 ) such that µ(t) is !-invariant for every O ,;;; t ,;;; 1? 8.3 Let X be a compact metric space and T: X subset of X, we put
-+ X
a continuous map. If A is a
i-(x,A) := limsup! #{O ,;;;j < njT1(x)eA}. n-+oo n a) Prove that for every compact set Uc X and every xeX there exists µe.AT(X) such that
8. Invariant Measures
57
µ(U) ;;;,, t(x, U). Hint Let (n1}i;,. 1 be a sequence of integers such that lim
.!_ #{0 :,;;j
(c). If(c) does not hold there exists f e C 0 (X) such that sequence (2) does not converge uniformly to Lfdµ, where µ is the unique element of .Hr(X). Then there exists e > 0, a divergent sequence of integers (n;)1,. 1 and a sequence (x1) 1,. 1 of points of X such that
I
_1_1 f(Ti(x,))-f f dµI 1n1 + J=O X
~e
for every i. Let µ. 1 E .H(X) be such that
f
1
g dµ.,
X
.,
.
L g(TJ(x.)) ni + j=O
= --1
for every g e C 0 (X); the existence of this measure is guaranteed by the Riesz representation theorem (8.4). Since .H(X) is compact by Theorem 8.2, we can assume that the sequence(µ.,) converges to ve.H(X). We now prove that ve.Hr(X). Let ge C0 (X); then
so that v E .,Hr(X). But ILfdv- LfdµI
= .~~,Lfdµ.,= lim
n, ➔ co
1-
LfdµI
I
1f(Ti(x;))-f fdµI ni + 1 j=O X
and we get v =faµ, contradicting the fact that Tis uniquely ergodic. (c) => (b) is trivial. (b) => (a). Let ,p: C0 (X)--. R be the functional defined by (/)
= ~-~ n
!t 1
f(Ti(x)).
~ e,
I. Measure-Preserving Maps
60
Then, for µe .KTCX),
f
x
I,
f
fdµ=-1(foTi)dµ, n+l1=0 x
because
I,
(x)))
for every j. Since the sequence - 1 - ( f(T1 is bounded by n+ 1 J=O from the dominated conv~rgence theorem (0.3.4) that
f
x
I,
llfllo, it follows
f
1fdµ= lim(foT1)dµ n➔oo n + 1 j=O x
=f
lim
xn ➔ oo
(_!_l I, + n
J=O
(f o T 1))
=f
x
¢,(f) dµ
= ¢,(f). .
Thus the only element of .KT(X) is the measure associated with the positive linear functional ¢,. □ We shall now analyze the dynamics of a uniquely ergodic map on the support of its invariant measure. Definition. Let T be a continuous map on a compact space X, and A c X a non-empty subset of X. We call A a minimal (invariant) set for T if A is compact, T(A) c A, and there is no non-empty proper subset of A with the same properties. The map Tis called minimal if A = X. Proposition 9.3. Every continuous map on a compact space possesses a minimal set. Proof The family /F of non-empty, compact, invariant subsets of Xis non-empty, since Xe /F. The order relation defined by inclusion satisfies the assumptions of Zorn's lemma, so a minimal element under this relation is a minimal invariant
-
□
Proposition 9.4. If T: X-> X is uniquely ergodic, the support of the unique element µ e .HT(X) is a minimal set. Proof Let A be the support of µ; clearly T(A) c A. Take a non-empty compact set A 0 c A such that T(A 0 ) c A 0 , and define v e .K(X) by v(A) = µ(An A0 ). Then v(r-1 (A)) = µ(r- 1 (A)n A0 ). Since T(A 0 ) c A 0 it follows that r- 1 (A)nA 0 = (TIA 0 )- 1 (A n A0 ). Thus v(r-1 (A)) = µ((TIA 0 )- 1 (A n A0 )) =µ(An A0 ) = v(A), implying that v e .HT(X), hence v = µ. But this means that A 0 ::::, supp(v) = supp(µ) = A,soA = A 0 • □
10. Shifts: the Probabilistic Viewpoint
61
In the next chapter we shall see an important example (due to Furstenberg) of an analytic diffeomorphism of T 2 which is area-preserving, minimal, but not uniquely ergodic.
Exercises
9.1 Let X be a compact metric space and T: X-+ X a continuous map. a) Show that if A c Xis a minimal set then T(A) = A and w(p) = A for every pE A. b) Prove that if p EX is recurrent, the set w(p) is minimal if and only if for every e > 0 there exists N > 0 such that for any n > 0 we can find n ,:;; m ,:;; n + N with d(P(p),p) < e. c) Let r be the closure of the union of all minimal sets of T. Show that, for any e > 0, there exists N > 0 such that, for every p Er and n > 0, we can find n,:;; m < n + N with d(Tm(p),r),:;; e. 9.2 Let X be a complete metric space, and T: X-+ X a homeomorphism. We say that a point x EX is stable if for every e > 0 there exists b > 0 such that d(y, x) < b implies d(T"(x), T"(y)) < e for every n E Z. a) Prove that every point of X is stable if X is a compact metric topological group and Tis a translation. b) Prove that every point of Xis stable if and only if the family (T")nEZ is equicontinuous. c) Prove that if every point of Xis stable, then the closure of any orbit is a minimal set. d) Prove that if every point of Xis stable, the restriction TIA is uniquely ergodic for every minimal set A. 9.3 Prove that a homeomorphism T of a compact metric space X is uniquely ergodic if and only if it possesses a unique minimal set A and Tl A is uniquely ergodic.
10. Shifts: the Probabilistic Viewpoint Let X be a topological space. We denote by B(X) the set of sequences 0: Z-+ X and by n+(X) the set of sequences 0: z+ -+ X, both endowed with the product topology. The (left) shift o-: B(X)-+ B(X) is the map defined by
o-(0)(n) = 0(n
+ 1).
The same definition applies for the (one-sided) shift o-: B+-+ n+. Both maps o- are continuous. When X is finite we identify it with the set {1, ... , n }, endowed with the discrete topology, and we write B(n) := B(X), B+(n) := B+(X). For µE A(B(X)) we denote by Bµ(X) the space B(X) endowed with the measure µ; B;(x) is defined analogously. Given Borel sets A 0 , ••• , Am in X, and j E Z, we define the cylinder C(j, A 0 , ••• , Am) as
L Measure-Preserving Maps
62
C(j,A 0 , ••• ,A,..)
= {OeB(X)IO(j + i)eA;,O ~ i ~ m}.
Disjoint unions of cylinders form an algebra which generates the Borel a-algebra of B(X). Thus every probability measure peJl(X) is determined by its restriction to this algebra, and, by the approximation theorem (0.1.2), given a Borel set A c B(X) and some e > 0, there exists a disjoint union of cylinders A 0 such that p(A LI A0 ) ~ e. Given a measure p 0 e .,/f(X), it can be proved that there exists a unique measure pe.,lf(B(X)), called the product measure associated with p 0 , such that m
p(C(j,Ao,Am))
= TI Po(A;). J=O
It is easy to verify that p is a-invariant. The shift a: B,.(X) -+ B,.(X) is called a Bernoulli shift. When X = {l, ... ,n} and p 0 eJl(X) is the probability defined by µ 0 ({i} 0 ~ p1 ~ 1, Lf=i p1 = 1, we will sometimes write B(p 1 , ••• , Pn) := BiiCX).
= p1,
Giving a a-invariant probability measure on the Borel a-algebra of B(X) is equivalent to giving a stationary stochastic process with values in X. To understand this, recall the definitions of random variable and stochastic process: A random variable with values in a topological space X is a measurable map from a probability space (Y,af, v) into X, and the probability measure µe.,lf(X) associated toe is defined by µ(A) = v(e- 1 (A)). Random variables are mathematical models to describe outcomes of non-deterministic experiments or phenomena. For example, if the outcome of an experiment or measurement is expressed as a real number, and we do not have enough information to predict this number exactly, we can at least give for every interval [a,b] c R the "probability" that the number will fall inside this interval. The domain (Y,li, v) of is, most often, a purely mathematical object which is never described, exactly because its points contain the information which would make the outcome deterministic. The reader who is interested in a conceptual discussion of modeling by random variables will find an elegant treatment of the topic in Billingsley [B2). A stochastic process with values in Xis a sequence (en)neZ of random variables en: (Y, li, v)-+ X. Consider the map Y-+ B(X) defined by
e
e
e:
e(x)(n)
= en(x):
Associated to the stochastic process we have the probability measure µeB(X) defined by
for any Borel set A in B(X). In other words,µ is the probability measure associated to the random variable with values in B(X). If µ is u-invariant we say that the process is stationary. Stochastic processes are models for a series of non-deterministic experiments or measurements. If the outcome of these experiments are real numbers, for example, they can be described as a stochastic process (en)nez with values in R, and given intervals [a1,b1] and integers m;, 1 ~ i ~ l, the value µ(n~=i C(m1,(a1,b1))) is interpreted as the "probability" that the m;-th outcome fall between a1 and b1, for 1 ~ i ~ l. For comments and examples, see Breiman [Bll].
e
10. Shifts: the Probabilistic Viewpoint
63
A Markov measure in B(n) is an element µeJt.,(B(n.)) such that there exist numbers Pv, 1:::; i,j ~ n, satisfying µ(C(m,k1, • .• • k,.))
= Pk,Pk,k2 • • ·Pk,-,k,
for every cylinder C(m, k1 , ••• , k,.). We describe the general method for constructing Markov measures:
Definition. An n x n stochastic matrix is a matrix P whose coefficients Pii satisfy a) L,iPii = 1; b) there exists a vector (p 1 , ..• , p.) such that Pi
•
> 0, j = 1, ... , n
L P1Pu =
Pi, j = 1, ... , n.
i=l
We say that the vector (p 1 , ••• , p.) is a probability vector associated to the matrix P.
Theorem 10.1. Given a stochastic matrix P with coefficients piJ, 1 :::; i, j ~ n, and its associated probability vector p = (p 1 , •.. , p.), there exists a unique measure µ e .l{(B(n)) such that µ(C(m, k1, • • •, k,))
for every r ;;;,, 1, m, k 1 , Proof Let d
0
... ,
= Pk,Pk,k, · · · Pt.-,k,
k,. Moreover, µ is a u-invariant Markov measure.
be the algebra of finite disjoint unions of cylinders. We define
µ:d0 ➔ Rby
µ(C(m, ko, · •·, k,))
on cylinders, and, for A e d
0
= Pk0 Pk0 k1 • • • Pk,_,k,,
a union A 1 U · · · U A. of disjoint cylinders, by
µ(A)= µ(A 1 )
+ · ·· + µ(A.).
Then µ is clearly additive, and it is easy to see that µ(A)= sup{µ(C)ICed0 , C compact, Cc A}.
Thus, by theorem 0.1.4, µ is a measure, and it is straightforward to check that it satisfies the desired condition. This condition also implies thatµ is Markov. Finally, µ(u- 1 (A)) = µ(A) for A ed0 , so, by 2.1, µ is u-invariant. □ In this situation P is called the transition matrix for the process, and the Pii are its transition probabilities. In what follows, and whenever we discuss spaces of the type of B(n), we will write C(m,k0 , ••• ,k1)
= {0eB(n)l0(m + j) = kj,0 ,s,;j ~ l},
L Measure-Preserving Maps
64
and denote by B(p, P) the space B(n) endowed with the Markov probability measure associated to the vector p and the transition matrix P. In order to illustrate the statistical significance of Markov processes, we shall present an example of a sequence of random events which is naturally modeled by a Markov process. Consider a building with N fire extinguishers, which are inspected every year and replaced if their pressure has fallen under a certain level or if they have not been replaced in the previous inspection. Assume that the probability for the pressure of an extinguisher to fall below the safety level in one year is p. If in last year's inspection i extinguishers were replaced, what is the probability that j extinguishers will be replaced in this year's inspection? The condition that i extinguishers were replaced last year means that N - i of them will be "old" this year. Thus
Pii = 0
for j < N - i.
On the other hand, if j ~ N - i, we must compute the probability of finding j - (N - i) extinguishers-below the safety level among the i which were replaced last year. This number is Pv
= (.
_i )vj+i-N(l - Pt- 1 forj 1+1-N -
~N-
i.
The matrix P = (p,i) is stochastic. If we assume that in the first year all the extinguishers were new, the probability pj"> that j extinguishers will be replaced in the n-th inspection is given by the formula
p 0, ... , q. > 0. Corollary 10.3. If Q satisfies the assumptions of the theorem and moreover Li% = 1 for everyj, then l = 1, we canchooseq uniquely so that Lq1 = 1, andif x = (x 1 , •.• ,x.) satisfies L x 1 = 1 then
lim Q"x
= q.
We will prove the corollary from the theorem; for a proof of the theorem, see [G2].
10. Shifts: the Probabilistic Viewpoint
65
From conclusions (b) and (c) of the theorem it follows that we can take q = (q 1 , ••• ,q.) as the unique point in the intersection of the one-dimensional space (Q- M)-1 (0)with theaffinehyperplaneH = {(x 1 , •• • ,xN)ILx1 = l}. Then we have for every i
Adding these equalities, ,1,
Finally, let H 0 yeH0 • Then
= "i,qijqj = r.(r.qij)qj = "i,qi = 1. l,j J I j
= {(xi, ... ,xN)l"i,ix1 = 0}. Qnx
If xeH we can write x
= q + y,
with
= Q•q + Qny = q + Qny.
Li
The condition qii = 1 for every j implies that QH0 = H 0 • Since 1 is the dominant eigenvalue and its eigenspace is not contained in H 0 , it follows that the eigenvalues ofQIH0 all have absolute value less than 1. This implies that limn ➔ +, IIQnlHoll = 0, and thus limn ➔ +co Qny = 0. O We now return to our example. Observe that
p 0 such that for every n ;;,, N.
Proposition 11.4. Let X a separable Baire metric space and T: X
-+
X a continuous
map. The following properties are equivalent: a) T is transitive; b) For a residual subset of points x EX, we have ro(x) = X; c) For every open subset Uc X the set Un;.o T-n(U) is dense in X.
From (c) it immediately follows that topologically mixing continuous maps of compact spaces are necessarily transitive. The converse is false; for example, transla-
I. Measure-Preserving Maps
70
tions in compact groups are never mixing (exercise 11.10) but, as we have seen in the previous section, they can be minimal (hence transitive). Proof (a)=>(c). Let Uc X be open, and yeX. Let xeX be such that w(x) = X. Then, since yew(x), we can find for every neighborhood V of y some n > 0 such that Tn(x)E V; also, since w(x) = X, we can find an arbitrarily large m such that Tm(x)e U. Let's take m > n. Then, since Tn(x)E Vand Tm(x)E U, we have y- 0. Since U contains some u., we only have to prove the claim for U = u•. We have xeS
Cs. C
nu
y-i(U.),
i~Oj~i
which means that x E y-m(U.) for infinitely many values of m. For the same values we have Tm(x)e Un. (b) => (a). Trivial. □ Corollary 11.5. A homeomorphism T of a Baire space X is transitive if and only if y-l is transitive.
Proof We will prove that y-i satisfies (c), i.e. that Uj;,,o Ti(U) is dense for every open set U c X. Using condition (b), we pick x such that w(x) = X, so there exists m such that Tm(x)E U. It follows that
U Ti(U)::::, U Ti(Tm(x)) 1;;:o
:::, w(Tm(x))
r~o
= w(x) = X.
□
It is true that every shift is topologically mixing, but we will instead prove a more general result. We say that a subset A c B(n) is a subshift ifit is compact and invariant under O", i.e. if O"(A) = A. By extension we also say that the subshift A is transitive (resp. topologically mixing or minimal) if O" IA is transitive (resp. topologically mixing or minimal). The subshift A c B(n) is said to be of finite type if there exists an n x n matrix A = (a;,i) such that the a;i are all O or l and that 0 EA if and only if ao(ilB(i+ll
=
1
11. Shifts: the Topological Viewpoint
71
for every ieZ. In this case we write A= B(A). Conversely, given an n x n matrix A = (aii) whose coefficients are all O or 1, we can define the subset A := {0eB(n)laee = 1 for every ieZ}, and it is easy to prove that A is a compact set and o-(A) =A.If a1i = 1 for every i andj we get A= B(n). Proposition 11.6. Let B(A) c B(n) be a subshift of finite type, and let (a~ml) be the coefficients of Am, m ;;a,, 0. a) B(A) is transitive if and only if for every 1 ,;;;; i,j,;;;; n there exists m > 0 such that aljl>O. b) B(A) is topologically mixing if and only if for every 1 ,;;;; i,j ,;;;; n there exists m > 0 such that aU> > 0 for every l ;;a,, m.
In the proof we shall use the following Lemma 11.7. For every 1,;;;; i,j,;;;; n and m;;, 1, we have alj> the set of maps y: {O, ... ,m} ➔ {1, ... , n} such that
y(O)
= i;
y(m)
= j;
= #S&m>,
where s&m> is
arr, where t is such that a,i = 1. Then
#S
=
L
L
#St1 =
atr-F'O
al;"1 = Ial;"1a,i = alj+i>.
t1.irf:.O
t
This completes the induction step, because there is a bijection
given by
o-m(C(j, k 0 , .•. , k,) n B(A)) n (C(i, h0 , ••• , h.) n B(A))
= C(j + m, k 0 , ••• , k,) n C(i, h0 , .•• , h.) n B(A). Consider 0 1 E C(j, k 0 , •.• , k,) n B(A), 02 E C(i, h 0 , ... , hs). Take m such that
(1)
L Measure-Preserving Maps
72
+ s > 0. i
The second property implies, by lemma 11.7, that there exists«: {0, ... ,m + j - i - s} --+ {l, ... ,n} such that
«(O) «(m alZ(,J«(t+lJ
+j
= hs,
(2)
- i - s) = k0 ,
= 0, 0 ~ t < m + j
(3)
- i - s.
(4)
We defme 0eB(A) by
O(t) = 02 (t)
t
~
i
+ s,
= O(t - i - s) i + s ~ t ~ m + j, O(t) = 01 (t - m) t ~ m + j.
0(t)
Then 0eB(A) since 01 eB(A) and 02 eB(A) ano from (2), (3) and (4). Furthermore,
0eC(j + m,k0 , ••• ,k,)nC(i,h0 , ••• ,h.), which implies, by (1), that u"'(U) n V # 0Conversely, if u: B(A)--+ B(A) is transitive, and given 1 ~ i,j ~ n, we take m > 0 such that um(C(0,j))nC(0,i) #- 0- Let 0eum(C(0,j))nC(0,i) = C(m,j)nC(0,i). Then 01{1, ... ,m} satisfies 0(0)
= i;
0(m) =j; a8(,JB(t+t)
=1
for 1 ~ t ~ m - 1.
By the lemma, this implies that aLmJ ~ 1.
□
The reader should undertake the proof of part (b), which is done using the same ideas. The next proposition shows that subshifts of finite type are exactly the support of Markov measures.
Proposition 11.8. Let P be an n x n stochastic matrix, and p e R" its associated vector of probabilities. The support of the Markov measureµ associated with P and pis the subshift of finite type B(A), where A = (alJ) is given by aiJ
=1
if Pii > 0,
aiJ
= 0
if P;J = 0.
Proof For 0 e B(A), every neighborhood V of 0 in B(n) contains a neighborhood of the form C(i, 0(i), ... , 0(i + m)). Thus m+i-1
µ(V) ~ µ(C(i, 0(i), ... , 0(i
+ m))) = PB(i) TT
t=i
PB(t)B(t+l)•
11. Shifts: the Topological Viewpoint
73
and this number is positive because p 9 0 (p is a vector of probabilities associated with P) and Pe(tJO(t+lJ > 0 for every i ..; t ..; m + i (since 0 e B(A) and thus a 6 0 we put
= {yeXld(Tn(x), Tn(y)) ~ e, Vn ~ O}, W."(x) = {yeXld(T"(x), Tn(y)) ~ e, Vn ~ O}. w.•(x)
a) Prove that T(W."(x)) c W."(T(x)) and r 1 (W."(x)) c W."(T(x)). b) If T is expansive with expansiveness constant e0 , prove that.for every there exists N > 0 such that rn(w.•(x))
C
~
> 0,
W/(Tn(x))
for all O < ll < e0 , xeX and n ~ N. c) Consider B(n) with the metric defined in exercise 11.1. Prove that, if eN then a. e W.:(P) if and only if rx.(i) = P(i) for every i > - N.
= 4-
n
Prove that pis a topological invariant. b) Prove, using Lemma 11.7, that for every m > 0 we have #{OeB(A)lu,.(O)
= O} = tr Am.
Deduce that p(ulB(A))
= p(A),
where p(A) denotes the spectral radius of A. Hint.--l;et J be a matrix such that J Ar 1 is triangular. Use the fact that tr An= tr(JAr 1
t.
c) (M. Shub). Prove that p(T) < oo if Tis an expansive homeomorphism of a compact metric space X. Hint: Let U1 , ••• , Um be open sets whose diameters are smaller than some expansiveness constant and whose union is X. Prove that
for every n. 11.5 Prove that the subshifts
are not topologically equivalent. 11.6 Prove that the set of periodic points of a subshift of finite type is dense. 11.7 We say that a function a:: {O, ... ,m}-+ {1, ... , n} is contained in OeB(n) if there exists j such that a(t - j) = O(t)
for j,:;;;; t ,;;;,.j + m. a) Given a set !F of maps a:;: {O, ... ,m;}-+ {1, ... ,n}, let--A(!F) be the set of OeB(n) such that none of the maps in !F is contained in 0. Prove that A(!F) is a subshift. b) Prove that
A(Q tFn) = l51 A(~). c) For Ac B(n) a subshift, denote by !F(A) the set of all maps a:: {O, ... ,m}-+ {l, ... ,n}, m > 0, which are not contained in any OeA. Prove that A(!F(A)) = A.
12. Equivalent Maps
d) Prove that when F is finite A(') is topologically equivalent to a subshift of finite type. Hint Let N be the maximal cardinality of the domain of the maps tx;, for tx;eF, and let [( be the set of maps a: {O, ... ,N}-+ {1, ... ,n} such that txl{O, ... ,s}¢F for everyO::;;; s,:;;; N. Define h: A(F) ➔ B(.9') by (h(0)(j))(i)
= 0(j -
N
+ i),
and show that h(A) is of finite type. e) Deduce from this that every subshift is the intersection of subshifts of finite type. 11.8 Let T1 and T2 be expansive homeomorphisms of the metric space X, both with the expansiveness constant e0 • Leth: X-+ X be a surjective map such that
T1 h
= hT2
and d(x, h(x)) ,:;;; e0 /2
for every x. Prove that h is a homeomorphism. 11.9 Prove that a metric space is totally disconnected if and only_if any connected subset in it has only one point. 11.10 Prove that there exist no expansive homeomorphisms of S 1 • Prove that a translation of a topological group is not topologically mixing. 11.11 a) If X c B(n) and h: X-+ B(m) is continuous, prove that there exists N > 0 such that h(0(O)) depends only on 0( -N), ... , 0(N); in other words, if a and p belong to X and a(j) = p(j) for Iii ::;;; N, then h(tx)(O) = h(p)(O). Hint: Let C(j;, k 1 , ••• , k.,), 1 ::;;; i ,:;;; m, be such that Then take N = maxjj;I. b) If u(X) = X and uh(0) = hu(0) for every 0eX, prove that there exist N > 0 and F: {l, ... ,N} 2 N+l-+ {1, ... ,m} such that
h(O)(j)
= F(0(j -
N), ... , 0(j
+
N)).
12. Equivalent Maps The objects we have studied so far are formed by a measure space (X, d, µ) and a measure-preserving map T: X ➔ X. We shall now introduce a concept of equivalence between two such objects.
= 1, 2, be two measure spaces, and T;: X; ➔ X 1, i = 1, 2, measure-preserving maps. We say that T1 is equivalent to T2 if there exist maps
Definition. Let (Xi, d;, µ;), i
I. Measure-Preserving Maps
78
F: X 1 ➔ X 2 and G: X 2 ➔ X 1 such that a) For every A 2 ed2 , F- 1 (A 2 )ed1 (mod0) andµ 1 (F- 1 (A 2 )) = µ 2 (A 2 ); b) For every A 1 ed1 , G-1 (Ai)ed2 (mod0) and µ 2 (G- 1 (Ai)) = P1(A 1 ); c) I = QF = FQ almost everywhere; d) T2 F = FT1 almost everywhere. This is obviously an equivalence relation. One of the aims of ergodic theory is to classify measure-preserving maps modulo this equivalence relation. One of the methods for this analysis consists in associating with a map T: (X, d, µ) ➔ (Y,.'i', v) a linear operator Ur: 2 2 (Y) ➔ 2 2 (X) defined by
Urf =f 0 T. The fact that T preserves measure can be proved to imply that Ur is a unitary operator, i.e., denoting by ( ·, ·) the inner products in 2 2 (X) and 2 2 (Y):
(Urf, Urg) for every f, g e 2
2(
= (f,g)
Y).
Definition. Let (Xi, di, µJ, i = 1, 2 be measure spaces and I;: Xi ➔ Xi measurepreserving maps, with the associated linear operators Ur,· We say that T1 and T2 are spectrally equivalent if there exists an isometry L: 2
2 (X 1 )
➔2
2 (X2 )
If T1 and T2 are equivalent, they are spectrally equivalent, since the map F: X 1 ➔ X 2 given by the definition of equivalence gives rise to an isometry UF: 2 2 (X2 ) ➔ 2 2 (Xi) which satisfies the condition UFUr, = Ur 2 UF (just take UFf=foF). Halmos and Von Neumann [H3] proved in 1942 that spectral equivalence implies equivalence for maps T: X ➔ X for which the eigenfunctions of UT generate 2 2 (X). As we shall see in the Section II.3, translations of T" are examples of such maps. In general, however, spectrally equivalent maps are not necessarily equivalent; for example, all Bernoulli shifts are spectrally equivalent (Section II.11), but Kolmogorov [K8] proved in 1958 that they are not all equivalent. He did this by associating to each map T: X----> X a real number h(T), called the entropy of T (see Chapter IV), which is an invariant of the map, meaning that all equivalent maps have the same entropy. The entropy of a Bernoulli shift a: B(p 1 , ... , p.) ➔ B(p 1 , ••. , p.) is equal to - I7=i Pi log pi; thus, in particular, the shifts B(l/2, 1/2), B(l/3, 1/3, 1/3) and B(l/4, 1/4, 1/2) have different entropies and so cannot be equivalent. In the case of Bernoulli shifts, the converse also holds: Two Bernoulli shifts with the same entropy are equivalent. This remarkable result was proved by Ornstein. Thus we have the following Theorem 12.l. Two Bernoulli shifts are equivalent if and only if they have same entropy.
12 Equivalent Maps
79
In 1972 Friedman and Ornstein [F3] complemented this result with a criterion for determining if a map is equivalent to a Bernoulli shift (such maps are called Bernoulli transformations). Two of the more notable applications of this criterion are the following:
Theorem 12.2 (Friedman-Ornstein [F3]). A Markov shift u: B(p, P) -+ B(p, P) is a Bernoulli shift if and only if there exists n such that all coefficients of pn are positive. Theorem 12.3 (Katznelson [K2]). A continuous automorphism of rn is Bernoulli if none of the eigenvalues of its linear lifting is a root of unity.
Given Ornstein's theorem (12. l), the automorphisms of rn which satisfy the assumption of Katznelson's theorem (12.3) are classified by their entropy. We shall prove in Chapter III that the entropy has value 11..l.;llogl..l.;I, where the A; are the eigenvalues of the linear lift whose absolute value is greater than one. As an example of the construction ofan equivalence, we will now show that the homomorphism if>: S 1 -+ S1, ef>(z) = zn, is equivalent to the one-sided Bernoulli shift u: B+ (1/n, ... , 1/n) -+ B+ (1/n, ... , 1/n). Let a 1 = l, a2 , a 3 , ••• , an be the n-th roots of unity taken in counterclockwise order. Let Ii be the interval (ai,ai+1), l ~j ~ n - l, In= (an, l). Put A= nm;,,orm 0. Thus we can apply Zorn's Lemma to§', and obtain a maximal family (K1) 1;;,, 1 eF. We claim that this family satisfies the conclusion of the lemma. For if the complement X 0 of its union has positive measure, we can apply Lusin's theorem (0.1.3) to the space X0 endowed with the finite measure µ and the Borel set X 0 c X0 , thus obtaining a compact set Cc X 0 with µ(X0 \C) arbitrarily small, in particular µ(C) > 0. Since C is covered by a finite number of balls of radius r/2, its intersection with one of them must have positive measure; but then, by adjoining this intersection to (K1) 1;;,, 1 we can get a strictly larger family satisfying the same conditions, thus contradicting maximality. □ By repeated application of the lemma we now take a sequence (~)n;;,,i of families of compact sets (K!n>);;;,, 1 such that
µ( x-y~jn>) =
0,
supdiam(K;(n)) :!6; 1/n J
foreveryn;;,; 1, and such that each KJ"1 is a union of sets in(Kln+1>)1,. 1 and, possibly, a set of zero measure. It is then clear that the increasing sequence of partitions f!/>1 :!6; f!/>2 ~ • • ·, where f!/>n = (K!n>>i,,, 1 , is a Lebesgue sequence (check this). D
Let d
0
be the Borel u-algebra on [O, l] and A the Lebesgue measure.
Proposition 12.6. Let (X, d, µ) be a Lebesgue probability space without atoms. There exists a measure-preserving invertible map from (X, d, µ) into ([O, l], d 0 , 1). Proof Let (Y'n)n;;,,i, Y'n
= (Kln1};;;,, 1, be a Lebesgue sequence of partitions of X. Set Xo:=
n(u
n~1
KJ">);
i~l
then µ(Xg) = 0. Let (atn)n;;,,i be a Lebesgue sequence of partitions of [0, l] satisfying the following condition: There exist bijections Bn: atn -+ Y'n such that µ(Bn(P))
= J(P),
Bn(Q) C Bm(P) for every m ;;,; 1, n ;;,; m, Pe atm and Q e atn contained in P. Now put
Y=
n ( U)
m~l
Re9lm
and define T0 : Y-+ X 0 by To(x)
=
n Bm(Rm(x)),
m~l
where Rm(x) is the element of atm(x) containing x. The map T0 is well-defined, as can be seen from the assumptions on the Bm and the definition of a Lebesgue space.
12. Equivalent Maps
83
Furthermore, for Pe&'n,
= B;1 (P) n Y,
To-1 (P)
which implies that T0- 1 (P) is a Borel set satisfying
l(T0- 1 (P))
= µ(P).
Un,
Since this holds for every Pe 1 &'n, proposition 2.1 implies that T0 is a measurepreserving map from (Y, d 0 n Y, lido n Y) into (X, d,µ). We extend T0 to a map T: [O, 1] -> X by mapping any point outside of Y into an arbitrary point p e X 0 • To prove that Tis invertible, consider the map G: X-> [O, 1] given by
G(x)
=
n B,;- (&'n(x)) 1
n~1
for xEX0 (and G(X0) = 0, say). The intersection is non-empty, since we have a nested family of compact sets, and contains a single point because lim diam(B; 1 (&'.(x)))
=
lim ,1,(B,;- 1 (&'.(x)))
= µ(
n--++oo
n ➔ +oo
n &>.(x)) = µ({x}) = O;
n;?;:1
this shows that G is well-defined. It is also measurable, being the pointwise limit of measurable maps G.: X-> [O, 1] defined by G.(x)
= infB; 1 (&'.(x)), XEXo,
G.(X0) = 0. The verification that GT(y) = y for all xE Y is left to the reader. We check that TG(x) = x for XE G- 1 (Y): If G(x)E Y, it follows that G(x)
=
n (B.- (x)), 1
n;?;:1
so that
n Bm(B,;;- (&'m(x))) = n Y'm(x) = x.
=
1
m~l
m~l
Finally, we have to prove that µ(G- (Y)) 1
completing the proof.
µ(G- 1 (Y))
= 1. But
= A- (G- (Y)) = l((GT)- 1 (Y)) = l(Y) 1
1
= l,
□
Proposition 12.7. Let (X, d, µ) and (Y, i!iJ, v) be Lebesgue probability spaces. Then for every homomorphism F: !!J ➔ d there exists a measure-preserving map T: X----> Y such that F(A) = T- 1 (A) for every AE!!J. Two such maps agree almost everywhere.
Proof We prove only the case when the spaces (X, d, µ) and (Y, i!iJ, v) have no atoms. For the general case, which follows after smoothing away a few technical difficulties, see exercise 12.10. Because of Proposition 12.6, it is clear that we only have to prove the result at hand for (X, d, µ) = (Y, i!iJ, v) = B(l/2, 1/2), which is a Lebesgue space by Proposition 12.4. We recall that B(l/2, 1/2) is, by definition, B(2) with the Bernoulli mea-
L Measure-Preserving Maps
84
sureµ such that µ(C(0,1)) = µ(C(0,2)) = 1/2. Put 11'1 := {C(0,1),C(0,2)}, 11'n+l := V11tl Pn, we have
yen F(P.) = nF(ll'.(x)) = n~1
S".
n~1
We now define T: B(l/2, 1/2)-+ B(l/2, 1/2) by T(y)
= x for yeS".
It is easy to see that (3)
when Pis an atom in some partition &'n. Thus, by Proposition 2.1, Tis measurepreserving. Besides, if A is a Borel set, there exists for each e > 0 an n ;;;,, 1 and a set A., the union of atoms in &'., satisfying
Thus but, from (3),
so that On the other hand,
which implies µ(F(A)A
r
1 (A))
~
2e;
since e is arbitrary, it follows that F(A) = Now assume that S: B(l/2, 1/2) -+ B(l/2, 1/2) is a measure-preserving map satisy- 1 (A).
12 Equivalent Maps
85
fying s- 1 (A) = F(A) for every Borel set A. This means that s-1 (A) = If A is an atom of&'"' it follows that
r- 1 (A)(mod O).
S(y)eA for almost every point ye r- 1 (A). Since S = &'.(T(y)), we have that S(y) e &'.(T(y)) almost everywhere in r- 1 (A)e&'•. But this holds for every A e&'. and every n ;;i, 1, so there exists a set X 0 c B(l/2, 1/2) of full measure on which the relation above holds. This implies the last assertion of the proposition. D Corollary 12.8. Let (X, st,µ), (Y, P-1, v) be Lebesgue probability spaces and G:_.s/1 --+ PlJ an isomorphism. Then there exists an invertible measure-preserving map T: X --+ Y such that G(A) = T(A) for every A Est. Two such maps agree almost everywhere on X.
Proof From proposition 12.7, there exist measure-preserving maps T: X--+ Yand H: Y--> X such that
r- 1 (B) = G-1 (B), H-1 (A)
=
G(A)
for every A est and Be :?4. Thus the composition TH satisfies (TH)- 1 (B)
=B
for every B ei.i. By the uniqueness part of proposition 12.7, it follows that TH(y) = y almost everywhere on Y, and it follows analogously that HT(x) = x almost everywhere on X. □ Definition. Let T and S be automorphisms of probability spaces (X, st,µ) and (Y, f!l, v), respectively. We say that T and Sare isomorphic if there exists an isomor-
phism F: d
--+
jJj such that SF
= Ff; i.e. such that the following diagram commutes:
Fl .sl'
ii
f
-
s
.sl'
lF pjJ
It is clear that isomorphism is an equivalence relation. Equivalent automorphisms are isomorphic. It follows from proposition 12. 7 that for Lebesgue spaces the converse also holds. The reader should verify both assertions.
Exercises
12.1 Let (X,st,µ) and (Y,P-1, v) be measure spaces, and T: X--> Y a measurepreserving map.
L Measure-Preserving Maps
86
a) Prove that Ur is surjective if and only if the map f: ~ ➔ d given by f(A) r-1 (A) is an isomorphism. b) Prove that for Lebesgue spaces Ur is surjective if and only if Tis invertible.
=
12.2 Let (X, .sit,µ), (Y, di, v) be measure spaces, and T: X ➔ Ya measure-preserving map. Prove that if Ur is not surjective there exist disjoint sets A1 and A 2 in d such that µ(A 1 ) =I- 0, µ(A 2 ) =I- 0 and T(A 1 ) = T(A 2 )(mod0). Hint: Take f e.:i' 2 (Y,d,µ) such that U:f = 0. Then put A1 = 1 ((0, +oo)), A 2 = 1 ((-oo,O)).
r
r
12.3 Let X be a compact metric space, T: X ➔ X an expansive homeomorphism with constant e, and £1' = {P1 , ••• , Pn} a finite partition of X all of whose atoms have diameter less than e. Consider the map h: X ➔ B(n) defined by the property T"(x) E Ph(x)(n)• a) Prove that h is measurable and one-to-one. b) Prove that for every µe.H r(X) there exists ve.H.,(B(n))such that Tis equivalent to u: B.(n) ➔ Bv(n). 12.4 Let S be the set of all maps from {O, ... ,m} into {1, ... ,n}. Consider h: B(n)-+ B(S) defined by
(h(8)(j))(i) := 8(j - m
+ i).
a) Prove that h is continuous and one-to-one. b) Prove that h(B(n)) is a subshift of finite type, topologically equivalent to u: B(n) ➔ B(n).
12.5 For a probability space (X, d, µ), define a metric d( ·, ·) in sl as follows: d(A, B)
= µ(A L1 B).
a) Prove that d( ·, ·) is indeed a metric, and that it makes sl into a complete space. b) Prove that (X, d, µ) is separable, that is, possesses a countable generating (mod 0) subalgebra, if and only if d is separable.
c) If A is a separable metric space, let Is(A) be the set of isometries of A endowed with the topology of simple convergence. Take a countable dense subset {x 1 ,x 2 , ••• } of A, and consider the map d0 (-, ·): Is(A) x Is(A) ➔ R defined by
Prove that d0 is a metric which generates the topology of Is(A), and that it is complete if Y is. d) For A= d, show that the set Isµ(d) of isomorphisms of d' is a closed subset ofls(d). 12.6. Let d be the u-algebra of subsets of [O, 1] x [O, 1] of the form A x [O, 1], where A c [O, 1] is a Borel set. Denoting by l the Lebesgue measure in both [O, 1] and [O, 1] x [O, 1] and by d 0 the Borel sets of[O, 1], prove that ((0, 1] x (0, 1], d, l) is not a Lebesgue space but is isomorphic to ([O, 1], d 0 , l).
12. Equivalent Maps
87
12.7 Prove that if (Y,al, v) is a separable probability space without atoms and ([O, l], .910 , ..1.) is the unit interval with the Borel o--algebra and Lebesgue measure, then ~ is isomorphic to .ii. 12.8 Let T be an invertible automorphism of a probability space (X, .91, µ), and f!l' a partition of the same space. We say that f!J> is T-generating if Vi=-n T"(f!J>) generates .JJI. a) If f!l' = {P1 , •. • ,Pn} is T-generating, prove that there exists ve..i.,(B(n)) such that
Un
v(C(O,k0 , ••• ,k1))
=
µ( () ri(Pt))· J=O
Hint: Use the Hahn-Kolmogorov theorem, 0.1.5. b) Deduce that Tis equivalent to a shift a-: B.(n) - B.(n). 12.9 a) Let T and S be invertible automorphisms of the probability spaces (X, .JJI, µ) and (Y,a, v), respectively, and assume that there exist a T-generating partition f!l' = {P 1 , ••• , Pn} and an S-generating partition 91 = {Ri, ... , Rn} such that, for every m ~ 0, 1 ~ ni ~ n, 1 ~j ~ m, we have
Prove that T and S are isomorphic. b) Prove that an invertible automorphism T of a probability space (X,.91,µ) is isomorphic to a Bernoulli shift a-: B(P1 , ••• , Pn) - B(P1 , ••• , Pn) if and only if there exists a T-generating partition f!l' = {P1 , .. . ,Pn} .such that, for every m ~ 0, 1 ~ n1 ~ n, 1 ~j ~ m, we have
12.10 a) Let (X, .JJI, µ) be a measurable space. Prove thatµ possesses a unique decompositionµ = µ 1 + µ 2 , where µ 1 and µ 2 are measures on .91, µ 2 is purely atomic (i.e. each A e .JJI is either an atom of (X, .JJI, µ) or has zero µz-measure), and (X, .JJI, µi) has no atoms. Hint: Put µ 2 (A) = µ(A) if Ae.JJI is an atom, and µ 2 (A) = 0 otherwise. b) For Lebesgue probability spaces (X, .91, µ) show that {p} e .JJI for almost every point p in X. Deduce that in this case the measure µ 2 found in part (a) is of the form
where the a; are positive and c) Prove proposition 12.7.
L; a = 1. 1
12.11 Let (X,.JJI,µ) and (Y,al, v) be Lebesgue probability spaces, and T: X - Ya measure-preserving map. Prove that if for every Ae.JJI we have T(A)eal(modO) and v(T(A)) = µ(A), then Tis invertible.
I. Measure-Preserving Maps
88
12.12 a) Let (X, d, µ) be a measure space, and F: d -+ d a homomorphism. Prove that there exists a unique isometry UF: £-' 2 (X, d, µ)-+ £-' 2 (X, d, µ) such that UFfA =fF(A)
for every A Ed. b) Given an isometry U: £-' 2 (X,d,µ)-+£-' 2 (X,d,µ) which maps characteristic functions into characteristic functions, prove that there exists a homomorphism F: d -+ d such that U = UF. If U is surjective, show that Fis an isomorphism. c) Given an isometry U: £-' 2 (X,d,µ)-+ 2? 2 (X,d,µ) as in (b), and assuming that (X, d, µ) is a Lebesgue probability space, show that there exists a measurepreserving map T: X-+ X such that U = UT. 12.13 a) Let (X,d,µ) and (Y,~, v) be measure spaces, and T: X-+ Ya measure-preserving invertible map. Prove that there exists A Ed with full measure, whose image is in~ and also has full measure, and such that TIA is injective. b) Prove the converse of Theorem 12.6, namely: If (X,d,µ) is a probability space and there exists a measure-preserving invertible map from (X, d, µ) to ([O, 1], d 0 , A), then (X, d, µ) is a Lebesgue space. Hint: use (a) and Proposition 12.4.
Chapter II. Ergodicity
1. Birkhoff's Theorem Around the turn of the century the work of Boltzmann and Gibbs on statistical mechanics raised a mathematical problem which, in our context, can be stated as follows: given a measure-preserving map of a space (X, d, µ) and an integrable function/: X-+ R, find conditions under which the limit l.
f(x)
+ f(T(x)) + ... + f(T"- 1 (x))
(1)
1m - - - - - - - - - - - - -
n-+oo
n
exists and is constant almost everywhere. Similar questions had already shown up in other areas of mathematics, for example, in the problem of the average movement of the perihelion in celestial mechanics (see Arnold [A6]). In 1931 Birkhoff [Bl] proved that for any T and f the limit (1) exists almost everywhere. From this result he showed that a necessary and sufficient condition for its value to be constant a.e. is that there exist no set A e d such that O < µ(A) < 1 and r- 1(A) = A. As we shall see below, the fact that the limit is constant easily implies that it is equal to the integral off over X. Maps which satisfy this condition are called ergodic. Birkhoff's result did not close the problem that motivated it, since for the maps that occur in statistical mechanics ergodicity could not then be proved. Only in the sixties did the results of Sinai, and more recently those of Bunimovich, imply that maps analogous to the ones studied in statistical mechanics are ergodic. The reader can find a good short history of statistical mechanics and its connection with the beginnings of Birkhoff's theorem in Mackey's survery_[Ml]; we will not develop this study here, although in II.4 we describe more precisely some of the results of Sinai and Bunimovich. Quite apart from its origins and its relation to statistical mechanics, Birkhoff's theorem is a lofty result. It is pointless to emphasize its importance here: it will become manifest as we go through the theorem's manifold applications. Now for a precise statement of the theorem: Theorem 1.1 (Birkhoff). Let (X, d, µ) be a probability space and T: X preserving map.
-+
X a measure-
90
Il. Ergodicity
a) If fe2 1 (X), the limit
1 n-1 lim f(Ti(x))
L
n-++oo n j=O
exists for almost every point xeX. b) If f E .f£P(X), 1 ~ p < oo, the function Jdefined by
_ f(x)
=
1 n-1 . lim f(T1(x)) n---t-+oo
L
n j=O
is in f£P(X) and satisfies
lim l f~- -1 n-1 "'f,foT1· \\ =0, n 1=0 P f(T(x))
(2)
= J(x) a.e.
(3)
c) For every f E .f£P(X) we have
Proof. The crucial point is property (a); the others follow from it by combining it with some elementary facts of integration theory. We thus start by showing these two implications. To prove (b), assume (a) and observe that
_
1 n-1
lf(x)I ~ lim -
n-+oo n
.
L lf(T1(x))I
j=O
for almost every point x; the second limit exists because of (a). Thus (4)
Since I/IP is a positive function, it will be integrable if the function given by the limit is, and this, by Fatou's lemma (0.3.2), follows from showing that liminf n-+ +co
But
f(
1 n-1 - "'f, lfoTil x n 1=0
f( L
1 n-1 If O Til
X
n j=O
)p dµ
0 and K 2 > 0 such that
liminf!C(T"(x))?: K 1 , n--++oo
n
C(x)
~
K2
for xeA
and µ(A) > 0. Use the Poincare recurrence theorem (1.2) to derive a contradiction between the two inequalities. b) Prove that, if C: X ---> (0, + oo) is measurable and Co T - C is integrable, lim ! C(T"(x)) = 0
n--++oo n
a.e.
Hint: Write
1 1 n-l . 1 -C(T"(x)) = (Co T- C)(P(x)) +-C(x), n n~o n
I
and apply Birkhoff's theorem.
1.5 If Tis-a continuous map of a compact metric space X, prove that the minimal attraction center of T (Chap I, exercise 8.3) coincides with the closure of the union of supports of all T-invariant probability measures.
2. Ergodicity Consider a measure-preserving map T of a probability space (X, d, µ). Recall that a set A Ed is called T-invariant if y- 1 (A) = A.
2 Ergodicity
101
Definition. Tis said to be ergodic if every T-invariant set has measure O or 1. Bernoulli shifts are ergodic. The proof uses the following fact, which should be checked by the reader: ff A 1 c B(p 1 , ••• , p.), i = 1, 2 are cylinders, there exists m0 ;;;i: 0 such that, for all m ;;;i: m0 , (1)
Now assume that A c B(pi, ... ,p.) is a-invariant. Since the a-algebra of B(p 1 , ••• ,p.) is generated by cylinders, there exists for any s > 0 a finite union A0 of disjoint cylinders such that µ{A 0 AA)~s. Property (1) is easily seen to hold for finite unions of cylinders as well as for cylinders. Thus, for some m ;;;i, 0, µ(a-m(A0) n Ao)
= µ(Ao)µ(A 0),
µ(a-m{A 0 ) n A0) = µ(Ao)µ(Ao). It follows that µ(u-m(A 0 ) A A0 )
~
µ(a-"'(A) A a-m(A 0 )) + µ(u-"'(A) A A) + µ(A 0 A A)
= 2µ(AA A0 )
~
(2)
2e.
On the other hand, µ(u-m(A 0 ) A A0 )
= µ(u-m(Ao) n Ao)+ µ(u-m(Ao) n Ao) = 2µ(A 0 )µ(A 0) = 2µ(A 0 )(1 - µ(Ao)).
(3)
From (2) and (3) we get
and, since e is arbitrary, µ(A)(l - µ(A))
or, again, µ(A)
= 0,
= 0 or 1.
Definition. A function f
E
.!t' 1 (X) is called T-invariant if f
o
T
=f
almost everywhere.
Proposition 2.1. The following properties are equivalent: 1) Tis ergodic; 2) If f e .!t' 1 (X) is T-invariant then f is constant almost everywhere; 3) If f e ft'P(X) is T-invariant then f is constant almost everywhere; 4) For every A, Bed we have
1 n-1 lim- L µ(T-m(A)nB) nm=O 5) For every f e .!t' 1 (X) we have f
= Jx f
= µ(A)µ(B);
dµ almost everywhere.
IL Ergodicity
102
=
Proof. (3) (1). HA e d is T-invariant, its characteristic function£. is T-invariant and lies in -2''(X). Thus f..t is constant almost everywhere, i.e. µ(A) = 0 or 1. (1) (2). If/ e .!l' 1 (X) is T-invariant, the set Ac:= {xl/(x) ~ c} is invariant for each c. Since Tis ergodic, this means µ(A,) = 0 or 1 for each c. We leave it to the reader to show that this implies that f is constant almost everywhere. (2) (5). Since j lies in .2' 1 (X) and is T-invariant, it must be a constant. From
= =
Ljdµ= Lfdµ
the assertion follows. (5) (4). By Birkhoff's theorem (1.1),
=
1 n-1 lim - L f..t(T 1(x))
n ➔ oo
n j=O
f
= f..t- =
f..t dµ = µ(A)
X
almost everywhere. By the dominated convergence theorem (0.3.4),
= lim -1 n ➔ +con
(4)
fL
n-1
£.(T1(x))fBdµ
Xi=O
=(1). If A is T-invariant, we apply (4) to the sets A and A'. Then n-1
L µ(T-'1(A) n Ac) = 0,
µ(A)µ(AC) = lim
n-++oo j=O
so that µ(A)
= 0 or 1.
□
In fact, condition (5) only need hold in a dense set of .!l' 1 (X) to imply the other four: Proposition 2.2. /f there exists a dense set F
j(x) =
L
.!l'1 (X) such that
c
a.e.
f dµ
for every f e F, then Tis ergodic. Proof Since the sequence
1 n-1
-n LfoTi j=O
converges to jiu .2'
1 (X),
it is enough to check that, for f e2 1 (X), we have
lim
•-+oo
1 ll!"ffoT n J=O
-f
X
fdµII =0. l
2 Ergodicity
103
fll 1
Take e > 0, and choose g E F such that Ilg implies goT1 l !"f n J=O
-f
gdµII
X
~
1
a/3. Let n0 be such that n ;;,:, n 0
~e/3.
Then, for n;;,:, n 0 , 11~%foT1 - Lfdµt ~~t~foTi- 1tgoTit
JI! ~f
+ n 1=0 go Ti -
Since
llf o T 1 - go T 11I 1 = Jl(f -
we conclude that
"f
II !n 1=0 f
O
T1 -
IL f
X
f
X
gdµII
1
+ IL gdµ- Lf ciµj. g)o T 1II 1 = llf - gJI 1 , and since
gdµ- L f dµI
f dµ
II I
~ Jig -!111,
~ Ilg =fll
1
+ e/3 + Ilg -
fll 1
~ s.
□
Another characterization of ergodicity can be given in terms of the average time 'A(x) spent by a point x in a set A (see definition at the end of Section 1), which exists by Birkhoff's theorem. Proposition 2.3. Tis ergodic if and only if 1:A(x)
= µ(A)
a.e.
for every A E .sf. Proof If T is ergodic then 1:A(x) = £(x) =
L
fA dµ = µ(A)
a.e.
Conversely, let AE.sf be T-invariant. Assume µ(A)> 0. Since 1:A(x) = l for XE.sf, it follows that µ(A) = 1. □ The following "uniqueness theorem" holds for ergodic maps: Proposition 2.4. If T is ergodic and µ 1 : .sf ➔ [O, l] is another T-invariant measure, the following conditions are equivalent: a) µ1 = µ; b) µ1 « µ; c) There exists no T-invariant set A E .sf such that µ(A) = 0 and µ 1 (A) cf. 0.
II. Ergodicity
104
Proof (b) => (a). If µ 1 «µand µ 1, µ are invariant, the Radon-Nikodym derivative dµ 1 /dµ is invariant. Since Tis ergodic, dµif dµ is a.e. constant, and µ 1 = µ. (c)=>(b). Assume (b) does not hold, and take A0 ed such that µ(A 0 )=0 and µ1(Ao) 'F 0. The set A:= Un.:;or(Ao) contradicts (c). (a)=> (c). Trivial. D
If X is a set and .91 is a u-algebra on X, the set Jt(X, .91) of all measures on .91 has an obvious vector space structure. If T: X --> X is a measurable map, the set Jt r(X, .91) of T-invariant probability measures on dis a convex subset of Jt(X, .91). The next proposition characterizes ergodic ~ps~ith respect to T as the extremal points of the convex set Jt r(X, .91). (The expression"µ is ergodic with respect to T" evidently means that µEJtr(X,.91) and Tis an ergodic map of (X,d,µ). An extremal point x of a convex set C is one which cannot be written x = Ay + (1 - A)z for distinct y, z EC and some O < A < 1.) Proposition 2.5. A measure µEJtr(X,d) is ergodic if and only ifµ is an extremal point of Jtr(X,d). Proof Assume that µEJtr(X, d) is ergodic and thatµ= .1.µ 1 + (1 - .1.)µ 2 , with µ 1 , µ 2 E.Ar(X,.91) and O < .1. < 1. We have µ 1 « µ since .1. # 0, so proposition 2.6 implies that µ 1 = µ. Analogously, we have µ 2 = µ, which shows that µ 1 = µ 2 and µ is extremal. Now ifµ E Jt r(X, d) is not ergodic, there exist a T--invariant set A 0 Ed satisfying 0 < µ(A 0 ) < 1. We define measures µ;E.4tr(X,d), i = 1, 2, by
1
µ1 (A)
= µ(Ao/(A n A 0 ),
1 µi(A) = µ(Ao) µ(An Ao) for all Aed. Then we can write µ = µ(Ao)µ1
+ µ(Ao)µ2,
showing that µ is not extremal.
D
We next mention an elementary relation between ergodicity and transitivity (see 1.11.4): Proposition 2.6. Let T be a continuous map of a Baire space X, and µ a T-invariant probability measure on X assigning positive measure to open sets. If Tis ergodic, then it is transitive. Proof Let U0
c
X be a non-empty open set. Then the set U:=
U r"(U0 )
has positive measure and is T-invariant. Thus its complement has measure 0, hence
2 Ergodicity
105
empty interior (sinceµ. is positive on open sets). This means U is dense, and, by 1.11.5, T is transitive. □ The converse does not hold. For example, if µ 1 and µ 2 are Bernoulli measures on the space B(2), the measureµ. := (µ. 1 + µ. 2 )/2 assigns positive measure to open sets (since µ. 1 and µ. 2 do), but is not ergodic, by Proposition 2.5. Also Furstenberg's example (Section 7), already mentioned to illustrate the relationship between minimal and uniquely ergodic maps, and between equivalences and topological equivalences, furnishes a counter-example to the converse of 2.6, since it is an area-preserving and minimal (hence transitive) analytic diffeomorphism of T 2 , but is not ergodic.
Proposition 2.7. Let G be a_cf}mpact abelian group, and Lx: G ➔ Ga left translation. The following properties are equivalent: a) Lx is transitive; b) L" is minimal; c) Lx is ergodic; d) Lx is uniquely ergodic; e) The set {nxlnez+} is dense. Proof. (c) => (a) follows from Proposition 2.6. (a)=>(e). T~re exists y such that {L!ylnez+} is dense. But {nxlnez+} = L_;1 ({nx + ylnez+}) = L;1({L!ylnez+}), proving (e). (e) => (d) follows from Theorem 1.9.1. (d) =>(b) follows from Proposition 1.9.5. (b) =>(e). Trivial. (d) => (c). Trivial. □
Exercises 2.1 Let X be a set, .sit a u-algebra on X and T: X ➔ X a measurable map. Il i = 1, ... , n are ergodic and /J,; « µi for i =I- j, prove that there exist disjoint sets A; e d, i = 1, ... , n, such that /J,; e .,I{T(X, .sit),
n
UA;=X, i=l
µj(A;)
= Jii•
2.2 Let (X, d, µ) be a probability space and T: X ➔ X an ergodic map. Prove that a measurable function f: X ➔ C is integrable if and only if
1 n-1 lim sup- L lf(Ti(x))I < + oo n-+oo ff j=O
for almost every point xeX. Hint apply exercise 11.2. 2.3 Let T be a continuous map of a compact metric space X and µ.eAT(X) an ergodic measure.
II. Ergodicity
106
a) Prove that for almost every x e X we have fun
! "f f(T1(x)) =
n ➔ +con j=O
f
f dµ
X
for every continuous function f: X-+ C. b) Prove that for almost every xeX we have 't'_,t{x)
= µ(A)
for any set Aed such that µ(oA) = 0. c) Prove that there exists a countable set S xeX and every r,f:S, we have 't's,(x)(x)
c
(0, + oo) such that, for almost every
= µ(B,(x)).
2.4 Let X be a compact metric space and T: X-+ X a homeomorphism. We say that a subset A c X is w-saturated if for every x e A all points ye X such that lim d(T•(y), r•(x))
=0
n-+oo
also belong to A. We say that A is a-saturated if xeA implies ye A for every y such that lim d(r•(x), r-•(y)) = 0. n ➔ +oo
a) Prove that if f: X that the limit
-+
R is a continuous function and a e R, the set of x e X such
lim n ➔ +oo
! nf f(T 1(x)) n j=O
exists and is < a is an w-saturated Borel set. b) Prove that if µe..Kr(X) is such that every w-saturated Borel set of positive measure intersects every a-saturated Borel set of positive measure in a set of positive measure, thenµ is ergodic. Hint: Use (a) and Corollary 1.4. 2.5 Let (X,d,µ) be a probability space and T: X-+ X a measure-preserving map. Prove that if f e .2' 2 (X) satisfies
then
for almost every x e X. Hint: Use Chebyshev's inequality, exercise 1.1.4.
3. Ergodicity of Homomorphisms and Translations of the Torus According to Corollary 1.7.2, all surjective continuous homomorphisms and all translations of T• preserve Haar measure. Here we investigate when such maps are ergodic. As before, we write the canonical homomorphism n: R•-+ rn. We recall
3. Ergodicity of Homomorphisms and Translations of the Toms
107
that for every continuous homomorphism f: T"-+ T" there exists a unique linear map Rn-+ Rn, called the linear lifting off, such that ,r,J = f,r,. The eigenvalues of fare those of/.
J:
Theorem 3.1. A continuous homomorphism f: yn-+ yn is ergodic if and only if none of its eigenvalues is a root of unity. Theorem 3.2. If x e Rn, the translation L*>: T•-+ T• is ergodic if and only if (k, x) ¢ Z for every kezn. The proofs utilize the orthonormal basis of .!l' 2 (Tn) obtained from the Fourier basis {e2 "i(k,x)lkeZ"} of the cube [O, 1] x . ~. x [O, 1]. Theformaldescriptionofthis basis and its fundamental properties are the following:
Lemma 3.3. There exists an orthonormal basis {;klkeZ•} of .!l'2 (Tn) such that a) ; 0 = 1; b) For every homomorphism f: T"-+ yn we have
fJk Of = ;l*(k)• where/* is the adjoint of J c) For every xeRn the translation L,, satisfies
¢J,,oL,.(x) = ei(k,2nx>;k. Proof. For each if/: Rn-+ C satisfying if/(x) = ifJ(x + m) for all xeR", mezn, there exists a unique tp: yn-+ C such that .f, on = if!. Let ifrk: Rn-+ C be defined by i/Jk(x) = ei(k,Z11•\k)•
j=O
we conclude that qi Of =
i-1
i-1
j=O
j=O
L q!J•j(k) = I
q!J•j+'ck)
= ql, D
showing that f is not ergodic.
Proof of Theorem 3.2. Let¢, E .Z 2 (T") and¢, o L,,
= ql. Then
(ql,qlk) = (qloLn(x),'h) = (ql,qlkoL,,(-x)) = e-i(k,2nx)(ql,qlk). Thus if (k,x) ¢:Z for every O # keZ", we have (¢,
T 2 be a hyperbolic automorphism of T 2 (see exercisel.7.8), and let
p, q be distinct fixed points off Prove, following steps (a) to (d) below, that if ip: T 2 -> R is continuous and ip(p) =fa ip(q) the set A of points x E T 2 such that the
sequence
1 n-1
-n I
n ~ 1,
ip(Ji(x)),
j=O
converges is meager, i.e. it is contained in a countable union of closed sets of empty interior. a) Let O < a < ½lf(p) - f(q)j. For NE z+, define AN as the set of x such that
nJo ip(Ji(x)) 1 n-1
1
1 ;;;
J~
I
m-1
ip(Ji(x))
,s;; a
for every n, m ~ N. Prove that AN is closed for every N and that A b) Define ip0 : T 2 -> R by
c
U~=i AN.
1 n-1
tp0 (x) = lim supn ➔ +oo
n
I
tp(Ji(x)).
j=O
Prove that if (xi)j;, 0 is a sequence of points in AN converging to a point x, for some N ~ 1, then lim supi ➔ +oo li;6o(xi) - i;60 (x)I ,s;; a. c) Prove that if x 0 is a fixed point off then tp0 (x) = ip0 (x 0 ) for every x in w•(x 0 ) (see exercise I. 7.8). d) Deduce that AN has empty interior for every N. Hint: Use the fact, from exercise 1.7.8, that w•(p) and w•(q) are dense.
4. More Examples of Ergodic Maps We have already seen several examples of ergodic maps, and many more will appear in the following sections. The purpose of the present section is to give an idea of examples which are relevant in the current development of ergodic theory; it is not meant to be complete, and proofs are only sketchily presented, if at all. A) Tor al billiards are a generalization of the systems described in I.1. Take the torus T 2 , and draw on it a number of obstacles, that is, disjoint convex sets with C boundary. (The figure below shows an example with three obstacles.) 00
IL Ergodicity
110
Consider a particle which moves in T 2 along a geodesic (i.e. the projection of a straight line under the canonical map 1r: R 2 -+ T 2 ) until it collides with one of the obstacles. Collisions are perfectly elastic, i.e. the incidence and reflection angles are the same. In order to study the path of the particle we denote by C1 , ... , Ck the boundaries of the obstacles, oriented counterclockwise and parametrized by arclength s 1 , ... , sk. A collision is uniquely determined by the label i of obstacle C;, the parameter S; of the contact point, and the angle 0 < 0 < 1r between the incidence vector and the tangent vector to the curve at the contact point. Put M = (C1 x (0, 1r)) U · .. U (Ck x (0, 1r)), and let T: M-+ M be the map that associates with a collision parametrized by (s;, a) EC; x (0, 1r) the next collision of the particle, parametrized by T(s;, a). This map is well-defined because after a collision at p a particle must necessarily find another obstacle on its path; if not, the particle would eventually return to a point p' arbitrarily close to p (a simple consequence of the fact that all orbits of a translation of the torus are recurrent), and this, as the figure below shows, is impossible, since there must have occurred a collision before the particle reaches p'.
We can define on M a measure µ by the formula µ(A)=
KL
sin0dsdµ,
where A is a Borel set contained in M and K is a constant such that µ(M) can be proved [A6] that T preservesµ.
= 1. It
4. More Examples of Ergodic Maps
111
Theorem 4.1. If the curvature of the obstacles is everywhere non-zero, Tis a Bernoulli map.
The reader can find in Gallavotti [Gl] a proof which is quite concise and elegant, considering the technical complexity which generally rids proofs about billiards. The study of billiards goes back to Gibbs, who introduced as a model for a perfect gas a number of spheres within a bounded region, moving with constant velocity and colliding with one another and with the boundary in a perfectly elastic way. In the thirties, Birkhoffintroduced billiards as defined in I.l, but only in the sixties, starting with Sinai's work [S5], were any billiards proved to be ergodic. (See discussion in I.l.) B) Existence of ergodic diffeomorphisms. It is natural to ask whether every compact manifold without boundary endowed with a volume form has a diffeomorphism that preserves the volume form and is ergodic with respect to the corresponding measure. The answer to this question is yes; it was given by Anosov and Katok. Later Brin, Katok and Feldman proved [B12] the following stronger result: Theorem 4.2. Every compact manifold without boundary endowed with a C" volume form possesses a C"' diffeomorphism which preserves this form and is a Bernoulli map with respect to the corresponding measure.
The proof of this theorem is quite complicated, already in dimension 2. However, we can at least delineate the construction in the two-dimensional case, which involves some interesting ideas. Observe first that given a manifold M it is enough to show that for some C" volume form co0 there exists a diffeomorphism fo with the stated properties, for given any other form w we can apply Moser's theorem which guarantees the existence of a C"" diffeomorphism g: M-,. M such that g*w = w 0 • Then gf0 g- 1 : M-+ M preserves w and is a Bernoulli map with respect to the measure µw corresponding to co, because g gives an equivalence between f and / 0 considered as automorphisms of the probability spaces (M, d, µw) and (M, d, µ°' 0 ) respectively, where d is the Borel u-algebra on M. The case M = T 2 is immediate if we take for f an automorphism whose eigenvalues have absolute value different from 1, since such automorphisms are Bernoulli with respect to the Haar measure. Call such an automorphism/0 • In order to prove the result for a torus with k handles, the idea is to take 2k fixed points p 1 , ... , Pzk of fo, remove these points from T 2 , blow up the punctures into circular holes and finally identify pairs of holes along the boundaries. More specifically, we first make sure that fo has 2k fixed points by replacing fo by some power of itself (which still has eigenvalues off the unit circle), since fo certainly has infinitely many periodic points. We next take k = l and deal only with this case, since the general case is just a matter of repeating the procedure. So let D1 and D2 be two disjoint discs on T 2 centered on p 1 and p 2 •
112
II. Ergodicity
1\ I\
,1 II
11 I I I I I I I I
We claim that there exists a C00 volume form won T 2 \(D1 U D2 ) and a continuous map h: T 2 \(D1 U D2 )-+ T 2 such that: I) h(oD;) = P;, i = 1, 2; II) The restriction of h to T 2 \(D1 U D 2 ) is a diffeomorphism which transforms w into the volume form w0 on T 2 given by the Haar measure; and III) The diffeomorphism /o: T 2 \(D 1 U D2 )-+ T 2 \(D1 U D2 ) defined by h- 1f 0 h = /o extends to a homeomorphism of the manifold with boundary T 2 \(D1 n D2 ), and its extension acts on the boundary as shown: ~i
y
i
=
1, 2
i.e. it has four fixed points fo which the orbits of the other points converge, as indicated by the arrows. To construct h picture D1 as the disc in R 2 of radius Rand centered at the origin:
4. More Examples of Ergodic Maps
113
R
R'
R"
Let R" > R' > R and take ¢,: [O, R"] ➔ [O, R"] of class C°' in (R, R") and such that ¢,(r) for r very close to R", ¢,(R)
=r
= 0 and R' 2 r ¢,(r)ef,'(r) = R'2 - R2
for every R < r < R'. Solving for ¢, in this interval we have ,1,( ) _ R' ( 2 .,, r - (R'2 - R2)112 r -
R2)112
.
The reason for this strange-looking concoction is that now we can define h: T 2 \D1 ➔ T 2 \{pi} by h(x) = x outside BR,,(P) and h(r, 0)
= (¢,(r), 0-)
for R ~ r ~ R', where (r, 0) are polar coordinates on the disc. Then the determinant of h' extends to a C"" function on the manifold with boundary T 2 \(D1 U D2 ) because R'2
(deth')(r,0)
= R' 2 _
R2 ,
R < r < R'.
If J is the extension of deth' to T 2 \Di, we take
and then h*(m0 ) = (deth')m 0 = m.
The reader should convince himself by drawing pictures that the diffeomorphism
./o: T 2 \D1 ➔ T 2 \D1 given by
.io = h- 1f 0 h extends to a homeomorphism of T 2 \D1 which acts on iJD 1 in the desired way. It is clear that repeating the procedure for D 2 we do indeed get a map h satisfying conditions I, II and III. We now glue the boundaries of D1 and D2 as indicated in the figure:
II. Ergodicity
114
Then lo is a homeomorphism of M 2 which preserves w, and h gives an equivalence between fo and fo considered as automorphisms of the probability spaces (T2,d,µ°' and (M2,d,µ,,,). Observe we merely say that is a homeomorphism, and indeed the reader with enough patience can check that it is not a diffeomorphism. In order that lo be a C"' diffeomorphism, it is necessary to start with a different f 0 • It can be shown that there exist continuous functions Pn: [O, + oo) -+ [O, + oo ), n ~ 2, such that p.(O) = 0 and p.(t) > 0 fort > 0, and having the following property: If f 1 : T 2 -> T 2 is a C"' diffeomorphism such that f 1 (p;) = P;, i "." l, 2, and
Jo
0 )
f{(p;)
= I,
i
=
1, 2,
llfi°l(x)II ~ pj(d(x, p;)) j ~ 2, i ~ l
(1)
(2)
(i.e. if f 1 is close enough to the identity in the neighborhood of p 1 and p 2 ), then the diffeomorphismJ1 : T 2 \(D 1 U D2 )-> T 2 \(D 1 U D2 )defined by J 1 = h- 1f 1 h extends to a C diffeomorphism of T 2 \(D 1 U D2 ) such that 00
= x,
(3)
f;(x) = I,
(4)
=0
(5)
fi_(x)
fi°l(x)
for every x E oD 1 U iJD2 and j ~ 2. On the other hand, Katok showed [Kl] that it is possible to modify a diffeomorphism fo of the kind we used above into a C"' diffeomorphismf1 satisfying conditions (1) and (2). Applying the same cut-and-paste procedure to f 1 , we get a diffeomorphism li_: M 2 -+ M 2 with the necessary properties, since by (3), (4) and (5) li_ is still C"' after gluing the discs to the holes.
4. More Examples of Ergodic Maps
115
The only remaining compact oriented two-dimensional manifold without boundary not encompassed by the above procedure is the sphere S2, which we discuss now. The involution (x,y)1-+(-x, -y) of R 2 induces an involution p of T 2 which can be visualized, as indicated by the figure below,
as a 180-degree rotation around the axis r. This involution has four fixed points ai, a 2 , a3 , a 4 • There exists a continuous map h: T 2 - S 2 which is a local C00 diffeomorphism outside the fixed points of p and such that the inverse image h- 1 (x) of any point x e S2 is either one of the fixed points of p or a pair of points of T 2 mapped into one another by p. We describe this branched cover as follows: Cut the torus in the figure by a vertical plane so that the intersection is two circles (meridians), and consider the half-torus obtained on either side. The map h acts on this half-torus by identifying the points of the boundary as shown in the figure below (where we have included an intermediate step of the "closing" of the two openings):
II. Ergodicity
116
The action of h on the other half-torus is determined by the fact that the inverse image of points of S 2 consists of two points in involution. Now let / 0 : T 2 ➔ T 2 be a hyperbolic automorphism. It is obvious that
fo 0 P = P 0 fo-
(6)
This implies that there exists a dilieomorphism io: S 2 ➔ S 2 such that the diagram
.T~Ts2
---::-4
s2
fo
commutes. In fact, this map is given by io(x) := h(f0 (y)), where yeh- 1 (x); this is independent of the choice of y because of (6). It is easy to check that is a homeomorphism, and in fact a CX' dilieomorphism outside the points h(ai),j = 1, 2, 3, 4. At these points, however,io is not differentiable. To fix this problem one uses a technique similar to the one for the case of the torus with handles: one proves that Jo can be modified so as to obtain a diffeomorphism f 1 : T 2 ➔ T 2 which is a Bernoulli map with respect to the Haar measure, which satisfies f 1 op = po / 1 and which fulfills conditions analogous to (1) and (2) at points ai, a2 , a 3 , a 4 . This implies that J1 is of class C00 at these points as-well.
Jo
C) Genericity of ergodicity. The problem of whether or not ergodicity is a generic
property in a given space of automorphisms of a probability space has obvious interest. The answer, and the degree of difficulty in proving it, depends on the space. Here we shall briefly describe the main results on this subject. Consider a compact manifold M without boundary, and let co be a volume form on M. Let Hom,,,(M) be the space of homeomorphisms of M preserving the probability measure associated with M. The most natural topology on this space is given by the metric d(f,g) := supd(f(x),g(x)) + supd(r 1 (x),g- 1 (x)). xeX
xeX
With this metric, Hom 01 (M) becomes a complete metric space. The question of whether ergodic maps form a residual subset of this space was solved affirmatively by Oxtoby and Ulam, who proved the following result:
Theorem 4.3. Ergodicity is a generic property in Hom01(M). Now let us consider the space Diff!,(M) of C' diffeomorphisms of M preserving co, endowed with the C' topology. In this context the question of the genericity of ergodicity is open when dim M > 2, and has a negative answer for every twodimensional M if r ~ 4. When 1 ~ r :;;;; 3 it is again open. The negative answer when dim M = 2 and r ~ 4 follows from a deep result (the Kolmogorov-Arnold-Moser theorem) about the dynamics of area-preserving diffeomorphisms near fixed points of elliptic type. This result and its negative consequences for ergodicity are discussed in Section 5.
4. More Examples of Ergodic Maps
117
Another space where the question arises naturally is the space Symp'(M, ro0 ) of symplectic diffeomorphisms of a compact symplectic manifold withqut boundary (M, ro0 ). Here again the answer is unknown when 1 ~ r ~ 3, and can be proved to be negative, as a consequence of the Kolmogorov-Amold-Moser theorem, for r ~ 4 and (M, ro0 ) arbitrary. In a purely measure-theoretical context, ergodicity is a generic property. More precisely, let (X, d, µ) be a Lebesgue probability space (in particular, a probability space where X is a locally complete separable metric space and d its Borel u-algebra). Let Aut (X,d,µ) be the set of automorphisms of(X,d,µ). There exists a metrizable topology on Aut (X, d, µ) such that (T,,) converges to T if and only if lim µ(T,,(A)A T(A))
=0
for all A Ed. This topology can be explicitly defined by taking, for example, a countable family {Ai} c d that generates d(mod0) (such a family exists because the space is Lebesgue), and putting
1
d(T1, T2) := ~ 2• µ(T1(A.)A T2(Ah)). This topology makes Aut (X,d,µ) into a complete metric space (exercise 4.4). Ergodicity is a generic property in this space (Rokhlin). The reader may have observed that we have not addressed the question of density of ergodic maps. Since density is a priori a weaker property, it could turn out to hold in cases where genericity fails. The fact is, however, that the density of ergodic maps implies genericity in any reasonable space. The formal statement of this fact can be found in exercise 4.4(c).
D) Interval exchange transformations. We say that a map T: [0, 1]-+ (0, 1] is an interval exchange transformation if it is injective and there exist O = t 0 < t 1 < · · · < tm = 1 and numbers a;ER, 1 ~ i ~ m, such that for every i and every t 1_ 1 < x < ·t, we have T(x) = u1x 1 + a;, where u; = + 1 or -1. Obviously T preserves Lebesgue measure. As an example, faker, s such that 0 < r < s < 1, and define T: [0, 1]-+ [0, 1] by
l+r-s
:~ I
1-s
--r----I
I I
T(x) =x + 1-s T(x)= -x+r+ 1 T(x) = -x + 1
1
I I I
Then Tis an interval exchange transformation.
0:Sx S 1 gives a "topological equivalence" between T0 and T. Again, we use quotes because ii is not quite a homeomorphism; but even then, it clearly makes sense to say that the dynamics of Tis explained by the dynamics of T0 • For example, T0 is minimal if and only if w(x) = M 2 for every x whose w-limit set is not a singularity. As an example of the construction just described, consider the figure below:
4. More Examples of Ergodic Maps
,)
'
•
\.
121
.
/
The vector field has two singular points (the second of which cannot be seen in the picture), and we assume it preserves volume. Identifying two by two the ~oundary components of the figure on the left, we obtain the surface on the right, where the vector field has a finite number of singularities and possesses a transverse circle y (the image under the identification of any boundary component). The return map T: y -+ y is then an interval exchange transformation. Next we state some fundamental results about interval exchange transformations. We start with some definitions. If T: [O, 1]-+ [O, 1] is an interval exchange transformation, we define C(T) as the set of points x E [O, 1] where all powers T", n E Z, are continuous. The complement of this set is obviously countable, and we have T(C(T)) = r- 1 (C(T)) = C(T). We say that Tis minimal if w(x) = [O, 1] for every x E C(T).
Theorem 4.4 ([K5]). The following properties of an interval exchange transformation T: [O, 1]-+ [O, 1] are equivalent: a) there exists x E C(T) such that w(x) = [O, 1]; b) Tis minimal; c) For every n > 0 the set of fixed points of T" has empty interior.
A useful criterion that implies minimality was obtained by Keane [K5]:
Theorem 4.5. Let T: [O, 1] -+ [O, 1] be an orientable interval exchange transformation, = ((s)-fx fdµII 2 T,,M. When both eigenvalues have absolute value 1, we say that xis an elliptic fixed point off When neither has absolute value 1, we say that x is hyperbolic (see section 1.11). In the hyperbolic case the dynamics off in the neighborhood of xis entirely described by the theorems of Hartman and Sternberg. The first asserts that there exists a neighborhood U of x and a homeomorphism h from U into a neighborhood of the origin in T,,M such that (Dxf)h(x)
= h(f(x))
1 (U).
for xe U nfIn other words, up to a continuous change of coordinates,/ and Dxf act in the same way. The theorem of Sternberg says that, if f is c«> and 211 A.~ 2f ,t, 1 for all integers n, m ;;,, 0, n + m > 0 and i = 1, 2, then the homeomorphism h with the properties above can be chosen smooth and with a smooth inverse. Both theorems hold in higher dimensions (with an appropriate modification in the case of Sternberg's). When x is elliptic, nothing can be inferred in general about the local behavior off from its linear part Dxf For example, the diffeomorphisip.s / 1 , / 2 : R 2 -+ R 2 given by
= Lx + xllxll 2 fi(x) = Lx - xllxll 2
f1(x)
where L: R 2 -> R 2 is a rotation, exhibit very different dynamics around the origin (a fixed point). For / 1 , the IX and w-limit sets of any point close to the origin, but
II. Ergodicity
126
different from it. are 0 and {O}, respectively. For f 2 the converse obtains; furthermore, both behaviors are different from that of L, for which the a- and co-limits are non-empty and do not contain the origin. We now assume that f preserves some volume form, which implies that ,1,1 ,1,2 = 1. Then the situation is much brighter; under fairly reasonable assumptions about the eigenvalues of an elliptic fixed point and the differentiability off. one can prove some surprisingly strong properties. The condition on the eigenvalues is that their arguments be different from 0, ± n/2, n and ± 2n/3, and the condition on f is that it be of class C4 • Then it is possible to choose coordinates on Min which f takes on a very simple form, the so-called Birkhoff normal form. More precisely, there exist a neighborhood U of x and a C4 diffeomorphism h from U into a neighborhood V of the origin in R 2 such that. writing (r, 8) the polar coordinates of a point in h(unr1 (U)), we have (hfh- 1 )(r,8) = (r,8
+ a 0 + 2. In the latter case there exists a version of the KolmogorovArnold-Moser theorem for symplectic diffeomorphisms. It says that in every neighborhood of an elliptic fixed point x of a symplectic diffeomorphism f: M 2 n-+ M 2 n and under certain conditions of non-degeneracy and regularity, there exists a set of positive measure which is the union of invariant sets, each diffeomorphic to an n dimensional torus and restricted to which f is topologically equivalent to transitive translations of Tn. For exact statements of this result, see [R4]. In 1973 Zehnder [Zl] proved that there exists a residual subset CC ofDiff:'(M 2) such that, for f ere and x a non-degenerate elliptic periodic point of some power f"' off, there exist, inside the annuli determined by the theorem of KolmogorovArnold-Mosi:r, other non-degenerate elliptic periodic points, as well as hyperbolic periodic points and transversal homoclinic points associated with the latter. In [A6] and [A7] the reader will find more information and interesting pictures describing how complicated is the dynamics in the neighborhood of a non-degenerate elliptic point.
6. Ergodic Decomposition of Invariant Measures Let X be a compact metric space and T: X -+ X a measurable map. As before, we denote by .Itr(X) the set of T-invariant probability measures on the Borel u-algebra of X. By Theorem 1.8.1, .Kr(X) # 0 if Tis continuous, but when this is not the case, .Itr(X) can well be empty (see exercise 1.8.6). When .Itr(X) # 0 it is reasonable to ask whether it contains any ergodic ~ps~The answer is yes, and it follows from the results in this section. The motivation for these results is as follows: we would like to associate to each point x e X a measure µx such that
n Ergodicity
128
µ,,(A) = -cA(x) for every Borel set A c X (see definition at the end of Section 1). This, however, is asking for too much, if only because the limit in the definition of 't'A(x) doesn't always exist; in fact, unless x is a periodic point, it is always possible to find a Borel set A for which 1:A(x) is not defined (exercise 6.1). Thus we have to be more sophisticated in using measures to describe the stochastic behavior of the orbit of a point. We first define classes of points whose orbits can be described by m~ans of probability measures. Let T: X-. x be a measurable map. We define E0 (T) as the set of points xeX such that, for every continuous f; X-. R. the limit 1 n-1 i(x) = lim f(Ti(x)) (1) n ➔ +a,n
I
J=O
exists. Further, let C0(X) be the space of continuous functions f: X-. R endowed with the norm 11111 = SUPxexlf(x)I. For xeE0 (T)we define L,,: C0 (X)-.R by 1 n-1 L,,(f) = lim f(T 1(x)). (2) n ➔ +co
I
n j=O
Then L,, is a positive linear functional and Lx(l) = 1, so that by Riesz's representation theorem (1.8.4) there exists a unique probability measure µ,, on the Borel o--algebra of X such that
L
f dµ,,
= L,,(f).
We define E 1 (T) as the set of xeE0 (T) such'that µ,, is T-invariant. When Tis continuous we have Eo(T) = E 1 (T) because, for every f e C0 (X),
L
(f o T)dµ,,
= L,,(fo T) = L,,(f) = L f dµ,,,
the second equality being an easy consequence of (2). When T is not continuous o T may not be either, so it doesn't make any sense to write L,,(f o T), and the argument above fails. Thus E0 (T) is not necessarily equal to E 1 (T), as for the example in exercise 1.8.6, where ..KT(X) = 0 but it can easily be seen that E 0 (T) = X. We finally define Ez(T) as the set of x e E 1 (T) for whichµ,, is ergodic, and E(T) as the set of x e Ez(T) for which x belongs to the support ofµ,,. Points in E(T) are called regular. For example, for the map T: [O, l]-. [O, l] defined by f
T(x) = ½(x + x 2 ) it is easy to verify that µ,, = 00 , µ1
0 ,;;; x < 1,
= 01,
so that E 2 (T) = [O, l] and E(T) = {O, 1}.
6. Ergodic Decomposition oflnvariant Measures
129
The four sets defined above are Borel sets. (A sketch of the proof of this fact is given in exercise 6.2) Our purpose here is to prove that, when .Hr(X) is non-empty, the set E(T) is also non-empty, and that every element of .Hr(X) can essentially be decomposed as a linear combination of measures P.x, with x e E(T).
Defmition. If .Hr(X) # 0, a set Ac Xis said to have total measure if µ(A 0
for every f E C0 (X) such that f(p) > 0. Let r > 0 and n > 0 be such that f(x) > r for every x E B 11.(p). Then
f
X
f dµP = f(p)
~ rlimsup_!__ # {O ~j < ml Ti(p) E B11.(p)} > 0. m--++oo m
O
6. Ergodic Decomposition of Invariant Measures
133
Theorem 6.4. Let T: X--+ X be a measurable map and µ e .KT(X). Then every f e ft' 1 (X,µ) is µ,.-integrable for µ-almost every xe.E(T) and
It is interesting to observe that if the characteristic function of a set A µ-integrable, i.e. if A is a Borel set (mod 0), then µ(A)=
c
X is
L
µ,,(A)dµ.
In particular we have the following corollary: Corollary 6.5. A set A c X has total measure if µ(Ac) µe.,{(T(X).
= 0 for every ergodic measure
Proof of Theorem 6.4. For f: X _. R bounded and measurable, we have, by Lemma
6.3, Lfdµ" =f(x)
for µ-almost every x e X. Then
Hf ef£ 1 (X,µ) we can assume, without loss of generality, that f is positive. Then there exists a monotonically increasing sequence (f,,).,. 0 of bounded measurable functions f,,: X _. R which converge pointwise to f. It follows that
f
f dµ" = lim
n-+oo
X
f
f,,dµ" = lim f..(x) n ➔ +oo
X
for µ-almost every x. Since the sequence (f..) is monotonically increasing (because (f,,) is), and since LJ,.dµ
= Lf,,dµ,
we can apply the monotone convergence theorem (0.3.3) and obtain
f (f X
X
f
dµx) dµ(x) =
f( X
lim
n--++oo
J..) dµ =
lim
n-++a,
f /, X
dµ
= lim
n-++oo
f
X
f,, dµ
f
= Xf
dµ.
□ Exercises
6.1 Let X be a compact metric space and T: X _. X a continuous map. Let¢,: X be continuous and such that
--+
R
134
II. Ergodicity
lim inr! 11➔ +00
nf R of period 2n such that k(x
Proof. Let h: T 2
->
+ rx) -
k(x)
(2)
T 2 be given by h(eix, w)
Then
showing that fh inverse
= -rp(x) a.e.
= (eix, we-ik(x>).
(fh)(eiX, w)
= (ei(x+•>, we-i(k(x)-ql(x))),
(hT)(eix, w)
=
(ei(x+•>, we-ik(x+•>),
= hT almost everywhere. On the other hand, h is invertible, with
and both h and h- 1 can be proved, using Fubini's theorem, to preserve Haar measure. Thus h gives an equivalence between f and the translation. □ The problem now lies in recognizing when f is minimal. Lemma 7.3. If frt>dx=O
(3)
and there exists no continuous function k: R -> R of period 2n satisfying
k(x
+ rx) - k(x) = -m(x),
where 0 such that Zn/ex is irrational and that (1) has no continuous periodic solution of period 2n, but has a solution k: R --+ R of period 2n whose restriction to [0,2n] is in ff' 2 • Put 0 and r > 2 satisfying
.
C
le""'-1I ;;;,,n'
(21)
for every n #- 0. Let ¢,: R ➔ R be a smooth function of period 2n, whose integral over (0, 2n] is zero. Prove that (4) has a smooth solution k: R ➔ R with period 2n. b) Deduce that if 01: and¢, are as in part (a), the diffeomorphism defined by (1) is topologically equivalent to the translation (z, w)1➔ (e 1«z,e 1Pw), where
f2"
P = Jo
¢,dx.
7.2 A real number tis called a Liouville number if there do not exist constants C > O and r > 2 such that (22) for all integers p, q. a) Prove that the complement of the set of Liou ville numbers has Lebesgue measure zero. Hint: Let N(C,r) be the set of te (0, 1] which do not satisfy (22) for some integer O ~ p ~ q. Prove that
-
A(N(C,r)) ~ 2C
L
1 ,-i•
q#aOq
and that
is the complement in (0, 1] of the set of Liouville numbers. b) Let a e R. Prove that there exist C > 0 and r > 2 satisfying (21) if and only if a./2n is not a Liouville number. Hint: use the inequality
~s~le 1• - 11 ~s, 1t
which holds for all s e (0, n].
8. Mixing Automorphisms and Lebesgue Automorphisms In Section 3 we proved that an automorphism T of a probability space (X, d, µ) is ergodic if and only if
1 lim -
m-1
L µ(Y-n(A) n B) = µ(A)µ(B)
m-+oom n=O
IL Ergodicity
142
for any sets A, Bed. H this is the case and the limit of the sequence µ(T-n(A) n B) exists. it can only be µ(A)µ(B). This motivates the following definition:
Definition. An automorphism T of a probability space (X, d, µ) is called mixing if for every pair of sets A, Bed we have: Jim µ(rn(A) n B) = µ(A)µ(B). n-+oo
Mixing automorphisms are ergodic because, if mixing,
r-1 (A) = A
for every n and Tis
µ(A)µ(Ac) = lim µ(r-n(A) n Ac)= µ(An Ac) = 0, •-+co
showing that µ(A)
= 0 or µ(Ac) = 0.
➔ X a measurable map, and µ a T-invariant probability measure on the Borel a-algebra of X. If T is mixing and µ is positive on open sets, then T is topologically mixing.
Proposition 8.1. Let X be a topological space, T: X
Proof. Let U and V be non-empty open sets. Then limn~+oo (T-n(U) n V) = µ(U)µ(V) > 0. Thus µ(T-n(U)n V) > 0 for all large enough values of n, and in particular r-n(U) n V #- 0□
It follows from this proposition that translations of compact topological groups are never mixing (with respect to Haar measure, wnich, according to Theorem 1.7.1, is
the only measure invariant by such maps), since, by exercise 1.11.10, they are never topologically mixing. We know that translations of compact topological groups can be ergodic, so here we have examples of ergodic maps which are not mixing. We will soon see examples of mixing automorphisms, including all ergodic automorphisms of Tn and all Bernoulli shifts. We recall from 1.12 that we can associate to every automorphism T of a
probability space (X, d, µ) an isometry UT: .!l' 2 (X, d, µ) ➔ .!l'2 (X, d, µ) defined by UTf=foT.
Proposition 8.2. An automorphism T is mixing if and only if lim (U;f,g)
= (.f..l)(g,l)
(1)
for every f, g E .!l'2 (X, d, µ). Proof If UT satisfies this property, we apply it to f and obtain
= fA. and g = f 8 for
A, Bed,
lim µ(r-n(A)nB) = lim (U;f.4.,f8 ) n-+oo
n-+co
(2)
= (f.4., l)(fB, 1) = µ(A)µ(B). Conversely, if Tis mixing, (2) shows that (1) holds when f and g are characteristic
8. Mixing Automorphisms and Lebesgue Automorphisms
143
functions of sets in .91, and consequently also when they are linear combinations of such functions, by linearity. The set of such linear combinations is dense in ft'z(X, .91, µ). Now take f, g e; ft' 2 (X, .91, µ). Given 6 > 0, let fo, g0 e; ft' 2 (X, .91, µ) be such that
lim n-+oo
llf-follz ~6, llu - gollz ~ a. (U:rfo,go) = j f*m(S0 ), so it can be written k = f* 1.,(l'), i > 0, l' E S0 • But then land l' are two points in the intersection of S0 with the orbit of k, implying that l = l'. Thus
,, = 1= -r•-1 = r•-1 t* 1+i = f*'(t'), and, if l -# 0, f* has an eigenvalue which is an i-th root of unity. By contradiction we have l = 0 and k = 'f* 1(l) = 0. □ All known invertible Lebesgue automorphisms are spectrally equivalent (see 1.12). To make this assertion independent of the state of our ignorance, we introduce the following definition: A Lebesgue automorphism T has finite rank if the subspace E in the definition can be found such that the dimension of E e UT(E) is finite. Otherwise T is said to have infinite rank. Analogously, T has countable rank if E can be found so that Ee UT(E) is separable (i.e. has a countable Hilbert basis). Observe that if .!l'2 (X,d,µ) is separable, which is by far the commonest situation, then any Lebesgue automorphism of infinite rank has countable rank. The dimension of Ee Ur(E) is independent of E when E satisfies (3), (4) and (5) (see exercise 9.3), and thus can be called the rank of T. The definition of finite rank given above, however, avoids having to prove this fact. A final important observation is that there is no known example of a Lebesgue automorphism of finite rank. This fact, together with the proposition below, justify
147
8. Mixing Automorphisms and Lebesgue Automorphisms
our assertion that all known invertible Lebesgue automorphisms are spectrally equivalent. Proposition 8.7. All invertible Lebesgue automorphisms with countable rank are spectrally equivalent. Proof Let T and T' be Lebesgue automorphisms of probability spaces (X, d, µ) and(X',d',µ'). Let E 0 and E0be closed subspaces of £' 2 (X,d,µ) and £' 2 (X',d',µ') such that
Since T and T' have countable rank, both E 0 and E 0 have countable dimension. Thus there exists an isometric isomorphism L 0 : E 0 -> E 0. We define a linear map L: .P2 (X,d,µ)-> £' 2 (X',d',µ') by the conditions L(l) = LI U¥(E 0 )
i,
= U¥Lo Ui"I U¥(E 0 ),
n ;;, 0. Then Lis an isometric isomorphism which satisfies U¥LUi" = L, so T and T' are spectrally equivalent. In Section 10, using Gaussian shifts, we will construct mixing automorphisms
which are not Lebesgue.
Exercises 8.1 An automorphism T of a probability space (X, d,µ) is called weakly mixing if
l Jim n--++oo
n-1
. 1
L µ(r (A) n B) -
n j=O
µ(A)µ(B)
= o.
a) Prove that a weakly mixing automorphism is ergodic. b) Prove that powers of a weakly mixing automorphism are also weakly mixing. c) Prove that the Cartesian product of a weakly mixing and an ergodic automorphism is ergodic. d) Prove that the Cartesian product of a two weakly mixing automorphisms is weakly mixing. e) Prove that if Tis weakly mixing the only eigenvalue of Ur is 1. 8.2 Prove that if Tis an invertible Lebesgue automorphism of the probability space E!i' 2 (X,d,µ) there exists a probability measure vE .,tt([0, 2n]) such that
(X,d,µ) then for every f
IL Ergodicity
148
for every n. Hint: Write fas Lm lf;f,,,, with fm e E 0 , E 0 satisfying (6), and define v by ~; (x)
= h/meimr II:,
where ,1. denotes Lebesgue measure. 8.3 By the Riemann-Lebesgue Lemma any v e .H([0, 2,r]) which is absolutely continuous with respect to the Lebesgue measure ,1. satisfies lim n-++oo
f.
2"
eint dv
= 0.
O
Prove, following the steps below, that there exist sequences (0n)n;,, 1 of real numbers, converging to zero, for which no v e .H([0, 211:]) such that v « ,1. satisfies
(This result will be utilized in the proof of Corollary 10.5.) a) Let c0 be the space of sequences 0: z+ ➔ R which converge to 0, endowed with the norm 11011 = SUPn;,,o 10(n)I, and 11 the space of sequences 0: Z ➔ R with Ln 10(n)I < oo, endowed with the norm 11011 = 10(n)I. Consider the operators F: 2 1 ([0, 2,r], ,1.) ➔ c0 , G: 11 ➔ 2"''([0, 21t], ,1.) defined by
Ln
F(f)(n)
=
G(0)(x)
f.
2n
0
ei"'J dµ,
= L 0(n)e'nx. n
Prove that F and G are well-defined and G is the adjoint of F. b) Prove that the image of G does not contain the function f(t) = t. Observe that it contains all trigonometric polynomials, and deduce that it cannot be closed. c) Deduce that F cannot be surjective.
9. Spectral Theory The properties of automorphisms T of probability spaces (X, d, µ) considered in the previous sections (ergodic, mixing, Lebesgue) can all.be characterized in terms of the associated operator Ur. Thus, Tis ergodic if and only if the eigenspace of Ur with eigenvalue 1 is one-dimensional (Proposition 2.2); it is mixing if and only if Ur satisfies (1) in Section 8 (Proposition 8.2); and the definition itself of Lebesgue automorphisms involves the operator Ur. Properties such as these, which can be expressed in terms of the operator Ur, are called spectral properties, and it is clear that they occur very naturally, so much so that it was only in the late fifties that Kolmogorov introduced the first non-spectral class of automorphisms, which now bears his name (Section 11). Because the map Ur is an isometry of a Hilbert space,
9. Spectral Theory
149
it is natural to try to apply functional analytic results about such isometries to the study of automorphisms of probability spaces. Already in 1930, during the infancy of ergodic theory, Koopman observed that Ur is an isometry and suggested that this fact be taken as the starting point for solving the problem of existence of orbital averages and its relation with the spatial average (see Section 1). Roughly at the same time, von Neumann proved the following result: Theorem 9.1. Let H be a Hilbert space and U: H--> Han isometry. Let E(U) := {ulUu = u} and n: H--> E(U) be the orthogonal projection. Then
1 n-1 lim u•u = nu n-..+oo
I
n j=O
for every uEH. The reader will notice that for U of the form Ur, for some T, this result follows from Birkhoff's theorem. Its proof is the object of exercise 9.1. It is easy, and sometimes useful, to introduce classes of isometries which correspond to the classes of automorphisms we have considered. With the notation of Theorem 9.1, we say that an isometry U is ergodic if dim E(U) = 0, mixing if lim H 2 satisfying U2 L = LU1 • Theorem 9.2. If U is an invertible isometry of a Hilbert space H, then
UIS(u)
~ U,,": 2' 2 ([0,2n],µu)---> 2' 2 ([0,2n],µu)
for all uEH.
An interesting consequence of 9.2 is the following characterization of Lebesgue invertible isometries: Corollary 9.3. An invertible isometry U of a Hilbert space H is Lebesgue if and only ifµ" is equivalent to the Lebesgue measure for every u EH.
Proof Suppose that every µ. is equivalent to the Lebesgue measure ,t We claim that there exists a set {xalo:E.'.T} such that H can be written as the orthogonal sum H = Et)S(xa)-
" The proof of this claim is a standard maximality argument. Consider the family of all sets {xalo: E ff} such that S(Xa·) .l S(xa") for a' # o:". Introduce an order ~ in this family by setting if EB, S(x;.) c EB, S(x~,, )- A straightforward application of Zorn's lemma shows the existence of a maximal set {xalo: E ff} in this order. Let H 0 := ffi S(x.). If H 0 = H we are done. Otherwise, Ht# {0}. Clearly U(H0 ) = H 0 ; since U is an isometry, this implies U(Ht) = Hl. Then, if0 # uEHt, we have {u}U{x.jaEff} ~ {xalaEff} without equality, contradicting the maximality of {x.la E ff} and proving our claim. Now observe that, writing µa := µ,
10. Gaussian Shifts
151
where the first equivalence follows from 9.2 and the second is an immediate consequence of the equivalence between µ,. and l (see exercise 8.2~ Thus U is equivalent to an orthogonal sum of copies of U;.. Therefore it is sufficient to show that U;. is a Lebesgue isometry. But this is clear because un(l)(t) = einr for all n, and these functions form an orthonormal basis for 2 2 ([0, 211:], l). Thus, if £ 0 is the space spanned by the function 1, we have un(Eo) .L um(Eo) for n t,. m and H = Ef) un(Eo)This shows that U is a Lebesgue isometry. The converse implication is left to the reader. D
Exercises
9.1 Prove Theorem 9.1 by following the steps below: a) Let H 0 := E(U)i. Observe that U(H0 ) c H 0 . Show that the theorem follows from proving that
1 n-1 Iim -
L Uiu =0
n-+..,nJ=O
for every u e H0 • Observe that if we H 0 and Uw b) Define Sn: H 0 ➔ H 0 by
= w then w = 0.
Prove the identity US.*Sn -- -1 S*U" n - -1 s* n n n
+ S*S n n·
c) Prove that if the sequence (S~S. 1u)1 ,,, 0 converges weakly to w, then Uw = w. Deduce that s:s.u converges weakly to O for every ueH0 • Prove that US.ull 2 converges to O for every u e Ho.
9.2 Prove thatf = 1 is the unique vector of .P 2 ([0,2n],l) such that {UfflneZ} is orthogonal. Deduce that U;.IE is not a Lebesgue isometry for every closed subspace E # .P 2 ([0,2n:],l). 9.3 Prove that the definition of the rank of a Lebesgue isometry U of a Hilbert space is independent of the subspace E. Hint (for the invertible case): If E' and E" satisfy the definition and F' and F" are the orthogonal complements of U(E') in E' and U(E") in E".
10. Gaussian Shifts In this section we study a class of probability measures on the Borel o--algebra of B(R) called Gaussian probabilities. The corresponding shifts a-: B,,(R)--+ B,.(R), where µ is a o--invariant Gaussian probability measure, are called Gaussian shifts.
II. Ergodicity
152
In the framework of the relationship between a-invariant probability measures and stochastic processes (Section 1.10). Gaussian probabilities correspond to the socalled stationary Gaussian processes. We start by defining Gaussian probability measures on Rn. A probability measure µ e .-H(R") is called Gaussian if there exists a positive definite symmetric matrix A such that µ(B)
=
1 __1_ Jdet(A) (2:n:f12
f e-112 dx Js
for every Borel set B c Rn. From now on we shall denote the factor outside the integral by k(A); it is designed to make µ(R") = 1, since, by exercise 10.1, we have k(A)
=
(f
Rn e-1/2(..t-•x,x) dx)-1.
(1)
The coefficients aii of A can be expressed in terms of µ by the following formula (exercise 10.2):
An infinite-dimensional matrix (a;i)i,ie z is said to be positive definite if, for all n ~ 0, the matrix (aii)i,j,;;a is. A probability measureµ e .-H(B(R)) is called Gaussian if there exists a positive definite symmetric matrix A = (a 1i)i,ieZ sub that, for every cylinder C(k, B0 , ••• , B,,.), we have µ(C(k, B0 , ••• ,Bm)) =
[
k(A)e- 112 dx
(2)
Jsox .. ·xB_
where A stands for the matrix (a;A,;;;,i,;;Hm· From (1) it follows that, defining B(R)-+ R by ;;(fl) = fJ(i), we have
; 1:
aii =. [
;;;idµ.
(3)
JB(R)
The matrix A is called the covariance matrix ofµ (also in the finite-dimensional case). Theorem 10.1. Given a symmetric, positive definite matrix A:= (aii)i,ieZ there exists a unique Gaussian measure µe.-H(B(R)) whose covariance matrix is A. The measure µ is a-invariant if and only if ai+t,i+l = aiifor every i,jeZ.
Proof Uniqueness is obvious from (2). To prove existence, we use the following lemma, whose demonstration is outlined in exercise 10.3: Lemma 10.2. Let A be a symmetric, positive definite m x m-matrix. Write Rm
=
RP x Rq, p + q = m, and let A1 , A 2 are the matrices obtained by restricting A to RP and Rq, respectively. Then
f
JR"xF
k(A)e-112 dx
= [
JF
k(A2)e-112 dx,
10. Gaussian Shifts
153
where F c Rq and G c R" are arbitrary Borel sets.
□
Let d. be the sub-u-algebra of the Borel u-algebra of B(R) generated by cylinders of the form C( - n, D0 , • •• , D2.), where Di, 0 ~ i ~ 2n are arbitrary Borel sets. We define probability measures µ,,: dn-> [O, 1] by µ,,(C(-n,Do,••·•D2nH
I
=
k(A 0 )e-l/2(A;;•x,x) dx,
JD0X···XD2n
where A. is the restricted matrix (aii)-iu.t.J,.;n• By the lemma, we have fln+1 Id.
= µ., n ~ 0.
(4)
U.,.
Next we take the algebra d,,, := 0 d. and the function µ 00 : d,,,-> [O, 1] defined by µ,,,(B) = µn(B) for Bed•. This is well-defined by (4), and µ 00 is additive. Since d,,, is a compact class µ 00 is in fact u-additive, and can be extended to a probability measure µe.lt(B(R)), which satisfies µId.=µ •• By the lemma,µ satisfies (2), and is thus a Gaussian probability measure whose covariance matrix is-A. To prove the last statement of the theorem, use (3) to write a;+1,i+1
=
JI
0 (i.e. plj> > 0 for all 1 .,:;; i,j.,:;; N); c) For every 1 .,:;; i,j.,:;; N we have lim. ➔ +oo Ph"> = Pi• Proof.
(a) ⇒ (c).
We have already proved that P;Pt
= µ(a-"(C(O, i)) n C(l,j)).
If a is mixing, lim p;Pi)"> n-+oo
=
lim µ(a-"(C(O, i)) n C(l,j))
= µ(C(O, i))µ(C(l,j))
= PiPi•
n--+oo
(c) ⇒ (b). Trivial. (b) ⇒ (a). We apply the Perron-Frobenius theorem (I.10.2) and its corollary (1.10.3) to the matrix P" and conclude that its largest eigenvalue is 1. Since P(l, ... , 1) = (1, ... , 1) we find that 1 is also the largest eigenvalue of P. Let P0 :=JPF 1 be the Jordan canonical form of P, where J is an invertible matrix; we can write
Since IA1 I < 1 for all i
~
i ,::;; k it is easy to see that
lim
P3 =
n--+ +oo
This shows that P" = 1- 1 P3] converges. To prove (a) we have to show that
Jim (U;J,g)
= (f, l)(g, 1)
for f, gin 2' 1 (X). It is enough to take f and g characteristic functions of cylinders, because the set of such functions generates 2' 1 (X) (since cylinders generate the a-algebra of B(p, P)). Thus, let A:= C(m, k 0 , •.• , kj) and B := (m', k~, ... , k;) be cylinders. We have p,(lo;•l
(U;JA,JB)
= µ(a-"(A)nB) = µ(A)µ(B)~, Pk&
12. Mixing and Ergodic Markov Shifts
where 10 := m' - (m lim (U;fA.fs>
n ➔ +oo
165
+ j), by Lemma 12.2. Thus 1 = µ(A)µ(B)-
lim ~~nl P1co n ➔ +oo
= (fA> l)(fs, 1)_!_
lim Pko n-+oo
Pi!itn>-
We finally have to show that limn-++oo J>b"> = Pi• The assumption P" > 0 for some n implies that Pis irreducible, which, by Theorem 12.1, implies that
1 n-1 lim - L Pt>
n--++oo
n k=O
= Pi•
But we proved above that the sequence (Pik>)k:;.o is convergent, so it must converge ~~
□
As a consequence of this result, theorem 1.12.2 can be stated in the following way: Theorem 12.4. Every mixing Markov shift is equivalent to a Bernoulli shift.
Exercises
12.1 Show that, if CT: B(p, P)-+ B(p, P) is a Markov shift, lim n-+oo
!n log lµ(CT-"(A) n B) -
µ(A)µ(B)I
< 0,
where A and B are cylinders. 12.2 Show, using exercise 11.l, that mixing Markov shifts are Kolmogorov automorphisms.
Chapter Ill. Expanding Maps and Anosov Diffeomorphisms
1. Expanding Maps In Section I.1 we highlighted the fact that some maps have an invariant measure naturally associated with them, but that ergodic theory also comprises, among its aims, the dynamical study of maps which are not born with an associated invariant measure. The first step in this direction was showing that every continuous map of a compact metric space has an invariant measure (I.8). The second was the ergodic decomposition theorem and the concept ofregular points (II.6). Not much more can be said about continuous maps in general; in order to develop our theory further, we will restrict our attention to differentiable maps of closed manifolds. Before doing that, however, we will introduce the concept of expanding maps of metric spacesroughly, maps which locally increase distances. For such maps, it is possible to find, among all invariant measures, one whose properties make it especially interesting. We start with two particular cases of the definition, where the results obtained are particularly relevant. Definition. A C 1 map f of a closed manifold is called an expanding endomorphism if there exists n > 0 such that
ll(DJ"M > !lull for all xEM,
uE
'I'xM.
The simplest example of an expanding endomorphism is the map taking z E S 1 into z 2 • This has an obvious generalization, the automorphisms of T" whose eigenvalues have absolute value greater than one. Other examples can be obtained by observing that in C 1 (M, M) expanding endomorphisms form an open set, as can easily be seen from the definition. From the topological point of view, the theory of expanding endomorphisms is quite developed. It started with the work of Shub [S5], where he showed that the universal cover of a manifold which· has expanding endomorphisms is R", and that homotopic expanding endomorphisms are topologically equivalent. In the same paper, Shub introduced a method for finding expanding endomorphisms, which essentially consisted in the following: Let G be a Lie group and I' a subgroup such that the quotient manifold G/I' is compact. If F: G-> G is an isomorphism whose derivative at the identity has eigenvalues > 1, and if F(I') c I', then F induces an
1. Expanding Maps
167
expanding endomorphism of G/I'. The existence of a diffeomorphism F with the property above implies that G belongs to a special class of Lie groups, known as nilpotent groups. Manifolds of the type G/I' are called nil-manifolds. To obtain further examples, we consider closed manifolds M such that there exists a covering map n: G/I' ➔ M, where G/ I' is a nil-manifold. Such an Mis called an infra-nil-manifold; an example is the Klein bottle. It is easy to see that if there is an expanding endomorphism /: G/I' ➔ G/I' and an endomorphism / 0 : M ➔ M such that f 0 n = nf, then / 0 is also expanding. If/ was obtained as above, we say that / 0 is an algebraic expanding endomorphism of M. Gromov [G6] proved that, topologically speaking, this is the only kind of expanding endomorphisms there is, or, more precisely, that every expanding endomorphism is topologically equivalent to an algebr-aie--expanding endomorphism of an infra-nil-manifold (Porteus [P7]). We recall that a map f: X ➔ Y between metric spaces is called Holder continuous if for each x e X there exist constants C > 0, 0 < ')' < 1 and a neighborhood U of x such that d(/(y1).f(y2 )) ,s;; Cd(x,y)r
for every y 1 , y 2 e U. If Xis compact C and')' can be chosen so as not to depend on x. A differentiable map/: M ➔ N between manifolds is Holder C 1 if its derivative is Holder continuous. Theorem 1.1. If f is an expanding endomorphism of a closed manifold M, and its derivative D,J is a Holder continuous function of x, there exists a unique /-invariant probability measureµ absolutely continuous with respect to Lebesgue measure A.. This measure has the following properties: a) The Radon-Nikodym derivative dµ/dA. is Holder continuous and strictly positive. b) / is exact with respect toµ. c) For every Borel set Ac M, we have
lim A(r"(A))
= µ(A).
n-+oo
Markov transformations of the interval are another particular case of expanding maps, as defined later, and they are analogous to expanding endomorphisms: Definition. A map f: [0, 1] ➔ [0, l] is called a Markov transformation if there exists a finite or countable family (Ij) of disjoint open intervals in [0, l] such that a) A([0, l] - Uiii) = 0; b) For every j, there is a set K of indices such that f(Ii) = UkeK Ik(mod 0). c) For every x e Uili, the derivative off exists and satisfies lf'(x)I ~ a, for some
fixed a> 0. d) There exist f3 > 1 and n0 > 0 such that, if /m(x)e Uiii for all 0 we have l(/"0 )'(x)I ?: /3. e) There exists m > 0 such that A(f-m(li) n Ii) =I= 0 for every i, j. f) There exist C > 0 and 0 < y < 1 such that f'(x) I - 1 ~ Clx - yj If'(y)
1•
~ m ~ n0
-
1,
IIL Expanding Maps and Anosov Diffeomorphisms
168
The most remarkable example of a Markov transformation is the Gauss transformation, introduced in I.1:
f(x)=~-[~l f(O)
x#O
= 0.
It is easy to find other examples-just take transformations with graphs as shown below:
The theorem below contains the core of Adler's and Bowen's results on Markov transformations: Theorem 1.2. If f is a Markov transformation of the interval, there exists a unique f-invariant probability measureµ on the Borel a-algebra of [O, l] which is absolutely continuous with respect to Lebesgue measure. This measure satisfies the following properties: a) dµ/d}. is strictly positive, and Holder continuous on each interval Ii. b) f is exact with respect to µ. c) limn-+oo }.(r"(A)) = µ(A)for every Borel set AC [O, l].
This result is essentially due to Renyi, who proved it in 1957 ([RlO]). Later, it was revisited and polished by Adler ([Al]) in 1975 and by Bowen ([BIO]) in his last
1. Expanding Maps
169
paper. posthumously published in 1979 with a preface by Adler, where he briefly sketches the history of the subject. In fact, theorem 1.2 is an archetypical result, whose proof carries a very simple but central property of smooth ergodic theory, which deserves a separate exposition, free of the technicalities that usually clobber it in the course of rigorous proofs. This property is the following: There exists K > 0 such that, for J an interval and Ii" the interval in the family (fi) that contains f"(J) (0 ~ n ~ N), we have _ 1 .l.(A)
K
.l.(jN(A)) .l.(A) .l.(B) ~ .l.(jN(B)) ~ K .l.(B)'
(1)
where .l. is Lebesgue measure and A, B are arbitrary Borel sets contained in J. In the context of theorem 1.1, i.e. when f is a Holder C 1 expanding map of a closed manifold M, the corresponding version of this property would be obtained by taking J c: Man open set such that f"IJ is injective for n = 0, 1, ... , N, and the conclusion would again be (1). Returning to Markov maps (and leaving to the reader the translation of these considerations to the case of expanding maps of closed manifolds), it is easy to see that to prove (1) is it enough to find K > 0 such that (2)
for all N ~ 0 and x, y such that f"(x) and f"(y) belong to the same member of the family (Ii) for all O ~ n ~ N. But, by condition (f), l(fH)'(x)I l(f N)'(y)I
.u
N-l lf'(f"(x))I = lf'(f"(y))I N-1
~
TI (1 + Clf"(x) -
f"(y)IY).
n=O
Using condition (d) it is easy to find A > 0 and O" < t < 1 such that lf"(x) - f"(y)I ~ AtN-•jjH(x) - jH(y)I, l(jH)'(x)I j(jH)'(y)I
~
floo (1
""n=O
+ CA7(t7)") ·
Setting K := Il:=o(l + CA 7 (t 7 )") we get the second inequality in (2). The first follows by interchanging x and y. Properties (1) and (2) are called distortion inequalities. Observe that (1) bounds the change in the relative sizes of subsets of J under iteration by f N_ A different way of understanding (1) and (2) is to observe that if f is locally linear (i.e. linear on intervals of(Ii)), we have K = 1. Therefore these inequalities give a uniform bound for the buildup of non-linearity under iteration of the map. Inequalities (1) and (2) are simple but rich in consequences. (1) was first used successfully by Renyi in his 1957 paper, but he used (2) as his hypothesis. (Perhaps, as pointed out by Adler in his foreword to Bowen's posthumous paper, it was clear to Renyi how to derive (2) from a condition like (f).) Later, distortion inequalities
III. Expanding Maps and Anosov Diffeomorphisms
170
played a key role in the fundamental papers of Anosov [A4] and Sinai [S9], and since then they have become a standard tool in the ergodic theory of smooth maps. A landmark improvement in this technique was achieved by Jacobson [Jl] in 1980. It concerns maps of the interval of the form /(x) = Ax(l - x). When A = 4, it is easy to see that f has an invariant probability measure, absolutely continuous with respect to Lebesgue measure (cf. exercise 1.2.4). Jacobson proved that there is a set of positive Lebesgue measure of values of Ae(0,4) for which this property holds. His proof, roughly speaking, starts with an attempt to extend the method we shall use to prove Theorem 1.2. But here distortion inequalities such as (2) are much more delicate because of the existence of a point where f' vanishes. To overcome this deep difficulty, Jacobson proved that for a fat set of values O < A < 4 the influence of this point on most orbits is weak enough to grant certain distortion inequalities that are sufficient to complete the construction of the absolutely continuous invariant probability measure. We now introduce the general definition of expandirtg maps, and then prove a theorem about them which subsumes the two preceding ones. This theorem will be a fundamental tool in the development of the ergodic theory of Anosov diffeomorphisms (Section 2).
Definition. Let (X, d, µ) be a probability space, where X is a separable metric space and d its Borel a-algebra We say that a map f: X-+ Xis expanding if there exists a sequence of partitions (9%;.o such that a) UPe.'l'o
= X.
b) For every n ~ 0 and Pe&'.+ 1 , f(P) is a union (modO) of atoms of&>., and/IP is injective. c) There exist O < .ii. < 1 and K > 0 such that, denoting by &>.(x) the atom of &>. which contains x, we have d(x, y)
~
K.il. "d(f"(x),f"(y))
for every n ~ 0, xeX, ye&'.(x). d) There exists m > 0 such that, for every pair of atoms P, Qe&'0 , we have µ(f-m(P) n Q) -:/:- 0. e) There exist J: X -+ R+, 0 < ')' < 1 and C > 0 such that, for every n ~ 0 and every Borel set A contained in an atom of &'0 , we have µ(f(A))
=
L
J dµ,
and for every x, y contained in the same atom of&>. we have
IJ(y) 1I J(x) -
~ Cd(/(x),/(y))1.
We first check that Markov transformations and Holder C 1 expanding endomorphisms are comprised in this definition. If f: [O, 1] -+ (0, 1] is a Markov transformation, we take forµ the Lebesgue measure and for &'0 the partition given by the family
1. Expanding Maps
171
(I;). Conditions (a), (b) and (d) are immediately verified. Defining J by J(x) = f'(x) for xe UiIJ and J(x) = 0 elsewhere, condition (e) is satisfied because
I~Yx~ - I~ 1
Cd(x, y)1
and d(f(x),f(y))
;;i:
0, C2 > 0 such that, for every n
and ye9.(x), we have
,; ; I~:i~~ -11,;;; ~:i~~
exp(C1 d(f"(x),f"(y))Y), C2 d(f"(x),f"(y))1.
Proof. By property (e) of the definition,
J.(y) J.(x)
=
TT J(f:(y)) ,;;; TT (1 + C d(r(x),r(y))Y). J(f (x))
m=O
m=O
But ye9.(x) implies r(y)e9.-m(x), so (c) shows that d(r(y),r(x)),;;;; KJ.•-md(f"(x),f"(y)).
Putting I:=
A. 1,
~
1, x e X 0
III. Expanding Maps and Anosov Diffeomorphisms
174
~
Il (1 + Kii d(f"(x),r(y)Y)
j=O
= exp }: log(l + Kii d(f"(x),f"(y))1) j=O
t
~ exp
Kii d(f"(x),f"(y))7 = exp ( K d(f"(x),r(y))7 l
~I)
Taking C 1 = K(l - I)- 1 proves the first inequality. To prove the second, let D be the diameter of X 0 , and-B, B' > 0 be constants such that exp()(~ 1 + BO(
and
(1
+ B0()- 1 :;,,
1 - B'O(
for every O ~ 0( .-;;; C 1 D1 . Then J.(y) J.(x)
~ 1+
BC 1 d(f"(x),f"(y))7.
Switching x and y, J.(y) ?c (1 + BC1 d(f"(x),f"(y))r 1 J.(x)
1 - B'C 1 d(f"(x),f"(y))7,
;;,,
so that
11 .-; ; max{ B, B'}C
I:!,,_(y) J.(x)
Lemma 1.5. There exists K > 0 such that S.(x)
Proof. Put r := infp e r,0 l(f(P) ). Then, if f"(y)
-
- i
d(f"(x),f"(y)Y.
~
J. dl = J.(y)
&'.(y)
I exp(C J,.(y)
1
□
K for every x E X 0 and n
= x, we have
r ,,:; 1(9'0 (f"(y))) = l(f"(9'.(y))) = ,,:; J.(y)
1
i
&'n(y)
?, 1.
J.(z) ~J. ( ) dl(z) n Y
d(f"(y),f"(z))Y)dl(z),,:; J (y)l(.sr.(y))exp(C1D1 ). 0
Thus s.(x) .-;;;
I
r- 11(&.(y))exp(C1 D1 ).
ye f-"(x)
Since each atom of .oJ>. contains at most one point of f-"(x), we conclude that s.(x) o,;; r- 1 exp(C1DY).
0
Lemma 1.6. For every n ;;,, 0 and every pair of points x, y in the same atom of .9'0 , we have
l. Expanding Maps
175
Proof. Since x and y belong to the same atom of ali0 , every atom of alin which intersects /-"(y) also intersects /-"(x). Enumerate the points Xi, x 2 , ••• in 1-n(x) and the points Yi, y 2 , ••• inf-n(y) in such a way that x1 e &'.(y1). Then, by Lemma 1.4,
~ ~J.(xm)-
IS.(x)- Sb)I
1
Iii;:~ -11
~ IJ.(xmr1 c2d(/"(xm).fn(YmW ~ s.(x)C2d(x,y)1•
0
m
Proof. It is enough to show that, for every Borel set A contained in some atom of we have
~ 0,
2.(A)
=
Ls.
dA.
Call P the atom of~ containing A. Then 1-•(P) is a union of atoms Bi, B2 , ••• of ~•. For every j, f"IB1: B1 -+ P is a bijection, whose inverse we call ¢>1• Applying (4) to 1)- 1, we get 2(,/Jj(A)) =
=
f d2 = J,pj(A)
f
f.
J.(y)((J. o ,p1 o f")(y))- 1 d2(y)
,pj(A)
(J. o t/>1)-1 d2 =
f"(tpj(A))
f
(J. o t/>1r1 d2.
A
It follows that 2.(A)
= A(r"(A)) =
L
A(rn(A) n P)
PeU'n
= r2iA)) j
j
=
rf j
(J.o c for all n. Next we choose a finite family {P1 , ••. ,,Pi} of atoms of ~o such that every u.+ and u.- contains an atom of this family. Using the fact that µ 0 satisfies (b) in Theorem 1.3, it is easy to see that there exists a sequence (e.) converging to zero such that
Jto(A.n P;) µo(lj)
~ e. for lj c
u.-.
We now use property (d) in the definition of expanding maps. By this property and (b) in Theorem 1.3, there exists l> > 0 such that µ 0 (!-m(P;) n lj) > l> for all l j ~ l. Choose P; c U."'+m and lj e u.-. Then Bn+mµo(P;) ~ µ 0 (P; n A~+m)
=
~
i
~
µ 0 (/-m(P; n A~+m)) = µ 0 (/-m(P;) n A~)-
Moreover,
Thus µo(/-m(P;) n lj)-= µ 0 (/-m(P;) n A~ n lj) ~
e.+mµo(P1)
+ µ 0 (rm(P;) n A. n P;)
+ e.µo(lj).
As n increases the last term goes to zero, contradicting the fact that µ 0 (!-m(P;) n ~>~
□
2. Anosov Diffeomorphisms A diffeomorphism f of a closed manifold M is called Anosov if there exists a direct sum decomposition of the tangent bundle TxM at each point x into complementary
2. Anosov Diffeomorphisms
179
subspaces E!, ~. and also constants K > 0 and O < ,1, < l, satisfying
= Ej• (D,J)~ = Ei• (Dxf)E!
ll(Dxf")IE!II ~ K..1.", ll(»xr•)l~II ~ K..1. • for every x e M and n ;;;,, 0. The definition of Anosov flows is analogous: A flow ef,: R x M -+ Mis Anosov if it has no singularities and there exists a decomposition T,,M = E! EB~ EB~ at each point x e M (where £t := ef,:(x)) and constants K > 0 and O < ,1, < 1 such that (DxefJ,)E!
= E~,(x),
= ~-,(x)• ll(Dx4't)IE!II ~ K..1.', ll(Dx4'-,)IE~II ~ K..1.'
(Dx4'-t)E~
for every xeM, t;;;,, 0. This class of dynamical systems originated with the work of Hadamard, where he essentially proved that the geodesic flow on the unitary tangent bundle of a manifold all of whose sectional curvatures are negative is an Anosov flow. In 1939, Hedlund proved that the geodesic flow on the unitary tangent bundle of a surface of strictly negative constant curvature is ergodic, and in 1940 Hopf extended this result to manifolds with arbitrary strictly negative curvature. The present definition was introduced by Anosov in 1966 [A4], but he used the name C-flow (in Russian U-flow), where C stands for "condition". Our nomenclature was introduced by Smale, but the old one is still used in the Russian literature. In this famous paper. Anosov showed that Anosov flows and ditfeomorphisms which preserve a measure given by a volume form and which are Holder·c 1 are ergodic. He also proved that Anosov flows and ditfeomorphisms are open in the C 1 topology, and that they have the important property of being structurally stable. A diffeomorphism f (resp. flow ,ft) of a closed manifold is structurally stable if it has a neighborhood /J/£ in the C 1 topology such that every g (resp. if,) in /J/£ is topologically equivalent to f (resp. ,ft). The definition of topologically equivalent flows is analogous to that of diffeomorphisms (1.11): ef, and if, are topologically equivalent if there exists a homeomorphism h: M-+ M taking orbits of ef, into orbits of if,. The concept of structural stability was introduced by Andronov and Pontrjagin in 1937 and has flourished spectacularly since the sixties. The reader can consult [S3] or [N2] with respect to the fundamental results on structural stability; here we limit ourselves to the ergodicity properties of Anosov diffeomorphisms. First, however, we shall briefly describe some central topological results. The basic examples of Anosov diffeomorphisms are the isomorphisms of T" all of whose eigenvalues have absolute value different from one. They are called linear Anosov diffeomorphisms of the torus; we show that they are indeed Anosov ditfeomorphisms. Let f: T"-+ T" be an isomorphism as above. Then the eigenvalues of Def: T,(T")-+ T.,(T") have absolute value =ft I. We define E!, as the Drinvariant
E:
III. Expanding Maps and Anosov Diffeomorphisms
180
subspaces of T,(T") associated with eigenvalues less than one and greater than one, respectively. Then there exist constants K > 0 and O < ,l < 1 such that ll(D,r)IE!II ~ K-l", ll(D,r")IE!II ~ K.l" for all n;?; 0. It is easy to see that E~ := x + E!, E~ := x + £:, K and ,l satisfy the conditions in the definition. From the topological point on view this takes care of Anosov diffeomorphisms on tori: Theorem 2.1 (Franks [F3], Manning [M3]). Every Anosov diffeomorphism of a torus is topologically equivalent to a linear one.
There are examples of Anosov diffeomorphisms on other manifolds. Smale introduced in [S 10] an extension to infra-nil-manifolds of the concept of linear Anosov diffeomorphisms, which Shub used to develop his method of construction of expanding endomorphisms described in Section 1. With this generalization, the Franks-Manning theorem is still valid, but it is unknown whether there are any other manifolds which have Anosov diffeomorphisms. In the case of codimension 1 Anosov diffeomorphisms (i.e. when the dimension of E• is one) the problem has been solved: Theorem 2.2 (Franks [F4], Newhowse [N3]). Any manifold which has a codimension 1 Anosov diffeomorphism is homeomorphic to a torus.
Another famous question about Anosov diffeomorphisms is whether they are transitive. This is true for all known examples, and in the case of linear Anosov diffeomorphisms follows from the fact that they preserve Haar measure; by Theorem 2.1, this proves the conjecture for every Anosov diffeomorphism on a torus. With respect to Anosov flows the situation is much more complex. There are examples of Anosov flows on three-dimensional manifolds for which the nonwandering set is not the whole manifold (Franks and Williams [F5]). The problem of classifying or describing the manifolds which have Anosov flows presents special difficulties, in part because there exist no canonical procedures to generate models up to topological equivalence, as is the case with linear Anosov diffeomorphisms. The central theorem in the ergodic theory of Anosov diffeomorphisms is the following: Theorem 2.3. Let f be a transitive Anosov diffeomorphism of a closed manifold M with a Holder continuous derivative. There exists a unique f-invariant probability measure µ+ on the Borel u-algebra of M such that, for every continuous function ¢>: M -> R, we have:
l
n~~oo;;
.-1 j~
ef>(Ji(x))
f
= Mdµ+
(1)
for A-almost every x EM, where A denotes the Lebesgue measure on M. With respect to this measure f is Bernoulli, and, for every Ac M with µ+(oA) = 0, we have
Jim l(r"(A)) = µ+(A). n- +oo
(2)
2. Anosov Diffeomorphisms
181
Property (1) immediately guarantees thatµ+ is unique, since, ifµ is another measure satisfying (1), we have JM if>dµ = JM¢>dµ+ for every continuous if>: M--+ R. Furthermore, if f preserves a probability measure 1 0 « A, we have Ao = µ+. In fact, the set of points where (1) does not hold has Lebesgue measure zero, so has measure zero with respect to A0 • Applying the dominated convergence theorem (0.3.4), we get
r if>dµ+ = JMr (fMif>dµ+)dlo = fM(lim ! I ¢>(Ji(x)))dAo(x)
JM
.~1!1
00
1 n-1 Jo
-;:;
f . M
n-+con1=0
0, the local stable manifold w,s(x) is defined as w,s(x) := {yld(J"(y),f"(x)) :c;; e for all n;?; 0},
and the local unstable manifold W,"(x) as W,"(x) := {yld(J"(y),f"(x)) :c;; e for all n :c;; 0}. These sets can have an extremely complex structure. If f is an Anosov diffeomorphism, however, and e is small enough, they are diffeomorphic to discs. Theorem 2.5. If f: M--+ M is an Anosov diffeomorphism of class C, there exists e0 > 0 such that:
III. Expanding Maps and Anosov Diffeomorphisms
182
~ ea and every xeM the sets W."(x) and W."(x) are C' diffeomorphic to discs, and T,, W.'(x) = E!, T,, W."(x) = ~b) For every O < e ~ e0 there exists o = o(e) such that # W.'(x)n w,;"(y) = 1 for all x, yeM such that d(x,y) ~ o.
a) For every O < e
This theorem was demonstrated by Anosov in [A4]. It was later generalized by Hirsch, Pugh and Shub [H4]. Figures I and II illustrate the relevant properties in the three-dimensional case:
w;(x)
Fig. I
Fig. II
From now on f will denote an Anosov diffeomorphism of M. Put 0, 0 < oc ~ 1 such that IJ(z) - J(w)I ~ Cmax{d(z, w)",d(n(x), n(w)f}
for every z, we W"(y,R) and Ju(n(A)) =
for every Borel set A
c
L
J dJ.
W"(y, R).
In Section 4 we shall also prove, using Lemma 2.7, the following property:
We shall also need the following two lemmas. A proof of the first can be found in [N2]; for the second, see [H4]. Lemma 2.9. The fiber bundle E" is Holder continuous. Lemma 2.10. There exists a C"" Riemannian metric such that, denoting by II · II the associated norm, we have ll(Dxf)IE~II < 1, ll(Dxf- 1 )IE~II < 1 for every xeM.
2. Anosov Diffeomorphisms
185
Proof of Theorem 2.3. Let !JI
= {Ri, ... , Rm} be a Markov partition. Write
Mo:=
i5J"(M\yaR1).
Thenf(M0 ) = M0 and, by Lemma 2.8, we have l(M\M0 ) = 0. Choose for eachj a point x1 e R1, and write X := U1 W"(x1, R1) n M0 • Define h: M 0 -+ X by h(x) := w.:,(x)nX;
the reader can verify that his well-defined. Define F: X-+ X as F := ho f Then Fh(x)
= hf(x)
(7)
for every xeM0 • Since Xis a manifold, it has-a-well-defined Lebesgue measure 2 1 • Let 20 the probability measure on the Borel 11-algebra of X defined by l 0 (A) = l 1 (A)/l 1 (X) for Aed, where dis the Borel 11-algebra of X. From Lemma 2.7, we have (8)
Finally, we define a metric d( ·, ·) on X. Put d(x, y) := 1 if x and y belong to different rectangles, and d(x, y)
= sup d,,(h- 1 (x) n
W"(z, R1), h- 1 (y) n W"(z, Rj))
(9)
zeRi
for x, ye W"(x1, R1 ), where du(·, ·) denotes the Riemannian distance in the manifold W"(z, R1) that comes from the Riemannian structure induced by the metric given by Lemma 2.10. The metric du satisfies (10) for peM and qe W.~(p), where
A.:= sup { ll(Dxf)IE~II, ll(Dxr 1 )IE~II }.
(11)
xeM
We claim that Fis an expanding map of(X,d,2 0 ). Take the partition &'0 formed by the sets P_; := W"(x1, R1) n M 0 , and put &. := V'l=o F-1(&'0 ). Then property (a) in the definition of an expanding map is immediately satisfied. That F(P_;), P; e &',.+ 1 , is a union of atoms of&',. follows from Lemma 2.6. The injectivity of F is clear. As a consequence of(9) and (10), property (c) holds ifwe take K = 1 and las in (11). The transitivity of F implies (d). Finally, to prove (e)", we take for each 1 ~ i ~ m the integer j(i) such that f(xi) e RJ(i)• Defining nJ(i>: RJ -> W"(xJ W"(x1, R1) be defined by p(x) := w.:(x) n W"(x1, R1).
Then v(S) = ).(p(S)) and S is a cylinder contained in ~- Let S0 c p(A) be a compact set such that ).(p(A)\S0 ) < B. Then A 0 = p-1 (S0 ) is compact and v(A\A 0 ) = ).(p(A)\S0 ) < B. Thus v satisfies the condition of the lemma and is u-additive; in particular, µ00 is also Cl-additive, and a probability measure on u.;a,oJ"(h- 1 (d')). We now apply Theorem 0.1.5 to obtain a probability measure µ+ over V':=of"(h- 1 (d')). Since µ 00 is !-invariant, so is µ+. We will show that v.;,,of"(h- 1 (d)) is the Borel Cl-algebra of Mo. Observe that the atoms of the partition v:=o f-"(9P) belong to h- 1 (d') for every N ~ 0. This implies thatthe atoms of Ji(V:;,,of-"(9P)) belong to jl(h- 1 (d')) for every j. Thus the atoms of
belong to f"(h- 1 (d')). We'll be done if we show that the partitions Bf. generate the Borel u-algebra of M 0 • Since (Bf.) is an increasing sequence, it is enough to check that lim ( sup diam(A)) n-+oo
= 0.
(13)
AE!Jln
If x, y EA E Bf., then fm(x) and fm(y) belong to the same atom of 9P for-every -n ~ m ~ n. This implies that d(r(x),fm(y)) ~ Bo for every -n ~ m,:;; n. If (13) were false, we would have a sequence (n1}j;,,; diverging to infinity, a number r > 0 and sets A •. E Bf•. such that diam R •. > r. Take x. , Yn. E Bf•. such that d(x• ., Yn .) > r. 1 1 1 1 1 1 1 1 Then d(r(x.),r(Y.)) ,:;; Bo
for every -n1 ~ m,:;; n1. Put x := limJ-+oo x";' y := limJ-+oo Yn; (taking a subsequence if necessary). Then d(r(x),r(y)) ,:;;
Bo
for every m. This gives x E W.:(y) n E:/y) which, by Theorem 2.5, implies x = y. But d(x, y) = lim1_ d(x. , Y. ) > r. This contradiction completes the proof of (13) and shows that u.;,,of"(h- 1 (d)) is the Borel u-algebra of Mo. We finally extendµ+ from M 0 to M by putting µ+(A):= µ+(An M0 ) for A a Borel set of M. We will show that µ+ satisfies the desired properties (with Kautomorphism instead of Bernoull~ as already mentioned). Thatµ+ is !-invariant was already proved. To show that f is a K-system, we take the u-algebra h- 1 (d') and show that it satisfies the conditions of the definition. We have already seen thatf- 1 (h- 1(d')) C h- 1 (d') and that u.;,,of"(h- 1 (d')) is the Borel u-algebra of M0 and consequently the Borel u-algebra of M (mod 0). Finally, take A E n.;,,of-"(h- 1(d')). Then 00
AE
n r·h-l(d')) = n h-l w•(x) = W'(y), W"(x) n W"(y) #- 0 => W"(x) = W"(y). Ty w•(x)
The family (W'(x))xeM is called the stable foliation and denoted by ~P. The unstable foliation !F" is defined analogously. We say that a submanifold N c: Mis transversal to ff,' if TXN EBE: = TXM
for every x EN. A Poincare transformation off is a triple (U, V, ,p), where U and V are manifolds transversal to ff' and ,p: U-> Vis a continuous injective map such that ,p(x) E W•(x) n V
for any XE U.
V
W'(x)
For linear Anosov diffeomorphisms every Poincare transformation is C"', but in general, even if f is C"', Poincare transformations are not even Lipschitz. It can be proven, though we won't need it here, that if f is Holder C 1 every Poincare transformation is Holder continuous. In the case of codimension one Anosov diffeomorphisms (i.e. dim = 1 for every x) Poincare transformations are C 1 (exercise 3.1). This property (or rather, the corresponding property of Anosov flows) was essential in Hopfs proof that the geodesic flow on the unitary tangent bundle of a closed surface of negative curvature is ergodic. His proof, however, only used the differentiability of Poincare transformations via their Jacobian. Anosov found out that Poincare transformations of Holder C 1 Anosov diffeomorphisms have a Jacobian, even if they are not differentiable, and this enabled him to extend Hopfs proof and show that Holder C 1 Anosov diffeomorphisms which preserve Lebesgue measure are ergodic (this is a particular case of Theorem 2.3). It is still unknown whether the same result holds for Anosov diffeomorphisms which are merely C 1 • In order to present Anosov's result in detail, we start by introducing the concept of absolute continuity. A map ,p: N-+ P, where N and P are manifolds, is called absolutely continuous if it is continuous, injective, and there exists a continuous function J: N-> R, called the Jacobian of ,p, such that
E:
3. Absolute Continuity oftbe Stable Foliation
L
J d1
191
= 1(qS(A))
for every Borel set A c N, where 1 denotes Lebesgue measure on N or P. Theorem 3.1. Every Poincare transformation (U, V,ip) of a Holder C 1 Anosov diffeomorphism is absolutely continuous, and, for every compact set U0 c U, there exist C > 0 and O < l':::;; 1 such that the Jacobian of qS satisfies
IJ(p) - J(q)I:::;; Cmax{d(p,q),d(ip(p),qS(q))}1
(1)
for every p, q E U0 , where d( ·, ·) denotes the intrinsic metrics of U and V.
The Holder condition of the statement is necessary: there is an example, due to Robinson and Young [RS], of an Anosov diffeomorphism of class C 1 which has Poincare transformations (U, V, qS) such that ip is not absolutely continuous and, even worse, maps certain Borel sets of measure zero (in U) into sets of positive measure. The max in condition (1), on the other hand, is artificial, and can in fact be left aside, since condition (1) and the already mentioned fact that ip is Holder continuous imply that IJ(p) - J(q)I :::;; Cd(p,q)Y for every p, q E U0 • However, we will not prove that is Holder continuous, and for our purposes (1) is sufficient. Before the proof-proper we establish some basic definitions, properties and notations. We assume that the Riemannian structure of Mis such that there exists 0 < .A. < 1 satisfying ll(D.,f)IE~II < l for any xEM, ll(D,,F 1
)1~11
(p)), Po := f'"(p), ifo := /'"(ef>(p)). Then _ jdet(Dp/'")I TPUI m J(p) - ldet(D;(p}/'")I Tq){p)I H(Po,qo, Tp0 f (U), _ J(p)
ldet(Dp/m)I TPUI
_ _
= ldet(D;a;i/'")I T;a;JI H(po, qo, TPof
m
¾,0 / (V)),
m m (U), T,;_J (V)).
Thus J(p) - ldet(DPorm)I Ji,Jm(U)I _ldet(Dqorm)I YgJffl(U)I J(p) - ldet(DPorm)I T,,Jm(U)I ldet(Dqorm)I T,;_Jm(U)I
. H(Po,qo, T,,Jm(U), ¾,0 /m(V)) H(po,ifo, '.li,0 /'"(U), '¼.Jm(V)).
3. Absolute Continuity of the Stable Foliation
By Lemma 3.6 and (101 the first factor is
197 ~
than
exp(Adm(P0 ,p0 )') ~ exp(Au!) ~ exp(AC11u~11 );
(12)
the second factor is estimated in the same way. For the numerator of the third factor, we have, from (6) and Lemma 3.2: H(Po,qo, T;,0 r(U), 'TqJm(V)) ~ exp(Ci(cx(TPor(U)) + cx(TqJm(V)) + d.(p 0 ,q0 ) 1 )) ~
exp(Ci(A.,,,(cx(TPU) + a(~
V)) + d.(Po,qof)).
Since Sm ;;;:a Pf's0 , it follows that m;;;,, log(s0 /sm)/log(1/P1 ). Thus, putting c := -logljlog(1/P1 ), we have from (11): Am~ (sm/s0 }' ~ (C/so)'uf Defining B:= suppeu0 d.(p,ip(p))- 1 and B' := max{supxeUo cx(T,,U),supye~ 0 and O < the constants appearing in (14), such that
{J ~
1, depending only on
(15) Switching p and pin (15) we get J(p) ~ 1- G' ~ J(p),,... "o·
(16)
Inequality (1) now follows from (15), (16), the compactness of U0 and the continuity ~~
□
We close this section with the proof of Lemmas 3.6, 3.4 and 3.3. Proof of Lemma 3.6. We denote by p( ·, ·) a Riemannian metric on the Grassmanian fiber bundle G of £"-dimensional subspaces of the fibers of TM. The following lemma, proved below, will be necessary:
Lemma3.7. a) For every N0 c N as in the statement of Lemma 3.6 there exist constants A> 0, 0 < v ,;;;; 1 such that, for every x, ye r(No) and n ~ 0, we have (17)
III. Expanding Maps and Anosov Diffeomorphisms
198
b) There exist constants D > 0 and O
(x) be the matrix of L.(x) with respect to the bases glil(x), ... , ~~>(x)} and {11lil(x), ... , 17jil(x) }. For x EU; and f(x)E ½ let A@(x) be the matrix of Dxf with respect to the bases {~Yl(x), ... , ~~>(x), 17y>(x), ... , 17jil(x)} and {~Yl(f(x)), ... , ~~l(f(x)), 11Yl(f(x)), ... , 17jil(J(x)) }. We will prove by induction that there exist constants A > 0, n0 > 0 and O < v :::;; 1 such that (18)
for every 1 ~ i ~ m, n;, n0 and x, yEf"(N0 ) n U; such thatf-i(x) andf-i(y) belong to the same element of 0/,/ for any O ~ j ~ n. This immediately implies (17), because, since N is Holder C1, we have an inequality similar to (18) for n ~ n0 , and (17) only has to be checked for x close to y. Put
D := sup rx(TwN). WE
Take C1 > 0 and O
0, 0 < v < 8 such that IIL~l(x) - L~l(Y)II ~ Ad(x,y)"
(20)
3. Absolute Continuity of the Stable Foliation
199
for any x, yef""(N0 ) n U;. We replace '¥1 by a refinement, also called '¥1, such that 2C1 Dl'¥11'-• < (1 - A.2)A, where 1'¥11 := max 1 .,;;.,;mdiam(U1). It is clear that (19) and (20) are still valid for the new cover. Now use the induction assumption to pick constants A > 0 and O < v < {J such that (18) holds for n0 ~ n ~ p (instead of all n ;;;i, n0 ). Take x, yef!'+i(N0 ) n U1 such that f- 1(x) and f- 1(y) belong to the same element of '¥1 for O ~j ~ p + 1. In particular, 1 (x), 1-1 (y)e ½, and
r
Lt\ 1(x)
= AUil(J-1(x))L~1(f-1(x))AUi>(J-1 (xW 11E!,
analogously for y. Put H := Auil(J-1 (x)) - AU1>(J-1(y)) AUi)(f-l(x))-1 - AUl>(rl(y))-1. Then
and
L~!1(x)- L~~1(Y)
and G :=
= AU1>(J- 1 (x))IE•(J- 1(xn-1 + HlvL~)(rl(y))AUil(J-l(x)rl + AUil(J-l(y))L~i(rl(y))G.
Recalling that, for xef!'(N0 )n ½, we have_ IILt(z)II
= cx(T,,f!'(No)) ~ lPcx(Tr-,No)
and using the induction hypothesis, we get 11Lg~1(x)- L~1+.1(Y)II ~ ..1.2IIL~i(r 1 (x)) - L~1cr 1(y))II
+ C 1dp(r1(x),f-1(y))',V Dl
+ l-lP DC1dp(r 1 (x),r 1(y))' ~ (l 2 A+ 2C1 Ddp(r 1 (x),r 1 (y))1 -•dp(r 1(x),r 1 (y))' ~ Adp(r 1 (x),r 1 (y))' ~ Adp+l(x,y)'.
□
Proof of Lemma 3.3. Let Kc N be compact and K 6 := {yld(y,QJ(K)) < f>}. Since QJ(K) is compact,
and, since (QJ.) converges uniformly to QJ, we have for each {J > 0 an integer n0 > 0 such that QJ.(K) c K, for n ;;;i, n0 • Thus, for every e > 0 there exists n0 such that for n
;;;i,
n0 • Thus
I
JK
J dl
=
lim
I
n➔ +coJK
J.dl
= lim n-+co
l(QJ.(K))
~ l(QJ(K)) + e,
and, since e is arbitrary, we conclude that
Now let Ac N be a Borel set and K 1 c K 2 c ···be compact sets such that
III. Expanding Maps and Anosov Diffeomorphisms
200 a)
A=UK; and
lim n ➔ +co
i=l
l (A\ l) K1) = 0. i=1
Then
In particular, A.(tp(A (Nx n TxM)
= T,;M
for xeM.
We put N,(x) := {ueNxl llull < r}. From now on we denote by d0 (-, ·) the intrinsic metric of G, for G c M a submanifold, and by G,(x) the set {ye Gld 0 (y, x) < r}. Lemma 3.4 will follow from the following
3. Absolute Continuity of the Stable Foliation
201
o > 0, r > 0 and r > O such that, if we take compact submanifolds with boundary H, G c M and a continuous injection h: G-+ H satisfying
Lemma 3.8. Let e > 0 be given. There exist constants
o, cx(J;H) ~ o, cx(T,,,G)
~
d(x, h(x))
0 and O < y :i;;;; 1 such that, given x e M and Borel sets A c B c W"(x), we have
for every n ~ 0 such that diamf"(B) to the metric d.( ·, · ).
~ r2 ,
the diameter being calculated with respect
It follows from this lemma and the preceding paragraph that
l.(ut> nf"(A)
_ l.(f"(J-•(ut>) n Ac))
-
l.(f"(f-n(ut>)))
,;:: l.(r"(Ul:1 n Ac)) ,;:: .._,, K l.(rn(ut>)) .._,, Ke •. Write a•ge :=
Uia•R1, and observe that f(a'Bf)
C
a•Bf
(exercise 4.1). Then we have f"(Ac) ::, f"((a'at) 0 there exists n > 0 such that &'oo(x) n Y-"(&'oc,(x)) #-
0-
It is easy to see that for any ye r-•(&'00 (x)) we have
1. Introduction
209
&'oo(Y)
C
r-n(&'00 (x)).
Suppose that y also belongs to &'00 (x). Then &'00 (y) &'00 (x)
C
= &'
00
(x), and it follows that
r-n([!l>00 (x)),
implying that r-n(&'00 (x)) = &'00 (x) (modO), since the two sets have same measure. Since O < µ(&'00 (x)),.;; µ(&'(x)) < 1, we conclude that rn is not ergodic. □ Entropy is a measure of how fast µ(&'.(x)) converges to zero. The fundamental result which formalizes this assertion is the following theorem, a weaker version of which was proved by Shannon [Sl] in 1948, and the present one by McMillan [M2] and Breiman [B10]:
Theorem 1.2. If Tis a measure-preserving map of the probability space (X, d, µ) and [!l> is a partition of the same space satisfying
L µ(P)logµ(P)
I.U).
.2,
H(fll.U)
= H(f:1>
V
fll.K) = H(fl>I.U)
+ H(fllfl>
V
.U);:;;,, H(fl>I.U).
For the second part, recall that the function qi: (0, oo)-+ R defined by qS(x) := x log x is convex. Thus
°"
=-?-µ(M;nPk)l~g 1,k
µ(M;nPk) (P.) =H(.UI&). µ k
(c) H(.2) ~ H(fl v £1>) = H(fl v &I l) = H(&) + H(.21£1>). (d) H(& v fll.,lt) = H(&>l.,lt) + H(fllfl> v .U) ~ H(&I.U) (e) and (f) are trivial.
Definition. If T: X -+ X is a measure-preserving map and entropy of T with respect to fl> is defined as
1 h(T,£1>) := lim -H n-+oo
We must prove the limit exists:
n
(n-1V r- .&). 1
j=O
+ H(.21.U). □
flJ'
is a partition of X, the
(1)
IV. Entropy
216
Proposition 3.2. Let (aJn;. i be a sequence of positive real numbers such that infn(a,Jn) > - 00 and an+m " an + a'" for all n, m. Then the limit limn➔ +ao an/n exists. Proof. Put c := infn(an/n). Given e > 0, pick n0 such that (a00 /n 0 ) ~ c + e. Then, for n > n0 , we can write n = n0 p + q with n 0 ~ q ~ 1 and p ~ 1. We obtain
1 c + e + - sup ai.
~
n l~j~n 0
For n large enough it follows that
□ We can apply the proposition to the sequence a.:= H(V'j~~ r-it?JJ), since
This proves the existence of the limit in the definition. Proposition 3.3. a) h(T, t?JJ) - h(T, _q) " h(t?JJl_q). b) If _q " t?JJ then h(T, _q) " h(T, t?JJ). c) h(T, r- 1 t?JJ) = h(T, t?JJ). d) For every n ;?; 0, h(T, t?JJ) = h(T, Vj=o it?JJ). e) h(T,t?JJ) = lim. ➔ 00 H(t?JJ1Vj= 1 y-it?JJ).
r-
f) The sequence
!n H ( j=O \/ r-1t?JJ) is decreasing.
Proof a) Using successively (c), (d) and (e) in Proposition 3.1, we can write
=
•-1
L H(t?JJl_q) = nH(t?JJl_q)_
i=O
Hci
But h(T,t?JJ)- h(T,,q) = limi(
y-it?JJ)). so (a) is proven.
b) If _q ~ t?JJ we get H(,ql&>) = 0, and (a) implies (b). c) is trivial.
3. Entropy
217
d)Wehave
h(
Yo y-ig, n
)
1 (m-1 = m~oomH )i, y-i (m-1 Yo y-ig,)) .
m+n- 1 1 H m m+n- 1
= hm - - - · m ➔ oo
(m+n-1 . ) = h(T, &'). V y-ig, J=o
e) The sequence H(&>IVj.:-J y-i) is decreasing by Proposition 3.1 (b). Let c;.,. O be its limit. Then, by 3.1 (a),
By induction the first summand is equal to
H(&') +
:t H(&l;Z
y-Jg,),
so we obtain
This shows that
By Cesaro's theorem this limit is c, and (e) is proved f) From the proofof(e),
H(l y-ig,) = H(&') + Jl H(&'ljYi y-pg,)This implies that
We thus have
nH(Ya y-Jg,) = nH(Yi y-Jg,) + nH(&'li y-ig,) ~ nH(;i y-ig,) + H(Yo y-ig,) = nH(Y y-1g,) + H(V y-ig,), i=O
i=O
218
IV.Entropy
so that
D
Exercises 3.1 Let X be a compact metric space and fl'= (P;) 1 ,;;1,;;m a partition of X into compact Borel sets. We denote by H,,(&) the entropy of fl' with respect toµ e .H(X). a) Prove that the function µ1-+H,,(&) on .H(X) is upper semicontinuous at any point µ 0 such that µ 0 (8P;) = 0 for every i. b) Let T: X -> X be continuous. Prove that the functionµ 1-+ H,,(T, fl') on .Hr(X) is upper semicontinuous at any point µ 0 such that ~(8P1) = 0 for every i.
3.2 Let (X,d,µ) be a probability space. Let 1tn(X,d,µ) be the set of partitions of (X, d, µ) consisting of n atoms (after identifying partitions which coincide (mod 0)). For two partitions fl'= (P;) 1 ,;;,,;;n and ut = (R,) 1 ,;;1,;;m define d(fl',ut) := inf sup µ(P,A R~,J), (,
l~i~n
where R is continuous with respect to d. c) Let T be an automorphism of (X, d, µ). Prove that the function fl' 1-+ h(T, fl') is continuous with respect to d.
4. The Kolmogorov-Sinai Theorem Definition. Let (X,d,µ) be a probability space, and Ta measure-preserving map. The entropy h(T) of T is the supremum of h(T, fl') over all finite partitions & of (X,d,µ).
Observe that taking the supremum over all partitions fl' such that h(T, &) < + oo gives the same result. On the one hand, since all finite partitions have finite entropy, this supremum is not less than h(T). On the other, if & = (P;);;;, 1 and H(fl') < + oo, we define fl'(n) := {P1 , .. . ,P., Un,.~}- Then we have O ~ h(T, fl') - h(T, [JJ) = h(T, &'), n-++00
and h(T, &') can be approximated by the entropies h(T, &(•>. Proposition 4.1. Equivalent maps have the same entropy. Proof The proof is left to the reader.
□
We now develop some fundamental rules for the calculation of the entropy of maps. Theorem 4.2. Let &'1 ~ &'2 ~ &'3 ~ • • · be partitions of X, and &' a partition with H(&') < + oo. Then&' is contained in the are the intervals {zl2n(j - 1)/nm < Arg(z) < 2nj/nm}, 1 ~j ~nm.Then V'f'=o y-i9 is equal to the Borel a-algebra of S 1 • By Corollary 4.5 this shows that h(T) = h(T, !?I'). But h(T,!?I')
1 = lim-h m
(m-1 V r!I'), j=O
and the atoms of Vj.:;;,1 !?I' are nm in number and all have measure n-m. Thus h(T,r!J>)
= -lim~nmn-mlogGr = -lim;logG) = logn.
4) The entropy of the translation T(z)
□
= az of S 1 is zero.
Proof. Consider first the case Arg(a)¢2nQ. Then Tis ergodic, and consequently minimal. If we take a partition !?I' formed by two intervals (a 1 , a 2 ) and (a 2 , ai), the atoms of VJ=t y-i9 consist of all intervals with endpoints Tia 1 , Tia 2 • Since Tis minimal, V'f'= 1 y-i9 is equal to the Borel a-algebra of S 1 • Then h(T) = h(T,&')
= lim H(!?l'I n ➔ oo
Vy-igo).
1=1
By proposition 4.2, &' is contained in the a-algebra V:'=i y-n9, so we indeed have h(T)
If Arg(a)
= h(T,!?I') = 0.
= 2np/q, we use the following proposition:
Proposition 4.6. h(Tm) = mh(T) for every m ;;;, 0. If Tis invertible, h(Tm) = 1ml h(T) for meZ. Proof. Let &' be a partition of X. h(T,&')
= lim -1H (m(n-1) V Y-1!?1') ;;;, lim -1H (n-1 V y-imt?Jl) n ➔ +c.o
nm
1 , 1 hm -H m n--++oo n
= -
j=O
m-++co
nm
(n-1 . ) = -h(Tm,!?I'). 1 V y-imt?Jl j=O
m
j=O
(1)
4. The Kolmogorov-Sinai Theorem
223
On the other hand, h(
rm, Vm y-i[lj) ) =
1 lim -H n
(n-1 V y-im ( V y-i[lj) m
n- +oo
j=O
i=O
(n - l)m
.
))
=
= n hm ---( ➔ +o, n n-
1
) H Jm
1 lim -H n
((n-l)m V y-i[lj) )
n-+ +oo
j=O
((n-l)m V T
-j
)
flJJ
j=O
j=O
= mh(T,flJJ),
so that h(Tm) ;;,: mh(T), since for every partition flJJ there exists a partition fl, := Vj=o y-i[lj) such that h(Tm,!2);;,, mh(T,flJJ). By (1), h(Tm,flJJ) ,,,-; mh(T,flJJ), so h(Tm) ,,,-; mh(T). This shows h(Tm) = mh(T). If Tis invertible, we have for any partition flJJ: h(T,flJJ)
(n-1
)
n-1
1 1 ( r- 2 ), = lim-H V TJflJJ n
j=O
□
so h(T- 1 ) = h(T) and h(T-m) = mh(T- 1 ) = mh(T).
Lemma 4.7. Let A be in the u-algebra V':=i &•. Then, for every e > O, there exists > 0 such that for all n ;;,, N there exists a union A. of atoms of flJJ. such that
N
µ(A.LI A),,,-; e. Proof The set of A with this property obviously contains seen to form au-algebra.
U.;;,
1 ~"
and is easily □
Proof of Theorem 4.2. Assume flJJ is contained in the u-algebra V':=t flJJ•. Given e, take N > 0 such that if n ;;,, N and P; is an atom of flJJ there exists P/"l E P. such that
µ(P; LI P/"l) ,,,-; e. Put j-1
AJ"l :=
P_/"1\U P/"l,
2 ,,,-;j ,,,-; r - 1,
i=l r-1
A)(f µ(A!•>nPi)logµ(A!•>nPi)) • i-fu. µ(A!"l) µ(A\"l)
as n increases is zero, since the summands in parentheses all converge to zero (µ(A!•> n Pi) goes to zero if i ¥= j, and µ(A)•> n Pi)/µAt> goes to 1 if i = j). By Proposition 3.1 (b),
so that n-t-+oo
We now prove the converse. Suppose that limH(&'I&'.) = O; we show that there exists a sequence (A.), A.e&>., such that limµ(A.A P) = 0. Put&'.:= {Pl"lli ~ 1}. Given O < l < 1, defines.:= {jlµ(PnPj•>)/µ(Pi) < l}. Then H(&'I&'.)
~
-
L µ(P n Pi(n)) logµ(P n 1")) µ(Pj )
jeS•
~
- I:
µ(P n Pj•>) log l.
jeSn
For A.:=
Uus. Pi, we get
µ(P\_U Pj·1) = µ(Pn _U Pi 0 for every non-trivial partition &i".
if
Proof We prove only one direction; for the converse see [K3]. Let T be Kolmogorov, with sub-er-algebra d 0 . We use the following lemma:
Lemma 4.11. Given O < .5 < 1 there exists c > 0 such that every partition fJJJ ·with µ(B) > 8 for every atom B of fJJJ satisfies h(T, fJJJ) ;, c. Proof Assume the lemma false for some 8. There exist partitions fJJJ. the required condition, and such that Iim h(T, fJJJ.) = 0. Then h(T,fJJJ.) = lim H(fJJJ.J m-+oo
Vy-i[JJJ.)
1=1
=
Jim H(r-•fJJJ., m-+ +oo
c
c
d
d 0 satisfying
y-•(v y-i[JJJ.)) 1=1
By Theorem 4.8 we can select, for each n, an atom B E r-•fJJJ. and a union
IV. Entropy
226
A(n) of atoms of
n~=1
Vrt: r-1dln
Un;;.NB(n), A:=
n~=l
satisfying µ(B 0.
n
1_. 0
= 0. Since
r-Jd 0 , so µ(A) = 0 and µ(B) = 0. But µ(B)
~ □
Now let &' := { P 1 , ... , P,.} be a non-trivial partition. We want to prove h(T, &') > 0. Since V':=o d 0 = d, it is clear that for every e > 0 there exist mn > 0 and a partition din c Vi,!!0 Tid0 such that µ(pt.n> LIP;) ,;;; e,
where Pfn> is an arbitrary atom of din. If e is sufficiently small, H(&'nl&') and H(&'l&'n) are also small, and so is H(&'n) - H(&'), which is less than or equal to sup{H(&'nl&'), H(&'l&'n)}. Put Y;. := r-m•&'n. Then Y;. c d 0 , and since the atoms of Y;. have the same measures as the atoms of &'n, these measures are close to those of the atoms of&>. By the Lemma, there exists c > 0 such that h(T, Y;.) ;;,:
C.
Thus
But
so
Since H(&'nl&') is arbitrarily small we are done.
□
Exercises
4.1 Let nn(X,d,µ) be defined as in exercise 3.2 (a). For&' and !l. in nn(X,d,µ), we put d(&', !l.) := H(&'l!l.)
Prove that nn(X,d,µ).
d and
+ H(!l.191').
d (as in exercise 3.2 (a)) give rise to the same topology on
4.2 Let M be a compact manifold and µ a probability measure on the Borel o--algebra of M. Let C 0 (M, M) be the space of continuous maps from Minto itself, endowed with the C 0 topology, and ci(M, M) the subspace formed by the maps that preserve µ. Prove that the entropy function f 1-+ hµ(f) on ~(M, M) possesses a residual set of continuity points.
5. Entropy of Expanding Maps
227
5. Entropy of Expanding Maps In Section 111.1 we proved that expanding endomorphisms, Markov transformations of the interval, and, more generally, expanding maps, have a unique invariant probability measure absolutely continuous with respect to Lebesgue measure, and that this measure has other interesting properties: it is equivalent to the Lebesgue measure, and with respect to it the dynamical system is a K-system. The purpose of this section is to compute the entropy of these classes of transformations with respect to this special measure. We start with expanding maps, and obtain the entropy of expanding endomorphisms, Anosov diffeomorphisms and Markov transformations as corollaries. Let X be a metric space, d its Borel CT-algebra, µ a probability measure on d and fan expanding map of (X, d, µ). Let J be the Jacobian off, and .9; the partitions of X required by the definition of expanding maps. Letµ+ be the unique !-invariant measure absolutely continuous with respect toµ (Theorem 111.1.3). We intend to prove that the entropy h,.+(f) off with respect to µ+ is given by h,.+(f)
=
L
logJ dµ+.
In order to guarantee that log J is integrable we have to assume that the entropy H,,(90 ) of &'0 with respect to µ is finite. By Theorem 111.1.3 there exist constants 0 < k1 < k 2 such that (1)
for every Borel set A. This obviously implies that the entropy H,.+(&'0 ) of 9 0 with respect to µ + is finite. On the other hand, it also implies that log J is integrable with respect toµ+ if and only if it is with respect toµ. Theorem 5.1. Put
f?J>
:= &'0 , and assume -H,,(&')
h,..(f)
=
L
< + oo. Then log J is integrable, and
logJ dµ+.
Proof The integrability of log J is the less interesting part of the statement, because in the common case that&' is finite it follows immediately from (e) in the definition of expanding maps. Thus we leave the proof of the integrability of log J to the end, and first use it to prove (1). For i), 1, put &'; := Vf=ori(&'o)- We claim that d is equal (mod0) to Vr;;.oJ-i(&') = Vr;,,of?J>.J- This is because the atoms of&'. have diameter ~Kl", where K and 1 are given by (c) in the definition of an expanding map. Since &'1 ~ 9 2 ~ • • ·, Theorem 0.5.5 proves the claim. Now corollary 4.5 implies
To calculate h,,.(f, f?J>), recall that by Lemma 111.1.3 there exist constants C1 > 0 and
IV. Entropy
228
O
) =
lim H, (
V y-;£1>
m-1
)
.
J=O
m-+oo
Let Sm be the number of atoms of V;";,,1 y-irJ>. We have Sm= #{0: {l, ... ,m} ➔ {1, ... ,N}la8w8 u+1J
= 1 for every 1 ,s;;,j ,s;;, m}
= L #{0: {1, ... ,m} ➔ {1, ... ,N}la8w8u+iJ = 1 for every 1 B(A) is topologically mixing; thus proposition 11.2.4 shows there exists K c B(A) with µ(K) = 0 and v(K) = 1. Since V':=o rJ>. is equal to the Borel ualgebra (where rJ>. := V;;;J y-if!>), there exists a unions. of atoms of rJ>. such that
234 lim. ➔ 00
IV. Entropy
v(S.) = 1, lim. ➔ 00 µ(S.) = 0. Since logl = hv(u) = h.(u,£!1') = inf~H.(£!1'.), n n
we get
L7=i X; log X; subject to the constraint L7=i X; = c is equal
Now the maximum of to c log(k/c). Thus
1 #{i1Pl"1 cs.} 1 C #{ilP/"1 cs~} log A ~ -v(S.) log (S ) + -v(S.) log c) . n v n n v(S.
Multiplying by n and subtracting log J. ",
0,;:: v(S )l ""
n
#{ilPl"l Cs.}+ v(Sc)lo #{ilPl"l CS~} og
v(S.)l"
"
g
v(S~)l"
·
Here we need the following lemma: Lemma 6.2. There exists C > 0 such that for all n and i µ(Pl°))~ cl-•. Proof. Let Pl") =:Pion (1-l P;, n ... n
(1-(n-l)
P;._, = C( -(n - 1), i.-1, ... ' io),
Then
□ Using the lemma we obtain µ(s.)
= I:
µ(P/" 1) ~ c;.-•#{ilPl") cs.},
p~nlcsn
µ(S~)
= I: µ(Pl"))~ cl-"#{ilPi") cs~}, p~nlcS~
so that 0
~ v(S.) log Cµ((SS.,)) v
n
+ (1 - v(S.)) log µ(S~) . Cv(S~)
6. The Parry Measure
235
Since µ(Sn) converges to 1 and v(S,.) converges to zero, the right hand side tends to - oo as n increases, contradiction. In order to prove (3), define Lk: C 0 (B(A)) -+ R by
1
Ld := # p·IX(O" k)f(x). We want to show that Iimk ➔ +oo Ld = Js lone has #(Fix(o-k)nA)
= ai~i:/>-
Hence
so that
On the other hand,
f
£.dµ =µ(A)= Pk,Pk,k2···Pk,_,k,
B(A)
Since qk,k, = Ak,qik,, qk,k,
= Ak q1k,, we get 1
Ak, Aki
qk1k1
= qk1k1'
so that
f
B(A)
£.dµ
=r
1qk,k1
= lim LJAk ➔ oo
Cylinders form a system of relatively compact open neighborhoods of B(A), so every continuous function f can be approximated by a linear combination fo of characteristic functions of cylinders. Say that
llf - folio ,,;; e/3. Since
IV. Entropy
236
Ldo =
lim k-+oo
we can find n0 such that for k ;;,, n0
I
Lt/0
f
lo dµ,
JB(A)
-
t
A>
lo dµ
I. ;
s/3.
Then, fork ;;,, n0 ,
ILd- JfB(A) ldµl..;;1Lk(f-lo)l+ILdo-
f lodµj+jf
J B(A)
(f-lo)dµI
J B(A)
..;; Ill - lollo + s/3 + Ill - lollo ..;; s.
□
Exercises
6.1 Let X be a compact metric space and T: X -+ X a homeomorphism. Let &' be 1(&>) is equal to the Borel u-algebra of X. a finite partition of X such that VT;,.o Assume µe.AT(X) is ergodic and there exists C > 0 such that
r-
µ(P);;,, C exp(-nh,,(T))
for all Pe Vj=o
r- 1(&>). Prove that h,(T) < h,,(T)
for all v e .,I(T(X) distinct from µ. 6.2 Given a subshift B(A) of finite type, let 0: {O, ... , n}-+ {1, ... , N} such that ao(i)ou+1>
s.
be the number of functions
= 1 for O ..;; j ..;; n -
1.
Prove that the Parry measure µ satisfies, for every cylinder C(O, k 0 , ••• , k1): µ(C(O, k 0 , ••. , k1))
= -
1
s.
# {O!O(j) = k1 for O ..;;j..;; land a6(.j)ou+1>
= 1 for O ..;;j..;; n}.
7. Topological Entropy Let X be a compact metric space and T: X -+ X a (not necessarily continuous) map. We say that a subset S c X is an (n, e)-generator if for every x e X there exists ye S such that d(Ti(x), T1(y)) ..;; e for all O ..;; j .,;; n. In other words, every point of X stays s-close to some point of S for at least n iterations. It is easy to show that the compactness of X implies the existence offinite generators: take a cover {U1 , ... , Um} of X by sets with diameter ..;; e, and choose a point z in every non-empty set of the form
7. Topological Entropy
237
(1)
where 1 ~ i1 ~ m, 0 ~j ~ n. The at most mn points thus selected form an (n,e)generator. To see this, takexeX,and for each 0 ~j ~ n take i1 such thatJi(x)e U1 • Then x belongs to the set in (1), and, if z is the element of S chosen to represent this set, we have d(f1(x),Ji(z)) ~ efor0 ~j ~ n since/1(x) andJi(z) both belong to [f;1 , which has diameter ~ e. Let r(n,e) be the least number of points in an (n,e)-generator. We have just seen that r(n, e) ~ m". Thus limsup!logr(n,e) n ➔ +oo
n
(2)
is finite. We define the topological entropy of T as h,0 p(T)
= limlimsup!logr(n,e). e➔ O
n ➔ +co
n
(3)
Since r(n, e) increases as e decreases, the same happens to (2). Thus the outer limit in (3) exists, although it can be infinite. In exercise 7.4 we will encounter homeomorphisms with infinite topological entropy. If T is a Lipschitz map of a compact set homeomorphic to a subset of Rn, the topological entropy is finite (exercise 7.6). Diffeomorphisms of compact manifolds, for example, fall into this case. The lim sup in (3) cannot be replaced by a limit: there are examples of maps for which the sequence 1/n log r(n, e) diverges for arbitrarily small values of e. However, even when liminf•-+oo 1/nlogr(n,e) and limsupn-+oo 1/nlogr(n, e) are distinct, their difference is small for small e. Proposition 7.1.
h,0 p(T) = limliminf!logr(n,e). e ➔ O n-+ao
n
This proposition will be proved later. We now comment on the relation between topological and metric (i.e. measure-theoretic) entropy. In 1963, in the paper which introduced the concept of topological entropy [Al], Adler, Konheim and McAndrew conjectured that for T continuous, the topological entropy is the supremum of the metric entropies: h,0 p(T)
= sup
hµ(T).
µe./1,.(X)
This equality, known as the variational property of the entropy, was proved by Dinaburg [D2], Goodman [G4] and Goodwyn [GS]. We prove it in Section 8 for the case of X homeomorphic to a subset of R", which covers the most relevant situations arising in the study of dynamical systems. It follows from this inequality and Parry's theorem (6.1) that the topological entropy of a subshift of finite type a: B(A) ...... B(A) is the logarithm of the dominant eigenvalue of the matrix A. Below we prove the same result directly. Observe that
IV. Entropy
238
in this case the topological entropy is actually the maximum of the metrical entropies; this is not true in general. Misiurewicz [M9] gave examples of C' diffeomorphisms of compact surfaces for which the maximum is achieved, but only for 1 ,.;; r < oo. No C00 examples are known. To prove 7.1 we shall use the following lemma:
Lemma 7.2. Let 71i. i = 1, ... , N be positive integers, and e > 0. Then
r(f
ni,2s),.;; TTr(ni,e).
i=l
i=l
Proof Let S1 := {x~>, ... ,xi:>} be an (n1,s)-generator, with l1 = r(ni,e). Let S be the set of N-tuples a,:= (j 1, ... ,jN) with 1 ,.;;j1 ,.;; 11 such that there exists z(a,)eX such that
for all O ,.;; t ,.;; n1, where I
m1 :=
L ni, J=l
i = 1, ... , N.
We obviously have
On the other hand, {z(a)laeS} is a (Lf=i nj,2s)-generator because, given xeX, we can find 1 ,.;; j 1 .,;;; 11 such that d(T'(Tm•(x)), Ti(xJ?)) ,.;; e,
0 .,;;; t ,.;; n1,
i
= 1, ... , N.
Thus (j 1, ... ,jN)eS and d(T'(T""(x)), Ti(Tm•(z(ct)))),.;; 2e, _0,.;; t,.;; n1,
i
= 1, ... , N,
which implies that N
d(T'(x), T'(z(a,))) .,;;; 2s.
0 ,.;; t ,.;;
L ni. j=l
We conclude that
□ Proof of Proposition 7.1. Ifn 0 > 0,every n ~ n0 can be written as n k, s are positive integers and 0 ,,;;; s ,,;;; n0 • Thus
logr(n,2s) = log(n0
= kn 0 + s, where
+ -~- + n0 + s,2e),,;;; klogr(n 0 ,s) + logr(s,e).
We can rewrite this as 1 kn 0 1 -logr(n,2e) .,;;;-•-logr(n 0 ,e) n n no
1
+-
sup logr(s,e). n 1~s~n0
7. Topological Entropy
239
Now take the litn sup of this inequality for increasing n, and observe that kn0 /n approaches 1 as n increases. We obtain limsup!logr(n,2s),.; inllogr(n,e), n-+~ n n n which immediately implies the proposition.
□
We now consider expansive homeomorphisms (cf. 1.11), for which the definition of topological entropy can be considerably refined. Proposition 7.3. Let T be an expansive homeomorphism of a compact metric space X, with expansiveness constant s0 > 0. For every 0 < e < s0 we have
h,0 p(T)
=
litn !1ogr(n,e) = inf!logr(n,e). n n n
n-++oo
Lemma 7.4. Under the assumptions of Proposition 7.3, for every 0 < e < e0 there exists k > 0 and N > 2k such that, if x and y belong to X and satisfy
d(T'(x), T'(y)) ,.; e,
0 ,.; t ,.; n
for some n?; N, then d(T'(x), T'(y)) ,.; e/2,
k ,.; t ,.; n - k.
Proof Assume the conclusion false. For every k > 0, there exist points xk, Yk and integers nk ?; 2k and k ,.; tk ,.; nk - k such that d(T'(xk), T'(yk)) ,.; e for 0 ,.; t ,.; nk,
(4)
d(T'k(x1J, T'k(yk)) ?; s/2.
(5)
Let x and y be cluster points of the sequences (T'k(xk)), (T •(yk)). From (5) we get 1
d(x,y)?; e/2.
(6)
We can rewrite (4) as d(T1(T'k(xk)), T 1(T 1k(yk))) ,.; e for -tk ,.; t ,.; nk - tk.
Thus, recalling that t,; and nk - tk diverge to
+ oo as k increases, we get
d(T'(x), T (y)) ,.; e 1
for every t, which contradicts (6) since Tis expansive.
□
Corollary 7.5. Under the assumptions of Proposition 7.2, for every 0 < e < s0 there exist k > 0 and N ?; 2k such that
r(n, e) ?; r(n - 2k, e/2)
for all n ?; N.
(7)
IV. Entropy
240
Proof. Let S := {x 1 , ••• ,x1} be an (n,e)-generator with l = r(n,e). It follows immedi-
ately from Lemma 7.4 that {Tt(xi), ... , Tk(x1)} is an (n - 2k,e/2)-generator.
□
Proof of Proposition 7.3. To prove the first equality, take k as in Corollary 7.5 and use (7) to obtain
liminf!logr(n,e) ;;;i: limsup!logr(n - 2k;e/2) ;;;i: limsup!logr(n - 2k,e) n-++oo n n-++«> n n ➔ +oo n = lim sup n - 2k _!____ 2k log r(n - 2k, e) = lim sup! log r(n, e).
n
n-++oo
n-
n-++oo
For the second, take n0 > 0. Write n > 0 as n = l(n0 + 2k) and O,;;; s < n0 + 2k. Then, by 7.2 and 7.5, we have
n
+ s, where l, s are integers
+ 2k,e/2) + logr(s,e/2) ,.;; llogr(n0 ,e/2) + logr(s,e/2),.;; llogr{n 0 ,e) + logr(s,e),
logr(n,e) = llogr(n 0
Dividing by n and taking the limit as n increases gives the second equality.
O
Corollary 7.6. Let Ac B(v) be a subshift, and put r(n) := #{t,b: {O, ... ,n}--+ {1, ... , v}lt,b is the restriction to {l, ... ,n} of some 0eA}. Then
h,0 p(o"LA)
= lim !togr(n). n-++oo
n
Proof Recall from 111.9 that o-lA is expansive. We define a metric d(·, ·) on B(n) as follows:
where d0 (i,j) = 0 if i = j and d0 (i,j) = 1 otherwise. By exercise 1.11.1, there exist arbitrarily small values of e and corresponding integers N := N(e) for which the following conditions are equivalent: a) d(a, {J) ,.;; e; b) d(a, {J) ,.;; 2e; c) a(j) = {J(j) for ~II - N ,.;; j ,.;; N. Lets. c B(v) be a set such that for every 0eB(v) there exists a unique aeS. whose restriction to {-N, ... ,N + n} is equal to the restriction of 0 to the same· interval. Since (a) and (c) are equivalent, we have d(ui(a)ui((J)) ,.;; e for every O ,.;; j ,.;; n. Thus s. is an (n, e)-generator. This implies that · r(n,e),.;; #S. = r(n
+ 2N).
On the other hand, if Sis an (n, e)-generator and # S < # a 1 , a 2 es. and 0eS such that 0(Ti(a;), Ti(O)) ,;;; e
s. = r(n + 2N), there exist
for O ,;;; j,.;; n, i = 1, 2.
7. Topological Entropy
241
Thus d(Ti(ai), Ti(a2 )),,;; 2B,
0
,,;;j,,;; n,
which implies, by the equivalence between (a), (b) and (c),
+ n, contradicting the definition on s•. Thus r(n, e) = # s. = r(n + 2N), which obviously -N ,,;;j,,;; N
a 1 (j) = 0(j) = a 2 (j),
□
implies the corollary.
Corollary 7.7. If B(A) is a subshift of finite type and A is the dominant eigenvalue of A, we have
h,0 p(o-lA)
= logl.
Proof Using the notation of Section 6, we can write
r(n)
= Ia!j>. i,J
Since we have proved that the limit
exists for all i,j and is non-zero for some i,j, we get lim r(n)
= logl.
n-+co
□
A continuous map of a compact metric space Xis called intrinsically ergodic if there exists a unique µeAr(X) such that h,0 p(T) = h,,(T). In this case µ is called the intrinsic measure of T.
Proposition 7.8. An intrinsically ergodic map is ergodic with respect to its intrinsic measure. Proof Let T: X --+ X be intrinsically ergodic, with intrinsic measure µ. If T is not ergodic, there exists a Borel set A such that T- 1 (A) = A and µ(A)¢{0, 1}. Define µ 1 , µ2EAr(X) by 1 µ 1 (S) = µ(At(S n A), µ 2 (S)
= µ(~ X be such that T(O) = 0, T(Ai) = Ai, and TIAi is topologically equivalent to the shift a: B(j + 1)-> B(j + 1). Prove that h10P(T) = + oo. 7.6 Let X c R1 be a compact set and T: X -> X a Lipschitz map, i.e. such that there exists K > 0 satisfying d(T(x), T(y)) ,;;; Kd(x, y) for all x, y EX. Prove that h,0 p(T) .;;; llog K.
Hint. Let e > 0 and put e. = eK-•. Lets.=: {x 1, ... ,xm(nl} be such that every point of X has distance .;;;e. to s •. Prove thats. is an (n,e)-generator, and deduce that limsup!logr(n,e);,:; limsup!logm(n). n--++oo
n
n--++oo
n .
8. The Variational Property of Entropy In this section we prove the following result, stated in Section 7: Proposition 8.1 (Variational property of entropy). Let X be a compact metric space, and T: X -> X a continuous map. The topological entropy of T satisfies h,0 p(T) =
sup
hµ(T).
µe.li,(X)
Proof The proof presented here requires two additional assumptions. The first, that K := X be a subset of some Euclidean space (or homeomorphic to one), is not very
significant, since it is always satisfied in the more interesting examples. The second is that T be a homeomorphism. A radically different and very elegant proof has been given by Misiurewicz ([Ml3], [Wl]). We start by proving that h,0 p(T)
~
sup hµ(T).
(1)
µe.li,(K)
Take µ E AT(K). For m > 0 an integer, let am be the partition of R whose atoms are [j/m, (j + 1)/m], j E Z. Assuming K c RN, let Y'm be the partition of K whose atoms are the sets An K, A E .o/im x . ~- x .o/im. In principle, the property µ(An B) = 0 for two atoms A, B of Y'm is not necessarily satisfied (sinceµ is not assumed absolutely continuous with respect to the Lebesgue measure), but we can guarantee that it is satisfied by translating the set K if necessary. Setting L := 3N - 1, each atom of Y'm intersects at most L other atoms of &'m. Given e > 0, take m0 such that 2diam(A),;;; e
(2)
for every atom A of &'mo· We show below that, in the notation of exercise 7.2, 1 hµ(T, &'m) ;,:; - log s(n, e) - log L n
(3)
8. The Variational Property of Entropy
for all m
~
~
m0 and n
245
0. This implies that
~ limsup!logs(n,e) -
hµ(T,fJ'm)
n-+oo
for all m ~ m0 • Since 4.4 implies that h,,(T)
Vt::
0
n
logL
is equal (mod 0) to the Borel o--algebra of K, Corollary
= lim hµ(T,fJ'm) ~ limsup!logs(n,e)- logL. m-+oo
n-+oo
n
This holds for all e > 0, so h,,(T) ~ h,op(T)
+ log L.
But the same inequality can be applied to T": h,,(T)
1 · n
= -h,,(T") 1 n
~
1 n
-(h,0 p(T")
+ logL) l n
= -(nh, p(T) + logL) = h, p(T) + -logL. 0
0
Since this holds for all values of n
~
0 we get
h,,(T)
~
h,0 p(T),
proving (1). To prove (3) we observe that (f) in Proposition 3.3 implies that
Thus, (4)
Let X1,
... ,
xk be points of K, each contained in one atom of
k
=
v;.;;J y-j(~), so that
n-1
#
V
r-j(~).
j=O
By (2), two points X;, xi which are not (e, n)-separated satisfy
for all 0 ~ s ~ n - l. This property implies, by an elementary combinatorial argument, that for each X; there exist at most L" points in the set {x1 , ••. , xd which are not (e, n)-separated from X;. Thus there exists an (e, n)-separated subset {xii, ... , xii} with
IV. Entropy
246
It follows that s(n, e) ? j 1 ;;,,
1
n-1
-,;
L
#
Vr
. 1 (&';).
j=O
Together with (4), this implies (3). To prove that ~
h, p(T) 0
(5)
sup hµ(T) µe.it,(K)
we shall use the following lemmas: Lemma 8.2. Let Ac B(N) be a subshift. There exists µEA.- 1A(A) such that
h10 p(o-lA)
= hµ(o-lA).
Lemma 8.3. Consider the commutative diagram
where K 1 , K 2 are compact metric spaces, T1 , T2 are continuous and h: K 1 -+ K 2 is continuous and surjective. Then h*: AT, (K 1 )-+ AT 2 (K 2 ) is surjective.
We first see how (5) follows from the lemmas. Take fJ > 0 such that . 1 h,0 p(T) ~ hmsup-logs(n,e) n--+ +co n
+ fJ.
(6)
Let &'m be as in the first part of the proof, and define Sm as the set of all subsets A of &'m such that
n p cf 0.
PeA
We define the map hm: K-+ B(Sm) by hm(x)(n)
=
{Pe&'mW(x)eP}.
Also, for l ? I' ? m, we define the map
by hl'l(O)(n) = {ip(A)IAeO(n)},
where ip: &'1 -> &'l' is determined by the condition ip(P) ::::, P. Then hl'l is continuous for every I ? l' ? m. On the other hand, hm is not continuous in general, but its image Am is closed and it satisfies
The Variational Property of Entropy
247
showing that A,. is a subshift. We also have uh,.,
= h1,1u, = hll".
h11 ,h,,,,,
This means the diagram
K
Tl
a.
Am
·l
A,,
h...,,
h..,,,
h,,.,
·l
h,,.,
A,,
A,
commutes for all I' ;;;,, l ;;;,, m. Take m such that 2diam(A),.; e
(8)
for all A e&'m. The reader can check, using (8), that if {x 1 , ... ,x,} c K is an (e, n)separated set the blocks hm(x;)l[O, n] and hm(xi)l[O, n] are distinct for all l ,.; i ,.; j,.; s. This implies that limsup!logs(n,e),.; h,0 p(ulAm). n ➔ +co n Then, by (6), h,op(T) ,.; h,op(ulAm)
+ b.
Using Lemma 8.2, choose µme.,l(,,JA= such that h,,JulAm)
= h,op(ulAm)-
For every l > m, use Lemma 8.3 to find µ 1 e.,l(.-JA,(A 1) such that
h!.,µ, = µ... Then
IV. Entropy
248
for all I > m. Let f/11 be the inverse image under h1 of the Borel u-algebra of A,, and let v1 be the probability over f/11 given by v1(h11 (A)) = µ 1(A). The commutativity of (7) implies that
r- 1 (fll,) C v(T- 1
(A))
for l > m,
= v1(A) for A efll,, l > m,
f/11, => f/11,, µ 1,lfll,,,
fll,
for l'
~
I"
~
m,
= µl'lfll,. for I' > I"> m.
These four properties imply the existence of an additive function µ"': Ur>mf/11 -> [0, 1] such that µcclfll, = µ, for l ~ m, µ"'(r- 1 (A))
= µcc(A)
for A E
U fll,. l>m
We claim thatµ"' is a probability measure. By Theorem 0.1.4, it is enough to exhibit a compact class CC c: Ui>mf/11 such that (9)
For '(J we take all sets of the form h11 (S), where l > m and Sc: A, is compact. To check that CC is indeed a compact class, we take A 1 => A 2 =>···in CC, say
A.= hi:,1 (S.), wheres. c: A,. is compact. Without loss of generality we can assume l. wish to prove that
= n.
We (10)
To do this we will find a compact metric space Ace, a sequence of surjective continuous maps h., A"'-+ A. and a surjective map h"': K-+ Ace that make the diagram 00 :
K
~~
h.*A. +---A ~~ •. +---A"' hn,n'
hn',oo
commute for all n < n'. Once we have these data, we can write A.
= h_;;1h;;:~(S.),
so that
Since h"' is surjective, it remains to show that
8. The Variational Property of Entropy
249
But since h;;:!c,(Sn) is compact for every n, this amounts to showing that n' > n implies h;.~00 (Sn')
C
h;;:!,,(Sn).
This is immediate, since h;;1(h;.~.,,(Sn,)\h;;:!,,(Sn)) = h;,1(h;.~.,,(Sn;))\h;;;1(h;;:!c,(Sn)) = An,\An =
0
and h00 is surjective. This proves (10). We now construct A 00 , h00 and hn,ao· Let A 00 be the space of sequences 0: Z-+ Un An such that O(n) E An and hn,n•(O(n')) = 0(n) for all n' > n, and consider on it the topology induced by the product topology on On An. Define hn,ao by hn, 00 (0) = 0(n) and· h00 (x)(n)
= hn(x).
The reader can easily verify that these objects satisfy the desired properties. Next we prove (9). Take A E Ui>m £ii; by definition there is n > m such that A = h;; 1 (S), where Sis a Borel subset of An. Using Lusin's Theorem (0.1.3), we find a compact subset S0 of S such that µn(S\S0 ) < e. Then ~(h;; 1 (S)) - µn(h;; 1 (S0 ))
= µn(h;; 1 (S\So)) = µn(S\So) < e.
h;; 1 (S0 )e'C,
Since this completes the proof of(9). Thus we have found a T-invariant probability measure µ 00 over Ui>m£i1• By Theorem 0.1.5 µ 00 can be extended to a probability measure, also called µ 00 , over the er-algebra U l>m &191• But Ui>m &191 contains all the atoms of the partitions~' l > m, because they are the inverse images under h1 of the canonical partition of A 1• Thus µ 00 E.,/(r(K) andµ.= µ 00 oh•. To complete the proof (modulo Lemmas 8.2 and 8.3) just observe that hµJT) ~ hµm(alAm) = h,op(alAm) ~ h,op(T) - tJ,
and that fJ is arbitrary, showing (5).
D
Proof of Lemma 8.2. From exercise 1.11. 7 we know there exists a decreasing sequence (A.).;. 1 , where each An is topologically equivalent to a subshift of finite type and
n An=A.
n~l
By Parry's theorem (6.1), there exists µn e .,l(o-lA.(A.) such that hµ.(alA.)
= h,op(alAn)-
If n > 1 we can assume that µne.,l(,,1A,(A 1 ) by putting
µn(A)
= µn(A n An)
for every Borel subset A of A 1 • Since crJA 1 is expansive, we know from exercises 7.3 and 7.4 (c) that, for any sequence (v.).;. 1 with v. E .,l(,,1A(A 1 ) and lim•-+oo vn = v, we have h.(alA 1 ) ~ limsuph,.(crlAi), (11) n-+oo
IV. Entropy
250
and h,0 p(o-lA)
= lim h, p(o-lA.).
(12)
0
n ➔ +a:>
From (11) and (12) it follows that if (µ.)i,. 1 is a subsequence of(µ.) converging to µ we have hµ(ulAi);,, limsuphµ. (ulAi) = limsuphµ •.(o-lA.) j ➔ +~oo
=
j ➔ +oo
1
Jim hrop(ulA.)
j ➔ +co
J
= h, p(o-lA). 0
J
But, since the support ofµ. is contained in A •. , the support ofµ is contained in A, and thus , ,
□ Proof of Lemma 8.3. Take µE .Ar2 (K 2 ), and let E c C 0 (K 1 ) be the space of continuous functions : X-+ R:
f
x
,f>dµ
= lim
•-+oo
I
.1 • ,f>(x). # F1x(f ) xeFix(f•)
The following results, which we shall not prove, elaborate on the hypothesis that J is topologically mixing, showing that it is not unnecessarily restrictive. Theorem 9.2 (Bowen [BJ]). ff J: X-+ Xis a mixing hyperbolic homeomorphism, there exists a decomposition X = X 1 U · · · U X 1 into disjoint invariant sets such that
f(X;) = Xi+i,
i =I, ... , l - 1,
f(X,) = X1 and J'IX1 is topologically mixing for all I ~ i::;;; l.
9. Hyperbolic Homeomorphisms
255
Theorem 9.3 (Smale [S10]). If A is a hyperbolic set with local product structure of a diffeomorphism f and the periodic points of JIA are dense in A, there exists a decomposition A= A 1 U · · · U Ak into disjoint invariant compact sets such that JIA; is transitive for all 1 :s;; i ,s;; k.
Since a connected manifold cannot be decomposed into disjoint compact sets, we have Corollary 9.4. A transitive Anosov diffeomorphism of a connected manifold is topologically mixing.
The fundamental tool in the proof of Theorem 9.1 are Markov partitions, which we introduced in Section III.2 in a more specialized context. The definitions below are formally the same presented there. Let e0 > 0, {J > 0 be the constants given in the definition of a hyperbolic homeomorphism. Take O < ) :/:0- TakexeRB(l) such thatf-1 (x)eR9(0)• By the induction hypothesis, n:=of-"(R9(n))
9. Hyperbolic Homeomorphisms
257
is an s-subrectangle of Ro(l)• so for any ye ()!=of-"(R 0) we have
W.:(x) n
(.6
0
r•(Roc•>)) :::, W.:(x) n w,;:(y) =I- 0.
But Ro(O) n r 1 0:V-"(Ro(n)>):::, w.:(x) n w.:(y), so it is enough to prove that this latter set is non-empty. Since
0 =I- w.:(x) n
(.60 r·(Ro(n)>)
w.:(x) n Ro(l),
C
this reduces to showing that
1-10,v.:(x) n Ro(l))
C
Ro(O)•
which follows from r 1(w,;:(x) n Ro(l))
= r 1(W"(x, Ro11)) = W"(r 1(x), Ro10J
The proof that the sets in (2) are non-empty is analogous. We still have to show that (i) cannot contain more than one point. If x, y E n.r"(Ro(n)), then f"(x), f"(y) E Ro(n) for all n, which implies
0. Pick N such that diam( n F"(Ro(n)>),,;;;
B.
JnJ,S:N
Any ci: close enough to 0 satisfies ci:(n) rr(y)
nr"(R.(n))
=
= 0(n) for jnj ,;::; N. Thus n F"(Ra(n))
C
~O
n
=
r"(Ro(n)),
()
~O
proving that d(n(x), n(y)) ,,;;; t:. Property (a) follows from fn(0) = =
1(
0r·(Ro(n)>)
nr•(R,,o)
=
=
r:v-(RO(n))
=
Qr"(Ro(n+l))
rro-(0).
n
For (b ), we must show that for every x EA there exists 0 E B(A 91 ) such that (3)
for every N. Define 0(0) such that x E Ro(o)• By induction, assume we have already
IV.Entropy
258
selected 0(-m), ... , 0(0), ... , 0(m) satisfying (3) for N aB(J)BU+l)
= m, and
=1
(4)
for -m ,s;;;j ,s;;; m - 1. We have f(Recm>) n
(YA)
# 0,
for otherwise, since UiRi = X, we would have f(Re(m)) C uiaRi, which is impossible because then iJR 1 has non-empty interior, contradicting the assumption that the R; are proper. Let 0(m + 1) be such that f(Recm>) n Recm+1> # 0, and similarly for 0(-(m + 1)). Then 0 satisfies (3) for N = m + 1 and -(m + 1) :!i;_j ,s;;; m + 1, completing the induction. The proof of (c) depends on some preliminary lemmas.
Lemma 9.7. The function (N0 ) = N. We then say that N is modeled on N 0 • If x e N we define the tangent space T"N to N at x as t/>'(r 1 (x))YxN0 . It can be easily shown that this concept does not depend on the map ¢,. The intrinsic topology on N consists of the sets whose inverse image under t/> are open in N 0 • This concept, too, is independent of N 0 and¢,. We next define a measure on the Borel a-algebra of N (with respect to the intrinsic topology) as the image- in N of the measure on N 0 generated by an arbitrary volume form w. If we change N 0 , t/> and w, this measure can change, but stays in the same equivalence class. The expression "almost everywhere" in an injectively immersed submanifold refers to this class of measures. The theorem below deals with the stable and unstable sets of points of the Pesin region. The stable set of a point x e M is defined as W'(x) := {Ye MI lim d(f"(x),f"(y)) = n ➔ +co
o},
and the unstable set as W"(x) := {ye Ml lim d(f-"(x),r"(y)) = n-+oo
o}.
For A(f) as in the statement of Theorem 10.1 and xeA(f), we define E'(x) and E"(x) as
EB E (x), E"(x) = EB E (x). ll,(x)l>O E'(x)
=
1
ll,(x)l; b) x, ye.Er> implies w•(x)n W"(y) c: I,O
The details of the proof are left to the reader. Proof of Theorem 11.1. Here the really rekwant part is (c), so we start assuming (a) and (b) to prove (c). Parts (a) and (b), although very useful, are minor and will be left for the end. By Corollary 11.6.5, proving that A has total measure amounts to showing that µ(A) = 1 for all ergodic measures µ e .,1t1 (X). Take such a µ. A fiber bundle F is a family (Fx>xex of vector subspaces of Rk (called fibers of F) such that there exist measurable maps 17;: X--+ Rk, i = 1, ... , m, such that {17 1 (x), _.. , '7m(x)} is a basis for F" for µ-almost every x e X. An isomorphism is a family of linear isomorphisms T(x): F"--+ Ff such that if 17 1 , ••• , '1m are maps as specified above, the entries of the matrix of T(x) with respect to the bases {17 1 (x), ... ,17m(x)} and {17 1 (/(x)), ... , '7m(f(x))} are µ-integrable and bounded. A fiber subbundle G c F is called T-invariant if T(x)G" = Gf(x) for µ-almost every x. Given an isomorphism T of a fiber bundle F, we put
l 1 (T,x)
= limsup!logllT,,(x)II, n ➔ +co
n
where T,,(x) is defined as T(f"- 1 (x)) ... T(x) if n > 0, and T(f-n(x)) ... T(f- 1 (x)) if n < 0. It is easily seen that ). 1 (T,f(x)) = ). 1 (T, x); sinceµ is ergodic, ). 1 (T, x) has the same value ). 1 (T) at µ-almost all points x. We have the following two lemmas, whose proof is also postponed: Lemma 11.S (Invariant subbundle). Let F be a fiber bundle and T an isomorphism of F. For xeX, let Gx be given by Gx := {ueFxllimsup!logllT_.(x)ull:,;:; - l 1 (T)}. n ➔ +oo n
11. Proof of Oseledec's Theorem
271
Then the subspaces Gx, x e X, form a T-invariant subbundle such that
!
lim log II T,,(x)ull n-±«>n
= 2 1 (T)
(4)
for µ-almost every point x e X and every u e Gx. Lemma 11.6 (Complementary subbundle). Let F be a fiber bundle, Tan isomorphism of F and G a T-invariant subbundle. Let GJ. be the subbundle of F whose fibers G;are the orthogonal complements of Gx in Fx. Denote by nx: Fx-+ G;- the orthogonal projection, and define the isomorphism t of GJ. as f(x) := nx T(x)IG;. Then, if
A1(t>
+ A1(Y-1IG) < 0,
there exists a T-invariant subbundle of F such that HxEBGx
= Fx
(5)
at almost every xeX, and
(6) We now show how these lemmas imply that µ(A)= 1, and thus Theorem 11.l(c). We start with the following corollary: Corollary 11.7. If Fis a fiber bundle and T: F-+ Fis an isomorphism, then either Iim.-+oo (1/n)logll T,,(x)ull = l 1 (T) for all O =I- ueFx and µ-almost every x, or there exist T-invariant fiber bundles G and H such that
GxEBHx
= Fx
for µ-almost every x,
and
lim n-+±co
!n log II T,,(x)ull = l 1 (T)
(7)
for µ-almost every x and every u e Gx. Proof Let G be given by Lemma 11.5. If Gx = Fx for µ-almost every x, the first option in the statement holds and we are done. Thus, by the ergodicity off, we can assume that G, =f, Fx for µ-almost every x. Let GJ. and f: GJ. -+ GJ. be as in the statement of Lemma 11.6. We claim that 21 (f)
< l 1 (T).
Suppose 2 1 (f) ~ A1 (T). By 11.5, there exists a f-invariant sub bundle O c GJ. such that lim n-+±oo
!n log II f.(x)ull = l
1 (f)
IV.Entropy
272
for µ-almost every x and evecy ue (t. But, for ue G;-, nf"(x)
T,,(x)u
= 't,.(x)u.
Thus liminr!IogllT,,(x)ull ;;;i: lim !1ogi1't,.(x)ul1 n-+oo
n
n ➔ +oon
= l 1 (1').
(8)
By the definition of l 1 (T), limsup!logllT,,(x)II :s;;; l 1 (T,x). n-+oo n It follows that l 1 (T) ;;;i: l 1 (1'), and then l 1 (T)
(9)
= A. 1 (1'). Thus
lim !1ogl1T,,(x)l1,;;;; l 1 (f,x) n
(10)
n-+oo
for µ-almost every x and evecy u e G". From (8) and (10) we easily get lim inc! log II T,,(x)ull ;;;i: l 1 (T) n-++oo
n
for µ-almost every x and evecy u e G" EB G;- =: Fx. Fore > 0, put C,x. ( ) ·=
.f m 0,O Gx(o), it suffices to show that, for every 8 > 0, dim Gx(o) =I= 0 for µ-almost every x. Let Xm be the set of x EX such that there exists O =I= u e Fx satisfying (19)
IV. Entropy
276
for every O ~ n ~ m. We claim that infµ(Xm) > 0.
(20)
m
Assuming this we have, since X 1 ::, X 2
But if x
0
~
E
::, • • ·:
Xm for every m, there exist vectors um such that (19) holds with u
= um and
~
m. Taking u as the limit of some subsequence (um,), we clearly have O #- u e Gx(o), and u satisfies (19) for all n;;,,, 0. This shows that dim Gx(o) > 0 for XE n
nm;,,iXm. But Gf" 0 if XE Un;,,ol"(nm;,,1 Xm)- Since Un;,,ol"(nm;,,1 Xm) is invariant and has positive measure (as it contains nm;,,i X.,), it follows from the ergodicity ofµ that dimx(o) > 0 for µ-almost every x, as desired. We now tum our attention to (20). Let x be such that 1:x)x)
for all m
~
= µ(Xm)
1 (cf. page 98). We must show that
inf
'Xm(X)
> 0.
m~l
We shall use the following lemma, due te V. I. Pliss. Lemma 11.8. Given ,1, > 0, e > 0, H > 0, there exist N 0 = No(J.., e, H) and o o(J.., e, H) > 0 such that, if a 1 , ••• , aN are real numbers, N ~ N 0 , satisfying N
Lan~NJ.., n=l
lanl there exist l
~
No, 1 ~ n 1
~
~ •• • ~ n
L
for n
H
=
1, ... , N,
n1 ~ N such that a; ~ (n - n1)(J
i=nJ+l
for every j
= 1, ... , l and n1 < n ~ N.
Proof Put
n
s. :=Lb •. j=I Then
+ e)
=
11. Proof of Oseledec's Theorem
Let n1
,;;; · · · ,;;;
277
n1 be the integers in [l, NJ which satisfy Sn1:;;. Sn
for all n:;;. nr The largest of them, n1, is obviously equal to N. It might happen that this is the only element in the set {n 1 , ••• , n1}. In order to avoid this, we require that
N >H+l+e ----,
(21)
8
which implies SN,;;; -Ne< -(H
+ l + e),;;; a 1 -
(l
+ e) =
b1 = S 1 ,
(22)
and this means that if Sm= max 1 ,;;;n,;;;NS., we have me{n 1 , ••• ,n1} and, by (22), < N. We thus have, forj > 1:
m
n
•
L a; = i=nJ+l L b; + (n i=n,+1 = (S. -
S.)
n1)(l
+ (n -
+ e)
n1)(l + e) ,;;; (n - n1)(l
+ e).
To estimate the value of l, we start by observing that, for 1 < j ,;;; l,
s.j+l :;;. s.;+l = s.J + bn;+l :;;. s.j -
(H
+ Ill + e),
so that
s.
1 :;;.
s., -
(j - l)(H
+ Ill + e).
In particular,
s., :;;. s., -
(l - l)(H +
Ill + e).
Sincen1 = N,
+ 1).1 + e) l)(H +Ill+ e) = -H -
-eN :;;. SN :;;. Sn, - (I - l)(H :;;. S1
-
(I -
(I- l)(H +Ill+ e).
Thus (l - l)(H
+ Ill + e) :;;.
-
H
+ eN,
and I l N :;;. N - (H
e
H
+ Ill + e)N + (H + Ill + e)"
We then put No= max(H
=
+ lell + e,
27),
e
2(H
+Ill+ e)
If N > N0 , (21) holds automatically, and, from (23),
(23)
IV. Entropy
278
l H e - ;;;, - - - - - - - + - - - N (H + 111 + e)N0 H + 111 + e ~
e
e
- - - - - - + - - - - = (). 2(H + 111 + e) H + 111 + e
□
Now take O # u E Fx and N1 > 0 such that logllT,.(x)ull;;, n(l 1 (T)- e/2)
(24)
for all n;;, N 1 . For N;;, N 1 , set
Then (24) implies N
I
ai,,; N(-1 1 (T)
+ e/2).
j=l
Taking H > 0 such that IIT- 1 (x)II,,; expH, IIT(x)II,,; expH for µ-almost every x, we get
la) ,,; H. Choose N0 := N0 ( - Ai (T) + e/2, e/2, H), o := o( -Ai (T) + e/2, e/2, H) as in Lemma 11.8. Then there exist 1 ,,; ni ,,; · · · ,,; n1 ,,; N, l ;;, ON, such that n
I
a; ,,; (n - ni)( -Ai (T) + e)
(25)
i=ni+l
for every j = 1, ... , I and n1 ,,; n ,,; N. This means that for ui
:= (TN-n;+l (x)u)/ll(TN-n;+i (x)u)II,
we have
II T_(N-•;>(f N-•;(x))u)I ,,; exp((n - n)(-li (T) + e)) for j = 1, ... , l and ni,,; n,,; N. Thus fN-•;(x)E Xm if N - ni ~ m. But N - ni;;, m holds at least for j = 1, ... , I - m. This shows that f N-•;(x) E Xm for the same values ofj, and
Taking the limit as N increases,
so infm -rxJx) > 0, as desired.
□
Proof of Lemma 11.6. Take measurable maps I'/;: X --> Rk, i = 1, ... , k, and (/ X --> R\j = 1, ... , I, such that,for µ-almost every x, {'li(x), ... , '1k(x)} and {( 1 (x), ... , ( 1(x)} are orthonormal bases for G;- and Gx, respectively. Let Ebe the space of maps which
11. Proof of Oseledec's Theorem
279
associate to each x a linear map A(x): G;--. G,,, in such a way that the entries of the matrix of A(x)with respect to the matrices {'1 1 (x), ... , 'ft(x)} and {e 1 (x), ... , e,(x)} depend measurably on x. We define the operator Rk by the properties
= (Dxf)u, u E T,;M, L(x)u = AU, UE(TxM)J_,
L(x)u
where l > 0 is chosen so small that
This inequality guarantees that O =fa u e(TxM)J_ if and only if . 1 hm -logllL.(x)ull n-+-±oo
n
= l.
From this we conclude that x is a regular point off if and only if it is a regular point of L, and that the eigenspaces and Lyapunov exponents of Lat regular points are the same as those off, plus (TxMl and A, respectively. D
12. Proof of Ruelle's Inequality In this section we demonstrate part (a) of Theorem 10.2. Let f, Mandµ be as in the statement of 10.2. Assume Mis embedded in R 1, and let Ube an open neighborhood of M such that there exists a diffeomorphismf0 : U-> f(U) c U satisfyingf0 IM = f and (Dxf0 )(TxM)J_ = (1.r., are the atoms of the partition &'1 , translated by some vector y 0 e R1• Besides, ¢,;; 1 (&'.(x)) is contained
12. Proof of Ruelle's Inequality
283
in Q 0 , so that v9 ,.(x).,;:; #{Pe[?'>ilg.(Qo)n(P Since g.(Q 0 )
+ Yo) -:f. 0}.
= (Dxg) + rx,n + q.,
Vg,.(x).,;:; #{Pe[?"1l(((D,,g) ~
+ q.)(Qo) +
Yo)nP cf. 0},
(1,n -
sup #{Pe[?"i!(((Dxg) + q.)(Q 0 ) + y)nP cf. 0}. y
Using the fact that (q.) converges to zero uniformly over bounded sets, we get vg(.»)
= limsupv9,.(x).,;:; sup #{Pe&'il((D,,g)Q 0 + ynP-:t- 0}.
D
y
n-++oo
In particular, putting C(g) :=
SUPxeM
ll(Dxg)II\ we have
vg(x)
~
C(g)
(3)
for almost every x EM. Lemma 12.2. For ge!5},
Proof Since the sequence H(f?l'. n MIV}'.!. 1 gi(&. n M)) is decreasing and converges to hµ(g IM,[?'>. n M) (Proposition 3.5), it is enough to prove that
H(&.nMlg([?".nM)).,;:;
L
logvg,ndµ.
But H([?".nMlg(&mnM))
=
I
µ(A)(-
A e.&>.nM
I PEtl•.nM
µ{Png(A))logµ(Png(A))) µ(g(A))
µ(g(A))
g(A)nP,t0
,s;
I
µ(A)log #{Pe[?"mlg(A)nP cf. 0}
Ae.&>.nM
:'S;
IM log
Vg,n
□
dµ.
Corollary 12.3. For ge!5},
Proof. Since V:'= 1 (f?l'm n M) is equal to the Borel er-algebra of M (µ-mod 0), we get hµ(gJM) = lim hµ(glM,[?'>.n M) ,s; limsuplogv9 ,.dµ. n-+oo
To show that
n-++oo
IV.Entropy
284
limsup
f
I logv ,.dµ,;;;; 9
n➔ +oo JM
limsuplogv11,.dµ,
M n➔ +oo
it is enough to find K > 0 such that v9,.(x),;;;; K for almost every xeM and n large enough. Let V c R1 be a compact neighborhood of M, and put K := SUPxev ll(Dxf)ll 1• It is easy to see that there exists n0 such that V9 ,.(x),;;;; K for n ~ n0 and almost every x e M. □ Lemma 12.4. Forge~,
hµ(glM),;;;;
f
M
(1imsup!logv9 n-+oo
n
.)dµ.
Proof. We have
so that
. f
h,,(glM),;;;; hmsup
1 -logv,-dµ.
Mn
n ➔ +oo
By (3),
l~logv9 .(x)l ,;;;;~logC(gf for almost always x EM. Then limsupJ !1ogv0 .dµ,;;;; n-++oo
Mn
f
M
= C(g)
limsup!logv9.dµ. n
n-+oo
□
Lemma 124 reduces the proof of Theorem 10.2 (a) to showing that, if x E Am(f), limsup!logv18 (x),;;;; n-++oo
n
I
l;(x)dimE;(x).
(4)
A.,(x)>O
This requires another lemma. A box with sides a 1 , written as
••• ,
t
s = {x + 1 t;u;IO,;;;; t1 ,;;;; 1, llu;II =
a1 is a set S which can be
a}
Let 0 such that
l
a;.
13. Proofof Pesin's Formula
285
Proof. It is clear that ¢,(ai,a2 , ••• ,nai, ... ,a1)..;; n¢,(a 1 ,a2 , ••• ,a1)
for n ;;;, 1 an integer. This gives ¢,(a1, ... ' a,) ..;; ¢,([a1]
+ 1, ... ' [ai] + 1)..;; ¢,(1, ... ' 1) n([aj] + 1) j
The lemma follows by taking C := ¢,(1, ... , 1)21•
□
We return now to the proof of(4). Let {u 1 , ••• ,u1} be a basis ofR1 such that each ui belongs to some E 1(x) or to (T,,M).1. Let Q be a box with sides parallel to ui, ... , u1 and containing Q0 • By Lemma 12.1, %(x):,:;; sup #{Pe&'il((D_,fo")Q 0
+ y)nP # 0}
y
:,:;; sup #{Pe&'il((Dxfo")Q'
+ y)nP # 0}.
y
By Lemma 12.5, this latter expression is less than or equal to
TT { ll(Dxfo")u;!I I ll(Dxfo")u1II >
1},
so that
where the first sum is taken over all i such that log ll(Dxfo)"u;II > 0.
□
13. Proof of Pesin's Formula In this section we prove Pesin's formula (Theorem 10.2 (b)). Our proof follows [M9]. Let M be a closed manifold and f: M ➔ M a diffeomorphism preserving a probability measure µ E .lt(M), absolutely continuous with respect-to the Lebesgue measure. Ruelle's inequality gives h,,(f) ,.;;
so the point is proving that h,,(f) ;;;,
L L
Xdµ,
Xdµ.
We will first estimate a lower bound for h,,(f) which holds in a more general context. Let µ be a probability measure on the Borel CT-algebra of M. If p: M ➔ (0, 1) is a
IV. Entropy
286
measurable function, put s.(g,p,x) := {yld(gi(x),gi(y))
< p(gi(x)),O : U-+ (D0 F)E 2 be given by
cI>(u) := Tu
+ q(tf,(u), u).
For u, wEU we have llcI>(u) - cI>(w)II ;;?, II T(u - w)II - llq(l/f(u), u) - q(l/f(w), w)II ~ lllu - wll - or;o(lltf,(u) - tf,(w)II
;;;; (l - or;o(l
+ c)) llu -
wll.
+ llu -
wll)
IV.Entropy
290
If0 < o ~ iix- 1 (1 + c)-1, we conclude that ,Pis a homeomorphism onto ,P(U) and its inverse is Lipschitz with constant ~(A - ixo(l + cW 1• Let if,: (U)}.
But if, can be written as if, = if, o small enough this number is less than c. D We return to the proof of (4). For x a regular point off, we put E"(x) =
EB
Ej(x),
Aj(x) 0 and 0 < t,;;; 1 such that
ll(Dxg") - (D,g•)II ,;;; C"llx - YII'
(8)
for every x, ye M. To prove this we take O < t ,;;; 1 and C0 > 0 such that II (Dxg) - (D,g)II ,;;; Co llx - YII'
(9)
for every x, yeM. (These constants exist since f, and hence also g, is Holder C 1.) Let A be a Lipschitz constant for g, and let C > C0 be such that At+l)n c;;;i,A+Co ( C for all n ~ 0. We prove (8) by induction; the case n (8) holds for l ~ n ~ m. Then
=
(10) 1 is clear since C
~
C0 • Assume
ll(Dxgm+l) - (D,gm+l )II ,;;; ll(Dgm(x)llm+l) - (Dgm(y)gm+l )II (11)
,;;; ll(Dgm(y)g)ll ll(Dxgm) - (D,gm)II. Now A is a Lipschitz constant for g, so ll(Dxg)II that
~
A for all x, and (11) and (9) imply
ll(Dxgm+l) - (D,gm+l )II ~ Co llgm(x) - gm(y)II Am ,;;; (CoAm(t+l)
+ Acm llx -
YII'
+ ACm) llx - YII'.
From (10) it.follows that
which concludes the induction step. Now take 0 < < l such that (ce•r < b for every n. For yeB~m• (8) implies that ll(D7 gm) - (Dxgm)II ~ b. All the assumptions of Lemma 13.4 are thus satisfied, and Lemma 13.5 is proved. O
e
Now assume we fixed the constant c > 0 so small that there exists a > 0 with the following property: For x EK and ye M less than a apart, every subspace E c T,M which is an (E 0 (x), E"(x))-graph with dispersion ,;;; c satisfies
IV. Entropy
292
lllogldet(Dyg)jEI - logldet(Dxg)jEI II ~ 8.
(12)
For xeK, set D,(x) := {x
+ Y1 + Y2IY1 eE0 (x),Y2EE"(x), IIY1II
~ r, IIY2II ~ r},
and let k 2 > k 1 > 0, r 1 > 0 be such that
B,. 1,(x) c D,(x) c B,. 2,(x)
(13)
for every xeK, 0 < r ~ r1 . Let N: K--+ z+ be the return function of K, that is, let N(x) be the least integer such that gN(x) e K. We recall that N is integrable (exercise 1.2.3). Extend N to M by putting N(x) := 0 for x ¢ K. Finally, define p: M--+ (0, 1) by p(x) := min(a,(kifk2)eN).
Then logp is µ-integrable because N is. Since µ(K);;,,, µ(E) - e, Birkhoff's theorem (11.1.1) implies that µ({xeEl-rKc(x) ~ je});;,,, (1 - .fi)µ(ll). There exists a compact set K 1
c
Kand an integer N0 > 0 such that
µ(K 1 );;,,, (l - 2je)µ(E)
and (14)
for all xeK 1 and n~ N0 . Since 8 is arbitrarily small the proof will be complete if we show that (4) holds for points in K 1 . To estimate the Lebesgue measures 2(S.(g,p,x)), observe that for some B > 0 we have 2(S.(g,p,x))
=BI
2((y
+ E"(x))nS.(g,p,x))d2(y)
JEO(x)
for every n ;;,,, 0, where ). denotes Lebesgue measure on the spaces y reduces the problem to showing that lim sup inf !(-log2((y + E"(x)) n s.(g,p, x)));;,,, N(x(x) -
+ E"(x). This 8 0 ).
(15)
n-+oo ye£0(x)n
For yeE 0 (x), let A.(y) be the set of points wey
+ E"(x) such that
gi(w)e Dp(gl(x))fk, (gi(x)). By the definition of the sets D,(y) and by the properties of the constants k 1 and k 2 , this implies that A.(y) => (y + E"(x)) n s.(g, p, x). The proofof(15) is then reduced to showing that Jim sup inf !(-log2(A.(y)));;,,, N(x(x) - e0 ). n-+oo
ye,EO(x)n
(16)
14. Entropy of Anosov Diffeomorphisms
293
We claim that if gn(x)eK and An(Y) #- 0 then g"(An(y)) is an (E 0 (g"(x)),E"(gn(x)))graph with dispersion m be such that Am(Y) # 0, g"(x) e Kand g 1 ¢ K form vol(G) for all (E 0 (w), E"(w))-graphs with dispersion vol(g"(A.(y))
=
f
Jdet(D,g")I T,,An(y)I dl(z).
(18)
A.(y)
Settings.:= {O e > 0, and take µ e J{r(M) satisfying h,.(f) ;;?; h,0 p(f) -
B.
From the ergodicity ofµ, there exist l 1 , l 2 such that the Lyapunov exponents off are l 1 and l 2 and the Lyapunov exponents of 1-1 are - l1 and -A 2 at µ-almost every point of M. From Ruelle's inequality,
+ max{0,l2 }, 0 < h, p(f)- B ~ h,.(f) ~ max{0, -li} + max{0, - l 2 }. 0 < h,0 p(f)-
B~
h,.(f) ~ max{0,l 1 }
0
From the first equality, at least one of l 1 or l 2 is strictly positive. From the second, at least one of them is strictly negative; thus they are both non-zero. This shows thatµ is hyperbolic. Equality (1) follows since e is arbitrarily small. D Iff: M-+ Mhas a hyperbolic set Kc M,itisclear that every µeJ{11K(K)isergodic and hyperbolic. The converse problem, i.e. finding hyperbolic sets starting from the existence of hyperbolic measures, is dealt with in the following theorem of Katok:
Theorem 15.2(Katok [K3]). If f is Holder C 1 and µeH(f) satisfies h,.(f) > 0, then for all O < e < h,.(f) there exists a hyperbolic set A topologically equivalent to a transitive subshift of finite type and such that h,0 p(JIA) ~ h,.(f) - e.
= 2 and f is a Holder C 1 diffeomorphism of M with positive topological entropy, there exists for every O < e < h,0 P(f) a hyperbolic set A topologically equivalent to a transitive subshift of finite type and satisfying h,0 p(fl A) ~ h,0 p(f) - e.
Corollary 15.3. If dim M
This follows immediately from Lemma 15.1 and Theorem 15.2. Another interesting consequence of Katok's theorem is the following corollary, which guarantees the existence of hyperbolic sets from conditions on the action of the diffeomorphism on homology:
= 2 and f is a Holder C 1 diffeomorphism of M whose induced action on homology f*: H 1 (M 2 ,R)-+ H 1 (M 2 , R) is hyperbolic (i.e. has eigenvalues with absolute value different from 1), then there exists for every O < e < log I;_ I a hyperbolic set A topologically equivalent to a transitive subshift of finite type and such that Corollary 15.4. If dimM
h,0 p(JIA) ~ log Ill
- a.
This corollary follows from Katok's theorem and the following theorem of Manning, whose importance goes beyond this application:
Theorem 15.5 (Manning [M4]). Let T be a continuous map of a closed manifold M, and T*: H 1 (M, R) -+ H 1 (M, R) the induced map on homology. Then h, p(T) ~ log l, where A is the spectral radius of T. 0
IV.Entropy
298
Obtaining lower bounds for the topological entropy from conditions on the action on homology is a cogent problem. We recommend the reading, however superficial, of [P6], [M4], [B7] and (S4] (in this order).
16. The Brin-Katok Local Entropy Formula Let K be a compact metric space, f: K-+ Ka continuous map and µe.H(f). Put B,(f,n,x) := {yld(fi(x),Ji(y)) ~ 8,0 ~j ~ n}, h:(J.x)
= limsup(-!1ogµ(B,(f,n,x))), n
n-+oo
h;(f,x)
Clearly, 0
0 a partition& of Rm whose atoms are cubes (ne, (n + l)e) x .'?. x (ne,(n + l)e), neZ. Without loss of generality, wecanassumethatthe measure of the boundaries of these cubes is zero. Set n
t?J>!•>
== V rj>, j=O
and, as usual, let
f?J'!'1(x)
be the atom of t?J>!•> containing the point x. Define
1!"1(x) :=
lim (-!logµ(f?J'~•>(x))). n
n-++
Since clearly ;ta> ;:,,
dµ = h,,(f, f?J>(•>).
(1)
;(x). e➔ O
From (1) and the monotonicity of ¢>t•> on ll we get
L
;,,(f, x) dµ(x) = h,,(f).
The next lemma is the key step in the proof: Lemma 16.3. If
for almost every x e K, then c 1 ~ h-(f,x),;;;; h+(f,x),;;;; c2
almost everywhere. Let us see how Lemma 16.3 implies the theorem. Observe that
1p,,(f,f(x)) = 1p,,(f, x)
(2)
IV. Entropy
ahnost everywhere. Thus, for all s0 > 0 and i e z+, the sets K; := {xeKlillo..; O}, and for i e S define a probability measure µ1 by the formula µiCA) := µ(An K 1)/µ(K 1).
Clearly i;,,,(f,x)
= ip,,(f,x)
for µ-almost every x e K;, and then also for µ,almost every x e K 1• But saying µ,almost every x e K 1 is the same as saying µ;-almost every x e K, since µ 1(KD = 0. Thus ill0 ..; ip,,.,(f,x) = ip,,(f,x)..; (i + l)s 0 (3) for µ,almost every x. Applying Lemma 16.3 to f andµ it follows that ill0
..;
h-(f, x) ,.; h+(f, x) ..; (i + l)s0
(4)
for µ 1-almost every x. Further, (3) and (4) imply that lh-(f,x)- (x) = lim - !1ogµ(&J•>(x)) n ➔ +oo n
~ lim sup n-+oo
!1ogµ(B,(f,n, x)) = n
Taking the limit when s approaches zero, h+(f, x) ..; ip,,.(f, x) for almost every x e K.
~
c2
h:(J. x).
16. The Brin-Katok Local Entropy Formula
301
To prove the other inequality in Lemma 16.3, we introduce the following definition: we say that two atoms A and B of a partition are related, and write A ~ B, if
&*>(f1(x))n&>(Ji(y)) #- 0 for all O ~ j
~
n, x e A, ye B. In this case it is easy to check that B,(f,n,x)
U
c
Pe&>!•>.
(6)
P~~,:>(x)
Now take a positive integer p such that for any ll > 0 every atom of f1J intersects at most p atoms of~•>. This number depends only on the dimension of Rm and is independent of ll. Take any b > 0 and define the set of "good" atoms of fl'~•> as G(n, e) := { A e &>~•>jµ(A)
~
e(x) = ~,,(f,x);,, c 1 ,-o implies that the set {xl~t(x);,, c 1
-
b/2},
which is contained in G(ll), has measure arbitrarily near one if e is small enough. Now set c := b
+ c2
-
c1
+ 2logp,
and define the set of "fat" atoms of~•> as F(n,e) := {Ae&>!'>lµ(A);,, eJ•> related to a given one is bounded by pn. Hence µ(B) ,,;;.fl 7 81-'(x) 7 &'(x), &>.(x) 207 r(O//) 242 r(n,a) 237 SL(n) 46 S.(g, p, x) 286 supp(µ) 55 Symp;:,(M, ro 0 ) T* 55
117
ltoltiit 2 ••• 43 T/M 40 0//1 V 0//2 242 Uµ 150 UT 80 W 8 (x), W"(x) 68, 188, 266 w,s(x), W,"(x) 7~181, 251 W,8 (x), W,"(x) 253 W 8 (x, R), W"(x, R) 182 wss(x) 267 (X,d,µ) 2
Subject Index
The page where a term is defined is in italics. Very common terms are only indexed once, where they are defined our occur for the first time. c'.X-,,
see ru-
absolute continuity of a measure 6 - - of the stable foliation 189, 202, 294 --ofamap 190-192 additivity, see also a-additivity - criterion 2 Adler 168, 169, 237, 243 algebraic expanding endomorphism 167 almost uniform convergence 3 Andronov 179 Anosov Ill, 170, 179,182,190,227 -diffeomorphisms 178-182, 202,206,253 - - and hyperbolic homeomorphisms 251, 253 - - and the stable foliation 189-191 - -, codimension-one 180, 202 - -, linear 180, 184, 190, 202 - -, ergodicity of 180 --,transitive 180-181,206 - - - and Bernoulli shifts 180,242,261 - - and expanding maps 181-186 - - - and Markov partitions 184 - - -, entropy of 293-296 - - -, intrinsic ergodicity of 251 - - -, topological mixing of 255 -flows 179-180, 190 approximation theorem 2, 14, 62 area-preserving, see volume-preserving Arnold, see also Kolmogorov-Arnold-Moser 89,126 asymptotic measure 293, 296 atom of a partition 7 attraction, center of 57, 100 automorphism 27 -, entropy of 210 (a la Shannon), 218 (a la Kolmogorov) -, exact 158, 161, 167-168, 172 -, hyperbolic 51, 15, 109,116,251 -s, properties of 148
-s, space of 116 average, orbital 22, 58, 98, 149, 295 -, spatial 26, 149 - time spent on a set 22-23, 98, 103 -, time, see orbital average Baire space 56, 69, 70, 104 Banach space 5 Bendixson, see Poincare-Bendixson Bernoulli shift, see also Bernoulli transformation 23, 62, 66, 78-79, 87 - -sand Anosov diffeomorphisms 180, 242, 254,261 - -s and hyperbolic homeomorphisms 254, 261 --sand K-systems 160-161, 181,187 - -sand Markov shifts 79, 165 - -s, equivalence of 78, 207 - -s, entropy of 78, 207-208, 220,225,254 - -s, ergodicity of 101 - -s, mixingness of 142 - transformation, see also Bernoulli shift 79, 111, 116, 161 - map, see Bernoulli transformation - measure 84, 105, 208 billiards 16, 111, 119 -, ergodicity of 24-26, 111 Billingsley 62 Birkhoff 17, 69, 89, 111 -'s theorem 22, 57, 89-100, 102-103, 129-130, 149, 163, 210, 292 - -, corollaries of 95 - -, generalization of 99 -center 56 - normal form 126 Boltzmann 89 Borel set 2 - a-algebra 1 Bowen 168,169,184,243,254,255,266
312 box 284-285 Breiman, see also Shannon-McMillan-Brennan 62,209 Brin 111,207,211 Brin-Katok formula 298 Bunimovich 25, 89
Cantor set 244 celestial mechanics 69, 89 center of attraction 57, 100 -, Birkhoff 56 Cesaro summable 139,140,217 characteristic function 3 Chebyshev inequality 26, 106 circle, see also sphere, torus 24-25, 31, 38, 46 - and interval exchange transformations 119 -, diffeomorphisms of 38 -, homeomorphisms of 56, 74, 77, 79 -, invariant 25-26, 126-127 - map ZI-+ z• 166, 222 -, translations of 58 -, transversal 119, 121 classical mechanics 37 compact class 2, 3, 153, 248 complementary subbundle lemma 271 continued fractions 17, 19, 23, 43 convergence in measure 3 -, almost uniform 3 -, dominated 5 cotangent space 40 covariance matrix 152-153, 156 - sequence 153, 154-157 cylinder (in a product) 61 -, topological 38 Darboux's theorem 39, 40, 127 decimal expansion 21, 22, 208 density of ergodic maps 125-127 derivative (Radon-Nikodym) 6 - and partitions 7 - , Lie 34 determinant of a map 33 Dinaburg 23 7 Dirac measure 56 dispersion 289-291, 293 dissipative 30 · distortion 269 -inequalities 169-170 divergence 15, 34 dominant eigenvalue 64-65, 164, 231-232, 237 dominated convergence theorem 5 e-stability 251, 254 (E 1 , E 2 )-graph 289-293 Egorov's theorem 3, 291
Subject Index
eigenspace, see eigenvalue eigenvalue at a fixed point 67-68, 125-126, 179-180, 251-252 - at a regular point 263, 268-280, 281 - , dominant 64-65, 164, 231-232, 237 - of a homomorphism of T 2 79, 107-108, 111, 211 elliptic fixed point 25, 116, 125-127 entropy 78-79, 150, 159, 207-304 - and Lyapunov exponents 265 - function 226 - , maximizing 231, 233 - of Anosov diffeomorphisrns 293-296 - of an automorphism 210 (a la Shannon), 218 (a.la Kolmogorov) - of automorphisms of T" 266 - - , relative 214 - of Bernoulli shifts 78, 207, 220, 254 - of equivalent maps 219,225 - of expanding maps 227-230 - of the Gauss transformation 231 - of K-systerns 225 - of Markov shifts 221 - of a partition 210 -, topological 237-239, 244,254, 297-298 -, variational property of 244 equivalence, measure-theoretic 77, 85, 11 I, 114 - - and topological equivalence 80, 105 --andentropy 219,225-227 - - with Bernoulli shifts 78, 161, 165, 207 - - with-a shift 86-87 - - with a translation of T 2 134-135 - modulo zero 2 - of Hilbert space isometries 150 -, spectral 78, 207 - - of K-systems 159 - - of Lebesgue automorphisms 146-147, 207,225 -, topological 67, 75 - - and equivalence 80, 166-167 - - of Anosov diffeomorphisms 180 - - of expanding endomorphisms 80, 105 - - offlows 179 - - with an interval exchange transformation 120 --with a rotation 126-127 - -with a shift 69, 244 - - with a subshift 73-74, 76-77, 86, 249, 29 - -with a translation of T 2 141 ergodic decomposition theorem 127, 166 - isometries of Hilbert space 149 - maps, see also ergodicity --,densityof 125-127 - - , genericity of 116-117, 125 - measure 104
Subject Index -theory 15, 89 - - o f smooth maps 169-170 ergodicity 22, 89, 101-109, 133-134, 141, 148, 159 -and invariant sets 101 -and entropy 208-211 - , examples of 23, 109 - , intrinsic 241 - - of Anosov diffeomorphisms 251 - - oflinear automorphisms 266 - of Anosov di!Teomorphisms 180 -of billiards 24-26, 111 - of Bernoulli shifts 101 - of the Gauss transformation 23 -ofGaussianshifts 153-154 - of geodesic flows 179 -ofMarkovshifts 161-162 -of mixing automorphisms 141,147 - , unique 58-61, 67,105, 134, 161 exact automorphisms 158, 161, 167-168, 172-173 expanding endomorphism 166-171, 180,227, 230,262 - - , algebraic 167 -map 167, 170-178 --sand Anosov diffeomorphisms 181-186 - - s , entropy of 227-230 expansive map 73-77, 239,243 - and subshifts 73 extension theorem, Hahn-Kolmogorov 3, 87 extremal point 104 fat set, see residual set Fatou's lemma 5, 90 Fejer's theorem 139 Feldman 111 fiber bundle 184, 191, 197, 253, 269-271, 294 finite type, see subshift offinite type first category, set of, see meager set first integral 37-40 fixed points 109-112 --,elliptic 25,116, 125-127 - - on the circle 56 --on the torus 56,115,121 - - , hyperbolic 67-68 flow 33-41, 119, 179-180, 190 - , global 33, 36 Fourier series 107, 136-139 Franks 180 Friedman 79 Friedman-Ornstein's theorem 79 Frobenius, see Perron-Frobenius Fubini's theorem 11, 135, 203 Furstenberg's example 61, 80, 105, 134
313 Gallavotti 111 gas, ideal 111 Gauss 19,20 Gaussian shifts 147, 151, 152-157 - - , ergodicity of 153-154 - probability measures, see Gaussian shifts Gauss transformation 17-23, 27, 42, 168 - - , entropy of 231 - - , ergodicity of 23 -measure 18-23,27 generator, see T-generator and (n, s)-generator genericity, see also residual set 56 -of ergodic maps 116-119, 125 geodesic 110 - flow 179, 190 Gibbs 89, 111 Goodman 237 Goodwyn 237 Grassmannian 197,264 Gromov 167 Haar measure 47, 50, 58 - - on the circle 79 - - on the torus 80, 106, 111-112, 116, 134-135, 180,212,265,266 Hadamard 179 Hahn-Kolmogorov extension theorem 3, 87 Halmos 47, 78 Hamiltonian 40 - mechanics 25 Hartman's theorem 125 Hedlund 179 Herman 25, 126 hierarchy of properties 161 Hilbert space 5, 144-146, 159,269 - -, separable 146 - - , isometrics of 149-151 -sum 144 Hirsch 182 Holder continuity, Holder C 1 167 - inequality 5 homeomorphism, hyperbolic 251-256, 261, 263 homoclinic point 68-69, 127 homology 69, 297-298 homomorphism of Lie groups 46-52 - of measure spaces 81, 83, 88 - of the torus 106-109, 135, 145 Hopf 30, 36, 179, 190 hyperbolic automorphism 51, 75, 109,116,251 - homeomorphism 251 -256, 261, 263 - measure 296-297 - set 251, 253-255, 297 ideal gas 111 incidence matrix
255
314 infra-nil-manifold 167, 180 injectively immersed submanifold 68, 266-267 integrability 4 integral 4 interval exchange transformation 117-124 - - - , minimal 121-122 intrinsic ergodicity 241 - - of Anosov diffeomorphisms 251 - - of linear automorphisms 266 - measure, see also intrinsic ergodicity 241-242, 251, 262 intrinsic topology 266 invariant (of a map) 78 -circles 25-26, 126-127 - measure 15, 103, 166,176,227 - -, natural 166 - - on Lie groups, see also Haar measure 46-47,50,52 - -s and the Birkhoff center 56 - -s, set of 52 - -s, uniqueness of, see also unique ergodicity 20, 26, 63,103, 152-154, 167-168, 172,227 - -s, ergodic decomposition of 127 -sets and ergodicity 101 - -, minimal 60-61, 74 - subbundle 270-273 --lemma 270 -, topological 75- 76 - volume form 33-38 - - - , uniqueness of 37-38 invertible map 27, 30, 81-88, 95, 118, 124, 135, 144, 219-223 - Lebesgue automorphism 146-147 - isometrics of Hilbert space 150-151 irreducible matrix 161, 165 isomorphism of measure spaces 81, 85-88, 146-149, 166 - of fiber bundles 270-271, 279 Jacobian 190-193, 202,227 Jacobson 170 Jordan canonical form 164, 232 -'s theorem 263
KAM, see Kolmogorov-Arnold-Moser Katok, see also Brin-Katok 69, 111, 114, 119, 161, 207, 211 -'s theorem 297 Katznelson's theorem 79 K-automorphism, K-system 158-161, 165, 181, 187,227 -sand Bernoulli shifts 160-161, 181,187 -s and entropy 225 Keane 121-122 Klein bottle 167
Subject Index Kolmogorov, see also Hahn-Kolmogorov 25, 78, 126, 148, 207, 211 Kolmogorov-Arnold-Moser theorem 25, 69, 116-117, 126-121 Kolmogorov automorphisms, see K-system Kolmogorov-Sinai theorem 219,220,225 Konheim 237,243 Koopman 149 Krieger 225 Kusmin 20 Laplace 19 Laurent series 139 Lebesgue automorphisms 144-151, 16 I - -, and K-systems 157-159 --andshifts 155 - - , equivalence of 146-147, 207,225 --,mixing 145 - -, rank of 146, 159 - isometries of Hilbert space 149 -measure 2 --onmanifolds 32 -'s theorem 7, 48 -sequence 81-82 - space 81-88, 117, 124, 225 Ledrappier 230 left-invariant measure, see invariant measure on Lie groups left translation, see translation length vector 118, 121, 124 Lie derivative 34 - groups, see also topological groups 46-47, 50-51, 166-167 linear lift 48, 51, 79, 107,145,211,263 Liouville's formula 34 -number 141 Lipschitz 190, 237, 244, 290-291 local product structure 254-255 - stable manifold 181 - unstable manifold 181 locally exact 40 Lusin's theorem 2, 46, 82, 249 Lyapunov exponents 263-268, 281, 296-297 - - and entropy 265 McAndrew 237, 243 Mackey 89 McMillan, see also Shannon-McMillanBreiman 209 Manning 180, 297 Markov transformations 20, 167-171, 227, 229-230 - measure 63-64, 72,231,233 - partition 181, 182-185, 203, 206, 255-256, 259-264
Subject Index - shift, see also Markov measure --sand Bernoulli shifts 79, 165 - - s , entropy of 221 --s,ergodicityof 161-162 - - s , mixing 164-165 maximal ergodic theorem 92 meager set, see also residual set, genericity 109, 264 measurable map 3 measure 1 - , asymptotic 293, 296 -, Gauss 18-23, 27 -, hyperbolic 296-297 -zero 2 measure-preserving map 15 mechanics 15, 25, 37, 69, 89, 111 memory, finite 66 minimal center of attraction 57, 100 - set 60-61, 74 - interval exchange transformation 121-122 - map 60-61, 80, 120-124, 134-137 - subshift 70 - translation 70, 105, 222 Minkowski's inequality, 5 mixing, see also weakly mixing, topologically mixing - automorphisms 142-148, 161, 172,188,233, 254 - -, ergodicity of 141 - Gaussian shifts 154-155 - isometries of Hilbert space 149 - Lebesgue automorphisms 145 - Markov shifts 164-165 modulo zero 2 monotone convergence theorem 5, 10 Moser, see also Kolmogorov-Arnold-Moser 25, 111 Munroe mutually singular 26 (n, e)-generator
236-238, 240, 242, 244 (n,e)-separated set 243,245,247 von Neumann 47, 78, 149 Newhouse 253, 254 nil-manifold I 67 non-degenerate form, 32, 39, 42 - elliptic fixed point 126-127 normal sequence 43-44
co-limit set 29, 30, 36, 38, 120, 125-126 co-saturated set 106 one-sided shift 61, 79,161 orbit 21 -s, stochastic behavior of 128 orbital average 22, 58, 98, 149, 295
315 Ornstein, see also Friedman-Ornstein 78-79, 161,207,225 Oseledec's theorem 264-265, 267,269, 281, 291, 296 Oxtoby 116 Parry 231 - measure 231, 236, 261 -'s theorem 231,237,249 partition 7 - , entropy of a 210 perfect gas 111 perihelion, movement of 89 periodic maps 225 periodic points, see also fixed points 67, 111, 119, 128 - - and invariant measures 56 - - and local product structure 254-255 - - , number of 231 - -, elliptic 127 - - of Anosov diffeomorphisms 181 - - of a subshift 76 permutation of an interval exchange transformation 118, 121, 124 Perron-Frobenius' theorem 64, 65, 164 Pesin's formula 263, 265, 285-289 - theorem 267 physics, see mechanics Pliss 276 Poincare 15, 69 - maps, Poincare transformation 36, 190-192, 202-203 - recurrence theorem 16, 21, 27, 56, 100, 137 Poincare-Bendixson theorem 119 Pontrjagin 46, 47, 149 Porteus 167 prime integral, see first integral probability space 2 product measure 7, 62 -- of topological groups 46 proper rectangle 182, 206, 255 Przyticky 266 Pugh 182 Radon-Nikodym theorem 6 random variable l 9, 62 rank of a Lebesgue automorphism 146, 159 rectangle 182, 184-186, 205--206, 255-258, 294 recurrence theorem, Poincare 16, 21, 27, 56, 100, 137 recurrent point 15-17, 29-30 -sand the co-limit set 61, 67 -s on the circle 56 -, weakly 254
316 regular point 166, 263-265, 268, 281, 290 relative entropy 214 Renyi 168, 169 representation theorem, Riesz 54, 59, 128, 250 residual set, see also genericity 56, 69-70, 116, 119, 125, 127,226 Riemannian metric 32 - - on Lie groups 47, 50 - - and Anosov diffeomorphisms 184-185, 191 - - and entropy 296 - - on the Grassmannian 197 Riemann-Lebesgue lemma 148 Riesz representation theorem 54, 59, 128, 250 right-invariant measure, see invariant measure on Lie groups right translation, see translation Robinson 191 Rokhlin 117 Rudin 1 Ruelle 230 -'s inequality 265-267, 281,285, 295, 297 Riissman 25, 126 a-additivity 1, 2 a-algebra 1 a-finite 2 s-saturated 188 s-subrectangle 256-257, 294 Sard's theorem 37 separable probability space 14, 86-87, 159 separated, see (n,e)-separated Shannon 207,209 Shannon-McMillan-Breiman theorem 207, 210,211-212,231,287,295,299 shift, see also Bernoulli shift, Gaussian shift, Markov shift, one-sided shift 61, 67 -s and T-generators 87 -s, entropy of 207 Shub 76,166,180, 182,253-255 signature 118, 124 simple function 4 Sinai, see also Kolmogorov-Sinai 89, 111, 170, 181, 184, 254, 265 Smale 69, 179-180, 253,255 solenoid 46 spatial average 26, 149 spectral property 148, 207 - equivalence 78, 207 - - of K-systems 159 - - of Lebesgue automorphisms 146-147, 207,225 sphere, see also circle 115-116 -, homeomorphisms of 74 stability, see also e-stability 266
Subject Index -, structural 179 - , strong 267 stable boundary 258 - foliation 190-202, 294 -manifold 51,181, 188-189 standard deviation 26 stationary stochastic process 62, 152 statistical mechanics 89 statistics 64 Sternberg's theorem 125 stochastic process 62, 152 strongly stable set 267 structural stability 179 subshift 70, 74-77, 247 - and expansive maps 73 - , minimal 74 -, topological entropy of 240, 246 - offinite type 70- 72, 76- 77, 86, 249 - - - and Parry measure 236 - - - and hyperbolic homeomorphisms 251-252, 256 - - - and hyperbolic sets 251-252, 297 ---,entropyof 231 - - -, topological entropy of 237, 241 support 55-60, 72, 100, 128, 159,233, ~O, 269 symplectic manifold, symplectic structure, symplectomorphism 39-42, 69, 117, 127 Takens 25, 69 T-generator 87, 219-221, 225 T-invariant, see invariant time average 22 topological entropy 237-239, 244,254, 297-298 - - of subshifts 237, 240-241, 246 - groups, see also torus, Lie groups 45-47, 50, 52, 58, 61, 77, 142 - invariant 75 topologically mixing 69-71, 77, 254-256, 260-261 - - Anosov diffeomorphisms 255 --shifts 70-71, 231-233 torus 46, 108-115, 134-137 - , automorphisms of 48, 51, 75, 79, 109, 166, 170,211,251,265-266 -, diffeomorphisms of 56, 61, 80, 105, 184, 202 - , translations of 78, 106, 127 translation 46-4 7 -s of the circle 58 -s of the torus 78, 106, 127 -, minimal 70,105,222 transversal circle 119, 121 - homoclinic point 68-69, 127 -section 36 -submanifold 190, 193-194
Subject Index unique ergodicity 58-61, 67, 105, 134, 161 uniqueness of attraction center 57 - of eigenspace decomposition 263, 280-281 - of invariant Gaussian measure 152-154 - of invariant measure, see also unique ergodicity 20, 26, 103, 167-168, 172 - - - maximizing entropy 233 - - - on topological groups 47 - - - under expanding endomorphisms 167, 227 - - - under expanding maps 172, 227 - - - under Markov transformations 168, 230
317 - of interval exchange transformation 118 - of invariant Markov measure 63 - of invariant volume form 37-38 unstable see stable variational property of the entropy 244 volume-preserving maps 25, 61, 105, 116, 119, 126 von Neumann 47, 78, 149 weakly mixing 147 weakly recurrent 254