174 71 52MB
English Pages 345 [348] Year 2014
Luger Modern X-Ray Analysis on Single Crystals
Also of Interest Jürn W. P. Schmelzer (Ed.) Glass – Selected Properties and Crystallization, 2014 ISBN 978-3-11-029838-3, e-ISBN 978-3-11-029858-1, Set-ISBN 978-3-11-029859-8
Takeo Oku Structure Analysis of Advanced Nanomaterials – Nanoworld by High-Resolution Electron Microscopy, 2014 ISBN 978-3-11-030472-5, e-ISBN 978-3-11-030501-2, Set-ISBN 978-3-11-030502-9
Adalbert Kerber, Reinhard Laue, Markus Meringer, Christoph Rücker, Emma Schymanski Mathematical Chemistry and Chemoinformatics – Structure Generation, Elucidation and Quantitative StructureProperty Relationships, 2013 ISBN 978-3-11-030669-9, e-ISBN 978-3-11-030672-9, Set-ISBN 978-3-11-030010-9
Mathias Wickleder (Editor-in-Chief) Zeitschrift für Kristallographie – New Crystal Structures ISSN 2197-4578
www.degruyter.com
Peter Luger
Modern X-Ray Analysis on Single Crystals
A Practical Guide
2nd fully revised and extended edition
Author Prof. Dr. Peter Luger Freie Universität Berlin Institut für Chemie und Biochemie, Anorganische Chemie Fabeckstr. 36a 14195 Berlin [email protected]
ISBN 978-3-11-030823-5 e-ISBN 978-3-11-030828-0 Set-ISBN 978-3-11-030829-7 Library of Congress Cataloging-in-Publication data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;detailed bibliographic data are available in the Internet at http://dnb.dnb.de. © 2014 Walter de Gruyter GmbH, Berlin/Boston Typeset: LVD GmbH Berlin Print and Binding: Hubert & Co. GmbH & Co. KG, Göttingen Cover: Corannulene, C20H10, deformation density, reproduced by courtesy of the Verlag der Zeitschr. für Naturforsch., see Grabowsky et al., Z. Naturforsch. 65b 452 (2010). ∞ Printed on acid-free paper Printed in Germany www.degruyter.com
Preface to the 1st edition Seventeen years after the discovery of X-rays by Wilhelm Conrad Röntgen in 1895, von Laue and his collaborators Friedrich and Knipping found that this novel type of radiation showed the property of diffraction when passed through a crystal lattice. In their classic experiment of 1912 they proved that X-rays, like all other electromagnetic waves, interact with the electron sheath when exposed to a sample of matter, thus causing a diffraction process. This experiment can be regarded as the foundation of X-ray analysis, a new method for determining the structure of solid matter. However, in the first half-century after its invention this method could seldom be applied, and then only under restrictive conditions. The execution of one structure determination frequently took several months, in some cases even a few years. Moreover, the results were of limited accuracy and not always unambiguous. This situation has changed totally in the last few years. In a dynamic development modern X-ray analysis has become an instrument of structure determination which yields the most detailed, safe and precise information on molecular and crystal geometry available. Two major reasons can be given for this decisive progress. First, it is a consequence of the development of the so-called “Direct Methods” of phase determination in the last 20 years. From this it is now possible to work on a large number of compounds, especially of organic chemistry, which could never have been treated before. Second, the possibility of executing the extensive numerical calculations with the help of ever faster and larger computers has reduced the time needed for a structure determination or has even made its execution possible in the case of larger structures. Today it can be stated that an X-ray analysis can be performed on any crystalline compound within a reasonable amount of time if its molecular weight is not too large. Being now comparable in speed and expense to other methods and superior in results, the application of X-ray analysis is increasing not only in all chemical laboratories but also in biological and biochemical as well as in physical research projects; the number of scientists using this method is becoming larger from year to year. Today, by means of highly sophisticated computer programs controlling fairly automatically the measuring and structure determination process, an X-ray analysis can be processed with little effort on the part of the user. In general, extensive previous knowledge of theoretical crystallography is unnecessary; instead, much practical experience is more helpful for the experimenter to continue his investigation. However, in spite of automation several sources of error remain for the user, each capable of preventing a successful solution to a structural problem. It seemed therefore appropriate to have a guide for practical work in X-ray analysis directed at those who are not highly experienced in crystallography but who need structure determination as a method for solving some of their problems. In this book the fundamentals of crystallography are presented together with those topics
VI
Preface to the 1st Edition
that are helpful for the execution of a structure analysis. The contents were selected with respect to practical applicability; most questions arising in the course of practical work are treated. This book is addressed to graduate students intending to use this method in any part of an examination as well as to scientists in any research or industrial laboratory, hence to all people concerned with a structural problem which might be solved by the method of single crystal analysis. In the first part mainly theoretical aspects are presented. Note that no effort has been made to derive all results of diffraction theory. This is not the aim of this book since we are more interested in practical problems. In the second and subsequent parts we describe the process of an X-ray structure determination in all details, starting with the diffraction experiments, then dealing with the phase determination, the refinement and finally with the representation and documentation of results. The presentation of three structures as examples supports the orientation of this book toward practical work. I have tried to give as modern a formulation as possible of the mathematical aspects that figure so largely in X-ray analysis because modern mathematical language seems to be the most appropriate for a clear understanding, despite its somewhat abstract nature. It is the aim of this book to serve as a guide and to enable the reader to solve his structural problems almost without further preparation. It is desired that this book will be a contribution to the further dissemination of X-ray analysis as a modern method of structure determination to an ever increasing number of scientists.
Preface to the 2nd edition When I wrote the 1st edition of this book at the end of the 1970s I believed that some kind of a revolution had taken place in X-ray crystallography because the two major obstacles, the solution of the phase problem and the huge amount of numerical calculations, seemed to have satisfactorily been overcome. I was completely wrong. The revolution noticed at that time continued over the following 30 years and an X-ray analysis today is by no means comparable to the one I described around 1980. The experimental developments (brilliant X-ray sources, area detection, non-ambient temperatures and pressures) and the breathtaking development in computer hardware and software led to an “explosion” in X-ray analysis applications. It was therefore necessary to enter all the modern developments into a new version of this book, maintaining basic crystallography which has, of course, been left unchanged. In contrast to the 1st edition, the first chapter on mathematics has now been moved to an appendix to allow the reader to directly start with the fundamentals of crystallography. The part on crystallographic film methods has been shortened because of their minor current relevance. The parts on experimental developments concerning X-ray sources and area detection diffractometers have been given more space or are completely new. A further completely new chapter is on structures determined at nonroutine conditions. Of the three test structures of the 1st edition, two were kept but supplemented by two further examples of medium sized structures with much more than 50 atoms, which would have been a serious problem at the time of the 1st edition. Like the 1st edition this edition should be a guide for practical work in X-ray analysis directed at those who are not highly experienced in crystallography. Although today the execution of an X-ray analysis is highly automated with respect to the experimental and the computational procedures, this book should help to teach the user what is behind this method to know what he/she is doing when carrying out an X-ray analysis. In 2012 the crystallographic community celebrated 100 years of X-ray diffraction because it was in 1912 when Max von Laue together with Walter Friedrich and Paul Knipping performed the first X-ray experiment on a crystal. 2014 will be the year of crystallography. Our field can review a very exciting and successful century and I am very optimistic that it will even further develop. May this new edition contribute to a wide distribution of crystallographic knowledge.
Acknowledgment For the 1st edition, I gratefully received support from Professor George A. Jeffrey, University of Pittsburgh, USA, for a scientific and linguistic revision of my manuscript.
VIII
Preface to the 2nd Edition
Jeff is no longer with us. He passed away in 2000. I am deeply indebted to him for his various advices not only for this book but also in the course of several fruitful research cooperations and discussions we had over the years. For the preparation of this manuscript for the 2nd edition, several colleagues and coworkers of my group helped me with various contributions. My special thanks go to Manuela Weber, who worked with me for decades in research and teaching activities. She made extremely valuable contributions to the preparation of this edition. I also owe many thanks to the De Gruyter publisher, Berlin, for the cooperation and for much helpful advice in preparing and publishing this book. Last but not least I am greatly thankful to my wife Rut and to my daughter Katrin for their patience with me when being physically with them but mentally absent while thinking about crystallographic problems. Berlin, March 2014
Peter Luger
Contents 1 1.1 1.2 1.2.1 1.2.2 1.3
Introduction 1 Historical remarks 1 The crystal lattice: Basic definitions 2 Periodicity, lattice constants 2 Lattice points, lattice planes, reciprocal lattice Sample structures 13
2 2.1 2.2 2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5
Fundamental results of diffraction theory, X-radiation Electron density and related functions 14 Diffraction conditions for single crystals 17 X-rays 21 Generation of X-rays 21 Absorption 26 Filters, monochromators 31 X-ray tubes 34 Synchrotron radiation 39
3 3.1 3.1.1 3.1.2 3.1.3 3.1.4 3.2 3.2.1 3.2.2
Preliminary experiments 42 Film methods 42 The rotation method 42 Zero level Weissenberg method 46 Upper level Weissenberg – normal beam and equi-inclination method 50 Precession technique 54 Practicing film techniques 61 Choice of experimental conditions 61 Rotation and Weissenberg photographs of KAMTRA and SUCROS
4 4.1 4.1.1 4.1.2 4.1.3 4.1.4 4.2 4.2.1 4.2.2 4.2.3 4.2.4 4.2.5
Crystal symmetry 68 Symmetry operations in a crystal lattice 68 Introduction 68 Basic symmetry operations 69 Crystal classes and related coordinate systems 73 Translational symmetry, lattice types and space groups Crystal symmetry and related intensity symmetry 94 Representation of ρ and F as Fourier series 94 Thermal motion, displacement parameters 98 Intensity symmetry, asymmetric unit 103 Systematic extinctions 110 Quasicrystals 112
5
14
82
63
X 4.3 4.3.1 4.3.2
Contents
Space group determination 115 General considerations, practical aspects 115 Space groups of KAMTRA and SUCROS from film exposures
5 5.1 5.1.1 5.1.2 5.1.3 5.1.4 5.2 5.2.1 5.2.2 5.2.3 5.2.4
Diffractometer measurements 126 Point detector data collection on a four-circle diffractometer Eulerian cradle geometry 127 Choice of experimental conditions 131 Precise determination of lattice constants 134 Intensity measurements 135 Area detector diffractometers 141 Imaging plates 142 CCD diffractometers 144 Data processing for area detector measurements 146 Data collections for B12 and C60F18 147
6
Computer programs
7 7.1 7.2 7.2.1 7.2.2 7.2.3 7.2.4 7.3 7.3.1 7.3.2 7.3.3 7.3.4 7.4
Solution of the phase problem 154 Data reduction 154 Fourier methods 159 Interpretation of the Patterson function 159 Heavy atom methods, principle of difference electron density Harker sections, applications to KAMTRA 165 Numerical calculation of Fourier syntheses 170 Direct methods 172 Normalization 172 Fundamental formulae 180 Origin definition, choice of starting set 187 Application of direct methods, the examples of SUCROS and C60F18 195 Phase determination for macromolecules 197
8 8.1 8.1.1 8.1.2 8.2 8.2.1 8.2.2 8.2.3 8.3
Refinements 203 Theoretical aspects 203 Model versus experiment, R-value 203 Theory of least-squares refinement 204 Practicing least-squares methods 212 Aspects of numerical calculations 212 Execution of a complete refinement process 212 Corrections to be applied during refinement 214 Analysis and representation of results 222
120
126
150
161
Contents
8.3.1 8.3.2 8.3.3 8.4 8.4.1 8.4.2 8.4.3
Geometrical data 222 Graphical representations 224 Archiving data, crystallographic information file (CIF) 226 Applications to the test structures 229 KAMTRA and SUCROS: Completion and refinement 229 Refinement of C60F18 235 Structure solution and refinement of B12 236
9 9.1 9.2 9.3 9.4 9.4.1 9.4.2 9.4.3
Structure analysis at non-routine conditions 240 Low temperature X-ray diffraction experiments 240 Neutron diffraction 244 Structure analysis at high pressures 247 Electron density 249 Multipole model, experimental conditions 249 Topological analysis using QTAIM 253 Examples in the Life Sciences 262
10
Concluding remarks and outlook
270
Appendix: Mathematics 272 A.1 Matrices, vectors 272 A.1.1 Introduction 272 A.1.2 Matrices, determinants, linear equations 272 A.1.3 Vector algebra 281 A.1.4 Linear independency, bases, reciprocal bases 288 A.1.5 Basis transformations 296 A.1.6 Lines and planes 300 A.2 Fourier transforms and convolution operations 305 A.3 Counter statistics, mean value, standard deviation 310
Bibliography Index
323
315
XI
1 Introduction 1.1 Historical remarks Crystals can easily be distinguished from all other types of matter by optical inspection. They are generally transparent with sharp edges and plain faces, properties which make them unique in nature. For these samples the Greeks introduced the notation κρνσταλλοζ (kristallos = ice) because they were comparable in shape with ice crystals. Over the centuries crystals were the subject of detailed examination in mineralogy or admired as precious stones. The microscopic nature of the regularities which could be observed macroscopically was originally not known. It was a milestone in crystallography when in 1912 Max von Laue together with Friedrich and Knipping performed the first X-ray diffraction experiment on a crystal [1]. This experiment did not only confirm the character of X-rays as electromagnetic waves, but also proved the periodic arrangement of matter in a crystal. This opened up a rapidly developing field of research in crystallography, the X-ray structure analysis. In the first months this field saw an exciting development. Still in 1912 X-ray diffraction was described theoretically by von Laue himself, Ewald [2] and their English colleagues W.H. Bragg and W.L. Bragg (father and son). The famous “Bragg equation” was reported on a conference in November 1912 and already in 1913 the first X-ray crystal structure, namely of sodium chloride, was published by the Braggs [3]. Then, however, for about 40–50 years X-ray structure analysis was confronted with severe problems. There was on one hand the so-called “phase problem” (see Chapters 2 and 7), which could be solved only in exceptional cases at that time, and there was on the other hand the need to carry out huge numerical calculations especially for larger structures, which was impossible in a time where computers were practically not available. The decisive breakthrough came in the 1960s based on two developments which happened almost at the same time. The so-called “Direct Methods” for the solution of the crystallographic phase problem became available through sophisticated and easy-to-use computer programs, and the introduction of always faster and more and more powerful computers allowed almost unlimited numerical calculations. Figure 1.1 shows the rapid development of crystal structure research in the last 50 years, illustrated by the exponentially increasing number of entries into the two most important databases of worldwide published crystal structures. While it took 50 years for the first 1000 structures, we see now more than 40 000 structures per year. In 2009 the International Union of Crystallography (IUCr) announced the 500 000th entry into the Cambridge Structural Data Base (CSD) [7]. It follows that nowadays crystal structure analysis is a reliable, accurate and one of the fastest methods to determine three-dimensional atomic structures in the solid
2
1 Introduction
Figure 1.1: Entries of published crystal structures of organic and organometallic compounds into the Cambridge Structural Data Base (CSD) [4] in five year intervals. Insert: Corresponding entries of macromolecular structures into the Protein Data Bank (PDB) [5]. For inorganic structures, a third international database, the Inorganic Crystal Structure Data Base exists [6].
state. Applications take place in various disciplines like chemistry/biochemistry, biology, pharmaceutical research, solid state physics, material sciences etc.
1.2 The crystal lattice: Basic definitions 1.2.1 Periodicity, lattice constants A single crystal is a sample of material where the distribution of matter q(r) (the nature of q(r) will be specified later in detail, see Section 2.1) is periodic in three dimensions; that is (Figure 1.2), there are three non-coplanar vectors a, b, c with q(r) = q(r + ma) = q(r + nb) = q(r + pc) = … = q(r + ma + nb + pc) m, n, p are integers.
(1.1)
The volume element defined by a, b, c is called the unit cell. It represents the nonperiodic unit. The whole crystal lattice is obtained by periodic sequences of unit cells
1.2 The crystal lattice: Basic definitions
3
Figure 1.2: Model of a single crystal.
in all three dimensions. Generally the unit cell is the smallest nonperiodic volume element, however, for certain reasons, exceptions exist. The unit cell volume V is given by the scalar triple product V = (abc).
(1.2)
The vectors a, b, c, forming a basis of direct space in a mathematical sense, are called unit cell vectors. Their magnitudes together with the angles between them are called cell constants or lattice constants, being hence the following six quantities a = ⎪a⎪, b = ⎪b⎪, c = ⎪c⎪, α = ∠(b, c), β = ∠(c, a), γ = ∠(a, b).
(1.3)
While the angles are given in degrees, it is common practice (and convenient) to use the Å-unit for the lengths, 1 Å = 10–10 m, because covalent atomic distances are in the Å-range (the C–H bond distance is just close to 1 Å, the C–C single bond length is 1.5 Å). To correspond to the SI-unit system, also pm (10–12 m) or nm (10–9 m) might be used, that is 1 Å = 100 pm = 0.1 nm. In terms of the unit cell vectors a, b, c [according to (1.3), also called lattice constant vectors] every point P in space represented by the vector r, e.g. the position of an atom in the crystal, can uniquely be written as r = xa + yb + zc.
(1.4)
4
1 Introduction
The quantities x, y, z, having no physical dimensions, are said to be the fractional coordinates of the point P. If P is situated inside a chosen unit cell, its fractional coordinates have numerical values between 0 and 1. Macroscopically, some symmetry can be recognized frequently at the crystal habitus. This symmetry which originates from symmetry in the unit cell, hence is already present microscopically, is an important issue in crystal structure determination and will be elaborated in Chapter 4. For the moment let us consider some aspects with respect to symmetry in Figure 1.3. Except for pure translation, the crystal shown on the left has no further symmetry; the unit cell shows just one structural motif. In the crystal on the right the motif reappears a second time, related to the first one by mirror symmetry. It follows that the periodicity vector in b-direction has to be expanded until the end of the second row of motifs. Only the third row is a pure translation of the first one. The unit cell contains the motif twice. Moreover the symmetry causes the lattice constant vectors a and b to be perpendicular to each other, b ⊥ a. If the crystal structure has to be determined, it is sufficient also for the crystal on the right to know one motif, if in addition the symmetry is given. In this course, the symmetry independent part of the unit cell is called asymmetric unit. In Figure 1.3 the asymmetric unit is the entire cell on the left, half the cell on right. We note that the choice of the lattice constants is not necessarily unique, because there exist more than one possibility to describe the periodicity of a crystal. Several conventions have been introduced to agree on a preferable choice of lattice constants which will be considered later (Chapter 4). For example, in the crystal shown on the left of Figure 1.3 the unit cell vector b could be replaced by the diagonal b′ = a + b to represent properly the periodicity.
Figure 1.3: Influence of symmetry on the unit cell size and the choice of unit cell vectors, illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
1.2 The crystal lattice: Basic definitions
5
Considering periodicity and symmetry as key properties of the crystal we can already express in brief the task to be carried out during a crystal structure determination, which consists of three parts: (1) Determine the periodicity, hence lattice constants, (2) Determine the symmetry, later called the space group, (3) Determine the structure of the asymmetric unit, hence the fractional coordinates of all atoms in the asymmetric unit.
1.2.2 Lattice points, lattice planes, reciprocal lattice To allow an easy understanding of the major results of diffraction theory and for a simple representation of the diffraction conditions for single crystals (see Section 2.2) it is very convenient to introduce further notations like lattice points and lattice planes and to make use of a second set of basis vectors, the reciprocal lattice constant vectors. For a given set of lattice constants a, b, c we consider all points in space having the position vectors g = ma + nb + pc where m, n, p are integers.
(1.5)
These vectors define a lattice; the so-called crystal lattice and the vectors g are called lattice vectors. They represent the lattice points. To approach a convenient description of planes we consider for the moment the following problem: A plane may intersect the three lattice constant vectors a, b, c at 1/h a, 1/k b, 1/l c (see Figure 1.4), where h,k,l can in principle be arbitrary nonzero scalars, but in crystallography will always be integers. For a description of this plane we wish to calculate the normal vector e and the origin distance d, which provide all geometric properties of a plane (e gives orientation, d gives absolute position in space, see Appendix, Section A.1.6, normal form of a plane equation).
Figure 1.4: Example of a lattice plane. Actual values of hkl are h = 2, k = 1, l = 3.
6
1 Introduction
A vector in the direction of e is given by the vector product of two vectors in the plane, e.g. v1 = 1/l c – 1/h a and v2 = 1/k b – 1/h a. Setting v = v1 × v2 = (1/l c – 1/h a) × (1/k b – 1/h a) we get v = 1/(kl) c × b – 1/(hl) c × a – 1/(hk) a × b. At this stage we cannot proceed further unless we construct something new to replace the vector products of the lattice constant vectors. This is done by the so-called reciprocal lattice constant vectors, which represent another type of basis, as is detailed also in the Appendix, Section A.1.4. They are defined by b×c ( abc ) c×a b* = ( abc ) a×b c* = . ( abc )
a* =
(1.6)
The scalar triple product in the denominator is V = (abc) = unit cell volume. The six quantities a* = ⎪a*⎪, b* = ⎪b*⎪, c* = ⎪c*⎪, α* = ∠(b*, c*), β* = ∠(c*, a*), γ* = ∠(a*, b*)
(1.7)
are called reciprocal lattice constants. Analogous to the lattice points g introduced above [see (1.5)] we call the lattice defined by the vectors h = ha* + kb* + lc* h, k, l integers,
(1.8)
the reciprocal lattice and the vectors of type h are called reciprocal lattice vectors. Now we have two types of lattices. The first one, defined by (1.5), is called the direct lattice, the second one, defined by (1.8) is the reciprocal lattice. Correspondingly, to distinguish from the reciprocal lattice constants, the notation direct lattice constants is also used. Moreover, the entire space, defined by the direct basis vectors, is the direct or physical space (because the physics happens in this space), while the reciprocal basis defines the reciprocal space. If the vectors of the direct space have the unit of a length, say Å, it follows immediately from (1.6) that the vectors in reciprocal space have the unit of a reciprocal length, say Å–1.
1.2 The crystal lattice: Basic definitions
7
Some properties: (1) (a*)* = a, the same holds for b* and c*. It follows that the reciprocals of the reciprocal lattice constants reproduce the direct lattice constants. (2) a b* = a c* = 0, but a a* = 1 b c* = b a* = 0, but b b* = 1 c a* = c b* = 0, but c c* = 1. (3) If V = (abc) (unit [Å3]), V* = (a*b*c*) (unit [Å–3]) is called the volume of the reciprocal unit cell. It holds V* = 1/V.
(1.9)
(4) If α = ∠(b, c), β = ∠(c, a), γ = ∠(a, b) are the angles between the direct lattice vectors, then for the corresponding angles α *, β *, γ * of the reciprocal lattice vectors, the following equations hold: cos α * =
cos β cos γ − cos α sin β sin γ
cos β * =
cos γ cos α − cos β sin γ sin α
cos γ * =
cos α cos β − cos γ sin α sin β
r = xa + yb + zc h = ha* + kb* + lc* rh = hx + ky + lz.
(5) If and we obtain
(1.10)
(1.11)
Proof for (1)–(5) is given in Section A.1.4. Now we can recall the problem with respect to the vector v introduced above. We repeat v = 1/(kl) c × b – 1/(hl) c × a – 1/(hk) a × b. We can now replace the vector products in the equation for v by making use of the reciprocal lattice constants, e.g. a × b = c* (abc), c × a = b* (abc), c × b = −a* (abc) (note the change of sign here, because the vector product is not commutative!). Then we get v = – (abc) [1/(kl)a* + 1/(hl)b* + 1/(hk)c*] =−
( abc ) (ha * + kb* + lc *). hkl
8
1 Introduction
Let us consider the vector h = ha* + kb* + lc*. (This is an exception to our convention, in that h is not the magnitude of the vector h. The reason is that crystallographers denote vectors of the above kind by the symbol h and the defining integers by h, k, l with h being not equal to ⎪h⎪.) Since h is a multiple of v, h has the direction of e, and we get e = h/⎪h⎪. The normal equation of the plane is then re = (rh)/⎪h⎪ = d or rh = ⎪h⎪d. If we express r in terms of the direct basis r = xa + yb + zc and suppose r to be the vector of a point in the plane, the vectors
and
r1 = 1/ha – r r2 = 1/kb – r r3 = 1/lc – r
are coplanar and therefore their scalar triple product has to be zero: (r1 r2 r3) = 0 = ((1/h – x)a – yb –zc) [(– xa + (1/k – y)b – zc) × (– xa – yb + (1/l – z)c)] = ((1/h – x)a – yb – zc) (xya × b – x(1/l – z)a × c – x(1/k – y)b × a + (1/k – y) (1/l – z)b × c + zxc × a + zyc × b). Since all scalar triple products containing the same vector twice vanish, we get (r1 r2 r3) = (abc) [(1/h – x) (1/k – y) (1/l – z) – (1/h – x)zy – 2xyz – xy(1/l – z) – xz(1/k – y)] = 0. Since (abc) ≠ 0, because they are basis vectors, the expression in square brackets vanishes. Elementary calculation leads to y 1 z x − − − =0 hkl hk hl kl or hx + ky + lz = 1.
1.2 The crystal lattice: Basic definitions
9
On the other hand, it follows from (1.11) for the scalar product rh = hx + ky + lz. So we get rh = ⎪h⎪d = 1 ⎪h⎪ = 1/d.
(1.12)
Let us summarize this result since it is of major importance in crystallography: For a plane given by the vectors 1/ha, 1/kb, 1/lc with a, b, c being arbitrary basis vectors and h, k, l scalars, the vector h = ha* + kb* + l c* has the following properties: (a) h has the direction of the normal vector of the plane, (b) the magnitude of h is the reciprocal of the origin distance d. With these important geometrical properties, every plane in three-dimensional space can be expressed by its vector h in reciprocal space. Since it is more advantageous to proceed with vectors than with equations of planes in arithmetical calculations, we have a good reason for the introduction of reciprocal bases. In the special case of planes being parallel to one or more basis vectors, the intersection on the basis vectors is infinite. Since the inverse is zero, we have to use a zero for the corresponding index of h. For instance, the plane parallel to the a–b plane has the vector h = 0a* + 0b* + lc* = lc*. As we shall see in Chapter 2 the reciprocal lattice vectors after (1.8) h = ha* + kb* +l c* play a dominant role in describing diffraction in a crystal. As shown above, they have the property to define a plane in direct space intersecting the lattice constant vectors at 1/ha, 1/kb, 1/lc. Now we consider a set of planes, parallel to this given plane, with the additional property that each plane contains at least one lattice point of direct lattice and that each lattice point lies on one plane. A set of such planes is called a set of lattice planes. One plane of this set is called a lattice plane. We note that a large variety of possibilities exist to construct sets of lattice planes as illustrated in Figure 1.5. In the following we wish to characterize such sets of lattice planes. It is immediately clear that each individual plane of a given set of lattice planes has a normal vector ch where c is a scalar factor. Now let us consider instead of h = ha* + kb* + lc* the vector h′ = h′a* + k′b* + l′c*, with h′, k′, l′ having the property to divide h, k, l by a common factor and the greatest common devisor (g.c.d.) of h′, k′, l′ equal to 1. Then it is evident that each normal vector of the desired planes is a multiple of h′, since h and h′ are parallel. One example of the desired planes can easily
10
1 Introduction
be obtained. Let f be the least common multiple (l.c.m.) of h′, k′ and l′. If we multiply 1/h′, 1/k′, 1/l′ by f we get three integers, m = f/h′, n = f/k′, p = f/l′. The plane having the axial components ma, nb, pc is then one example of the desired lattice plane, since it contains the lattice points (m, 0, 0), (0, n, 0), (0, 0, p) and its normal vector is parallel to ⎛ 1/m ⎞ ⎛ h′ /f v = ⎜ 1/n ⎟ = ⎜ k′ /f ⎜ ⎟ ⎜ ⎝ 1/p ⎠ ⎝ l′ /f
⎞ ⎟ = 1/f ⎟ ⎠
⎛ h′ ⎞ ⎜ k ′ ⎟ = 1/f h ′. ⎜⎝ l′ ⎟⎠
We can now solve the problem by two propositions. (1) The set of planes desired can be expressed by the equation h′r = n n = 0, ± 1, ± 2, …
(1.13)
with h′ = h′a* + k′b* + l′c*. (2) The distance d between two neighbored lattice planes is given by d = 1/⎪h′⎪.
(1.14)
Since the proof of this is not well known in the crystallographic literature it is given below, [8] however, if the reader is afraid of too many mathematical details he/she is recommended to pass over immediately to Section 1.3. To prove (1) we state at first that all planes parallel to the given one can be expressed by the equation h′r = C, with C being an arbitrary constant.
Figure 1.5: Different sets of lattice planes. The small circles represent the lattice points, illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
1.2 The crystal lattice: Basic definitions
11
It follows from (1.12) that every plane having the desired properties must satisfy an equation dr = 1 with d parallel to h. That means d = C′h. Since h′ is parallel to h, we get d = C′h = C′′h′ and 1 = dr = C′′h′r or h′r = 1/C′′ = C. The desired set of lattice planes is therefore a subset of the set of planes defined by the equation above. If a lattice point given by the integers (m′, n′, p′) satisfies this equation, we get h′r = h′m′ + k′n′ + l′p′ = C. It follows immediately that C must be an integer. To prove (1.13), we have to show that the set of all integers C appearing on the right side of the last equation is equal to the set of all integers. Let us denote that set by Z. Then we have, Z = [h′m′ + k′n′ + l′p′⎪ m′, n′, p′ integers]. The plane having the equation h′r = 0 contains the origin, which is a lattice point with m′ = n′ = p′ = 0, so zero is an element of Z. Since Z is a set of integers, there exists an integer u ≠ 0 with ⎪u⎪ being the smallest positive integer. Then we shall show that u is the greatest common divisor (g.c.d.) of h′, k′, l′, and Z = [nu ⎪ n = 0, ± 1, ± 2, …]. First we need the property of the sum (difference) of two elements C, C′ of Z to be an element of Z, also. This is trivial, since C and C′ are of the form C = h′m′ + k′n′ + l′p′ C′ = h′m′′ + k′n′′ + l′p′′ hence C ± C′ = h′(m′ ± m′′) + k′(n′ ± n′′) + l′(p′ ± p′′) = h′m′′′ + k′n′′′ + l′p′′′ which is again an element of Z.
12
1 Introduction
Let C be an arbitrary element of Z. Suppose C = nu + r, with ⎪r⎪ < ⎪u⎪. Since u is an element of Z, nu is an element of Z and the same holds for C – nu. That means, r = C – nu is an element of Z. Since ⎪r⎪ < ⎪u⎪ and u was that element of Z with the smallest magnitude, it follows that r = 0, hence C = nu. Then it follows that Z = [nu⎪ n = 0, ± 1, ± 2, …]. Now it remains to show that u = (g.c.d.) of h′, k′, l′. Suppose u′ to be the greatest common divisor of h′, k′, l′. Since u is an element of Z, it can be written u = h′m′ + k′n′ + l′p′ = u′m*m′ + u′n*n′ + u′p*p′ (since u is a common divisor) = u′(m*m′ + n*n′ + p*p′). The last expression in brackets is an integer, say r, so we get u = u′r, hence ⎪u′⎪ ≤ ⎪u⎪. Since h′ = h′1 + k′0 + l′0 is an element of Z and Z = [nu] was already shown, an integer m′ exists with h′ = m′u. Similarly we get that u is a divisor of k′ and l′. Since u′ is greatest common divisor, it follows that ⎪u′⎪ ≤ ⎪u⎪. Finally, we get u′ = ± u. Since in our case, the (g.c.d.) of (h′, k′, l′) was equal to 1, we get Z = [n ⎪ n = 0, ± 1, ± 2, …] and thus (1.13) holds. If we write (1.13) in the form h′ r = 1 for n ≠ 0 n we get from (1.12), for the origin distance d′ of all planes except that passing the origin, d′ = n/⎪h′⎪ and the distance between two neighbored planes is then d = 1/⎪h′⎪ hence (1.14) holds.
1.3 Sample structures
13
1.3 Sample structures We shall describe in the course of this book the structure determinations of four compounds in detail. These four compounds are (1) potassium hydrogen tartrate, C4H5O6K, in the following the code KAMTRA is used, (2) sucrose, C12H22O11, code SUCROS, (3) vitamin B12, code B12, (4) the C60-fullerene compound C60F18, code C60F18. We have chosen these compounds for several reasons. Good single crystals of the first two compounds can be obtained and the molecules are relatively small. The data measurement procedures and computer calculations are very easy to carry out. We can use KAMTRA as an example of a structure in which the phase problem can be solved by application of the “heavy atom method” (see Section 7.2.2). SUCROS is an optically active organic structure to which we can apply “direct methods” (see Section 7.3) to the structure solution. Moreover, the history of the various sucrose structure determinations illustrates nicely the progress in the field in the last decades. The first X-ray analysis of sucrose was published in 1952, hence 40 years after the discovery of X-ray diffraction, with a hint by the authors that it took almost 8 years to get finished [9]. At that time the structure was even based on two-dimensional projections because of the size and the rather complicated geometry of the molecule. Three-dimensional neutron and X-ray structures were then published in the 1960s and 1970s [10, 11] and a 20 K charge density study was reported in 2007 [12]. As we shall see, today the sucrose structure is a small crystallographic problem, which can be solved within a couple of hours. Also vitamin B12 played an important role in crystallographic history [13]. It is a structure with more than 150 atoms and was solved in the mid-1950s by the group around Dorothy Hodgkin in the UK using the “heavy atom method”. At that time it was a huge problem and her fantastic work on this and further large biomedical substances was awarded the chemistry Nobel Prize in 1964. We shall redetermine the B12 structure applying both the heavy atom method and direct methods. While graphite and diamond were known for centuries as the only two forms of carbon, it raised considerable attention in the 1980s when the fullerene molecules were discovered consisting of 60, 70 and even more carbon atoms [14]. While the C60 molecule has a very symmetric almost spherical shape similar to a conventional “soccer ball”, the structure of the highly fluorinated C60F18 differs remarkably from this symmetrical shape [15]. So not only the structure is of certain interest, but there is also, as we shall see, an experimental challenge with respect to the poor crystal quality, so that we consider this structure analysis as an example where synchrotron radiation had to be applied [16]. If the reader becomes experienced with all the problems arising from these four structure analyses, or if he/she even redetermines one or two of them him/herself, he/she should be able to solve most of the more straightforward structural problems arising in his/her laboratory.
2 Fundamental results of diffraction theory, X-radiation 2.1 Electron density and related functions We wish now to introduce some important results of diffraction theory and we shall see that properties of Fourier and convolution theory described in the Appendix, Section A.2, are very convenient for that purpose. Diffraction occurs when light interacts with matter. When X-rays, which are electromagnetic waves with wavelengths in the range 0.1 to 10 Å, are the source of radiation, they interact with the electrons of matter; hence all the results of X-ray diffraction experiments must be due to the electron distribution of the diffracting material. So we have to be concerned with the electron distribution which is described relative to the material volume, hence with the electron density function, denoted by ρ(r) (unit: electrons per volume, e.g. e Å–3). From now on we replace the quantity q(r) introduced in Section 1.2.1, to describe the distribution of matter by ρ(r), having the periodicity property of equation (1.1) in a crystal. When the electron density can be derived from an X-ray diffraction experiment this function is a so-called physical observable in the sense of quantum chemistry. With this property ρ(r) differs from the wave functions ψ(r) in Schrödinger’s equation. The time-independent Schrödinger equation reads Hamilton Operator
Hψ = Eψ ⎞ ⎛ ∇2 + U⎟ = Hamilton Operator with H = ⎜ − ⎠ ⎝ 2m 2
∇2 =
∂2 ∂2 ∂2 + 2 + 2 2 ∂x ∂y ∂z
U = potential energy, E = total energy. It is a second order partial differential equation and has two disadvantages. An exact solution exists only for the hydrogen atom and the wave function ψ(r) is not accessible to any experiment. After Born’s statistical interpretation, the probability dW to find an electron in the volume element dV is given by dW = ψψ *(r ) dV, where an asterisk indicates the conjugate complex of a complex quantity. Then the probability density is lim
dV → 0
dW = ρ(r ) dV
2.1 Electron density and related functions
15
and ρ(r ) ≈ ψψ *(r ).
(2.1)
This way the experimentally obtainable electron density ρ(r) plays a fundamental role for the description of chemical structures. The most general diffraction experiment is that in which X-rays from an X-ray source interact with a sample of arbitrary material and the diffracted radiation is recorded on a detector as shown in Figure 2.1. The aim of such an experiment is to get information about the electron density ρ(r) of the sample from the intensity I of the diffracted radiation. The relationship between I and ρ(r) is complicated and we shall not deal with its theoretical derivation, but the result is simple to understand, once we introduce a further function, the Fourier transform of the electron density (the Fourier transform operator should be denoted by the symbol “F ”): The function F(b) = F (ρ(r)) is a function of reciprocal space and defined by F(b ) =
∫ ρ(r ) e
2 πi ( b, r )
dV.
(2.2a)
V
F(b) is called the structure factor. The reciprocal space with the basis vectors a*, b*, c* (see Section 1.2.2) in this connection is also called Fourier space, and its vectors b have the dimension of a reciprocal length (e.g. Å–1). In general the structure factor is a complex function, which can be expressed as F(b) = ⎪F(b)⎪ e iϕ(b).
(2.2b)
The magnitude of the structure factor, i.e. ⎪F(b)⎪ = ⎢F(b⎟) = + F(b )F*(b ) (F*(b) = the conjugate complex of F(b)), is called the structure amplitude, ϕ(b) is its phase. Conversely, ρ(r) is related to F(b) by Fourier inverse transform: ρ(r ) =
∫ F(b) e
−2πi(b , r )
dV*.
(2.3)
V*
Now we can present the main result of diffraction theory. The diffracted intensity I is proportional to the square of the structure amplitude:
Figure 2.1: General X-ray diffraction experiment.
16
2 Fundamental results of diffraction theory, X-radiation
I(b) ∼ F(b) F*(b) = ⎪F(b)⎪
2
⎪F(b)⎪ = +
or
I(b ) , c
(2.4a) (2.4b)
if c is the proportionality factor. This is the main result of diffraction theory, and at the same time, its main problem. The aim of the experiment is to get the electron density function ρ(r). If we had its Fourier transform F(b), we could calculate ρ(r) by inverse transformation after (2.3), but unfortunately experiments provide us only with the magnitude ⎪F(b)⎪ from I, and not with the phase ϕ(b) of this complex function. This phase problem is the central problem of every diffraction experiment and no direct experimental solution has yet been found. It has obstructed for decades the progress in X-ray crystallography. There are some experiments reported for the so-called multi-beam case [1], which provide, at least in principle, the possibility of direct experimental phase measurement. However, due to a considerable effort a general application to practical problems of single crystal analysis was not considered. Since we cannot obtain the inverse Fourier transformation of F(b), let us examine the inverse transformation of what we get from experiment, that is, F –1 [F(b) F*(b)]. The function P = F –1 (F F*) given by P(u ) =
∫ F(b) F*(b) e
−2 πi (b, u )
dV*
(2.5)
V*
with its argument u being a vector in direct space, is of great importance in crystallography. It is called the Patterson function, since it was Patterson (1934/35) [2] who introduced this function for the first time as an approach to the phase problem. Note that the argument of P is usually denoted by a “u” and not by an “r”, although it is a vector of physical space as is the argument r of ρ(r). When we discuss the properties of P in another section (7.2.1), we shall see that it is useful to distinguish between u and r because there is a geometric difference between these two quantities. For the moment, let us only point to a mathematical property of P(u), which can easily be derived from eq. (A.30b) (Appendix, Section A.2). The Patterson function is the convolution square of the electron density, hence P(u ) =
∫ ρ(r ) ρ(r − u ) dV.
(2.6)
V
This last equation will be discussed in detail since it provides an important method for the solution of the phase problem. All procedures, called heavy atom or Patterson methods (see Section 7.2.1) are based on (2.6). Now we have introduced the three important functions of diffraction theory, which are the two functions of direct space,
2.2 Diffraction conditions for single crystals
17
(a) the electron density ρ(r) (b) the Patterson function P(u) (c) and the function of reciprocal space, the structure factor F(b). An elegant summary of the results of diffraction theory has been given by Hosemann and Bagchi [3] in the form of a diagram: I( ) ←⎯→ FF ( P( )
ρ
The arrows drawn as full lines indicate the possible operations. The arrows drawn as dashed lines show the phase problem. We see that there is no path from the result of the experiment, I(b), to the right-hand side of the diagram, to the desired electron density ρ(r).
2.2 Diffraction conditions for single crystals From experiments with diffraction of light from a slit or a lattice we learn that diffraction maxima are observable if the geometrical dimensions of diffracting material are of the same order of magnitude as the wavelength of the light. With X-rays having wavelengths in the Å-range, diffraction can be observed when the geometry of the diffracting matter is of atomic dimensions. Crystals have the favorable property that they provide a periodic arrangement with dimensions comparable to the wavelength of the incoming beam so that in certain directions the diffracted intensity is no longer spread continuously but becomes discrete. This is illustrated by the one-dimensional example in Figure 2.2. In the three-dimensional case as holds for the periodicity of the crystal lattice, the diffracted intensity is discontinuously distributed in space, which is expressed by the vectorial form of the famous Laue diffraction condition, which is the three dimensional generalization of what was shown in Figure 2.2. A single crystal may have the unit cell vectors a, b, c and the corresponding reciprocal cell vectors a*, b*, c*. Then I(b) ≠ 0 only if b = h = ha* + kb* + lc* with h,k,l integers, otherwise I(b) = 0.
(2.7)
Two important properties derive from the Laue condition. The first is that the intensity distribution is discontinuous. Of course, this is an important advantage for the experimental situation. If diffraction intensities from a single crystal are to be taken, they appear only at discrete points in space and no continuous intensity diagram has to be measured. The second is that each diffraction spot is related uniquely to a reciprocal
18
2 Fundamental results of diffraction theory, X-radiation
Figure 2.2: Periodic row of scatterers as one-dimensional crystal model. If an incoming wave hits the points P0, P1, P2, … at the angle ϕ0, amplification in the outgoing direction ϕ takes place if the path difference of neighbored waves is a multiple of the wavelength λ. This means P0 A −P1B = h λ. In the three-dimensional periodic crystal, three equations of this type have to be fulfilled [see Laue diffraction condition, eq. (2.7)], illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
lattice vector h = ha* + kb* + lc*, which in turn represents a lattice plane, which intersects the unit cell vectors on 1/ha, 1/kb, 1/lc. As was pointed out above, diffraction occurs if the primary radiation has a wavelength comparable to the periodicity. This condition is also satisfied by neutron or electron radiation. Electron diffraction suffers from high absorption in the solid state, so is not suited to be used in crystal structure analysis. In neutron diffraction, the interaction takes place with the nuclei of the atoms, so that from X-ray and neutron diffraction complementary information about electron density and nuclear sites are obtained. Hence both methods are valuable for structure determination. In practical work neutron diffraction is seldom applied because neutron sources are available only at a few reactor sites while X-ray equipment can be installed at almost each laboratory. That is why we consider in the course of this book mainly the diffraction of X-rays on single crystals; however, we shall give a brief description of neutron diffraction in Section 9.2. We know from the Laue condition that only the lattice planes are responsible for diffraction; the geometrical conditions under which this will occur are most simply understood through the Ewald diffraction condition (Figure 2.3). Let us expose a crystal K to an X-ray beam of wavelength λ and let s0 be the unit vector in the direction of the primary beam. Diffraction in the direction of a further unit vector s happens if and only if a lattice plane L with its normal vector h has a special orientation with respect to the primary beam, so that the three vectors, h, s0, and s satisfy the equation h=
s − s0 s s or h = − 0 . λ λ λ
(2.8)
Since two vectors of length 1/λ appear in eq. (2.8) it is very convenient to consider a sphere of radius 1/λ around the crystal in its center, being called the Ewald sphere. Let
2.2 Diffraction conditions for single crystals
19
Figure 2.3: Geometrical representation of the Ewald diffraction condition.
0 be the point where the vector s0/λ ends on the Ewald sphere. It follows now from the Ewald condition that diffraction occurs if the normal vector h of a lattice plane L hits the surface of the Ewald sphere, when plotted from the origin 0. Let us explain the role of the angle θ in Figure 2.3. We find that h /2 = sin θ 1/λ or with d = 1/⎪h⎪, λ = 2d sin θ.
(2.9)
This famous equation is called Bragg’s law (or Bragg’s equation), and the angle θ is said to be the “Bragg angle”. Now we can discuss both the Ewald condition and Bragg’s law. Diffraction of a lattice plane L happens only if d, the reciprocal of the normal vector’s magnitude, satisfies Bragg’s equation. The plane L behaves like a mirror, since the incident and reflected beams make the same angle θ to the plane. The angle between incident and diffracted beam is then 2θ. Now we can understand why crystallographers refer to the discrete diffracted intensities as reflections, and we can now express in a few words the diffraction of X-rays on single crystals: X-rays are reflected by a lattice plane of a single crystal if, and only if, the angle of the incident beam to the plane satisfies Bragg’s law. In that case, the incident beam, the diffracted beam, and the vector normal to the plane satisfy Ewald’s condition. However, it has to be pointed out, that the physics behind is diffraction and not reflection, but the description as reflection is convenient.
20
2 Fundamental results of diffraction theory, X-radiation
We see again that the description of diffraction can easily be made on the basis of the reciprocal lattice vectors h, introduced in Section 1.2.2. Therefore, to describe in the following all diffraction experiments it is only necessary to consider these vectors. Another important consequence derives from Bragg’s law. Since ⎪sin θ⎪ ≤ 1, we get λ/(2d) ≤ 1 or ⎪h⎪ ≤ 2/λ.
(2.10)
It follows that, for a given radiation with fixed wavelength λ, the number of possible reflections is limited. Only those reflections in reciprocal space inside the sphere of radius 2/λ can be observed. This sphere is called the limiting sphere. The limiting sphere has twice the radius of the Ewald sphere (see Figure 2.4). Since we have one reflection per one reciprocal unit cell, the number of reflections inside the limiting sphere is M if its volume is equal to M times V*. It follows that MV* = M=
4 ⎛ 2⎞ π 3 ⎜⎝ λ ⎟⎠
3
32 π 1 1 3 λ 3 V*
or M=
33.5 V λ3
(2.11)
with V = 1/V* = cell volume. For the wavelengths of CuKα and MoKα radiation, which are most frequently used, we have λ(Cu) ≈ 1.542 Å and λ(Mo) ≈ 0.711 Å. Then we get M ≈ 9 × V for the copper sphere and M ≈ 93 × V for Mo radiation, i.e. the number of reflections which could be taken with Mo radiation is about a factor ten larger than for Cu radiation. Usually the experimental conditions only allow a reflection measurement below a given θmax. Then (2.10) has to be replaced by ⎪h⎪ < (2/λ) sin θmax
(2.10a)
and (2.11) becomes M=
33.5 V sin3 θmax . λ3
(2.11a)
If, for instance, θmax = 70° must be chosen for CuKα radiation, we get M ≈ 7.6 V. The same magnitude of M for the same cell volume V is obtained if θmax ≈ 26° is taken if MoKα radiation is used. The resolution of the diffraction experiment can never exceed λ/2, so that the resolution of a structure determination can reach this value, however, only if all reflec-
2.3 X-rays
21
Figure 2.4: Ewald sphere (radius 1/λ) and limiting sphere (radius 2/λ).
tions of the limiting sphere are measured. If the experimental conditions are chosen according to (2.10a) the resolution, denoted generally by dmax, reduces to dmax = λ/(2sin θmax) [Å].
(2.12)
For CuKα radiation and θmax = 70° we get dmax = 0.82 Å. Since atoms do not approach closer than this separation, this is not a significant problem. The diffraction experiment is said to be at atomic resolution. In protein crystallography, high-order reflections are generally too weak to be measured. Then θmax must be strongly reduced, so that dmax can increase considerably to values around 2−4 Å and atomic resolution is lost. In any case, for protein structures dmax is always reported as important information and efforts are made to keep dmax as small as possible.
2.3 X-rays 2.3.1 Generation of X-rays X-rays, first discovered by Wilhelm Conrad Röntgen on November 8, 1895, are electromagnetic radiation with wavelengths λ in a range 0.1 < λ < 100 Å. X-rays are produced when electrons of high speed hit a target material and are rapidly decelerated. Their kinetic energy E=
m 2 v 2
22
2 Fundamental results of diffraction theory, X-radiation
Figure 2.5: Experimentally measured intensity distribution of white radiation plotted versus λ for various voltages V. Target material, Cu. The interval plotted by dashed lines is superimposed by the characteristic radiation.
is then transformed into X-radiation. Since the X-ray quanta do not absorb the incident energy completely, their energies E = (hc)/λ are spread over a large range and form a continuous spectrum of so-called “white radiation”. Another radiation type of high intensity is obtained if the incident electrons have sufficient energy to remove an electron from an inner orbital of a target atom. Since the excited atom is very unstable, it completes its inner orbital with an electron of an outer shell. The energy difference ∆ E gained from the transition of an outer to an inner shell is transformed into X-radiation with a discrete wavelength λ corresponding to ∆E, which is named characteristic radiation: hc = ΔE. λ Figure 2.5 shows the intensity distribution I of white radiation plotted versus λ for various voltages V. There is a characteristic limit for a minimum wavelength λmin for every voltage. This wavelength corresponds to the maximum energy gained if the complete kinetic energy E=
m 2 v = eV 2
of incident electrons is transformed into radiation energy (hc)/λ. Then we get eV = h
c λ min
2.3 X-rays
23
or λ min =
hc . eV
Taking the numerical values for the constants h = 6.625 × 10–34 J sec c = 2.998 × 108 m sec–1 e = 1.602 × 10–19 C we obtain λmin =
12.4 V [kV ] [Å]
(2.13)
as the minimum wavelength of white radiation to be obtained with a given voltage V. Formula (2.13) is called the Duane–Hunt law. Another feature, which can be derived from Figure 2.5, is that the position of intensity maximum shifts toward smaller wavelengths with increasing voltage. For a fixed voltage the maximum intensity is at a λ-value of about 1.5 times λmin. A good approximation for the white radiation intensity Iw in terms of V, the electronic current i and the atomic number Z of target material is given by Iw = c V2 i Z
(2.14)
where c is a constant. For most single crystal diffraction experiments, we use a fixed value for λ, and are more interested in the monochromatic lines of characteristic radiation than in the continuous radiation. The exception is the Laue method which uses white radiation. Since this method is hardly used in home X-ray labs, it is not further considered here. To understand why the characteristic lines appear and what types of lines can appear, we have to consider the quantum mechanical model of discrete atomic energy levels. In Figure 2.6, the three inner electron shells are displayed, named K, L, M, corresponding to the principal quantum numbers n = 1, 2, 3. Every shell is separated into 2n-1 discrete energy levels with respect to the three further quantum numbers l, m and s. So we get one level for the K-shell, three for the L-shell, five for the M-shell, etc. Every transition of an outer to an inner level causes a characteristic line. However, there are “selection rules” which make some transitions impossible, for instance that from the inner L-shell to the K-shell. The nomenclature of characteristic lines is defined in the following way: The radiation type is defined by the capital letter of the accepting shell. If the donor shell is the neighboring shell, the radiation is said to be α-radiation; if it is the next shell, it is called β-radiation (although for historical reasons, there are some exceptions). The most important radiation types for the purpose of X-ray diffraction are the Kα1 and Kα2 and the Kβ-radiation. Kα1 and Kα2 wavelengths result from the two possible
24
2 Fundamental results of diffraction theory, X-radiation
Figure 2.6: Energy levels and transitions causing characteristic X-ray lines.
transitions of the LII and LIII-level to the K-shell. Since the energy difference between the two L-levels is small, the wavelengths of Kα1 and Kα2-radiation are close together. Therefore, under some experimental conditions, the two lines are not separated and coalesce in one line, the Kα-line. The same holds for the Kβ-line. Actually, we have a doublet of the β1 and β3 line due to the MIII → K and MII → K transition, but in practice they are not separated, and are therefore always used as Kβ-radiation. Although a large number of further lines exists, they are of no practical importance in crystal structure analysis because of their weak intensities. Since an electron jump to the K-shell is more likely from the L-shell than from the M-shell, the Kα-line has the greatest intensity. For the wavelength λ of a characteristic line, Balmer’s formula holds 1 1 1 = R(Z − σ )2 ⎛⎜ 2 − 2 ⎞⎟ ⎝n m ⎠ λ
(2.15)
with R = 1.097 × 105 cm–1 = Rydberg constant, n, m are the principal quantum numbers of participating shells, and σ is a screening constant taking repulsion of other electrons into account. For n = 1 and m → ∞ we get a special wavelength λK with the corresponding energy c ΔEK = h λK
2.3 X-rays
25
necessary to remove an electron from the K-shell to infinity. It follows from (2.13) that a minimum voltage VK of VK =
12.4 λK
(2.16)
is necessary for the excitation of λK. This voltage, VK, is called the excitation potential of the K -series. For fixed n and m, the factor 1 1 R ⎛⎜ 2 − 2 ⎞⎟ ⎝n m ⎠ is a constant p and we get 1 = p(Z − σ )2 . λ
(2.17)
This is Moseley’s law. It expresses the property whereby the characteristic radiation wavelength decreases with the reciprocal second power of the atomic number Z. Since 1/λ is proportional to the radiation energy and since radiation is said to become harder with increasing energy, we can state that the degree of hardness of a radiation increases with the second power of Z. For the intensity IK of K-radiation, a good approximation is given by IK = L (V – VK)1.5 I
(2.18)
where L has a particular value for every type of line, depending on the transition probability of the corresponding pair of shells. For the Kα1–Kα2 doublet, the intensity ratio is approximately I (Kα 1 ) (2.19) = 2. I (Kα 2 ) The ratio of integral Kα-radiation to integral Kβ-radiation is approximately I (Kα ) = 5. I (Kβ )
(2.20)
Figure 2.7 shows a complete intensity diagram of the continuous and characteristic Kα- and Kβ-radiation. Since we intend to use the monochromatic radiation, we try to reduce, or better, to remove, all but the Kα-radiation. The optimum ratio, Q, of characteristic to white radiation, in terms of V, is given by setting its first derivative equal to zero, where, from the ratio of (2.18) and (2.14), we have Q=
IK L (V − VK )1.5 i (V − VK )1.5 = = L′ . IW c V2 i Z V2
Then 0=
dQ 1.5 (V − VK )0.5 V 2 − 2V(V − VK )1.5 = L′ . dV V4
26
2 Fundamental results of diffraction theory, X-radiation
Figure 2.7: Complete spectrum of white and characteristic radiation from a copper X-ray tube, taken experimentally at 39 kV.
It follows that 1.5 V = 2 (V – VK) or V = 4 VK.
(2.21)
From (2.21) we see that we have to apply a voltage four times VK to get a best ratio of IK over IW′. However, the white and β-radiation are not removed. This is done, at least in parts, by absorption, discussed in the next section.
2.3.2 Absorption X-rays are attenuated when passing through matter. This loss of radiation is due to two effects, photoelectrical absorption and scattering. If an X-ray beam of primary
2.3 X-rays
27
intensity I0 passes a sample of homogenous material of thickness dx, the loss dI at the element dx is proportional to I0 and dx: dI = – μI0 dx. By integration, we get I = I0 e– μ x
(2.22)
an expression called Beer’s law. The factor μ, having the dimension of a reciprocal length, is called the linear absorption coefficient. Taking the two sources of absorption into account, μ is sometimes decomposed into a sum of μ = s + t, with s as the scattering and t as the absorption coefficient. For our purposes, it is sufficient to operate with μ only. Absorption is an additive atomic property of matter, which means that its magnitude does not depend on the physical or chemical state of the atoms. The absorption coefficient of any kind of matter can therefore be calculated by a simple addition, if the absorption coefficients of contributing elements are known. In addition to the linear absorption coefficient, two further types of absorption coefficients are used. The first, denoted by μm, is the ratio of μ and the density ρ, and is called the “mass absorption coefficient”, μm =
μ . ρ
(2.23)
The second, called the atomic absorption coefficient, is useful in taking the atomic properties of absorption into account. It is denoted μa and defined by the following consideration (see Figure 2.8). If an X-ray beam of cross section q passes through a
Figure 2.8: Attenuation of an X-ray beam by a layer of atoms.
28
2 Fundamental results of diffraction theory, X-radiation
sample of matter with thickness dx, the attenuation dI is proportional to the number dN of atoms passed: dI = – μa I0 dN. The volume element dV = q dx may contain dM atoms, then dN = dM/q. With L being Avogadro’s number, A the atomic weight, and ρ the density, we get dm L A
dM = if dm = ρ dV is the mass of dV. Then dN = hence dI = −μa
dm L ρ dV L ρ qdx L ρ L = = = dx qA qA qA A
ρL I0 dx, or by integration A ρL ⎞ ⎛ I = I0 exp ⎜ −μa x . ⎝ A ⎟⎠
Comparing this result with (2.22) and (2.23), we get ρL A
(2.24a)
L A
(2.24b)
Aμ . L ρ
(2.24c)
μ = μa
μ m = μa μa =
Since μ has the dimension of a reciprocal length, the dimension of μm and μa are cm2g–1 and cm2. The dimension of μa is that of a “cross section” and therefore μa is sometimes called the “atomic cross section for absorption”. Since absorption depends only on the atomic properties of matter, the molecular absorption coefficient μmol is μ mol =
∑ Nμ i
i
ai
=
1 L
⎛ ⎞
∑ N ⎜⎝ μρ ⎟⎠ i
i
Ai i
if the molecule is composed of Ni atoms of type i. An analogue to (2.24b), μm,mol is defined by μm,mol = μmol (L/Mr) if Mr is the molecular weight. Then we get for an arbitrary compound μ = μ m,mol = ρ
∑N i
i
Ai ⎛ μ ⎞ . M r ⎜⎝ ρ ⎟⎠ i
2.3 X-rays
29
Setting g i = Ni A i , we get Mr μ = ρ
∑
⎛ μ⎞ gi ⎜ ⎟ ⎝ ρ⎠i
i
(2.25)
with gi being the mass fraction of element i contributing to the compound. In terms of μa for an absorber of volume V, we get for μ, with (2.24a): μ = μ mol
ρL mL = μ mol . Mr VM r
mL being the number of molecules in the volume V, we have ni = MNi as the Mr number of atoms of type i in the volume V. Then
With M =
μ=
1 V
∑ nμ i
i
(2.26)
ai
If V = Vc is the unit cell volume of a single crystal and n is the number of molecules in the unit cell, we get n μ= n i μ ai (2.27a) i Vc
∑
or
μ=
n LVc
⎛ ⎞
∑ n ⎜⎝ μρ ⎟⎠ i
i
Ai
(2.27b)
i
where the summation is taken over the atoms of one molecule. The quantities μa and ⎛ μ⎞ ⎛ μ⎞ ⎜⎝ ρ ⎟⎠ are dependent only on the absorbing material. The ⎜⎝ ρ ⎟⎠ values are tabulated in the “International Tables for X-Ray Crystallography” (2004), Vol. C for the wavelengths most frequently used and for nearly all elements [4] (for some more details about the International Tables and further volumes, see Section 4.1.4). Let us, for example, calculate μ for the four structures we wish to determine. All quantities needed are given in Table 2.1. Since in crystal structure analysis generally CuKα or MoKα radiation are of interest, we calculate μ for these radiation types only (cell volume Vc and n are taken from the result obtained later). The sum corresponding to equation (2.27b) reads for KAMTRA: μ(CuKα) = (4 / (628.5 × 6.022)) (4 × 4.51 × 12.011 + 5 × 0.391 × 1.008 + 6 × 11.5 × 15.999 + 1 × 14.5 × 39.102) × 1024 × 10–23. Note the two different powers of 10. The factor 1024 derives from the cell volume, being usually given in Å3, so that it must be transformed by 10–24 into cm. The factor 10–23 results from Avogadro’s number. We then get μ(CuKα) = (4 / (628.5 × 6.022)) (216.68 + 1.97 + 1103.93 + 5669.79) × 10 = 73.9 [cm–1].
30
2 Fundamental results of diffraction theory, X-radiation
Table 2.1: Numerical values of atomic absorption coefficients needed for the calculation of μ for the test structures. μ/ρ [cm2 g–1]
μ [cm–1]
Compound
Vc
n
CuKα
MoKα
CuKα
MoKα
KAMTRA (C4H5O6K)
628.5
4
C: H: O: K:
4.51 0.391 11.5 145
0.576 0.371 1.22 16.2
73.9
8.2
SUCROS (C12H22O11)
716.5
2
C: H: O:
4.51 0.391 11.5
0.576 0.371 1.22
12.4
1.4
B12 (C72H136CoN14O29P)
8963.0
4
C: 4.51 H: 0.391 Co: 321 N: 7.44 O: 11.5 P: 75.5
0.576 0.371 41 0.845 1.22 7.97
23.7
2.9
C60F18 (C60F18)
3593.8
4
C: F:
0.576 1.63
16.0
1.8
4.51 15.8
Figure 2.9: Schematic representation of μ/ρ plotted versus λ.
2.3 X-rays
31
By similar calculations we get all further linear absorption coefficients listed in the last two columns of Table 2.1. As expected, μ is always significantly smaller for MoKα radiation than for CuKα. The magnitude of μ will be an important consideration when the choice of radiation for intensity measurement is made. In the case of C60F18, the data collection was carried out with synchrotron radiation at λ = 0.6214 Å, being an even shorter wavelength than for MoKα radiation. Hence for this wavelength μ is still lower (μ = 1.4 cm –1) than in the MoKα case. μ For a given element, μa and depend on the wavelength λ. Figure 2.9 shows the ρ μ characteristic distribution of plotted versus λ. Sharp discontinuities occur at parρ ticular values for λ. These discontinuities are called absorption edges, found once in the K-series, and three times in the L-series. An explanation for this rapid increase when approaching the absorption edge from the long-wave side can be given by considering the energy level model in Figure 2.6. When the X-ray quanta have an energy less than the energy necessary to remove an electron, say from the K-shell, their photoelectrical absorption is possible – more or less probable – with respect to outer shells. If the critical energy is EK and λ reaches hc , photoelectrical absorption with respect to a EK μ further shell, the K-shell, can take place. Therefore increases discontinuously at λK. ρ It is clear that absorption edges are present for every level; hence we have one absorption edge for the K-shell, three for the L-shell, and five or more for the outer shells. From this explanation of absorption edges, it follows that the wavelength of the edge is equal to λK defined in (2.16) and the result of (2.21) can now be explained as follows. The wavelength λK corresponding to the excitation potential VK is equal to the wavelength of K-absorption edge (of course, the same holds for all other energy levels) and can be determined experimentally from this property. If we calculate VK from (2.16) we get an optimum ratio of characteristic to white radiation with a voltage V = 4 VK. For the absorption discontinuities it is very difficult to give an equation relating μa to λ or Z. Between the absorption edges a good approximation is given by the corresponding magnitude λ K =
μa = c λ3 Z3 c = const.
(2.28)
(2.28) tells us that absorption can be reduced by using hard radiation and that hard radiation is essentially necessary if the material to be examined is composed of elements with large atomic number. 2.3.3 Filters, monochromators As mentioned before, we wish to carry out our diffraction experiments with monochromatic radiation, so it has to be discussed how to suppress the unwanted radiation. This can be done either with filters or monochromators.
32
2 Fundamental results of diffraction theory, X-radiation
The large variation of absorption near the K-absorption edge is a useful property for filtering out Kβ-radiation. Figure 2.10 shows the λ-dependence of the intensity distribution of copper (Z = 29) radiation with the mass absorption coefficient of nickel (Z = 28). The absorption edge of nickel is situated between the CuKβ and the Kα lines, attenuating Kβ-radiation much more (about a factor of 5) than the Kα. Experimental investigations show that the use of a nickel filter with thickness of 0.015 mm changes the ratio of I (Kα): I (Kβ) to I (Kα ) (2.29) ≈ 100. I (Kβ ) Comparing (2.29) to (2.20), we see that the ratio of I (Kα) to I (Kβ) is increased by a factor of 20. However, this improvement is accompanied by a loss of almost 50 % of the Kα-radiation. It is common practice to use as a filter material an element whose atomic number is one or two less than that of the target element, since the absorption edge of the filter then lies between the Kβ- and the Kα-line. A summary of radiation types most frequently used and their filter materials is given in Table 2.2 (from International Tables
Figure 2.10: Intensity of the Cu radiation and the mass absorption coefficient of nickel, both plotted versus λ.
2.3 X-rays
33
Table 2.2: K wavelengths and β-filters for some commonly used X-ray tube target elements [5]. Wavelength[Å] Target element
β-filter
Kα1
Kα2
Kα*
Kβ1
Cr
V
2.289726(3)
2.293651(3)
2.2909
2.084881(4)
Fe
Mn
1.936041(3)
1.939973(3)
1.9373
1.756604(4)
Co
Fe
1.788996(1)
1.792835(1)
1.7905
1.620826(3)
Cu
Ni
1.5405929(5)
1.5444274(5)
1.5418
1.392234(6)
Mo
Zr
0.7093172(4)
0.71361(1)
0.7107
0.63230(1)
Ag
Pd
0.5594218(8)
0.563813(3)
0.5608
0.497082(6)
for X-Ray Crystallography (2004), Vol. C) [5]. Note that Kα* indicates a weighted average of the Kα1 and Kα2 lines. The use of filters has the advantage that they can easily be entered as small foils into the primary beam without further alignment. The disadvantage is, however, that the Kβ-line is not completely suppressed and is visible in the profiles of strong reflections. An alternative is the use of monochromator crystals. If the lattice plane of a strong reflection having d = 1/⎪h⎪ is oriented at an angle θ with respect to the beam, for which λ satisfies Bragg’s law, then only that part of the primary radiation having this wavelength will be reflected and further used. Monochromators for laboratory use (see an example in Figure 2.11) contain pyrolytic graphite crystals, either in plane or bent shape, the latter ones have focusing properties and contribute to an intensity increase. For the graphite monochromator the 002 reflection, having d = 3.357 Å, is used. The monochromator angle can easily be calculated as follows: From Bragg’s equation we obtain sin θ = λ/(2d). Choosing, for example, λ = 1.5418 Å for CuKα radiation, θ = 13.28° is obtained. Setting the monochromator at this angle, the beam is monochromated in favor of this wavelength. For MoKα and AgKα, the corresponding angles are θ = 6.076° and θ = 4.773° [7]. An accurate alignment of the monochromator has to be ensured for an optimal intensity yield, which is provided by set screws on the monochromator housing. With a monochromator the Kβ-radiation is eliminated, but no separation of the energetically close Kα1- and Kα2-lines is made. The reason is that the monochromator crystal is not ideally perfect but a real crystal (as most crystals are), being subdivided in a large number of mosaic blocks, all disoriented over a certain angular range (for more details see Section 8.2.3). This so-called mosaic spread is around 0.4° for graphite monochromator crystals, so that a small energy range is allowed to pass. It should not be overlooked that the advantage of a better monochromatic beam goes at the expense of a stronger loss of intensity compared to the β-filter and, moreover, the homogeneity of the focal spot is reduced. The plateau of constant intensity at the sample site is smaller than for filtered radiation with consequences for the choice of the crystal size.
34
2 Fundamental results of diffraction theory, X-radiation
Figure 2.11: An example of a commercial graphite monochromator (Huber, type 151) [6] by courtesy of Huber GmbH & Co. KG, Rimsting, Germany.
As long as point detectors for intensity data collection were used exclusively, one-dimensional reflection profiles were measured, allowing a rather convenient subtraction of the unwanted background by appropriate software. At these experimental conditions it was always a debate whether filtered or monochromatic radiation was to be chosen. Laboratory experience was that filtered radiation frequently gave a more accurate set of intensity data. Since area detection is now more and more preferred, monochromatic radiation is used exclusively, because all data processing routines rely on this type of radiation.
2.3.4 X-ray tubes A number of X-ray tubes of different types are available commercially today. Although we shall not discuss construction details (if the reader is interested, he/she may read the instruction manual of the producer), some of their principal properties can be described. We nowadays have the choice between conventional sealed tubes, being in use in most laboratories since decades, rotating anode setups, and since short microfocus X-ray sources [8]. X-rays from synchrotron sources are a special issue and will be discussed in the next chapter. Figure 2.12 shows a schematic representation of a sealed X-ray tube and an image of a conventional tube for diffraction experiments. Electrons from the heated cathode are accelerated by high voltage towards the anode. The X-rays emitted from the anode target leave the tube by a window which is normally of beryllium. Since the intensity maximum of X-rays is found to be at a take-off angle of approximately 10°, windows are positioned at such a distance from the target to realize this favorable angle. When deciding which type of X-ray tube to use, the practical crystallographer must be aware of three features. The first is the choice of target material and the radiation filter or monochromator. Tubes with Cr, Fe, Cu, Mo, and Ag targets are commercially available. As mentioned before, Cu and Mo tubes are adequate for 99 % of all problems in single crystal diffractometry. A Cu tube is most useful for crystals with
2.3 X-rays
(a)
35
(b)
Figure 2.12: (a) Schematic representation of a sealed X-ray tube. (b) Conventional sealed X-ray tube in laboratory use. Warning: On mounting the tube the user should not touch the highly toxic beryllium windows.
large unit cell dimensions, because there is a danger that neighbored reflection profiles may overlap with harder X-radiation. On the other hand absorption is lower with Mo tubes, so it may be a rough advice to use Cu radiation for large organic or protein structures, when absorption is no severe problem, while for inorganic structures with high absorption but generally smaller unit cells Mo tubes are first choice. The second and third important properties of an X-ray tube are the size of the focal point and the power, since the X-ray intensity at the sample depends on these two parameters. The shape of the target is generally rectangular with dimensions a : b = 6 : 1. At a take-off angle of 10° we then get a square focus of size b × b (see Figure 2.13) in the direction parallel to the long dimension of the target, and a line focus, size 6b × b/6 in the perpendicular direction. The four tube windows therefore provide two line and two square foci from each X-ray tube. For single crystal diffraction, only the square focus is used. The two line foci can be used for powder diffraction experiments. Since circular collimators are used to limit the incident beam, we always have a circular X-ray beam with a diameter less than b at the crystal. Because the diffracted intensity depends on the crystal volume, care has to be taken that the largest dimensions of the crystal are less than the diameter of the primary beam at the crystal. Crystallographers say the crystal should be “bathed” in the beam. Since the edge of the primary
36
(a)
2 Fundamental results of diffraction theory, X-radiation
(b)
Figure 2.13: (a) Reduction of X-ray beam diameter to one sixth of target length at a 10° take-off angle (since sin 10° ≈ 1/6); (b) target dimensions and focus size.
beam is sometimes less homogeneous than the center, the crystal size should, in fact, be significantly smaller than the diameter of the X-ray beam. Since X-ray collimators generally have diameters between 0.5 to 1.0 mm, the size of single crystals used in crystal structure analysis is always smaller than 1 mm. A useful estimate of an optimum crystal size is obtained by application of Beer’s law, equation (2.22). Assuming a crystal of volume ≈ x3, we have I0 ∼ V ≈ x3 and I ∼ x3 e−μ x. A maximum I is obtained in terms of x by setting its first derivative equal to zero dI ~ −μx3e −μx + 3x2 e −μx = 0. dx Hence x = 3/μ.
(2.30)
However, equation (2.30) is applicable only if μ >> 30 cm–1. Otherwise the collimator size is the more restrictive limit. The power dissipation of modern sealed X-ray tubes varies from 2 to 3 kW with a stationary anode. Since the optimum voltage for a given target material does not depend on the tube’s power, but on equation (2.21), the full capacity can be utilized by choosing the current as high as possible. If we have, as an example, a copper tube with a power of 2.0 kW, voltage V and current i are chosen as follows. VK is about 9 kV; from (2.21) we obtain a best voltage V = 4 × VK = 36 kV. Setting i = 55 mA we have the best condition for the optimum intensity of X-radiation. In practical work, i should be chosen smaller, on the one hand to avoid an overcharge of the tube, and on the other hand the lifetime of the rather expensive tube can be extended. The conversion of the incident energy of the electron beam into X-radiation is a very inefficient process. Less than 1 % is transformed into radiation; the remaining energy appears as heat which has to be dissipated by a technically rather demand-
2.3 X-rays
37
ing water-cooling system. It consists normally of a closed cycle system which has to provide a circulation of at least 3–4 liters per minute for sufficient cooling. To make sure that no damage by overheating of the tube occurs, a safe cooling process has to be ensured and supervised by appropriate water-watching devices, which should switch off the generator once the water flow falls below a critical threshold. Care has also to be taken that clean water is always in the circuit since contamination by even small solid particles may block the small slits next to the anode and interrupt a continuous water flow. Even with a properly working cooling system the heat problem cannot completely be solved, so that progress in improving the power dissipation of sealed tubes was slow in the last decades and reached only a factor of 2 or 3. To avoid the heat concentration on the small stationary anode target, rotating anodes were introduced to distribute the incident energy on a larger area. In an evacuated system, an anode target disc rotates by several thousand revolutions per minute and offers this way the electron beam from the cathode an area of given radius and thickness which is substantially larger than at a stationary anode. That is why the power dissipation is much higher than for sealed tubes, especially if a small focal spot is utilized. Rotating anodes are commercially offered as 5 to 30 kW versions. Although the intensity increase of rotating anodes is appreciated, this advantage goes at the expense of permanent maintenance of the vacuum and the mechanical setup. Moreover, the cathode filament, producing the electron beam, and the anode target have to be replaced from time to time, so that in addition to considerable maintenance also non-negligible running costs have to be taken into account. The intensity obtained from an X-ray source does not depend only on the power but also on the focus size. It is evident that for a given crystal size, you do not gain any intensity in the diffraction experiment if you change the tube to another with double power, if the focus size is simultaneously doubled. Only by improving the ratio of power to focus size can a higher intensity be obtained. That is why a very interesting development has taken place in the last few years with the introduction of the microfocus X-ray sources. The basic idea is a focusing of the incident beam to a small spot of high intensity by appropriate X-ray optics such as mirrors or tapered capillaries. A first microfocus tube using X-ray mirrors was described by Arndt et al. already in the 1990s [9, 10]. Recently, microfocus X-ray sources have become commercially available, among them the Incoatec Microfocus Source (IμS). It is a sealed tube with a two-dimensional focusing multilayer mirror optic producing a highly intense focal spot of ∼ 180 μm or less. Since the power dissipation is only 30W, air-cooling is sufficient, so that the complex water-cooling described above for conventional sealed tubes is not needed. IμS is available for CrKα, CuKα, MoKα, and AgKα radiation. Typical generator settings for MoKα are 50 kV and 0.6 mA. The manufacturers report on a performance beyond 5 kW rotating anode sources. As further advantages they point out an easy and stable mounting and little or almost no maintenance [11].
38
2 Fundamental results of diffraction theory, X-radiation
Figure 2.14: Different shapes of the IμS beam (left) and the one of a conventional sealed X-ray tube (right). Reproduced from ref. [12] with permission of the IUCr.
The group around Stalke (Göttingen, Germany) [12] reported on single crystal diffraction experiments with a microfocus X-ray source on the one hand and a conventional sealed tube on the other hand with both sources mounted on the same goniometer and otherwise exactly the same experimental conditions. The beam characteristics and the obtained data sets were analyzed. They measured the shape of the IμS beam and the one of the sealed tube (Figure 2.14). It could clearly be seen that the microfocus source produces a narrow sharp beam profile (half width 0.16 mm) with higher peak intensity than the sealed tube, which in turn provides a plateau of stable intensity of about 0.5 mm. From these beam specifications and their experiences with several data collections from both X-ray sources the authors conclude that for larger crystals (dimensions ≥ 300 μm) both sources provide data sets of comparable quality while for small crystals (≤ 100 μm) the IμS is first choice. However, the authors also point out that due to the narrow inhomogeneous beam profile of the microfocus source the crystal volume exposed to the beam is not always constant, so that a proper alignment of the crystal and scaling of the data is essential. Since for a large number of compounds, especially with increasing molecular weight, large crystals are unlikely to be obtained, the microfocus X-ray source is an appreciated alternative. It was mentioned above that heat is a severe problem on X-ray generation and cooling has to ensure that the solid anode material is kept at a temperature well below its melting point. This physical limit is overcome by anodes consisting of a continuous flow of liquid metal alloys, realized by a new technology, the recently introduced so-called liquid Metaljet X-ray sources [13]. As target material, gallium or indium alloys can be chosen which are liquid close to room temperature. Their Kα emission lines provide wavelengths at λ = 1.3 Å close to CuKα for Ga and at λ = 0.51 Å for In, close to AgKα. Since a continuous flow of already molten alloy supplies a permanently regenerated target surface, a high X-ray flux can be generated. The manufacturers report on a significantly higher brightness compared to X-ray sources with solid anodes.
2.3 X-rays
39
2.3.5 Synchrotron radiation If charged particles (electron or positrons) are accelerated in a storage ring to a velocity close to the speed of light an electromagnetic radiation is emitted in a tangential direction with respect to the storage ring. This radiation, which is concentrated within a narrow angular space, is a white radiation and covers a wavelength spectrum from infrared to gamma radiation. It has not only a brilliant intensity, but also further outstanding properties which will be detailed below. First experiments with synchrotrons started in the late 1940s, the so-called “first generation” storage rings were being operated by the end of the 1960s. They were originally dedicated to particle physics experiments, where synchrotron radiation was a byproduct with some disadvantages concerning beam and intensity stability and rather short runtime cycles. Even this application of synchrotron radiation offered promising experimental options, so that second generation sources, established by the 1980s, were dedicated to the use of synchrotron radiation only. In the early 1990s, third generation synchrotrons started operation and since the beginning of the 21st century the development of X-ray free electron lasers (XFEL’s) was initiated. A detailed overview about synchrotron crystallography is given by Helliwell [14]. To generate and maintain high intensity synchrotron radiation a procedure is followed which is illustrated in Figure 2.15. From an electron gun low energy electrons are entered into a linear accelerator [(1), for numbering, see Figure 2.15], where they gain energy of several million eV. In a next step they are fed to a (circular) booster synchrotron (2), where they obtain their final energy of some 4–8 GeV and are then injected into the storage ring (3), where they travel in a vacuum of around 10–8 to 10–10 mbar. Some technical effort with
Figure 2.15: Schematic representation of a synchrotron storage ring and its components. Image reproduced by courtesy of Diamond Light Source (UK) [15].
40
2 Fundamental results of diffraction theory, X-radiation
bending magnets is needed to maintain a precise orbit for hours. Openings in tangential directions of the storage ring allow the synchrotron radiation to exit the ring and enter into one of the beamlines (4, 5) for being used at the experimental stations (6–8). Several further magnetic devices with various functions are inserted into third generation storage rings, such as wigglers or undulators. Both consist of periodically arranged magnets in two parallel rows, forcing the beam between them to follow a sinusoidal path. Depending on how they are technically conditioned they have special functions. Wigglers contribute to increasing the intensity and produce a broad radiation spectrum, while undulators yield a more compressed beam with a characteristic energy spectrum. Undulators have a smaller electron beam deflection angle than wigglers. For a given undulator geometry this leads to wave interference and a quasi-monochromatic beam. Synchrotron radiation is several orders of magnitude more intense than a conventional X-ray tube. While the average source brilliance of an X-ray tube is around 108 (see Figure 2.16), already first generation synchrotrons made a big step forward to 1012, which was increased to more than 1020 for the third generation synchrotrons. The upper end of the vertical axis in Figure 2.16 is occupied by the free electron laser brilliance. One further important property is the tunability, hence the user can choose a desired wavelength. This is of great actual importance for multiple-wavelength diffraction experiments carried out for phasing macromolecular structures, see Section 7.4. Once a choice is made, appropriate X-ray optics, such as monochromators or focusing mirrors, generate a highly monochromatic and sharply focused beam. The technically demanding X-ray optics system is designed for each experimental station to meet the individual requests for the experiment in question. It is normally located in an own optical hutch (6) just in front of the experiment, which in turn is in a second hutch at the end of the beamline (7). For safety reasons this hutch can be entered by the user only when the beam is off, which is supervised by a strict interlock system. Finally, outside the experimental hutch in an operator area (8) the experiment is controlled by the user with adequate software either commercially supplied or generated by the beamline staff or in some cases also by the user group. For single crystal diffraction experiments synchrotron primary radiation has a number of major advantages. The high primary intensity allows – short exposure times, so that, for example, radiation sensitive or instable crystals are exposed to the beam in minimum time intervals; – tiny and weakly diffracting crystals to be measured. Crystal dimensions of a few microns are acceptable; – weak high-order reflections to be measured. This allows the resolution of a diffraction experiment to be increased (important in protein crystallography and for electron density data sets). The sharp and highly monochromatic beam yields extremely sharp reflection profiles, which is helpful for unit cells with long lattice constants, where overlap of
2.3 X-rays
41
Figure 2.16: Average X-ray source brilliance development over the years for X-ray tubes and the different generations of synchrotrons. Image reproduced by courtesy of G. Admans/ESRF [16]. The brilliance of an X-ray source is a measure of its brightness per spot size and given in photons/sec/mm2/mr2/0.1 % bandwidth [14, 16].
neighboring reflection profiles is avoided, so that the intensity of each reflection can be properly recorded. Some disadvantages should also be mentioned. Synchrotron radiation is provided by large scale facilities only. Although their number has increased worldwide in the last years and is steadily growing, the availability of synchrotron beamtime is limited. Each potential user wishing to carry out synchrotron experiments has to submit a proposal to the scientific board of a synchrotron source, which should substantiate the planned experiment, especially why synchrotron radiation is needed. Only if the proposal is accepted, is beamtime awarded, which is generally given in shifts (one shift is a time interval of 8 hours). Usually no more than a couple of shifts are allocated, so that even after a successful application the user has a strongly limited time interval for the experiment, which must be completely exploited. This means that the experiment and the crystalline samples should be carefully prepared already at the home laboratory and that on-site the experiment should be permanently supervised, that is 24 hours per day. This also means that the visiting experimenter group should consist of more than one person. Moreover, the administrative procedure described above can take several weeks or months, so synchrotron experiments must be planned well in advance. It is also strictly recommended that the user contacts the beamline staff as early as possible to agree on individual requirements, choice of wavelength, low temperatures, etc. Nevertheless, in some fields of X-ray crystallography, for example for large protein or virus structures, the majority of data collections are already carried out at synchrotron beamlines.
3 Preliminary experiments We can now proceed to the first experiments in a single crystal structure analysis. When starting a structure analysis, we have to assume that single crystals of the compound are available. The first information we can get is that of the crystal symmetry and lattice dimensions. For this purpose we describe the film techniques which can serve as first experimental applications of single crystal diffraction. The notation “film techniques” indicates that the detector is an X-ray sensitive film.
3.1 Film methods 3.1.1 The rotation method From the Ewald condition, shown in Figures 2.3 and 2.4, we see that we get diffraction of a lattice plane if, and only if, its vector h intersects Ewald’s sphere. If a crystal is oriented in an arbitrary orientation towards the X-ray beam, this condition will only be fulfilled for a few planes by pure chance. One way to obtain diffraction by every reflection in the limiting sphere is to rotate the crystal about an axis normal to the direction of the incident beam. This is the fundamental concept of most film methods used in single crystal diffractometry which uses monochromatic radiation. Another way to obtain a large number of reflections is to vary the wavelength to get a variety of Ewald spheres, which means to use white X-radiation. This method was used in the first diffraction experiments of von Laue
(a) Figure 3.1: (a) Random orientation of a crystal with respect to the X-ray beam; (b) orientation of reciprocal lattice planes in the direction of the primary beam.
(b)
3.1 Film methods
43
and coworkers and is called the Laue Method. It experienced some recent renaissance, because synchrotrons provide white X-radiation in a certain energy range, see previous chapter. Since all diffraction spots can be assigned to a lattice vector h in reciprocal space, all single crystal diffraction experiments can be described in terms of the reciprocal lattice. That is why we have plotted only the reciprocal lattice in Figure 3.1. Let us now examine two possible orientations of this lattice with respect to the incident beam. (As in Figure 2.3 we have placed the origin of the reciprocal lattice on the intersection of the incident beam direction and the Ewald sphere.) In Figure 3.1(a), the reciprocal lattice is in an arbitrary orientation to the incident beam. Rotation of the crystal (and the reciprocal lattice) gives rise to several intersections of lattice points with the Ewald sphere. However, the reflected beams will have no special directions and will therefore show a distribution on a film which bears no obvious relationship to the crystal symmetry or the crystal axes. The result of such a rotation photograph is shown in Figure 3.2(a); it is like the “stars in the sky”. We get a more interpretable diffraction pattern if the crystal is adjusted, as shown in Figure 3.1(b), so that it rotates about an axis which is normal to a set of reciprocal lattice planes and normal to the X-ray beam. The reciprocal lattice planes then intersect the Ewald sphere on parallel circles. The reflected beams are then all gathered on different cones, each cone being associated with one reciprocal lattice plane. These cones, named Laue cones, have the crystal rotation axis as the common cone-axis (Figure 3.3).
(a)
(b)
Figure 3.2: (a) Rotation photograph of a randomly oriented crystal; (b) rotation photograph of a properly aligned crystal.
44
3 Preliminary experiments
If the film is positioned cylindrically around the crystal with the cylinder axis coinciding with the rotation axis of the crystal, these cones appear as parallel circles on the cylindrical film, becoming straight lines after the film is unrolled. An example of such a rotation photograph is shown in Figure 3.2(b). Since all reflections of one reciprocal layer are present on one line of the film, the lines are called layer lines. That in the plane of the incident beam (also called equator plane) is the zero layer line, the next the first layer line, and so forth. From the rotation photograph of an aligned crystal, we can measure the distance D* between the lattice layers. From Figure 3.3 we find tan μ1 = l1/RF RF = radius of film camera and sin μ1 =
D* 1/λ
or D* =
sin μ1 λ
Figure 3.3: Laue cones and their registration on a rotation photograph.
3.1 Film methods
45
Since RF is known, and l1 can be measured from the photograph, μ1 and then D* can be calculated. In general for the nth layer line we get
and from sin μ n =
tan μn = ln/RF
(3.1)
sin μ n . nλ
(3.2)
sin nD* we find 1/λ D* =
Equation (3.2) applies to every rotation axis and for every kind of lattice layer. Consider a rotation axis of the form r = ua + vb + wc with u, v, w being integers with no common factor. Now, since direct and reciprocal spaces are completely equivalent, we can use formula (1.13) of Section 1.2.2 for planes in reciprocal space and their normal vectors in direct space. From the definition of r it follows that r is normal to a set of planes satisfying the equation hnr = n n = 0, 1, 2, …. The distance between these planes is D* = 1/ ⎪r⎪ [see formula (1.14)]. Since r is the rotation axis, these planes represent the layer lines. That means for a reflection h lying in the nth layer line, we get the so-called “layer line condition”: hr = n n = 0, 1, 2, …. From (3.2) we get D* =
(3.3)
1 sin μ n = r nλ
or with D = r D=
nλ . sin μ n
(3.4)
If the rotation axis is a crystal axis, say one of the base vectors, for instance a, then we have r = 1a + 0b + 0c and (3.3) reduces to h1 + k0 + l0 = n. It follows that the zero layer line includes the reflections of the b* – c* plane. They all have the indices 0kl. The first layer line contains reflections of type 1kl, in general reflections on the nth layer line are of the type nkl. From (3.4) we then get, with D = ⎪a⎪ = a a=
nλ . sin μ n
(3.5)
Thus we can measure the magnitude of the lattice constant, a, from an aligned rotation photograph, if the rotation axis is parallel to this lattice direction. The lattice constant
46
3 Preliminary experiments
derived from a rotation photograph is, in fact, the only information of a direct space quantity that we get from film techniques. All further information will be of reciprocal space. Although the rotation photograph is the first technique to provide information about the crystal lattice, it has one disadvantage. It provides no information relating to the reflections within the layers, because the two-dimensional information about an entire reciprocal lattice layer is projected on a one-dimensional layer line. To overcome this difficulty, we use other film methods. Frequently used in single crystal analysis and of some historical relevance are the Weissenberg and the precession method.
3.1.2 Zero level Weissenberg method At first the experimental conditions for this technique are similar to those for the rotation method. As before, it is necessary to rotate a well-aligned crystal about an axis perpendicular to the direction of the primary beam. There are two main differences from the rotation method. A slotted screen is used which permits the recording of only one layer line, and the film holder makes a translational movement in the direction of the crystal’s rotation axis (Figure 3.4), as the crystal rotates.
Figure 3.4: Principle of the Weissenberg geometry, zero layer is recorded.
3.1 Film methods
47
Figure 3.5: Schematic representation of a Weissenberg camera, illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
With this experimental arrangement (see also Figure 3.5), reflections intersecting the Ewald sphere at different times will be registered at different positions on the film. Thus a complete layer of reciprocal space which appears as one single line (layer line) on a rotation film will be distributed over the whole plane of the Weissenberg film. In contrast to the rotation technique, all reflections are separated and can be identified. These experimental conditions give a distorted representation of a layer of the reciprocal lattice in contrast to the second moving film method, the precession method, which will be described later.
(b)
(a) Figure 3.6: (a) Example of a Weissenberg exposure, zero layer; (b) undistorted representation of a selection of reflections recorded on the exposure shown in (a). The reflection marked by its x-y-coordinates in (a) is encircled in the representation in (b).
48
3 Preliminary experiments
An example of a Weissenberg film is shown in Figure 3.6(a). The typical arrangement of diffraction spots on festoon-like curves can clearly be seen. To measure detailed information about the geometry of the reciprocal lattice from the reflection positions, the experimental conditions and their geometrical consequences have to be known. Figure 3.7 shows the experimental conditions for a zero-level Weissenberg exposure in the plane of the zero layer; this is perpendicular to the projection drawn in Figure 3.4. The crystal rotation axis is directed perpendicular to the plane of the paper. Consider a two-dimensional coordinate system in the plane having its origin, O, at the intersection of the primary beam and the Ewald sphere and its x-axis perpendicular to the direction of the primary beam. Relative to this coordinate system we choose polar coordinates so that to every lattice point P having the position vector h, a polar radius r and a polar angle ω can be assigned. If the crystal is rotated about its axis, P turns by an angle ω + β to intersect the Ewald sphere at P′, resulting in a reflection. This reflection will appear on the extension of the vector s/λ on the film at B. Let t be the time necessary to move P to P′ and let us then denote the position vector of P′ by h(t). Let y be the circular arc AB which will be a straight line if the film is unrolled. If RF is the radius of the cylindrical film holder, we get, y/(2 π RF) = (2 θ)/360 or θ = (y 180)/(2 π RF).
(3.6)
Figure 3.7: Experimental conditions of the Weissenberg technique in the zero layer plane, projected in the direction of the crystal rotation axis.
3.1 Film methods
49
Most Weissenberg cameras have a film cassette radius of RF = 180/(2 π) ≈ 28.6 mm. Then we get θ (in degrees) = y (in mm). The crystal rotation is coupled via a gearing g (to be given in mm/degrees) with the translation x of the film holder. Then we get for x, x = g(ω + β). From Figure 3.7, it follows from simple geometric considerations, β = 90° − α 2α = 180° − 2θ hence
β = 90° − (90° − θ) = θ, or x = g(ω + θ).
The quantities x and y can be measured directly on a Weissenberg film [see Figure 3.6(a)]. x has to be measured relative to an origin which can be chosen arbitrarily, in principle. However, for convenience, the point x = 0 should be positioned at the left margin. Using Bragg’s equation, r = ⎪h⎪ can be calculated and the polar coordinates (r, ω) of a lattice point can be obtained from x and y by r = ⎪h⎪ = (2 sin θ)/λ with
(3.7)
θ = (180 y)/(2 π RF) (= y, if RF = 28.6 mm), and ω = x/g − θ.
(3.8)
By application of the transform (x, y) → (r, ω) an undistorted reciprocal lattice layer is obtained, from which the distance between reciprocal lattice points can be calculated. Figure 3.6(b) shows a graphical representation of the undistorted reciprocal layer obtained from the exposure of Figure 3.6(a). Properties which can be derived from this undistorted image will be discussed in Section 4.3.2. A special feature of a Weissenberg film is that reflections having the same ω-values [for instance the reflections on lines A and B in Figure 3.6(a)] are situated on a straight line passing through the origin of the reciprocal lattice. For these reflections, it follows from (3.7) and (3.8) that x = g(ω + θ) = g ω + (y g 180))/(2 π RF).
50
3 Preliminary experiments
Setting c = g ω, and d = (g 180)/(2 π RF), then c and d are constants and we get y = (1/d)x − c/d.
(3.9)
Thus y and x are related by the equation of a straight line, which implies that these reflections are positioned on a straight line on the Weissenberg exposure (see Figures 3.6(a) and 3.6(b) for comparison). So we have shown that reflections situated on a straight line on a Weissenberg film belong to a reciprocal lattice line passing through the origin. This fact is frequently utilized to determine reciprocal lattice constants directly from the Weissenberg film. If the rotation axis is identical to one of the unit cell vectors, say c, the zero layer Weissenberg exposure contains the hk0 reflections. Since the reciprocal lattice lines h00 and 0k0 pass through the origin of the reciprocal lattice, they appear as lines on the film. They can be identified by symmetry considerations as we shall discuss in Section 4.3.2. These reflections, which are multiples of the reciprocal basis vectors, are said to be axial reflections. For an h00 reflection we have h = ha* + 0b* + 0c* = ha*. Then Bragg’s equation becomes λ=
2 sin θ 2 sin θ = ha* h
and we get a* =
2 sin θ . hλ
(3.7a)
Since θ is given by its y-value, the reciprocal lattice constant a* is obtained from a Weissenberg exposure.
3.1.3 Upper level Weissenberg – normal beam and equi-inclination method Let us now proceed to the upper level Weissenberg technique. The experimental conditions are similar to those for the zero layer, except that the layer line screen has to be shifted to prevent all but the desired nth layer to be registered on the film. This technique is denoted normal beam method, since the incident beam is still normal to the crystal rotation axis. The shift tn of the layer line screen (see Figures 3.3 and 3.4) can be calculated from t tan μ n = n Rs or tn = R s tan μn where R s is the radius of the layer line screen.
3.1 Film methods
51
A disadvantage of the normal beam method is that it allows the registering of only those layers which would appear on a rotation photograph. As can be seen from (3.4), that means (since sin μn ≤ 1) that n ≤ D/λ.
(3.10)
With minor changes in the geometrical conditions, the number of layers being registered can be enlarged by a factor of two. Therefore, in practice it is customary to prefer the so-called equi-inclination method rather than the normal beam method. When using this technique the primary beam is no longer chosen to be perpendicular to the
(a)
(b)
Figure 3.8: Experimental conditions for the equi-inclination method; (a) view perpendicular to the crystal rotation axis (as in Figure 3.4); (b) view in the direction of the crystal rotation axis (as in Figure 3.7).
52
3 Preliminary experiments
crystal rotation axis, but it is turned by an angle ϕn from that direction [Figure 3.8(a)]. As before, the origin O of the reciprocal lattice is at the intersection of the primary beam and the Ewald sphere. The central plane c is perpendicular to the crystal rotation axis and passes through the crystal center. Then the distance z between zero layer and c-plane is given by [see Figure 3.8(a)] z = sin ϕn . 1/λ ϕn the equi-inclination angle, is now chosen so that the nth layer has the same distance to the c-plane as the zero layer, and since (1/2)nD* = z (see Figure 3.3), ( 1 / 2 )nD* = sin ϕn 1/λ
nλ = sin ϕn 2D or, using (3.4) sin ϕn = (sin μn)/2.
(3.11)
The reflections of the nth layer having their reflection position at P′ are registered on the film at B [Figure 3.8(a) and (b)]. To suppress all but the reflections of the nth layer, the layer line screen has to be shifted by an amount tn [Figure 3.8(a)], with tn = Rn tan ϕn.
(3.12)
To obtain the polar coordinates r and ω using the upper level Weissenberg technique, let us look again at Figure 3.8(b), which shows the experimental conditions in the plane of the nth layer. It is perpendicular to Figure 3.8(a). There is one significant difference from the situation represented in Figure 3.7. The reflection angle 2 θ is replaced by an angle 2 ∆, with 2 ∆ = ∠ O′ C P′, which is the projection of 2 θ = ∠ O K P′ onto the plane of the nth layer. All further results derived from Figure 3.7 are still valid, so we have ω = x/g − ∆ r can again be derived from the distance y = AB on the film. However, y has to be transformed into ∆ after (3.6) and r = O′P′ is no longer equal to ⎪h⎪, but r is the projection of ⎪h⎪ onto the plane of the nth layer. We get r = O′D′ sin Δ = 2O′C sin Δ . From Figure 3.8(a), it can be seen that
O′C = ( 1 / λ ) cos ϕn holds. Hence,
3.1 Film methods
53
r = (2/λ) sin ∆ cos ϕn. Let us summarize. For examination of n layers (n = 0, 1, 2, ...) of reciprocal lattice by the equi-inclination Weissenberg technique, the following procedure must be followed: (1) After setting the crystal about a special rotation axis, a rotation photograph is taken. The distance D of reciprocal layers perpendicular to the rotation axis is given by D=
nλ sin μ n
(3.4)
with tan μn = ln/RF.
(3.1)
If the rotation axis has the direction of one unit cell vector, (3.4) gives the length of this unit cell vector. (2) To obtain a Weissenberg photograph of the nth layer of reciprocal lattice, the inclination angle ϕn and the displacement tn of the layer line screen have to be calculated from sin ϕn = (sin ϕn)/2
(3.11)
tn = Rs tan ϕn.
(3.12)
For n = 0, ϕ0 = t0 = 0 holds. (3) The transformation to an undistorted reciprocal lattice can be performed by taking the coordinates (x, y) of each reflection on the film, followed by the calculation of the corresponding polar coordinates (r, ω): ∆ = (y 180)/(2 π RF) (= y, if RF = 28.6 mm) ω = x/g − ∆ r = (2 sin ∆ cos ϕn)/λ.
(3.13)
If n = 0, eq. (3.7) replaces the equation for r. An advantage of the equi-inclination technique against the normal beam method is obvious. From (3.4) and (3.11), it follows that ϕn has to satisfy (n λ)/(2D) = sin ϕn ≤ 1. It follows that n ≤ (2 D)/λ.
(3.14)
A comparison with (3.10) shows that in the case of equi-inclination techniques, the maximum number of layers which can be recorded is twice the number of those in the case of the normal beam method.
54
3 Preliminary experiments
3.1.4 Precession technique The great disadvantage of the Weissenberg method is that it records a distorted representation of the reciprocal lattice. The distortion of the reciprocal lattice representation on a Weissenberg film results from the fact that the movement of the crystal, and therefore the movement of the reciprocal lattice, is different from that of the film. It follows that an undistorted representation can only be obtained if the crystal and the film movement are synchronous. This condition is realized in different ways by two film methods, the de Jong-Bouman and the precession method. The de Jong-Bouman method, which provides in principle the same information as the Weissenberg technique, is nowadays hardly in use and shall not be considered here. The other film method which allows the recording of the undistorted lattice is the precession method developed by Buerger in 1942. [1] The basic experimental situation is illustrated in Figure 3.9. In contrast to the rotation and Weissenberg method, the crystal is assumed to be aligned with one axis, say a, in the direction of the primary beam [Figure 3.9(a)]. The reciprocal lattice layers 0kl, 1kl, etc., are then oriented perpendicular to the primary beam. Consider the zero layer. To make this layer intersect the Ewald sphere, it is necessary to incline the crystal with its a-axis by an angle μ0 against the X-ray beam direction [Figure 3.9(b)]. Since reflections of zero layer lattice points are obtained only if these points cut the surface of the Ewald sphere, movement of the crystal is necessary. In the Buerger precession method, the crystal executes a precession movement about the primary beam axis keeping a constant opening angle μ0 between the crystal a-axis and the primary beam direction. An undistorted exposure of one reciprocal plane is obtained if a screen is positioned so that only reflections of one lattice plane are able to pass and if a planar film simultaneously performs the same motion as the crystal. These conditions are realized with a Buerger precession camera, of which a schematic representation is given in Figure 3.10. The principal parts of this camera are the crystal and the film holders H and H′, which are coupled to ensure a simultaneous motion. An annular screen S, which is attached rigidly to the crystal holder, obstructs
(a)
(b)
Figure 3.9: Crystal setting by the precession method: (a) inclination angle μo = 0 ; (b) μo > 0 . o
o
3.1 Film methods
55
Figure 3.10: Schematic representation of a Buerger precession camera.
all but the reflections of one layer. The inclination angle μ is adjusted via an arc A, which is connected with a motor M. The rotation of the motor shaft then causes the precession motion of the complete system of film, crystal and screen. To choose the geometric settings, let us look at Figure 3.11, which shows the geometry in the case of a zero layer precession exposure. The intersections of the layers 0kl, 1kl, etc., result in reflections being located on the corresponding Laue cones. Let us denote the angles subtended by these Laue cones by 2ϕ0, 2ϕ1, etc. As Figure 3.11 shows, we get a cone opening angle of 2ϕ0 = 2μ0 for the zero layer. To prevent all but the zero layer cone from reaching the film, a screen having an annular opening of radius rs is placed at a distance ds from the crystal with ds =
rs . tan μ0
(3.15a)
If crystal and film are moved synchronously, then an undistorted image of the zero layer appears on the film. The magnification follows from simple geometric proportionality. The ratio of a distance x taken on the film (in mm) to a corresponding distance d* on the zero layer (in Å–1) is equal to the ratio of dF and 1/λ (dF in mm, λ in Å): x d = F . d* 1/λ It follows 1 d λ = F [Å] d* x
(3.16a)
x [Å–1]. dFλ
(3.16b)
or d* =
On the precession exposure of zero layer, a spherical area of radius rB is recorded, with rB/2 = OA, see Figure 3.11. It holds that
56
or
3 Preliminary experiments
rB / 2 = sin ϕ0 1/λ rB = (2/λ) sin ϕ0.
(3.17)
As Figure 3.11 shows, not only the zero layer but also higher layer Laue cones are present (as long as the corresponding layers intersect the Ewald sphere). It is therefore usual to place a second film between the screen and the crystal to record all Laue cones on one film. An exposure of that type is called a cone-axis exposure and contains information similar to a rotation photograph in the Weissenberg arrangement. The difference is that one layer appears as a line on a rotation photograph but as a circle on a cone-axis exposure. The determination of the lattice constant in the precession axis direction can then be made from a cone-axis exposure. Suppose again that the alignment axis is a. Then the distance d* between the zero and first layer is 1/a. From Figure 3.11 we get AK = cos ϕ0 1/λ BK = cos ϕ1 1/λ 1 / a = AK − BK 1/a=
cos ϕ0 cos ϕ1 . − λ λ
Figure 3.11: Geometrical situation for taking a zero layer precession exposure.
(3.18)
3.1 Film methods
57
Figure 3.12: Geometrical conditions for a first layer precession exposure.
Since the first layer is supposed to be closer to the crystal, its cone opening angle 2ϕ1 is larger than that of the zero layer. If the radii of the zero and first layer circle on the cone-axis photograph are r0 and r1, we have r0 < r1 and we get tan ϕi = ri /∆ i = 0, 1 where ∆ is the film to crystal distance. Since ∆ is not usually very precisely known, whereas ϕ0 = μ0 is known from the experimental conditions, we get ∆ = r0/tan ϕ0 and tan ϕ1 = r1 / Δ =
r1 tan ϕ0 . r0
(3.19)
So we get ϕ1 from (3.19) and then a from (3.18). For an upper level precession photograph (see Figure 3.12), the precession angle μ1 can be chosen arbitrarily. However, it is usually chosen to be smaller than μ0, since otherwise the large cone opening angle would require too large a film holder, resulting in a collision during the precession motion. The screen distance ds for a screen of radius rs is now given by rs (3.15b) d s, 1 = tan ϕ1
58
3 Preliminary experiments
where ϕ1, the opening angle of the first layer cone, is obtained as in (3.18) from cos ϕ1 = cos μ1 − λ/a.
(3.20a)
Since the layer to crystal distance is different for the first layer from that for the zero layer, a different magnification factor would result if the film were left at the same distance as for the zero layer. Therefore we displace the film by a shift u1 towards the crystal (because the first layer is closer to the crystal than the zero layer, see Figure 3.12), which is given by u1 d = F 1/a 1/λ u1 =
λd F . a
(3.21a)
Generalizing the results for the first layer to the nth layer, we get d s, n =
rs tan ϕn
(3.15c)
with cos ϕn = cos μn − (n λ)/a un =
n λ dF a .
(3.20b)
(3.21b)
Let us now summarize the procedure for obtaining a series of precession photographs: (1) Adjust the crystal with one axis, say a, parallel to the primary beam. (2) Choose an inclination angle μ0 (μ0 = 30° is a good choice), place a film in the film holder at a distance dF (dF is an instrument constant, usually dF = 60 mm). Choose an annular screen of radius rs (screens with radii from 20 to 60 mm are generally available) and calculate ds from (3.15a). Place a second film on the crystal side of the screen to obtain a cone-axis exposure simultaneously. (3) To get the instrumental setting constants for the first and higher layers, determine the circle radii from the cone-axis exposure and calculate a from (3.18). (4) Choose an inclination angle μn (smaller than μ0) for the upper layers. Calculate ds,n from (3.15c) and un from (3.21b) and use them when taking the upper layers. Note that it may be necessary to vary the screen radius rs when recording the upper layers. Figure 3.13 shows a zero layer precession photograph of SUCROS and a simultaneously recorded cone-axis exposure. They were recorded using an inclination angle μ0 = ϕ0 = 23.5°. With an annular screen having a radius rs = 20 mm we chose a crystal to screen distance ds = 46.0 mm.
59
3.1 Film methods
The lattice layer spacing can be obtained from the radii of the zero and first layer circle (note that r0 < r1) and we could also calculate the settings for the first layer. However, for the space group determination of SUCROS, this first layer is unnecessary (as we shall see in Section 4.3.2). The radii of zero and first layer circles are found to be 2 r0 = 36.8 mm, 2 r1 = 81.5 mm. With ϕ0 = μ0 = 23.5°, we get [from (3.19)] ϕ1 = 43.9° and then [from (3.18)] l/D = 0.128Å–1, D = 7.84 Å. It is of interest to compare the areas of the reciprocal lattice recorded when using Weissenberg and precession techniques (see Figure 3.14). The area covered by a Weissenberg exposure has a radius rw equal to the diameter of the Ewald sphere, hence rw = 2/λ. The radius rB of the circle limiting that portion of the zero layer to be recorded by the precession technique is given by formula (3.17). However, the precession angle μ0 is generally not larger than 30°, to avoid collisions of the moving parts of the instrument. With sin 30° = 0.5, we get rB = 1/λ in comparison with rw = 2/λ. So we get the relation rB : rw = 1 : 2
(3.22a)
FB : Fw = 1 : 4.
(3.22b)
and for the corresponding areas,
For the precession technique, which provides an undistorted representation of the reciprocal lattice layers, there is significant loss of information. This disadvantage can be compensated for by using a shorter wavelength. If for instance a Weissenberg exposure is taken with CuKα radiation (λ = 1.5418 Å), a precession photograph of the same reciprocal lattice layer would record the same number of reflections if MoKα radiation (λ = 0.7107 Å) were used, since λMo : λCu 1 ≈ 1 : 2.
Figure 3.13: Example of a zero level precession exposure of SUCROS (left) and a simultaneously recorded cone-axis exposure (right). The role of the axes shown on the left will be discussed in Section 4.3.2.
60
3 Preliminary experiments
A further drawback of the precession method should be mentioned. Central circular areas exist for all upper layers which do not intersect the Ewald sphere (see Figure 3.12). It is therefore impossible to record these “shadowed areas”. A circular area of radius rshad, B = CD is situated inside the Ewald sphere. We get, rshad, B = CB − DB = CB − OA =
sin ϕ1 sin μ1 − . λ λ
The image of the shadowed circle on the film then has the radius r′shad, B with r′shad, B : rshad, B = dF : 1/λ. It follows rshad, B = 1/λ (sin ϕ1 – sin μ1)
(3.23a)
r′shad, B = dF (sin ϕ1 – sin μ1).
(3.23b)
The question can arise: which method should be used for an actual problem? A general answer cannot be given. With modern techniques, the film methods are used only for space group determination and for examination of crystal quality. Therefore, the information obtained from precession exposures is sufficient in most cases, even if CuKα radiation is used. On the other hand, for an experienced crystallographer, the distorted representation of a Weissenberg exposure causes no great problem, and this technique might be preferred since a Weissenberg instrument is less troublesome to operate. Because the mechanical parts of a Weissenberg camera are less complicated than for a precession camera, the problems of misalignment, due to mechanical breakdown or mechanical wear, are less frequent.
Figure 3.14: Areas of reciprocal layers covered by Weissenberg (Fw, rw = 2/λ) and precession technique (Fb, rb = 1/λ for μ0 = 30°); SUCROS, h0l plane, illustration kindly provided by Bergmann/Schäfer, Vol. 6, 1st edn. (1992), de Gruyter, Berlin.
3.2 Practicing film techniques
61
3.2 Practicing film techniques 3.2.1 Choice of experimental conditions The aim of these preliminary investigations is to get information about the unit-cell dimensions, about the crystal symmetry, and about the crystal quality and its stability to X-rays. At this stage, no intensity measurements are made, since it is assumed that an automatic diffractometer can be used for this purpose (see Chapter 5). Up to the mid-1960s, diffraction intensities were measured by film techniques, but since then nearly all single crystal experiments are carried out with diffractometers. These instruments run automatically and measure intensities more precisely than film methods and at a substantially higher speed, especially if the modern area detector diffractometers are available. Several experimenters consider the execution of film exposures as not necessary, sometimes even as a “waste of time”, because the above mentioned information about unit cell and symmetry are also provided by the software routines that modern diffractometers are usually equipped with. This is true in principle. Nevertheless, for the less well-experienced user it is a good idea to practice film techniques in his/her first two or three X-ray analyses, because this provides the unique chance to learn basic crystallography by directly inspecting the diffraction pattern on the exposed film. The geometrical properties of the reciprocal lattice are easily recognized from photographs. Intensity symmetry can immediately be visualized and the user can directly decide about the choice of lattice constants and measure their lengths. Abnormal crystal properties, such as disorder, twinning, or crystal splitting, which influence the diffraction intensities, are recognized easily by film methods. Making
Figure 3.15: Choice of some crystals. Some specimens suitable for single crystal measurements are marked by arrows.
62
3 Preliminary experiments
use of the diffractometer software packages everything is done automatically so that the user has nothing to do but to trust the computer. Hence, although this advice seems somewhat old-fashioned, beginners in X-ray crystallography might spend the time to make film exposures in their first steps in the field, if a corresponding camera is available in the lab. When choosing a single crystal for film techniques we must consider three properties. (1) The crystal must be a real single crystal (see Figure 3.15). No more than one individual should be selected. A polarization microscope should be used to observe whether the crystal is twinned. Some remarks about twinned crystals are made in Section 8.2.3. (2) The crystal should be as large as possible within the limitations given by the primary beam diameter and the ~ l/μ limitation discussed in Section 2.3.4, eq. (2.30). A larger crystal will usually shorten the time of exposure. (3) If film techniques are used to determine the crystal setting, it is useful to select a crystal with well-formed edges and faces if possible. Needle-shaped crystals are the most favorable ones for a fast setting, but less suitable for diffractometry because of the difficulty of enclosing all the crystal within the X-ray beam. Moreover, needle-shaped crystals may increase absorption problems. The crystal shape depends on the compound to be investigated and not all crystals are needle-shaped. But if you have one, use this direction as the rotation axis for film exposures. Frequently, the longest elongation of the crystal will coincide with the shortest lattice constant. This is understandable if we consider a simple model as shown in Figure 3.16. Suppose we want to stack match boxes. The most stable way is that shown in Figure 3.16(a), in which there is the maximum area between two match boxes and therefore the best stability. This direction of stacking is the shortest dimension of one box, but the longest dimension of the pile.
(a)
(b)
Figure 3.16: Two ways of stacking match boxes. Method (a) is much more stable than method (b).
3.2 Practicing film techniques
(a)
63
(b)
Figure 3.17: (a) A commercial goniometer head (Huber Diffraktionstechnik, Rimsting, Germany), [4] (b) a crystal mounted on a cactus needle.
Having chosen a good crystal it has to be mounted on a goniometer head, which is shown in Figure 3.17(a). Two arcs, A and A′, allow the crystal axis to be turned about ±20° in two perpendicular planes. Two sledges, B and B′, provide further alignment by allowing the crystal axis to be centered. The crystal is mounted by an adhesive on a glass fiber of length approximately 10 mm. Alternatively, a cactus needle instead of the glass fiber has proven a good choice. Both are amorphous materials and provide minor background [Figure 3.17(b)]. In most cases crystal setting on the camera can be done optically. If necessary, it can be followed by a precise alignment using X-rays, by means of a setting photograph. Several techniques for setting photographs are known. Most of them are based on the method of Bunn, [2] or a modification of Bunn’s procedure by Dragsdorf. [3] We refer to the corresponding literature for details.
3.2.2 Rotation and Weissenberg photographs of KAMTRA and SUCROS Single crystals of KAMTRA are shown in Figure 3.18 (left). We have chosen a colorless needle-shaped crystal having dimensions 0.2 × 0.3 × 0.8 mm. All film exposures will be made with a 2 kW copper X-ray tube and nickel filter. The voltage is 40 kV, with a current of 40 mA. With 1.6 kW we are by 20 % below the maximum of 2 kW, but this helps to increase the lifetime of the rather expensive X-ray tube. With a needle axis the optical setting is almost perfect. With this relatively large crystal, an exposure time of one hour is sufficient for the rotation photograph. If smaller crystals are used, several hours of exposure may be necessary. The zero level Weissenberg photograph can be started immediately after the rotation photograph. Several hours of exposure time is needed for the Weissenberg technique, depending on the crystal size and its scattering power. The following timetable can be kept. Crystal selection, setting and rotation photographs can be done within one day by starting the work in the morning. The first Weissenberg photograph can then be started in the afternoon, and finished and processed the next morning.
64
3 Preliminary experiments
Figure 3.18: Single crystals of KAMTRA (left) and SUCROS (right).
To prepare upper level Weissenberg photographs by the equi-inclination technique, the inclination angle ϕn and the displacement constant tn have to be calculated. From the rotation photograph (Figure 3.19) we can measure ln to be used in formula (3.1). To improve the accuracy, we measure 2ln and divide by 2R, being 57.3 mm for our camera type. Application of (3.4), (3.11) and (3.12) leads to the desired quantities given in Table 3.1. If the rotation axis is long, i.e. >10Å, higher layer lines than two should be used to reduce the error caused by the relatively bad precision of the 2ln values. The zero and first level Weissenberg photographs taken for KAMTRA are shown in Figure 3.20. These two levels give us almost all the information about the dimensions and symmetry of the crystal lattice. The type of layer line levels that are recorded with one crystal setting is given by (3.3). If the needle axis of the crystal is the direction of the shortest lattice constant, say c, then the quantity D from the rotation photograph is equal to c and the layer line levels contain the hk0 and hk1 reflections. Some important features of these Weis-
Figure 3.19: Rotation photograph of KAMTRA.
3.2 Practicing film techniques
65
Table 3.1: Instrument setting constants for KAMTRA and SUCROS (Weissenberg technique). Sample
n
2ln[mm]
D[Å]
ϕn[°]
tn[mm]
KAMTRA
1 2
11.7 25.1
7.71 7.69
5.7 11.6
2.5 5.0
SUCROS
1 2
10.2 21.5
8.80 8.78
5.0 10.1
2.2 4.4
senberg photographs should be noted. Consider the two lines of reflections A and B on the zero layer photograph shown in Figure 3.20(a). On either side of the line A we can always observe pairs of reflections having equal intensity. The same is true for line B. Looking at the lines A′ and B′ on the first layer [Figure 3.20(b)], we find the same symmetric behavior of intensities. The lines of reflections A, B and A′, B′ in both layers are therefore symmetry lines. This is the first experimental result concerning the symmetric properties of the crystal. A further feature should be noted. The reflections on the zero and first layer appear at nearly the same positions. This is best seen by placing the two photographs together. There are, however, two exceptions. The reflection lines A′ and B′ seem to contain twice as many reflections, or in other words, on lines A and B every second reflection is systematically absent.
(a)
(b)
Figure 3.20: Weissenberg exposures of KAMTRA: (a) zero layer; (b) first layer.
66
(a)
3 Preliminary experiments
(b)
(c) Figure 3.21: Weissenberg exposures of SUCROS: (a) zero layer; (b) first layer; (c) second layer.
3.2 Practicing film techniques
67
Now we have two important results. The intensity distribution shows a symmetry property and there are special groups of reflections which are systematically absent. We shall make use of these findings in Section 4.3.2 when lattice constants and the space group are derived. The next example for the application of film techniques concerns sucrose. Since crystals of this well-known compound are more plate-shaped (see right part of Figure 3.18), we now have two, rather than one, prominent axis. We mount a crystal with dimensions 0.4 × 0.4 × 0.2 mm on a goniometer head and select the same conditions for Weissenberg technique as for KAMTRA. The instrument setting constants are given in Table 3.1, the resulting photographs are shown in Figure 3.21. On the zero and upper layers we see again the lines of reflections marked A and B, A′ and B′, A″ and B″. In contrast to the KAMTRA layers, however, they are not lines of symmetry, neither on the zero nor in the upper layers. Furthermore, none of these lines or any other reflection groups show systematically absent reflections. An explanation of these observations will be given in Section 4.3.2.
4 Crystal symmetry 4.1 Symmetry operations in a crystal lattice 4.1.1 Introduction
It was already mentioned in the Introduction, Section 1.1, that crystals have outstanding macroscopic properties. From optical inspection more or less pronounced symmetry of the crystal faces can be recognized, causing a beauty which makes crystals so attractive that some of them are admired as precious stones. In this respect crystals belong to the most expensive solid materials. One carat (0.2 g) of a good quality single crystal of diamond costs about 15 to 20 thousand Euros, which is much more than, for example, one carat of gold or platinum, which is three orders of magnitude cheaper. The symmetry which is macroscopically visible must exist also microscopically in the crystal lattice. First indications were already seen on the various film exposures we described in Chapter 3, where the diffraction pattern showed a symmetry which must have its correspondence in the crystal symmetry. So we have to consider which symmetry can occur in a crystal. Then we have to examine the consequences of crystal symmetry for the intensity of the diffraction pattern. It is important to care for that question, because investigators are interested in the minimum number of necessary measurements. We have learned already that the number of reflections which can be recorded is a fixed number given by equation (2.11). It is therefore quite useful to know how the symmetry properties of the crystal can reduce the amount of experimental data required and also the number of atomic parameters to be determined. Figure 4.1 illustrates two different arrangements in a crystal. In Figure 4.1(a) we have a pattern where the only symmetry is periodicity. The unit cell is the smallest nonperiodic volume element consisting of one motif. In Figure 4.l(b), however, the situation is quite different. Here the motif is present in four different orientations. The identical motif reappears first after having passed another one in both directions. The lattice constants have twice the magnitude of those of Figure 4.l(a), hence the unit cell has the fourfold volume. Nevertheless, both patterns consist of the same motif. Obviously the structure of crystal (a) is completely known if the structure of this motif and the lattice constants are known. For crystal (b) we have, in principle, two possibilities. We could regard all four orientations as four different motifs, in which case the crystal structure would be known from the determination of all four motifs and the lattice constants. However, it is simpler to determine the one motif, which together with the knowledge of the symmetry and the lattice constants, completely describes the crystal structure. In example (b) at least two symmetry operations are present. Once you get the related motif in horizontal direction by rotation about the axis marked “2” by 180°. The lower part of the unit cell is obtained by reflection on a plane marked “m”, perpendicular
4.1 Symmetry operations in a crystal lattice
69
to the plane of the drawing. The advantage of using this symmetry can be expressed numerically. We have four molecules in the unit cell, of which only one is symmetry-independent. The crystal structure is described by the positions of the six atoms of this one molecule, the two symmetry operations, and the cell constants. If the symmetry were neglected, the positions of all 24 atoms in the unit cell would have to be determined.
(a)
(b)
Figure 4.1: A molecular motif in two different periodic arrangements; (a) only translational symmetry; (b) rotations and reflections in a molecular crystal [1].
The asymmetric unit of the entire unit cell is the whole cell in crystal 4.1(a), a quarter cell or one molecule in 4.1(b). By taking crystal symmetry into account, we can restrict ourselves to the determination of the asymmetric unit. Since the diffraction experiment can provide some information about the crystal symmetry, an important task will be to find out the relationship between the diffraction intensities and the crystal symmetry. We shall do this by considering first all possible crystal symmetry operations and then their effect on the intensity distribution.
4.1.2 Basic symmetry operations Before proceeding further, we have to define the symmetry notation. A motif is said to be symmetric if a geometrical operation exists which leaves the motif indistinguishable from that initially present. Such an operation is denoted a symmetry operation. For example, for a cube (Figure 4.2), more than one symmetry operation exists, as there are rotations by 90° or 180° [Figures 4.2(a) and (b)] or reflections [Figure 4.2(c)]. The geometrical objects related by a symmetry operation are said to be the corresponding symmetry elements. From the example in Figure 4.2 we illustrate two important types
70
(a)
4 Crystal symmetry
(b)
(c)
Figure 4.2: Symmetry at a cube. The shown symmetry operations are rotations by 90° (a) and 180° (b) and a reflection (c). The small flag makes the operations visible. Otherwise the cube would be indistinguishable from its initial position. The symmetry elements (being also the fixed points) are twofold (a) and fourfold axes (b) and the mirror plane (c), illustration kindly provided by Bergmann/ Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
of symmetry operations; a rotation about an axis, and a reflection on a plane. The symmetry elements belonging to those operations are a rotation axis and a mirror plane. A rotation axis is said to be an n-fold axis if the rotation angle is ϕ = 360°/n. In the examples in Figures 4.2(a) and 4.2(b), we had rotations about ϕ = 90° = 360°/4 and ϕ = 180° = 360°/2, hence we had a fourfold and a twofold axis. A crystal symmetry operation must not only meet the above mentioned definitions, but must also be consistent with the three-dimensional periodicity in a crystal. As we shall see, this condition will restrict the possible symmetries in a crystal drastically. The symmetry operations considered above have in common that they have a fixed point. A point in space is called a “fixed point” if it is left invariant with the execution of a symmetry operation. For a rotation, all points on the rotation axis are fixed points. This holds for all points on the mirror plane for a reflection. Further symmetry operations in a crystal are all translations by one or more periodicity vectors. Their main difference from the rotation and reflection symmetries is the absence of a fixed point. We shall consider this case later. We shall first deal with those operations having a fixed point, referred to as point group symmetry operations. If the three-dimensional periodicity holds, then only ten basic crystal symmetry operations with fixed points exist, which are the 1-, 2-, 3-, 4-, and 6-fold axes and the related rotation inversion axes (Figure 4.3). An inversion axis is the combination of the n-fold axis and the center of symmetry or inversion center. This inversion operation, sometimes called “reflection on a point”, (see Figure 4.4) is equivalent to a rotation about 180° followed by reflection on a plane. The intersection
4.1 Symmetry operations in a crystal lattice
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
71
Figure 4.3: The ten basic symmetry operations, illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
of the twofold axis with the mirror plane is the inversion center and the only fixed point of this operation. Motifs related by a center of symmetry are said to be centrosymmetric. Whether a crystal structure is centrosymmetric or not plays an important role in phase determination, as we shall see in Chapter 7. The restriction to the above mentioned rotation axes follows immediately from the three-dimensional periodicity. Proof of this fundamental proposition can be given geometrically, and is illustrated in Figure 4.5. Suppose that N1 is the origin of an n-fold axis perpendicular to the plane of the paper. From the periodicity of the crystal, we must have the same n-fold axis at N2 if a is one lattice constant vector. If ϕ is the rotation angle, we get, by application of the rotation axis at N2, a further lattice point at N3, which is also the origin of an n-fold axis. This axis then produces the next lattice point at N4 which, in accordance with the translational concept of the crystal lattice, must
Figure 4.4: Illustration of an inversion center, representation kindly provided by Bergmann/Schäfer, Vol. 6, 1st edn. (1992), de Gruyter, Berlin.
72
4 Crystal symmetry
be at a distance d from N1 which is an integer multiple of a = ⎪a⎪. From trigonometry, we get a = d + 2a cosϕ. If d = ma, it follows that a = ma + 2a cosϕ or 1 – m = 2cos ϕ.
(4.1)
Since ⎪cosϕ⎪ ≤ 1, the integer M = 1 – m must satisfy the inequality ⎪M⎪ ≤ 2. Then M can only have the five values ± 2, ± 1, 0. With these values we get from (4.1) the five possible rotation angles which are specific for 1-, 2-, 3-, 4-, and 6-fold axes (see Table 4.1). The five possible n-fold axes are shown in Figure 4.3(a)–(e). Now let us discuss the rotation-inversion axes. A one-fold axis followed by an inversion center is just the inversion center itself (Figure 4.3f). A 2-fold inversion axis results in a well-known symmetry element, the mirror plane. Its operation is illustrated in Figure 4.3(g). A
Figure 4.5: Restriction on 1-, 2-, 3-, 4-, and 6-fold axes.
Table 4.1: Possible rotation axes M
m
ϕ[°]
n = 360/ϕ
–2 –1 0 1 2
3 2 1 0 –1
180 120 90 60 0
2 3 4 6 1
4.1 Symmetry operations in a crystal lattice
73
rotation about 180° of the original point 1, followed by an inversion, leads to the final position 2, a reflection of position 1. Note that while a 2-fold inversion axis is identical to a mirror plane, the 3-, 4-, and 6-fold inversion axes illustrated in Figure 4.3(h), (i) and (j), are new operations and not equivalent to any introduced previously. (However, a 6-fold inversion axis can also be described by a 3-fold axis combined with a mirror plane perpendicular to the axis, 6 = 3/m.) Certain descriptors are assigned to each symmetry operation. N-fold axes are denoted by 1, 2, 3, 4, 6; the three and more fold inversion axes by 3, 4, 6 The 1- and 2-fold inversion axes, which are identical to a center of symmetry and to a mirror plane, are generally denoted 1 and m, although sometimes the symbols i for the inversion center and 2 for the mirror plane are also used.
4.1.3 Crystal classes and related coordinate systems The example in Figure 4.1(b) shows that a crystal must not necessarily contain only one symmetry element, but in general more will be present. We noticed the existence of a 2-fold axis and a mirror plane and after we had introduced the inversion center we see that this symmetry element is also present in the example of Figure 4.1(b). The rule that every symmetry operation transforms a crystal to a position which is indistinguishable from the initial state is also true for a combination of two symmetry operations. It can be shown that the set of all possible symmetry operations of a motif has the properties of a mathematical group. For this reason, a systematic discussion of crystallographic symmetry theory is provided by group theoretical methods. With these methods, it can be shown that due to the translational restrictions in a crystal, the combination of basic symmetry operations is restricted to 32 possibilities.
Figure 4.6: Symmetry elements in the crystal class 2/m.
74
4 Crystal symmetry
Figure 4.7: A 2-fold axis in a mirror plane (crystal class mm2).
These 32 symmetry groups, each containing a finite number of basic symmetry operations, are called the crystal classes. A crystal class which is commonly found is that named 2/m (say two over m), consisting of a 2-fold axis and a mirror plane oriented perpendicular to the axis. For illustration of this symmetry, let us use the “baby head” model in Figure 4.6 which can nicely illustrate front and back side of a motif. Before starting any symmetry operation it may have position 1. From group theoretical aspects it is necessary to include the identity as one symmetry operation in every crystal class (we had already included the identity as the 1-fold axis in the basic elements). Now let identity be symmetry element I, the 2-fold axes, II, and the mirror plane, III. Then we get position 2 by application of II and position 3 and 4 by application of the reflection. These two symmetry operations automatically produce a further one, an inversion. We have an additional symmetry element, IV, a center of symmetry at the intersection of the 2-fold axis and the mirror plane. Thus the crystal class 2/m contains four symmetry elements: (1) the identity, (2) the 2-fold axis, (3) the mirror plane, (4) the center of symmetry. The symmetry in this crystal class is said to be 4-fold, since every point is present in four equivalent positions. Looking again at our starting example in Figure 4.l(b), we see that this was a crystal belonging to crystal class 2/m. Another possibility of combining a 2-fold axis and a mirror plane is obtained by having the axis in the mirror plane. The complete set of symmetric positions for this combination is shown in Figure 4.7. Let 1 be the identical position. By rotation about the 2-fold axis, we get position 2. Reflection of 1 and 2 on m leads to positions 3 and 4. A second mirror plane m′ is produced, perpendicular to the first and having the 2-fold axis as the intersection of the two mirror planes. It follows that a crystal class 2 m
4.1 Symmetry operations in a crystal lattice
75
defined by a 2-fold axis and a parallel mirror plane is identical to a crystal class mm2, defined by two perpendicular mirror planes and a 2-fold axis. What happens if the 2-axis is replaced by a 2-axis in the examples discussed above? A crystal class 2 /m is nonsense, of course, since 2 = m. In principle, 2 m is possible, but it is the same crystal class as 2m. Examining Figure 4.7, application of 2 = m′ on 1 leads to 4. Then the reflection of 1 and 4 on m (having 2 in its plane) gives 2 and 3, Table 4.2: The 32 crystal classes. No.
Crystal System
Hermann and Mauguin*
Schoenflies*
01 02
Triclinic
1 1
C1 Ci
03 04 05
Monoclinic
2 m 2/m
C2 Cs C2h
06 07 08
Orthorhombic
222 mm2 mmm
D2 C2v D2h
09 10 11 12 13 14 15
Tetragonal
4 4 4/m 422 4mm 4 2m 4/mmm
C4 S4 C4h D4 C4v D2d D4h
16 17 18 19 20
Trigonal
3 3 32 3m
C3 C3i D3 C3v D3d
21 22 23 24 25 26 27
Hexagonal
28 29 30 31 32
Cubic
3m 6 6/m 622 6mm 62m 6/mmm
C6 C3h C6h D6 C6v D3h D6h
23 m3 432 43m m3m
T Th O Td Oh
6
* For the crystal class symbols the nomenclatures of Hermann and Mauguin and of Schoenflies are in use. The Hermann and Mauguin symbols are the preferred ones in crystallography.
76
4 Crystal symmetry
and we get the same result as for 2 m. Now we have the equality of three combinations: 2 m = mm2 = 2 m. Table 4.2 gives the 32 crystal classes in two notations. One is of Hermann and Mauguin, which is recommended by the International Union of Crystallography, another one is of Schoenflies, frequently preferred in inorganic and solid state chemistry. As previously noted, the symbols 1, 2, 3, 4, 6 are used for the corresponding n-fold axes; the symbols 1, 2, 3, 4, 6 for the inversion axes, and additionally m = 2 for mirror planes. A slash (/) between an axis and a mirror symbol indicates that the two symmetry elements are perpendicular to each other. If an axis is situated parallel to a mirror plane, a separation character between their symbols is omitted. So, as already shown in the examples, 2 m indicates that the 2-fold axis is parallel to m, while 2/m is the symbol for a 2-fold axis perpendicular to m. The crystal classes are related to the basic symmetry operations which were characterized by the property of having a fixed point in space. Before expanding the symmetry concept by the addition of translational elements, we have to introduce the mathematical representation of symmetry operations and the implications of crystal symmetry to the selection of unit cell vectors. In Section 1.2.1 we introduced the fractional coordinates by eq. (1.4), which uniquely allow the representation of a point P in space by the vector r in terms of the unit cell vectors a, b, c: r = xa + yb + zc.
(1.4)
We recall that the fractional coordinates x, y, z, have no physical dimensions. If P is situated inside the chosen unit cell, its fractional coordinates have numerical values between 0 and 1. If P is transformed by any of the basic symmetry operations to an equivalent point P′, the question arises, what mathematical operation produces its vector r′? The
Figure 4.8: Best choice of unit cell vectors if mirror symmetry exists.
4.1 Symmetry operations in a crystal lattice
77
solution is very simple. Every basic symmetry operation can be represented by a 3 × 3 matrix, the matrix elements depending on the choice of the unit cell vectors. This can be illustrated with an example. In Figure 4.8 we have the situation where two points P and P′ are related by a reflection on a mirror plane m. We could in principle choose any unit cell if it is in line with the periodicity of the crystal. However, from the special symmetry in this crystal, one choice of unit cell vectors is necessarily the most favorable, that is, two vectors in the mirror plane, the third one in a direction perpendicular to this plane. As shown in Figure 4.8, the representation of the symmetry transformed vector is very simple. For r′, transformed via the mirror plane, only its y-component changes its sign and we get r′ = xa – yb + zc. Using the representation of a vector by a column (see Appendix, Section A.1.4), with ⎛ x⎞ r = ⎜ y ⎟ , we get from the definition of a matrix product [Appendix, formula (A.2)] ⎜ ⎟ ⎝ z⎠ ⎛ 1 0 0⎞ r ′ = ⎜ 0 −1 0 ⎟ r . ⎜⎝ 0 0 1 ⎟⎠ From this special choice of unit cell vectors, requesting two right angles between them, the relation between r and r′ is very simple and the elements of the transformation matrix are simple integer numbers, in some more general cases simple rational numbers. If we consider a molecular crystal with no symmetry but the identity 1, as in Figure 4.1(a), we note that more than one possible unit cell can be chosen, all having the same volume. Three examples are drawn in Figure 4.9; obviously, none of the possible choices has any advantage over the others. It follows that the lattice constants a, b, c have no restrictions, and the angles α, β, γ between the unit cell vectors have no
Figure 4.9: Possible choices of the unit cell in a triclinic crystal: All are consistent with the periodicity [1].
78
4 Crystal symmetry
Figure 4.10: Favored unit cell choice in a crystal having the symmetry 2/m.
specific values. A crystal having a unit cell of that type is said to belong to the triclinic crystal system. Since it can be shown that the same arbitrary choice of unit cell can be made if no symmetry except an inversion center is present, we can say that a triclinic crystal system is characterized by the absence of any symmetry except the identity and a center of symmetry. Let us now expand the symmetry versus unit cell consideration to a crystal with symmetry 2/m (see Figure 4.1(b) or 4.6). Again we could choose several unit cells unless we want to represent the symmetry operations by simple arithmetic expressions. The 2-fold axis is oriented perpendicular to the mirror plane. Let us choose the origin of the unit cell at the intersection of the twofold axis and the mirror plane and let r be the vector of one atom of the molecule. Again one choice of unit cell vectors is necessarily the most favorable, that is, one vector in the direction of the 2-fold axis, the two others in the mirror plane, see Figure 4.10. The representation of the symmetry transformed vectors is very simple. For r′, transformed via the 2-fold axis, we obtain r′ = –xa + yb – zc, for r′′, transformed via the center of symmetry, r′′ = –xa – yb – zc, and for r′′′ transformed via the mirror plane, r′′′ = xa – yb + zc. (To obtain all symmetry-related vectors in the same unit cell as r, only pure translations by appropriate unit cell vectors are necessary.)
4.1 Symmetry operations in a crystal lattice
79
In column representation we get, ⎛ −x ⎞ r′ = ⎜ y ⎟ ⎜ ⎟ ⎝ −z ⎠
⎛ −x ⎞ r ′′ = ⎜ − y ⎟ , ⎜ ⎟ ⎝ −z ⎠
⎛ x ⎞ r ′′′ = ⎜ − y ⎟ ⎜ ⎟ ⎝ z ⎠
or (from the definition of the matrix product), ⎛ −1 0 0 r′ = ⎜ 0 1 0 ⎜⎝ 0 0 −1
⎞ ⎛ −1 0 0 ⎞ ⎛ 1 0 0⎞ ⎟ r, r ′′ = ⎜ 0 −1 0 ⎟ r , r ′′′ = ⎜ 0 −1 0 ⎟ r . ⎟⎠ ⎜⎝ 0 0 −1 ⎟⎠ ⎜⎝ 0 0 1 ⎟⎠
With these three matrices we have a mathematical representation of all symmetry operations in this crystal class, if we add the unit matrix for the identical operation. We can identify the four symmetry operations in the crystal class 2/m by the 3 × 3 matrices ⎛ 1 0 0⎞ S( 1 ) = ⎜ 0 1 0 ⎟ ⎜⎝ 0 0 1 ⎟⎠ ⎛ −1 0 0 S( i) = ⎜ 0 −1 0 ⎜⎝ 0 0 −1
⎞ ⎟ ⎟⎠
⎛ −1 0 0 S( 2 ) = ⎜ 0 1 0 ⎜⎝ 0 0 −1
⎞ ⎟ ⎟⎠
⎛ 1 0 0 S( m) = ⎜ 0 −1 0 ⎜⎝ 0 0 1
⎞ ⎟ ⎟⎠
and this matrix representation permits the analytical calculation of symmetry-related positions. While another choice of unit cell vectors would also have resulted in a matrix representation, this is the most simple and the most suitable for this kind of symmetry. It is therefore customary for all crystallographers to choose the unit cell vectors as described above. From this special choice it follows that the lattice constants have some restrictions. Since two vectors have to be in a plane perpendicular to the third vector, two of the angles are restricted to 90°. Usually the axis in the direction of the 2-fold axis, called the unique axis, is designated b. Then only β differs from 90° and α = γ = 90°. A crystal with lattice constants of that kind is said to belong to the monoclinic system. The system is monoclinic when only one 2-fold axis or one m-symmetry plane is present, or if both are combined perpendicular to each other. With the other crystal classes, other restrictions due to symmetry are observed. We shall not explain this in detail, but only present the results. It was shown that seven coordinate systems, generally called crystal systems, are sufficient to describe all possible crystal symmetries. These seven systems, which are nothing more than special coordinate systems, are summarized in Table 4.3, together with the conditions for the lattice constants. In Table 4.2, which shows the crystal classes, an additional column indicates the crystal system for every crystal class. There are several conventions which are used with these crystal systems. In the monoclinic system, the unique axis is generally denoted “b”. Then the non-right
80
4 Crystal symmetry
Table 4.3: Crystal systems. Name
Conditions for Lattice Constants
Triclinic
no restrictions for a, b, c, α, β, γ
Monoclinic
no restrictions for a, b, c, β; α = γ = 90° (2nd setting) no restrictions for a, b, c, γ; α = β = 90° (1st setting)
Orthorhombic
no restrictions for a, b, c; α = β = γ = 90°
Tetragonal
no restrictions for a and c, but a = b and α = β = γ = 90°
Hexagonal
no restrictions for a and c, but a = b, α = β = 90°, γ = 120°
Trigonal (rhombohedral axes)
no restrictions for a, but a = b = c, no restrictions for α, but α = β = γ
Cubic
no restrictions for a, but a = b = c, α = β = γ = 90°
angle must be β. This choice is called the second setting, in contrast to the choice of c as the unique axis, which is denoted the first setting, but is used less frequently. In all other systems the axis of highest symmetry is named c, as in the tetragonal and hexagonal system. The reciprocal lattice constants as introduced in Section 1.2.2 are restricted in the same way as for the direct lattice. If we calculate, for instance, the reciprocals in a monoclinic system, we get, by the definition in eqs. (1.6), V = abc = b(c × a) = abc sin β a* =
bc sin α 1 = V a sin β
b* =
ca sin β 1 = V b
c* =
ab sin γ 1 = V c sin β
cos a * =
0−0 = 0, sin β
cos β* = −cos β , cos γ * =
0−0 = 0, sin β
hence a* = 90° hence β* =180° – β hence γ* = 90°.
4.1 Symmetry operations in a crystal lattice
81
The result of these calculations for all seven crystal systems is given in Table 4.4. A special case is the trigonal system, which is preferred to describe 3-fold axis symmetry. The unit cell vectors having the properties a = b = c and α = β = γ (not necessarily equal to 90°) are named rhombohedral. They form a unit cell as shown in Figure 4.11(a), the 3-fold axis is in the body diagonal of the unit cell. In principle, the symmetry of a crystal belonging to the trigonal system can also be described in the hexagonal system, and for some symmetry combinations with 3- and 6-fold axes, a trigonal or a hexagonal system can be chosen. The orientation of the two types of unit cells is shown in Figure 4.11(b). The hexagonal c-axis coincides with the body diagonal of the rhombohedral cell. The transformation between the unit cell vectors a, b, c (rhombohedral) and A, B, C (hexagonal), is given by
Table 4.4: Cell volume and reciprocal lattice constants in terms of direct cell constants for various crystal systems. Crystal System
Non-restrictedCell Constants
Volume
Reciprocal Cell Constants
Triclinic
a, b, c
V = (see below)
a* =
bc sin α ca sin β ; b* = V V
c* =
ab sin γ ; α *, β*, γ * as given V
α, β, γ
by equation (1.10) in Section 1.2.2 Monoclinic
a, b, c, β
V = abc sin β
a* =
1 1 ; b* = ; a sin β b
c* =
1 ; β* = 180° – β c sin β
Orthorhombic
a, b, c
V = abc
a* =
Tetragonal
a, c
V = a2c
Hexagonal
a, c
V=
1 1 ; c* = a c 2 1 a* = ; c* = ; ( γ * = 60°) c a 3
Trigonal
a, α
V = (see below)
a2c 3 2
(rhombohedral axes) Cubic
a
V = a3
1 1 1 ; b* = ; c* = a b c
a* =
cos
α* = 2
α 2
;
a* =
1 a sin α sin α *
a* =
1 a
ln the triclinic system, V = abc 1 – cos2α – cos2β – cos2 γ + 2cos α cos β cos γ In the trigonal system, V = a3 1 – 3cos2α + 2cos3α
1 2cos
82
4 Crystal symmetry
(a)
(b)
Figure 4.11: (a) Unit cell defined by rhombohedral unit cell vectors, (b) relation between rhombohedral and hexagonal cell.
A=a–b B=b–c C = a + b + c or in matrix notation ⎛ A ⎞ ⎛ 1 −1 0 ⎞ ⎛ a ⎞ ⎜ B ⎟ = ⎜ 0 1 −1 ⎟ ⎜ b ⎟ . ⎜⎝ C ⎟⎠ ⎜⎝ 1 1 1 ⎟⎠ ⎜⎝ c ⎟⎠ The transformation given above is said to belong to an obverse orientation of the rhombohedral axes relative to the hexagonal axes. Another orientation may be chosen, denoted as reverse orientation by A = a – c, B = b – a, C = a + b + c. By convention, the obverse orientation is standard. The transformation of all other quantities such as reciprocal unit cell vectors, indices, etc., can be done as described in the Appendix, Section A.1.5.
4.1.4 Translational symmetry, lattice types and space groups Until now the considered crystal symmetry operations had the property to have at least one “fixed point” in space. If we refrain from this restriction we can add a number of translational elements. However, since again the additional symmetry operations
4.1 Symmetry operations in a crystal lattice
83
have to be consistent with the three-dimensional periodicity of the crystal, they are restricted to a limited number as we shall see below. As shown in the last section, every basic symmetry operation has a matrix representation. The addition of a translational element can be expressed by a vector ⎛ t1 ⎞ t = ⎜ t2 ⎟ ⎜ ⎟ ⎝ t3 ⎠ creating a new situation where we now have to operate with symmetry operations of the form B=S+t (4.2) where S is one of the 3 × 3 matrices derived in the last section and t is a translational vector which complies with the crystal periodicity. (Note that (4.2) has to be regarded as a formal operator to act on a position vector r so that the execution of B should lead to a symmetry-related vector r′, given by r′ = Br = Sr + t. Only in this sense is (4.2) a reasonable expression. Regarded as a matrix equation, B would be undefined since S and t are of different types.) It can be shown that the set of all B operations forms a mathematical group, named the group of motions, M. The complete symmetry of any motif can be described by a special subgroup of M. No limitation can be given for the number of subgroups
Figure 4.12: Superposition of two lattices. Light and dark motifs displaced by a vector t.
84
4 Crystal symmetry
of M unless we restrict ourselves to the consideration of crystal symmetry. For this case, it was deduced by Schoenflies and Fedorov that the number of subgroups of M necessary to describe the symmetry of a crystal lattice is limited to 230. These 230 subgroups are called space groups, and every crystal lattice has a symmetry which is described by one of the 230 space groups. Every space group is related to a crystal class, since it can be shown that the set of matrices in (4.2) needed for the description of its symmetry is equal to the set of matrices in one of the 32 crystal classes. In this sense, every space group belongs to a crystal class, or, we can say that every crystal class can be expanded to a number of space groups by addition of some translation vectors to its symmetry matrices. The expansion of the crystal classes to the space groups by addition of translational elements can easily be overlooked because it can be carried out only in two ways: (1) Addition of overall translation vectors to all symmetry matrices of a crystal class, resulting in the so-called centered lattices. (2) Addition of individual translation vectors to one or more symmetry matrices of a crystal class, resulting in the so-called glide planes and screw axes. Centered Lattices: The first and most simple possibility for expanding a crystal class is the addition of one overall translation vector to all symmetry matrices of a crystal class. The geometrical result of such an operation is illustrated in Figure 4.12. Supposing this (two-dimensional) lattice is spanned by unit cell vectors a and b. We find, for every light motif, its equivalent shadowed motif, translated by a vector t = l/2 (a + b). Thus every unit cell contains one additional shadowed motif in the center of the cell which can be obtained from a light one by a pure translation operation. A lattice having this property is said to be centered; non-centered lattices are called primitive. As we see from Figure 4.12, this centered lattice can be regarded as the sum of two identical primitive lattices, displaced against each other by the translation vector t; the two primitive lattices are given by the light and the shadowed motifs. Note that in the example in Figure 4.12, it is not absolutely necessary to regard the lattice as centered. If the unit cell vectors a′= a and b′ = t are chosen, a primitive lattice is obtained. The reason for this is the low symmetry of this lattice, which has the identity as its only basic symmetry element. So it can alternatively be regarded as a (two-dimensional) triclinic lattice. Triclinic lattices permit an arbitrary choice of unit cells, so the construction of a centered lattice can be avoided. The situation changes fundamentally for a lattice with higher symmetry. Consider again a lattice with symmetry 2/m. The a–c plane is the mirror plane, and the 2-fold axis has the direction of b. The conventions of a monoclinic system require the choice of unit cell vectors as illustrated in the a–b plane shown in Figure 4.13(a). b is the monoclinic “unique axis” (second setting), and therefore perpendicular to a (and also to c). As we have already seen, the lattice symmetry generates for a given motif four equivalents I=1, II=2, III=i, and IV=m, which we display schematically in a common
4.1 Symmetry operations in a crystal lattice
85
large sphere around the origin. The addition of a vector t = 1/2 (a + b) to all these symmetry-related motifs leads to a second set of motifs at the center of the a–b plane. Again we can regard this two-dimensional lattice (which also holds for the three-dimensional case) as the sum of two primitive lattices displaced by the vector t = 1/2 (a + b), and again we have the possibility of avoiding the description of this lattice as centered by choosing another unit cell, for example, replacing b by b′ = t. But this unit cell does not comply with the rules of a monoclinic system, since the angle between b′ and a no longer has a value of 90°. Now we have two possibilities. Either we refrain from choosing the crystal system appropriate to the symmetry, or we make use of a centered lattice. Crystallographers decide for the latter, which means that in the example in Figure 4.13(a), the correct choices are a and b, resulting in a centered lattice, and not a and b′. In Figure 4.13(a) the translation vector t = 1/2 (a + b), which is responsible for this centering, points from the origin of the unit cell to the center of the a–b plane. Crystallographers call this plane the C-plane; the a–c and b–c planes are called the B- and A-planes, respectively. Lattices with centering of one plane are called face-centered, with a designation of the special face which is centered. That is, they are said to be A-, B-, or C-centered. The lattice in Figure 4.13(a) is then C-centered. It is common practice to make use of a symbolic representation [see Figure 4.13(b)], where all motifs related by non-translational operations are collected in one single point in the eight equivalent corners of the unit cell, while the motifs obtained by pure translation are represented as a single point at the top of the translational vector. A face-centered lattice like the monoclinic C-lattice is called doubly primitive, since it can be derived from two primitive lattices and an overall displacement vector transforming every point in one lattice to its equivalent in the second. Generalizing this concept, we can define centered lattices of higher multiplicity by n primitive lattices together with n-1 displacement vectors. Fortunately, this general aspect needs
(a)
(b)
4
Figure 4.13: (a) Two-dimensional representation of a centered lattice with symmetry 2/m, the four motifs corresponding to this symmetry are displayed schematically by their symbols; (b) symbolic representation of a C-centered monoclinic unit cell.
86
4 Crystal symmetry
not be considered, since there are only two more centered lattice types present in crystals. These are those with all faces centered, designated F, and the body-centered lattices, designated I. An F-centered lattice is quadruply primitive. In addition to every lattice point, we get three further points by translation vectors pointing to the centers of all faces. An I-lattice is doubly primitive, with one translation vector pointing to the body center of the unit cell. The analytical representations of the displacement vectors are, for the F-lattice, ⎛ 1/2 t1 = ⎜ 1 / 2 ⎜ ⎝ 0
⎞ ⎟ ⎟ ⎠
⎛ 0 ⎞ t2 = ⎜ 1 / 2 ⎟ ⎜ ⎟ ⎝ 1/2⎠
⎛ 1/2⎞ t3 = ⎜ 0 ⎟ ⎜ ⎟ ⎝ 1/2⎠
and for the I-lattice, ⎛ 1/2⎞ t = ⎜ 1/2 ⎟. ⎜ ⎟ ⎝ 1/2⎠ A prominent example of an F-centered lattice is the NaCl structure, which was, as mentioned in the Introduction, the first crystal structure ever reported, see Figure 4.14. The relatively large Cl− anions arrange in a cubic F-centered lattice. The smaller Na+ cations fill the voids forming also a cubic F-centered lattice. Both contributing lattices are displaced by a translation (1/2, 0, 0). The space group is cubic, Fm3m. It should be noted here that a triply primitive lattice can be observed in a hexagonal lattice. But it can be replaced by a primitive one, if the lattice is described in the rhombohedral system. With this reservation, we can summarize: The addition of an overall translation vector to the symmetry operations known so far leads to three possible types of centered lattices. Two of them, the face-centered and the body- centered lattices, are doubly primitive, and the last, called all face-centered, is quadruply primitive.
Figure 4.14: An example of an F-centered lattice: The NaCl structure [1], illustration kindly provided by Bergmann/ Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
4.1 Symmetry operations in a crystal lattice
87
A detailed study of the possibilities for centering in all crystal systems shows that crystals can have only 14 different types of lattices. Since Bravais (1850) was the first to discover this, these 14 lattice types are called the Bravais lattices. We show the symbolic representations of the 14 lattice types in Figure 4.15, and give a brief discussion here. In the triclinic crystal system, only primitive lattices, denoted P, are necessary, as has already been demonstrated in the example in Figure 4.12. It is conventional to avoid centered lattices whenever possible, unless the choice of a primitive cell is inconsistent with the orientation of the symmetry axes in the unit cell. Following this convention, triclinic cells will only be primitive. In the monoclinic system, it is sufficient to describe all centerings by two types: a primitive or a C-centered lattice. An A-centered lattice which is equivalent to C-centering is avoided by choosing the non-unique axis vectors a and c in a way that the centered face will be a–b instead of b–c. A B-centered monoclinic cell can be transformed into a primitive monoclinic cell of half the volume. This transformation is illustrated in Figure 4.16(a), looking at the a–c plane, which is sufficient to consider, because the unique axis vector b is not affected. If a, b, c are the unit cell vectors of the B-centered cell, we obtain three unit cell vectors A, B, C A = 1/2 (a – c) B=b C = 1/2 (a + c)
Figure 4.15: The 14 Bravais lattice types, illustration kindly provided by Bergmann/ Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
88
4 Crystal symmetry
(a)
(b)
Figure 4.16: (a) Transformation of a monoclinic B-centered cell to a primitive cell, illustrated in the a–c plane; (b) transformation of a tetragonal F-centered cell to an I-centered cell, illustrated in the a–b plane. Dark circles represent points at z = 0, the light circles obtained from the F-centering are all at z=1/2. Choosing A = 1/2(a + b), B = 1/2(b – a), C = c, we transform into a tetragonal I-lattice.
by the transformation representing a primitive monoclinic lattice, since the angles between A and B and C and B are still 90°, and the angle between A and C is generally not a right angle. The problem of showing the redundancy of I- and F-lattices in the monoclinic system is left as an exercise to the reader. The orthorhombic crystal system allows all four types of centered lattices. For the face-centered lattice, the convention is the same as in the monoclinic case. The choice of unit cell vectors is such that the centered face is the a–b plane. In the tetragonal system, there are only two lattice types, the primitive and the body- centered. One example of reduction of other types to one of these two representatives is demonstrated in Figure 4.16(b), which shows the transformation between tetragonal F- and I-lattices. In the hexagonal and trigonal system, only primitive lattices are present. However, it must be noted that a centering in the hexagonal system can only be avoided by transforming a triply primitive hexagonal lattice into a trigonal system. Figure 4.17 shows a lattice of that type. We have three types of lattice points; those drawn by small full spheres having components in the c-direction of 0 or 1, those with larger full spheres having z-components 1/3, and those with open spheres, having z-components 2/3. The translation vectors in the sense of (4.2) are t1 = (1/3)a + (2/3)b + (2/3)c t2 = (2/3)a + (1/3)b + (1/3)c or, written as columns ⎛ 1/3 ⎞ ⎛ 2/3 t1 = ⎜ 2 / 3 ⎟ ; t2 = ⎜ 1 / 3 ⎜ ⎟ ⎜ ⎝ 2 / 3 ⎠ ⎝ 1/3
⎞ ⎟. ⎟ ⎠
4.1 Symmetry operations in a crystal lattice
89
Figure 4.17: Unit cell transformation of a triply primitive hexagonal lattice to a primitive rhombohedral lattice [1].
The transformation into the rhombohedral cell allows the description of this lattice as primitive. However, note that here the centering does not disappear by choosing another cell in the same crystal system, but only by changing to another system. The notation of lattice types in the hexagonal and trigonal system is as follows. Primitive lattices in both systems are indicated by a P. A lattice which is triply primitive in the hexagonal system, but primitive in the rhombohedral system, is denoted R. Finally, the cubic system allows P-, I-, and F-lattices. A face-centered lattice is impossible, since all directions are equivalent because of the symmetry in this system. Glide Planes and Screw Axes: The second (and also the last) subject to be discussed is the consequence of adding an individual translation vector to a basic symmetry operation. This can be done briefly, since the results are easy to formulate. In a crystal lattice there are just two possibilities of combining a basic symmetry operation with an individual translation vector. The first is the addition of a displacement to a reflection, the displacement vector necessarily having a direction parallel to the mirror plane. A symmetry element of that kind is designated a glide plane. The second is the addition of a displacement to a rotation, the displacement vector necessarily having a direction parallel to the rotation axis. A symmetry element of that kind is called a screw axis. We have already seen that the components of the overall displacement vectors could not have arbitrary values, but were restricted to integer fractions of the unit cell vectors. The same holds for glide planes and screw axes. Usually glide planes are parallel to one of the unit cell planes, either a–b, a–c or b–c (there are only a few exceptions in the high symmetric crystal systems). In Figure 4.18, we have drawn a glide plane parallel to the a–c plane of a monoclinic cell. Since we have already pointed out that the glide vector t must be parallel to the mirror plane, it follows that t = αa + βb.
90
4 Crystal symmetry
Figure 4.18: Glide planes in a monoclinic cell, illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
α and β can have only the values 0 and 1/2, and we have three possibilities t(a)
= (1/2)a
(4.3a)
t(c)
= (1/2)c
(4.3b)
t(a,c) = 1/2(a+c).
(4.3c)
Glide planes are called axial if the glide component is parallel to an axial direction; they are called diagonal if the glide component is diagonal. In the axial case, they are designated by the character indicating the glide direction. In the diagonal case, they are designated by n. The three possible glide planes are shown in Figure 4.18; two axial of types a (with t = t(a)) and c (with t = t(c)), and one diagonal of type n (with t = t (a, c)). There is a further type of glide plane known as the diamond glide d, since its name derives from its occurrence in the diamond structure. The translation vector is given by a quarter of a diagonal, i.e., it is of the form 1/4(a + b), 1/4(a + c), 1/4(b + c), or 1/4(a + b + c). Screw axes exist for every allowed rotation axis. Their classification is very simple. If the repetition vector in the axis direction is R, we can derive from every n-fold axis (n – 1) screw axes, having the translational components (1/n) R, (2/n) R, ..., ((n – 1)/n) R (n = 2, 3, 4, 6). This proposition is also valid for n = 1, but it has no practical sense. The symbolic notation is nm if the axis is n-fold and the translation vector is (m/n) R. Figure 4.19 shows all possible 2-fold and 3-fold screw axes. Two properties should be pointed out: (1) Usually the rotation axis has the direction of one unit cell vector, say c. Then for an nm-screw axis, the translation vector is t = (m/n) c or, written as a column, ⎛ 0 t=⎜ 0 ⎜ m/n ⎝
⎞ ⎟. ⎟ ⎠
(4.4)
4.1 Symmetry operations in a crystal lattice
91
It is possible for 3-, 4-, and 6-fold screw axes that with multiple applications of the screw axis operation the translation component exceeds the repetition period. In a 32-axis, for example, the first rotation about 120° is accompanied by a translation of (2/3)c, producing the point 2. The second rotation about 120° needs a further translation of (2/3)c; that means 3′ is already translated by (4/3)c, which is c + (1/3) c. Since c is the repetition period, 3′ is the equivalent of point 3 (having t = (1/3)c) in the initial cell. The same situation is observed for all other screw axes for which this problem arises. (2) Let us compare all equivalent points produced by a 31- and a 32-axis (Figure 4.19). It is clear that the points of the 31-axis are related by an anticlockwise rotation of 120° each, plus translation by 1/3 in the c-direction. The points produced by the 32-axis operation can be regarded as rotated by 120° clockwise, and then translated by (1/3)c. Thus the two axes differ only in their direction of rotation. In other words, the motifs produced by these two screw axes are mirror-image related; they behave like left- and right-hand screws. Motifs having these properties are said to be enantiomorphous. Enantiomorphy plays an important role when dealing with the structures of optically active compounds. An enantiomorphic relationship exists also for the 41- and 43-axes and for two pairs of 6-fold screw axes, the 61-65 pair and the 62–64 pair. With the introduction of the additional translational symmetry elements no further expansion needs to be considered and we can summarize the crystal symmetry concepts (Figure 4.20). (1) From the five rotation axes possible in a crystal lattice, we get, by addition of their corresponding inversion axes, the 10 basic symmetry operations.
Figure 4.19: 2- and 3-fold screw axes, illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
92
4 Crystal symmetry
Figure 4.20: Crystal symmetry derivation scheme.
(2) All 10 basic operations can be expressed analytically by 3 × 3 matrix representations. The matrix elements become simple rational numbers if the unit cell vectors are chosen in accordance with one of the seven crystal systems. (3) With the restrictions resulting from the three-dimensional periodicity of crystals, 32 combinations of the basic operations are possible. These 32 sets of basis symmetry operations, each forming a mathematical group, are the so-called 32 crystal classes. Each crystal class can be represented by a finite set of 3 × 3 matrices corresponding to the basic operations actually present in this class. (4) The introduction of translational elements results in the fourteen possible Bravais lattices on the one hand and in two new types of symmetry elements, the glide planes and the screw axes, on the other hand. Every symmetry element of this expanded set can be represented by a 3 × 3 matrix and a three-dimensional column vector. 230 subsets, each forming a mathematical group, can be deduced to describe the complete symmetry of a crystal lattice. These groups are called the 230 space groups. Since the set of matrices needed in the matrix-vector description in one space group is identical to the set of matrices of one crystal class, it can be said that every space group belongs to a crystal class, or that every crystal class has an expansion to a certain number of space groups.
4.1 Symmetry operations in a crystal lattice
93
For a better understanding of the derivation of space groups, let us discuss as an example the expansion of the monoclinic crystal class 2/m to its six possible space groups. The symmetry elements of this crystal class are: (1) the identity, 1, (2) the 2-fold axis, 2, (3) the mirror plane, m, perpendicular to the 2-fold axis, (4) the inversion center, i. Let us do these expansions step by step. (1) P2/m: This space group is a very simple one. It is obtained by addition of no translational elements, i.e., the first space group contains the same symmetry elements as the crystal class itself. The first capital character always denotes the Bravais lattice type, which is primitive in this case. We get new space groups by replacing the 2-fold axis with a screw axis, or the mirror plane with a glide plane (or both), and we have the opportunity of introducing a C-centering. No other centering is allowed in the monoclinic system. (2) P21/m: The 2-fold axis can be replaced only by one screw axis, a 21-axis. This leads to the space group P21/m. (3) P2/c: We proceed with the replacement of the mirror plane by a glide plane. Since the standard choice of the glide direction is c, we get the space group P2/c. In the case of an a- or an n-glide plane, the space group would be called P2/a or P2/n, but together with P2/c all three are regarded as one space group, since they all contain the same symmetry elements. (4) P21/c: Replacing both the mirror plane and the 2-fold axis by a glide plane and a screw axis, we get the well-known space group P21/c. This is by far the most frequently present space group, especially for organic structures. Because of the great frequency of space group P21/c, everyone working on crystal structures should be well experienced with its symmetry. It is evident that instead of a c-glide plane, an a- or n-glide plane can be chosen in this space group. Then we get as symbols P21/a or P21/n. (5) C2/m: This space group is obtained by replacing the primitive lattice by a face centered one, or in other words, by addition of an overall translation vector. The mirror plane and the 2-fold axis are not affected. (6) C2/c: The last space group, derived from the crystal class 2/m and designated C2/c, consists of a C-centered lattice, a 2-fold axis and a c-glide plane. Note that the presence of a 2-fold axis, together with a C-centering, automatically produces 21-screw axes. Therefore this space group could also be written as C21/c. The same is true for the space group C2/m, which is equivalent to C21/m. All the possibilities of replacing one or more basic symmetry elements in 2/m by those with translational elements have now been utilized. We have derived the six space groups belonging to the crystal class 2/m, and no more can be derived from this class.
94
4 Crystal symmetry
Before finishing the description of crystal symmetry, it should be noted that questions concerned with crystal symmetry and space group representation are discussed in detail in the “International Tables for X-Ray Crystallography, Vol. A.” Illustrated descriptions of each of the 230 space groups are given, with all the symbols and official abbreviations used for every symmetry operation. All questions which are not discussed here are surely described in the “Tables”. The subject of the International Tables is not only crystal symmetry, but they also present in several volumes all important aspects of crystallography in detail. The first issue appeared already in the 1930s, a second issue in the 1960s, while at present the following volumes are available, either as print versions or online, see also the IUCr website http://it.iucr.org: Volume A, Space group symmetry, 2006 edition; Volume A1, Symmetry relations between space groups, 2006, 2011 edition; Volume B, Reciprocal Space, 2006, 2010 edition; Volume C, Mathematical, physical and chemical tables, 2006 edition; Volume D, Physical properties of crystals; 2006 edition; Volume E, Subperiodic groups, 2006, 2010 edition; Volume F, Crystallography of biological macromolecules, 2006, 2012 edition; Volume G, Definition and exchange of crystallographic data, 2006 edition. (Source: http://it.iucr.org) [2]. Everyone working on crystal structure analysis makes use of the “Tables”. They are a tool no crystallographer should dispense with. They must be available in each lab where crystallographic work is carried out.
4.2 Crystal symmetry and related intensity symmetry As shown in Section 2.2, the diffraction pattern of a single crystal consists of discrete intensities which can be uniquely related to the integer lattice planes. Using this provision the basic formulae of diffraction theory can be transformed into expressions which take this property into account.
4.2.1 Representation of ρ and F as Fourier series First the argument b in the structure factor expression F(b) as derived in Section 2.1, formula (2.2), for arbitrary materials can be replaced by a reciprocal lattice vector h giving rise to the nonzero diffraction spots. Then the equations relating ρ and F by Fourier transforms are
4.2 Crystal symmetry and related intensity symmetry
F( h ) =
∫ ρ( r )e
2 πihr
95
dV
(4.5)
dV *
(4.6)
V
ρ( r ) =
∫ F( h )e
−2 πihr
V*
with h = ha* + kb* + lc* h, k, l are integers
(4.7)
r = xa + yb + zc.
(4.8)
and
The argument h of F indicates that F now has to be considered only for lattice vectors satisfying condition (4.7). In practical work, unit cell vectors a, b, c and a*, b*, c* as used in (4.7) and (4.8) are no longer chosen arbitrarily, but will have to be in agreement with the conditions of one of the seven crystal systems given in Table 4.3. ρ(r), which has to be considered only in one unit cell because of its three-dimensional periodicity, can be written in terms of the electron density of the contributing atoms (Figure 4.21). Let ρj(r) be the electron density of the jth atom with the position vector r referred to an origin at the atomic center. The problem is that the precise electron density distribution of an atom is not precisely known. For chemical reasons, it is evident that ρj depends on both the atom type and its bonding state. Let us suppose for the moment that ρj(r) is known; then we get for the electron density of N atoms with position vectors rj in the unit cell
Figure 4.21: Representation of ρ(r) in terms of the ρj′s, illustration kindly provided by Bergmann/ Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
96
4 Crystal symmetry
N
ρ( r ) =
∑ ρ ( r − r ). j
j
j= 1
If fj(h) is the Fourier transform of ρj, we get from the “Shift-Theorem” [see Appendix, equation (A.24)], N
F(h) = F (ρ(r)) =
∑ f (h )e j
2 πihr j
.
j= 1
The integral representation for F(h) in (4.5) is now replaced by a series representation, which is more convenient in numerical calculations. The problem remaining is the quantity fj(h), which must be calculated from the ρj(r). In crystallography it is customary to make the approximation that the ρj′s are spherical, i.e. ρj does not depend on r, but only on r = ⎪r⎪, hence ρj ≈ ρj(r).
(4.9)
This implies that fj also has spherical symmetry. Since ⎪h⎪ is proportional to s = sin θ/λ (Bragg’s law, see equation (2.9) in Section 2.2.), we get fj = fj( s ), s =
sin θ λ
(4.10)
and finally for F(h), N
F( h ) =
∑fe j
2 πihr j
.
(4.11)
j= 1
(Usually the argument s for the fj is omitted.) Several models have been used to express ρj(r) analytically and to thence derive fj(s). These calculations are based on the atomic model at rest, so that the fj corresponds to the scattering power of the stationary atom. The fj-curves in terms of s obtained by various calculations (Hartree, Fock and several other authors) are tabulated in the International Tables, Vol. C [3] for all elements and a large number of elemental ions. The fj′s are called atomic scattering factors. For example, the atomic scattering factor curves for H, C, and O and O– are drawn in Figure 4.22. They all start at s = 0 with a value equal to the atomic number, and decrease monotonically towards zero for s > 1.l. Coefficients for the analytical representations of scattering factors are also found in the International Tables, Vol. C [4]. These expressions are advantageous in computer calculations and are stored in most of the existing programs. The spherical approximations for ρj(r) and fj(s) result in two important consequences: (1) The assumption of a spherical atomic electron density distribution is denoted as the so-called independent atom model (IAM) approximation. It has in principle
4.2 Crystal symmetry and related intensity symmetry
97
Figure 4.22: Examples of atomic scattering factor curves, fj(s) [e] vs. s =sinθ/λ [Å–1].
severe disadvantages. If, for example, an atom in a chemical structure is covalently bonded, it is for sure, that its electron density in bonding or other preferred regions (such as lone pair regions) does not adopt spherical symmetry. However, this disadvantage is generally accepted, because then the expression (4.10) for fj is also spherical and is rather easy to express. In the IAM the maximum in the spherical electron density of an atom is usually interpreted as the atomic position. Therefore, although we are concerned with the electron density function, we determine nothing else but the atomic positions rather than the real density distribution. If precise results of an X-ray analysis show electron density details in non-spherical directions, the IAM has to be discarded for an analysis of these details and replaced by a model which can describe aspherical properties. One model of that type is the multipole model [5] we shall discuss briefly in Section 9.4.1. (2) The magnitude of F [see (4.11)], depends on the magnitude of the fj’s. Since they decrease with increasing s, there is a general trend for the F’s to decrease with s. Therefore, reflections with large sinθ/λ-values will usually tend to have weak intensities, which is observed on the film exposures shown in Section 3.2.2, for example. Furthermore, for s >> 1.1, the fj(s) curves approach to zero. Reflection intensities for these s-values are generally too weak to be observed. For MoKα radiation with λ ≈ 0.71Å, for example, we have for θ = 45° (sinθ ≈ 0.71), s = sinθ/λ ≈ 1. Therefore, reflections with θ > 45° have intensities which are usually near zero and can only be measured under special conditions. Even below this angle, reflections are very weak
98
4 Crystal symmetry
from the small fj(s)-values and from an additional damping from thermal motion effects (see below), so that in general, reflections can only be observed with MoKα radiation to a θ-limit of 25–30°. The situation is different for CuKα radiation with a wavelength λ = 1.54 Å. Even a maximum sinθ = 1 leads to sinθ/λ ≈ 0.65. It follows that for CuKα radiation, all reflections up to the limit given by the Ewald sphere are likely to be observed and should be measured. Transformation into a discrete series representation can also be obtained for ρ(r). Since reflections are discrete, we can replace the integral by a sum in (4.6). Then dV* reduces to ∆V*. Thus, ∆V* = V* = 1/V with V* the volume of the reciprocal unit cell since F exists only for lattice points of type (4.7). Then we get
∑
ρ( r ) = ( 1 / V )
h
F( h ) e−2 πihr
or, with hr = hx + ky + lz, [see Appendix, eq. (A.14)]
∑ ∑ ∑ F( hkl )e
ρ( r ) = ( 1 / V )
h
k
l
−2 πi( hx + ky + lz )
.
(4.12)
4.2.2 Thermal motion, displacement parameters It was already mentioned that the atomic scattering factors correspond to the stationary atom model, so that formula (4.11) for the structure factor is valid only for the atoms at rest. In the crystal, the atoms always execute thermal and zero-point vibrations about their rest points, so that a correction has to be applied to the structure factor expression. As was shown by Debye already in 1914, the thermal motion of each atom can be taken into account if the atomic scattering factor f for the stationary atom is replaced by the scattering factor fT for the vibrating atom, of the form fT = fe−( B sin
2 θ )/ λ 2
.
(4.13)
The quantity B, denoted as the Debye–Waller factor, is related to the atomic vibration by B = 8 π 2 U = 8 π 2 u2
(4.14)
where u is the root-mean-square amplitude of the atomic vibration. u has then the dimension of a length, B and U that of the square of a length. Since λ is usually given in Angstroms, the condition that the exponent in (4.13) has no dimension requires the dimensions of Å for u and Å2 for B and U.
4.2 Crystal symmetry and related intensity symmetry
99
The description of thermal motion by a single parameter U is based on the assumption of an isotropic vibration of the atom, which means that the atomic motion is equal in all directions in space and that the volume indicating the mean sojourn probability is a sphere (see Figure 4.23). The quantities B or U are then said to be isotropic temperature factors. It is clear that this simple model can only be an approximate description of the atomic motion. It is not reasonable that the vibration in the bond direction is the same as that normal to the bond. Therefore a more complicated thermal motion than the isotropic assumption will generally take place. An improved description is obtained by introducing anisotropic temperature factors. This is done as follows: For each reflection h we get for the scalar product hh = ⎪h⎪2, from Bragg’s equation, 2
h =
4 sin2 θ . λ2
Substituting this in (4.13), we get fT = fe− Tiso with Tiso = (B ⎪h⎪2)/4 = (Bh2a*2 + Bk2b*2 + Bl2c*2 + 2 Bhk a*b*cosγ* + 2 Bhla*c*cosβ* + 2 Bklb*c*cosα*)/4. (4.15) Replacing the isotropic B in each summand by the tensor components Bij (i,j = 1,3), we get Taniso = (B11h2a*2 + B22k2b*2 + B33l2c*2 + 2 B12hk a*b*cosγ* + 2 B13hla*c*cosβ* + 2 B23klb*c*cosα*)/4
(4.16a)
Figure 4.23: Description of the isotropic and anisotropic vibrational behavior of an atom by a sphere or an ellipsoid [6].
100
4 Crystal symmetry
or, in terms of Uij = Bij /(8π2) Taniso = 2 π2 (U11h2a*2 + U22k2b*2 + U33l2c*2 + 2 U12hk a*b*cosγ* + 2 U13hla*c*cosβ* + 2 U23klb*c*cosα*).
(4.17) (4.16 b)
The probability volume is now an ellipsoid with the Uii expressing the mean square amplitudes of vibration axes and the Uij (i ≠ j) representing the ellipsoid orientation (Figure 4.23). Although a more exact description of thermal motion requires a more complicated model for its correct representation, the ellipsoid model is found in practice to be a good compromise. On the one hand it allows the description of an anisotropic behavior of the vibrating atom, and on the other hand it keeps the number of parameters to an acceptable limit. Note that the transition from isotropic to anisotropic temperature parameters needs the introduction of five more parameters for each atom. More general models for the representation of thermal motion, such as the so called Gram–Charlier expansion [7], require substantially more parameters and are applied only in very exceptional cases. Another expression for the anisotropic exponent which is frequently used is Taniso = β11h2 + β22k2 + β33l2 + 2 β12hk + 2 β13hl + 2 β23kl.
(4.16c)
A comparison with (4.16b) shows that the β’s are defined by β11 β22 β33 β12 β13 β23
= 2 π2 U11a*2 = 2 π2 U22b*2 = 2 π2 U33c*2 = 2 π2 U12a*b*cosγ* = 2 π2 U13a*c*cosβ* = 2 π2 U13b*c*cosα*.
(4.18)
(Note that frequently a definition of the Uij is used which includes the cosine terms for the Uij elements with i ≠ j, i.e. U12, U13, and U23 are replaced by U12′ = U12 cosγ*, U13′ = U13 cosβ*, U23′ = U23 cosα*.) With the introduction of temperature factors, we get for the structure factor F (h) =
N
∑fe j
− Tj 2 πihr j
e
(4.19)
j= 1
with Tj either the isotropic expression (4.15) or the anisotropic expression (4.16). To reduce in publications the long listing of the six anisotropic thermal parameters to only one parameter it has become common practice to make use instead of the so-called equivalent isotropic temperature factor Ueq [8], which reads [9] Ueq = 1/3 (U11a2a*2 + U22b2b*2 + U33c2c*2 + 2 U12ab a*b*cosγ + 2 U13aca*c*cosβ + 2 U23bcb*c*cosα),
(4.20)
4.2 Crystal symmetry and related intensity symmetry
101
and can be replaced for orthogonal crystal systems by Ueq = 1/3 (U11 + U22 + U33).
(4.20a)
Figure 4.24 shows a very popular illustration of a molecular structure obtained from an X-ray analysis, which is found in most publications in that field. It is an ORTEP representation [6] (shown for the example of cyclobutane) where the atoms are displayed by their corresponding ellipsoids. This type of graphical representation was introduced already in the 1960s by C.K. Johnson [10] and is still today one of the most frequently used graphics software for the representation of X-ray structures. It shows not only the molecular geometry but it presents in addition some information about the atomic vibrational behavior. It allows the choice of a certain probability level to find an atom in the given volume. In most representations a 50 % probability is chosen. Figure 4.24 shows a further aspect, which led to the replacement of the notation “thermal parameters” by “displacement parameters” several years ago [12]. The X-ray analysis of non-substituted cyclobutane was carried out to examine whether the molecule is planar or not. Since it was located on a crystallographic mirror plane, it seemed that an absolute planar conformation in the crystal was confirmed. However, a closer look at the ellipsoids in Figure 4.24 indicated that two opposite carbon atoms show extensions of their ellipsoids perpendicular to the molecular mirror plane. This allows two interpretations: The first is to assume that the molecule carries out “butterfly-like” vibrations perpendicular to the molecular plane with respect to a diagonal of the four membered ring. The second interpretation is a small disorder of the two atoms in question in the sense of a displacement from the mirror plane being too small for getting properly resolved. This would lead to a slightly nonplanar molecule.
Figure 4.24: ORTEP representation of the structure of cyclobutane. The carbon atoms are plotted at a 50 % probability level, the hydrogen atoms as spheres at arbitrary radius [11]. Illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
102
4 Crystal symmetry
In any case the ellipsoids represent either thermal motion effects or small displacements from an equilibrium position or a mixture of both. Since effects of this type are frequently observed, it was decided to replace the designation of the quantities U, Ueq and Uij by the more general expressions isotropic and anisotropic displacement parameters. We shall make use of this notation in the following. Note that the exponential term in (4.13) causes a further damping of the f(s) curves with increasing s. Therefore the tendency of high-order reflections to have weak intensities is amplified. Crystals consisting of atoms with low vibration occur mostly with inorganic compounds which give intensity data to high sinθ/λ-values. Crystals of organic compounds usually consist of more strongly vibrating atoms. In consequence, low temperature measurements give better results for those crystals in which reflection intensities are not observable at medium sinθ/λ-values at room temperature. Figure 4.25 represents some quantitative considerations for the example of the scattering factor curve of carbon. For B = 0 the dampening is subject of the decrease of the scattering factors only. With increasing B-values, which were chosen from averages of structure analyses at the given temperatures, the scattering factor curves become considerably more dampened. For example, at sinθ/λ = 0.6 Å–1, which is the normally obtained resolution, the room temperature curve has lost more than 80 % against the B = 0 curve and at sinθ/λ = 1.0 Å–1 the room temperature curve is practically zero. If the diffraction experiment is carried out at 100 K, which is nowadays frequently done, the loss at sinθ/λ = 0.6 Å–1 is smaller, but amounts still to 60 %. It
Figure 4.25: Scattering factor curves of carbon multiplied by a number of exponential expressions as in (4.13) to illustrate the influence of different B-values, scattering factors in e, s = sinθ/λ in Å–1.
4.2 Crystal symmetry and related intensity symmetry
103
follows that reduction of temperature during the diffraction experiment has strong favorable influence on the reflection intensity at medium and high orders in sinθ/λ. Some details of low temperature experiments are discussed in Section 9.1.
4.2.3 Intensity symmetry, asymmetric unit With the expression (4.11) for F (the alternative use of (4.19) would not affect the following considerations), we can transform the symmetry properties of the unit cell to those of F and with that to the intensities. The first general symmetry property of F, however, does not depend on any crystal symmetry. Since in the expression (4.11) for F(h) the scattering factors fj are spherical, hence do not depend on h, but only on ⎪h⎪, it can be seen easily from (4.11) that F(–h) = F*(h).
(4.21a)
It follows that I(h) ~ F(h) F*(h) = F(h) F(–h) I(–h) ~ F(–h) F*(–h) = F(–h) F(h) hence I(h) = I(–h).
(4.21b)
This important property, which expresses the fact that X-ray diffraction is always centrosymmetric, is called Friedel’s law. It follows immediately that the number of independent reflections is reduced to one half of the limiting sphere. For example, the rule of thumb derived in Section 2.2 that the copper limiting sphere includes 9 × V reflections can now be modified to state that the number of independent reflections is at most 4.5 × V in the general asymmetric case. It should be noted already here that conditions exist where Friedel’s law is not properly satisfied. We shall discuss this case, which has some influence on the intensity symmetry considerations, at the end of this chapter. So, for the moment we assume that Friedel’s law is valid. The triclinic space groups Pl and P1 have only the identity and the inversion center symmetry elements. This inversion center does not cause further intensity symmetry, so for the triclinic system the half limiting sphere is the so-called asymmetric unit of reflections, that is, it includes the number of independent reflections. In practical intensity measurements, this means that only non-negative values need be considered for one of the indices hkl Although it has no influence on intensity symmetry, the existence of an inversion center in P1has an important consequence on another property of F, i.e. the phase. Since for every atom with its position vector r = (x, y, z), the corresponding centrosymmetric vector r′ = (-x, -y, -z) is present, (4.11) reduces to
104
4 Crystal symmetry
N/ 2
∑ f (e
Fcent ( h ) =
j
2 πihx j
+ e−2 πihx j ).
j= 1
With Euler’s formula 2 cosϕ = eiϕ + e–iϕ we get Fcent ( h ) =
N/ 2
∑ f cos 2πhx . j
j
(4.22)
j= 1
Instead of being a complex number, F is now a real number, i.e. the phase problem, which in the complex representation F = ⎪F⎪ eiϕ includes the problem of determining ϕ for all possible values from 0 to 2π, now reduces to a “sign problem”. For centrosymmetric structures, it has to be determined whether the signs of the F’s are “plus” or “minus”. The important property of centrosymmetric structures, that the structure factors are real, can also be illustrated graphically (see Figure 4.26). Note that it follows from (4.21a) for centrosymmetric structure factors that F(–h) = F(h).
(4.23)
As the first example of a crystal symmetry element having influence on intensity symmetry, let us look at the mirror plane. Consider a crystal in the monoclinic system with the mirror plane in the x-z plane. Then every atom with its position vector r = (x, y, z) has its equivalent in r′ = (x, -y, z). It follows for the structure factor N/ 2
Fm ( hkl ) =
∑ f (e j
2 πi( hx j + ky j + lz j )
+ e2 πi( hxj + k( − y j )+ lzj ) )
j= 1
Figure 4.26: Graphic representation of a centrosymmetric structure factor.
4.2 Crystal symmetry and related intensity symmetry
105
and N/ 2
Fm ( h − kl ) =
∑ f (e j
2 πi( hx j − ky j + lz j )
+ e2 πi( hxj − k( − y j )+ lzj ) )
j= 1
hence Fm(hkl) = Fm(h – kl).
(4.24)
For a glide plane, translational elements are added only to x and z, and do not affect the calculation above. So, as a general rule, any mirror or glide plane in direct space causes a mirror plane in reciprocal space. A similar rule can be obtained for the n-fold axes, as represented by the following calculation for a 2-fold axis. Assuming again a monoclinic system with the 2-fold axis in the b-direction, the equivalent position vectors are r = (x, y, z) and r′ = (-x, y, -z). Then we get for the structure factor N/ 2
F2 ( hkl ) =
∑ f (e j
2 πi( hx j + ky j + lz j )
+ e2 πi ⎡⎣ h( − xj )+ ky j + l( − zj )⎤⎦ ).
j= 1
We obtain the same result for negative h and l, i.e. F2 (hkl) = F2(-hk-l).
(4.25)
A screw component does not affect this result and since we get analogous properties for the other rotation axes, a general rule can be given for n-fold axes: N-fold axes or screw axes in direct space cause n-fold axes in reciprocal space. These findings seem rather unfavorable for the moment, because from the intensity symmetry we cannot distinguish between mirror and glide planes on the one hand and axes and screw axes on the other hand. Fortunately we shall see in the next chapter that the so-called systematic extinctions will solve this problem. As a representative example for all crystal systems, let us discuss the consequences of these two symmetry rules for the space groups in the monoclinic system. Here every space group contains either a 2-fold axis or a mirror plane (or a related symmetry element). Then one of the properties (4.24) or (4.25) is valid, holding also for the intensities. Since Friedel’s law is always satisfied, we get as additional conditions: (a) if I(hkl) = I(h-kl), it follows that I(hkl) = I(-hk-l) because (-hk-l) = -(h-kl), or (b) if I(hkl) = I(-hk-l), it follows that I(hkl) = I(h-kl) from Friedel’s law. Since either (a) or (b) is valid, we get for the intensity symmetry in the eleven monoclinic space groups,
106
4 Crystal symmetry
Table 4.5: The eleven Laue groups. Crystal System
Laue Group
Included Crystal Classes
Asymmetric Unit of Limiting Sphere
Triclinic
1
1, 1
one half sphere
Monoclinic
2/m
2; m; 2/m
one quadrant
Orthorhombic
mmm
222; mm2; mmm
one octant
Tetragonal
4/m 4/mmm
4; 4; 4/m 422; 4mm ; 42m; 4/mmm
Hexagonal
6/m 6/mmm
6; 6; 6/m 622; 6mm; 6m2; 6/mmm
Trigonal
3 3m
3; 3 32;3m;3m
Cubic
m3 m3m
23; m3 432; 43m; m3m
⎫ ⎪ ⎪ ⎪ one octant or less, ⎪ see International Tables, ⎬ Vol. I for structure factor ⎪ ⎪ expressions ⎪ ⎪ ⎭
I(hkl) = I(-h-k-l) = I(h-kl) = I(-hk-l).
(4.26)
The only octants of the limiting sphere which have different intensities are those which differ in the sign of h or l. Thus the asymmetric unit of reflections in the monoclinic system is one quadrant, for instance that containing reflections of type (hkl) and (-hkl). The investigation of intensity symmetry in the monoclinic system has shown not only that all space groups of one crystal class have the same intensity symmetry, but that all three classes (2, m, 2/m) of this crystal system have the same intensity symmetry. This is not surprising when taking Friedel’s law into consideration, since these crystal classes differ only by a center of symmetry. This equivalence of intensity symmetry leads to a further classification among the crystal classes. All crystal classes having the same intensity symmetry, i.e. differing only by a center of symmetry in their symmetry elements, are said to belong to a “Laue group”. All 32 crystal classes can be classified by the eleven Laue groups which are listed in Table 4.5, together with their corresponding crystal classes. Since the X-ray diffraction from a single crystal gives the intensity symmetry and not the crystal symmetry, this equivalence in the intensity symmetry for the various space groups of the same Laue group is a disadvantage for space group determination. Before discussing this problem in further detail, let us conclude the question of the asymmetric reflection unit. In the orthorhombic system, we have only one Laue group, just as in the monoclinic and triclinic case. Similar considerations as above lead to the result that intensities are equal if their absolute values of indices are equal, i.e., I(hkl) = I(-hkl) = I(h-kl) = I(hk-l) = I(-h-kl) = I(-hk-l) = I(h-k-l) = I(-h-k-l).
(4.27)
4.2 Crystal symmetry and related intensity symmetry
107
The asymmetric unit of reflections is thus one octant hkl, e.g. with h, k, and l all nonnegative. For the higher symmetric crystal systems, no general conditions can be given, since the intensity symmetry is described by more than one Laue group. The symmetry relations of F’s are given in the International Tables, Vol. I [13] for every space group. Until now we assumed that Friedel’s law [formula (4.21b)] was satisfied, but it was mentioned already that exceptions exist. We introduced the atomic scattering factors as the Fourier transforms of the atomic electron densities. In the independent atom model these scattering factors were real functions dependent only on s = sin θ/λ, so that all reflections with the same absolute value ⎪h⎪ have the same f. This simple representation of the scattering factors is only valid if the wavelength λ of the incident beam is significantly different from the wavelength λk of the K-absorption edge for an atom of the scattering material. If, however, λ ≈ λk, the scattering process shows an unusual behavior caused by an anomalous phase shift of the scattered wave. This effect, called anomalous dispersion, can be expressed analytically by replacing the real atomic scattering factor f by a complex quantity fA which is obtained from f, by inclusion of a real and an imaginary correction term, fA = f + ∆ f′ + i ∆ f ′′.
(4.28a)
The effect of anomalous dispersion increases with the wavelength of the incident radiation, and is usually significant only for those elements of the scattering material having an atomic number close to that of the target material of the X-ray tube. For most atoms, the corrections ∆f′ and ∆f′′ are tabulated in the International Tables, Vol. C [14], for various wavelengths. They are insensitive to s and in practice it is sufficient for most cases to use the values for s = 0 when such corrections are necessary. In practical X-ray work, anomalous dispersion is small for CuKα and even smaller for MoKα radiation for atoms having an atomic number less than 20. For very accurate analyses, or for the determination of absolute configurations, anomalous scattering factors should be used, favorably if heavier elements like sulfur or halogen atoms are present. For light atom organic structures, the effect is usually ignored. For much heavier atoms, the magnitudes of ∆f′ and ∆f′′ indicate whether it is worthwhile to correct the scattering factors for anomalous dispersion. The effect of anomalous dispersion has important consequences on the structure factor expression. We shall see immediately that Friedel’s law no longer holds, because the scattering factors no longer have spherical symmetry, but are even complex quantities, see (4.28a). Let us write fA = ⎪fA⎪ e2πiα 0 ≤ α < 2 π. (4.28b) Then we get the expression FA(h) for the structure factor, corrected for anomalous dispersion
108
4 Crystal symmetry
N
∑f
FA ( h ) =
e2πiαj e2πihr j
Aj
j=1
or N
FA ( h ) =
∑f
Aj
e2πi(αj + hr j) .
(4.29)
j=1
For FA (−h) we get N
FA ( − h ) =
∑f
Aj
e2πi(αj − hrj)
j=1
and then ⎛ N ⎞ ⎛ N ⎞ FA ( h ) FA* ( h ) = ⎜ fAj e2πi(αj + hr j) ⎟ ⎜ fAj e −2πi(αj + hr j) ⎟ ⎝j=1 ⎠ ⎝j=1 ⎠
∑
∑
⎛ N ⎞ ⎛ N ⎞ FA ( − h ) FA* ( − h ) = ⎜ fAj e2πi(αj − hr j) ⎟ ⎜ fAj e −2πi(αj – hr j) ⎟ ⎝j=1 ⎠ ⎝j=1 ⎠
∑
∑
Since these two expressions are generally different and since I(h) ~ F(h) F*(h), it follows that Friedel’s law is no longer valid in the presence of anomalous dispersion. However, for centric structures, we get for the structure factor expression N/2
FA ( h ) =
∑f
Aj
j=1
e2πiαj ( e2πihrj + e −2πihr j
)
N/2
FA ( h ) =
∑2
fAj e2πiαj cos 2πhrj
j=1
Since the cosine is an even function, we get FA(h) = FA(−h) and hence I (h) = I (−h). Therefore, because of the anomalous dispersion, the structure factor of a centric structure is no longer real, but is a complex quantity, as for the acentric case. However, since FA(h) = FA(−h), Friedel’s law still holds. We can now summarize: If anomalous dispersion has to be considered, the structure factor is always a complex quantity; Friedel’s law is still valid for centric structures but not in the acentric case. An important and interesting application can be made from this last property. If the structure of a chiral compound is examined, its absolute configuration cannot be determined as long as Friedel’s law holds, as can easily be seen:
4.2 Crystal symmetry and related intensity symmetry
109
(L)
(D)
Figure 4.27: D- and L-glycerinaldehyde; examples of a pair of enantiomorphic structures [6].
If we denote the structures of the enantiomorphic pairs as left-handed (L) and righthanded (R) (a prominent example of such pair is illustrated in Figure 4.27), their atomic positions are related by rj (R) = − rj (L) (j = 1, …., N).
(4.30)
From formula (4.11) for the structure factor, we obtain N
FR(h) =
∑
N
fje2πihrj (R ) =
j= 1
∑fe j
–2πihrj (L)
= FL*( h ) = FL ( –h ).
j= 1
Hence ⎪FR(h)⎪ = ⎪FL(h)⎪ and ϕR(h) = − ϕL(h).
(4.31)
It follows that, if optically active compounds such as amino acids, carbohydrates, or any chiral natural products are investigated, the X-ray analysis gives no information whether the L or D form is present, provided that Friedel’s law holds. When Friedel’s law is no longer true, the effect of anomalous dispersion can be used to derive the correct absolute configuration. By using fA from (4.28b) instead of f, we get for the structure factors N
FAR ( h ) =
∑f
Aj
e2πi ⎡⎣αj + hrj( R )⎤⎦ =
j=1
N
∑f
Aj
e2πi ⎡⎣αj − hrj( L)⎤⎦ = FAL (–h ).
j=1
Since Friedel’s law no longer holds, we get * ( h ) = FAL (– h ) FAL * (– h ) ≠ FAL ( h ) FAL * ( h ). FAR ( h ) FAR It follows that the intensities of right- and left-handed structures are not equal, and the differences can be used for a decision in favor of the correct absolute configuration.
110
4 Crystal symmetry
It has to be noted that the question of absolute configuration frequently arises for light atom organic compounds which show only small anomalous dispersion. Nevertheless there are several examples in which the relatively small dispersion effect of oxygen with CuKα radiation (∆f′ = 0.049, ∆f′′ = 0.032) was sufficient to determine the absolute configuration. However, this work requires exceptionally precise intensity measurements and in this case Friedel-related pairs of reflections (h and –h) have to be measured. This means, for example, that for a triclinic structure, data of the entire limiting sphere instead of half of it are needed. The asymmetric unit of reflections given in Table 4.5 has to be complemented by the Friedel-related reflections, so that in general the amount of data increases by a factor of two. A criterion to indicate whether the absolute configuration has been successfully determined from a data set that includes Friedel pairs was introduced by Flack (Flack parameter) [15], see Section 8.2.3. The first successful application of anomalous dispersion for determining the absolute configuration was described by Bijvoet and coworkers [16]. From their famous investigation on a NaRb tartrate, the absolute configuration of tartaric acid was derived experimentally for the first time.
4.2.4 Systematic extinctions In our first inspection of the film exposures, we observed, in addition to intensity symmetry, special groups of reflections in which some were systematically absent. Let us now discuss this phenomenon in detail. We shall see that, without exception, the translational parts of the symmetry elements will cause these systematic absences and that this property will be of significant assistance in space group determination. It will help to solve the problem of the last chapter to distinguish between mirror and glide planes, and axes and screw axes, respectively. As representative for all translational elements, let us prove three propositions: (1) If a lattice is C-centered, all reflections hkl with h + k = 2n + 1 are systematically absent (general extinction rule). (2) If a lattice contains a glide plane perpendicular to the b-direction with glide component c/2, all reflections of type h0l with l = 2n + 1 are systematically absent (zonal extinction rule). (3) If a lattice contains a 2-fold screw axis in the b-direction, all reflections of type 0k0 with k = 2n + 1 are systematically absent (axial extinction rule). The proof of all three propositions is easily done by calculation of F(h), assuming the named symmetry: (1) For a C-centered lattice, every atomic vector (x, y, z) has its equivalent in (x + 1/2, y + 1/2, z).Then we have
4.2 Crystal symmetry and related intensity symmetry
N/ 2
∑ f [e
F ( hkl ) =
j
111
+ e2 πi( h( xj + 1 / 2 )+ k( yj + 1 / 2 )+ lzj ) ]
2 πi( hx j + ky j + lz j )
j= 1
N/ 2
∑ f [e
=
j
2 πi( hx j + ky j + lz j )
( 1 + eiπ( h + k ) )].
j= 1
With h + k odd, eiπ(h+k) = –1, hence F(hkl) = 0, if h + k = 2n + 1. (2) Assuming the situation as in space group P21/c, the pair of equivalent positions concerning the c-glide plane is (x, y, z) and (x, 1/2 –y, 1/2 + z). Then we get N/ 2
∑ f [e
F( hkl ) =
j
+ e2 πi( hxj + k( 1 / 2 − yj )+ l( 1 / 2 + zj ) ]
2 πi( hx j + ky j + lz j )
j= 1
N/ 2
=
∑ f [e j
2 πi( hx j + lz j )
( e2 πiky j + e2 πik( 1 / 2 − y j ) eiπl )].
j= 1
For reflections (hkl) with k = 0, we get F ( h0l ) =
N/ 2
∑ f [e j
2 πi( hx j + lz j )
( 1 + eiπl )].
j= 1
For l odd, eiπl = –1, hence F(h0l) = 0 if l = 2n + 1. (3) If a 2-fold screw axis in the direction of b is present, we have the equivalent position (x, y, z) and (–x, 1/2 + y, 1/2 –z). It follows that N/ 2
F( hkl ) =
∑ f [e j
2 πi( hx j + ky j + lz j )
+ e2 πi( − hxj + k( 1 / 2 + y j )+ l( 1 / 2 − zj )) ]
j= 1
N/ 2
=
∑ f [e j
2 πi( k / 4 + ky j + l / 4 )
( e2 πi( hxj − k / 4 + lzj − l / 4) + e2 πi( − hxj + k / 4 − lzj + l / 4) )]
2 πi( k / 4 + ky j + l / 4 )
2 cos 2 π( hx j + lz j − k / 4 − l / 4].
j= 1
N/ 2
=
∑ f [e j
j= 1
For reflections of type (0k0), we get N/ 2
F( 0k0 ) =
∑ f [e j
2 πi( k / 4 + ky j )
+ 2 cos ( kπ / 2 )].
j= 1
For k odd, cos(kπ/2) = 0, hence F(0k0) = 0 if k = 2n + 1.
112
4 Crystal symmetry
It is evident that calculations with corresponding symmetry elements will have similar results. So we can establish the following rules: (1) In every non-primitive lattice, general systematic extinctions are present. (a) For a C-centered lattice, only reflections hkl with h + k = 2n are present. For A- or B-centered lattices, the conditions are k + l = 2n and h + l = 2n. (b) For an F-centered lattice, only reflections hkl satisfying simultaneously the conditions h + k, k + l, (l+ h) = 2n are present (the condition l + h = 2n is redundant, since it follows from the first and second conditions). (c) For an I-centered lattice, only reflections hkl with h + k + l = 2n are present. (2) Every glide plane causes zonal systematic extinctions. If the glide plane is perpendicular to a, b or c, the extinctions affect reflections of type 0kl, h0l, or hk0. The glide direction is given by that of the nonzero indices of which even parity is required. For a glide plane perpendicular to b, we have the possibilities: (a) if h0l for h = 2n holds, the glide plane is of type a, (b) if h0l for l = 2n holds, the glide plane is of type c, (c) if h0l for h + l = 2n holds, the glide plane is of type n. (3) Every screw axis causes axial systematic extinctions. If the axes coincide with a, b or c, the reflection series affected are h00, 0k0, or 00l. The extinctions for all possible screw axes in the c-direction are listed in Table 4.6.
Table 4.6: Axial extinction conditions for screw axes in the c-direction (reflections present are listed). Type
00l reflections
Type
00l reflections
21 31 32 41 43
l = 2n
42 61 65 62 64 63
l = 2n
⎫ ⎬ ⎭ ⎫ ⎬ ⎭
l = 3n l = 4n
⎫ ⎬ ⎭ ⎫ ⎬ ⎭
l = 6n l = 3n l = 2n
4.2.5 Quasicrystals It follows from the relations between crystal and intensity symmetry elaborated in the last chapters that the diffraction pattern of a single crystal cannot show other symmetry properties than the point group symmetries introduced in chapter 4.1.2. It was therefore surprising and confusing when in the 1980s diffraction exposures appeared with fivefold or tenfold symmetries, as shown in the examples in Fig. 4.28 [17, 18], which were considered so far as forbidden for the diffraction pattern of a crystal.
4.2 Crystal symmetry and related intensity symmetry
(a)
113
(b)
Fig. 4.28: Diffraction pattern of quasicrystals. (a) Exposure of a Zn-Mg-Ho quasicrystalline sample showing five-fold symmetry, source: H. Takakura and A. P. Tsai [17]. Copyright 1998, reprinted by permission of The Japan Society of Applied Physics. (b) Exposure of an Al-Pd-Re quasicrystalline phase, ten-fold symmetry is seen. Insert: The sample which gave rise to this exposure clearly shows macroscopically faces with five-fold symmetry, source: Fisher et al [18]. Copyright 2002, reprinted by permission of Taylor & Francis Ltd., UK. Quelle. (a) aus Japan, Genehmigung für 4.28a liegt vor, jetzt Fig4.28a.jpg. (b) Fig4.28b, stammt aus USA, Reproduktionsgenehmigung liegt jetzt auch vor. Fig4.28b.tif kann also verwendet werden.
A likewise forbidden symmetry was visible macroscopically at related crystalline samples, as shown in the insert of Fig. 4.28b. Samples of this type with symmetry properties being in strong contrast to the non existence of five-fold or higher symmetry than six-fold, as was expressed as a fundamental law in crystallography, are so-called quasicrystals. They are composed of a quasiperiodical atomic arrangement with no translational symmetry. The first person who found these unexpected results was Daniel Shechtman, who worked on some aluminum transition metal alloys. The quasicrystalline samples were grown by rapid cooling from the melt. This discovery was originally extremely controversial. His first manuscript about this subject was immediately rejected with the advice to read basic crystallographic textbooks first. So it took two years from his first experiment in 1982 until his findings got published [19]. For almost a decade he was confronted with strict denial by Linus Pauling, as a two-time Nobel Prize winner a giant in the chemical society. Shechtman could never convince Pauling about the correctness of his discovery until Pauling died in 1994. Nevertheless, during the time the quasicrystal concept was accepted by the scientific community and in 2011 Daniel Shechtman was awarded the Nobel Prize in chemistry. Because of their forbidden symmetry quasicrystals cannot consist of strict periodic structures. However, since they give rise to sharp diffraction spots (see Figure 4.28), the atomic distribution in the sample must follow some (nonperiodic) principles.
114
4 Crystal symmetry
Figure 4.29: Penrose pattern generated by two rhombi having edges of equal lengths but have different thicknesses. If the areas of all rhombi having edges in same directions in common are shadowed, they define equidistant and parallel lines, which can be considered as models for lattice planes in the three-dimensional samples, illustration kindly provided by Bergmann/ Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
Mathematic models describing one-dimensional structures of this type are well known. The law of propagation of a one-dimensional nonperiodic structure follows the so-called Fibonacci sequences. Making use of two initial components, a short and a long one, a one-dimensional nonperiodic sequence according to the Fibonacci laws can be constructed. If the concept of allowing two components as building blocks is also applied to the two-dimensional case, a quasiperiodic pattern can be composed, which was already introduced in 1974 by the Penrose pattern [20]. It allows paving the ground void-less and quasiperiodically. Figure 4.29 shows an example of a Penrose pattern composed of two rhombi as building blocks. Although this pattern also follows a certain building principle, it cannot be generated periodically by any unit cell. At certain sites in Figure 4.29, local fivefold symmetry can be recognized and even approximate lattice planes can be constructed which are prerequisite for sharp reflections. In three dimensions quasiperiodic models can also be constructed from two rhombohedral building blocks and evidence exists that the forbidden symmetries are based on quasicrystalline models of this type. It is not quite understood why nature applies this energetically less favorable construction principle of using two building blocks instead of one unit cell. That is why alternatively so-called overlapping-cluster models are discussed. It was shown by P. Gummelt that quasiperiodic structures can also be composed of one building block if cluster overlap is tolerated [21]. Quasicrystals have unusual favorable material properties, so that several industrial applications have been considered. They have low thermal and electrical conductivity, so that they can be used as insulators (although being metal alloys!). They are suited for surface coating and they are used further in the production of fine surgical instruments and for the coating of frying pans to replace Teflon. Due to their outstanding structural properties, quasicrystals have also been designated as third state of solid matter in addition to the crystalline and the amorphous state.
4.3 Space group determination
115
4.3 Space group determination 4.3.1 General considerations, practical aspects We shall describe here how the information obtained from the diffraction pattern can be utilized for a proper space group determination. First, however, we have to point out that only some, but not all, space groups can uniquely be determined from the recognition of Laue symmetry and systematic extinctions. In some cases important additional information can be obtained from the calculation of unit cell contents via the density. This is done as follows: The mass per mol of a compound of molecular weight Mr is m = Mr [g mol–1]. Then the mass mZ of Z molecules is mZ = (ZMr)/ L [g] where L is Avogadro’s number. If the unit cell has the volume V, the density ρ can be calculated if the number Z of molecules in the unit cell is known. Since ρ = mZ/V, we get ρ = (ZMr)/(VL) [g cm–3]. This expression for ρ is usually called X-ray density and is designated ρx, since it is derived from X-ray results. Usually V is given in Å3, i.e. 10–24 cm3. Since L = 6.023 × 1023 mol–1, ρx is written as M r 10 ρx = Z 10−24 [ gcm −3 ] . V10−246.023 With 10/6.023 ≈ 1.65: ρx = (1.65 ZMr)/V [g cm–3] or [Mg m–3]
(4.32a)
or Z =
ρx V . 1.65 M r
(4.32b)
These formulae can be used in two ways: The first is the calculation of Z from the value of ρ measured macroscopically (by flotation methods, for example). Since Mr is usually known and V is derived from the lattice constants, (4.32b) gives Z. Since Z must be an integer (except for a few special examples), the calculation from (4.32b) should give an integral value (with some error caused mainly by the uncertainty of ρ). Nevertheless, the nearest integral value is a good estimate of Z. The second is the calculation of ρ from (4.32a). Once the correct integer value Z is known, ρ can be calculated very precisely from (4.32a), since Mr is known and V can
116
4 Crystal symmetry
be determined very accurately from precisely measured lattice constants. Thus the X-ray density determination can result in a value which is more precise than most of the experimental procedures. Applying (4.32) to the structure of KAMTRA, the molecular weight is Mr = 188.2 g/ mol and the volume V will be determined to be 628.5 Å3 (see Section 5.1.3). The experimentally measured density was found to be 1.95 g cm–3. From (4.32b) it follows that Z = 3.95, which means Z must be equal to 4. With Z = 4 we calculate ρx = 1.98 g cm–3 from (4.32a) and get a very precise result for the density. Let us consider an example where the space group determination from intensity symmetry and systematic extinctions alone is not unique. If the intensity symmetry corresponds to the monoclinic Laue group 2/m and no systematic extinction can be observed, then we have a primitive lattice with no translational symmetry elements, such as screw axes or glide planes. The space group could be P2 or Pm or P2/m. Recognition of the correct space group without solving the structure is impossible. However, we can use the knowledge of the unit cell contents for a decision in favor of one of the space groups as follows: the space groups P2 and Pm have twofold symmetry, that of space group P2/m is fourfold. If the cell contains four molecules, P2/m can be the correct space group. Then one molecule would be the asymmetric unit. However, the space groups with twofold symmetry cannot be excluded, because two molecules can establish the asymmetric unit. This is not unlikely to occur. Several structures are known where even more than two molecules are in the asymmetric unit. Recently a structure of L-tryptophan was reported where 16 molecules represented the asymmetric unit in the space group P1. In this case the asymmetric unit was identical to the unit cell contents [22]. On the other hand, if the cell contains two molecules, the space group P2/m is only possible if the molecule itself contains one of the space group symmetry elements, that is, either a 2-fold axis, a mirror plane, or an inversion center [see Figure 4.30(a)]. Then the space group could be P2/m with one half molecule in the asymmetric unit. The atoms sitting on the symmetry element like the three atoms on the mirror plane m in Figure 4.30(a) are said to be in special positions. Some (or sometimes all) of their fractional coordinates are restricted to fix the atom to the symmetry element. If the molecule cannot have this symmetry as in Figure 4.30(b), only P2 or Pm remain
(a)
(b)
Figure 4.30: (a) Example of a molecule with internal mirror symmetry located on a crystallographic mirror plane m perpendicular to the paper plane, allowing half a molecule as asymmetric unit. (b) Structure with no molecular symmetry, which cannot match this mirror symmetry.
4.3 Space group determination
117
possible, the space group P2/m can be excluded, except in special rare cases where the molecules are disordered. In some cases the physical properties of the compound are useful for space group determination. One major problem is the decision whether a crystal belongs to an acentric or the corresponding centric space group since the Laue symmetry always indicates the corresponding centric class. So from the diffraction experiment a reference to one of the 21 non-centric crystal classes can scarcely be obtained. In this connection the property of certain crystal classes to have polar directions may be useful. A given direction, represented by a vector r = [u, v, w], is defined as a polar direction if it is not related by one of the crystal class symmetry elements to its corresponding centrosymmetric direction –r = [–u, –v, –w]. From this definition it follows immediately that a polar direction can only occur in one of the 21 non-centrosymmetric classes (Table 4.7). If, for a given crystal class, a vector r can be chosen in a polar direction having the property that the vector sum of r and all symmetry-related vectors r′, r″,… does not equal zero, the crystal class is said to be a polar class. The ten polar classes are listed in Table 4.7. Two physical properties are closely connected to crystals belonging to a polar or a non-centric crystal class. If for the nonzero resultant in a polar class an electric dipole moment is produced in terms of a change of temperature this phenomenon is called pyroelectricity. The development of an electrical polarity along a crystal’s polar direction from a mechanical compression or tension is defined as piezoelectricity. Since piezoelectricity requires the existence of a polar direction, it is clear that this effect is only observable in one of the non-centric classes. In fact, an exception is the high symmetric cubic class 432 where piezoelectricity is no longer possible so that actually 20 crystal classes remain (see Table 4.7). Since pyroelectricity is restricted to one of the ten polar crystal classes, the presence of one of these two physical properties may be used in the course of crystal class decision. However, it should be pointed
Table 4.7: Non-centrosymmetric crystal classes, polar and enantiomorphous classes. Crystal system
Noncentric (Piezoelectricity)
Polar (Pyroelectricity)
Enantiomorphous
Triclinic Monoclinic Orthorhombic Tetragonal Hexagonal Trigonal Cubic
1 2; m 222; mm2 4; 4; 422; 4mm; 42m 6; 6; 622; 6mm; 6m2 3; 32; 3m 23; 432+; 43m
1 2; m mm2 4; 4mm 6; 6mm 3; 3m –
1 2 222 4; 422 6; 622 3; 32 23; 432
No.
21
10
11
+
No piezoelectricity
118
4 Crystal symmetry
out that nothing can be concluded from the absence of these effects because they may be too weak to be observable. The last column of Table 4.7 shows those eleven of the twenty-one non-centrosymmetrical classes which are enantiomorphous, i.e. they contain no symmetry element which produces the reflected image of a motif. If, for example, a structure consists of one enantiomer of an optically active compound, only one of the 64 space groups belonging to these crystal classes is possible. If we had such a compound in our monoclinic example, we could immediately exclude Pm and P2/m, resulting in a unique solution, in this case P2. In general, even if additional information from the physical behavior of a compound are used, a unique determination of the space group is not obtained, although there are 50 out of the 230 space groups for which the determination is unambiguous. Two of these are P21/c and P212121, which occur very frequently with organic molecules. In the monoclinic system, one axial extinction and one zonal extinction determines the unique space group P21/c. The orthorhombic space group P212121 is also uniquely determined. If the intensities show the symmetry of the orthorhombic system and three perpendicular axial extinctions are present, the unique solution for the space group is P212121. In practical X-ray work it is advisable to try the space group determination in the following sequence: (1) Determine the Laue group from the intensity symmetry. (2) Select the unit cell vectors to be consistent with the observed intensity symmetry. (3) Check whether these unit cell vectors are equal in length and determine the angles between them, hence determine the crystal system (see Table 4.3). (4) Determine the systematic extinctions, if any. (5) Calculate the cell volume and the unit cell contents, make use of the X-ray density [eq. (4.32)]. (6) Check whether the results of (1) to (5) are in agreement with the properties of one (or more than one) of the 230 space groups. The procedure described here is carried out in this or a similar way by the software integrated in the control programs of modern diffractometers. The software suggests one or more possible space groups depending on some preliminary data collection results. The user is then requested to make a choice. An urgent warning should be expressed here. If a wrong space group is chosen, it is very probable that the subsequent structure determination will completely fail. Hence the user should not completely rely on the software only but make use of his/her crystallographic knowledge to decide in favor of the most probable space group. A frequent source of error is that systematic extinctions are not correctly recognized especially serial extinctions on short axes, where only a few reflections contribute to a decision. If a unique choice cannot be made, the structure determination must be attempted in all possible space groups.
4.3 Space group determination
(a)
119
(b)
Figure 4.31: (a) Originally assumed formula for a tropical plant derivative. (b) Reasonable structure solution in C2 [1].
Figure 4.32 Final correct structure of the pentacyclic triterpene [1].
The following is an example where neither the chemical identity nor the space group were unambiguously known: For a natural product isolated from a tropical plant preliminary chemical examinations suggested a formula C28H46O2 (Mr = 414.7), and a structure as given in Figure 4.31(a). X-ray exposures indicated monoclinic symmetry (Laue group 2/m), systematic extinctions hkl for h+k = 2n+1 and a cell volume of 2822 Å3. With this general systematic extinction, we had a C-centered lattice. Since we had no zonal or serial extinctions, the possible space groups were C2, Cm and C2/m. Making use of eq. (4.32) we calculated ρx = 0.98 gcm–3 for Z = 4, and ρx = 1.96 gcm–3 for Z = 8. Z = 4 would favor C2 or Cm, while Z = 8 made C2/m more probable. However, both calculated densities were suspicious. ρx = 0.98 gcm–3 seemed very small for an organic compound of that type, while ρx = 1.96 gcm–3 was too large. Probably the chemical formula was not quite correct. Since we could not trust the formula (otherwise we could exclude Cm and C2/m because of chirality) structure determinations were attempted in all three space groups. No chemical reasonable solution was obtained in Cm and C2/m; only the solution in C2 gave an estimate of the correct structure [Figure 4.31(b)], which was somewhat different to that expected and turned out to be a pentacyclic triterpene, see Figure 4.32. With the correct formula C32H52O2 and a more precise cell volume of V = 2821.9 Å3, a reliable density ρx = 1.10 gcm–3 was now calculated for Z = 4.
120
4 Crystal symmetry
4.3.2 Space groups of KAMTRA and SUCROS from film exposures With our knowledge of crystal and related intensity symmetry, we can now analyze the film exposures of KAMTRA in more detail. On the zero layer Weissenberg exposure shown already in Figure 3.20(a) and displayed again in Figure 4.33(a), for which the characteristic “festoon” character is represented in Figure 4.33(b), the lines marked A and B have special properties already discussed in Section 3.2.2. The ∆x-distance of those two lines on the film is exactly 4.5 cm, corresponding to an angular increment of 90° (since g = 0.5mm/°). These two lines therefore constitute a rectangular lattice plane, as shown in Figure 4.34, which shows the plot of the undistorted lattice layer from the upper half of the Weissenberg exposure. This image is obtained by application of the transformation given by formulae (3.7) and (3.8) in Section 3.1.2. Because of the symmetry and extinctions, the reciprocal unit cell vectors are chosen with b* in the directions of A and a* in the direction of B, see also Figure 4.35. Then the vector of the rotation axis must be c to satisfy the layer line condition (3.3) in Section 3.1.1. With h = ha* + kb* + lc* and r = 0a + 0b + 1c, we get hr = h0 + k0 + l1 = n (n = 0,1,2, ....). Then the zero layer contains all hk0-reflections, the first layer all hk1-reflections, etc. With these definitions and the undistorted representation of Figure 4.34, the reflections can be indexed, i.e., they can be identified by their integer coordinates. The result of indexing the reflections for the zero layer is shown for several reflections in Figure 4.34. Note the relation between reflections on the upper and lower half of a Weissenberg exposure. As Figure 4.33(b) shows, one complete lattice layer would be recorded
(a) (b) Figure 4.33: (a) Zero layer Weissenberg exposure of KAMTRA ]see also Figure 3.20(a)]; (b) its schematic representation in “festoon” type arrangement. hk0 quadrants with their proper signs are indicated.
4.3 Space group determination
121
on the upper half as well as on the lower half of the film if the translation were large enough. In practice, the length of the film cylinder (which usually does not exceed 100–120 mm, corresponding to an angular increment of 200 to 240° for film rotation with g = 0.5mm/°) restricts the range covered to two or two and one-half quadrants on each half of the film. Usually this is sufficient, since two quadrants of a reciprocal lattice layer include all the symmetry-independent reflections in a layer in most cases. The magnitude of c is 7.70Å, obtained as D = 1/D* from the rotation photograph (Table 3.1). The magnitudes of a* and b* can be derived from the θ-values of reflections on the lines B and A, which have the indices h00 and 0k0 respectively. Application of Bragg’s law in the form of equation (3.7a) to axial reflections of the type h00 and 0k0 gives the magnitude of the corresponding reciprocal lattice constants. The precision of a* depends strongly on the precision in measurement of θ. The error Δθ of θ can be assumed to be independent of θ. Since Δsinθ decreases when θ
Figure 4.34: Undistorted representation of the zero layer of the KAMTRA lattice with a couple of reflections indexed.
122
4 Crystal symmetry
approaches 90°, the precision of a lattice constant determination increases with high angles of θ. Precise measurements of lattice constants should therefore be made by using high-order reflections. Choosing the reflections I on line A and II on B [(see Figure 4.33(a)], their indices can easily be derived with the help of Figure 4.34 as (0 10 0) and (6 0 0). We find yI = 46.0 mm and yII = 36.0 mm, corresponding to the angles θI = 46.0° and θII = 36.0°. Then we have a* = 0.127 Å–1 and b* = 0.0933 Å–1 , and we know that γ*, the angle between a* and b*, is 90°. Considering the hkl-quadrants in Figure 4.33(b) and the indexing of reflections in Figure 4.34, the intensity symmetry can be formulated as I (h k 0) = I (hk 0) (quadrants on both sides of line A) I (h k 0) = I (hk0) (quadrants on both sides of line B) and I (h k 0) = I (hk0). Since the first level Weissenberg exposure [Figure 3.20(b)] shows the same intensity symmetry along A′ and B′, the intensity symmetry can be formulated generally as I (h k l) = I (hk l) = I (h k l) = I (hkl). Intensity symmetry of that type is found in the orthorhombic and higher symmetry systems. Additional information on the crystal system is obtained from the inspection of the angles between the axes. c* is the vector from the origin to the intersection point of lattice lines A′ and B′ in the first layer. If c* forms right angles with a* and b*, the point of intersection of the rotation axis coincides with the top of c* at point C [Figure 4.35(a)]. It then follows that the lattice lines A′ and B′ intersect the rotation axis also in C and in consequence are present as lines on the Weissenberg photographs [see equation (3.9)]. If one of the angles differs from 90° (in Figure 4.35(b) this is illustrated for the angle β*), the point C at the top of c* is not identical with the intersection point D, and the lattice line B′ no longer intersects the rotation axis. Its image on the Weissenberg film is no longer a line. If both angles between c* and the two other unit cell vectors are different from 90°, neither A′ nor B′ are straight lines of the film. The question whether c* forms one or two non-right angles with the vectors in the zero layer can easily be answered by inspecting the Weissenberg exposure of first layer where the absence of straight lines for A′ or B′ (or both) indicates non-right angles. If only one angle differs from 90°, the magnitude of displacement p = DC can be used for an estimation of the angle. Since OD = D*, we get [see Figure 4.35(b)] cotβ* = p/D*.
(4.33)
123
4.3 Space group determination
(a)
(b)
Figure 4.35: Illustration of a shift of the intersection point: (a) orthogonal system; (b) non-orthogonal system.
However, the determination of p is not very accurate. Furthermore, since p> 100 cm–1, it should even be considered whether a radiation source with shorter wavelength can be used to reduce μ. In contrast to the above mentioned suggestions, we note that some scientific journals, such as Acta Crystallographica, request authors who wish to publish results of a crystal structure analysis, to carry out an absorption correction if μ > 3.0 cm–1. Organic compounds without heavy atoms have linear absorption coefficients around 10 cm–1 for CuKα and around 1–2 cm–1 for MoKα radiation, so that in practice for all Cu data sets an absorption correction is requested. For a crystal of arbitrary shape bathed in X-rays, the absorption factor may be expressed by
Figure 7.2: Incident and reflected beam corresponding to the crystal volume element dV.
7.1 Data reduction
A=
1 e−μ (p + q) dv. V
∫
157
(7.6)
v
dV is a volume element of a specimen having a crystal volume V, and p and q are the lengths of the paths of the incident and reflected beams for the volume dV (Figure 7.2). Several methods for the application of absorption corrections are available (in more or less sophisticated computer programs), most of them based on the calculation of path lengths through the crystal, which vary from reflection to reflection. The determination of these path lengths is not a simple problem. For regularly shaped crystals (e.g. spherical and cylindrical), (7.6) can be calculated analytically and may be used for crystals of that shape. It is sometimes advisable, especially for compounds with large μ, to grind a crystal in order to obtain spherical crystals. However, it should be noted that this procedure does not work for soft crystals, and there is always the danger that the crystal will be damaged, so the procedure should only be used if crystals of sufficient mechanical stability are available. If the crystal shape cannot be approximated by a sphere or a cylinder, the calculation of (7.6) must be done numerically for the individual crystal. For this calculation, the precise shape is used as input from a measurement of the crystal faces. For that purpose, it is advisable to use crystals with well-developed natural faces. The Lorentz, the polarization and the absorption corrections are in most cases the only ones necessary for data reduction. We can therefore modify (7.1) by I(h) = t2LpA⎪F(h)⎪2.
(7.7)
The quantity which is impossible to obtain is the scale factor, which can only be calculated if the structure is already known. It is therefore customary to define a socalled “Frel(h)”, which differs from ⎪F(h)⎪ only by the scale factor; Frel(h) = t⎪F(h)⎪.
(7.8)
I(h) . LpA
(7.9)
Then, from (7.7), we get Frel (h ) =
To get σ(Frel), we express I as a function of Frel and calculate the first derivative: I = LpAFrel2 dI = 2Lp A Frel . dFrel Setting σ (Frel) ≈ dFrel and σ (I) ≈ dI, we get
158
7 Solution of the phase problem
σ( Frel ) =
σ( I ) . 2 Lp A Frel
(7.10)
If no absorption correction is made, A = l is always used. The calculation of Frel, together with the calculation of σ(Frel), is done from the experimental data in the data reduction for every reflection. Eq. (7.10) has a severe drawback for weak reflections. If Frel in the denominator of (7.10) is close to zero, σ(Frel) tends to infinity. That is why it has recently been considered more favorable to use Frel2 and σ(Frel2) in all further calculations. In any case, a so-called hkl-file has to be provided from the data reduction procedure, which should include for each reflection h,k,l, Frel, σ(Frel) or h,k,l, Frel2, σ(Frel2) I(h ) Lp A
(7.11a)
σ (I) . Lp A
(7.11b)
with
Frel 2 ( hkl ) =
and
σ ( Frel 2 ) =
Whether the hkl-file should include all measured reflections or only the symmetryindependent ones has to be decided by the user. We recall the amount of independent reflections: (1) triclinic: one half of the limiting sphere, e.g. ±h, ±k, +l (2) monoclinic: one quarter of the limiting sphere, e.g. ±h, +k, +l, or +h, +k, ±l (3) orthorhombic: one eighth of the limiting sphere, e.g. +h, +k, +l (4) tetragonal and higher symmetries: one eighth of the limiting sphere or less. The rules (1)–(4) given above are only valid in the absence of anomalous dispersion. In this case pairs of reflections related by Friedel’s law (so-called Friedel pairs) have the same intensities [see also eq. (4.21b)]. If anomalous dispersion is present, Friedel pairs can exhibit (in general small) intensity differences. Then it is up to the user whether to ignore these small differences and merge over Friedel pairs or not. Intensity differences caused by anomalous dispersion can be utilized for deriving the absolute configuration of chiral compounds. If this is intended, a data set with nonmerged Friedel pairs must, of course, be used. The quality of the data set after merging can be examined by the so-called internal R-value, denoted Rint. If the average for a group of multiple observations and symmetry-related reflections is Iav, then Rint.is defined by
∑ I(hkl) – I = ∑ I(hkl)
av
R int
hkl
hkl
.
(7.12)
7.2 Fourier methods
159
This is the first time that a certain type of an R-value is introduced, a construction which is very popular in crystallography. The “R” stands for residual or reliability, sometimes R-values are also called agreement factors. We shall learn about further types of R-values in the next chapters, especially in Chapter 8 on refinements. The R-value is always a relative quantity and is frequently given in percent and has thus no physical dimension. A second type of R-value is Rσ, defined by
Rσ =
∑ σ(I(hkl)) . ∑ I(hkl)
(7.13)
hkl
hkl
We note that Rint as well as Rσ, can also be considered for Frel2(hkl) and Frel(hkl). Both R-values should be as small as possible. A large Rint can either indicate that equivalent reflections are not correctly measured or that merging was made in the wrong symmetry. In this case the choice of the space group should be reconsidered. Rσ is a measure for the diffraction property of the crystal and thus also for the quality of the data set. If the majority of the reflections are strong, the sum in the denominator of (7.13) is large and the σ’s in the numerator are small to give a small Rσ. For weakly diffracting crystals the opposite holds. For the data sets of our four sample structures we obtained upon data reduction the quantities given below: KAMTRA: 2174 refls. coll.; 1004 unique; Rint = 1.1 % ; Rσ = 1.2 % (point detect.) SUCROS : 2619 refls. coll.; 1310 unique; Rint = 0.6 % ; Rσ = 0.8 % (point detect.) B12: 660822 refls. coll.; 98204 unique; Rint = 3.6 % ; Rσ = 3.5 % (area detect.) C60F18 245274 refls. coll.; 14898 unique; Rint = 4.3 % ; Rσ = 2.4 % (area detect.)
7.2 Fourier methods 7.2.1 Interpretation of the Patterson function As has been pointed out in Section 2.1, the phase problem is the central problem in single crystal analysis. It results from the impossibility of calculating the Fourier transform of F to obtain ρ because of the missing phases of the F’s. To overcome this difficulty, the F’s may be replaced by FF*, for which the Fourier transform is possible. The result is the Patterson function P (u) =
∫ F(h ) F*(h ) e
–2πi(hu )
dV*
V*
which we have already shown to be the convolution square of ρ
(2.5)
160
7 Solution of the phase problem
P (u) =
∫ ρ( r )ρ (r – u ) dV.
(2.6)
V
From the property that P(u) is the convolution square of ρ, useful geometrical interpretations can be derived which permit the solution of the phase problem in structures that have some special properties. Let us suppose that the structure is represented by point-shaped atoms with a weight given by their atomic numbers, as indicated by the two-dimensional model of benzene in Figure 7.3. The integral (2.6) then degenerates to a sum (because we have discrete points) P (u) ∼
∑
atoms
ρ(r ) ρ(r – u) .
This can be interpreted as follows. To obtain P(u), shift the structure ρ by the vector u [dashed representation in Figure 7.3(a), (b), (c)] to get ρ(r – u), calculate for every r the product ρ(r) ρ(r – u) and sum over all r. It follows that P(u) differs from zero only if in a product ρ(r) ρ(r – u), both factors are different from zero. This happens only if u is the difference vector of any pair of position vectors of two atoms of the structure, since only under this condition can one atom of the dashed structure coincide with an atom of the original model. This happens once in Figure 7.3(a), where atom no. 4 of the dashed molecule coincides with atom no. 1 of the non-shifted structure. The weight of P(u) is then proportional to the product of weights of contributing atoms. If u does not satisfy this condition, we have P(u) = 0 [Figure 7.3(b)]. It is, of course, possible that more than one pair of position vectors coincide for a special u [Figure 7.3(c)]. In this case, u is said to have a multiplicity larger than one and P(u) has a weight proportional to the sum of the product weights of contributing atoms.
(a)
(b)
(c)
Figure 7.3: Interpretation of the Patterson function as a convolution square demonstrated by the benzene molecule: (a) u is equal to the difference vector between two atomic positions [C(1) − C(4)], causing P(u) ≠ 0; (b) u does not coincide with any atomic difference vector, thus P(u) = 0; (c) u is identical with two atomic difference vectors [C(6) − C(1) and C(4) − C(3)], P(u) ∼ 2 × 62.
7.2 Fourier methods
161
From this interpretation of the convolution square, it can be said that the Patterson function represents all difference vectors of a given structure with a weight proportional to the frequency of occurrence times the weight of contributing atoms. In Section A.1.3 (Appendix), we discussed this problem when, as an exercise, we solved the problem of drawing all difference vectors in the benzene ring (Figure A.4). For a real structure which has, instead of atomic points, continuous electron density with maxima at the atomic centers, there is no fundamental difference. The Patterson function of a real electron density has in principal the same properties as for the point-shaped model. Only the discrete Patterson points have to be replaced by more or less broad maxima and a decrease to exactly zero can no longer be expected outside the difference vector positions. With this geometrical interpretation of the Patterson function, the problem of structure determination can be expressed as the following problem: how to find the structure from a given distribution of all difference vectors between each pair of atomic position vectors. This problem cannot be solved in this general formulation. However, the situation becomes quite favorable if additional provisions are made. As we shall show, there will be a good chance for a solution if the structure has a very simple geometry or contains one or a few atoms having an atomic number significantly larger than that of most of the other atoms, which is then called a heavy atom structure.
7.2.2 Heavy atom methods, principle of difference electron density Suppose that we have the structure of benzene as in Figure 7.3, but with one hydrogen atom replaced by a substituent of high atomic number, e.g. bromine (Figure 7.4). The strategy of solving this structure would depend in detail on the space group symmetry; nevertheless, some general remarks can be made. If, for instance, the space group contains at least a center of symmetry, the Patterson function will not only show the difference vectors within one molecule (intramolecular difference vectors) but also those between each pair of atoms in different molecules (intermolecular difference vectors). The weights of the difference vectors can be 1 × 1 = 1 from H – H vectors, 1 × 6 = 6 from H – C vectors, 1 × 35 = 35 from H – Br vectors, 6 × 6 = 36 from C – C vectors, 6 × 35 = 210 from C – Br vectors, 35 × 35 = 1225 from Br – Br vector. The Br–Br vector will have the highest maximum in the Patterson function (except for the origin maximum) unless the frequency of other difference vector types is high
162
7 Solution of the phase problem
Figure 7.4: Obtaining the heavy atom position from a centrosymmetric heavy atom structure.
which is rather unlikely in most cases. The Br–Br vector, however, combines the position vector r(Br) with its centrosymmetric vector −r(Br), so that for u(Br–Br) u(Br–Br) = r(Br) − [−r(Br)] = 2r(Br) holds. Therefore, the vector having the highest maximum is likely to be twice the position vector of the bromine atom. Hence the location of this atom in the unit cell is determined. This derivation was only possible because just one heavy atom was present in the structure, resulting in only one significant high maximum of the Patterson function. If there are a few more heavy atoms in the structure it may also be possible to identify further heavy atom difference vectors but with increasing complexity. Over a certain period of time it was very popular to apply the so-called Patterson vector search methods [1] (also Faltmolekülmethode in German) [2]. The basic idea was as follows: If for a given structure its geometry or at least the geometry of a sufficiently large fragment was known, then all difference vectors could be calculated. By rotation and translation of this set of vectors attempts were made to find an agreement with the maxima of the experimental Patterson. Then the positions of the given fragment atoms could be expected to be determined. Vector search procedures are still applied in DIRDIF and PATSEE, mentioned in Chapter 6, and can be very useful in the rare cases when direct methods (see Section 7.3) fail. With the heavy atom or fragment position(s) known, it becomes possible, in favorable cases, to complete the whole structure determination. This is done using the principle of difference electron density calculation (difference Fourier synthesis or dif-
7.2 Fourier methods
163
ference synthesis. It is customary in crystallography to denote the numerical calculation of the Fourier transformation from the reciprocal to the direct space as a synthesis. Thus also the term Patterson synthesis is an expression frequently used). Supposing some part of the structure, denoted ρc, is already known. From this partial model, the Fourier transform can be calculated and we get Fc (h) =F [ρc(r)] where the subscript c indicates that these quantities are calculated structure factors from a given model. Expressed as a sum after eq. (4.19), Fc (h) is given by Nc
Fc ( h ) =
∑f e j
2πihr j
e–T
j
(7.14)
j= 1
with Nc < N indicating that the portion of Nc among N atoms is known. The principle of difference synthesis is based on the assumption that if Nc does not differ too much from N, the phase ϕ(Fc) [being known from (7.14)] is approximately that of F itself. With the phases of Fc assigned to Frel we have obtained phases for the experimentally derived F’s. A scaling of the Frel’s could be obtained from the condition = .
(7.15)
Using (7.15) the quotient
∑ ∑
h h
Fc (h ) Frel (h )
=c
gives a first guess of the scale factor c, which must be denoted Frel scale factor since it acts on Frel. The quantity ⎪Fo⎪, defined by ⎪Fo⎪ = c Frel
(7.16)
is named observed structure amplitude, since it is derived directly from experimental observations. Now with Fo and Fc being introduced, we can express the difference electron density calculation as ∆ρ(r) = F –1 [Fo(h) − Fc(h)]
(7.17)
with ϕ[Fo(h)] being approximated by ϕ[Fc(h)]. From the arithmetic of Fourier transforms, it follows that ∆ρ(r) = F –1 [Fo(h)] − F –1 [Fc(h)].
164
7 Solution of the phase problem
Let us denote by ρo(r) the observed electron density being that given by the experimental data together with the true phases. Note that ρo(r) is not equal to the true ρ, which of course can never be obtained, since neither can all F’s be measured nor can experimental errors be completely excluded. Moreover, let us define ρc(r) = F –1[Fc(h)] and approximately ρo(r) ≈ F –1[Fo(h)]. Note that for ρo, only “≈” can be written, since the phases of Fo are only approximately determined. Then we get ∆ρ(r) ≈ ρo(r) − ρc(r)
(7.18)
which means that by application of (7.17) we get an electron density distribution for which the known atoms are removed and the remaining residue should show the missing part of the structure. In practice, however, the success of using this difference Fourier technique will depend on the amount and the accuracy of the known part of the structure, since it is on these parameters that the validity of approximating ϕ(Fo) by ϕ(Fc) depends. Unfortunately, it cannot be determined in advance whether a structural fragment is sufficient to ensure the success of this difference Fourier technique. For this reason, the use of a heavy atom position from the Patterson function is generally necessary for the further progress of structure determination. It is evident that in the sum for F, F(h ) =
fe e +∑ ∑ heavy
h
2πihr h − Th
light
f1 e2πihr1 e − T1
Fc
the weight of Fc will increase and thus its phase will be close to that of F if fh >> f1, which is the definition of a heavy atom structure. This is illustrated in Figure 7.5. Suppose that we have the case of the bromobenzene structure of Figure 7.4. Then the expression for F(h) consists of 7 contributions (hydrogens not considered), one for bromine and six for carbon. Since for small sinθ/λ the scattering factors are close to the atomic numbers, the complex bromine vector is significantly longer than any of the carbon vectors, so that the phase of the bromine vector should approximate rather closely the correct phase. Before trying to identify the heavy atom portion of the structure by interpretation of the Patterson function, the question should be considered whether this part of the structure is heavy enough for a successful structure determination. For that purpose, a rule of thumb is given by the so-called Sim quotient Q; i.e.
7.2 Fourier methods
165
Figure 7.5: Schematic representation of F(h) as a sum over the contributing atoms in the plane of complex numbers. It is obvious that the contribution of the heavy atom (Br) plays a dominant role.
Q=
∑Z ∑Z h 1
2 h
2 1
,
(7.19)
where the Z’s are the atomic numbers of the atoms in the unit cell and the summation over h (heavy atoms in a heavy atom structure) in the numerator is over all atoms already known, while the sum over l (light atoms) in the denominator is taken over all unknown atoms. If Q > 1,
(7.20)
then the application of the heavy atom method is likely to be successful. However, with very precise diffractometer data, the criterion (7.20) can be relaxed. There is a good chance of success with the heavy atom method if the sum of the squares of the heavy atom atomic numbers is not significantly less than that of the light atoms. A value for Q of 0.5 still gives a reasonable chance of success. Our test structure KAMTRA, formula C4H5O6K, is a good example for testing this rule. With the potassium atom as the heavy atom, we get a Sim quotient:
Q=
192 361 361 = = ≈ 0.7. 4 × 62 + 6 × 82 + 5 144 + 384 + 5 533
As mentioned above, this value might be sufficient for the heavy atom method to be successful. We shall try it, but first we have to learn more about some special properties of the Patterson function which derive from the space group symmetry.
7.2.3 Harker sections, applications to KAMTRA It was first pointed out by Harker (1936) [3] that space group symmetry can result in special sections of a Patterson synthesis which can be interpreted as projections of the
166
7 Solution of the phase problem
structure onto a plane or a line. These sections, named Harker planes or lines, or generally Harker sections, are very useful for the interpretation of a Patterson function. Let us discuss, for example, the space group P21/c. The symmetry operation representing the screw axis is –x, 1/2 + y, 1/2 – z. Every atom having the vector r = (x, y, z) has its equivalent in r′ = (−x, 1/2 + y, 1/2 −z). Since the Patterson function contains the difference vectors between each pair of atoms, the vector u (r) = r – r′ = (2x, −1/2, 2z −1/2) = (u, −1/2, w) will be present and this is true for each atom of the structure. From the periodicity of the crystal lattice, +1 can be added to the y- and z-component and we get u (r) = (2x, 1/2, 2z + 1/2) = (u, 1/2, w). This result can be interpreted as a projection of the structure on the plane y = 1/2 in twice the original scale and shifted in z by 1/2. On this Harker plane, a maximum at (u, 1/2, w) gives the atomic coordinates x = u/2 z = (w 2)/2. Since this is true for each atom, in principle a complete two-dimensional projection of the structure can be obtained. There is another Harker section for the space group P21/c and that is a Harker line caused by the glide plane symmetry. The difference vector of r = (x, y, z) and r′ = (x, 1/2 − y, 1/2 + z) is u(r) = (0, 2y − 1/2, −1/2) = (0, 2y + 1/2,1/2). This can be interpreted as a projection of the structure on the line (0, v, 1/2). Generally, the projection of a complete structure on a single line is not very helpful but in combination with a Harker plane, such a line projection may give useful information. If, for instance, the x and z coordinates of a heavy-atom position have been determined from a Harker plane, it may be possible to obtain the third coordinate from the Harker line.
7.2 Fourier methods
167
Although a Harker plane may contain a projection of the complete structure and information about the third dimension can be derived from Harker lines, in practice, there are difficulties, because Harker maxima may be superimposed by other vectors which correspond to vectors between atoms which are not symmetry related. Nevertheless, Harker sections should always be inspected very carefully. In spite of some ambiguities, their interpretation is often possible if the geometry of the structure is rather simple (especially for planar molecules) or if a single heavy atom position has to be determined. It can be generalized, from the example of space group P21/c, that symmetry elements derived from rotation or screw axes will result in Harker planes while symmetry elements derived from mirror or glide planes cause Harker lines. If, for example, more than one rotational symmetry element is present, we have more than one Harker plane and the information derived from one Harker plane may be completed and confirmed by that from the others. Therefore, in space groups of higher symmetry, the information of all the Harker planes and lines may permit the interpretation of a significant part of the structure. A further symmetry-related property should be noted here, although its application is more limited. If the space group is centrosymmetric, every vector r = (x, y, z) has its equivalent in r′ = (−x, −y, −z) and we obtain a Patterson vector u (r) for every atomic vector r u(r) = (2x, 2y, 2z). It follows that the vector 2r of every atomic vector r is present in the Patterson synthesis. But usually these special vectors can only be resolved from one another for a heavy atom vector as demonstrated in the example of Figure 7.4. Such a vector does, however, provide a check whether the interpretation of the Harker sections was correct. Now let us apply the heavy atom method via Harker planes to the example of KAMTRA. The space group P212121 is very favorable for the application of Patterson methods, since from this symmetry three Harker planes can be derived. The following four symmetry operations (1) x, y, z (2) 1/2 −x, −y, 1/2 + z (3) l/2 + x, l/2 −y, −z (4) −x, l/2 + y, 1/2 −z generate three Harker planes, given by (l)–(2): H1 (u, v, 1/2): (1/2 + 2x, 2y, 1/2) (l)–(3): H2 (1/2, v, w): (1/2, 1/2 + 2y, 2z) (l)–(4): H3 (u,1/2, w): (2x, 1/2, 1/2 + 2z).
168
7 Solution of the phase problem
(A constant −1/2 is always replaced by +1/2, since +1/2 and −1/2 differ only by a translation of one unit cell.) For an inspection of these planes we have to calculate the Patterson function for the three sections given by u = 1/2, v = 1/2, w = 1/2. The results are shown in contour line representations in Figure 7.6. In all three sections, four high maxima are found which are all symmetry-related so that only one independent high maximum is present. We assume that it is the potassium-potassium vector. (a)
(b)
(c)
Figure 7.6: The three Harker planes of KAMTRA. (a) H1 at w = 1/2; (b) H2 at u = 1/2; (c) H3 at v = 1/2. The maxima analyzed for the potassium position are marked by arrows.
7.2 Fourier methods
169
Taking H3, the position of one high maximum is at u = 0.663, w = 0.838. It follows that the x- and z-coordinates of the potassium position are x = u/2 = 0.332 z = (w − 1/2)/2 = 0.169. Now on H2, one of the four maxima should be at w = 2z = 0.338. Indeed one maximum is at v = 0.562, w = 0.325. With the uncertainties of interpolation, we can regard w = 0.325 as equal to w = 0.338. So we get from this Harker plane y = (v −1/2)/2 = 0.031, z = w/2 = 0.163. We now have all three fractional coordinates of the potassium position: x = 0.332; y = 0.031; z = 0.166 (by averaging). We should then observe a high maximum on H1 at u = 0.5 + 0.662 = 1.164 = 0.164 (from the periodicity) and v = 0.062. Indeed we find this maximum at u = 0.160 and v = 0.066. This consistency in the three Harker planes indicates that the potassium position vector has been found correctly. Its fractional coordinates are x = 0.332 y = 0.033 z = 0.166. No more information is likely to be obtained from the Patterson synthesis since all further high maxima correspond to K-O or K-C vectors, of which there are such a large number that an interpretation is impossible. We must therefore complete the structure determination by a difference synthesis. We shall describe this process in the section concerned with refinement (see Section 8.4.1). At this stage we have established that the heavy atom method has been successful for locating the potassium position with certainty.
170
7 Solution of the phase problem
7.2.4 Numerical calculation of Fourier syntheses The calculation of the Patterson function and the difference electron density introduced above are both Fourier inverse transforms. A program which is able to carry out all types of Fourier inverse transforms is called a Fourier synthesis program. The numerical formalism for these types of calculations is always the same, so we shall discuss here briefly some details. The calculation is based on the integral
∫ G(b) e
R(r ) =
–2πibr
dV*
V*
where G(b) is a general function of reciprocal space, which we shall replace later by the different types of actual functions. In the case of single crystals, it reduces, after eq. (4.12), to the sum
∑
R(r ) = (1/V)
h
G(h ) e–2πihr
or with G(h ) = G(h ) e2πiϕ (h ) , ϕ(h ) = phase of G(h ),
∑
R(r ) = (1/V)
h
G(h ) e2πi[ϕ (h )− hr ] .
To avoid complex arithmetic, Friedel pairs of reflections can be coupled, provided that ⎪G(−h)⎪ = ⎪G(h)⎪ and ϕ(−h) = −ϕ(h) holds. Then we get ⎪G(h)⎪e2πi[ϕ(h)−hr] +⎪G(−h)⎪e2πi[–ϕ(h)+hr] = 2 ⎪G(h)⎪ cos 2π[hr −ϕ(h)]. Hence, R(r ) = (2/V)
∞
+∞
+∞
h=0
k = −∞
l = −∞
∑ ∑ ∑ G(h ) cos 2π [ hx + ky + lz − ϕ(hkl)] .
(7.21)
The only reflection having no Friedel pair is the 000 reflection. The intensity of the 000 reflection and therefore ⎪F (000)⎪ can never be obtained experimentally, since it had to be taken at θ = 0°. However, F(000) can be calculated theoretically. From eq. (4.11) for F(h), we get for h = 0: N
F(0 ) =
∑ f (0) e . j
j=1
0
7.2 Fourier methods
171
Since for s = sinθ/λ = 0 the fj’s are equal to the atomic number Zj of contributing atoms, we get N
F(000) =
∑Z . j
(7.22)
j=1
In other words, the magnitude of F(000) is given by the number of electrons present in the unit cell. It follows from (7.22) that the phase of F(000) is also known and is equal to zero. We can say that F(000) is the only structure factor the magnitude and phase of which are originally known. The following functions are commonly used for G(h): (1)
⎪G(h)⎪ = ⎪Fo(h)⎪, ϕ(h) = ϕ[Fc(h)].
This is called an Fo-Fourier synthesis and the result is the observed electron density distribution ρo(r), with reservations arising from possible uncertainties of the Fc-phases. (2)
G(h) = Fc(h).
The result is a so-called Fc-Fourier synthesis representing the calculated electron density ρc(r). (3)
⎪G(h)⎪ = ⎪Fo⎪ − ⎪Fc⎪, ϕ(h) = ϕ(Fc).
From this calculation a difference synthesis, as discussed in Section 7.2.2, is obtained. (4)
G(h) = Frel2.
Since Frel2 is proportional to FF* [see eq. (7.8)], this calculation produces a Patterson synthesis. The Frel’s are usually unscaled. The lack of a proper scaling of the Frel’s acts only on the scaling of the Patterson function. However, since the user is only interested in the positions of maxima and their relative weight, this is of no significance. For the calculation of a Fourier synthesis, a three-dimensional grid will be used for subdividing the unit cell into “grid points”. The calculation of R(r) is then performed for every grid point of a specified section of the unit cell. The grid can either be chosen by the user or will be provided automatically by the Fourier program. The number of grid points in each direction should be chosen in relation to the length of the axes and the resolution of the structure determination. As already pointed out, the resolution cannot be better than half the wavelength of the radiation used – that means ≈ 0.8 Å for CuKα – it is not sensible therefore to choose more grid points than necessary for this resolution. A choice of three grid points/Å is in most cases reasonable. The output of a Fourier synthesis is generally not printed but stored in a mass storage file. It is then common practice to analyze the results by a peak-searching
172
7 Solution of the phase problem
program which determines all peaks above a given limit, sorts the peaks with respect to their height, and either prints out or stores the fractional coordinates of a specified number of the highest maxima. This can be followed by a fragment identification program which tries to generate a connected set of atoms from chemical considerations using sensible bond lengths and angles defaults. If more information than peak positions and relative height are desired, output maps can be generated consisting of two-dimensional sections. Each section, or page, consists of the density results for the grid value in that plane. By contouring equidensity points, the density distribution may be illustrated graphically. This then makes fine details visible. The three Harker planes shown in Figure 7.6 are examples of this type of representation.
7.3 Direct methods Up to the early 1960s, the principal method of structure determination was by interpretation of the Patterson function. As a consequence, a great number of chemical compounds, especially the so-called light atom structures of organic chemistry, were very difficult or impossible to investigate by the method of single crystal structure analysis. The development of the so-called Direct Methods of phase determination has completely removed this limitation. The name “Direct Methods” is derived from the fact that the phases of the structure factors are derived from the magnitudes of the F’s rather than indirectly by an interpretation of the Patterson function. However, extensive numerical calculations are necessary to apply this technique. It was an important step forward for X-ray crystallography that the development of direct methods and the rapid progress in computer technology happened almost at the same time in the 1960s. The detailed results of the theory of direct methods have been published in a large number of publications which use a non-trivial mathematical formalism. For our purposes, it is sufficient to present the fundamental ideas of that method and the formulae mostly used in the practical application. A short and not too detailed representation of the principles of direct methods can be given if we consider a very simplified model of the crystal structure. Then we shall see that the fundamental formulae can easily be derived. This simple model, which needs also a modification of the structure factor expression, will be introduced below.
7.3.1 Normalization The structure factor F (h), as derived in Section 4.2.2, [formula (4.19)], is based on a model of N atoms with different electron densities of finite dimensions, with their thermal motion taken into account by the temperature (or displacement) factor
7.3 Direct methods
173
expression. In direct methods, it is advantageous to consider instead an idealized model of point-shaped atoms of unique electron density with no thermal motion. For this model (see Figure 7.7), quantities are derived which play an important role in phase determination. These are known as normalized structure factors or E-values, since they are usually represented by the capital letter E(h). The process of calculating E-values is the so-called normalization. In case of the model introduced above the entire electron density, ρn, can be expressed as N
ρn ( r ) =
∑ Δ( r – r ) j
(7.23)
j=1
where ∆ (r) is the Dirac’s delta function. Using the shift theorem [Appendix, Section A2, formula (A.24)], we get for the structure factor G(h) of that model N
G( h ) =
∑ g( h )e
2 π i hr j
j=1
where g (h) is the Fourier transform of ∆ (r). Since it can be shown that the Fourier transform of a delta function is equal to l, we get N
G( h ) =
∑e
2 π i hr j
.
(7.24)
j=1
The quasi-normalized structure factor E′ (h) [4] is defined by the following equation E′( h ) = 2
Fe (h ) N
∑
2
(7.25)
fj2
j=1
Figure 7.7: Idealized crystal model to be used in direct methods with point-shaped atoms of unique electron density and no thermal motion (left). For this model a density overlap between neighboring atoms does not happen (right).
174
7 Solution of the phase problem
with N
Fe ( h ) =
∑f e j
2 π i hr j
j=1
being the structure factor of a model with thermal motion excluded. If all atoms in the unit cell are alike for all j, then fj = f, and (7.25) reduces to 2
N
f2 E′(h )2 =
∑
e2 π i hr j
j=1
N f2
.
It follows that E′(h ) =
1 G(h ) . N
From this result it follows that, for a structure of one type of atoms, the absolute values of E′ and G differ by a constant factor. We can therefore interpret the E′-values as structure factors of an idealized model with all atoms having the same unique electron density and no thermal motion. The normalized structure factors, or E-values, are defined by 2
E(h )2 =
Fe (h ) E′(h )2 = N ε ⎛ ⎞ fj2 ⎟ ⋅ ε ⎜ ⎝ j=1 ⎠
(7.26)
∑
with ε being related to some symmetry properties of the space group. Note that neither (7.25) nor (7.26) defines a phase for the E′- and E-values; these formulae give only the absolute value of these quantities. It is customary to assign the phase of F to the E-value of the same reflection, so that a phase determination of E’s would determine that for the F’s, and vice versa. The advantage of using E-values instead of G-values is that the E’s can be calculated (at least approximately) from the experimental data, while the G’s can only be derived from the known structure. The quantity ε is a small positive integer which depends on the space group symmetry. It is usually equal to l for general reflections, but may have larger values for some reflection classes, depending on the symmetry. So for instance, in space group P212121, ε = 2 for the series h00, 0k0, 001, while for all other reflections, ε is equal to l. The correct ε-values for each reflection class in every space group is provided by normalization programs, which all contain an ε-calculation routine. Since the quasi-normalized and the normalized structure factors differ only by that factor ε, the physical interpretation of E-values is in principle the same as that
7.3 Direct methods
175
of E′-values. The main difficulty with the calculation of E-values from the experimentally derived Frel’s arises from the scale and temperature factors, which are originally unknown. It follows from (7.8) that N
Frel ( h ) = t
∑ f exp [ 2π hr ] exp ⎡⎣ − ( B j
j
j=1
j
sin2 θ ) / λ 2 ⎤⎦ .
Assuming, as a first approximation, that the Bj are equal for all atoms (i.e. the individual isotropic Bj’s are replaced by an overall isotropic B), we get Frel ( h ) = t exp ⎡⎣ − ( B sin2 θ ) / λ 2 ⎤⎦ Fe (h ) or 2
Fe ( h ) =
Frel ( h )2 . t exp ⎡⎣ − ( 2 B sin2 θ ) / λ 2 ⎤⎦ 2
(7.27)
⎪Fe ⎪2 is needed for the calculation of E-values and Frel is obtained from the experimental data. The unknown quantities are t and B. They can be estimated following a method proposed by Wilson [5]. Using the variable s = sin θ/λ and taking the averages Fe ( h )
2
and Frel ( h )2 for a given s, we get Frel ( h )2
t 2 exp [ –2 Bs2 ] = The average Fe ( h )
2
Fe ( h )
2
.
can be calculated as follows:
⎛ N ⎞⎛ N ⎞ 2 Fe ( h ) = Fe ( h ) Fe ( h )* = ⎜ fj e2 π i hrj ⎟ ⎜ fj e –2 π i hrj ⎟ . ⎝ j=1 ⎠ ⎝ j=1 ⎠
∑
∑
This can be separated into N
2
Fe ( h ) =
∑ j=1
N
fj2 +
N
∑ ∑
n=1
fn fm e2 π i h (rn – rm )
m=1
(with n ≠ m in the double sum). If the average is taken, a summation over all h is necessary so that an equal number of positive and negative terms are produced in the second sum, which will therefore tend to zero. It follows that
176
7 Solution of the phase problem
N
Fe ( h )
2
=
∑f
(7.28)
2 j
j=1
Then we get t 2 exp [ –2 Bs2 ] =
Frel ( h )2 N
∑f
= K(s).
2 j
j=1
Taking the natural logarithm on both sides, we get ln t2 − 2Bs2 = ln [K(s)].
(7.29)
(7.29) is the equation of a straight line if ln [K(s)] is plotted versus s2, having as slope –2B and the intersection at s2 = 0 of ln t2. In practice, this plot is made in the following way. A number of sinθ/λ intervals is chosen. For all reflections in each interval, the average Frel2 is calculated together with the sum in the denominator of K(s), using the medium atomic form factors fj of this interval. In this way, ln [K(s)] can be obtained and plotted versus s2 (Wilson plot). By approximating that plot by a straight line, its slope −2B gives the overall isotropic temperature factor, B, and the intersection at s2 = 0 is ln t2, from which the scale factor t can be derived. With B and t known, the ⎪Fe (h)⎪2 values can be calculated from (7.27) and then the E-values from (7.26). Each computer program for this calculation requires as input the unit cell and space group information, the Frel’s (usually the output of the data reduction program), the atomic form factors, and finally, the unit cell contents, in order to calculate the fj2 sums. The last calculation is difficult to realize if the chemical identity of the compound being investigated is not completely known. In practice, however, it is sufficient to input the approximate unit cell contents. It has been shown that even considerable deviations from the true unit cell contents do not significantly affect the E-value calculation. So, for example, if there are four molecules in a unit cell, each having the formula say, C18H35O5, it is quite sufficient at this stage to approximate the cell contents by 4 × (C20H40). It follows from (7.26) and (7.28) for the average E2 : E2 = 1.
(7.30)
Thus the magnitudes of E-values are independent of the actual structure and do not differ too much from l. Usually the highest E-values of a structure are less than 3 or 4. It is a good check on severe errors in intensity data to examine whether extremely large E-values are present. If a reflection is found having an E-value >> 5 or so, it is quite certain that there is an error in the intensity value of this reflection.
7.3 Direct methods
177
Similarly from the results of (7.28) and (7.30), further averages for E-values can be calculated. These averages, as well as the distribution of E-values, do not depend on the actual structure, but only on whether or not the structure is centrosymmetric. Based on calculations made by Wilson [6], it can, for instance, be shown that the average E has the following values: E = 2 / π = 0.798, for centrosymmetric structures E = ( 1 / 2 ) π = 0.886, for acentric structures.
(7.31)
Table 7.la gives some theoretical values depending on the presence/non-presence of an inversion center, which can be used for the decision whether the structure being investigated is centrosymmetric or not. Every computer program concerned with the normalization of intensity data will usually calculate these values and this enables the user to compare them with the theoretical values. Although the experimental values will never agree completely with the theory (due to experimental errors and to the fact that a finite number of reflections is considered, whereas the theoretical averages are based on an infinite number of reflections), it is usually possible to decide whether they indicate the centric or acentric case. Also, if these values are very far from either the centric or the acentric values, this is a definite indication that something is wrong with the experimental data set. Once E-values have been introduced, we note that in addition to the Fourier inverse transformations discussed in Section 7.2.4 two further types can be calculated. The one uses G(h) = E2(h) − l. From this type of Fourier coefficient, we get a so-called vector map V(r). From the properties of this synthesis, it is a so-called sharpened origin removed Patterson synthesis. The interpretation of the Patterson function using Frel2 is occasionally complicated
Table 7.1a: Some theoretical values relating to the distribution of normalized structure factors. Centrosymmetric
Acentric
1.0
1.0
0.798
0.886
0.968
0.736
Amount with ⎪E⎪ > 1.0
31.7%
36.8%
Amount with ⎪E⎪ > 2.0
4.6%
1.8%
Amount with ⎪E⎪ > 2.5
1.2%
0.2%
E2
178
7 Solution of the phase problem
Table 7.1b: Actual values found for the example structures. SUCROS
C60F18
0.880
0.877
0.762
0.765
Amount with ⎪E⎪ > 1.0
34.1%
35.0%
Amount with ⎪E⎪ > 2.0
2.2%
2.5%
E
Conclusion
Acentric
Acentric
by overlap of two or more maxima, caused by the fact that the atomic maxima and thus the difference vector maxima have a finite extension. Since the E-values relate to atoms without thermal motion, it follows that the maxima derived from a calculation using E2-values should have less peak overlap. These peaks are said to be sharpened relative to those of a normal Patterson synthesis. The large maximum at the origin also obscures the analysis of difference vectors of short length. By using (E2 – l) instead of E2 as Fourier coefficients, the origin peak can be removed: V(r = 0) = ( 2 / V) = ( 2 / V) ⎡ ⎢⎣
∑ ( E ( h ) − 1) cos (0) 2
h
{∑ E ( h )} – M ⎤⎥⎦ 2
h
where M is the total number of reflections. Since it follows from (7.30) that
∑
h
E2 ( h ) = M,
we get V(r = 0) = 0 which means that the large origin maximum usually present in a Patterson synthesis is removed. The second one uses ⎪G(h)⎪ = ⎪E(h)⎪, ϕ(h) derived from direct methods. This type of Fourier synthesis, called E-map, will be used frequently in the application of direct methods (see Section 7.3.4). The result is an electron density distribution having sharper maxima than when ⎪Fo⎪’s are used. However, E-maps more frequently show maxima that are not related to an actual atom position. These maxima, denoted as ghosts or false peaks, must be recognized and carefully eliminated from further calculations. The Wilson plots of our test structures SUCROS and C60F18 are shown in Figure 7.8. Significant deviations from a Wilson plot straight line can be seen espe-
7.3 Direct methods
179
(a)
(b)
Figure 7.8: Wilson plots of (a) SUCROS and (b) C60F18 as obtained from SIR [7]. In the output of SIR ln[K(s)] is plotted in the horizontal and s2 in the vertical direction.
180
7 Solution of the phase problem
cially for SUCROS. Such deviations, in the form of an S, are frequently observed and are not the result of experimental errors. From the approximation of the Wilson plots by least-squares lines, the slope and intercept are obtained, see inserts in Figure 7.8.
7.3.2 Fundamental formulae The principles of direct methods are based on the fact that the experimental data, that is the ⎪Frel⎪’s or the magnitudes of Fo’s, already contain information about the phases of the structure factors. This was first pointed out by Harker and Kasper in their socalled “Harker–Kasper inequalities” [8]. By application of the Cauchy–Schwarz inequality
∫ f(x) g(x) dx
2
≤
( ∫ f(x) dx)( ∫ g(x) dx) 2
2
to the expression of the structure factor, some remarkable results can be obtained. With F ( h ) = ρ(r ) e2πihr dV,
∫ V
supposing ρ(r ) ≥ 0 for all r, and f(h) = ρ (r)1/2 g ( h ) = ρ ( r ) e2πihr 1/2
we get ⎛ ⎞⎛ ⎞ 2 2 F(h ) ≤ ⎜ ρ( r ) dV ⎟ ⎜ ρ( r ) ⎡⎣ e2πihr ⎤⎦ dV ⎟ . ⎝V ⎠⎝V ⎠
∫
∫
Since 2
e2πihr = 1 and
∫ ρ( r ) dV = Z, V
with Z = F(000) (see 7.22) being equal to the number of electrons in the unit cell, we get ⎪F(h)⎪ ≤ F(000)
(7.32)
which is, of course, a trivial result. However, with the additional provision of special symmetry elements present in the unit cell, some important non-trivial results can be
181
7.3 Direct methods
derived. Let us assume, for example, that an inversion center is present. Then F(h) reduces to F (h) =
∫ ρ( r ) (e
2πihr
+ e–2πihr ) dV
V/2
=2
∫ ρ(r ) cos 2πhr dV.
V/2
Choosing f and g as above, we get F (h) ≤ 2 Z 2
∫ ρ( r ) (cos 2πhr ) dV, 2
V/2
with cos2α = 1/2 + 1/2cos2α, we get then ⎛ ⎞ 2 F ( h ) ≤ 2Z ⎜ ( ρ / 2 ) dV + ( ρ / 2 ) cos2π ( 2h ) r dV ⎟ = Z/2 Z + F ( 2h ) , ⎜⎝ ⎟⎠ V/2 V/2
∫
(
∫
)
or with Z = F(000), 2
⎪F(h)⎪ ≤ F(000) [F(000)/2 + F(2h)/2]. Defining u(h ) =
F(h ) F(000)
we get finally 2
u(h ) ≤ 1/2 + u(2h )/2 .
(7.33)
With using u instead of F, (7.32) reduces to ⎪u(h)⎪ ≤ 1. From (7.33) information about a phase, or since we have a centrosymmetric problem, about a sign, can be derived from the magnitudes of the u’s. Let us suppose that we have two reflections h and 2h with their u-values having large magnitudes, say both 0.6. Then (7.33) is only satisfied for u(2h) = +0.6 (since ⎪u(h)⎪2 = 0.36 and 1/2 + (1/2) 0.6 = 0.8, while 1/2 − (1/2)0.6 = 0.2), so the sign of F(2h) must be +. However, a definite conclusion from (7.33) can only be deduced for a limited number of reflections, If, for instance, both reflections have small magnitudes of u, say 0.2 or so, no information can be derived, since in this case +u(2h) as well as −u(2h) satisfy this equation. Although further Harker–Kasper inequalities can be derived for other symmetry elements, their practical use is limited, since reflections with large u-magnitudes are
182
7 Solution of the phase problem
usually rare and hence an insufficient number of phases can be determined. Therefore Harker–Kasper inequalities are no longer applied in practical structure analysis. Nevertheless, this first recognition between magnitudes and phases was an important result, which stimulated the development of more powerful methods of direct phase determination. A large number of investigations on that subject have been initiated, of which one of the earlier important results was the Sayre equation published in 1952 [9]. It is one of the basic formulae of direct methods. A simple derivation of Sayre’s equation can be obtained using the simplified structural model, introduced in Section 7.3.1 with point-shaped atoms of unique density having no thermal motion, so that there is no overlap between pairs of atoms. For this model ρ(r) = [ρ(r)]2 holds. The Fourier transform of ρ(r) is then denoted by G(h) (see Section 7.3.1). Since the Fourier transform of a product can be expressed by a convolution operation [Convolution Theorem, Appendix, Section A2, formula (A.29)], we get
F [ρ(r)] = {F [ρ(r)]} ∩2 or, by using G(h) for the Fourier transform, G ( h ) = G(h ′ ) G(h – h ′ ) dV*.
∫
V*
Since it was shown in Section 7.3.1 that the E-values are closely related to that model, and since for single crystals the integral over the reciprocal space can be replaced by a sum, we get E(h) = T
∑
h′
E(h ′ ) E( h – h ′ ).
(7.34)
(7.34) is the famous Sayre equation, which is a key formula in the theory of direct methods. The non-negative factor T does not play a role and has no effect on its application. Although we have derived this equation from a very special model, it is valid generally. From Sayre’s equation, two formulae to be applied in actual phase determination will be derived, depending on whether the structure is centrosymmetric or acentric, so we consider both cases separately. (1) Centrosymmetric case In the centrosymmetric case, the E-values have a sign of + or −. Sayre’s equation can then be interpreted as follows. For reflections h with ⎪E (h)⎪ being sufficiently large, it is likely that the sum on the right side of (7.34) will contain more terms E (h′) E (h − h′) having the same sign as E (h) itself, than terms of opposite sign. Otherwise equation
7.3 Direct methods
183
(7.34) could not hold. This is especially true for those terms for which E ⎪(h′)⎪ and ⎪E (h − h′)⎪ are large, since they are the major contributors to the sum. Let us consider an example. An E-value of, say, -3.5 may have six E-value summands on the right side of Sayre’s equation: –3.5 = ?0.5 ? 0.3 ? 2.7 ? 0.2 ? 1.5 ? 0.1, The question marks should indicate that we do not know at the moment the signs on the right side. To add up to -3.5 it is essential that the largest ⎪E⎪ = 2.7 has also the sign “−”, otherwise the addition on the right side can never give a negative result. It also follows that the second largest ⎪E⎪ = 1.5 must also have the sign “–”, so that a sign distribution as given below can lead to the correct solution –3.5 = + 0.5 + 0.3 – 2.7 – 0.2 – 1.5 + 0.1. Hence there exists a more than 50 % probability that, for large E-values, s(h) = s(h′) s(h − h′) holds, where s(h) denotes the sign of E(h). This equation remains valid if on the left side h is replaced by −h (since s(h) = s(−h)). Setting −h = h1, h′= h2 and h − h′ = h3 we get finally s(h1) = s(h2) s(h3) or
(7.35) s(h1) s(h2) s(h3) = 1
if the three reflections h1, h2, h3 satisfy the equation h1 + h2 + h3 = 0.
(7.36)
Reflection triplets for which (7.36) holds are said to be related by a Σ2-relation. As we shall see, these Σ2-relations play an important role with all applications of direct methods. From the derivation of (7.35), it was clear that this equation cannot hold exactly, so that instead of writing “ = ” it would be better to write “ ≈ ”. Fortunately, the probability that (7.35) is valid for a given structure can be calculated, as was first done by Cochran and Woolfson (1955) [10]. Setting nj =
fj N
∑f
j
j= 1
with the fj being the atom form factors of the N atoms of the unit cell, and
(7.37)
184
7 Solution of the phase problem
N
∑n
σk =
k j
(7.38)
j= 1
the probability, p, that (7.35) holds can be expressed by ⎤ ⎡ σ3 p = 1 / 2 + ( 1 / 2 ) tanh ⎢ 3/2 E(h 1 ) E(h 2 ) E(h 3 ) ⎥ . ⎦ ⎣ σ2
(7.39)
For an easier interpretation let us transform (7.39) to the case of N atoms of equal type. Then nj = n = σk = N
f 1 = Nf N
1 1 = Nk Nk–1
σ3 1/N2 1 = 3/2 = σ2 1/N N N and we get ⎡ 1 ⎤ p = 1/2 + ( 1/2 ) tanh ⎢ E(h 1 ) E(h 2 ) E(h 3 ) ⎥ . ⎣ N ⎦
(7.40)
Now we can interpret the sign relationship (7.35) together with (7.39) or (7.40) as follows: If we have three reflections connected by a Σ2-relation (7.36), and if the signs of two of them are known, then the sign of the third reflection can be deduced from (7.35) with a probability given by (7.39). This probability increases with the magnitudes of contributing E’s, but decreases with the number of atoms in the unit cell, thus with the size of the structure, as shown by (7.40). Several questions arise with this interpretation of (7.35). The first is where to obtain the known signs so that the sign relationship can be applied? This question will be discussed in Section 7.3.3. As we shall see, it will be possible to obtain a required starting set of known signs. Another question is, are the probabilities calculated from (7.39) large enough so that sign determinations from (7.35) can be trusted? To get an impression about the magnitude of p, we have calculated p from (7.40) [which does not differ too much from (7.39)] for various values of N and for different magnitudes of E, assuming ⎪E(h1)⎪ = ⎪E(h2)⎪ = ⎪E(h3)⎪ = E. The results, given in Table 7.2, show that for a medium-sized structure with N = 120 atoms in the unit cell, p is greater than 95 % only if all three contributing reflections have E-values greater than 2.5. As follows from Table 7.1a, only about 1 % of all reflections have E-values of that magnitude in centrosymmetric structures. If we have, for instance, a total of 3000 reflections, only 30 will have E-values above 2.5, and out of these, only a small number of ∑2-relationships will result. Although relations of that high probability
7.3 Direct methods
185
Table 7.2: Probabilities for the sign relationship calculated from eq. (7.40) for various parameters E and N. E
N = 40
80
120
160
200
3.0
1.00
1.00
0.99
0.99
0.98
2.5
0.99
0.97
0.95
0.92
0.90
2.0
0.93
0.86
0.81
0.78
0.76
1.5
0.74
0.68
0.65
0.63
0.62
1.0
0.58
0.55
0.54
0.54
0.53
are infrequently obtained, it is dangerous to accept lower probabilities. Note that already a probability of 90 % means that from 10 ∑2-relations, one sign determination is wrong. For a structure with hundreds of reflections, hundreds of ∑2-relations must be applied; therefore it is clear that a significant number of sign determinations will be wrong. Furthermore, every sign determined from a ∑2-relationship may be used as input in a further relation, so that a wrong sign determined in the first stages of phase determination will be propagated throughout the whole phase determination process. If, in the latter stages of sign determination, a large number of signs is known, it can happen that an unknown reflection is contained in more than one Σ2-relation. Then, of course, s (h) =
∑
h′
s( h ′ ) s( h – h ′ )
(7.41)
is a better approximation of Sayre’s equation, and in this case the probability for the sign of h to be “+” is given by ⎡ σ3 p+ ( h ) = 1/2 + ( 1/2 ) tanh ⎢ 3/2 E(h ) ⎣ σ2
∑
h′
⎤ E(h ′ ) E(h – h ′ ) ⎥ ⎦
(7.42)
while the probability p− (h) of a sign to be “−” is given by p− (h) = 1 − p+ (h)
(7.43)
(2) Acentric case For phase determination in the acentric case, a further formula can be derived from Sayre’s equation. Separating (7.34) into its real and imaginary part, we get E ( h ) sin ϕ ( h ) = T
∑
h′
E(h ′ ) E(h – h ′ ) sin [ ϕ(h ′ ) + ϕ(h – h ′ )]
186
7 Solution of the phase problem
and E ( h ) cosϕ ( h ) =
∑
h′
E(h ′ ) E(h – h ′ ) cos [ ϕ(h ′ ) + ϕ(h – h ′ )] .
By division we get tanϕ ( h ) =
∑ ∑
h′
h′
E(h ′ ) E(h – h ′ ) sin[ϕ(h ′ ) + ϕ( h – h ′ )] E(h ′ ) E(h – h ′ ) cos[ϕ( h ′ ) + ϕ( h – h ′ )]
.
(7.44)
This is the well-known tangent formula derived by Hauptman and Karle in 1956 [11]. Just as (7.35) is the key formula for phase determination in the centric case, the tangent formula is the key formula for acentric structures. The tangent formula had an important impact on the application of direct methods. For their contribution to the field Jerome Karle and Herbert Hauptman were awarded the Nobel Prize in chemistry in 1985. The process for phase determination by application of (7.44) is in principle the same as in the centric case. Σ2-relations have to be found and if the phases of two reflections are known, this relation can be used as input to (7.44) to determine the third. Note that (7.44) holds exactly only if the summation is taken over all h′. In practical application, however, only a few terms can be used. In the first stages of phasing it may happen that only one term is available. Then the result of (7.44) is only a first approximation of ϕ(h), but after having expanded the phase determination to a larger set, a re-phasing process may be started using more terms in the tangent sum. By an iterative expansion and re-phasing process, convergence to a set of sufficient correct phases is likely to be obtained. Again it is true that the major contributors to the sum in (7.44) are those terms having large ⎪E⎪-values, so that a start with relations which include reflections of large ⎪E⎪-values is made, as for the centric case. Summarizing all the theoretical results of direct methods, we can state that a practical application is possible if; first, the structure is not too large; second, a set of known phases can be obtained; and third, this set can be used in a sufficient number of Σ2-relations between reflections of large E-values for the determination of additional phases. From all these aspects discussed above, it can be questioned whether or not phase determination by direct methods is a guaranteed success. However, very powerful programs, using subtle procedures, have been developed and are distributed worldwide (see Chapter 6). Experience in recent years has shown that, in spite of the fact that all the formulae are approximations, direct methods are by far the most powerful general method of phase determination presently available. Provided that a set of reasonably accurate intensity data is available and if the number of atoms in the asymmetric is not larger than a few hundreds, the phase problem will be solved almost without exceptions. From the probability formula (7.40) it follows, however, that direct methods may not be applied successfully for very large, i.e. macromolecular structures. That is why
7.3 Direct methods
187
in protein crystallography different phase determination methods are generally used, which will briefly be described in Section 7.4.
7.3.3 Origin definition, choice of starting set It was shown in the last section that the application of direct methods requires a set of reflections with known phases. Fortunately, a set of up to three known phases can be obtained by the definition of the unit cell origin. As was pointed out in the discussion of crystal symmetry, the origin of a unit cell is chosen to be in a special position relative to the symmetry elements. For instance, in a centrosymmetric cell, the origin must always coincide with the inversion center. In every centric cell, inversion centers are not only present at (0, 0, 0) but also in seven other positions, so that a total of eight centers of symmetry can be found. They are located at (see Figure 7.9): (0, 0, 0), (1/2, 0, 0), (0, 1/2, 0), (0, 0, 1/2) (1/2, 1/2, 0), (1/2, 0, 1/2), (0, 1/2, 1/2), (1/2, 1/2, 1/2). Note that these eight centers of symmetry are not identical, since the structural motifs around each of these points are different, in contrast to centers of symmetry related by unit cell translations. Therefore the choice of origin at a center of symmetry can be made in eight different ways. If O and O′ are two origins with ∆r being the origin shift vector (see Figure 7.10), every point P can be expressed by a vector r relative to O and a vector r′ relative to O′ with r = ∆r + r′.
Figure 7.9: Positions of inversion centers in a centric cell.
188
7 Solution of the phase problem
Figure 7.10: Relationship between the position vectors r and r′ and the origin shift vector ∆r.
The structure factor expressions are N
F(h ) = F ei ϕ =
∑f e j
2πihr j
j=1
and
N
F′(h ) = F′ eiϕ′ =
∑f e j
2πihr j′
j=1
N
=
∑f e j
2πih ( r j − Δr )
= e− i( 2πhΔr ) F(h )
j=1
= F ei(ϕ− Δϕ ) with Δϕ = 2 πhΔr .
(7.45)
It follows that F and F′ have equal amplitudes, but the choice of different origin causes a phase shift ∆ϕ, given by (7.45). This result is important and requires interpretation. An immediate consequence is that no phase determination is possible unless the origin is specified; otherwise the phases are ambiguous. In this connection, we have to ask why tangent formula (7.44) or the sign relationship (7.35) can be valid since they propose phase values with no provisions for defining an origin. In fact, this is only possible because the reflections involved in these two formulae satisfy ∑2-relations and, moreover, it can be shown that the phase sum ϕ1 + ϕ2 + ϕ3 of a reflection triplet involved in a ∑2-relation is a so-called structure invariant. Phases, or combinations of phases, are said to be structure invariants if they do not change with origin transformations. From the phase shift calculated above, it can easily be seen that the phase sum ϕ (h1) + ϕ (h2) + ϕ (h3) is invariant if the three reflections satisfy h1 + h2 + h3 = 0. If F (h) and F′ (h) are related to O and O′, we get (ϕi = ϕ (hi)), F ( h 1 ) F ( h 2 ) F ( h 3 ) = F ( h 1 ) F ( h 2 ) F ( h 3 ) ei(ϕ1 +ϕ2 +ϕ3 )
7.3 Direct methods
189
and F′ ( h 1 ) F′ ( h 2 ) F′ ( h 3 ) = F′ ( h 1 ) F′ ( h 2 ) F′ ( h 3 ) ei(ϕ1′ + ϕ′2 + ϕ′3 ) = F ( h 1 ) F ( h 2 ) F ( h 3 ) ei ⎡⎣ϕ1 +ϕ2 +ϕ3 − 2 πΔr ( h1 + h2 + h3 )⎤⎦ . Since h1 + h2 + h3 = 0, we obtain ϕ1′ + ϕ2′ + ϕ3′ = ϕ1 + ϕ2 + ϕ3 hence ϕ1 + ϕ2 + ϕ3 is a structure invariant. This property is the reason why phase determination is possible from the sign relationship (7.35) and the tangent formula (7.44), since these two formulae define the phases of the structure invariants which are independent of the choice of origin. In the course of further developing direct methods, additional invariants such as quartets [12], quintets or higher invariants [13] were introduced. In a quartet invariant four reflections are involved: Q = ϕ(h1) + ϕ(h2) + ϕ(h3) + ϕ(h4) with h1 + h2 + h3 + h4 = 0. These higher invariants also carry phase information and are used in some of the modern direct methods programs. To understand how origin definition can be used to get known phases, let us study the problem for the space group P 1 . This space group has the eight inversion centers as shown in Figure 7.9, each of which may be chosen as the origin. Since only π or 2π are possible phase shifts (we have a centric cell) it is of interest only whether the scalar product h ∆ r is an odd or an even multiple of 1/2. So for phase-shift considerations, the actual values of reflection indices are of no consequence. It is only necessary to know whether they are odd or even. Therefore, all reflections can be put into eight categories, denoted by: eee, oee, eoe, eeo, ooe, oeo, eoo, ooo (e = even, o = odd). These categories are named parity groups. All reflections belonging to the same category have the same properties with respect to a phase shift. Now we can calculate the phase shift for each category if the origin is transferred from (0, 0, 0) to one of the other possible positions. The results are given in Table 7.3. A phase shift 0 means that the sign of a reflection is not changed, a shift π indicates a change to the opposite sign. The reflection 1 1 1, for instance, belonging to the parity group ooo will change its sign if the origin is shifted by 1/2 in each axial direction or if shifted by (1/2, 1/2, 1/2). For the other origin shifts, its sign will remain unchanged. On the other hand, a reflection of parity group eee is completely deter-
190
7 Solution of the phase problem
Table 7.3: Phase shifts in P 1 . No.
Origin shift
eee
oee
eoe
eeo
ooe
oeo
eoo
ooo
1
0
0
0
0
0
0
0
0
0
2
1/2, 0, 0
0
π
0
0
π
π
0
π
3
0, 1/2, 0
0
0
π
0
π
0
π
π
4
0, 0, 1/2
0
0
0
π
0
π
π
π
5
1/2, 1/2, 0
0
π
π
0
0
π
π
0
6
1/2, 0, 1/2
0
π
0
π
π
0
π
0
7
0, 1/2, 1/2
0
0
π
π
π
π
0
0
8
1/2, 1/2, 1/2
0
π
π
π
0
0
0
π
mined by the structure and does not depend on any origin shift, as can be seen from the first column which shows that all shifts are zero. This is the only parity group having this property. The results listed in Table 7.3 can be used for origin definition as follows. For a given structure, the reflections have a fixed sign distribution with respect to the structure itself and a special origin. Although no sign is known, we can fix the sign
(a)
(b)
Figure 7.11: Reflection categories and origin shifts: (a) allowed combination; (b) forbidden combination.
7.3 Direct methods
191
of a reflection not belonging to the eee parity group to a value of our choice (see Figure 7.11). This can be, for example, a reflection of the parity group ooo for which we define a sign value, say a (a = +1 or = −1). The sign will remain a if we transform the structure by one of the origin shifts, l, 5, 6, or 7 (Table 7.3). Let b be the sign of a second reflection of, say, eoo parity group relative to the origin l. Since we do not know b, the sign of this second reflection can be chosen either as +b or −b. If we choose +b we can allow an origin shift l or 7 [see Figure 7.11(a) and (b)]. If −b was chosen instead of b, an origin shift 5 or 6 has to be applied. Now let us proceed with a third reflection and assume it to be taken from the parity group oeo [see Figure 7.11(a)]. For the initial phase c, we can choose +c or −c. If we are in the path of origin shifts l or 7 (+b was chosen), +c would make no change of origin necessary, which means we stay at origin 1. If we choose −c, we must change to origin 7. In the path of origin shifts 5 or 6 (−b was chosen), a choice of +c requires the origin 6, whereas if −c was chosen, origin 5 must be used. Therefore for every sign combination of the second and third reflection relative to the first, there is only one origin which is consistent with a given sign combination. However, the sign of a fourth reflection can no longer be chosen arbitrarily. Since it can have two possible sign values and the origin shift is already uniquely fixed by three reflections, it cannot be ensured that the origin shift necessary for the fourth reflection is identical to that previously fixed. Before discussing the final interpretations of the example of Figure 7.11(a), let us deal with the example of Figure 7.11(b). It differs from that of Figure 7.11(a) only by the choice of the third reflection, which is taken out of the parity group oee instead of oeo. If the third reflection has the original sign c and if we are in the path of origin shift l or 7, (+b was selected), +c would permit these origin shifts but −c would not (see Table 7.3), since neither l nor 7 could change a sign. In path 5 or 6, we have a similar situation. In this case, neither 5 nor 6 are consistent with +c, so that for both choices of c it is impossible to have all sign combinations by means of origin shifts. It follows that each choice of sign of three reflections of the parity groups ooo, eoo and oeo can be realized by a uniquely determined origin shift, so that this definition of signs can be interpreted as an implicit definition of one of the possible origins. In the case of the parity groups ooo, eoo and oee, the origin is either determined ambiguously (+c in path l, 7 or −c in path 5, 6) or its definition is impossible (−c in path l, 7 or +c in path 5, 6). Therefore, this combination of parity groups cannot be used for fixing the origin. The procedure described above can be expressed more simply by the use of the so-called parity vectors p. A vector p is introduced with components 0 or 1, with 0 representing an even parity, 1 an odd parity. Eight parity vectors exist, one for each of the reflection categories given in Table 7.3 and for each reflection a corresponding parity vector can be assigned.
192
7 Solution of the phase problem
Let us have a look at two examples: ⎛ 4⎞ The reflection h = ⎜ 2 ⎟ , parity group eee, has the corresponding parity vector ⎜⎝ 0 ⎟⎠ ⎛ 5 ⎞ ⎛ 0⎞ p = ⎜ 0 ⎟ . The reflection h = ⎜ −3 ⎟ , parity group ooe, has the corresponding ⎜ 6 ⎟ ⎜⎝ 0 ⎟⎠ ⎠ ⎝ ⎛ 1⎞ parity vector p = ⎜ 1 ⎟ . ⎜⎝ 0 ⎟⎠ Then it can be decided for three reflections whether or not they are suitable for the definition of the origin by the following rule: Three reflections h1, h2, h3 can be used for the choice of the origin if and only if their corresponding parity vectors p1, p2, p3 are linearly independent, or if the determinant consisting of the pi (i = 1,…,3) is equal to ± 1. Let us consider, for example, that we have three reflections of the parity groups used in Figure 7.11(b), say h1 = (−7 5 11), h2 = (2 −3 1) h3 = (1 0 2), then we get 1 1 1 D = 0 1 1 = 0 ≠ ± 1, 1 0 0 indicating that this choice is invalid. Having on the other hand, three reflections of the parity groups as used in Figure 7.11(a), say h1 = (−7 5 11), h2 = (2 −3 1) h3 = (1 0 3) , then we get (only the third reflection differs from the example above) 1 1 1 D = 0 1 1 = 1. 1 0 1 The second reflection triplet can be taken for the choice of origin, but not the first. Since the only property of P 1 used in this discussion was the choice of the origin at the eight positions shown in Figure 7.9, it follows that this rule holds for every space group which permits these eight points to be defined as origins. This condition holds for a large number of space groups, that is, all primitive centric space groups in the tri-
7.3 Direct methods
193
clinic, monoclinic and orthorhombic system, and all primitive acentric space groups of the crystal class 222. For space groups not included in the set described above, a generalization of the origin fixing rule has to be given. We shall not do this here, because all actual direct method computer programs have an algorithm included for a valid selection of number and type of origin fixing reflections. This was first introduced in the pioneering work of the MULTAN authors in their subprogram CONVERGE (see Chapter 6) [14]. In addition to the problem of origin fixing, a further aspect has to be discussed in the case of acentric space groups. In this case, the chirality of structures is unknown, since it follows from Friedel’s law that two structures of different chirality are not distinguishable from the reflection intensities. However, the phase of the left-handed (L) and right-handed (R) structures differ by their signs (see Section 4.2.3): ϕR(h) = −ϕL(h),
(4.31)
It follows that in spite of fixing the origin, two phase sets can exist, one corresponding to the left-handed structure and the other to the right-handed structure. At the stage of phase determination, a decision in favor of one enantiomorphic form has to be made. This can be done by restricting the phase of a reflection, which is significantly different from 0 or π to one-half of the permissible range, say 0 to π. This task is also taken by modern software, so that the user must not get involved in this problem. We note that the above mentioned phase restriction selects one enantiomorphic form accidentally, because originally it is not known which phase range belongs to which form. It follows that the structure obtained from a successful phase determination may be the correct enantiomer or not. The number of reflections for origin (and maybe for enantiomorph) fixing reflections is generally too small for a successful structure determination. Therefore, it is necessary to include additional reflections in the starting set, which are denoted as variables, because no fixed phase values can be assigned. For centric structures, once the + or – sign has been assigned to every variable reflection, a complete sign determination will be calculated for each choice of sign. Since one of the two signs must be correct, one of the two sign determinations should lead to a correct solution. If n variables are used, 2n trials have to be calculated. The problem becomes more difficult in acentric phase determination. Since the phases are unrestricted and can vary over 0 to 2π, it would, in principle, be necessary to use a large number of starting values taken over the whole phase range. It has been shown from experience that it is generally sufficient to use only four starting values for a general reflection: these are π/4; (3π)/4; (5π)/4; (7π)/4. This would lead to four trials for one general reflection and, if m reflections are necessary, the number of trials would be 4m and could therefore increase rapidly with increasing number of variable reflections.
194
7 Solution of the phase problem
The number of trials can, however, be reduced drastically, if the so-called magic integer method is applied, first introduced by White and Woolfson (1975) [15], later detailed by Main (1977, 1980) [16]. Let n phases ϕi (i = 1, ···, n) be given, which are generally expressed either in fractions of 2π or in degrees (0 ≤ ϕi < 360°), but are now considered in cycles, 0 ≤ ϕi < 1. Then a single factor x, also in the range 0 to 1, can be chosen, so that the phases ϕi can be approximated by the n equations ϕi = mi x [mod(1)], i = 1, ···, n
(7.46)
(mod(1) means that the phase values mi x obtained on the right side of (7.46) should always be reduced to the interval between 0 and 1. For example, for values mi = 3 and x = 0.5 the product 1.5 is reduced to 0.5). The non-negative integers mi, (i = 1, ···, n), are denoted magic integers. Since (7.46) holds only by approximation, a phase error ∆ϕi exists. Depending on the choice of the magic integers and an optimal factor x the errors ∆ϕi can be kept in an acceptable range. For an appropriate choice of a magic integer sequence, the use of Fibonacci series was recommended [17]. If the magic integer formalism is applied to the permutation of the starting set phases, a significantly smaller number of trials is needed than the number of 4n trials mentioned above. In a table given by Viterbo [17] the number of quadrant permutations is compared with the number of magic integer trials. If, for example, 5 variable reflections are given, the number of permutations reduces from 45 = 1024 to 50 with magic integers. Summarizing what has to be observed for the choice of a good starting set: (1) The starting set should consist of up to three reflections for origin definition, possibly one for enantiomorph fixing and a number of variables. The starting reflections should have E-values as high as possible and should be involved in as many Σ2-relations as possible, so that a large number of further phases of high probability can be generated. (2) The number of variable reflections should be kept as small as possible, however, magic integers help to keep the number of trials moderate. As mentioned before, the entire task of selecting an optimal starting set (including number and type of variable reflections) is now taken over by the modern direct methods programs. Only if a phase determination fails, may the user try to influence the starting set. Since it normally happens that a relatively large number of trials are calculated, the question remains open of which is the correct solution of the phase problem. A number of reliability criteria have been developed denoted as figures of merit (fom), which should indicate whether a solution is correct or not. However, none of them are certain to indicate the correct solution, so that the programs combine some of these figures of merit to a combined figure of merit (cfom), which is then used as a criterion to indicate the most probable solution.
7.3 Direct methods
195
We mention as one of the used figures of merit the R(E)-value introduced by Karle and Karle (1966) [18]. R(E) is the relative average difference between the “observed” E-values Eo(h) and the “calculated” E-values Ec(h). The observed E-values are those obtained from the experimental data after (7.25). Calculated E-values are derived from the application of the tangent formula. The numerator and the denominator of the tangent formula can be interpreted as the imaginary and the real part of E(h). Therefore if we write (7.44) as tanϕ ( h ) =
Ei ( h ) , Er ( h )
we get ⎪Ec(h)⎪ = Ei(h)2 + Er(h)2 . 2
Then R(E) is defined by R ( E) =
∑
h
Eo (h ) – Ec ( h )
∑
h
Eo (h )
,
(7.47)
which is a further type of R-value. The trial having the lowest R(E)-value should most likely contribute to finding the correct solution. We note here that a number of modifications have been applied to the classical methods. Generalized tangent formulae have been introduced, starting sets consisting of a rather large number of reflections with random phases (RANTAN) [19] were applied and a great progress in solving medium-sized structures was reported by using phase annealing methods [20]. For more details concerning the history of direct methods and modern developments we refer to the review article of Sheldrick (2008) [21]. The two methods of phase determination introduced in this chapter, Patterson and direct methods, dominate the modern methods of crystal structure analysis, especially in small molecule crystallography. More than 98 % of these crystal structures are solved with either of these two methods, for the large majority of structures direct methods are applied. In some cases Patterson and direct methods are combined. This is realized in computer programs like PATSEE or DIRDIF. Also newer versions of SHELXS and SIR allow Patterson methods to be applied (see Chapter 6).
7.3.4 Application of direct methods, the examples of SUCROS and C60F18 The procedure of direct methods application follows a schedule which is related to the structure of the corresponding computer programs, as was first introduced with MULTAN, see flow diagrams in Figure 7.12, and is basically similar in current programs, although modern programs offer several additional options.
196
7 Solution of the phase problem
Figure 7.12: Historical flow diagram of MULTAN [14].
The procedure consists roughly of five steps: (1) After input of an hkl-file and an instruction file to specify some crystal data like lattice constants, space group, unit cell contents, E-values are calculated (NORMAL routine). (2) Σ2-reations are generated, using only reflections above a given Emin. (3) An optimal starting set is chosen by the program, no interference by the user is needed. Depending on the number and type of variable reflections, various trials of phase determinations are calculated and the most probable phase set is identified by the best cfom. (4) An E-map is calculated (Fourier transform using E-values with the most probable phase set now determined!) (5) A peak search and fragment identification procedure is added trying to find a chemically sensible molecular fragment. Its fractional coordinates are written into an export file for further use (generally of .res-file type). Here are some details of how direct methods were applied to our sample structures SUCROS and C60F18. In both cases the data sets described in Section 5.2.4 were used and we applied SIR [7]. For SUCROS the program chooses 254 largest E-values (Emin = 1.28) for phasing. 24 trials are calculated, of which trial no. 5 was identified (cfom = 1.0) as the most promising one yielding the 23 non-H atoms of SUCROS (C12H22O11). The 11 highest maxima were correctly identified by the program as the oxygen atoms, so that the structure was completely and correctly determined.
7.4 Phase determination for macromolecules
197
For C60F18 the program chooses 501 E-values above Emin = 1.88. For this structure a rather large number of 174 trials are calculated, of which trial 9 has cfom = 1.00. The obtained structure showed the expected 78 maxima, of which the 18 highest ones were considered by the peak interpretation routine as fluorine. This assignment was only partially correct, because 6 of them were in fact carbon atoms, while 6 misinterpreted carbons were fluorine. This mismatch could easily be recognized from the chemical connectivity and corrected. Hence in this case all 78 atoms were completely determined but a few incorrectly identified with respect to their atomic type. The relative height of the maxima in the peak list is a further strong indication of whether or not the solution is correct. For SUCROS peak no. 23 has a relative height of 668, the next one, no. 24, has a relative height of only 181. For C60F18, peaks nos. 78 and 79 have heights of 370 and 64. These strong differences allow a rather reliable discrimination between correct and non-correct atom positions, the latter ones also denoted as ghosts. It should be recalled that, except for an hkl-file and unit cell and space group information, no further input from the user was requested, especially the choice of Emin, the number and type of starting reflections and the number of trials was completely left to the program.
7.4 Phase determination for macromolecules The probability formula (7.40), which contains the number of atoms N in the denominator, is an indication that direct methods will probably fail for macromolecular structures such as proteins for example. If the number of non-H atoms is much larger than, say 1000, other methods for phase determination are applied, which will briefly be described below. One of the first and frequently applied methods in protein crystallography, where several thousands of atoms must be determined, was the so-called isomorphous replacement, which makes use of Patterson methods in a first step. Two crystal structures are designated as isomorphous if their lattice constants are (almost) equal, if they have the same space group and if the atomic positions differ only in a few atoms. Chemically similar compounds can have isomorphous crystal structures. For example, some salts of tartaric acid are isomorphous; this holds for our sample structure KAMTRA and the corresponding ammonium salt. All atoms of the tartrate anion are on the same positions in both unit cells; the only difference exists for the cation. The same site is occupied once by K+ and once by NH4+ in the two structures. Concerning the diffraction pattern, isomorphous structures agree with respect to the reflection positions, but can show slightly different intensities due to differences in the chemical identity of a few atoms (e.g. the exchange K+ ↔ NH4+). For small molecule structures isomorphous pairs are rarely seen, because small changes in the
198
7 Solution of the phase problem
chemical identity can result in large structural changes. For macromolecules the situation is different. They crystallize generally with large spaces occupied with solvent molecules, mostly water. Then a good chance exists to replace some solvent positions by heavy atoms without influencing the original lattice structure. If isomorphous heavy atom derivatives are obtained they can be utilized for phase determination as follows: If F(h) is the structure factor of the original macromolecule (in case of a protein the native protein) and if F′(h) is the structure factor of the heavy atom derivative we can separate (because the structures are isomorphous!), F′(h) = F(h) + Fs(h),
(7.48)
with Fs(h) being the heavy atom contribution (subscript s = schwer). If we measure intensity data of both structures, we obtain ⎪F⎪ und ⎪F′⎪ (for convenience the argument h is left out). If we can obtain the position of the heavy atom, e.g. from Patterson interpretation, then Fs is known including its phase. The structure factor is generally a complex quantity (which is always true for proteins because of their chirality), so we can try a graphical solution of equation (7.48): F = F′ – Fs in the plane of complex numbers. As Figure 7.13 shows, two circles with radii ⎪F⎪ und ⎪F′⎪ are drawn with their origins O and O′ displaced against each other by the vector – Fs . They intersect at A1 and A2 and the vectors F1 = OA1 and F2 = OA2 solve eq. (7.48). Unfortunately the solu-
(a)
(b)
Figure 7.13: Graphical illustration of phase determination using the method of isomorphous replacement. (a) Determination of two possible phase angles α1 und α2 from one heavy atom derivative (single isomorphous replacement, SIR); (b) unique phase determination in favor of α from two heavy atom derivatives (multiple isomorphous replacement, MIR), illustration kindly provided by Bergmann/Schäfer, Vol. 6, 2nd edn. (2005), de Gruyter, Berlin.
7.4 Phase determination for macromolecules
199
tion of this single isomorphous replacement procedure (SIR) is not unique, because we obtain two possible phases α1 and α2 [α1 = –18°, α2 = 108° in Figure 7.48(a)] for the structure factor of the native protein. It follows that a second isomorphous heavy atom derivative ⎪F′′⎪ must be measured and its heavy atom contribution FT be determined. Then we have F′′ = F + FT. A third circle is drawn with radius ⎪F′′⎪ and the origin O′′ displaced by –FT against O [Figure 7.13(b)], which can confirm one of the two intersection points A1 or A2. This method depends strongly on the quality of the data collections, hence the accuracy of the values ⎪F⎪, ⎪F′⎪ and ⎪F′′⎪ and the determination of the heavy atom positions. So the three circles will hardly intersect in a common point [A2 in Figure 7.13(b)], but may favor a certain phase interval ∆α. In practical work it can happen that more than two heavy atom derivatives are necessary to obtain reliable phases. The method is then called multiple isomorphous replacement, MIR. One drawback of the multiple isomorphous replacement method is the need to prepare and to measure at least two, sometimes even more heavy atom derivatives. The procedure can be reduced to the use of one heavy atom derivative, if anomalous dispersion is taken into account. In Section 4.2.3 it was shown that in the presence of anomalous dispersion the real scattering factor f has to be replaced by a complex quantity fA: fA = f + ∆f′ + i ∆f″
(4.28a)
with i being the imaginary unit. The following three data sets are needed: (1) the native protein, structure factor F(h); (2) the heavy atom derivative, measured at a wavelength λ1, where the structure is not affected by anomalous dispersion, structure factor analogous to eq. (7.48): Fs1(h) = F(h) + Fs(h);
(7.49)
(3) the same heavy atom derivative, measured at a wavelength λ2, anomalous dispersion has to be taken into account, structure factor Fs2(h) = Fs1(h) + ∆fanom.
(7.50)
For convenience, let the structure consist of N atoms with only the Nth one affected by anomalous dispersion. Making use of (4.28a) and some non-trivial arithmetic with complex numbers the structure factors for Fs2(h) and Fs2(-h) read Fs2(h) = Fs1(h) + ∆fRe(h) + ∆fIm(h)
(7.51a)
Fs2(−h) = Fs1(h)* + ∆fRe(h)* − ∆fIm(h)*
(7.51b)
with ∆fRe(h) and ∆fIm(h) being contributions of the anomalous scatterer. The asterisk “*” indicates the conjugate complex of a complex number.
200
7 Solution of the phase problem
(7.51a) can be used to approach to the phase of Fs1(h). We get immediately Fs1(h) = Fs2(h) −(∆fRe(h) + ∆fIm(h)).
(7.51c)
Once we have obtained the heavy atom position from Patterson interpretation we know ∆fRe(h) and ∆fIm(h) including phases and we can try a graphical solution of (7.51c) in the plane of complex numbers, similarly as was done with the isomorphous replacement procedure. Two circles with radii ⎪Fs1(h)⎪ and ⎪Fs2(h)⎪ are drawn with their origins O1 and O2 displaced against each other by the vector −(∆fRe(h) + ∆fIm(h)), see Figure 7.14(a). The intersections at A1 and A2 solve eq. (7.51c) in favor of Fs1(h), however, as in the case of SIR we have two solutions. Making use of the Friedel-related reflection; we obtain from eq. (7.51b) Fs2(−h)* = Fs1(h) + ∆fRe(h) − ∆fIm(h) or Fs1(h) = Fs2(−h)* −(∆fRe(h) − ∆fIm(h)).
(7.51d)
We can now draw a third circle of radius ⎪Fs2(−h)⎪ around an origin O2′ being displaced by the vector −(∆fRe(h) − ∆fIm(h)) against O1, see Figure 7.14(b). One of the intersections of this circle with the first one confirms solution A1, so that the phase of Fs1(h) is obtained from the vector from O1 to A1. With the phase of Fs1(h) known, we can use eq. (7.49) to get the phase of the native protein, provided that the heavy atom position has been determined before.
(a)
(b)
Figure 7.14: Graphical solution of eq. (7.51c): (a) two solutions at A1 and A2 from the intersection of two circles of radii ⎪Fs1(h)⎪ and ⎪Fs2(h)⎪ (solid representation); (b) unique solution in favor of A1 with a third circle of radius ⎪Fs2(−h)⎪ (dashed representation).
7.4 Phase determination for macromolecules
201
The procedure described above, being designated single isomorphous replacement using anomalous scattering (SIRAS), has the advantage that in contrast to MIR only one isomorphous heavy atom derivative is needed, however, this goes at the expense of data collections at two wavelengths. The use of anomalous dispersion has been expanded to so-called multiwavelength anomalous dispersion phasing methods (MAD). In contrast to the methods discussed above, MAD has the advantage that no isomorphous derivatives have to be sought because crystals of only one sample are needed. Generally three data sets at various wavelengths close to the absorption edge are measured where the anomalous signal is amplified. For MAD phasing it is essential that the wavelength can be tuned, so that the wide wavelength spectrum at synchrotrons can be exploited [22]. In the case of metallo-proteins the experiment can be performed around an absorption edge of the metal. In the absence of a strong anomalous scatterer within the range of synchrotron beamlines, the anomalous signal can be enhanced by substituting the sulfur-containing residue methionine by seleno-methionine [23]. In several cases the anomalous signal of one data set measured close to the S- or Se-absorption edge is sufficient for phasing, known as single wavelength anomalous dispersion (SAD or S-SAD, Se-SAD) [24, 25, 26]. The pioneering SAD study was carried out by Hendrickson and Teeter (1981) with conventional CuKα radiation at home laboratory conditions [24]. However, since all of these methods are based on a weak anomalous signal, data collections at synchrotrons are preferred; more so, since more and more protein data sets are measured at synchrotron protein beamlines. Since until now an increasing number of protein structures have been solved, the solution of the phase problem can be attempted by a method known as molecular replacement (MR), which is gaining increasing importance. If a structure has to be solved, which has a very similar sequence to an already known structure, and if the unit cells are approximately alike, it is assumed that the unknown structure arranges in the crystal comparably to the known structure. That part of the known structure which is expected to agree with the unknown one is entered as model into the phasing of the structure to solve, and it is attempted to determine the missing part via difference syntheses. If necessary, rotation and translation functions can be applied to place the model in the correct position, which can be recognized by a comparison of the computed model Patterson function with the one of the protein to solve. If the molecular replacement method is successfully applied, it has the great advantage that only the data set of the native protein (hence one data set!) is needed, which reduces the experimental effort drastically compared to most of the other methods described above. Regarding the large number of protein structures deposited in the protein database (see Figure 1.1 in Section 1.1); there are good chances to find an adequate model structure, so that this method is now frequently applied in protein crystallography. In a survey of 2004 by Jiang and Sweet molecular replacement was already mentioned as the dominant method for phasing of macromolecules as it was applied to more than 50 % of the reported structures [27]. We refer also to a
202
7 Solution of the phase problem
review article on the different methods of solving the phase problem in macromolecular crystallography published by Taylor (2003) [28]. In this chapter we have discussed structure solution methods for macromolecules, consisting of several thousand atoms, while the methods described in the previous chapter apply to structures of, say, some hundred atoms. For a certain amount of time there was indeed a gap in the structure determination of medium-sized structures of more than 500 to 1000 atoms. This gap has been closed by so-called dual space methods, introduced by Miller et al. (shake and bake) [29] and later by Sheldrick (half baked) [21]. Both methods differ from the classical direct methods, which operate only in reciprocal space, in that calculations are carried out alternately in reciprocal and direct space (therefore dual space methods). In a first step a test structure is generated, a random structure in Miller’s “shake and bake”, and expectedly a more reliable one by attempts of Patterson interpretation in Sheldrick’s “half baked”. By Fourier transform, structure factors and hence starting phases are provided in reciprocal space. By application of generalized direct methods improved phases are expected to be obtained (shake step). An inverse Fourier transform into direct space should allow the interpretation of an improved structural model (bake step). This procedure usually has to go through a large number of shake and bake cycles until it can be recognized whether it converges to the correct structure. If no convergence is achieved, a further test structure has to be chosen and the process repeated. In practice it can happen that several thousand trials have to be run [30]. An implementation of Sheldrick’s procedure is in his program SHELXD [31], which is now widely in use. Already more than ten years ago, his group used SHELXD to solve the triclinic structure of HEW-lysozyme and exceeded the 1000-atom limit for the first time. Most of the phasing procedures described above need the collection of more than one data set, whereas one data set is already very large with respect to the huge unit cells of macromolecules. It follows that the effort in phasing macromolecular structures is extremely high. We note, however, that protein crystals generally do not diffract to high angles in 2θ. This reduces not only the resolution dmax (as introduced in Section 2.2) dmax = λ/(2sin θmax) [Å]
(2.12)
but reduces at the same time the number of reflections to collect drastically. If, for example, for CuKα radiation 2θmax = 45° is chosen, the data set reduces to 5 % (!) of the limiting sphere at the expense of a resolution of only 2 Å. The first structure determination of hemoglobin described in the literature was even carried out at a resolution of 5.5 Å [32]. Also today only few protein structures at atomic resolution are known. A prominent example is crambin, which was determined at 0.48 Å and was reported to set the record in crystallographic resolution for biological macromolecules [33].
8 Refinements 8.1 Theoretical aspects 8.1.1 Model versus experiment, R-value
All methods described above for solving the crystallographic phase problem were based on more or less severe approximations. It follows that even a successful structure determination will give only a preliminary or even incomplete model, which needs to be improved. Hence we shall make use of a refinement procedure which will be introduced below. First, however, we define a criterion which can be used to examine the accuracy of the present state of structure determination, which means we wish to compare the obtained model with the experiment. This is done by calculation of a further type of residual or agreement factor, designated R-value (see also Section 7.1). The present type of R-value is the most prominent and most frequently used one. It is the relative average difference between the calculated structure amplitudes ⎪Fc⎪ [see eq. (7.14)] and the observed structure amplitudes ⎪Fo⎪ [see eq. (7.16)] given by, R=
(1/r ) ∑ h Fo ( h ) − Fc ( h ) (1/r ) ∑ h Fo ( h )
where r is the total number of reflections. The factor (1/r) cancels, hence R=
∑
h
Fo ( h ) − Fc ( h )
∑ F (h) h
.
(8.1)
o
Since the ⎪Fo⎪’s are derived directly from the experiment and the ⎪Fc⎪’s are calculated from the structural model, the R-value serves for two purposes. On one hand it can be regarded as an indication of how well the model fits the real structure. If the model is inaccurate or incomplete (or completely wrong), the ⎪Fc⎪’s will strongly differ from the ⎪Fo⎪’s leading to a high R-value. If on the other hand we have a very good model and hence proper ⎪Fc⎪’s, but have only a poor data set, the ⎪Fo⎪’s will differ considerably from the ⎪Fc⎪’s, giving also a high R-value. Only if both model and data set are of proper quality we will obtain a small R-value. Since the R-value is a relative quantity, which is frequently expressed in percent, it does not depend on the size of a given structure. So the R-value provides just one single number which serves as a measure for the quality of a structure determination and allows even a convenient comparison between different structures. We note that we shall introduce slightly modified versions of the R-value given in eq. (8.1) in the next section, which play, however, a similar role to the one introduced here.
204
8 Refinements
Since the scaling provides that the sum of the ⎪Fo⎪’s is equal to the sum of ⎪Fc⎪’s, the R-values will always be smaller than 1.0, although the model might be completely wrong. Wilson (1950) has shown that the R-value of a structure which is oriented randomly in the unit cell can be calculated theoretically [1]. The result depends on whether the space group is centric or acentric: R(centric) = 2
(
)
2 − 1 = 0.828
(8.2a)
R(acentric) = 2 − 2 = 0.586.
(8.2b)
So it is clear that an R-value calculated for an actual structure or structural fragment must be significantly smaller than the values given in (8.2) to indicate that the solution is at least partially correct. The first Fc-calculation for KAMTRA, based on the K+ position only and the isotropic temperature factor obtained from the Wilson plot, led to an R-value of 0.434, see Section 8.4.1. This value is significantly lower than those for random structures, indicating that progress towards the correct structure has been made.
8.1.2 Theory of least-squares refinement The structural model determined so far can be expressed mathematically by a number of parameters. These are the positional coordinates and the displacement factors of each atom and one or more scale factors. As pointed out in the last section, these parameters can only be regarded as preliminary. How to improve the given model needs some mathematical considerations, which are, however, not very complicated. Let us represent the parameters of a given model M ′ by x1′ ,... x′n and note that this parameter set includes all parameters of the model contributing to the Fc’s. So x′j may be a positional, a displacement, or a scaling parameter, or possibly one of some special parameters which shall be discussed later. The problem is to obtain a model M having the parameters xj = x ′j + x j j = 1,... n,
(8.3)
so that M is a “best” approximation of the experimental data. The problem expressed in terms of F’s reads: Given a set of Fc’s dependent on the parameters x1′ , ... x′n Fc′ = Fc ( x1′ , ... xn′ ) , how do we find improvements for the x′j , so that a set of Fc’s derived from x j = x ′j + x j (j = 1,... n)
(
Fc = Fc x1 ,..., xn
)
8.1 Theoretical aspects
205
is a best approximation of the Fo’s? The question immediately arises, which approximation is said to be the best, and how do we get this approximation? This problem is a typical fitting problem and it will be solved by a method which is more than 200 years old, the least-squares method. It is based on the principle expressed first by Gauss and Lagrange (∼ 1800), who stated that a set of theoretical values Tk (k = 1,…, r) is the best approximation for a set of observations Lk (k = 1,…, r) if the sum r
∑ (T k=1
− Lk )
2
k
(8.4)
is a minimum. If the Tk depend on the parameters x1,… xn, the condition of (8.4) to be minimal can be used to calculate the most favorable parameter set. In the application of that principle to structure analysis, the observations Lk are given by the observed structure amplitudes ⎪Fo(h)⎪, the theoretical values Tk by the calculated structure amplitudes ⎪Fc(h)⎪. Then the principle of least-squares for single crystal analysis is Q=
∑ ( F (h) − F (h) )
2
o
h
c
= minimum.
(8.5a)
Weighting factors w(h) are frequently given to the various terms in that sum, to take into account the different precision of the ⎪Fo⎪’s. Then we modify (8.5a) to Q=
∑ w ( h ) ( F ( h ) − F ( h ) ) = minimum. 2
o
h
c
(8.5b)
We define a model M with parameters x1 ,..., xn to be the best approximation for the observations Fo if the Fc’s calculated from that model,
(
)
Fc = Fc x1 ,..., xn , satisfy (8.5). Provided that a first approximation model M′ exists with Fc′ = Fc ( x1′ , ... x′n ), we shall develop an algorithm for the calculation of improvements xj (j = 1,… n) by using (8.5b). Suppose that we have measured r reflections h1,…hr, so that r ⎪Fo⎪’s, ⎪Fo⎪1 …⎪Fo⎪r and r ⎪Fc⎪’s, ⎪Fc⎪1 …⎪Fc⎪r are present. Let us denote by vk the weighted difference between the observed structure amplitude and that of the desired model vk =
( (
)
w k Fc k x1 , ...xn − Fo k
)
k = 1,... r.
Then (8.5b) can be expressed as r
Q=
∑v
2 k
= minimum.
k=1
For the corrections xj (see 8.3), we assume that x j