196 33 14MB
English Pages 528 Year 2021
Biophysics for Beginners Second Edition
Helmut Schiessel
Biophysics for Beginners A Journey through the Cell Nucleus Second Edition
Published by Jenny Stanford Publishing Pte. Ltd. Level 34, Centennial Tower 3 Temasek Avenue Singapore 039190 Email: [email protected] Web: www.jennystanford.com British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 978-981-4877-80-0 (Hardcover) ISBN 978-1-003-22310-8 (eBook) DOI: 10.1201/9781003223108
Contents
Preface to the First Edition Preface to the Second Edition 1 Molecular Biology of the Cell 1.1 The Central Dogma of Molecular Biology 1.2 A Journey through the Cell Nucleus 2 Statistical Physics 2.1 The Partition Function 2.2 Applications 2.3 The Entropy 2.4 Particles with Interactions and Phase Transitions 2.5 Biomolecular Condensates
ix xii 1 1 8 19 19 33 36 45 60
3 Polymer Physics 3.1 Random Walks 3.2 Freely Jointed and Freely Rotating Chains 3.3 The Role of Solvent Quality 3.4 Self-Avoiding Walks 3.5 The Flory Argument 3.6 The Blob Picture 3.7 Polymers in Poor Solvents 3.8 Internal Structure of Polymers
79 80 84 89 91 94 97 101 106
4 DNA 4.1 The Discovery of the DNA Double Helix 4.2 DNA on the Base Pair Level 4.2.1 A Geometrical Approach 4.2.2 A Statistical Physics Approach
113 113 118 118 131
vi Contents
4.3 DNA as a Wormlike Chain 4.4 DNA Melting
146 177
5 Stochastic Processes 5.1 Introduction 5.2 Markov Processes 5.3 Master Equation 5.4 Fokker–Planck Equation 5.5 Application: Escape over a Barrier 5.6 Application: Dynamic Force Spectroscopy 5.7 Langevin Equation 5.8 Application: Polymer Dynamics
191 191 196 201 203 211 215 221 225
6 RNA and Protein Folding 6.1 RNA Folding 6.2 Protein Folding
241 241 247
7 Electrostatics inside the Cell 7.1 Poisson–Boltzmann Theory 7.2 Electrostatics of Charged Surfaces 7.3 Electrostatics of Cylinders and Spheres ¨ 7.4 Debye–Huckel Theory 7.5 Breakdown of Mean Field Theory
265 265 269 278 284 292
8 DNA–Protein Complexes 8.1 Protein Target Search 8.2 RNA Polymerase 8.3 Nucleosome Dynamics 8.3.1 Site Exposure Mechanism 8.3.2 Force-Induced Nucleosome Unwrapping 8.3.3 Nucleosome Sliding 8.4 Chromatin Fibers 8.4.1 Two-Angle Model 8.4.2 Solenoid-Type Models 8.5 Chromatin at Large Scales 8.5.1 From Classical Polymers to Fractal Globules 8.5.2 From Polymer Rings to Loop Extrusion
299 299 316 327 328 337 356 376 378 388 397 399 407
Contents
9 Computational Methods 9.1 Molecular Dynamics Simulations 9.2 Monte Carlo Simulations
421 422 425
Appendix A: Probability Theory Appendix B: The Distribution of Magnetization and the Central Limit Theorem Appendix C: Connection between Polymer Statistics and Critical Phenomena Appendix D: Hamilton’s Principle and the Pendulum Appendix E: Fourier Series Appendix F: The Pre-Averaging Approximation Appendix G: Interaction between Two Equally Charged Plates at Zero Temperature Appendix H: Geometries of Chromatin Fiber Models
445
References
491
Index
503
451 453 465 473 477 483 487
vii
Preface to the First Edition
Preface to the first edition: Biophysics or biological physics or statistical physics of biological matter or quantitative biology or computational biology is a large and rapidly growing interdisciplinary field with many names on the border between physics, biology and mathematics. It is not clear where this field begins, where it ends, or where it will ultimately lead. However, it is clear that tremendous progress has been made in this area over the past two decades. This book is the result of various attempts to teach the subject in a variety of settings to audiences from diverse backgrounds at different stages in their studies. It started with courses (usually 4 h long) at winter and summer schools in Denmark, the Netherlands, Belgium, South Korea, South Africa, France and Mexico. The material was expanded in a theoretical biophysics course for Physics Masters students at Leiden University. A script that I made for a course at the Casimir Graduate School between Delft and Leiden University was a short first version of the book. When I decided to write an entire book on the subject, I wanted it to be self-contained (avoiding phrases like “one can show”), especially since I noticed that some students in my classes did not know much about molecular biology while others had no background in statistical physics. In order for the book to reach a certain depth without becoming too thick, some difficult decisions had to be made beforehand. I decided that everything in this book should be related to what I consider the heart of molecular biology of the cell, the central dogma of molecular biology, which states that information flows from DNA to RNA to proteins. By limiting myself to this topic, I only had to deal with three types of molecules, all polymers, all of which are in the nucleus of
x Preface to the First Edition
eukaryotic cells. Using these molecules as examples, I cover a large number of biophysical topics, most of which are also relevant in many other areas of biophysics. I present examples that make it clear what is special about biophysics compared to other areas of physics. A second decision was to restrict myself to “paper and pencil” theories and show how they can be used to understand experimental observations, but not to discuss computer simulations. I stress here that simulations are just as important as theory and experiments, but for this textbook I have selected topics where a purely theoretical treatment is sufficient—as in most textbooks on more classical subjects. [Note: The second edition of this book covers computer simulations.] The book contains 8 chapters. The first chapters are more introductory, are shorter and contain fewer examples to enable the reader to grasp the structures of the theories. The later chapters bring more and more experimental examples and help the reader to develop the physical intuition necessary to grasp the complex physics behind the systems under consideration. The problems considered are also more recent, and many are still the subject of intense debate in the current literature. I have tested preliminary versions of all the chapters of the book in various courses: Chapters 2 and 7 with physics bachelors students, Chapters 1, 3, 4, 5 and 7 with physics masters students and Chapters 1, 4, 6, 7 and 8 with PhD students and postdocs, most of them physicists, but some with a background in chemistry or biology. I hope that this book will find usefulness in a variety of settings. I would like to acknowledge the people without whom this book would not have been possible. The late Jonathan Widom had been extremely kind in sharing his deep insights into biological matter with me when I had just started in the field. Never again have I met someone who was able to combine the biological and physical perspectives so seemingly effortlessly. It was Robijn Bruinsma and Bill Gelbart with whom I took my first steps in this new field. Robijn’s lecture at a winter school in Vancouver opened my eyes to what is special about biophysics; I tried to preserve some of that excitement in the section on protein-target search in the last chapter of the book [now Chapter 8]. Working with Philip Pincus gave me a more intuitive understanding of electrostatics, which I tried to explain
Preface to the First Edition
in the chapter on this subject. Discussions from this time with his student Andy Lau also affected this chapter. John Maddocks had kindly and critically read large parts of the manuscript and had helped clear up inconsistencies in my explanations of Euler elasticas. If there are still any left, it is entirely my fault. My former student Igor Kulic´ is the person whose work has found its way into this book in more places than anyone else. His clear-cut approaches to problems related to DNA and nucleosomes have proven ideal for a textbook. Martin Depken’s work on chromatin fibers and kinetic proofreading during transcription also seemed to fit too well into this book to be left out. Ralf Everaers has greatly influenced my views on chromatin fibers and the organization of chromosomes on large scales. In addition to John Maddocks, many other people helped me with the book. Peter Prinsen read and corrected large parts of it at an early stage. Marc Emanuel and Giovanni Lanzani had been helpful at several points when I was hopelessly stuck in a calculation. Giovanni also helped with some illustrations. I would also like to thank Behrouz Eslami-Mossallam, Jean-Charles Walter and Raoul Schram and many students in my courses for helpful suggestions. My doctoral supervisor Alexander Blumen helped me to appreciate clarity and precision during my PhD work, which I also wanted to achieve here. Some of the work on polymer dynamics with him and Gleb Oshanin found its way into this book. My favorite course during my studies at Freiburg University was the statistical physics course ¨ given by Hartmann Romer; I consulted my old notes from this course as the basis for Chapter 2. There are many more people I should thank. I mention some of them below (and apologize to those I forgot to mention): Ralf Blossey, Reza Ejtehadi, Ion Cosma Fulga, Stephan Grill, Remus Dame, Markus Deserno, Marianne Gouw, Rosalie Driessen, Arman ¨ g Langowski, Ralf Metzler, Fathizadeh, Peter Kes, Kurt Kremer, Jor Farshid Mohammad-Rafiee, Daniela Rhodes, Herve´ Mohrbach, Laleh Mollazadeh-Beidokhti, John van Noort, Wilma Olson, Fran Ouwerkerk, Jens-Uwe Sommer, Mario Tamashiro, Rochish Thaokar, Harald Totland, Michelle Wang and Kenichi Yoshikawa. Last but not least, I would like to thank Sabina for her infinite patience. H. Schiessel
xi
Preface to the Second Edition
This is an updated and expanded version of my textbook Biophysics for Beginners: A Journey through the Cell Nucleus. I decided to break with a rule in the first edition, namely to limit myself to “paper and pencil” theories. Most of the chapters now contain extensive computational exercises and there is a new chapter on computational methods (Chapter 9). I hope that the exercises will enable a deeper understanding of various (bio)physical systems (including the ideal and real gas, self-avoiding walks, DNA mechanics, RNA and protein folding and nucleosomes) and various methods (including exact enumeration, transfer matrix method, dynamic programming algorithm as well as molecular dynamics and Monte Carlo simulations). New sections reflect recent developments such as biomolecular condensates and loop extrusion in chromosomes. The chapter on polymer physics has been expanded and a new appendix has been added on the connection between polymer statistics and critical phenomena. Finally, the text has been improved throughout. I thank Marco Tompitak for many suggestions on how to improve the book. Martijn Zuiddam developed an analytical approach to describe sequence-dependent DNA mechanics, which was missing in the first edition (new Subsection 4.2.2). Many thanks to him and Ralf Everaers. I thank Michael Wimmer and Lennart de Bruin, with whom I taught computational physics at Leiden University. The interaction with them formed the basis for many changes in the new edition. Several of the new computational exercises were successfully tested in Leiden: Problem 2.6 “Verifying the ideal gas law” in the Bachelor course on Statistical Mechanics taught by Peter Denteneer (thanks to him and Pascal van der Vaart for improving it), Problem 3.4 “Exact enumeration of self-avoiding walks” in my MSc course
xiv
Preface to the Second Edition
on Theoretical Biophysics and Problems 9.1 “Different phases in a system of Argon atoms” and 9.2 “Phase transitions in the twodimensional Ising model” in my Computational Physics courses. Since the first edition of this textbook, I have had the pleasure of interacting with various other people who influenced my thinking. I would like to mention Alain Arneodo, Benjamin Audit, Gerard Barkema, Rhys Bird, Giovanni Brandani, Enrico Carlon, Jamie Culkin, Koen van Deelen, Nicolas Destainville, Ariel Kaplan, Lennard Kwakernaak, Steven Lankhorst, Joshua Lequieu, Kazuhiro Maeshima, Manoel Manghi, John Marko, Alireza Mashaghi, Alexandre Morozov, Jonas Neipel, Alexey Onufriev, Juan de Pablo, Lois Pollack, Rob Phillips, Akihisa Shioi, Andy Spakowitz, Cedric Vaillant, Tetsuya Yamamoto, Takahiro Sakaue, Shelley Sazer, Bahareh Shakiba, Gijs Vermar¨ıen, Michael Wellens, Jesse van Welzenes, Dulcy van der Werff, Joerie Wondergem and Renger Zoonen. H. Schiessel
Chapter 1
Molecular Biology of the Cell
1.1 The Central Dogma of Molecular Biology An introduction to the fundamentals of the molecular biology of the cell would easily fill this book. Instead, I shall focus on one set of problems in molecular biology that Francis Crick, one of the discoverers of the DNA double helix, has termed the central dogma of molecular biology. This central dogma states that there are three types of crucial biological macromolecules, DNA, RNA and proteins, that “communicate” in such a way that genetic information flows in the single direction from DNA to RNA to proteins. Figure 1.1 specifies the different steps of that information flow, which we will discuss below. The whole genetic information about a cell, its genome, is written down in one or several DNA molecules (DNA stands for DeoxyriboNucleic Acid). When a cell divides, the information must be passed on to its two daughter cells, and therefore the DNA needs to be replicated before the division can take place. To understand how elegant nature’s solution to replication works, we first need to discuss the structure of the DNA chain itself. The genetic text on the DNA chain is made from four letters, the nucleotides: adenine (A), guanine (G), cytosine (C) and thymine (T). These letters are Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
2 Molecular Biology of the Cell
replication
DNA
gene 1
gene 2
gene 3
transcription
RNA 1
RNA 2
RNA
3
translation
protein 1
protein 2
protein 3
Figure 1.1 The central dogma of molecular biology: information flows from DNA to RNA to proteins. The DNA molecules contain the complete genetic information of the cell in the form of genes which are often separated by pieces of “junk” DNA. Each gene is a building plan of a protein. Whenever a cell needs a certain protein, a transcript of its gene is made in the form of an RNA copy. This copy is then used as blueprint to assemble the protein. Also shown are enlarged portions of the three types of macromolecules: DNA and RNA are chemically very similar, except DNA is double stranded whereas RNA is single stranded. Proteins are chemically very different and are made from a sequence of amino acids (aa’s). The physical properties of these aa’s cause the protein to fold in a unique three-dimensional shape.
chemically linked into a one-dimensional chain producing a text like AAGCTTAG, but much, much longer. A DNA molecule in a cell carries this information not only once, but effectively twice, since it occurs in a double-stranded form. The two DNA strands are linked via hydrogen bonds through base pairing such that an A is always paired with a T and a G with a C. So our AAGCTTAG strand will be paired with a TTCGAATC strand (more precisely CTAAGCTT, since each strand has a chemically built-in direction and the two strands run anti-parallel). Therefore, the two strands are not identical, but they carry exactly the same genetic information. To duplicate the DNA,
The Central Dogma of Molecular Biology 3
daughter 1
DNA polymerase 1
parent DNA DNA polymerase 2
daughter 2 Figure 1.2
DNA polymerase at work (very schematic).
the double stranded chain just needs to be unzipped and each strand used as a template to convert it back to a double stranded molecule using the base pairing rules, see Fig. 1.2. This is accomplished by molecular Xerox copiers called DNA polymerases, which open up the DNA double helix and then take single nucleotide monomers from the surrounding solution and add them whenever they match to the growing complementary strands. In this way, two identical copies are made. What information is stored along the DNA chains? Essentially, they contain the blueprints of all the proteins of the cell. Each blueprint is called a gene. However, DNA is far too precious to be used directly as a template in a protein factory. Instead, when a certain protein is needed, a copy of the corresponding gene is created. This copy is a different chain, a ribonucleic acid (RNA), and it is much shorter than the DNA chain. RNA is chemically very similar to DNA, but with the T’s replaced by U’s (uracil). In contrast to DNA, it always appears as a single-stranded molecule. The creation of a gene copy is called transcription (Fig. 1.3), with the transcribing cellular “monks” being called RNA polymerases. Such a polymerase locally opens the DNA into its two strands and uses one of these strands as a template to create an RNA molecule. Similar to the duplication process it attaches new nucleotides to the growing RNA chain— always in accordance with the base pairing rules. This creates an
4 Molecular Biology of the Cell
RNA transcript
RNA polymerase
DNA Figure 1.3 A transcribing RNA polymerase at work (schematic).
identical copy of the gene that now acts as a messenger. Therefore, such RNAs are called messenger RNAs (mRNAs). The messenger must next find its way to a protein factory where translation into the protein or polypeptide takes place. What complicates things is that this third class of macromolecules speaks a completely different language, made up of 20 letters, called amino acids (aa’s). The protein factories, the so-called ribosomes, are thus essentially translation offices. A schematic view of a ribosome is provided in Fig. 1.4. Inside a ribosome the protein is polymerized according to the genetic code. Groups of three nucleotides, the codons, are translated from the mRNA blueprint into aa’s (e.g., UAC into the aa tyrosine). This translation is made possible by codonspecific adapters, so-called transfer RNAs (tRNAs). Such a tRNA molecule has on one side an anti-codon that is complementary to the codon (e.g., GAA is the anti-codon to UUC) and carries on the other end the corresponding aa. Once a tRNA is found that can bind to the codon inside the ribosome, the aa is chemically linked to the growing protein chain. The RNA blueprint is then shifted by three bases and the process starts again. Why does a codon consists of three nucleotides? The reason lies in the fact that there are far fewer nucleotides than aa’s. If the codon consisted of only two nucleotides, there would only be 4 × 4 = 16 combinations, not enough to encode for the 20 aa’s. A three-letter codon is therefore necessary. But the 43 = 64 possible codons lead
The Central Dogma of Molecular Biology 5
protein
attached amino acid (F)
tRNA
ribosome tRNA
anticodon (GAA)
mRNA Figure 1.4 A ribosome translating a mRNA into a protein (schematic).
to a certain redundancy, which, however, could be useful, as we argue further below. In Fig. 1.5, the genetic code is displayed. Usually, the genetic code is presented as a two-dimensional table. However, since a codon consists of three bases, we prefer to give a threedimensional representation, with each axis corresponding to a base. Since the inside of the 4 × 4 × 4-block cannot be seen, the lower part of the figure shows the cube broken into four separate layers. You can now read the genetic code directly from the table. For instance, the codon CCU (which is equivalent to CCT) corresponds to the aa P which stands for proline, sometimes also abbreviated by Pro. Now it is immediately clear by inspection that most of the redundancy goes into the third position of the codon: e.g., codons starting with CC (CCA, CCU, CCG and CCC) always lead to proline. Many more rows of identical aa’s can be seen in the 3-direction. This is not accidental but reflects a feature typical for many of the tRNAs. As described above, these adapters “memorize” this code directly by having on one side an anti-codon and on the other the corresponding aa. Many tRNAs tolerate a mismatch in the third codon position, which means that a tRNA can represent several codons that stand for the same aa. Finally, we mention that there are also three codons that act as stop signs telling a transcribing RNA polymerase that it has reached the end of a gene. The RNA polymerase then falls off and releases the freshly produced protein.
6 Molecular Biology of the Cell
Figure 1.5 The genetic code. Three consecutive bases form a codon that codes for an aa. In this three-dimensional representation each space direction stands for one position within the codon. For instance, GAA represents the aa denoted by E. The different colors indicate the physical properties of the corresponding aa as indicated by the legend. The stop signs correspond to stop codons. The lower part of the figure shows the 4 × 4 × 4-block separated into four layers so that you can “look inside.” Note that the genetic code is degenerate, especially with regard to the base in the third position. In the 3D representation, this means that rows in the 3-direction are often identical, e.g., CCA, CCT, CCG and CCC all stand for the aa P. The standard single-letter abbreviations for the aa’s are given in the legend to the right together with the standard three-letter abbreviation in brackets. The black and white representations of the letters indicate the mechanical properties of the encoding DNA sequences, allowing for mechanical “information” to be written along DNA molecules, see Chapter 4 for more details.
The Central Dogma of Molecular Biology 7
Up to now we spoke about a one-dimensional world of information. Somewhere the step to our three-dimensional world needs to take place. This happens via the spontaneous folding of a protein into its unique three-dimensional shape. Since the protein is a one-dimensional chain of aa’s, it is not obvious how this is possible. The mechanism underlying the folding lies in the fact that each aa has certain physical properties, some being polar or hydrophilic (tending to mix with water), some non-polar or hydrophobic (tending to not mix with water, like oil) and some being positively or negatively charged. These physical properties are indicated in Fig. 1.5 by the colors of the little blocks. A protein with its chain of aa’s is thus a chain of objects with different physical properties. An aa that carries a positive charge attracts an aa that carries a negative charge, hydrophilic aa’s try to be in contact with the surrounding water and hydrophobic aa’s try be shielded from it. The aa chain thus attempts to adopt a shape that accommodates all its aa’s in an optimal environment. Each protein’s aa sequence has evolved so that the molecule folds into a unique three-dimensional structure that enables it to perform its specific task. While the onedimensional DNAs are more like academic people living in a world of knowledge, the three-dimensional proteins are working class—they are in the real world doing real work. Some proteins catalyze certain chemical reactions, others form the building blocks of a cellular network of stiff fibers, the cytoskeleton. To move our muscles there are motor proteins, and in plants proteins harvest the energy of the sun light. Also, the two copying machines, the DNA- and the RNA polymerases, are made from proteins. At first glance, RNA seems to play only a minor role in this whole game, being merely the messenger between the DNA and the protein factories. But note that single-stranded RNAs occur in various roles: as mRNAs they carry information just like DNA, as tRNAs they fold into unique three-dimensional shapes just like proteins. The translation machine, the ribosome, is made from several proteins and from so-called ribosomal RNAs (rRNAs) which are also uniquely folded. This double role of RNAs suggests that in an early stage of evolution life consisted just of an RNA world with self-replicating RNA molecules. At a later stage it became advantageous to divide information storage and catalytic activities between specialists, the
8 Molecular Biology of the Cell
DNAs and the proteins. In this way, the role of the universal RNA molecules was reduced to mainly act as the interface between the DNA and protein worlds—even though modern molecular biology is now somewhat restoring its reputation by discovering various RNAs that are also found in modern cells to play important roles. In this book, we focus mainly on biophysical problems that are related to the central dogma of molecular biology. The molecules we deal with are DNA in Chapter 4, RNA and proteins in Chapter 6 and complexes made from DNA and proteins in Chapter 8. We discuss some key processes of the central dogma. Transcription is described in detail in Section 8.2. Concerning replication, we do not describe the intricacies of DNA polymerase but only discuss heat-induced strand separation in Section 4.4. Translation is only described insofar as we study the folded structures of RNAs including tRNA in Section 6.1. Finally, protein folding is discussed in Section 6.2. In addition to the thematic chapters mentioned above, there are also introductory chapters, the purpose of which is to provide the necessary background. Depending on the reader’s background some of these introductory chapters can be skipped. Chapter 2 introduces statistical physics, the foundation on which most of the book rests. Chapter 3 discusses ordinary polymer physics which serves as basis for and sets the contrast with the less ordinary polymer physics displayed by DNA, RNA and proteins. Chapter 5 introduces stochastic processes with which we can describe non-equilibrium thermodynamics, without which life would not be possible. Chapter 7 is devoted to the most prevailing interaction inside the cell, the electrostatic interaction. And finally, Chapter 9, gives an introduction to the two most important methods in Computational Physics, molecular dynamics simulations and Monte Carlo simulations. But let us make a quick journey through the cell nucleus first.
1.2 A Journey through the Cell Nucleus So far, we have discussed the different key macromolecular actors in a living cell and outlined their interactions, as summarized in the central dogma of molecular biology. In this section we discuss the structure of a cell and where these processes take place. In Fig. 1.6,
A Journey through the Cell Nucleus
Figure 1.6 A journey into the cell nucleus (schematic). The largest scale is shown in the window to the left that is about 20 μm across. It depicts an animal cell with its various organelles, the most conspicuous being the nucleus, the compartment that contains the DNA. Zooming in to the cell nucleus (5 μm) one can distinguish the various DNA chains which reside in well-separated chromosomal territories. A closer view (1.2 μm) reveals a DNA-protein fiber that folds onto itself within its territory. In a further closeup (0.4 μm) one can distinguish dense, stiff and more open, fluffy parts of this fiber. Zooming in (60 nm) shows that the fiber consists of one DNA chain (shown in red) wrapped around blue protein cylinders. The resulting DNA spools, so-called nucleosomes, are connected by naked pieces of linker DNA. Some of the nucleosomes are isolated, others are stacked on top of each other and form 33 nm wide chromatin fibers. A close-up of a nucleosome (20 nm) reveals that the DNA is wrapped in one and three quarter lefthanded superhelical turns around the protein cylinder. The 6 nm window shows the DNA double helix as well as some of the eight histone proteins that make up the inner core of the nucleosome. The final window, 2 nm across, shows the DNA double helix with its stacked series of base pairs. Note that the scale factor between the first and the last window is 10 000.
9
10 Molecular Biology of the Cell
a sequence of pictures is shown. The first picture to the left (the one in the nine o’clock direction) shows an animal cell. Then we zoom into the cell in a clockwise direction, frame by frame, and finally arrive at a close-up of the DNA double helix. The first figure is about 20 μm across, the last one is about 2 nm across, which corresponds to the diameter of the DNA double helix. There is a 10 000-fold magnification when going from the first to the last picture. Before we discuss the structure of the cell on the different length scales, let me point out that there are two categories of cell types, prokaryotic and eukaryotic. Eukaryotic cells have a nucleus (hence their name) and humans are composed of such cells. All cells of animals, plants and fungi are eukaryotic. On the other hand bacteria and archaea do not feature a nucleus and they are thus prokaryotes. In this textbook we have mainly eukaryotes in mind, yet most of the book applies to prokaryotes as well since all life obeys the central dogma. Even so, there are many crucial differences. An important one is the fact that eukaryotic DNA is ingeniously packaged into a DNA-protein complex called chromatin as described further below. There are at least two reasons why nature went through the effort to package eukaryotic DNA. One of them is the fact that many eukaryotes are multi-cellular organisms that have many different types of cells. For instance, nerve cells conduct electric signals over large distances, and white blood cells protect our bodies against infectious diseases. Yet each of our cells carries exactly the same genetic information in its DNA molecules. This is only possible if different genes are expressed in different cell types at different levels. To a large extent this is accomplished through the way in which DNA is packaged inside the cells. If one does not want a particular gene in a particular cell to be read out, one simply packs it away. A second, more obvious, reason for the DNA packaging are the length scales involved. For instance, the human genome consists of 3 billion base pairs (bp), two copies of which make up the two meters of DNA double helix per cell. The nucleus is the compartment inside which all the DNA is stored. As you can see in Fig. 1.6, a nucleus is typically just a few microns in diameter. So how do two meters of DNA fit into such a tiny container? To be precise, the DNA in a human cell consists of 46 DNA chains, called chromosomes,
A Journey through the Cell Nucleus
46×
random chromosomal DNA coil
cell nucleus densely packed DNA
100 μm
10 μm 2 μm
Figure 1.7 The nucleus of a human cell contains 46 DNA chains, each about 4 cm long. Unconfined, each of the 46 chains forms a random coil of about 100 μm diameter (left). All 46 chains packed densely together form a ball of about 2 μm diameter (right). In the middle: The cell nucleus (here with 10 μm diameter), the compartment that contains all the DNA.
each about 4 cm long. How much space would such a chromosomal DNA chain take up if it would not be confined? Standard polymer physics, as discussed in Chapter 3, applied to DNA, as discussed in Chapter 4, predicts that a DNA chain by itself would have a random conformation, in polymer language called a random coil (as that of a very long, well cooked spaghetti). There are infinitely many conformations, but on average the size of such a coil can be estimated (with Eq. 4.67) to be about 100 μm across. Thus the diameter of such a randomly coiled DNA molecule is much smaller than its contour length, but still significantly larger than the diameter of a nucleus, see Fig. 1.7; and this is just one of 46 chains. On the other hand, the volume V of the 2 meters of DNA is rather small since the DNA double helix is incredibly thin with its 2 nm diameter. Thus V = π (1 nm)2 × 2 m = 2π μm3 . A sphere of this volume would have a diameter of (12)1/3 μm ≈ 2 μm and would thus easily fit into a nucleus. So if the DNA chains are packed in some way, they easily fit into the nucleus. Now that we have discussed some of the challenges that DNA faces, let us go step-by-step through Fig. 1.6. Before discussing the different length scales, let me first stress one extremely important point. Even though these are pictures allowing you to “see” with
11
12 Molecular Biology of the Cell
your own eyes the structures inside a cell, and even though this is printed in a book, this does not necessarily mean that all the things shown here are really true. It should rather be considered as a very rough map, giving you an idea what is going on. This is an active area of research and many details, and maybe not just details, might be completely wrong. Over time, when new experimental methods allow for more detailed knowledge, some of these pictures will need to be adjusted. I will try to clarify when discussing some of the structures presented in this book what is solid knowledge and what is only speculation. A similar critical interpretation of the many shiny pictures in biological and biophysical textbooks and scientific articles (and especially the front pages of the journals in which they are found) is highly recommended. We start with the 20 μm window, which depicts a whole animal cell. The most conspicuous structure inside the cell is the nucleus, the compartment that contains all the DNA. Before zooming into the nucleus, let me point out that there are many other structures in the cell called organelles. For instance, the orange and yellow ellipses represent mitochondria that produce the energy for the cell. All these structures are separated from each other by membranes that are made from lipids, molecules that contain a hydrophilic and a hydrophobic part, that spontaneously self-assemble into twodimensional layers. Also the cell as a whole is separated from its environment by a membrane. All these structures are extremely interesting and display exciting biophysics, but we shall focus in this book on what happens inside the cell nucleus. Zooming in (5 μm), you can see a close-up of the whole nucleus. The nucleus is also separated from the rest of the cell, the socalled cytoplasm, by a membrane (actually a double membrane) that contains nuclear pores allowing molecules to enter and leave the nucleus. Going back to the central dogma (Fig. 1.1): replication and transcription both take place in the nucleus, but the ribosomes that do the translation are located in the cytoplasm. This requires that the mRNAs leave the nucleus by passing through the nuclear pores. The most surprising feature of the 5 μm window is the fact that the different DNA chains (shown here in different colors) do not intermingle but reside in separated so-called chromosomal territories—even though there are no physical boundaries between
A Journey through the Cell Nucleus
them. The fact that they are doing this is far from obvious. If you have ever cooked spaghetti, you know that they have no problem mixing perfectly. The physics underlying chromosomal territories has long been unclear, but recent advances have been made, as discussed in Chapter 8. A close-up to a chromosomal territory (1.2 μm window) reveals the fashion in which an individual chromosome is folded within its domain. There are indications that this might be similar to the conformation of a crumpled sheet of paper, except that it is the one-dimensional DNA-protein chain instead of the twodimensional paper that is crumpled. A further close-up (0.4 μm window) shows that the chromosome is composed of a fiber that is highly inhomogeneous. There are dense, stiff stretches (cylinders) and open, fluffy pieces in between (circles). Only when we reach the 60 nm window can we discern the DNA molecule itself, as the red chain. Most of this chain is wrapped around blue cylinders that are each made of eight so-called histone proteins. These DNA spools are called nucleosomes. Some of the nucleosomes are “free,” others are stacked and form 33 nm-thick so-called chromatin fibers (often also referred to as 30 nm fibers). Zooming in on one of the free nucleosomes (20 nm window), we can see that there is 1 and 3/4 turn of DNA wrapped around the protein cylinder. The DNA continues on each end of the wrapped portion as free stretches connecting to the neighboring nucleosomes. These unwrapped stretches are called linker DNA. A further close-up (6 nm) reveals details of the protein cylinder (colored ribbons) and of the DNA double helix, which is shown in close-up in the last figure of this series (2 nm), where we can see the paired bases. In this image sequence, we increased the scale by a factor of 10 000 as we went from the entire cell (20 μm) to the DNA double helix (2 nm). Our current understanding of the different length scales varies tremendously. Surprisingly, we understand the smallest structures, the DNA double helix and the nucleosome, best, while all the other levels are highly debated. Both for DNA and for nucleosomes we know the structure up to atomic resolution. For example, Fig. 1.8 shows the atomic structure of the nucleosome. To obtain it, nucleosomes were reconstituted from histones and DNA of just the right length, namely that of the wrapped DNA,
13
14 Molecular Biology of the Cell
Figure 1.8 Crystal structure of the nucleosome core particle. One and three quarters of DNA are wrapped on a left-handed superhelical wrapping path around an octamer of histone proteins. Left: top view. Right: side view. Structure 1aoi (Luger et al., 1997) from the protein data bank.
147 bp. At sufficiently high concentrations and sufficiently low temperatures, these so-called nucleosome core particles form a crystal, a phenomenon that is comparable to the formation of crystals from common table salt. The structure of the nucleosome core particles has been determined via X-ray diffraction on such a crystal. Similarly, several structures of short pieces of DNA (oligonucleotides) have been determined. Already at the next level of organization, the chromatin fiber, the picture becomes rather hazy due to a lack of appropriate experimental methods. Scientists cannot even agree whether such structures actually exist in vivo, i.e., inside a living cell, or whether they are artifacts only observed in vitro, i.e., inside a test tube. Moreover, the structure of the 30 nm fiber, which is so often observed in vitro, is still under discussion, and the structure shown in the 60 nm window in Fig. 1.6 is just one of many possible ones, as we shall discuss later. And things have been even worse for the larger scales, at least up to very recently, before new breakthrough experiments opened our eyes. We discuss these different length scales and what we know about them in Chapter 8. Figure 1.9 again depicts the hierarchical structure of DNA in eukaryotes, but this time without showing its spatial organization
A Journey through the Cell Nucleus
2 nm
10 nm 33 nm
chromatin fiber DNA doublehelix
string of nucleosomes
300 nm
folded fiber
1.4 μnm
mitotic chromosome
Figure 1.9 The hierarchical structure of a chromosome. From left to right: the DNA double helix; the string of nucleosomes; the 30 nm chromatin fiber; higher order features like loops; and the mitotic chromosome.
within a cell. It starts at the left with the DNA double helix with a diameter of 2 nm. Next the string of nucleosomes, sometimes referred to as the 10 nm fiber, is shown. The 33 nm wide chromatin fiber is the highly debated next level of organization. We indicate as a possible higher level beyond the fiber its folding into 300 nm-sized loops. Finally, at the right we show the highly condensed mitotic chromosome. This structure is denser than the chromosomes shown in Fig. 1.6 and appears before cells divide. It contains a chromosome and its copy, neatly packaged for the distribution between the two daughter cells. It is instructive to draw a comparison between the structure and function of chromatin and an everyday-life example: a library of books. Just as a nucleus stores the long one-dimensional string of base pairs, a library contains a huge one-dimensional string of letters, the text written down in all its books. A book like the one in front of you contains about 10 km of text. A library with 10 000 books thus stores roughly 100 000 km of text. This raises the question of how all this text can be stored and retrieved. Stuffing it arbitrarily into the library like strings of spaghetti would make it really hard to deal with. Instead the text is folded neatly in a hierarchical fashion in lines, pages, books and shelves, see Fig. 1.10. This makes it relatively easy to find the passage of interest with the
15
16 Molecular Biology of the Cell
Figure 1.10 A library of books has a similar structure and function as chromatin, cf. Fig. 1.9. It stores the text of all the books hierarchically in lines, pages, books and shelves. The information can be accessed locally by removing the desired book from the shelf and opening it at the appropriate page.
Figure 1.11 Crystal structure of a transcription factor bound to a short piece of DNA double helix (seen as a crosscut). Note that the factor consists of two proteins—as indicated in the figure—that bind completely around the double helix. This makes it sterically impossible for the transcription factor to bind to DNA wrapped on a nucleosome. Picture of 1vkx (Chen et al., 1998) created with Mathematica.
help of a few markers. Furthermore, although all the text is tightly stored, the book of interest can be taken off the shelf and opened at the appropriate page without perturbing the rest of the library. It is therefore clear that this hierarchical type of text organization makes it possible to store a large amount of information in a small space and still make it easily accessible.
Problems
It is obvious that both a library and chromatin have a hierarchical organization. Less obvious and, in many respects, still an open question is how the dense chromatin structure can be opened locally to allow access to its genes. For instance, the nucleosomal repeat length is about 200 bp, of which 147 bp are wrapped around the protein cylinder. This means that about 75% of a DNA chain is tightly bound to the histones. It is known that there is a multitude of proteins that bind to DNA at specific sites that contain specific short bp sequences. A class of such proteins are socalled transcription factors, which regulate transcription. However, many such proteins cannot interact with DNA when it is wrapped. This becomes clear when we look at a concrete example, e.g., the transcription factor in Fig. 1.11. This figure depicts a co-crystal structure of a transcription factor bound to an 11 bp piece of DNA. In the center of the figure you can see a cross section through the DNA double helix. It is important to note that the transcription factor binds all around the DNA double helix, which is necessary for the protein to actually recognize the target bp sequence and therefore be able to bind specifically. However, this obviously means that this transcription factor cannot detect its target sequence and bind to it if the corresponding DNA section is located within the part that is wrapped in a nucleosome. Moreover, even the unwrapped sections—the linker DNA—are somewhat buried inside the dense chromatin fiber. Therefore, it is necessary for the cell to have mechanisms at hand to open—unfold—the fiber and then somehow unwrap the DNA from the protein spools. This becomes especially mysterious when thinking about transcription and duplication: How do RNA- and DNA polymerases (Figs. 1.2 and 1.3) make their way through the tens or hundreds of nucleosomes that they encounter on their way? We do not know the answer to that question, but we will explain in Chapter 8 how proteins can bind to nucleosomal DNA.
Problems 1.1 Translation and transcription Right or wrong? Translation is the copying of a gene into messenger RNA by RNA polymerase. Transcription takes place inside ribosomes where a protein is
17
18 Molecular Biology of the Cell
produced from its RNA blueprint. DNA is made from amino acid building blocks. If you do not know the answers, go back to Section 1.1. 1.2 Complementary strands Inside cells DNA is double-stranded with the two strands being complementary to each other. Please give as many reasons as you can think of, why this is biologically advantageous. 1.3 Protein world As mentioned in the text, some people claim that there was an RNA world without DNA and proteins at an early stage of evolution. Could there have been a protein world instead without RNA and DNA? 1.4 DNA and life
Is DNA alive? Explain your answer.
1.5 Random coil In Fig. 1.7, it is indicated that a chromosomal DNA molecule—non-confined and under physiological conditions— would form a random coil of about 100 μm diameter. This estimate is based on Eq. 4.67. In this formula, L is the length of the molecule and D its thickness. You can find both numbers in the main text. Finally, l P is the so-called persistence length, a measure of the stiffness of the molecule. For the DNA double helix one has l P = 50 nm. Calculate the coil diameter by inserting these numbers into the formula.
Chapter 2
Statistical Physics
In this book, we shall encounter several important branches of physics, but for us the most important one will be statistical physics. Here I present a short introduction to statistical physics to provide the reader with the framework necessary to understand later chapters where it is used extensively. As we did in the previous chapter, I try to be as concise as possible and only discuss issues that are relevant for this book. In other words, this introduction is far from complete, but hopefully it will help readers not familiar with statistical physics to get a quick idea of the basic concepts. If you are familiar with this topic, you can skip this chapter except for the last section.
2.1 The Partition Function To explain what statistical physics is all about, I give you a simple example that we have all been familiar with since childhood: a balloon filled with gas. The physical state of the gas in the balloon can be fully characterized by three physical quantities: (1) The volume V of the balloon, that corresponds to the volume available for the gas. (2) The pressure p that describes how hard one has to Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
20 Statistical Physics
press to compress the gas, a well-known concept for e.g., car tires. (3) The temperature T of the gas, a quantity one typically plays with in a hot-air balloon. It has long been known that these three quantities are related to each other. Robert Boyle found in 1662 that when a fixed amount of gas is kept at a constant temperature, the pressure and volume are inversely proportional, i.e., p ∼ 1/V . Jacques Charles found in the 1780s that the volume is proportional to the temperature, V ∼ T , when the pressure of a fixed amount of gas is kept constant. And finally Joseph Louis Gay-Lussac stated in 1802 that the pressure of a fixed amount of gas in a fixed volume is proportional to the temperature: p ∼ T . You can easily check that all three of these laws are met if the ratio between the pressure-volume product and the temperature is constant: pV = const. T
(2.1)
That means if we look at a gas at two different states, ( p1 , V1 , T1 ) and ( p2 , V2 , T2 ), we always find p1 V1 /T1 = p2 V2 /T2 . What is that value, i.e., the value of the constant on the right hand side (rhs) of Eq. 2.1? The value depends on the amount of gas inside the balloon. An amount of gas that occupies V = 22.4 L at T = 0◦ C = 273.15 K and atmospheric pressure, p = 1.013 bar, is called one mole. The constant then takes the value R = 8.31
J K
(2.2)
and is called the universal gas constant (J: joule). If the amount of gas in the balloon is n moles, then the constant in Eq. 2.1 has the value nR. Equation 2.1, the so-called combined gas law, is an example of an empirical law that relates measurable physical quantities. Statistical physics is the theoretical framework that allows us to derive such laws from first principles. This is a pretty daunting task. A gas is a collection of a huge number of particles. We nowadays know that one mole of gas contains NA = 6.02 × 1023 particles (independent of the type of gas chosen; normal air, helium, etc.) where N A is called the Avogadro constant. This is a rather mind blowing fact: a balloon that contains one mole of particles can be fully characterized by three so-called macroscopic variables, p, V and T , yet it has a myriad
The Partition Function
of microstates characterized by the positions and velocities (both in X -, Y - and Z -direction) of 6 × 1023 gas molecules! Let us try to derive Eq. 2.1 from the microscopic structure of the gas. This will serve as a concrete example to introduce the methods of statistical physics. As a first step we introduce an abstract, very high-dimensional space, the phase space of our system, that contains the positions and momenta (the momentum of a particle is the product of its mass and velocity) of all N particles. A point in phase space y y is given by x1 , y1 , z1 , . . . , x N , yN , zN , p1x , p1 , p1z , . . . , pNx , pN , pzN , y where e.g., yi and pi denote the position and momentum of the i -th particle, both in the Y -direction. In short hand notation we can write (q, p) for such a point in phase space where q is a high-dimensional vector that contains all the positions and p all the momenta of the N particles (do not confuse this p with the pressure p from above, they have nothing to do with each other). The amazing thing is that as the gas molecules move inside the balloon and bounce off its surface, i.e., as the point (q, p) races through the phase space, we cannot see anything happen to the balloon in our hands, which remains at a constant pressure, volume and temperature. This suggests the following: To a given macrostate characterized by the triplet ( p, V , T ) there is a myriad of microstates, each characterized by a high-dimensional vector (q, p). We can imagine to divide the phase space into many very small hypercubes. Our system (e.g., the balloon) visits a given hypercube from time to time as it races through its microstates. Thus for each of these tiny hypercubes we can assign a probability, namely the probability to find the system inside that cube. In other words, we need to introduce the concept of probabilities by assigning to each microstate (p, q) a probability ρ = ρ (q, p). For a quick introduction to probability theory, please consult Appendix A. We now present a line of argument that allows us to determine the form of ρ, namely Eq. 2.5 below. Please be warned that even though each of the steps looks rather compact, it is not easy to grasp them entirely. At this stage, you might rather consider this as a rough outline that gives you a general view of things and allows you to quickly get to something concrete to work with. You do not have to feel too uncomfortable with this, since we shall later on provide a completely different argument that again leads us to Eq. 2.5.
21
22 Statistical Physics
Σ2
Σ1
Σ Figure 2.1 A balloon filled with gas molecules. The virtual line divides the whole balloon (system ) into two subsystems, its left half, 1 , and its right half, 2 . To a good approximation these two halves are independent from each other, as mathematically expressed in Eq. 2.3.
In Fig. 2.1 we show the balloon with its gas molecules; we call this whole system . We consider now two subsystem, 1 and 2 , namely the molecules to the left and to the right of a virtual dividing plane as indicated in the figure by a dashed line. Real gas molecules have a very short range of interaction that is much shorter than the diameter of the balloon. This means that only a very tiny fraction of the molecules in 1 feel molecules from 2 and vice versa. Therefore, the two subsystems can be considered as independent from each other to a good approximation. We can therefore define the probability densities ρ1 and ρ2 separately for each subsystem— without going into further mathematical details here. Now since 1 and 2 are independent, the probability of the whole system is simply the product of the probabilities of its subsystems, ρ = ρ1 ρ2 (just like with two dice: the probability to throw a 6 amounts for each dice to 1/6 and the probability that both dice yield a 6 is 1/6 × 1/6 = 1/36). Using the functional property of the (natural) logarithm, ln ab = ln a + ln b, this can be rewritten as: ln ρ = ln ρ1 + ln ρ2 .
(2.3)
This is one of the conditions that ρ needs to fulfill. A second one is the following. Here and in the rest of this chapter we are considering systems in equilibrium. By this we mean that the system has developed into a state in which nothing changes anymore. For our balloon this means that the values of p, V and T stay
The Partition Function
constant in time (unlike, e.g., a glass of water where all the water evaporates if you wait long enough). Likewise nothing changes anymore microscopically, i.e., the function ρ = ρ (q, p, t) does not explicitly depend on time but is of the form ρ = ρ (q, p), as we originally wrote it. In other words, ∂ρ = 0. (2.4) ∂t Amazingly Eqs. 2.3 and 2.4 are enough to determine ρ. We know from Eq. 2.4 that ρ is a conserved quantity, meaning a quantity that does not change in time. ρ must thus be a function of a conserved physical quantity. Possible candidate quantities are: (a) the total energy H of the system, (b) its total momentum P and (c) the particle number N (the numbers Nα of each type for different particle types). Most systems are confined by walls (e.g., a gas in a balloon). Whenever a gas molecule hits the balloon, it gets reflected and thereby transmits momentum to the balloon; thus P of the gas is not conserved. This means ρ can only depend on H and N. From Eq. 2.3 we know that ln ρ is an additive quantity and so is the energy H , H = H 1 + H 2 , and the particle number N, N = N1 + N2 . This means we know more about how ln ρ should depend on H and N, namely that it must be a linear function of additive, conserved quantities. This leaves several possibilities for ln ρ that depend on the concrete physical situation. For the balloon the number of particles inside the balloon is fixed since the gas molecules cannot pass through the balloon skin. However, energy can flow in and out of the balloon in the form of heat. In that case we should expect that ln ρ = −β H + C where β is some constant; C is another constant that has be chosen such that ρ is normalized to one. If in addition particles can move in and out, one should expect that ln ρ = +α N − β H + C with α being yet another constant. The plus and minus signs here are just conventions and do not mean anything since we do not yet know the signs of α and β. Let us begin with the first case, the one with N fixed. Such a system is called the canonical ensemble. From above we know that ρ must be of the form: 1 (2.5) ρ (q, p) = e−β H (q, p) , Z
23
24 Statistical Physics
the Boltzmann distribution. Here the function H = H (q, p) is the energy of the system that depends on the positions and momenta of all the particles. The role of the factor 1/Z is to normalize the probability distribution such that the sum over all different possible states of the system adds up to one (as the probabilities of a perfect dice, that for each possible outcome is 1/6, add up to 1). Surprisingly, this seemingly harmless factor is the whole key to understand the properties of the system, as we shall see below. As it turns out to be so important, it should not surprise you that it has a name: the partition function. We need to choose Z such that 1 ρ (q, p) d 3N q d 3N p = 1 (2.6) N!h3N and hence Z =
1 N!h3N
e−β H (q, p) d 3N q d 3N p.
(2.7)
The factor 1/N!h3 N in front of the integrals in Eqs. 2.6 and 2.7 seems to be an unnecessary complication in the notation and needs some explanation. Let us start with the factor 1/N!. This corresponds to the number of possible ways one could number the N particles (we pick a particle and give it a number between 1 and N, then the second particle and give it one of the N − 1 remaining tags and so on). If the microscopic world would behave classically (like the macroscopic world we are used to live in), we could give each of the N gas molecules such an individual tag and follow its course in time. That way the two configurations on the left-hand side (lhs) of Fig. 2.2 are different from each other, since particles 1 and 3 are exchanged. However, the microscopic world of these particles is governed by the laws of quantum mechanics. One of these laws is that identical particles are indistinguishable, in other words the two conformations on the lhs of Fig. 2.2 are identical and belong to exactly the same physical state, the one shown to the right of Fig. 2.2. When performing the integrals d 3 N qd 3 N p in Eqs. 2.6 and 2.7 one would encounter N! times such a configuration. The factor 1/N! prevents this overcounting. Next we discuss the factor h3N . This factor is introduced to make Z dimensionless, i.e., no matter what units we use (e.g., meters or inches for length) Z is always the same. h is a quantity with
The Partition Function
1 3
4 5 2
= 3 1
4 5
quantum mechanical
2 classical
Figure 2.2 A balloon with N = 5 identical gas molecules. In classical mechanics (lhs) we can number the particles individually, allowing us to distinguish between the configurations shown in the top and in the bottom. In quantum mechanics (rhs) identical particles are indistinguishable, which means that the two states shown on the left are one and the same, namely the configuration depicted at the right. This quantum-mechanical law is the cause of the 1/N! factor in Eq. 2.7.
the dimensions of length times momentum (or equivalently energy times time), namely h = 6.626 × 10−34 Js.
(2.8)
Even though this choice seems arbitrary from the perspective of classical mechanics, it can be motivated to be the most logical choice in the realm of quantum mechanics. The quantity h is the so-called Planck constant that appears in a famous relation in quantum mechanics: It is impossible to measure the position and momentum of a particle beyond a certain precision. According to the so-called Heisenberg’s uncertainty principle the uncertainty in position, x, and in momentum, px , both in X -direction, obey the relation xpx ≥ h/4π (more precisely x and p are the standard deviations, Eq. A.6, found when the measurement is repeated again and again under identical conditions). So if one measures the position of a particle very precisely, there is a large uncertainty in its momentum and vice versa, a consequence of the particle-wave duality that we shall not discuss here further. For this reason, it makes sense to divide our 6N-dimensional space in small
25
26 Statistical Physics
hypercubes of volume h3N , which explains the choice of the prefactor in Eq. 2.7. To give a concrete example we calculate the partition function of the gas in the balloon, Fig. 2.1. Before we can start to evaluate the integral, Eq. 2.7, we need to have an expression for the energy of the gas, H = H (q, p), also called the Hamiltonian of the system. We consider an idealization of a real gas, the so-called ideal gas. In this model the interaction between different gas molecules is neglected altogether. This turns out to be an excellent approximation for most gases since the concentration of gas molecules is so low that they hardly ever feel each other’s presence. This means that the energy is independent of the distribution of the molecules in space, i.e., H = H (q, p) does not depend on q. This leaves us only with the kinetic energy of the particles (assumed here to all have the same mass m): H = H (p) =
N N pi2 1 x 2 y 2 z 2 = pi + pi + pi 2m 2m i =1 i =1
(2.9)
| | with x pyi =z pi being the length of the momentum vector pi = pi , pi , pi . Plugging this into Eq. 2.7 we realize that we have Gaussian integrals of the form given in Eq. A.8 of Appendix A. The momentum integration of each of the 3 components of each particle √ gives a factor 2πm/β. In addition each particle is allowed to move within the whole balloon so that its position integration gives a factor V . Altogether this leads to
β N 2π m 3N/2 VN VN 2 − 2m i =1 pi d 3N p = e . (2.10) Z = N!h3N N!h3N β It is customary to introduce a quantity called the thermal de Broglie wavelength β (2.11) λT = h 2πm that allows us to write the partition function Z of the ideal gas very compactly: N V 1 . (2.12) Z = N! λ3T We introduced the partition function in Eq. 2.5 merely as a factor necessary to normalize the probability distribution, but we
The Partition Function
mentioned already that you can derive from Z almost everything you want to know about the macroscopic system. As a first example we show now that knowing Z means that it is straightforward to determine E = H , the average energy of the system. According to Eq. A.4 of Appendix A H is given by H (q, p) e−β H (q, p) d 3N q d 3N p H = e−β H (q, p) d 3N q d 3N p 1 1 = H (q, p) e−β H (q, p) d 3N q d 3N p. (2.13) Z N!h3N Here the denominator is necessary to normalize the canonical distribution and is, of course, again proportional to the partition function. It seems at first that the integral on the rhs of Eq. 2.13 needs to be evaluated all over again. However, the beauty of the partition function Z is that it is of such a form that it allows expressions such as Eq. 2.13 to be obtained from it by straightforward differentiation. You can easily convince yourself that one has ∂ E = H = − ln Z (2.14) ∂β with Z given by Eq. 2.7. The differentiation of the ln-function produces the 1/Z factor on the rhs of Eq. 2.13 and the form of its integrand, H e−β H , follows simply from the differentiation, −∂e−β H /∂β. This means all the hard work lies in calculating Z through a high-dimensional integral, Eq. 2.7. Once this is done, the harvest consists of straightforward differentiation as in Eq. 2.14. We can also calculate the variance of the energy fluctuations of a macroscopic system. In the case of an air-filled balloon these fluctuations result from the exchange of heat with the surrounding air outside the balloon that constitutes a so-called heat bath. This quantity is σ E2 = H 2 − H 2 (cf. Eq. A.6 of Appendix A) and follows simply by differentiation of ln Z twice: ∂2 ∂ H σ E2 = ln Z = − ∂β 2 ∂β ∂ 1 = H 2 − Z H (2.15) = H 2 − H 2 . ∂β Z To arrive at the second line we used Eq. 2.13; the first term accounts for the β-dependence inside the integral, the second for that of the Z −1 factor.
27
28 Statistical Physics
Since we already calculated the partition function of the ideal gas, Eq. 2.10, we can immediately obtain, via Eq. 2.14, its average energy: E = H =
3N . 2β
(2.16)
The energy is thus proportional to the particle number N, as one should expect for non-interacting particles, and inversely proportional to the quantity β. We still do not know the physical meaning of this quantity—even though, as we shall see, it is wellknown to us; we even have a sensory organ for it. For now we can only give β a rather technical meaning: It allows us, via Eq. 2.16, to set the average energy H of the gas to a given value. We can now also calculate the typical relative deviation of the energy from its mean value H . It follows from Eqs. 2.10 and 2.15 that 2 1 σE √ . = (2.17) H 3 N This means that in large systems the relative fluctuations around the mean value are so tiny that, for any practical purposes, the system is indistinguishable from a system that is thermally isolated, i.e., a system that cannot exchange energy with the outside world. Such a system is called a microcanonical ensemble, but is not discussed further here. Our aim is now to derive an equation for the pressure of the ideal gas and to check whether statistical mechanics allows us to derive the combined gas law, Eq. 2.1, from first principles. To make the analysis more convenient we put the gas in a cylinder with a movable piston, Fig. 2.3, instead of a balloon. If we apply a force f on the piston, then the pressure on it is given by p = f /A where A is the area of the piston. The gas occupies a volume V = Al with l denoting the height of the piston above the bottom of the cylinder. To better understand how the gas can exert a force on the piston we add to the Hamiltonian H (q, p) a wall potential U wall (l, q) that depends on the positions of all the particles and on the height l of the piston. We do not assume anything here about the form of the Hamiltonian H (q, p) so the following formulas are general. The wall potential U wall (l, q) takes an infinite value if any of the molecules is outside the allowed volume. This way the gas is forced
The Partition Function
fi
f u (l − x)
x
l
Figure 2.3 Gas in a cylinder. The piston is under an externally imposed force f that counterbalances the individual forces fi of the gas molecules close to the surface of the piston. Each of these forces follows from a short ranged wall potential u that smoothly goes to infinity as the gas molecule reaches the surface of the piston.
to stay inside the cylinder. To calculate the force exerted by the gas molecules, we assume that the potential goes smoothly to infinity over a microscopically short distance δ when a particle gets close to the surface of the piston (for the other confining walls we simply assume that the potential jumps right to infinity). More specifically, the wall potential is of the form N u (l − xi ) (2.18) U wall (l, q) = i =1
as long as all particles are inside the cylinder and infinity otherwise. Most particles are far from the surface of the piston, l − xi > δ, and thus do not feel it, i.e., u (l − xi ) = 0. But a small fraction of them are nearby, l − xi < δ, and they are pushed to the left exerting a force on the piston. For a given configuration of particles, q = (x1 , y1 , z1 , . . . , x N , yN , zN ) this force is given by N ∂u (l − xi ) ∂U wall (l, q) f =− =− . (2.19) ∂l ∂l i =1 We are interested in the mean force f that is given by
∂U wall (l, q) −β[H (q, p)+U wall (l, q)] 3N 3N 1 1 f = − e d q d p. Z N!h3N ∂l (2.20)
29
30 Statistical Physics
This expression might look complicated, but again it is just a simple derivative of the partition function, namely f =
1 ∂ ln Z . β ∂l
(2.21)
This is the average force that is exerted by the gas on the piston (and vice versa). Using the relations p = f/A and V = Al we can immediately write down the relation for the pressure: p =
f 1 ∂ ln Z = . A β ∂V
(2.22)
We can now use Eq. 2.22 to determine the pressure of an ideal gas. When calculating its partition function in Eq. 2.10 we did not take account of a detailed wall potential. But since the wall potential increases over a microscopically small distance δ l, the partition function is not affected by such details. Using Eq. 2.12 we find p =
N . βV
(2.23)
Comparison with the combined gas law, Eq. 2.1, lets us finally understand the physical meaning of β. It is inversely proportional to the temperature: 1 . (2.24) kB T The quantity kB is called the Boltzmann constant. From Eq. 2.1 together with Eq. 2.2 follows its value β=
J R (2.25) = 1.38 × 10−23 . K NA To summarize we have found two equations that characterize an ideal gas. From Eq. 2.16 we find for the energy kB =
3 NkB T 2 and from Eq. 2.23 we obtain the ideal gas equation of state E =
(2.26)
pV = NkB T .
(2.27)
The first relation, Eq. 2.26, states that each gas molecule has on average an energy of (3/2) kB T , this is, as we can see from Eq. 2.9, its kinetic energy. The temperature of a gas is thus a measure of the average kinetic energy of its molecules which move faster on average
The Partition Function
inside a hotter gas. The second relation states how these molecules exert a force when they bounce off the inner side of the wall of the balloon, Fig. 2.1, or the piston, Fig. 2.3. The hotter the gas, the faster the gas molecules and the larger the transferred momentum during collision. The larger the volume, the longer it takes for the molecules to hit the wall again, and the lower the average pressure. The quantity kB T is called the thermal energy. At room temperature, T = 293 K, one has kB T = 4.1 pN nm.
(2.28)
It is worthwhile to remember this formula by heart (instead of Eqs. 2.2 and 2.25). As we have already seen in Fig. 1.6, typical sizes of objects of interest (proteins, base pairs, etc.) are in the order of nanometers. Many of these molecules show configurational changes—often related to their biological function—that involve movements of some of their parts by distances comparable to their overall sizes. In addition, these molecules are constantly under bombardment by the surrounding smaller molecules (e.g., water molecules) providing spontaneously amounts of energy of the order of the thermal energy. Many biological macromolecules or their complexes seem to have been tuned by nature to require energies of the order of the thermal energy in order to be able to perform such configurational changes spontaneously. A beautiful way to study such molecules is to pull on them with small forces which is experimentally feasible nowadays, as we shall discuss in later chapters. Typically in the range of pNforces something interesting happens to the molecules, like a configurational rearrangement. This is precisely what we should expect based on the above argument, together with Eq. 2.28. Likewise there are naturally occurring molecular motors that can exert forces and these are again in the pN-range. Equation 2.28 is therefore important in biophysics because it relates energies, forces and length scales relevant for biomacromolecules inside the cell and inside the test tube. Let us now come to the second case, the case of a system that exchanges energy and particles with its surroundings. In this case only the expectation values of the energy, E = H , and the particle number, N = N, can be given. This is the so-called grandcanonical
31
32 Statistical Physics
ensemble. The density distribution ρ is now of the form: ρ=
1 α N−β H e . ZG
(2.29)
The grandcanonical partition function is a summation and integration over all possible states of the system, each state weighted with ρ. This means we have to sum over all particle numbers and then, for each number, over the positions and momenta of all the particles: ∞ 1 (2.30) ZG = eα N−β H (q, p) d 3N q d 3 N p. 3N N! h N=0 This can be rewritten as ZG =
∞ N =0
e
αN
ZN =
∞
zN Z N
(2.31)
N=0
where Z N is the canonical partition function of a system of N particles, i.e., the quantity that we called Z in Eq. 2.7. On the rhs of Eq. 2.31 we introduced the so-called fugacity z = eα . Using arguments similar to the ones that led to Eqs. 2.14 and 2.15, it is straightforward to see that ∂ ∂2 ln Z G , σ E2 = 2 ln Z G ∂β ∂β ∂ ∂2 N = N = ln Z G , σ N2 = ln Z G . ∂α 2 ∂α
E = H = −
(2.32)
Just as in Eq. 2.17, the relative fluctuations in energy and particle number, σ E /E and σ N /N, for large N become so small that the grandcanonical ensemble with mean energy E and mean particle number N becomes physically equivalent to the canonical ensemble with mean energy E and exact particle number N. So it is often just a matter of convenience which ensemble one chooses. Many calculations are more convenient in the grandcanonical ensemble since one does not have such a strict condition on N. Let us again consider the ideal gas. Inserting Eq. 2.12 into Eq. 2.31 we find its grandcanonical partition function
∞ ∞ zV 1 zV N N λ3 T . ZG = z ZN = = e (2.33) N! λ3T N =0 N=0
Applications
To arrive at the rhs we used the fact that 1 + x/1! + x 2 /2! + · · · is just the series expansion of the exponential function, ex . The expectation value of the particle number follows from Eq. 2.32 ∂ zV (2.34) N= ln Z G = 3 ∂α λT and that of the energy as well ∂ 3 zV 3 (2.35) E =− ln Z G = kB T 3 = NkB T . ∂β 2 2 λT This is equivalent to Eq. 2.26 but N is now strictly speaking N. The pressure formula, Eq. 2.27 follows even more directly from these relations as we shall see later below (see Eq. 2.71).
2.2 Applications Here we give three applications of what we just learned. These applications will become important later in the book. We first present the equipartition theorem which says that for a classical canonical distribution one has ∂H xi = δi j kB T (2.36) ∂xj where xi denotes any position or momentum coordinate. δi j is called Kronecker delta and is defined as δi j = 1 for i = j and δi j = 0 otherwise. Equation 2.36 can be proven via integration by parts: 1 kB T ∂ −H /kB T 3N 3N ∂H xi =− xi e d qd p ∂xj Z N !h3 N ∂xj ∂ xi −H /kB T 3N 3N 1 kB T e d q d p = kB T δi j . (2.37) = Z N!h3N ∂xj As an example consider again the ideal gas where the Hamiltonian H is given by Eq. 2.9. The average kinetic energy of particle i can be rewritten as
1 pi2 y ∂H x ∂H z∂H (2.38) + pi = pi y + pi ∂ piz 2m ∂ pix 2 ∂ pi that leads together with Eq. 2.36 to 2 3 pi (2.39) = kB T . 2 2m
33
34 Statistical Physics
z
kB T mg
w (z) Figure 2.4 The barometric formula describes the density w (z) of the atmosphere as a function of z, the height above the surface.
For a gas with N particles, we recover Eq. 2.26. In general, we can say that any coordinate that appears as a quadratic term a j x 2j in the Hamiltonian adds kB T /2 to the average energy (independent of the value of the factor a j ). One often calls these coordinates degrees of freedom and says that each degree of freedom contributes one half kB T . We shall see an application of this result later when we study the stretching of DNA under an externally applied force. As a second example we discuss the so-called barometric formula which describes the density of the atmosphere above the surface of the earth. In this case the Hamiltonian contains, besides the kinetic energies of the particles, also the potential of each particle in the gravitational field: H (p, q) =
N 2 pi + mgzi . 2m i =1
(2.40)
Here zi measures the height of particle i above the ground and g is the gravitational acceleration, on earth g = 9.8 m/s2 . The barometric formula gives the probability distribution of a given particle as a function of its height. This is obtained by integrating the density distribution, Eq. 2.5, over everything we are not interested in, i.e., over the momenta of all N particles and over the 3(N − 1)
Applications
w (p) √
4e−1 2πmkB T
p
2mkB T
Figure 2.5 The Maxwell distribution, Eq. 2.44, gives the distribution of momenta (or velocities) of a not necessarily ideal gas. Shown is also a balloon with a snapshot of gas molecules indicating to which part of the distribution each molecules contributes.
position coordinates of all the other particles. We find e
w (x, y, z) = ∞ 0
− kmgzT
e
B
− kmgzT B
dz
=
mg − kmgzT e B . kB T
(2.41)
Equation 2.41 states that the probability to find the particle at a certain height above the surface of the earth decreases exponentially with that height. Of course, this also applies to the density of the entire gas. A snapshot of such a gas in a gravitational field is provided in Fig. 2.4 together with the exponentially decaying density distribution. Later, in Chapter 7, we shall compare this formula to that of the density distribution of ions around charged macromolecules like DNA and proteins, which shows a different behavior. Finally, we discuss the Maxwell velocity distribution. We consider a system of particles that is given by the Hamiltonian H (p, q) =
N pi2 + W (q) 2m i =1
(2.42)
where W (q) can be any potential that depends on the positions of the particles. To learn about the distribution of the particle momenta (or velocities) we use a similar trick as above when we derived the barometric formula: we integrate the Boltzmann distribution, Eq. 2.5, over everything we are not interested in. In this case we
35
36 Statistical Physics
integrate over all the positions and over 3 (N − 1) momenta and find 2 1 − 2mkp T B . w (p) = e (2.43) (2πmkB T )3/2 The probability that the length p = |p| of the momentum vector p lies in the small interval [ p, p + dp] follows by integrating over a spherical shell of thickness dp in the three-dimensional p-space: w ( p) dp =
2 4π p2 − 2mkp T B dp. e (2πmkB T )3/2
(2.44)
Note that the potential W (q) does not enter in this result. Equation 2.44 therefore applies not only to an ideal gas, but also to a gas in which particles interact with each other or particles in an external potential, as in the case of a gravitational field discussed above. As shown in Fig. 2.5 w ( p) has its maximum at intermediate values whereas it decays to zero for p → 0 and for p → ∞. In the former case, this is because there are fewer and fewer states with small momenta; in the latter case, this is just a reflection of the penalty for high energy states through the Boltzmann weight, Eq. 2.43.
2.3 The Entropy In this section we introduce a quantity that is crucial for the understanding of macroscopic systems: the entropy. As we shall see, the concept of entropy allows for a different, more convincing argument for the Boltzmann distribution, Eq. 2.5. Before we get to that, let us start with a simple model system where it is quite straightforward to grasp the ideas behind entropy, especially the relationship between a macroscopic state and its associated microscopic states. The following system can be considered as an idealization of a so-called paramagnet. A paramagnet is a substance that consists of atoms that have magnetic dipole moments. The different dipoles do not feel each other and point in random directions. As a result, such a system shows no net macroscopic magnetization. The model consists of a collection of N microscopic spins on a lattice as shown in Fig. 2.6. Each spin represents an atom sitting on the lattice of a solid—in contrast to a gas, Fig. 2.1, where the atoms can move freely
The Entropy
s1
s2
s3
sN −1 sN
Figure 2.6 A system of N non-interacting spins. Each spin can point either up or down.
in space. We call the spin at the i -th site si and assume that it can take either the value +1 or −1 with a corresponding magnetic moment +μ or −μ. This leads to the overall magnetization N si . (2.45) M=μ i =1
We assume that the spins do not interact with each other. We also assume that there is no energy change involved when a spin flips from one value to the other. This means that all states have exactly the same energy. Therefore each microscopic state {s1 , s2 , . . . , s N } is as likely to occur as any other. Due to the thermal environment, the spins in a paramagnet permanently flip back and forth. We should therefore expect if we look at such a system long enough to measure any value of M between −μN and +μN. However, for a large system, N 1, a paramagnetic substance always (“always” not in the strict mathematical sense but almost always during the lifetime of the universe) shows an extremely small value, |M| μN. How is this possible? To understand this, we need to examine the possible number of microstates that correspond to a given macrostate, i.e., a state with a given value M of magnetization. If we find a macrostate M, then there must be k spins pointing up (and hence N − k spins pointing down) such that M = μk − μ (N − k) = μ (2k − N) .
(2.46)
Let us determine the number of microstates that have this property. This is a simple problem in combinatorics. There are
N! N (2.47) = k k! (N − k)! possible combinations of spins where k spins point up and N − k N point down. The quantity is called the binomial coefficient k
37
38 Statistical Physics
with N! = N (N − 1) . . . 2 × 1. (If you are unfamiliar with this, think about the number of ways to place N numbered socks in N drawers, one sock per drawer. There are N! ways to do this. If k of these socks are red and the remaining N − k blue, then there ar ewer possible distributions if we only care about color, namely e f
N accounting for the k! and (N − k)! permutations between k equally colored socks). The point now is that for large N there are overwhelmingly more configurations that lead to a vanishing M, k = N/2, than there are states for which M takes its possible maximal value, M = μN. For the latter case there is obviously only one such state, namely all spins
pointing up, whereas the former case can be N achieved in different ways. To get a better understanding N/2 of how big this number is, we employ Stirling’s formula, which gives the leading behavior of N! for large values of N: N √ N N →∞ 2π N. (2.48) N! → e Equation 2.48 holds up to additional terms that are of the order 1/N smaller and can thus be neglected for large values of N. Combining Eqs. 2.47 and 2.48 it is straightforward to show that the number Nmax of spin configurations that lead to M = 0 obeys
2 2N N . (2.49) Nmax = ≈ N/2 π N 1/2 As you can see, Nmax grows exponentially with N, Nmax ∼ 2 N . Macroscopic systems may contain something like 1023 spins which means that there is an astronomically large number of states with M = 0 (namely a 1 with 1022 zeros, a number much larger than anything you might have encountered before), compared to one state with M = μN. Let us call Nmicro (M) the number of microstates corresponding to a given macrostate characterized by M. In Appendix B it is shown that to a good approximation Nmicro (M) = Nmax e
−
M2 2μ2 N
.
(2.50)
This function is sharply peaked around M = 0 with the value Nmax given by Eq. 2.49. It decays rapidly when one moves away from
The Entropy
√ M = 0, e.g., it has decayed to Nmax /e for M = ±μ 2N, a value much smaller than the maximal possible magnetization ±μN. Suppose we could somehow start with some macroscopic state with a large value of M. Over the course of time the spins randomly flip back and forth. Given enough time it is overwhelmingly probable that M will have values that stay in an extremely narrow range around M = 0, simply because there are so many more microstates available with tiny Mvalues than with larger M-values. Therefore it is just an effect of probabilities that a paramagnetic substance shows (close to) zero magnetization. We can formulate this in a slightly different way. A macroscopic system will go to the state where there is the largest number of microstates to a given macrostate. This state is called the equilibrium state, because once the system has reached this state, it does not leave it anymore—not because this is impossible in principle, but because it is overwhelmingly improbable. We can also say the following: Of all the possible macroscopic states, the system chooses the one for which our ignorance of the microstate is maximal. If we measure M = μN we would know for sure the microstate of the system, but if we measure M = 0 we only know that our system is in one of about 2 N /N 1/2 (see Eq. 2.49) possible states. We introduce a quantity that measures our ignorance about the microstate. If we require that this quantity is additive in the sense that if we have two independent (sub)systems our ignorance of the two systems is simply the sum of the two, then we should choose this quantity, the so-called entropy, to be given by S = kB ln Nmicro .
(2.51)
The prefactor is in principle arbitrary, yet it is convention to choose it equal to the Boltzmann constant kB , the quantity introduced in Eq. 2.25. A macroscopic system will always—given enough time—find the macroscopic state that maximizes its entropy. Let us reformulate Eq. 2.51. Suppose we know the macrostate of the system, namely that there are k spins pointing up. Then each of the microstates corresponding to that macrostate has the same probability pk = 1/Nmicro . We can then rewrite Eq. 2.51 as follows: S = −kB ln pk .
(2.52)
39
40 Statistical Physics
When k, and therefore M, changes, the entropy changes. Since the entropy is extremely sharply peaked around k = N/2, the system will spontaneously reach states around k ≈ N/2 and never deviate from this anymore, not because it is forbidden, but because it is extremely improbable. In Chapter 3 we shall discuss a system that is very similar to the spin system considered here, namely a polymer under tension. It will serve as an example in daily-life in which we encounter such an entropic effect. The goal in the following is to extend the concept of entropy to a system like our gas in a balloon. Also there we expect that the system goes to a macrostate with the largest number of microstates or, in other words, to the macrostate for which we know least about the microstate, the state of maximal entropy. In this case, however, there is a complication. We had required that the average energy has a certain value, H = E , cf. Eq. 2.16. So we need to maximize the entropy with the constraint H = pi E i = E . (2.53) i
Here we assume that the states are discrete, which—as outlined above—should in principle always be assumed due to the uncertainty principle. We already know from the previous section that the probabilities of states with different energies are different. Extending Eq. 2.52 we now define the entropy as our average ignorance about the system: pi ln pi . (2.54) S = −kB i
What we need to do is to maximize S, Eq. 2.54, under the constraint of having a certain average energy, Eq. 2.53. This can be achieved using the method of Lagrange multipliers. Suppose you want to maximize the function f (x1 , . . . , xm ). If this function has a maximum, it must be one of the points where the function has zero slope, i.e., where its gradient vanishes: ∇ f = 0 with ∇ = (∂/∂ x1 , . . . , ∂/∂ xm ). However, what do we have to do if there is an additional constraint, g (x1 , . . . , xm ) = C with C being a constant? This constraint defines an (m − 1)-dimensional surface in the mdimensional parameter space. Figure 2.7 explains the situation for m = 2. In that case f (x1 , x2 ) gives the height above (or below) the
The Entropy
x2 ∇f
f (x1, x 2 ) = const g (x1, x 2 ) = C g (x1, x2 ) = const
∇g
x1 Figure 2.7 The method of the Lagrange multiplier. The objective is to find the maximum of the function f (x1 , x2 ) under the constraint g (x1 , x2 ) = C . Shown are lines of equal height of f (purple curves) and of g (blue curves). The red point indicates the maximum of interest. It is the highest point of f on the line defined by g = C . At this point the gradients of the two height profiles are parallel or antiparallel (case shown here). This means there exists a number λ = 0, called the Lagrange multiplier, for which ∇ f = λ∇g.
(x1 , x2 )-plane. As in a cartographic map we can draw contour lines for this function. The constraint g (x1 , x2 ) = C defines a single line gC (or combinations thereof) in the landscape. The line gC crosses contour lines of f . We are looking for the highest value of f on gC . It is straightforward to convince oneself that this value occurs when gC touches a contour line of f (if it crosses a contour line one can always find a contour line with a higher value of f that still crosses the gC -line). Since gC and the particular contour line of f touch tangentially, the gradients of the two functions at the touching point are parallel or antiparallel. In other words, at this point a number λ exists (positive or negative), called the Lagrange multiplier, for which ∇ ( f − λg) = 0.
(2.55)
Let us use this method in the context of the entropy. We want to find of S/kB , a function depending on the parameters the maximum p1 , . . . , pNtot where pi denotes the yet unknown probability of the i -th of the Ntot microstates. In addition we need to fulfill the constraint 2.53. This leads to a condition equivalent to Eq. 2.55, namely ∇ (S/kB − β H ) = 0,
(2.56)
41
42 Statistical Physics
with ∇ = ∂/∂ p1 , . . . , ∂/∂ pNtot and β a Lagrange multiplier. For each i , i = 1, . . . , Ntot , we find the condition
∂ S − β H = − ln pi − 1 − β E i = 0. (2.57) ∂ pi kB This leads to pi ∼ e−β E i which then still needs to be normalized to one, leading to 1 pi = e−β E i . (2.58) Z This means that we again recover the Boltzmann distribution, Eq. 2.5, using a different line of argument. Whereas the previous argument combined the arguments regarding conserved physical quantities and independence of subsystems, the current argument simple looked for the probability distribution that, for a given average energy, maximizes the entropy of the system. The inverse temperature β plays here the role of a Lagrange multiplier. Inserting the Boltzmann distribution, Eq. 2.58, into the entropy, Eq. 2.54, one finds 1 1 S = −kB e−β E i (− ln Z − β E i ) = kB ln Z + H . (2.59) Z T i Solving this relation for −kB T ln Z leads to F ≡ −kB T ln Z = E − T S.
(2.60)
From the partition function Z follows thus immediately the difference between E , the internal energy of the system, and the entropy. The quantity F is called free energy. Since the quantity S − β E is maximized in equilibrium, cf. Eq. 2.56, the free energy has to be minimized to find the most probable macrostate characterized by the temperature, volume and number of particles. F is thus a function of these quantities, i.e., F = F (T , V , N). The free energy is an example of a so-called thermodynamic potential. Knowing F allows to directly determine average quantities via differentiation, e.g., by combining Eq. 2.22 and 2.60 we find ∂F . (2.61) p=− ∂V Let us again consider the ideal gas as an example. The free energy follows from Eq. 2.12: 3
λT N 1 V N ≈ k F = −kB T ln T N ln − 1 . (2.62) B N! λ3T V
The Entropy
On the rhs we used Stirling’s formula, Eq. 2.48, and then neglected the term (kB T /2) ln (2π N), which is much smaller than the other terms. The pressure follows by differentiation of Eq. 2.62 with respect to V , see Eq. 2.61, leading again to p = kB T N/V . The method of Lagrange multipliers can also be used to derive the grandcanonical ensemble. Maximizing the entropy with two constraints, E = H and N = N, can be done analogously to the canonical case, Eq. 2.56, and leads to the condition
S ∇ − β H + α N = 0. (2.63) kB The requirement is thus
∂ S − β H + α N = − ln pi − 1 − β E i + α Ni = 0. (2.64) ∂ pi kB This leads directly to the Boltzmann factor for the grandcanonical case, Eq. 2.29. Inserting this distribution into the entropy, Eq. 2.54, we obtain kB −β E i +α Ni (ln Z G + β E i − α Ni ) e S= ZG i 1 H − kB α N . T Solving this relation for −kB T ln Z G leads to = kB ln Z G +
K = K (T , μ, V ) ≡ −kB T ln Z G = E − T S − μN
(2.65)
(2.66)
where we introduced the quantity μ = α/β, called the chemical potential. From Z G follows thus the difference between the internal energy E and T S −μN. The thermodynamic potential K is called the grandcanonical potential or Gibbs potential. Surprisingly the grandcanonical potential is directly related to the pressure of the system: K = − pV .
(2.67)
To see this, we start from the fact that E , S, V and N are so-called extensive quantities, i.e., quantities that are additive. For instance, let us look again at the gas-filled balloon, Fig. 2.1: The volume of the whole system is simply the sum of the volumes of the subsystems 1 and 2 and so are the energies, particle numbers and entropies. On the other hand, the temperature T , the pressure
43
44 Statistical Physics
p and the chemical potential μ are intensive quantities. For systems in equilibrium such quantities have the same value in the full system and in all its subsystems. Products of an intensive and an extensive quantity like T S are therefore also extensive. From this follows that the Gibbs potential K is an extensive quantity since all of its terms, E , −T S and μN, are extensive. This means that K fulfills the relation K (T , μ, λV ) = λK (T , μ, V )
(2.68)
for any value of λ > 0. If we choose e.g., λ = 1/2, then the lhs of Eq. 2.68 gives the Gibbs potential of a subsystem with half the volume of the full system. Its Gibbs potential is half of that of the full system (rhs of Eq. 2.68). We now take the derivative with respect to λ on both sides of Eq. 2.68 and then set λ = 1. This leads to ∂K V = K. (2.69) ∂V In complete analogy to the derivation of the relation for the free energy in Eq. 2.61, one can now show that p = −∂ K/∂ V and thus pV = −K = kB T ln Z G .
(2.70)
So we can obtain the pressure from Z G immediately. For instance, for the ideal gas we calculated Z G in Eq. 2.33 from which follows pV = kB T
zV = NkB T λ3T
(2.71)
where we used Eq. 2.34. We derived again the ideal gas equation of state, Eq. 2.27. Finally, let us take one more close look at the previously discussed example of a gas in a cylinder, Fig. 2.3. We have been a bit sloppy since we wrote in the legend of that figure that “the piston is under an externally imposed force f ,” but then calculated instead the expectation value of the force for a given volume, cf. Eqs. 2.20 to 2.23. If we want to be formally correct, then we need to maximize the entropy under the two constraints V = V and H = E . This is achieved by solving the following set of conditions
S ∂ − β H − γ V = − ln pi − 1 − β E i − γ Vi = 0 (2.72) ∂ pi kB where we introduced the additional Lagrange multiplier γ . Along similar lines that led us to the grandcanonical potential, Eq. 2.66, we
Particles with Interactions and Phase Transitions 45
find a thermodynamic potential G = E − T S + (γ /β) V . The ratio of the two Lagrange parameters in front of V is just the pressure, p = γ /β, as we shall see in a moment. The new thermodynamic potential G (T , p, N) = F (T , V (T , p, N) , N) + pV (T , p, N) is called the free enthalpy G. We can immediately check ∂ F ∂V ∂V ∂G = +V + p =V ∂p ∂V ∂p ∂p
(2.73)
(2.74)
where we used Eq. 2.61. For the gas in a cylinder with a piston under an externally imposed force f we need to calculate the partition function, Eq. 2.10, with an extra term − f l in the Hamiltonian to account for the different heights l of the piston. Along a similar line of arguments as before we arrive at 3
λT p G (T , p, N) = kB T N ln . (2.75) kB T Inserting this into Eq. 2.74 one recovers indeed the ideal gas law, Eq. 2.27, but this time in the version V = kB T N/ p. The grandcanonical potential obeys a very simple relation, K = − pV (cf. Eq. 2.70), and so does the free enthalpy. Using the same line of argument that led to Eq. 2.70 we find ∂G G= N = μ N. (2.76) ∂N That ∂G/∂ N is the chemical potential μ follows by comparing Eqs. 2.66 and 2.67 to Eq. 2.73.
2.4 Particles with Interactions and Phase Transitions So far we have introduced some of the concepts of statistical physics and illustrated them with two model systems: the ideal gas (inside a balloon, Fig. 2.1, or inside a cylinder, Fig. 2.3) and a system of non-interacting spins, Fig. 2.6. These systems have in common that the individual particles or spins do not interact with each other. However, there are usually interactions in real systems. Gas
46 Statistical Physics
molecules cannot occupy the same point in space because of their excluded volume. In addition they typically feel some attractive force if they are at close distance. Also, spins often tend to align with respect to each other. Such systems typically show a spectacular phenomenon: phase transitions. A substance like water at high temperatures (above 100◦ C at atmospheric pressure) is in a gas phase (called vapor), but below that temperature it forms the much denser liquid phase. Similarly, some materials that behave like a paramagnet at high temperatures show ferromagnetic behavior below a certain temperature (about 770◦ C for iron) where they can display spontaneous magnetization. Such phase transitions are the result of the interactions between a huge number of particles or spins. In principle statistical physics is able to predict phase transitions by considering microscopic models with the appropriate interactions. For instance, one could replace the simple ideal gas system by a system of interacting particles where particle i at position qi feels an interaction potential w qi − q j with particle j at position q j . Starting from a Hamiltonian that includes such interactions one could calculate the partition function and from that the macroscopic properties of the system. In practice, however, it is incredibly hard to do such a calculation and there are very few cases where it is possible. Instead of doing the full calculation one usually has to rely on computer simulations, see Chapter 9, or on approximate methods to calculate the partition function. In the following we introduce an approximate method, the virial expansion, that allows us to calculate the partition function of a socalled real gas. This is a gas of molecules that interact with each other, i.e., a system that we expect to show a phase transition from a gas to a liquid. Unfortunately the approximation is only good as long as the gas is dilute and it breaks down before one reaches the phase transition to the liquid phase. Nevertheless, this method leads to an expression that will give us at least qualitative (but not quantitative) insights into the gas-liquid phase transition. The Hamiltonian of the real gas is of the following form H (p, q) =
N pi2 + w qi − q j . 2m i< j i =1
(2.77)
Particles with Interactions and Phase Transitions 47
The first term represents the kinetic energy and is the same as for the ideal gas, Eq. 2.9. The second term accounts for the interactions between the particles. The sum goes over all pairs of particles (“i < j ” makes sure that each pair is only counted once) and we assume that the interaction potential w depends only on the distances between the particles. It is now most convenient to use the grandcanonical ensemble for which the partition function is of the form N ∞ ∞ 1 z zN Z N = 1 + IN . (2.78) ZG = 1 + 3 N! λ T N=1 N=1 The first step is just Eq. 2.31 where we wrote the N = 0 term separately. In the second step we inserted the explicit form of Z N , Eq. 2.7, with H (p, q) given by Eq. 2.77, and performed immediately the integration over the momenta. I N thus denotes the remaining integral (2.79) I N = e−β i < j w(|qi −q j |) d 3 q1 . . . d 3 q N . Let us first again consider the ideal gas. In this case I N = V N 3 and thus Z G = ezV /λT , Eq. 2.33. From that result we derived above, in Eq. 2.34, that N = zV /λ3T . In other words, the quantity z/λ3T (that also appears in Eq. 2.78) is in the case of the ideal gas precisely its density n = N/V . Now consider a real gas. If this gas is sufficiently dilute, then the interactions between its particles constitute only a small effect. The ratio z/λ3T is then very close to its density. Since we assumed here the density to be small, the quantity z/λ3T is small as well. We can thus interpret 2.78 as a series expansion in that small parameter. From this expansion we can learn how the interaction between the particles influences the macroscopic behavior of the system—at least in the regime of a sufficiently dilute gas. In that regime, it is often sufficient to only account for the first or the first two correction terms since the higher order terms are negligibly small. Unfortunately the quantity z/λ3T has not such a clear physical meaning as the density n. But since both parameters are similar and small we can rewrite Eq. 2.78 to obtain a series expansion in n instead of z/λ3T . This can be done in a few steps that we outline
48 Statistical Physics
here for simplicity only to second order in ζ = z/λ3T . We start from Z G = 1 + ζ I1 +
ζ2 I2 + · · · 2
(2.80)
To obtain the density n = N/V we need to calculate the expectation value of N that follows from ln Z G via Eq. 2.32. So next we have to find the expansion of ln Z G starting from the expansion of Z G . This is achieved by inserting Z G from Eq. 2.80 into pV = kB T ln Z G , Eq. 2.70. To obtain again a series expansion in ζ we use the series expansion of the logarithm around x = 1, ln (1 + x) = ∞ k+1 k x /k. This leads to k=1 (−1)
ζ2 βpV = ln Z G = ln 1 + ζ I1 + I2 + · · · 2 2 2 2 ζ ζ I1 I2 − + ··· = ζ I1 + 2 2 ζ2 I2 − I12 + · · · (2.81) = ζ I1 + 2 When going from the first to the second line in Eq. 2.81 we neglected all terms higher than ζ 2 . The particle number follows by taking the derivative of ln Z G with respect to α, Eq. 2.32. Since ζ = eα /λ3T one has ∂ζ /∂α = ζ and thus N=
∂ ln Z G = ζ I1 + ζ 2 I2 − I12 + · · · ∂α
(2.82)
We are now in the position to write an expansion in the density n = N/V (instead of in ζ ) by subtracting Eq. 2.82 from 2.81. This leads to βp =
N ζ2 − I2 − I12 + · · · V 2V
(2.83)
With this step we got rid of terms linear in ζ but there is still a ζ 2 term. This term can now easily be replaced by using Eq. 2.82 which states that N = ζ I1 up to terms of the order ζ 2 . We can thus replace the ζ 2 -term in Eq. 2.83 by (N/I1 )2 neglecting terms of the order ζ 3 . We then arrive at 2 1 N (2.84) I2 − I12 + · · · βp = n − I1 2V
Particles with Interactions and Phase Transitions 49
To see that Eq. 2.84 is indeed an expansion in n, we need to evaluate the integrals, I1 and I2 , defined in Eq. 2.79. We find I1 = d3q = V (2.85) V
and
d 3 q2 e−βw(|q1 −q2 |) = d 3r e−βw(r) d 3 q1 d 3 q1 V V V “V −q1 ” ≈V e−βw(r) d 3r. (2.86)
I2 =
V
The first step in Eq. 2.86 is simply the definition of I2 , Eq. 2.79. In the second step we substitute q2 by r = q2 − q1 , the distance vector between the two particles. The integration goes over all values of r such that q2 = q1 + r lies within the volume that we symbolically indicate by the shifted volume “V − q1 ”. The last step, where we replaced the shifted volume by the unshifted one, involves an approximation. This can be done since the interaction between the particles, w (r), practically decays to zero over microscopically small distances. Thus only a negligibly small fraction of configurations, namely where particle 1 has a distance to the wall below that microscopically small distance, is not properly accounted for. Now we can finally write down the virial expansion to second order. Plugging the explicit forms of the integrals, Eqs. 2.85 and 2.86, into Eq. 2.84 we arrive at −βw(r) n2 − 1 d 3r + · · · βp = n − e 2 V −βw(r) n2 e − 1 d 3r + · · · ≈n− 2 = n + n2 B2 (T ) + · · · (2.87) In the second step we replaced the integration over V by an integration over the infinite space. This is again an excellent approximation for short-ranged w (r) since e−β w(r ) − 1 vanishes for large r. The quantity B2 (T ) depends on the temperature via β and is called the second virial coefficient. Introducing spherical coordinates (r, θ, ϕ) with r1 = r sin θ cos ϕ, r2 = r sin θ sin ϕ, and
50 Statistical Physics
n
B2 n2
B3 n3
Figure 2.8 The first three contributions to the pressure βp of a real gas according to the virial expansion, Eq. 2.89. The main contribution is the ideal gas pressure n, the second is the two-body collision term B2 n2 that for low densities is considerable smaller than n. It accounts for collisions between two particles which occur with a probability proportional to n2 . In this snapshot of the gas there are three places, indicated by orange disks, where two bodies are in close contact. There is one place where three bodies are close (indicated by the green disk). Such configurations are accounted for by the even smaller three-body collision term B3 n3 .
r3 = r cos θ we can write B2 (T ) as 1 B2 (T ) = − 2
1
2π dϕ 0
∞ = −2π
∞ d cos θ
−1
dr r 2 e−β w(r ) − 1
0
r 2 e−βw(r) − 1 dr.
(2.88)
0
Let us take a closer look at Eq. 2.87. What we found is an expression for the pressure in the form of a series expansion in the density n. The first term is the ideal gas term, βp = n (cf. Eq. 2.27). The second term is a correction to this ideal gas law. Depending on the sign of B2 this term either increases or decreases the pressure. It is possible, but beyond the scope of this book, to derive the next terms of this expansion. One finds βp =
∞
nl Bl (T )
(2.89)
l=1
where Bl (T ) is called the l-th virial coefficient. We now know explicitly B1 (T ) = 1 and B2 (T ), Eq. 2.88. Higher order terms look
Particles with Interactions and Phase Transitions 51
w
whard
wattr
+
= r
r
r
Figure 2.9 The typical interaction potential w between two molecules as a function of their distance r. It typically is the sum of two contributions. The first one is a hardcore repulsive term, whard , that forbids particles to overlap (excluded volume). The second is a longer-ranged attractive potential, wattr .
increasingly more complex, a result of collecting terms of order l when going from Z G to ln Z G (as we did for l = 2 in Eqs. 2.80 and 2.81) and then of going from ζ to n (cf. Eqs. 2.81 to 2.87). One can interpret the l-th term of this expansion as accounting for collisions between l molecules as indicated in Fig. 2.8. Later in this book we need the free energy F of a real gas instead of its pressure, Eq. 2.89. Remember that the pressure follows from F by differentiation, p = −∂ F /∂ V , Eq. 2.61. The free energy is thus obtained by integrating the pressure, Eq. 2.89, leading to ∞ nl (2.90) Bl (T ) . β F = N ln λ3T n − 1 + V l −1 l=2 The first term in Eq. 2.90 follows from integrating the l = 1 term that leads to −N ln V . All the other contributions to the first term are just the integration constant, which has to be chosen such that the result matches the ideal gas result, Eq. 2.62, for the case that all Bl = 0 for l ≥ 2. You can easily convince yourself that one indeed obtains Eq. 2.89 from the virial expansion of F by taking the derivative with respect to V , p = −∂ F /∂ V . We now use the virial expansion in the density, Eq. 2.89, to explain some of the basic properties of phase transitions. As a start let us estimate from Eq. 2.88 the typical temperature dependence of the second virial coefficient B2 (T ). Figure 2.9 depicts the typical form of the interaction w (r) between two molecules. For short distances w (r) rises sharply, reflecting the fact that two molecules cannot overlap in space due to hardcore repulsion. For larger distances there is typically a weak attraction. As schematically indicated in the figure the total interaction potential can be written as the sum of these two contributions, w (r) = whard (r) + wattr (r).
52 Statistical Physics
To a good approximation the hardcore term can be assumed to be infinite for r ≤ d and zero otherwise, where d denotes the centerto-center distance of the touching particles, i.e., their diameter. The integral 2.88 can then be divided into two terms that take into account the two contributions to the interaction: ∞ d 2 B2 (T ) = −2π r (−1) dr − 2π r 2 e−βwattr (r) − 1 dr d
0
≈ 2π
d3 + 2π 3
∞ r 2 βwattr (r) dr = υ0 −
a . kB T
(2.91)
d
The approximation involved by going to the second line is to replace e−βwattr (r) by 1 − βwattr (r). This is a good approximation if the attractive part is small compared to the thermal energy, i.e., if βwhard (r) 1 for all values of r > d. In the final expression of Eq. 2.91 the volume υ0 = 2π d 3 /3 accounts for the excluded volume of the particles. It is actually four times the volume 4π (d/2)3 /3 of a particle. The factor 4 is the combination of two effects: (i) A particle excludes for the other a volume 4π d 3 /3 that is eight times its own volume. (ii) An additional factor 1/2 accounts for the implicit double ∞ counting of particle pairs by the n2 -term. The term a = −2π d r 2 wattr (r) dr is a positive quantity (assuming wattr (r) ≤ 0 everywhere as is the case in Fig. 2.9). We thus find that the attractive term becomes less and less important with increasing temperature and the system behaves more and more like a system with pure hardcore repulsion. There is a temperature T0 = a/ (kB υ0 ) below which B2 (T ) becomes negative, i.e., the particles effectively start to attract each other. Let us now insert Eq. 2.91 into the virial expansion 2.87. This leads to
1 1 1 a 1 (2.92) βp = + B2 (T ) 2 = + υ0 − υ υ kB T υ 2 υ where we introduced υ = 1/n, the volume per particle. The pressure here is the sum of two terms, one proportional to 1/υ and one to 1/υ 2 . For large values of υ, i.e., small densities, the first term is larger than the second one. On the other hand, for small values of υ, i.e., large densities, the 1/υ 2 -term dominates the pressure. Now
Particles with Interactions and Phase Transitions 53
suppose we are at a temperature T < T0 for which B2 < 0. Then Eq. 2.92 does not make any sense in the limit υ → 0 since then p → −∞. What we should expect instead is that the pressure shoots up to infinity, p → ∞, once the system reaches a density where the particles are densely packed. However, we should not be surprised that Eq. 2.92 does not work here since the virial expansion is an expansion for small densities. Even if we would go through the trouble to consider the infinite series, Eq. 2.89, we cannot expect the virial expansion to work at high densities (i.e., small values of υ). We know of a similar problem for series expansions of functions which are often only valid in a finite interval (e.g., ln (1 + x) = ∞ k+1 k x /k is only valid for |x| < 1). k=1 (−1) A rough but at least simple way to make some sense out of Eq. 2.92 is to add the next term of the virial expansion Eq. 2.89, B3 /υ 3 , and simply to assume that the third virial coefficient B3 is constant. This term has the dimensions length6 , since βp has the dimensions of force/ (area × energy), i.e., 1/length3 . A natural way is to assume B3 = d 6 since d, the molecules’ diameter, is the only length scale in the problem. With this, the pressure can be approximated by the following expression:
1 a d6 1 βp = + υ0 − + . (2.93) υ kB T υ 2 υ3 This new form shows the same behavior for large values of υ as before where in leading order βp = 1/υ but now it makes also sense for small values where in leading order βp = d 6 /υ 3 , i.e., p → ∞ for υ → 0. The exciting regime of this formula occurs at intermediate values around υ = d 3 where both the first and the last term have the same size 1/d 3 . In that regime the second term in Eq. 2.93 becomes dominant for sufficiently low temperatures. In that case we have three regimes, ∼ 1/υ 3 for υ d 3 , ∼ −1/υ 2 for υ ≈ d 3 and ∼ 1/υ for υ d 3 . We can convince ourselves by inspecting Fig. 2.10. It shows a plot of the pressure, Eq. 2.93, as a function of υ for seven different temperatures. The resulting curves are called isotherms since they correspond to states of the same temperature. There are obviously three types of isotherms: monotonously decreasing curves for large temperatures (red), non-monotone curves that
54 Statistical Physics
p
T = 1.1T ∗ T = 1.2T ∗ T = 1.3T ∗
p∗ T = T∗
p
T = 0.7T ∗ T = 0.8T
∗
p∗
T = 0.9T ∗ v∗
v
∗
v
v
Figure 2.10 Pressure p vs. volume per particle, υ, as predicted by the first three terms of the virial expansion, Eq. 2.93. The curves are isotherms, lines of constant temperature. The red lines correspond to temperatures above the critical temperature, T > T ∗ . In this case the curves look qualitatively similar to the ones of an ideal gas. For temperatures below T ∗ (blue curves) there appears at intermediate υ-values a regime where the curves have a positive slope. This unphysical behavior can be interpreted as a phase transition between a gas at large υ-values and a liquid at low ones, as explained in Fig. 2.11. The purple curve that divides the two types of isotherms has one inflection point with zero slope at (υ ∗ , p∗ ), the so-called critical point.
show an intermediate range of υ-values with positive slope (blue) and precisely one isotherm in between (purple) that has one point with zero slope at ( p∗ , υ ∗ ). The red curves are qualitatively similar to ideal gas curves for which p ∼ 1/υ, Eq. 2.27. These curves can be easily understood by looking at Fig. 2.3. The smaller the volume V available to the gas, the higher its density and the more particles are found close to the piston exerting a pressure on it. Let us next take a closer look at the purple isotherm for T = T ∗ in Fig. 2.10, the so-called critical isotherm. We can determine the critical isotherm by finding the so-called critical point characterized by ∂p ∂2 p = = 0. ∂υ ∂υ 2
(2.94)
Particles with Interactions and Phase Transitions 55
This leads to two conditions that allow us to determine υ ∗ and T ∗ :
a ∗ 2 (υ ) + 2 υ0 − υ ∗ + 3d 6 = 0 (2.95) kB T ∗ and
a (υ ) + 3 υ0 − kB T ∗ ∗ 2
υ ∗ + 6d 6 = 0.
(2.96)
Subtracting Eq. 2.95 from Eq. 2.96 allows us to solve for υ ∗ : υ∗ =
3d 6 . − υ0
a kB T ∗
(2.97)
This expression still contains the not yet determined critical temperature T ∗ . Plugging υ ∗ back into Eq. 2.95 allows us to find its explicit form: a √ . (2.98) kB T ∗ = υ0 + 3d 3 Inserting this into Eq. 2.97 finally gives a simple expression for the critical volume per particle: √ υ ∗ = 3d 3 . (2.99) The critical pressure p∗ follows by inserting υ ∗ and T ∗ into Eq. 2.93: 1 a kB T ∗ √ . p∗ = √ 3 = √ 3 3 3d 3d υ0 + 3d
(2.100)
We come back to the physics that happens around the critical point at (υ ∗ , p∗ ) below but first discuss now the blue isotherms in Fig. 2.10. They are characterized by an intermediate range with positive slope. Suppose we compress a gas at some constant temperature T < T ∗ , e.g., T = 0.8T ∗ . The pressure would follow the T = 0.8T ∗ -isotherm in Fig. 2.10. First, as long as υ is sufficiently large, the pressure increases when we compress the system. However, when υ comes in the vicinity of υ ∗ the pressure decreases. Does this make any sense? Not at all. The gas in this regime cannot be stable. To keep it at a given volume, we would have to exert a certain pressure. But any small fluctuation changes the density locally and regions of increased density (smaller υ-value) could not withstand that pressure and would collapse spontaneously.
56 Statistical Physics
p liquid gas
L
G T = 0.7T ∗
vL
vG
v
Figure 2.11 The Maxwell construction (see text). States where gas and liquid coexist (cylinder in the middle) lie on the line between points G (pure gas, right cylinder) and L (pure liquid, left cylinder).
To make the unphysical isotherms of Eq. 2.93 that occur for T < T ∗ physical, one needs to employ the Maxwell construction: One replaces a part of the isotherm by a horizontal line as shown in Fig. 2.11. The height of the horizontal line needs to be chosen such that the two areas that are enclosed between that line and the original isotherm are equal. We will give a justification of this in a moment, but first let us discuss what happens to the system when it moves along the horizontal line. Suppose we start at a very dilute system, i.e., at a large υ-value. When we compress this system at constant temperature, then the pressure rises first. Once the volume υG per particle is reached something dramatic happens: the pressure remains constant under further compression, see Fig. 2.11. This signals the onset of a phase transition. Whereas at point G the cylinder is still completely filled with gas (right cylinder), as soon as we move along the Maxwell line there is also a second phase in the cylinder, which is shown in darker blue in the middle cylinder. This is the liquid phase which has a higher density and is thus found at the bottom of the cylinder. By compressing the volume further, more and more molecules in the
Particles with Interactions and Phase Transitions 57
gas phase enter into the liquid. Once the point L is reached, all the molecules have been transferred to the liquid phase, see in Fig. 2.11 the cylinder on the left. Upon further compression of the system the pressure rises sharply following again the original isotherm, reflecting now the compression of the liquid phase. How is it possible that two phases coexist inside the cylinder? This is only possible if three conditions are fulfilled: (i) The temperatures in the two phases need to be the same since otherwise heat will flow from the hotter to the colder phase. This condition is fulfilled since both points, L and G, lie on the same isotherm. (ii) The pressure in both phases needs to be the same since otherwise the phase with the higher pressure expands at the expense of the phase with the lower pressure. The horizontal line is by construction a line of constant pressure. (iii) Finally, the chemical potentials of the two phases need to be the same, i.e., the chemical potentials at points L and G in Fig. 2.11 have the same value: μG (T , p) = μ L (T , p) .
(2.101)
Since this is the least intuitive condition we explain it here in more detail. We can think of each phase as a system under a given pressure p at a given temperature T . The appropriate thermodynamic potential is thus the free enthalpy, Eq. 2.73. Using Eq. 2.76 the total free enthalpy of the two coexisting phases is given by G = μG NG + μ L NL = μG NG + μ L (N − NG ) .
(2.102)
On the rhs we used the fact that the total number of particles, N, is the sum of the particles in the two phases, NG + NL. Suppose now that the two chemical potentials were different, e.g., μG > μ L. In that case the free enthalpy can be lowered by transferring particles from the gas to the liquid phase. Equilibrium between the two phases, as shown inside the middle cylinder of Fig. 2.11, is thus only possible if the two chemical potentials are the same. Only then the free enthalpy is minimized: ∂G/∂ NG = μG − μ L = 0. We now need to show that condition 2.101 is fulfilled when the equal area construction is obeyed. Combining Eqs. 2.73 and 2.76 we find for each phase the relation μk (T , p) =
F k + pVk Nk
(2.103)
58 Statistical Physics
with k = G, L. The coexistence condition, Eq. 2.101 together with the relation 2.103 then leads to the condition f L − fG = p (υG − υ L)
(2.104)
where fk denotes the free energy per molecule in the k-th phase. Next we calculate the difference f L − fG purely formally by integrating along the (unphysical) isotherm: υL f L − fG = (2.105) d f = − p (T , υ) dυ. isotherm G→L
υG
In the second step we used the relation d f = f (T + dT , υ + dυ) − f (T , υ) ∂f ∂f isotherm dT + d υ = − pdυ. (2.106) = ∂T ∂υ On the rhs we made use of the fact that per definition dT ≡ 0 along the isotherm and of the relation ∂ f/∂υ = ∂ F /∂ V = − p, Eq. 2.61. Combining Eqs. 2.104 and 2.105 we arrive at υG (2.107) p (T , υ) dυ = p (υG − υ L) . υL
This is just the mathematical formulation of the equal area requirement since only then the area under the isotherm between υ L and υG equals the area of a rectangle of height p and width υG − υ L. Instead of changing the volume at fixed temperature and measuring the pressure, one can also do the opposite, namely impose the pressure and observe the volume. To see what happens, just swap the axes in Fig. 2.11. If we start at small pressures the sample is in the gas state taking up a huge volume. Once the pressure is large enough, the system becomes a fluid by jumping discontinuously to a much smaller volume. As you can see, by imposing p instead of V , you can no longer control the ratio of liquid and fluid in the sample. A phase transition is called a firstorder phase transition if at least one of the first derivatives of the appropriate thermodynamic potential is discontinuous. In this example where we control T , p and N the thermodynamic potential is the free enthalpy G, Eq. 2.73, and one of its first derivatives, namely V = ∂G/∂ p (see Eq. 2.74) is indeed discontinuous.
Particles with Interactions and Phase Transitions 59
The critical point at (υ ∗ , p∗ ) and its immediate neighborhood exhibits very spectacular physics. It is not just the point beyond which the difference between the gas and the liquid disappears. At this point the substance shows large fluctuations in density over all length scales in space, from very small ones up to infinity. As a result one finds that the behavior of such a system is independent of its microscopic details. In other words, independent of the detailed underlying chemistry, such systems always behave identically. The system looks the same at any length scale, i.e., it is self-similar, and as a result, various of its physical quantities exhibit power laws. To give an example, let us consider the pressure dependence on the volume just around the critical point. This can be done by a Taylor expansion: ∞ 1 ∂ k p 1 ∂ 3 p ∗ k (υ ) (υ − υ ∗ )3 + · · · υ = − p − p∗ = k! ∂υ k ∗ 3! ∂υ 3 ∗ k=1
υ =υ
υ=υ
(2.108) The first two terms of the expansion vanish because of Eq. 2.94. So we find in leading order the power law p − p∗ ∼ − (υ − υ ∗ )3 that relates the pressure change to the volume change. Unlike for the first-order phase transition that occurs for T < T ∗ , the curve υ vs. p does not have a jump for T = T ∗ . We find however—in this example—that the curve υ = υ ( p) has an infinite slope at p = p∗ (again just swap the axes, now in Fig. 2.10, and look at the purple curve). Phase transitions where derivatives of order higher than first order of the thermodynamic potential have a jump, or are infinite, are called continuous phase transitions. In our example, all first-order derivatives of the enthalpy turn out to be continuous but ∂ V /∂ p = ∂ 2 G/∂ p2 is infinite at p = p∗ , i.e., we have here a continuous phase transition. Note, however, that our model of a real gas was oversimplified, i.e., there is no obvious reason why that power law should hold for a real physical system. Nevertheless it is a matter of fact that real systems exhibit such a power law, p − p∗ ∼ − (υ − υ ∗ )δ for υ > υ ∗ and p − p∗ ∼ (υ ∗ − υ)δ for υ < υ ∗ , albeit with a different value of δ, namely δ = 4.8. The reason why our model does not work quantitatively is subtle and is related to the fact that we do not account for the density fluctuations that govern the behavior
60 Statistical Physics
of such a system around the critical point. The value δ = 4.8 is found for any fluid whether it is water, methane, neon or whatever you like. The exponent δ is a so-called critical exponent (there are several more). It is the set of the critical exponents that is universal, not the specific location of the critical point (υ ∗ , p∗ ) or the value of T ∗ . To understand this deeper is the subject of an advanced course in statistical physics where the appropriate theoretical framework, the renormalization group transformation, is discussed (Yeomans, 1992). Some more insights will be provided in the next chapter which is about polymers. As it turns out, long polymers often represent systems close to a critical point. As a result, polymer theory is particularly elegant and the behavior of polymers is described by an abundance of power laws.
2.5 Biomolecular Condensates After the publication of the first edition of this textbook, the phenomenon of phase separation has taken center stage in biology. To account for this surprising new development we have now added the current section. In the previous section we discussed the coexistence of a gas and a liquid phase of the same substance. The new view in biology concerns the coexistence of liquid phases made from different compositions of molecules (Hyman et al., 2014). As a daily life example think about a mixture of oil and water. Similarly you can imagine that the many different ingredients in a cell might not all be neatly mixed but instead demix into different phases, forming small droplets inside the cell. As these droplets are composed of biomolecules, they are now called biomolecular condensates (Banani et al., 2017) but many other names, including cellular bodies, nuclear bodies, granules, speckles, aggregates, assemblages and membraneless organelles, can be found in the literature. Concerning “membraneless organelles”: In Chapter 1 we mentioned organelles inside cells and pointed out that they are separated from the cytoplasm by membranes. Liquid-liquid phase separation is a different, simpler strategy that cells make use of
Biomolecular Condensates
to organize their inside, offering other molecules different physical environments to reside or to act in. In the following, we first give some general theoretical background before shortly mentioning some examples of biomolecular condensates. We finish this section with some speculation about how liquid-liquid phase separation could be used by the cell to solve a difficult problem that chromatin faces after cell division. Suppose there are two types of molecules, yellow and green, inside a box which is initially divided by a wall such that all yellow molecules are to the left of the wall and all green ones to the right of it, see Fig. 2.12(a). For simplicity, we use a lattice model where each site of the lattice is occupied by exactly one molecule. We assume that these molecules are in a liquid state and can move freely (e.g., by neighboring molecules exchanging places). Now let us remove the dividing wall. What happens? First assume that the molecules do not care about who their neighbor is, i.e., we assume that the interactions between all the particles are identical. Molecules will hop around randomly and eventually the system will reach equilibrium. The equilibrium state is a homogeneous mixed state, see Fig. 2.12(b). This is caused by entropy as there are many more microscopic states with the two types of molecules being randomly distributed in the whole volume as compared to the starting configuration. We can quantify this by calculating the socalled entropy of mixing Smix . Considering each type of molecule as indistinguishable, the starting configuration has exactly one microstate, Nmicro = 1. According to Eq. 2.51 the entropy of this state is 0. Now let us calculate the entropy in the mixed state. Assume we have N molecules, Ny of which are yellow and Ng of which are green. Now the number of different microstates is given by N! N! (2.109) = Nmicro = Ny !Ng ! Ny ! N − Ny ! where we used Ny + Ng = N . From this follows the mixing entropy Smix = kB ln Nmicro = kB ln N! − ln Ny ! − ln N − Ny ! . (2.110) We use Stirling’s formula, Eq. 2.48, here in the form ln m! ≈ m ln m − m (for m 1) to arrive at Smix = −kB Ny ln Ny /N −kB N − Ny ln N − Ny /N . (2.111)
61
62 Statistical Physics
φ=1
φ=0
φ = 0.7
Smix 50 kB
mixed
40 30
20 10 demixed
(a)
(b)
(c)
demixed
10 20 30 40 50 60 70 Ny
Figure 2.12 Demixed and mixed states. (a) Demixed state where yellow and green molecules are separated by a wall (red). φ denotes the volume fraction of yellow molecules. (b) After removing the wall the system increases its entropy by the mixing of the molecules. (c) Mixing entropy, Eq. 2.111, as a function of the number Ny of yellow molecules. Indicated is the mixed state from (b) where 49 of the 70 molecules are yellow and states where all molecules are either green or yellow. The latter states feature zero mixing entropy. Likewise the demixed state shown in (a) has zero entropy.
This mixing entropy is depicted in Fig. 2.12(c). States which contain only one type of molecule (here Ny = 0 or Ng = 0) or where the two types are separated in two compartments, as in Fig. 2.12(a), have a vanishing entropy of mixing. On the other hand, a system increases its entropy by going to a mixed state. In Fig. 2.12(b) this state consists out of a mixture of Ny = 49 yellow molecules and Ng = 21 green molecules Now let us go to a more realistic off-lattice system and assume that each yellow molecule occupies a volume υy and each green one a volume υg . We introduce the volume fraction φ of the yellow molecules. It is defined as the percentage of the volume V of the box that is occupied by yellow molecules, i.e., φ = Ny υy /V (for the mixed state; otherwise it is the volume fraction of yellow molecules in the corresponding phase). This quantity is directly linked to the concentrations of the two types of molecules, cy = φ/υy and cg = (1 − φ) /υg (we assume that there is no free volume). Using these variables Eq. 2.111 can be generalized to: Smix φ 1−φ = −kB ln φ − kB ln (1 − φ) . (2.112) υy V υg For the special case υy = υg the two expressions, Eqs. 2.111 and 2.112, are identical. For the unmixed state one compartment has φ = 1 and the other φ = 0 and the total mixing entropy (the sum of
Biomolecular Condensates
600
40 20 0.2 −20
−T Smix
E
F 60
0.4
0.6
0.8
1.0
φ
=
+
400 200
−80
0.4
0.6
0.8
−200
−400
−40 −60
0.2
0.2
0.4
0.6
0.8
1.0
φ
−600
Figure 2.13 The free energy F as a function of the volume fraction φ of yellow molecules is the sum of the interaction energy E , Eq. 2.113, and −T Smix where Smix is the mixing entropy given by Eq. 2.112. The (dimensionless) parameters used in the plots are V = 1000, χ = 2.5, T = 1, kB = 1, υy = 3/2 and υg = 1. The large energetic and entropic contributions almost cancel; what remains is a free energy with two local minima. The minima have different heights because the system is not symmetric with respect to an exchange between the green and yellow molecules (υy = υg ).
the mixing entropies of the two subsystems) is zero. The increase in entropy drives the system to a mixed state with some intermediate value of φ, 0 < φ < 1. Now let us assume that the molecules interact. We need to consider the free energy of the system, F = E − T Smix , where E denotes the total interaction energy in the system. Let us assume that the interaction is only between next neighbors and that it favors that molecules of the same type are next to each other but disfavors close contact between different types of molecules. This type of interaction can be written as E = χ V φ (1 − φ) (2.113) with the interaction parameter χ > 0. The interaction energy of the demixed state shown in Fig. 2.12(a) is lower than in the mixed state shown in Fig. 2.12(b). The free energy of the system contains now two terms, one that favors mixing and one that favors demixing, as can be seen in Fig. 2.13. In the absence of interactions, the free energy is just given by −T Smix , the blue convex curve in Fig. 2.13, corresponding to the case discussed in Fig. 2.12 where mixing is favored. The interactions, however, have the opposite tendency leading to a region of φ-values where the free energy is concave. What is the consequence of this concave region? Similar to the case of the gas-liquid phase coexistence in the previous section, also
63
1.0
φ
64 Statistical Physics
F 60 40 20
φ1
φ2 mixed
0.2
0.4
0.6
0.8
−20
1.0
φ
−40 −60 −80
demixed
Figure 2.14 Free energy F of a system of green and yellow molecules as a function of the volume fraction φ of yellow molecules. This is exactly the same function as shown in Fig. 2.13; the parameters are stated in the caption of that figure. Here we show that the mixed state with φ = 0.7 is unstable. The system lowers its free energy by demixing into two phases whose composition can be found by the common tangent construction. The tangent (dashed line) touches the free energy at two points corresponding to a phase rich in green molecules with φ = φ1 = 0.05 and a phase rich in yellow molecules with φ = φ2 = 0.91. The free energy of the demixed system is the weighted average of the free energies of the two subsystems and thus lies on the common tangent at the location indicted by the arrow.
here is a region that cannot exist. Instead of forming a mixture of yellow and green molecules, the system can lower its free energy by separating into two phases, one rich in yellow and the other rich in green molecules. This is depicted in Fig. 2.14 which highlights the mixed state (assumed here to have φ = 0.7) and the demixed state with two phases, one with φ = φ1 = 0.05 and the other with φ = φ2 = 0.91. φ1 and φ2 demark the range of volume fractions where demixing occurs. These two volume fractions can be determined by the common tangent construction; the common tangent to the free energy curve is shown as a dashed line in the figure. Note that this construction ensures that the two phases have the same chemical potential, defined as μ = d F /d Ny = υy /V d F /dφ. As a result there is no net flux of particles from one phase to the other and the system is in equilibrium. The arrow in Fig. 2.14 indicates the free energy gain by going from the mixed to the demixed state. As the
Biomolecular Condensates
free energy is the weighted average of the two phases, it is given by the height of the common tangent at φ = 0.7. The demixing tendency changes with temperature. This is illustrated in Fig. 2.15(top) which shows the free energy of the same system for a range of temperatures, from T = 0.8 to T = 1.6. For each curve the common tangent is indicated. Note that the concave region of the free energy vanishes at the critical temperature T ∗ = 1.5153, indicated in red. Having determined the range of volume fractions where demixing occurs, we can construct the phase diagram displayed at the bottom of Fig. 2.15. It indicates for which compositions φ and temperatures T the system is a onephase mixture and for which it is separated into two phases. As an example we highlight the case T = 1 from Fig. 2.14. Starting from the yellow-green disk at (φ, T ) = (0.7, 1), the system separates into a green phase at (0.05, 1) (green disk) and a yellow phase at (0.91, 1) (yellow disk). The next question we need to address is the spatial arrangement of the two phases. It turns out that there are two different cases that allow for a constant chemical potential throughout: two homogeneous phases separated by a flat interface, Fig. 2.16(a), and a single droplet embedded in a homogeneous phase, Fig. 2.16(b). There is an important difference between the two cases concerning the pressure in the two phases. For the case of two homogenous phases the pressure is constant everywhere whereas for the droplet case a pressure jump occurs across the interface. The pressure pin inside the droplet is larger than the pressure pout at the outside. It is given by the so-called Laplace pressure pin − pout = γ
2 R
(2.114)
where R denotes the radius of the droplet. γ is the surface tension between the two phases and is related to the energy cost per area to have green and yellow molecules in contact. In the absence of the extra pressure inside the droplet, the droplet would not be stable and shrink. As its size would decrease 2from R to R −2 d R, the surface energy would be reduced by 4π γ R − (R − d R) ≈ 8π γ Rd R. This amount divided by the change in radius, d R, and surface area of the droplet, 4π R 2 , is the normal pressure of the surface; to
65
66 Statistical Physics
F
T = 0.9
100
T = 0.8
T = 1.0
0.2
0.4
0.6
0.8
φ
1.1 1.2 T = 1.3
−100
T = 1.4
−200
T = 1.5 T ∗ = 1.5153
−300
T = 1.6
T
1.8 1.6
(φ∗, T ∗ ) = (0.4495, 1.5153)
mixed
1.4 1.2 1.0 0.8
demixed
0.6 0.4 0.2 0.2
0.4
0.6
0.8
1.0
φ
Figure 2.15 Free energies (top) and phase diagram (bottom) of a mixture of green and yellow molecules. Same system as in Figs. 2.13 and 2.14. Free energies are shown for different temperatures, ranging from T = 0.8 to T = 1.6 and including T = 1 from the previous plots (highlighted in purple). Using the common tangent construction for each free energy curve the range of volume fractions is determined where demixing occurs and then plotted below in the phase diagram of temperature vs. composition. There is no demixing above a critical temperature, T ∗ = 1.5153, indicated in red.
Biomolecular Condensates
pin pout (a)
(b)
(c)
Figure 2.16 Two possible geometries for the two demixed phases: (a) the green and yellow phases are separated by a flat interface and (b) a single green droplet embedded in the yellow phase. In case (b) there is a pressure drop at the droplet interface, given by Eq. 2.114. As the pressure drop depends on the radius of the droplet, a system of two (or more) droplets is not stable as the largest droplet grows at the expense of the smaller ones, as shown in (c), a phenomenon called Ostwald ripening.
ensure mechanical stability the inner phase needs to compensate this pressure by exactly this amount, see Eq. 2.114. This has interesting consequences. In the previous section we claimed that two phases can only coexist if the temperature, chemical potential and pressure in the two phases are pairwise identical. We then implicitly assumed that the interface is planar. However, for curved interfaces the necessity to balance the normal forces at the interface modifies this rule for the pressure. Now, if we have more than one droplet, the system is not stable anymore. Whenever droplets have different sizes, they feature different pressure jumps across the interface as the Laplace pressure depends on the radius, see Eq. 2.114. The largest droplet with its smaller internal pressure grows at the expense of the smaller droplets with their higher internal pressures, Fig. 2.16(c). In more simple terms, you can compare the surface of two droplets (starting from the symmetric case) and then shrink one droplet at the expense of the other, up to the point that the smaller droplet disappears; all along this path the total surface of the droplets decreases and with it the interfacial energy. This instability, namely that larger droplets grow at the expense of smaller ones, a phenomenon called Ostwald ripening, drives the system eventually toward the single droplet state, depicted in Fig. 2.16(b). Going back to the biological case: Does this mean that a cell features for each phase exactly one droplet? Not necessarily. One
67
68 Statistical Physics
possibility could be, for example, that surfactants go to the interface and reduce the surface tension; this is the case for milk, an oil-water emulsion. Another possibility might be polymers that are anchored to the surface of droplets which might prevent their growth beyond a certain size; we will discuss some aspects of this scenario below. Also chemical reactions can be used to stabilize small droplets against Ostwald ripening (Zwicker et al., 2015). A particularly prominent biomolecular condensate, already observed by bright-field microscopy in the 1930s, is the nucleolus. It is shown in the 5 μm-window of Fig. 1.6 as the grey area between the chromosomal territories. The nucleolus is composed of proteins, DNA and RNA and is the site of the ribosome biogenesis. Centrosomes have also been known for a long time. They nucleate microtubuli that form the mitotic spindle; during cell division the spindle is responsible to properly separate the sister chromatids for the two daughter cells. Cajal bodies are biomolecular condensates found in the nucleus composed of proteins and RNA. They are thought to have a role in assembling spliceosomes, complex molecular machines that remove introns (non-coding sections) from transcribed messenger RNA. P granules are macromolecular condensates found in the nematode Caenorhabditis elegans involved in the germ-cell specification. They have been observed to show the characteristics of fluids: they fuse, they drip, they are spherical and they rearrange their content within seconds. The viscosity of P granules is comparable to that of honey. A common feature of macromolecular condensates is that they are formed by macromolecules (proteins and/or RNA). These macromolecules contain multiple elements that cause inter- and intramolecular interactions. These elements are typically weakly adhesive disordered regions or modular interaction domains. As the components of such condensates are macromolecules themselves and as they even oligomerize into larger complexes through these interactions, the entropic costs of phase separation is much smaller than in the case of the individual green and yellow molecules discussed above. Some of the biological macromolecules were tuned over evolutionary time scales to use these mechanisms to form various types of macromolecular condensates.
Biomolecular Condensates
As announced earlier, we discuss now a problem that cells face after they have divided into their two daughter cells. There are two main types of chromatin: euchromatin and heterochromatin. Euchromatin is more open and contains the genes that are actively transcribed, heterochromatin is denser and its genes are not or only poorly expressed. This division of chromatin into two types allows for an additional layer of regulation for the expression of genes. This is important for multicellular organisms as they feature different cell types for which different genes need to be accessible. This is reflected in the distribution of their heterochromatic regions. We do not discuss here how cell differentiation is established but ask instead a much simpler question: How is the pattern of eu- and heterochromatin faithfully transmitted from one cell generation to the next? This is important, for instance, in the growth of a tissue where all cells are of the same type. In a moment it will become clear that the challenge lies in isolating the euchromatic and heterochromatic regions from each other, such that one type of chromatin cannot spread into the other and vice versa. We shall speculate that the cells solve this problem by using a macromolecular condensate that ensures the faithful transmission of eu- and heterochromatin through the cell generations. First we need to understand how heterochromatin is different from euchromatin. A main distinction is the chemical state of its nucleosomes. As mentioned earlier, the core of the nucleosomes is composed of eight histone proteins. Each protein is a polymer and therefore has two ends. These two ends are chemically different because of the chemical structure of the protein backbone (discussed later in Section 6.2) and they are called N-terminus and C-terminus, as they contain a nitrogen and a carbon atom respectively. As it happens, relatively large sections of the N-termini of the histone proteins are unstructured, i.e., they do not fold into particular conformations. Because of this, these so-called histone tails cannot be resolved in the nucleosome crystal structures, see, e.g., Fig. 1.8, where only small sections of some tails are visible. The histone tails are often chemically modified in the cell. These so-called posttranslational modifications can change the charge of an amino acid, for instance. To give a specific example: Lysine— in its unmodified form positively charged—is often acetylated, a
69
70 Statistical Physics
modification that neutralizes its charge. Some of these modifications cause certain proteins to bind to nucleosomes that carry these modifications. This is also the case for heterochromatin. Nucleosomes that are part of heterochromatin carry a specific mark on the N-tail of one of their histones, histone H3. The modification is called H3K9me which means that the 9th amino acid of histone H3 (counted from the start of the protein, the N-terminus) is a lysine (. . . K9. . . ) and that it is methylated (. . . me). To be precise, the octamer of the nucleosome is composed of eight proteins, two H2A, two H2B, two H3 and two H4 proteins, and it is only the H3 proteins that carry this specific mark. The H3K9me mark is important because it serves as a binding site for a protein called HP1 (heterochromatin protein 1). HP1 is a crucial part of this story. This protein contains two structured domains called chromodomain and chromoshadow domain. The former domain binds to nucleosomes carrying the H3K9me mark whereas the latter domain serves as a dimerization interface between HP1 proteins. These two domains are connected by an unstructured hinge and also the N- and C-termini are unstructured. Remarkably, HP1 by itself can form droplets in the test tube (Larson et al., 2017; Strom et al., 2017) caused by weak interactions between these molecules. How this works in detail is not important here but HP1 seems to show the above mentioned characteristic features of proteins that form macromolecular condensates. In the cell, where such condensates have been observed as well, we can imagine H3K9me nucleosomes to be localized within these droplets as this allows interactions between the HP1 chromodomains and the H3K9me marks. What could be the biological reason behind this slightly intricate mechanism? One could imagine a (seemingly) much simpler scenario where nucleosomes in heterochromatin and nucleosomes in euchromatin carry different kinds of mutations in their histone tails. A sufficient number of such mutations could modulate the interactions between these two types of nucleosomes such that they would spontaneously phase separate in a manner similar to what was described above, thereby neatly forming eu- and heterochromatic regions. In fact, such effects might actually be in place, to some extent, as, e.g., regions that contain actively described
Biomolecular Condensates
genes feature a higher content of nucleosomes with acetylated tails compared to nucleosomes in heterochromatin. Here we disregard this effect, assuming that it is of less importance than the one involving HP1. Now we come to the heart of the problem that nature needs to solve. When a cell divides, the DNA gets duplicated. But what happens to the nucleosomes? Before cell division, the DNA is fully decorated with nucleosomes; the nucleosomes inside heterochromatin carry the H3K9me marks. It is believed that during duplication the nucleosomes are randomly distributed between the two DNA copies. This means that the chromatin of a given daughter cell carries only half of the original nucleosomes. The missing nucleosomes will be quickly replaced, filling all the gaps that resulted from the duplication. The problem, however, is that there seems to be no mechanism in place to put histone proteins with the right chemical modifications into these gaps. We can therefore assume that after one cell division the heterochromatic fraction of nucleosomes with H3K9me marks drops from one to 1/2. If there is no mechanism to reset this fraction to one, after a few more cell divisions it would be impossible to tell which part of the chromatin of the original cell was in the heterochromatic state and which was in the euchromatic state. What is needed is an enzyme that converts unmethylated nucleosomes in heterochromatin back again into nucleosomes carrying the H3K9me marks. Such an enzyme exists and is called H3K9 methylase. But how does this enzyme “know” which nucleosomes it should modify and which nucleosomes it should keep unchanged? Without any mechanism in place, it would just randomly methylate nucleosomes and thereby transform regions that were euchromatic into heterochromatin. Any difference between hetero- and euchromatin would “melt” away within a few generations. We speculate now that HP1 is the solution to this problem. As we know, it binds to H3K9me nucleosomes. We can imagine that HP1 droplets reel in all the H3K9me nucleosomes and thus contain all the heterochromatin. Even after a cell division, with half of the methylated nucleosomes being gone, the driving force might still be large enough to pull the heterochromatic chromatin stretches into the HP1 condensates. We might therefore imagine a
71
72 Statistical Physics
yellow blocks
green blocks
strongly selective interface
Figure 2.17 A block copolymer containing blocks of yellow and green monomers at a strongly selective interface. This system might resemble the arrangement of eu- and heterochromatin at the surface of HP1 droplets.
liquid-liquid phase separation, with one phase consisting out of HP1 and methylated nucleosomes (together with their unmethylated replacements which are connected to them via linker DNA) and another phase containing only unmethylated nucleosomes. Now it is crucial to understand an analogy to a problem in polymer physics called “block copolymer at a strongly selective interface.” We discuss polymers (long molecules composed of chains of small units called monomers) in the next chapter but we need to shortly mention them here already. A special type of polymer, block copolymers, is of interest here. A block copolymer is a polymer that contains stretches made from one type of monomer, say yellow, and other stretches made from another type of monomer, say green. For simplicity, we assume here the monomers to be exactly the same molecules as the ones discussed further above, with just one modification: They are connected into a one-dimensional block copolymer chain. Now suppose we are in a parameter range where the yellow and green molecules separate into two liquid phases, see the region called “demixed” in Fig. 2.15 bottom. In this situation, the yellow blocks of the polymer will reside in the yellow phase and the green blocks in the green phase (Sommer and Daoud, 1996). Parameters might be such that the volume fraction φ of the yellow
Biomolecular Condensates
molecules is close to one in the yellow phase and close to zero in the green phase; in Fig. 2.15 this would correspond to low temperatures. In this situation practically all monomers of the block copolymer will reside in their corresponding phases. The block copolymer is strongly pinned at the interface at all those locations where it changes its color, i.e., at the boundaries between its blocks, see Fig. 2.17. This is the above mentioned system: a block copolymer at a strongly selective interface. Now we are nearly there. After cell division the HP1 droplets form. Let say they form the yellow phase. Chromatin can be considered to be a block copolymer with yellow-liking H3K9me nucleosomes and euchromatin that prefers the green phase. The chromatin polymer will be pinned at the interface such that the heterochromatin is inside the HP1 droplet whereas the euchromatin remains outside. The non-methylated nucleosome replacements in the heterochromatin are forced along with their methylated neighbors into the droplets. Now methylase molecules are flushed into the cell. Let us assume that they cannot act outside the droplets as they need the physical environment provided by the HP1 molecules to show their enzymatic activity. Thus methylase acts only inside the droplets. In this way they find all the replacement nucleosomes and convert them back into the H3K9me state. The result of this mechanism is that the division of chromatin into heterochromatic and euchromatic regions is faithfully transmitted from one cell generation to the next. Even though this is as yet a purely speculative scenario, it suggests that nature might combine liquid-liquid phase separations and polymer physics to solve difficult problems, here specifically to transmit some aspects of its so-called epigenetic state to its daughter cells. Previous ideas in the literature like “barrier insulators” that prevent the “spreading” of eu- or heterochromatin along the one-dimensional DNA molecule sounded like magic before since chromatin is folded in three dimensions. The scenario described above might put them on a solid physical footing. What acts as an insulator is here the strongly selective two-dimensional interface to which the one-dimensional polymer is pinned as it crosses back and forth through the surface of the HP1 droplet whenever it changes from eu- to heterochromatin and back. Polymers have now entered
73
74 Statistical Physics
the stage and will continue to be the main actors in most of the following chapters.
Problems 2.1 Lagrange multipliers In Section 2.3 we used the method of the Lagrange multiplier to derive the Boltzmann distribution, Eq. 2.58. The role of the Lagrange multiplier was to ensure the constraint of a given average energy H = E , Eq. 2.53. There is a second constraint for the probability distribution, namely that it is normalized to one. We were somewhat sloppy by putting this in by hand when going from Eq. 2.57 to 2.58. Derive Eq. 2.58 with two Lagrange multipliers to account for the two constraints. 2.2 Second virial coefficient Consider a dilute gas of penetrable spheres with a box-like attraction. Their interaction potential w (r) is given by +W for 0 ≤ r < D, by −U for D ≤ r ≤ A and by zero otherwise (with W, U , D and A being positive numbers with D < A). (i) Calculate the second virial coefficient B2 = − 12 (e−βw(r) −1)d 3 r. (ii) Give the condition on the set of values W, U , D and A for which there exist a finite value of β with B2 = 0 (Warning: this calculation is a bit challenging). (iii) Does such a gas with B2 = 0 behave like an ideal gas? 2.3 Virial expansion up to third order Derive the virial expansion up to third order, i.e., redo the steps from Eq. 2.80 to Eq. 2.84 and notice the increase in complexity by just going this one step further. 2.4 van der Waals equation The van der Waals equation of state is an ingeniously simple ad hoc approach that gives a qualitative description of the equation of state of a real substance including its gas-liquid phase transition. It has been introduced by Johannes van der Waals in his thesis “Over de continu¨ıteit van den gasen vloeistoftoestand” (Leiden University, 1873). Van der Waals postulated the existence of atoms (controversial at the time) and even more, that they have excluded volume and attract. In the main
Problems
text we discussed a qualitatively similar approach based on the virial expansion (Eq. 2.93) but here we go back to van der Waals original treatment of the problem. He modified the ideal gas equation pυ = kB T as follows: υ → υ − υ0 to account for the excluded volume of particles and p → p + a/υ 2 which means that the pressure is effectively reduced due to the attraction between particles and that this reduction should be proportional to n2 . This leads to the van der Waals equation: a p + 2 (υ − υ0 ) = kB T . (2.115) υ (i) Sketch isotherms in the pressure versus volume per particle plot for various temperatures. (ii) Determine the critical point and temperature. (iii) For real substances one typically finds p∗ υ ∗ ≈ 3.4 kT ∗ . How does this compare to the van der Waals gas? 2.5 Maxwell and common tangent construction In Section 2.4 we used the Maxwell construction to determine the pressure and densities of the liquid and gaseous phases in coexistence, see Fig. 2.11. On the other hand, in Section 2.5 we used the common tangent construction to determine the volume fractions of the yellow molecules in the two liquid phases in coexistence, see Fig. 2.14. Are these two approaches different or are they actually one and the same construction? Hint: Think about one of the systems, e.g., the system featuring the liquid–gas coexistence and try to apply the other method. 2.6 Verifying the ideal gas law Write a computer program (e.g., in Python) that simulates an ideal gas in a box and show that it obeys the ideal gas law. In the following, we list some steps you can take to write the program and ask some questions concerning the physics used. (a) The gas is contained in a cube of volume V = L × L × L (L measures length in units of meters; a reasonable choice is L = 0.1). The position of particle i is given by (xi , yi , zi ) with 0 < xi , yi , zi < L. There are N particles (a reasonable choice
75
76 Statistical Physics
(b)
(c)
(d)
(e)
for N is N = 1000). Set up the initial situation such that the particles are randomly distributed inside the box. Hint: The components of the positions of all the particles can be contained in one NumPy array. Assign random velocities to the particles. To do this, you need to choose a temperature T ; you might choose T = 300 (in Kelvin). You also need to set a certain mass m per particle, e.g., m = 5.31 × 10−26 (in kg). What kind of particles are these? For each velocity component draw a random number from a √ Gaussian distribution with mean zero and width kB T /m with the Boltzmann constant kB = 1.38 × 10−23 . Give an argument for this choice of initial velocities. Hint: You can use the numpy.random.normal function. Now propagate the particles by a small time interval t by calculating the new positions from the old positions, e.g., x → x + υt where x and υ are the arrays storing all the position and velocity components. The time interval t should be chosen √ small enough such that the typical velocity component kB T /m times t is much smaller than the box size L, i.e., t √ L/ kB T /m (100 times smaller is a reasonable choice). After a few time steps the particles would fly out of your box. As the next step you need to implement a subroutine that reflects the particles at the walls. A simple way to do this is to check— after having calculated each new position—whether the particle is now outside the box (separately in x-, y- and z-directions). How would you implement the reflection at the walls, assuming that the collisions with the wall are elastic? To test whether your simulation works correctly, you could plot trajectories to see whether your particles are indeed reflected at the wall. For this you can store the trajectory in an array and make a scatter plot of the trajectories of all particles for a relatively short time interval. Also helpful is to check whether the total kinetic energy is preserved during the simulation and whether it agrees with the theoretical value. What is this theoretical value? Once you have a working simulation of the ideal gas, you need to measure the pressure p in order to verify the ideal gas law. Pressure is exerted on the walls when particles get reflected by
Problems
(f)
(g)
(h)
(i)
(j)
them. In the following, you calculate the average of the total force exerted by all the particles on one of the surfaces, e.g., the one at x = L. Calculate the total force on that surface by summing over the forces from all reflections at that wall during your full simulation time tsim t (e.g., tsim = 1000t). Argue that each such collision contributes a force 2mv x /t that acts during the time interval t. Simply sum over all forces from all collisions at the x = L-wall over your whole simulation time; let’s call this sum forces . At the end of the simulation you find the average force F from the sum forces by F = forces t/tsim . How is the pressure p computed after this? Does it depend on t? Make a plot (including labels) of pressure p vs. volume V . Measure the pressure for 10 different box sizes from L = 0.1 to L = 0.55 (in meter). Plot in the same figure the theoretical curve, p = NkB T /V , see Eq. 2.27. Make a plot (including labels) of pressure p vs. temperature T . Measure the pressure for 10 different temperatures from T = 100 to T = 550 (in Kelvin). Again plot in the same figure the theoretical curve, p = NkB T /V . Redo the two plots, now adding error bars to the data points, providing you with an estimate of the error. To do this, repeat the simulations k times (e.g., k = 100), each time starting with new initial conditions. So for each volume (or temperature) you perform k independent “measurements.” From this, you calculate the mean of these k values and the standard error of the mean. The latter quantity is the standard deviation σ of your √ sample (see Eq. A.5 for its definition) divided by k. So the more measurements you take, the better is your estimate of the mean pressure.
77
Chapter 3
Polymer Physics
DNA, RNA and proteins are all polymers. Polymers are extremely long molecules that are obtained by polymerization reactions where thousands or even millions of identical or similar units, socalled monomers, are linked together into one-dimensional chains. Compared to synthetic polymers which usually feature identical monomers linked into so-called homopolymers, biological polymers are formed from a set of a few different monomers leading to so-called hetero- or copolymers. Three typical examples are given in Fig. 3.1. From a physical point of view, the heterogeneity of biological macromolecules is particularly important if the sequence is designed to induce a certain folded structure, as is the case with
Figure 3.1 Sequences of polymers: a homopolymer made from Amonomers (top), a random copolymer composed of A- and B-units (middle) and an alternating copolymer (bottom).
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
80 Polymer Physics
proteins and sometimes with RNA. On the other hand, the DNA base pair heterogeneity does not have much (but some) impact on the DNA conformation. Before going into the intricacies of biological polymers, we focus in this section on generic features of homopolymers. This will help us later to appreciate the remarkable properties of biological macromolecules.
3.1 Random Walks Since polymers are extremely long with each bond featuring some flexibility, they are characterized by a practically infinite number of conformations which all have the same energy to a good approximation. This means that polymers are mainly governed by entropy rather than energy. A rather surprising feature of polymers is that their large-scale properties are essentially independent of the microscopic details, as we shall demonstrate later below. This means that we can use any reasonable model to describe a polymer on length scales large enough compared to the size of its actual monomers. The most simple of those reasonable approximations is to describe a polymer configuration as a random walk (RW) on a periodic lattice. An example configuration is shown in Fig. 3.2, where we chose a two-dimensional square lattice for simplicity. The end-toend vector is then given by R=
N
ai
(3.1)
i =1
where ai denotes the bond vector of the i -th segment. Each bond connects two neighboring monomers which are thought to “live” on the nodes of the lattice. The length of each segment is a corresponding to the spacing (the lattice constant) of the underlying lattice. We assume that the orientation of each bond is completely independent from the orientations of the other bonds; thus we even allow for the case that a bond folds back onto the previous bond. The summation in Eq. 3.1 goes over all N bond vectors; N (or more precisely N + 1) is called the degree of polymerization, the number of monomers in the chain.
Random Walks
In the spirit of statistical physics we are interested in averaged quantities rather than in one special realization. Let us therefore try to estimate the typical size of such a so-called polymer coil. For this we average over all different conformations. Since all these conformations have the same energy, they are all equally likely to occur, cf. Eq. 2.58. First, let us average the end-to-end vector itself: N N R = ai = 0. ai = (3.2) i =1
i =1
Unfortunately, the average vanishes since every bond vector ai points in every direction with the same probability (on a square lattice 4 directions, each with probability 1/4) leading to ai = 0. The quantity R is thus not a good measure for the typical coil size. What turns out to be a good choice is the mean-squared end-to-end distance
N N
2
ai · a j = ai + ai · a j = a2 N R2 = i, j =1
i =1
(3.3)
i = j
where the underlined term vanishes since of different
the directions bonds are assumed to be uncorrelated, ai · a j = 0 for i = j . What we have found here is that the typical size of the polymer coil grows as the square root of the degree of polymerization R ∼ N 1/2 .
(3.4)
This is a so-called scaling law. As we shall see, scaling laws are characteristic for polymers (see also de Gennes’ beautiful monograph on that subject (de Gennes, 1979)). So far we considered a free polymer coil. It is interesting and, as we shall see in later chapters, experimentally feasible to constrain the possible configurations by imposing the end-to-end vector R of a polymer. In that case there are far fewer configurations possible, namely only the ones where the sum of all bond vectors, Eq. 3.1, just adds up to that imposed vector. When one changes R, the energy does not change, only the set of configurations consistent with the given R-value. Naively one might expect that when one e.g., increases the end-to-end distance R = |R|, one does not feel any force counteracting the polymer extension because its internal energy stays constant. However, this is not true because there is a
81
82 Polymer Physics
Figure 3.2 Random walk on a periodic lattice.
dramatic change in entropy involved. To see this, let us calculate the entropy of the chain as a function of its end-to-end vector. According to Eq. 2.51 this equals S (R) = kB ln [N N (R)] .
(3.5)
Here N N (R) represents the number of distinct N-step RWs with end-to-end distance vector R, i.e., the number of microstates to the given macrostate R. How can we estimate N N (R)? This is surprisingly straightforward. Let us introduce pN (R), the probability that a given RW happens to have an end-to-end distance R. This probability density is just the ratio of N N (R) to the total number of distinct RWs, which is simply given by R N N (R ). Here the summation goes over all possible end-to-end vectors. It is straightforward to calculate this sum. For a lattice where each site has z neighbors (z is called the coordination number of the lattice), there are z possibilities for each step. The total number of steps follows to be R
N N R = z N .
(3.6)
We come back to this result later below. But for now we actually do not need it. Instead we know pN (R) directly: for large N it is to a
Random Walks
very good approximation given by N N (R) pN (R) = R N N (R ) 2
2
2
− x2 − y2 − z2 1 1 1 2 x 2 y 2 z e e e 2 2 2 2π z 2π x 2π y
1
3R 2
− (3.7) 3/2 e 2a2 N . 2πa2 N /3 How do we know this? In the second line of Eq. 3.7 we used the central limit theorem, which states that the sum of a sufficiently large number of independent random variables can be approximated well by a Gaussian distribution (see Appendix B). Equation 3.7 is written down for the specific case of a threedimensional cubic lattice where each
bond vector can point in 6 directions. As a result one has e.g., x 2 = a2 N/3 because on average only one third of the bond vectors are parallel to the X -direction. Equation 3.7 tells us that as the end-to-end distance increases, fewer and fewer configurations become available. In fact, the most likely case is where the two ends lie on top of each other, R = 0. By combining Eqs. 3.5 and 3.7 we obtain the entropy of the polymer 3kB (3.8) S (R) = S0 − 2 R 2 2a N and from that its free energy 3kB T (3.9) F (R) = E − T S (R) = F 0 + 2 R 2 . 2a N This looks precisely like Hooke’s law which describes the energy involved for small deformations of an elastic spring. In that case the mechanical energy of the spring as a function of the deformation z = z − z0 from the unstrained conformation z0 goes like E (z) = (C /2) z2 . In the polymer case, however, this law is not of mechanical but of entropic origin as is reflected in the T dependence of the spring constant C = 3kB T /a2 N. The higher the temperature, the higher C , i.e., the stiffer the chain. Analogous to Eq. 2.61 the force felt when imposing an end-to-end distance z in the Z -direction is given by ∂F 3kB T f = = 2 z. (3.10) ∂z a N
=
83
84 Polymer Physics
a2
−f
a1
aN
+f
Figure 3.3 The same RW representation of a polymer as in Fig. 3.2 but now with a force acting on the end monomers. The underlying lattice is not depicted.
The difference in the minus-sign compared to Eq. 2.61 reflects the fact that the force with which the chain pulls is inwards (trying to make z smaller, see Fig. 3.3), whereas the pressure of the gas in Fig. 2.3 is pushing the piston outwards. Assume now that the polymer chain is under a tension f applied at its ends in the Z -direction, see also Fig. 3.3. In this case the proper thermodynamic potential is the free enthalpy, Eq. 2.73: G ( f ) = F1 +
a2 N 2 3kB T z ( f )2 − f z ( f ) = F 1 − f . 2 6kB T 2a N
(3.11)
Here we lumped all z-independent terms, F 0 and the x 2 - and y 2 terms, together into a new constant F 1 . z ( f ) follows from inverting Eq. 3.10. The average end-to-end distance in the Z -direction follows analogously to Eq. 2.74: z=−
∂G a2 N = f. ∂f 3kB T
(3.12)
As is typical for a Hookean spring, force and extension are linearly related. But note again the surprising finding that at fixed force the end-to-end distance shrinks with increasing temperature. To stress the fact that this elasticity is of purely entropic origin, polymers are said to constitute entropic springs.
3.2 Freely Jointed and Freely Rotating Chains The RW polymer model lives on a lattice, which is quite an artificial assumption. Moreover, the force-extension relation, Eq. 3.12, predicts that the end-to-end distance goes to arbitrarily large values for sufficiently strong forces whereas an N-step random walk can
Freely Jointed and Freely Rotating Chains
z
ri+1 θi
ri−1
x
ri
φi
y
Figure 3.4 Section of a freely jointed chain. The orientation of the i -th bond, ri , is characterized by the two angles θi and φi .
never be stretched beyond its contour length aN. You might thus be worried that we went to a simplified model too quickly and that our results might be questionable. We now take a close look at two exactly solvable polymer models that live in continuous threedimensional space. They will help us to understand in which sense the above given RW model actually provides satisfactory predictions for polymer chains. We start with the freely jointed chain. It consists of a chain of stiff bonds that are connected via totally flexible joints, see Fig. 3.4. Each bond can thus point in any direction in space. A conformation is given by the set {R0 , R1 , . . . , R N } of positions of the monomers that are thought to be co-localized with the joints. From the positions follow the bond vectors ri = Ri − Ri −1 which all have the fixed bond length |ri | = b. The end-to-end distance is R = iN=1 ri . As above for the RW, we study the average . . . over all possible conformations, but the averaging is now over a continuum of possible directions for each bond. Also her
e, as for the RW, we immediately obtain from ri = 0 and ri r j = b2 δi j that R = 0 and 2 R = b2 N. (3.13) This looks just like the result of the RW model, Eq. 3.3. This is a first hint that polymers show universal behavior; in this case that polymer chains always feature the scaling law R ∝ N 1/2 — independent of the details of the underlying model.
85
86 Polymer Physics
Next, we calculate the end-to-end distance under an externally applied force f in the Z -direction, see Fig. 3.5. We use a different approach than in the RW model, namely we start from the Hamiltonian of the full system: N H =−f b cos θi = − f z (3.14) i =1
with θi denoting the angle between ri and the Z -direction, see Fig. 3.5. The term b cos θi in the sum gives the Z -component of the i th bond and the sum over all components amounts to z. The partition function follows then by integrating over all orientations of all the bonds: 2π
Z =
π
dφ1 . . . d φ N
0
= (2π) N
dθ1 . . . dθ N sin θ1 . . . sin θ N e−β H
0 1
eβ bf
i
−1
cos θi
d cos θ1 . . . d cos θ N
4π N e = (2π ) du = sinh N (βbf ) . (3.15) βbf −1 In the first line, we have used spherical coordinates to describe the orientation of the bonds introducing besides the θi ’s also the azimuthal angles φi , see Fig. 3.4. In the second line we integrated out the φi -angles, each contributing a factor 2π and used the standard trick d cos θi = − sin θi dθi . Having the partition function we can calculate the average endto-end distance as a function of force. Since z = b cos θi
N
i
=
(2π ) N
1 −1
1
βbf u
N
d cos θ1 . . . d cos θ N Z
i
b cos θi eβ bf j cos θ j
(3.16) this average follows directly from the partition function, Eq. 3.15, by differentiation:
1 1 ∂Z 1 ∂ N ∂ 1 z = = ln Z = ln sinh (βbf ) β Z ∂f β ∂f β ∂f f
1 = bN coth (βbf ) − = bNL (βbf ) βbf 2 b N f for β bf 1 3kB T (3.17) N bN − β f for β bf 1.
Freely Jointed and Freely Rotating Chains
ri
−f Figure 3.5
θi
+f
z A freely jointed chain under tension.
Here L (x) = coth x − 1/x is called the Langevin function. We can now see that for sufficiently small forces we recover the linear entropic spring behavior, Eq. 3.12. Since we did not use any approximations we recover also a sensible result for high forces where the end-to-end distances approaches that of a fully extended chain, bN. Remarkably, the problem we just calculated is mathematically identical to classical paramagnetism. If we let the spins in Fig. 2.6 point in any direction in space (not just up or down), then a system of N such non-interacting magnetic dipoles in a magnetic field is mathematically identical to that of a freely jointed chain under tension. The tension corresponds to the magnetic field, and the extension to the magnetization. The low-force case in Eq. 3.17 translates into the so-called Curie’s law, the high-force case is commonly referred to as the saturation of the spins. When comparing the mean-squared end-to-end distance of the RW, Eq. 3.3, and of the freely rotating chain, Eq. 3.13, we noticed the same scaling, R ∼ N 1/2 . But also the prefactors, a2 for the RW and b2 for the freely jointed chain, are exactly the squared bond length in both cases. However, the latter finding is just a coincidence. This can be understood very clearly by studying another exactly solvable model, the freely rotating chain. In this model the angle between two successive bonds of length b has a fixed value θ but the bonds can still rotate freely around each other as depicted in Fig. 3.6. This means that—unlike in the previous models—the average ri r j does not vanish between different bonds i = j . This is immediately obvious for the average ri ri +1 that is given by ri ri +1 = b2 cos θ
(3.18)
since ri +1 can be decomposed into a vector of length b cos θ in the ri direction and a vector of length b sin θ perpendicular to it. In other
87
88 Polymer Physics
θ
ri
ri−1
Figure 3.6
The freely rotating chain model (see text).
words, if we know the direction of the i -th bond to be ri , we know that on average the (i + 1) −th bond points in the same direction and its length in that direction is b cos θ. Now if we go one bond further to the (i + 2) −th bond, we know that if the direction of bond i is given but the direction of i + 1 is unknown, the i + 2 bond points in the ri direction but with an average length b cos2 θ . With each segment that we go further, we attain an extra factor cos θ . So in general we predict the following relation
ri r j = b2 cos|i − j | θ. (3.19) According to Eq. 3.19 the bond-bond correlation decays exponentially with the so-called chemical distance |i − j | along the chain. We are now in the position to calculate the mean-squared endto-end distance of the freely rotating chain:
N N−i N N
ri ri +k ri r j = R2 = i =1 j =1
≈
∞ N
i =1 k=−i +1
ri ri +k = b2
i =1 k=−∞
= b2 N
−1 + 2
N
1+2
i =1 ∞ k=0
cosk θ
k=1
cosk θ
∞
=
1 + cos θ 2 b N. 1 − cos θ
(3.20)
The above calculation is straightforward with one approximation involved when going from the first to the second line, namely to extend the summation over the chemical distance k to infinity. For
The Role of Solvent Quality 89
long chains this is an excellent approximation since, according to Eq. 3.19, the bond correlation decays rapidly with distance. Let us now take a closer look at the result of Eq. 3.20. First of all, note that we recover the same scaling with N, R ∼ N 1/2 , as in the previous models. What has changed, however, is the prefactor. It is as if the bond length has changed. we can introduce the 2 Accordingly 2 effective bond length beff via R = beff N; here 1 + cos θ beff = b . (3.21) 1 − cos θ Depending on the value of θ , the bonds appear to be longer (for 0◦ < θ < 90◦ ) or shorter (for 90◦ < θ < 180◦ ) than the actual microscopic bonds. So far we have looked at three fairly different models for polymer chains and found each time2 that the mean-squared end-to-end N. Whereas the value of beff depends distance is given by R2 = beff on the microscopic details of the model, the scaling with N, namely R ∼ N 1/2 , turns out to be completely robust and does not show any dependence on the microscopic details. This is indeed a true statement; these type of models for polymer chains always lead to the same scaling. From a physics point of view this universality makes polymers very attractive and it explains why a discussion of concrete chemical realizations of polymers is not necessary here: If we are only interested in the universal features, the chemistry does not matter.
3.3 The Role of Solvent Quality Polymer coils that feature the scaling law R ∝ N 1/2 are called Gaussian or ideal chains. The name “Gaussian” reflects the fact that the probability distribution of the end-to-end distance is Gaussian distributed, as in Eq. 3.7. As it turns out, the exponent only has the value 1/2 if one assumes something rather radical, namely that the monomers are point-like. Point-like monomers do not occupy any space and thus cannot get into each other’s way. In fact, all three models above do not forbid configurations in which two monomers occupy precisely the same point in space.
90 Polymer Physics
n m
Figure 3.7 Long-range excluded volume interaction between monomer m and n.
What happens if we take the excluded volume of monomers into account? As we discuss now, this induces a swelling of the polymer coil such that R ∝ N ν with a universal value of ν with ν > 1/2. In contrast to the models from the previous section, however, this is no longer easy to calculate. The problem is that this interaction can occur between any pair of monomers, also between pairs that are far apart along the chemical backbone, see Fig. 3.7. This makes this problem extremely non-trivial and in fact there exists no exact treatment for swollen polymer coils. The most straightforward way to describe the effect of the excluded volume on the polymer configuration is based on the virial expansion which we introduced in the previous chapter. Consider the interaction between two monomers in the solvent, e.g., between monomer m and n in Fig. 3.7. As we discussed earlier, the pair interaction w can typically be split into a hardcore contribution whard and a (small) attractive part wattr , see Fig. 2.9. This allows us to approximate the second virial coefficient by
. (3.22) υ = υ0 1 − T This is precisely Eq. 2.91 but written in a way typically used in the polymer literature. Specifically one writes υ for B2 (T ) and introduces the quantity , the so-called -temperature. Depending on the value of υ, three different cases can be distinguished: • υ = 0 (T = ): -solvent; R ∝ N 1/2 (ideal chain, Gaussian statistics)
Self-Avoiding Walks 91
• υ > 0 (T > ): good solvent; R ∝ N 3/5 (swollen polymer coil, “real” chain) • υ < 0 (T < ): poor solvent; R ∝ N 1/3 (polymer globule, collapsed chain) In order to characterize the different cases, one usually speaks about the solvent quality, since it is the solvent that largely determines the value of υ. For each of the cases we also give above the characteristic power law that governs the polymer radius and indicate what these polymer configurations are typically called. The simplified models that we discussed above are only applicable at the -temperature where the excluded volume part is effectively cancelled by the attractive contribution. Note that for large N, a polymer changes its radius dramatically when the solvent quality is changed. We first treat the good solvent case in Sections 3.4 to 3.6 in detail before briefly discussing the poor solvent case in Section 3.7.
3.4 Self-Avoiding Walks The simplest model for a polymer in a good solvent is the selfavoiding walk (SAW), a RW on a lattice that does not intersect with itself. The example configuration given in Fig. 3.2 above does not intersect and is thus actually a concrete realization of a SAW. The statistical properties of SAWs are complex. They have been studied by various numerical methods. Here we merely state the findings. We start with the total number of SAWs of N steps. This number shows the asymptotic form N Ntot = const. z N N γ −1 .
(3.23)
The first factor z N is similar to the term z N which gives the total number of random walks on a lattice of coordination number z, see Eq. 3.6. For a 3D cubic lattice one has z = 6 and z = 4.68 . . . and for a 2D square lattice one has z = 4 and z = 2.638 . . . . The factor N γ −1 is called the enhancement factor. γ is a universal critical exponent and only depends on the dimensionality d, namely γ = 1.1596 ± 0.0020 for d = 3 (Guida and Zinn-Justin, 1998) and γ = 43/32 ≈ 4/3 for d = 2 (the exact result follows from Ref. (Nienhuis, 1982)). The
92 Polymer Physics
situation for d = 1 is trivial: in one dimension the RW can either always step to the right or always to the left. Hence N Ntot = 2 and thus z = 1 and γ = 1. The SAWs have a mean-squared average end-to-end distance, called R 2F , that scales as R F aN ν
(3.24)
with a being the step size. Here ν is another universal exponent (arguably the most important one in polymer physics). This exponent again depends on the space dimension and is larger than 1/2 (the value for a RW) for d < 4. The values of this exponent for various space dimensions can be estimated with the Flory argument which is presented in Section 3.5. It is instructive to compare the probability distribution of the end-to-end distance of a SAW, Fig. 3.8(b), to that of an ordinary RW, Fig. 3.8(a), the latter being given in Eq. 3.7. The distribution depends on r only through the ratio r/R F :
1 r (3.25) p (r) = d ϕ p RF RF for a r aN. The reduced distribution ϕ p (x) for small x = r/R F decreases sharply limx→0 ϕ p (x) ∼ x g
(3.26)
with an exponent g 1/3 for d = 3, reflecting the difficulty for a SAW to return to its starting point due to self avoidance. For large x one finds (3.27) limx→∞ ϕ p (x) ∼ exp −x δ f1 (x) with δ = (1 − ν)−1 ,
(3.28)
as we will show further below. ( f1 (x) varies as a power of x.) Finally let us discuss SAWs that return to a terminal site B adjacent to the starting site A, Fig. 3.9. By closing the A-B link such a SAW can be mapped onto a self-avoiding closed polygon with N + 1 edges. The number of such polygons turns out to be
a d N ∼ . (3.29) N N (r = a) = z RF
Self-Avoiding Walks 93
p (x)
p (x) 3
δ
2
e−x
e− 2 x
xg
(a)
R x= √ a N
(b)
x=
R RF
Figure 3.8 Comparison of the distribution of the end-to-end distance of (a) a RW, Eq. 3.7, and (b) a SAW (see text).
A
Figure 3.9
B
A SAW with its terminal site adjacent to the starting site.
Let us compare this equation to what one would expect to find based on intuition. The total number N Ntot of SAWs of N steps was given by Eq. 3.23. Only a fraction of this number corresponds to looped configurations which can be estimated as follows. The terminal points of all SAWs are spread over the d-dimensional volume R dF whereas for loops they need to sit next to the starting point A (a distance a away). We therefore expect that the fraction of looped configurations is given by (a/R F )d and that N Ntot has to be multiplied with this factor to arrive at the total number of looped configurations. However, this is not quite true as can be seen by inspecting Eq. 3.29: the enhancement factor N γ −1 is absent. This surprising fact reflects again the difficulty of a SAW to return to its starting point.
94 Polymer Physics
We will prove Eq. 3.29 later. Accepting it for now, we can predict g from Eq. 3.26. According to that equation one has
1 a g 1 = d N −νg . p (a) d (3.30) RF RF RF On the other hand, it is by definition related to N N (a), Eq. 3.29, via p (a) =
1 N N (a) ∼ 1 1−γ . = dN ad N Ntot RF
(3.31)
Comparing this to Eq. 3.30 we find γ −1 . (3.32) ν The scaling laws that we reported here can be understood in terms of a beautiful analogy between polymers and the behavior of ferromagnets at critical points (de Gennes, 1979). We refer the interested reader to Appendix C where this remarkable connection is discussed in some detail. g=
3.5 The Flory Argument As already mentioned, in the case of a good solvent no exact treatment is possible. We give here the beautiful Flory argument (de Gennes, 1979) which somehow manages to find almost the right value for the universal exponent ν. It can therefore estimate how the sizes of swollen polymer coils scale with N. We drop in the following all numerical prefactors as they do not influence the scaling exponent. According to Flory, the free energy of a polymer in a good solvent can be approximated by the sum of two terms, the entropic spring term, Eq. 3.9, and the two-body collision term between monomers, i.e., the first correction term in the virial expansion, Eq. 2.90. Thus Flory’s free energy is of the form N2 R2 + υ (3.33) R3 a2 N with υ > 0. Clearly, Eq. 3.33 is an oversimplification. The twobody collision term completely overlooks the fact that the monomers are connected into a polymer. Instead it allows the monomers to βF =
The Flory Argument
be at any place inside the polymer coil, independent of each other, i.e., it treats the monomers as if they are gas molecules. The first term does account for the chain connectivity but counts the chain configurations of an ideal chain without taking into account the fact that many configurations are forbidden due to monomer-monomer overlap. Despite these shortcomings, let us go ahead anyway. Minimization of the free energy with respect to the coil size, ∂ F /∂ R = 0, gives the following scaling law 1/5 3/5 R = a2 υ N . (3.34) One calls therefore ν = 3/5 the Flory exponent. Amazingly the exact value of the exponent lies quite close; based on renormalization group theory, one knows that ν = 0.588 ± 0.001. In practice one often uses ν = 3/5 for convenience; we will do the same below. Similar to the ideal chain case, the exponent is again universal and the prefactor is not. In other words, any reasonable microscopic model for a polymer in a good solvent shows this exponent. Most importantly, real world polymers exhibit this scaling law, too, independent of the underlying chemistry. The Flory argument owes its success to the fortunate cancellation of two errors. We derived the entropic term above by counting the number of configurations consistent with a given end-toend distance, see Eqs. 3.5 and 3.7, but under the assumption of point-like monomers. When monomers cannot overlap there are many forbidden configurations for short end-to-end distances, an effect that becomes less and less important with increasing chain stretching. On the other hand, the two-body collision term does not account for the connectivity of the chain, which leads to an overestimation of the actual collision probability between monomers: a given monomer nearly automatically avoids hitting monomers from other sections of the chain because its neighboring monomers avoid these other sections too. Again, this discrepancy between an ideal and a real chain decreases with increasing chain stretching. In the end, both errors almost cancel each other out. We have given the Flory argument for a chain that lives in threedimensional space. It is straightforward to extend this argument to
95
96 Polymer Physics
d space dimensions. You might think that this is a rather academic exercise since we live in a three-dimensional world and experiments in d = 3 are therefore impossible. But first of all, we can mimic d = 1 by having a polymer in a narrow tube and d = 2 by adsorbing it on a surface. And moreover, going to general d teaches us something about why the approximation in d = 3 works so well. The Flory free energy in d dimensions is given by R2 N2 υ (3.35) + a2 N Rd where υ now has the dimension of a d-dimensional volume. The first term is the entropic spring term. As you can easily convince yourself by going back to Eqs. 3.7 and 3.8 there is a factor d/2 in front of the “spring” term but this is again just a numerical factor and is thus disregarded here. The second term in Eq. 3.35 is the two-body collision term in d-dimensional space. You could obtain it by generalizing the virial expansion of the previous chapter to d dimensions. Since we are only interested in its scaling, it is more helpful to simply realize that the two-body collision term must scale like c2 R d with c = N/R d being the monomer density in ddimensional space. Here the factor c2 gives the probability density of two-body collisions. Multiplying it with R d , one obtains the total number of collisions at a given time. The free energy, Eq. 3.35, is minimized with respect to R for 1/(d+2) 3/(d+2) N . (3.36) R = a2 υ βF =
Hence we find that the d-dimensional Flory exponent goes as 3 . (3.37) d+2 Remarkably, this is exact for d = 1 since a given chain can only be oriented in one direction on a one-dimensional lattice and thus ν = 1. Far from obvious, and not further discussed here, is the fact that the result in two dimensions, ν = 3/4, is also exact (Nienhuis, 1982). It is thus not surprising that the three-dimensional case works so well. Very interesting are also the findings for larger dimensions. For d = 4 one finds ν = 1/2, i.e., the ideal chain exponent, Eq. 3.4. This finding indicates that we have now so many directions in space that ν=
The Blob Picture 97
monomers hardly “see” each other and that to a good approximation the chain behaves as ideal. The results for d > 4 are confusing at first sight since Eq. 3.37 predicts ν < 1/2. This would mean that a chain with excluded volume would have a smaller coil size than an ideal chain, a finding that obviously makes no sense. This finding has to be understood as follows. Suppose we have a chain with an ideal chain configuration, R ∼ N 1/2 , in d > 4. Then the two-body collision term in Eq. 3.35 scales as N 2 / R d = N (4−d )/2 , tiny compared to the entropic spring term which is of the order one for an ideal chain. This shows that excluded volume effects are simply of no relevance beyond four dimensions and that the chains show ideal chain behavior. The case d = 4 is special because it is just at the borderline between swollen and ideal chains.
3.6 The Blob Picture The SAW is an extreme case where the step size a enters the second virial coefficient as υ = a3 . As we explained above, the subtle interplay between hardcore repulsion and attraction can lead to much smaller values of υ, υ a3 . What does a typical chain configuration look like in this case? We now give a powerful geometrical argument, the blob picture (de Gennes, 1979), which will lead us again to Eq. 3.34 but gives us a more microscopic view of the typical chain conformations. In fact, thinking in blobs is such a powerful approach to polymers that also the rest of this and the following section are dominated by blob arguments. We now assume that the second virial coefficient is small, 0 < υ a3 , and consider a subchain, a polymer section of g consecutive monomers. If g is small enough we expect that, to a good approximation, ideal chain statistics (cf. Eq. 3.3) relates ξ , the typical spatial extension, to g, ξ = ag1/2 .
(3.38)
This is true up to a number of gT monomers, the number at which the two-body collision term (i.e., the l = 2-term in Eq. 2.90 with V = ξ 3 , n = g/ξ 3 , B2 = υ, disregarding numerical factors) has
98 Polymer Physics
gT monomers
ξT
Figure 3.10 A good solvent chain forms a self-avoiding walk of thermal blobs of size ξT . The subchains behave like ideal chains within the blobs.
become so large that it is on the order of kB T : υgT2 = 1. ξT3
(3.39)
This defines the thermal blob of size ξT and monomer number gT . Combining Eqs. 3.38 and 3.39 we find that these quantities are given by a6 a4 , gT = 2 . (3.40) υ υ Thermal blobs try to avoid to overlap, as this would cost about kB T . Hence the chain forms a SAW of thermal blobs as depicted in Fig. 3.10. The step size a of the SAW is set by the thermal blob size ξT and the number of steps is just the number of blobs, N/gT . According to Eq. 3.24 with a = ξT and N/gT instead of N we find:
3/5 3/5 1/5 3/5 N a4 N = a2 υ R = ξT = N . (3.41) 6 2 gT υ a /υ ξT =
This 2 is1/5indeed Eq. 3.34. We understand now the meaning of the a υ factor that properly takes into account the influence of the thermal blobs on the overall size of the swollen chain. Note if the excluded volume contribution is so small that gT > N, the entire chain shows Gaussian statistics R = bN 1/2 . The condition for gT > N reads υ < a3 /N 1/2 . The blob picture is also helpful to calculate the stretching behavior of a polymer chain under an external tension f . We have calculated above the stretching behavior of ideal chains; we try now also to calculate the force-extension relation of real chains.
The Blob Picture 99
gP monomers
+f −f Figure 3.11
ξP A chain under tension can be subdivided into Pincus blobs.
Let us first derive again the ideal chain result but this time in the framework of blobs. When a tension f acts on such a chain, there is a typical length scale ξ P = kB T / f,
(3.42)
the size of the so-called Pincus blobs. Inside each blob the tension is only a small perturbation so that the blobs obey ideal chain statistics, 1/2 ξ P = ag P . Two connected blobs prefer to be aligned in the direction of the force since they then gain −ξ P f = −kB T . As shown in Fig. 3.11, the total length of the polymer in the force direction is then just the length of the resulting blob chain z=
N a2 N a2 N ξP = = f. gP ξP kB T
(3.43)
The blob argument does indeed recover the entropic spring result, Eqs. 3.12 and 3.17—up to numerical factors. The power of the blob picture becomes apparent when applied to the stretching of a chain swollen in a good solvent. Let us first focus on the special case of a SAW where υ = a3 . The Pincus blobs have again the size given by Eq. 3.42 but are now swollen subchains with ξ P = agνP (cf. Eq. 3.24). The end-to-end distance of the stretched polymer scales like the length of the blob chain:
1−ν
2/3 ν N f f z= ξ P = a1/ν N = a5/3 N . (3.44) gP kB T kB T Note that we find here that the excluded volume leads to a nonlinear force-extension relation z ∼ f 2/3 with an exponent smaller than one. This can be understood if we recall the entropic nature of this force. For an ideal chain the decrease of possible conformations with increasing end-to-end distance leads to a linear force-extension relation, Eq. 3.43. For an excluded volume chain
100 Polymer Physics
+f −f Figure 3.12
A good-solvent chain under tension: blobs within blobs.
there are many forbidden conformations for small extensions and much less forbidden conformations for larger extensions making it easier to stretch such a chain as reflected by the sublinear behavior. We are now in the position to calculate δ, i.e., to derive Eq. 3.28. Equation 3.27 gives for the entropy S (z) = const + kB ln p (z/R F ) = const − kB
z RF
δ .
(3.45)
Analogous to Eq. 3.10 we find the force from the free energy F = const − T S:
∂F ∂ kB T z δ f = = kB T ∼ δ νδ zδ−1 . (3.46) ∂z ∂z R F a N This is indeed identical to Eq. 3.44 if one sets δ = (1 − ν)−1 , recovering Eq. 3.28. We next calculate the stretching of a chain in a good solvent with smaller values of υ, 0 < υ a3 . This serves as an illustration of a more complex geometry where one has blobs inside blobs. Two effects lead to two kinds of blobs, the excluded volume to thermal blobs of size ξT (cf. Eq. 3.40) and the external force to Pincus blobs of size ξ P (cf. Eq. 3.42). For sufficiently small tensions, namely for f < kB T υ/a4 , Pincus blobs are larger than thermal blobs, ξ P > ξT . In this case we have a self-avoiding chain blobs inside of thermal 1/5 3/5 g P (cf. Eq. 3.41). each Pincus blob (cf. Fig. 3.12) with ξ P = a2 υ 1/5 . For large f , So we find Eq. 3.44 but with a replaced by a2 υ f > kB T υ/a4 , the Pincus blobs have shrunken so much that they are smaller than thermal blobs, ξ P < ξT . The subchains inside the Pincus blobs are then ideal and we recover ideal chain stretching behavior, Eq. 3.43 and Fig. 3.11.
Polymers in Poor Solvents 101
3.7 Polymers in Poor Solvents So far we have studied chains in - and in good solvents. We mentioned above that for sufficiently strong attraction between the monomers the second virial coefficient can become negative, υ < 0, see also Eq. 3.22. The free energy of a chain in such a poor solvent needs to have at least two contributions, one accounting for the attraction between the monomers and one that stabilizes the chain to prevent it from collapsing into a point. For the first contribution we use—as for the good solvent case—the pair interaction (the second term in the virial expansion 2.90) but here with a negative υ-value, i.e., υ = − |υ|. The stabilizing term is not so obvious. One could speculate that it is related to the entropic cost of confining the chain to a small volume but this effect turns out to be completely overpowered by another term: the three-body collision term, the third term (l = 3) in Eq. 2.90. For simplicity we do not calculate B3 explicitly but assume that it is given by B3 ≈ a6 , directly reflecting the excluded volume. We can write down the free energy β F = − |υ|
3 N2 6N + a , R3 R6
(3.47)
which is minimized for R=
a2 N 1/3 |υ|1/3
(3.48)
(up to a numerical factor). Note that the connectivity did not enter anywhere directly in this line of argument, except that we dropped the ideal gas contribution (the first term in Eq. 2.90). In fact, this argument is similar to our discussion of a real gas in a container that condenses into a liquid when there is sufficient attraction between the molecules, see Eq. 2.93. We now derive Eq. 3.48 in the context of the blob picture. First we have to determine the thermal blob size ξT up to which the chain behaves as ideal. For this we can repeat the line of argument given above in Eqs. 3.38 to 3.40 but with υ replaced by its absolute value |υ|. As in the good solvent case, the blobs do not like to overlap since this would cost about kB T . This time the repulsion results, however, not from the two-body but from the three body collision term a6 gT3 /ξT6 ≈ 1 (see also Eq. 3.47). The two-body attraction
102 Polymer Physics
ξT
Figure 3.13 A poor-solvent chain forms a densely packaged array of thermal blobs.
causes the blobs to attract each other, changing the free energy by −kB T if they are in contact since − |υ| gT2 /ξT3 ≈ −1. In other words, the poor solvent chain can be considered as a globule comprised of densely packed thermal blobs. The volume of the entire globule is then just the volume of a dense array of thermal blobs such as the one depicted in Fig. 3.13. The radius is then proportional to the cube root of that volume (up to a numerical factor):
a2 N 3 1/3 = N 1/3 . ξT (3.49) R= gT |υ |1/3 We used here Eq. 3.40 for ξT and gT (with υ replaced by |υ|). Equation 3.49 is identical to the prediction 3.48 that followed from a free energy minimization. We can now understand very clearly that the poor solvent exponent 1/3 is trivial, in sharp contrast to the exponent in the good solvent case which cannot even be calculated exactly. We have studied earlier the stretching of polymers in - and in good solvent. It is instructive to discuss here what happens when one stretches a collapsed globule by applying a force to its end monomers. This case turns out to be surprisingly complex (Halperin and Zhulina, 1991). The basic idea is that the polymer globule behaves similarly to a liquid drop. And as a liquid drop assumes a spherical shape to minimize its surface, so does the polymer globule. The surface tension, the energy per area that one has to pay to expose the globule to the solvent, follows directly from the blob picture. As shown above, neighboring blobs feel a mutual attraction on the order of kB T . Blobs at the surface are less happy than blobs
Polymers in Poor Solvents 103
in the interior of the globule since they have fewer neighbors. With about one neighbor less at a cost of about one kB T and an area on the order of ξT2 on the surface of the globule, we expect that surface blobs lead to a surface tension γ that scales like γ =
kB T kB T |υ |2 . = 2 a8 ξT
(3.50)
The surface free energy between the globule and the solvent is then given by F surf = γ S
(3.51)
with S denoting the total surface of the globule. The free energy for a globule under tension f is the sum of the bulk term F bulk , given by Eq. 3.47, and the surface term, Eq. 3.51. The value of F bulk follows from inserting the optimal radius, Eq. 3.48, into the free energy, Eq. 3.47. This suggests that F bulk has the value zero. But here you need to be careful: when presenting Eq. 3.48 we had thrown away the numerical coefficient since we only cared about the scaling. When plugging this radius back in, the two terms in the free energy seem to cancel. However, taking the numerical factor along (here 21/3 ) we find a non-vanishing F bulk , namely β F bulk = − |υ|2 N/ 4a6 . This is just proportional to the number of blobs, N/gT (see Eq. 3.40), each giving about −kB T condensation energy. When we deform the globule, the number of thermal blobs does not change, i.e., the globule is incompressible like a typical fluid, and the bulk energy stays constant. All that happens is that the blobs rearrange, thereby enlarging the surface of the globule. This means that we need in the following only to discuss the surface term, Eq. 3.51. Let us consider first small deformations z R with z denoting the difference between the long axis of the deformed globule and the diameter of the free globule. Since we start from a spherical globule that minimizes the surface we expect that for small deformation z the surface increases quadratically as F surf (z) ∼ (z)2 . As in Eq. 3.10 the force follows through differentiation, namely f = ∂ F surf /∂z ∼ z. Now let us go to larger deformations z > R. In that case the globule has to rearrange its thermal blobs. It can maintain their
104 Polymer Physics
f ∼ Δz −1/2 ∼ Δz
fc ∼ Δz
(a)
Δz
ξT (b)
Figure 3.14 A poor-solvent chain under tension: (a) Force-extension curve and Maxwell-construction. (b) Tadpole configuration: coexistence between condensed and non-condensed thermal blobs.
number, N/gT , but needs to expose more of them to the surface. The most straightforward assumption is that the spherical globule deforms into a cigar-like shape of diameter D and length z. Since the blobs form an “incompressible fluid” the volume of the cigar, D2 z (dropping numerical factors) should equal the volume R 3 of the original spherical globule. The surface of the cigar is then S = Dz = R 3/2 z1/2 leading to the surface energy F surf = γ R 3/2 z1/2 .
(3.52)
The force f with which the cigar resists its extension follows then via ∂ F surf f = (3.53) ∼ (z)−1/2 . ∂z Finally, once we pull so hard that the diameter of the cigar equals the thermal blob size, D = ξT , further extension is only possible by shrinking the blobs. In that case the Pincus blobs become smaller than the thermal blobs, ξ P < ξT or, equivalently, f > kB T |υ| /a4 . As a result the Pincus blobs form a chain of ideal blobs as shown in Fig. 3.11 and the force rises again in a linear fashion, z ∼ f (see Eq. 3.43). To summarize, we predict the force-extension curve shown in Fig. 3.14(a): linear force laws f ∼ z dominate for small and large forces, whereas in between the force drops as f ∼ z−1/2 . But the latter behavior is unphysical. If we increase the end-to-end distance so far that we are in the intermediate regime, we need to hold the chain at a certain tension that we can read off Fig. 3.14(a). A small
Polymers in Poor Solvents 105
thermal fluctuation inside the chain will inevitably induce a part of the globule to be slightly more stretched. According to Fig. 3.14(a), the tension to hold this section stretched is then smaller, so that piece stretches even further. In other words, the cigar configuration is not stable against fluctuations. The solution to this puzzle is that the polymer exhibits a “tadpole” configuration at intermediate distances, Fig. 3.14(b), instead of the cigar shape. The tadpole consists of a globular head and a tail made from thermal blobs. It needs to be held at a force fc where the blobs in the head and in the tail are in chemical equilibrium. To pull out a blob one pays one kB T but gains fc ξT , leading to the critical force fc = kB T |υ | /a4 .
(3.54)
This is reminiscent of a first order phase transition between two phases, here between that of condensed (inside the head) and that of non-condensed blobs (in the tail). We encountered a similar situation earlier when we discussed the gas-liquid phase transition in Fig. 2.11. In that case, we found a range of volumes where there is a coexistence between a liquid and a gas. The relative amount between the two phases could be shifted to any value by changing the volume of the container. We argued that the corresponding coexistence line can be found through the Maxwell construction. By doing so the chemical potentials of the two phases are identical, Eq. 2.101. Similarly the critical force in the tadpole follows from such a condition as outlined before Eq. 3.54. We indicated the Maxwell line in Fig. 3.14(a) as a dashed horizontal line. However, we note that there is an important difference to the liquid-gas transition. The latter is a true phase transition since a practically infinite number of molecules is involved. It is this infinite number that causes the integration of the “harmless” Boltzmann weight to induce a sharp singularity in the partition function and thus in the behavior of the system. For the globule, there is typically a rather limited number of thermal blobs. In that case there cannot be any singular behavior and instead the transition looks somewhat smooth—smoother than indicated in Fig. 3.14(a). Hence, this transition is said to be reminiscent of a first-order phase transition.
106 Polymer Physics
3.8 Internal Structure of Polymers Later, in Chapter 8, we shall speculate about the large-scale properties of eukaryotic DNA inside nuclei. Experiments allow one to measure the mean-squared distance between given pairs of inner monomers i and j , rather than just the distance between the end monomers that we have discussed so far. Therefore, we discuss in this section the internal structure of polymers. We might hope that this is straightforward and that the mean-squared distance between monomer i and j (see Fig. 3.15) is simply given by 2 Ri j = b2 |i − j |ν . (3.55) The factor b is some effective bond length and the exponent ν is— according to our naive expectation—again given by 1/2 for an ideal chain (as in Eqs. 3.3, 3.13 and 3.20), by 3/5 for a good solvent chain (as in Eq. 3.34) and by 1/3 for a poor solvent chain (as in Eq. 3.48). One can indeed find various instances in the literature where Eq. 3.55 has been claimed to hold with those exponents for the various solvent conditions. But does the naive picture really work for all three cases? Let us start with an ideal chain. In that case monomers only influence each because they are connected to each other along the chain. Therefore the distance between monomers i and j depends only on the stretch of chain in between and is not affected by the rest of the chain. Equation 3.55 is thus obviously true for all pairs of monomers. This is a hallmark of a so-called fractal, a structure
j i
Rij
Figure 3.15 Definition of the internal distance vector Ri j between monomer i and j .
Internal Structure of Polymers 107
that is self-similar on all length scales. For a region of size L of a fractal object one finds that its mass scales like M ∼ Ld f where d f is the fractal dimension of the structure. For an ideal chain one has Ri2j ∼ |i − j |, i.e., the “mass” of |i − j | monomers scales as Ri2j . So its fractal dimension happens to be an integer, namely d f = 2. An ideal chain is thus an object with a one-dimensional connectivity that has the same fractal dimension as a surface and lives in three dimensions. Obviously, in a good solvent it is no longer true that the statistics of the conformations of the |i − j |-subchain remains completely unaffected by the rest of the chain. The question here is whether the rest of the chain alters the statistics of the subchain in any significant way. A chain of length |i − j | (i.e., the subchain in the absence of the rest of the chain) has a monomer density csub that scales as csub ∼ |i − j | / |i − j |3ν = 1/ |i − j |4/5 . If the rest of the chain of length N with N |i − j | is present, then the local monomer density is increased by an amount crest ∼ 1/N 4/5 , a density that is much smaller than the local subchain density, crest csub . In the spirit of the Flory argument, Eq. 3.33, we argue that chain swelling is caused by monomer-monomer collisions that are proportional to the squared monomer density. Since the increase in monomer density due to the rest of the chain is so small, we expect that to leading order Eq. 3.55 still holds with ν = 3/5. So far the internal distances had no surprise for us in store. However, this is different for the poor solvent case. In the following I shall argue that for a poor solvent chain one finds 2 a2 |i − j | for |i − j | < R 2 /a2 (3.56) Ri j ≈ R2 for |i − j | > R 2 /a2 with R given by Eq. 3.49. What is claimed in Eq. 3.56 is quite remarkable: for sufficiently short monomer distances the collapsed globule shows ideal chain statistics. “Sufficiently short” means here that the monomer pair should on average not be further apart than the size R of the whole globule. For larger values of |i − j | the meansquared distance saturates at R 2 (up to a numerical factor). Why does a poor solvent polymer show ideal chain behavior locally? We give here a rather intuitive line of argument. Let us first cut our globular chain into somewhat smaller pieces as indicated in
108 Polymer Physics
Figure 3.16 Cutting a poor solvent chain into smaller pieces, one essentially obtains a semidilute polymer solution with strongly overlapping chains.
Fig. 3.16. As a result, one has a globule that is formed by several disconnected chains that are strongly overlapping. This looks just like a so-called semidilute polymer solution, a solution of many identical, overlapping polymers, a fraction of which is shown on the right hand side of Fig. 3.16. “Semidilute” here means that there is still a lot of solvent present, as we had also assumed earlier, see Fig. 3.13. If we understand the conformation of a polymer chain in such a semidilute solution, we might hope that we also understand what a fraction of a globular chain looks like—assuming that the few introduced cuts of the chain do not affect its typical conformations. Chain conformations inside a semidilute or dense polymer solution are indeed well understood—theoretically and experimentally. Let us assume in the following good solvent conditions; as will become clear, the argument can then also be applied to poor solvent conditions. Flory had the decisive insight that chains in such a semidilute solution should show ideal chain behavior. The basic idea of this so-called Flory theorem is as follows. Consider one chain in a sea of other chains, as depicted in Fig. 3.17. If we measure the overall chain density when going along a line that passes through our test chain (the z-axis in Fig. 3.17), one finds on average a constant density. If one only measures the density of the test chain, one obtains a peak centered around that chain (bottom curve in Fig. 3.17). In the absence of the other chains, the monomermonomer repulsion would thus induce a swelling of the test chain, as we had discussed above for a single chain in a good solvent. Now, however, the other chains are present. If we remove our test chain
Internal Structure of Polymers 109
z ρ other chains test chain
z Figure 3.17 Flory theorem: a given test chain (in red) in a semidilute solution shows a peak in the density that is exactly compensated by a dip in the density of the other chains (in blue). The resulting chain conformations are ideal, even under non-ideal solvent conditions.
and consider the density of the remaining other chains, we find a constant density everywhere, except for a dip where the test chain had been located (blue curve in Fig. 3.17). That means that the other chains produce a pressure trying to close that density hole. For the test chain in the solution of the other chains, the two pressures, the outward pressure of the test chain and the inward pressure of the other chains, exactly cancel and there is no net pressure. As a result, the test chain and also all other chains obey ideal chain statistics. The same line of argument holds in a poor solvent with the pressures attaining minus signs. For polymer solutions the Flory theorem has been tested experimentally by performing neutron scattering measurements on a deuterium labeled chain in a solution of unlabeled chains. The advantage of this method is that hydrogen and deuterium are chemically identical isotopes with only the latter being “seen” by the neutrons. It was found that the deuterated chain shows ideal chain statistics, Eq. 3.4 (Cotton et al., 1974). Going back to Fig. 3.16: any piece of a poor solvent chain behaves like an ideal chain—as long as it does not see the globule’s surface. The best test so far for this claim stems from an extensive computer
110 Polymer Physics
enumeration (Lua et al., 2004) of dense RWs on cubic lattices, socalled Hamiltonian walks that completely fill a cubic region of the lattice and—as SAWs—nowhere cross each other. Chains of length N = L3 on cubic lattices of size L × L × L with L up to 22 have been studied. For L = 2 there are only 3 symmetrically independent Hamiltonian walks but for L = 3 there are already 103346 different possibilities. For larger lengths it is impossible to perform a complete enumeration. Instead one needs to generate samples of conformations, trying to keep statistical biases as small as possible, a rather challenging endeavor. The data obtained in Ref. (Lua et al., 2004) clearly support Eq. 3.56.
Problems 3.1 Polyelectrolyte A polyelectrolyte is a polymer chain that contains charged monomers. The repulsion between the charges leads to a stretched polymer configuration which is derived here via a blob argument. (i) Consider a chain of length N in a -solvent. A (small) fraction f of the monomers is charged, each carrying a charge q. An electrostatic blob is an ideal subchain with a length such that its electrostatic self-energy equals the thermal energy. Calculate the size ξel and number of monomers gel of an electrostatic blob. (ii) Calculate the end-to-end distance of the polyelectrolyte chain. For this you have to make a reasonable assumption for the overall arrangement of the electrostatic blobs. 3.2 Polymer in a slit Consider a polymer confined between two infinite parallel walls at a distance D from each other. The polymer can be described as a sequence of confinement blobs of size D. (i) What is the end-to-end distance of a chain squeezed into that slab of height D in a -solvent? (ii) How does the result change for a good solvent (assuming υ = a3 )?
Problems
(iii) Use a force f to pull on the ends of the good solvent chain in the slit. Give the end-to-end distance in the various cases (only for blob acrobats!). 3.3 Fractal dimensions For an ideal chain we found a fractal dimension d f = 2. It is a coincidence that d f has an integer value. Self-similar objects have typically fractal dimensions with non-integer values. Here are a few examples. (i) What is the fractal dimension of a swollen polymer coil? (ii) Consider a swollen polymer chain in d dimensions. What is the fractal dimension as a function of d? (iii) The fractal dimension can be smaller than one. An example is the Cantor set. It is produced by an infinite number of steps. First cut out the middle third from a one-dimensional line of length L. This leads to two lines of length L/3 with a spacing of length L/3 in between. Then cut out the inner thirds of the two lines. And so on. What is the fractal dimension of this object? (iv) Generalize the rules for the Cantor set such that you can produce any fractal dimension between 0 and 1, 0 < d f < 1. 3.4 Exact enumeration of self-avoiding walks Calculate the total number of self-avoiding walks of given length N, first in two dimensions on a square lattice and then in three dimensions on a cubic lattice. This can be achieved by a very compact computer program (e.g., in Python), employing a recursive algorithm. We explain the necessary steps for the two-dimensional case here. Fill a (large enough) two-dimensional array with zeros which represents the lattice that has not yet been visited by the walk. Place your starting point in the middle of the lattice. Now you call subroutine SAW (i, N) where i is the number of steps you have already taken (zero at the beginning) and N is the total length of the SAWs you enumerate. This subroutine makes attempts to perform new steps (in all directions) and each time, if successful (i.e., if the new position is not already occupied by the growing chain), will call itself again. In this recursive way you build up all the SAWs. Whenever i reaches N, you found a valid SAW (you increase some counter by one). Then you step one level up again and try to grow the chain in a different direction and so on. More specifically: Suppose you have built a SAW
111
112 Polymer Physics
up to length i − 1. You have marked all the positions visited by your walk in your array, say the positions (x0 , y0 ) up to (xi −1 , yi −1 ). Now choose to expand your walk by stepping to (xi , yi ). To do this, you call the subroutine SAW (i, N). This subroutine does the following (in symbolic notation): If not visited (xi , yi ) then if i = N then increase the total number of SAWs by one else visited (xi , yi ) = true; xi +1 = xi + 1; yi +1 = yi ; xi +1 = xi − 1; yi +1 = yi ; xi +1 = xi ; yi +1 = yi + 1; xi +1 = xi ; yi +1 = yi − 1; visited (xi , yi ) = false
SAW (i SAW (i SAW (i SAW (i
+ 1, + 1, + 1, + 1,
N); N); N); N);
To which length do you manage to calculate all the configurations on a reasonable time scale, say less than an hour? Now try to extract the critical exponents γ from Eq. 3.23 and ν from Eq. 3.24 (see also 3.37) for two and three dimensions. First plot N and R F as a function of N in log-log plots. the quantities N Ntot /z Check whether the literature values for the exponents (given in this chapter) are approximately consistent with your plots (they should). You can also try to determine the exponents by finding the slopes of the corresponding plots, e.g., through simple linear regression (read up how this works).
Chapter 4
DNA
4.1 The Discovery of the DNA Double Helix This section gives a brief account of the discovery of the DNA double helix (Watson, 1968). It is a fascinating story in itself (a story of astonishing coincidences, questionable scientific ethics, etc.), but here I am mostly referring to it because it helps us understand what is special about DNA. For now let us forget everything we know about DNA, especially its structure and the fact that it carries the genetic information. In the 1940s it was not yet clear what genes were about, but it was clear that there is genetic information that can be passed on to the next generation. At that time, DNA was certainly not the main suspect for carrying this information since it seemed to be a dull molecule that showed no special activities and therefore possibly only had a structural function. On the other hand, proteins were known to represent a very rich class of extremely diverse molecules. It therefore seemed logical that the composition of proteins, or a subset of them, somehow encodes for the genetic information. The experiment by Oswald Theodore Avery in 1944 put an end to this idea. Two strains of the bacterium Streptococcus pneumoniae were used, the harmless R strain and the S strain, which causes
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
114 DNA
Streptococcus pneumoniae
R strain
S strain
cell free extracts from S strain cells: mixed f ti fractionated
DNA
RNA proteins
transformation
S strain cells contain molecules that carry heritable information
transformation
The molecule that carries heritable information is DNA
Figure 4.1 Avery’s key experiment that demonstrated that DNA is the carrier of genetic information.
pneumonia. When R strain cells were exposed to a cell extract from S strain cells, the R strain cells took up this material and changed into S strain cells, a process called transformation (see Fig. 4.1 left). The transformed cells stayed transformed and so also the cells of the next generation. In other words: the cell extract contained genetic information. To test which molecules of the extract carry this information, the cell extract was fractionated into RNA, proteins, and DNA (Fig. 4.1 right). Only R strain cells that were exposed to the DNA transformed. It was a big surprise that the genetic material was not in the proteins but in the DNA. Although it was now clear that DNA is the key molecule for understanding life, the scientific community focused on proteins and in particular tried to determine protein structures by X-ray diffraction from protein crystals. A young American zoologist, James D. Watson, was obsessed with revealing the structure of DNA in the early 1950s, sharing his office with up-to-then unsuccessful physics PhD student Francis Crick. There, in the Cavendish Laboratory
The Discovery of the DNA Double Helix 115
side chains
carbon nitrogen oxygen hydrogen
Assumption: same chemical environment for each atom of regular backbone regular polypeptide backbone
α-helix Figure 4.2 Pauling’s α-helix proposed for proteins in 1951.
in Cambridge, the emphasis was on determining the structure of hemoglobin freshly extracted from race horses. Studies on DNA crystals were performed in London at King’s College by Maurice Wilkins and Rosalind Franklin. It was known from these studies that DNA formed some form of helix. The general plan of Watson and Crick was to repeat the success story of great American chemists Linus Pauling. Pauling predicted in 1951 that there could be a helical structure in proteins, the α-helix. The idea is based on the fact that proteins have a regular backbone, the polypeptide backbone, see Fig. 4.2. Even though the side chains (or residues) are very diverse, he suggested that stretches of the protein chain could adopt a configuration where each atom of the regular backbone has the same chemical environment. By playing with an atomic model kit made from balls and sticks, Pauling discovered that this is possible when the backbone assumes a helical shape that is stabilized by hydrogen bonds. The irregular side groups face to the outside of the helix, as schematically depicted in Fig. 4.2, without disturbing the helical geometry. Watson and Crick started from the facts that DNA features a regular sugar-phosphate backbone (see Fig. 4.3 left) and that— according to the King’s college scattering data—it assumes a helical shape (probably made from two or three strands). As for the
116 DNA
phosphate
purine
sugar
pyrimidine
regular sugar-phosphate backbone
Problem: multivalent ions necessary to prevent repulsion of phosphates
Mg++
Assumption: same chemical environment for each atom of regular backbone AND 2 or 3 intertwined chains (X-ray)
Figure 4.3 In 1951 Watson and Crick came up with this wrong scheme to predict a DNA double helix.
protein case, there was the problem of having to deal with irregular side chains, here large purines (A, G) and small pyrimidines (T, C). Playing with a ball and stick model they managed in 1951 to construct a helix made of two intertwined chains with the two sugarphosphate backbones in the center and the side chains sticking out in the solution. However, the model had one serious problem: each phosphate group carries one negative charge in solution and the model had to assume that all these groups are located in the center of the helix. To make this possible they assumed the presence of the divalent cation Magnesium that forms bridges between the phosphates of the two backbones (see Fig. 4.3 upper right). When Watson and Crick proudly presented their model to Franklin, she pointed out that there were no multi-valent cations present in her experiments, and so the model had to be abolished. After this altogether unpleasant incident, the lab forbade Watson and Crick to continue working on DNA. At that time they happened to share the office with Peter Pauling, Linus’ son. In early 1953, Peter gave them a preprint of his father where he proposed the structure of DNA. The shock was followed by relief when they realized how Pauling in his model, three intertwined chains with
The Discovery of the DNA Double Helix 117
N
G
N
guanine
H
− O
+ H
N
N
+ H
− N
C cytosine
N
N N
H +
O −
H N
+ H
− O
sugar
sugar
H
N
A
− N
N
adenine sugar
+ H
CH3
T
N N
N O
thymine
sugar
Assumption: paired bases stacked in the center
Figure 4.4 Once the base pairing was found, the road was paved for 1953’s great discovery by Watson and Crick: the DNA double helix.
the sugar-backbone in the middle, had dealt with the problem of the phosphate charges. As the world-leading chemist of his time he had just forgotten to put charges on his phosphates. The wrong model was published in February ’53 (Pauling and Corey, 1953) and has only been cited 158 times so far (June 2020). Now Watson and Crick hurried to find the right structure before Pauling would be aware of his mistake. They realized that they needed to bring the charged backbone to the outside of the helix and to pack the irregular bases in an ordered fashion inside the helix. The breakthrough came when Watson played with paper cutouts of the bases. He found out that he could pair the bases, A with T via two hydrogen bonds and G with C via three hydrogen bonds, see top of Fig. 4.4. Both are purine-pyrimidine pairs and therefore the resulting two structures are approximately the same size. The specific base pairing explained the mysterious Chargaff’s rules: For any given DNA
118 DNA
sample, the amount of A equals the amount of T and the amount of C equals the amount of G. It was then quite straightforward for Watson and Crick to build a ball-stick model of a double helix with the base pairs forming a regular stack in the middle and the backbones facing the water at the outside (see Fig. 4.4). The model was published in 1953 in the journal Nature (Watson and Crick, 1953). Watson and Crick did not speak about biological implications, except in the last sentence: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”
4.2 DNA on the Base Pair Level In the previous section we presented the general ideas that led to the discovery of the DNA double helix. In this section we provide a more detailed discussion of possible helix geometries. As we shall see, there are various possibilities. We also discuss how the underlying base pair sequence affects the mechanical properties of such helices. In the first subsection we present a geometric picture which provides useful insights into the structure and mechanical properties of the DNA double helix. However, the approach has its limits: Even though there is nothing wrong with it, some of the experimental observations, e.g., the sequence preferences of nucleosomes which contain strongly bent DNA, go against our intuition. This requires a quantitative approach which is described in the second subsection.
4.2.1 A Geometrical Approach This subsection follows the beautiful line of argument presented in (Calladine et al., 2004). In order to understand what drives the helix formation it is helpful to consider the solubilities of the components that make up DNA: sugars, phosphates and bases. Sugar is water-soluble which many of us convince ourselves at breakfast every day. Phosphates are also water-soluble leading e.g., to algae growth in lakes due to
DNA on the Base Pair Level
base P
P
sugar S
phosphate
B
B
S
B
B
S
S
B
B
P
3.2˚ A 3.3˚ A
B
P S
P
P P
l = 6.5˚ A
S
holes
P
P
18˚ A negatively charged
S
uncharged polar
S P
nonpolar
Figure 4.5 Primary structure of DNA. Two sugar-phosphate backbones are connected to each other via base pairs leading to a ladder-like structure. The hydrophobic holes between the base pairs render such a structure unstable in water.
the excessive use of fertilizers in agriculture. Bases, however, are not water-soluble. These three components are linked together in a ladder-like structure as schematically depicted in Fig. 4.5. The sugarphosphate backbones form the rails of the 18 A˚ wide ladder with the base pairs constituting its steps. The repeat length of the backbone is 6.5 A˚ but the base pair plates are only 3.3 A˚ thick. This leaves 3.2 A˚ wide holes between the hydrophobic base pairs. Such a flat ladder structure is therefore not favorable in water. The holes can be closed by twisting the ladder into a helix as depicted in Fig. 4.6. The twist per base pair can be estimated as ˚ but its follows. The length of a backbone repeat is l = 6.5 A, component in the helix direction (say the Z -direction) should ideally only amount to l z = 3.3 A˚ to close the holes. Consider now the helix from the top (see top of Fig. 4.6). A repeat unit of the backbone then ˚ has a length projected into the X Y -plane of l x y = l 2 − l z2 = 5.6 A. The twist angle θ per base pair is thus
lx y θ = 2 arcsin ≈ 36◦ (4.1) 2 × 9 A˚
119
120 DNA
where we assumed that the distance from the midpoint to the corner ˚ This suggests a helical repeat of a base pair is approximately 9 A. length of the resulting DNA helix of about 10 bp. In addition to the twist rate this simple model predicts that the spacing between the two backbone spirals has two values. This follows from the fact that the backbones are attached to two corners of one long side of each base pair block, the side that is highlighted in orange in the lower right of Fig. 4.6. As a result, the two backbones of the DNA are separated by a so-called minor groove and a major groove (shown in orange and yellow) spiraling around the double helix. How does this simple geometrical argument compare to reality? Figure 4.7 depicts three idealized examples of DNA helices as deduced from crystal structures of oligonucleotides: A-DNA (Franklin and Gosling, 1953), B-DNA (Watson and Crick, 1953) and C-DNA (Marvin et al., 1958). The B-DNA structure shown in the middle comes closest to the DNA structure in the cell. It features a helical repeat length of 10 bp, in agreement with the simple geometrical argument given above. A-DNA is shown on the left, historically the first discovered structure with 11 bp per turn, and on the right a less well-known structure, C-DNA, with 9 bp per turn. All the structures depicted here are right-handed. The geometrical argument from above does not lead to a preferred handedness but details of the atomic structure lead to a strong bias toward right-handed helices. However, left-handedness is not forbidden per se and a lefthanded structure called Z-DNA has actually been observed under certain conditions but is usually disfavored due to strong internal distortions. Why are there different forms of double helices? For the idealized crystal structures as shown in Fig. 4.7 different helices result from different conditions in the crystal like its water content. In addition, there is another effect that is of more interest in the following, namely the underlying bp sequence. For instance, it has been found that a synthetic DNA chain just made up of a sequence of A’s on one strand (and T’s on the other) features the B-form; a chain of G’s on one strand (and C’s on the other) is found in the A-form. To understand this, we need to get a better grip on the geometric differences between the different helices, especially on the level of the base pairs.
DNA on the Base Pair Level
P8
P9
P7
P10
P6
P11
9˚ A
θ P5
P1
A lxy = 5.6˚ P4
P2
lxy 2 × 9˚ A
θ = 2 arcsin
P3
≈ 36◦
minor groove P11
P10
P9
P8
l = 6.5˚ A lxy
P7
lz = 3.3˚ A = 5.6˚ A
P6
P5
major groove P4
A
P3
N
P2
T H N − N
N
+ H + H
− O N N
N
P1
S
CH3
O
S
Figure 4.6 The holes between the base pairs are closed by twisting DNA into a helix. Base pairs are depicted as blocks, their attachment points to the backbones by filled circles.
There are basically six degrees of freedom when going from one base pair to the next: three translational ones (shift, slide, rise) and three rotational ones (tilt, roll, twist), as schematically depicted in Fig. 4.8. Of these six degrees, three show large variations between different helix forms and they are the ones we consider in the following. They are two rotational degrees of freedom, twist around the short axis and roll around the long axis, and one translational degree, slide along the long axis—highlighted in red in Fig. 4.8. As
121
122 DNA
Figure 4.7 Side views of three right-handed double-helical DNA models: ADNA, B-DNA, and C-DNA. From Fig. 11.2 of (Olson et al., 2009). Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with permission.
before, one side of the blocks is colored orange to indicate where the backbones (not shown here) are attached. This way the figures also define the sign of each of the deformations. For example, a positive value of the roll means, by convention, that the base pairs open up towards the minor-groove side. Figure 4.9 shows how one can arrive at the three kinds of helices, A-, B- and C-DNA, starting from an untwisted stack of base pairs (left). First we twist the stack by θtwist = 36◦ per base pair step and arrive at B-DNA (second from left, see also Fig. 4.7 middle). To get to A-DNA, we perform two further steps. First we introduce a positive roll, θroll = +12◦ . The base pairs are then tilted with regard to the helix axis, but not much else happens. Then we also “switch” on a negative slide with ρslide = −0.2 nm. As a result, the base pairs slide “downhill” leading to a shorter and thicker helix, reminiscent of A-DNA, Fig. 4.7 left. Figure 4.9 also shows the path to a C-DNA-like structure by introducing a negative roll of −12◦ , and then a positive slide of +0.2 nm (see also Fig. 4.7 right). To conclude, preferences for different helix shapes reflect preferences of the underlying
DNA on the Base Pair Level
ρshift
θtilt twist/rise axis
tilt
shift
roll/slide
ρslide
θroll
axis
tilt/shift axis
roll θtwist
slide
ρrise
twist
rise
Figure 4.8 There are six degrees of freedom per base pair step. The geometry of a base pair step is mainly characterized by the values of the three degrees of freedom highlighted in red: the twist θtwist around the short axis, the roll θroll around the long axis, and the slide ρslide along the long axis. Due to the coupling of the base pairs via the backbones, the other three degrees of freedom, tilt, shift and rise, play less important roles.
base pair sequence to assume certain values of slide, twist and roll. To understand these preferences, we need to take a closer look at individual base pair steps. As depicted in the inset of Fig. 4.10 the base pairs feature a propeller twist resembling propeller blades of an airplane. The origin of this propeller twist can be explained by going back to the model introduced in Fig. 4.6 where we argued that the holes between the base pairs are closed through their twisting into a helix. As we can see in the top of Fig. 4.10, this still leaves triangular sections exposed to the water. When looking onto the small side of a base pair step, one finds that only small sections of the extremities of the base pairs overlap, see the top right of Fig. 4.10. This area can be increased by a rotation around the major axis of the bases as shown on the bottom right of Fig. 4.10. Since this rotation has to go in opposite directions for each base of a pair, the base pairs take the form of a propeller at the expense of the hydrogen bonds in between that need to be deformed, see the inset in Fig. 4.10.
123
124 DNA
ρslide = −0.2 nm θroll = +12◦
A-like θtwist = +36◦ θroll = −12◦
B-like
ρslide = +0.2 nm
C-like Figure 4.9 Paths that lead from an untwisted stack of parallel base pairs (vanishing values of twist, roll, and slide) to A-, B-, and C-DNA-like helices. Note that the intermediate configurations shown cannot be realized with real DNA since the base pairs are connected via the backbones. This also leads to slightly different twist-values for A-, B-, and C-DNA, neglected in this schematic figure.
The propeller twist of the base pairs leads to preferences for roll and slide values, which are caused by even more microscopic details of the respective base pair steps. We will now inspect two specific examples: the AA/TT step and pyrimidine-purine steps. The AA/TT step is schematically depicted in Fig. 4.11(a) for the case of vanishing roll and slide (for simplicity shown here for a 0◦ twist). In this case, as indicated in the figure, an additional H-bond can be formed between the T of one base pair and the A of the following base pair. That is why AA/TT steps prefer small values of slide and roll. This can be verified by looking at a collective scatter plot in the roll-slide plane of AA/TT base pair steps, Fig. 4.11(b). These step parameters have been extracted from a large number of proteinDNA complexes like the one depicted in Fig. 1.11. One hopes that the different forces acting on the base pair steps in different crystals somehow average out and that the natural conformational response to forces emerges. Since the roll and slide values for the AA/TT step,
DNA on the Base Pair Level
water θtwist = 36◦
water
not protected from water increase contact area distorted H bonds
Figure 4.10 Top: Base pair step seen from the top and from the side. Triangular sections are exposed to the water. Bottom: A propeller twisted base pair step has smaller unprotected hydrophobic surfaces. Inset: The propeller twist is achieved at the expense of the H-bonds between the bases that have to be deformed. 60
ρslide = 0 nm
40
A O H
N
T
A ◦
θroll = 0
(a)
20
θroll
T
AA/TT
0
−20 −40 −3 −2 −1 0 1 2 3
(b)
ρslide [Å]
Figure 4.11 (a) Schematic sketch of the preferred geometry of an AA/TT step and (b) scatter plot in the roll-slide plane of AA/TT base pair steps found in high-resolution protein-DNA crystal complexes (data from Olson et al. (2009)).
as is characteristic for B-DNA, are mostly small, one should expect that DNA rich in AA steps tends to be in the B-form, as it is already mentioned.
125
126 DNA
60
ρslide = −0.2 nm
40
T
C
20
θroll
A G θroll = +20◦
−3 −2 −1 0 1 2 3
ρslide [Å] 60
ρslide = +0.2 nm
40
T
θroll = 0◦ avoidance of steric clash
(b)
TA/TA
θroll
20
G
C
0 −20 −40
cross-stacking of the purines
A
CA/TG
0 −20 −40
(a)
−3 −2 −1 0 1 2 3
ρslide [Å]
(c)
Figure 4.12 (a) Pyrimidine-purine steps have a wide range of roll angles and slides between the two extremes shown here. Scatter plots in the roll-slide plane of (b) CA/TG and (c) TA/TA base pair steps found in high-resolution protein-DNA crystal complexes (adapted from Olson et al. (2009)).
Figure 4.12(a) shows a step where a small pyrimidine is followed by a large purine, such as CA/TG. Because of the steric clash of the purines, such a step is not stacked as effectively and is therefore more flexible than the AA/TT step discussed above. Two extreme cases are shown: in the top configuration positive roll orients the purines parallel and a negative slide achieves their partial cross-stacking. In the bottom configuration steric clash is avoided through adopting a positive slide. Figure 4.12(b) and (c) present two concrete examples of such purine-pyrimidine steps in the form of scatter plots of roll vs. slide for CA/TG and TA/TA. The general tendency suggested by the geometrical argument in Fig. 4.12(a), in particular the correlation between roll and slide, can indeed be observed in these plots. This suggests that pyrimidine-purine steps can easily be accommodated inside A-, B-, and C-DNA-like helices. Note that pyrimidine-purine steps differ from purine-pyrimidine steps since the backbones have a directionality (see the sugar rings
DNA on the Base Pair Level
θroll
θroll
25◦ 0◦ ◦
−25
−5
0
5
22.5◦
25◦
9◦
10
15
n
0◦
−5
5 0
10
n
◦
−25
(a)
(b)
Figure 4.13 Bending DNA can be achieved by periodic rolling of the base pairs. (a) In this example, the base pair stack is bent in a smooth fashion. This is achieved by a sinusoidal dependence of the roll angle on n, the base pair step position. (b) Here the overall bending is achieved via localized kinks at the base pair steps where the minor or major groove points inwards.
in Fig. 4.5 and the arrows in Fig. 4.12(a)). Purine-pyrimidine steps turn out to be much less flexible than their pyrimidine-purine counterparts. The details of the base pair sequence not only have consequences for the preferences for specific helix geometries but also affect the elastic properties of DNA, especially with regard to bending. Figure 4.13 shows what happens on the microscopic base pair level when DNA is bent on a much larger scale. Depicted is a stack of twisted base pair blocks in the B-like geometry, the same as shown in Fig. 4.9. Now, however, the helix axis is no longer straight, but bent with a certain macroscopic radius of curvature. In Fig. 4.13(a) n = this is achieved by introducing a sinusoidal roll of the form θroll ◦ 9 cos (2π n/10) where n denotes the base pair number. The largest roll value, here 9◦ , is localized at the points where the major groove faces inwards at n = 0, ±10, . . . The most negative value, −9◦ , is assumed at the places where the minor groove points inwards, n = ±5, ±15, . . . Figure 4.13(b) shows an alternative way to bend DNA away from the straight state by concentrating all the bends at the places where either groove points inwards, while all the other
127
128 DNA
steps have zero roll values. This leads to a helix with kinks every fifth base pair step. The smoother alternative shown on the lhs is cheaper as long as the curvature is not too high. However, once DNA is bent too strongly, the weakest points along its contour, namely those where either groove points inwards, give way and thereby focus the entire curvature on them. A closer look at the crystal structure of the nucleosome core particle, Fig. 1.8, reveals that the nucleosomal DNA is rather sharply bent at the places where the minor or major groove face inwards, resembling the situation depicted in Fig. 4.13(b). The combination of the two facts that DNA has sharp bends in nucleosomes and that pyrimidine-purine steps are more flexible than purine-purine steps opens up a fantastic possibility: mechanical signals could be inscribed into the DNA base pair sequence that tell nucleosomes where to sit on the DNA. This is indeed possible; it turns out that rolling a pyrimidine-purine step—especially at a negative roll positions where the minor groove has to be compressed—is much cheaper than for any other step (Olson et al., 2009). A section of DNA featuring a pyrimidine-purine step every 10 base pairs is thus much easier to wrap around a nucleosome than a DNA featuring a random sequence. Surprisingly at first, it is even possible to inscribe such mechanical signals on DNA stretches that carry a gene. This is possible because the genetic code is redundant: 64 codons encode only 20 amino acids. Figure 1.5 not only depicts the classical genetic code but also indicates this second, mechanical, code (Segal et al., 2006). It can be read off the letters that represent the amino acid to a given codon. The left half of that letter is shown in white if the first step in its codon is a flexible step, namely a pyrimidine-purine step. Otherwise it is displayed in black. The right half indicates the same for the second base pair step of the corresponding codon. For example, the codon TCA stands for S (serine), its left half is shown in black (TC, a pyrimidine-pyrimidine step) and its right half in white (CA, a pyrimidine-purine step). The codon TCT also represents S, but this time both steps, TC (pyrimidine-pyrimidine) and CT (pyrimidine-pyrimidine), are stiff and hence the whole letter S is depicted in black. This example suggests that two signals, the
DNA on the Base Pair Level
classical genetic code and the mechanical code, can actually be multiplexed on DNA. Looking at the genetic table, Fig. 1.5, it becomes clear that the mechanical code is far from optimal. Only 8 out of the 20 amino acids have two different mechanical codes available. In fact, however, there is more room for variation via the base pair step between neighboring codons. Also the black-white view of the elasticity presented in Fig. 1.5 oversimplifies the fact that there is a whole spectrum of elastic energies associated with the different base pair steps, not just the two cases represented by black and white. Various experiments have been performed to determine the positions of nucleosomes on DNA (see e.g., (Segal et al., 2006)). Figure 4.14(a) presents the fraction of dinucleotides that were found to be TA steps inside mapped nucleosomes as a function of position x (in bp). These data are the average over various experiments (on chicken and mouse DNA, extracted from cells, or on reconstituted nucleosomes on this DNA (Segal et al., 2006)). Depicted is the left half of the nucleosome (from x = −70 to 0) and the beginning of the right half (up to x = 10). We do not show more than that since the two halves are mirror symmetric with respect to each other. As you can see, there are well-defined peaks at positions x = −65, −55, −45 and so on (indicated by arrows). These positions correspond to places where the minor groove faces inwards. This finding is consistent with our expectation, namely that flexible pyrimidine-purine steps should be preferred at positions with strong bending. Note that the experiments typically only detect nucleosomes that are strongly positioned, so this is not the average over all nucleosomes that are present e.g., on the chicken genome. In Fig. 4.14(b) we use arrows to point at places where TA steps are more likely to be found. To test our hypothesis further, let us look at purine-purine (or pyrimidine-pyrimidine) steps in Fig. 4.14(c). The probability to find a CC or GG step is rather flat (i.e., position independent) compared to the distribution for the TA steps. In other words, the mechanism of localization of soft bends found for TAs does not apply here, as expected. However, rather confusingly, the fraction of AA or TT steps—also depicted in Fig. 4.14(c)—shows pronounced peaks in phase with the TA steps. Why should these stiff steps
129
130 DNA
TA
p 0.06
TA
0.04
(b)
TA
p
0.02
0.20 0.15
−60 −50 −40 −30 −20 −10
(a)
0
10
x
0.10 0.05
AA + TT CC + GG −60 −50 −40 −30 −20 −10
0
10
x (c)
Figure 4.14 Dinucleotide fractions found for nucleosomal DNA (Segal et al., 2006). (a) and (c) depict the averages over various measurements on chromatin extracted from chicken and mouse cells as well as on reconstituted nucleosomes on chicken and mouse DNA. (a) The fraction of TA steps shows pronounced peaks at base pair step positions x = −65, −55, −45 and so on. These positions correspond to large negative roll where the flexible TA steps are energetically advantageous. (b) A nucleosome with the preferred positions for TA steps indicated by arrows. (c) Fractions of CC and GG steps combined (bottom curve) and of AA and TT steps (top curve). All these steps are stiff but surprisingly AA and TT steps are peaked where TA steps are peaked (see text). Technical note: There are data points exactly at the center of the nucleosome (x = 0; green lines). However, the central position is occupied by a base pair, not a base pair step. The reason is that each point represents a three base pair moving average (Segal et al., 2006).
preferably occur at positions of large negative roll? In fact, it is never found that such steps are strongly deformed in crystal structures of nucleosome core particles. As stated in Ref. (Tolstorukov et al., 2007): “Apparently, their positioning role is to bring the DNA sequence in register with the histone-octamer template; namely, to secure the most bendable DNA motifs adjacent to key histone arginine residues, which interact with the narrow minor groove formed by the AA:TT dimers, and seemingly facilitate the kinking and wrapping of DNA around the protein core.” Obviously it is rather challenging to understand the nucleosomal sequence preferences based on geometrical arguments alone. The next subsection provides a more systematic approach.
DNA on the Base Pair Level
4.2.2 A Statistical Physics Approach When writing the first edition of this textbook, it was rather unclear exactly how the sequence preferences of the nucleosome come about. In this section we make use of some new insights that follow from recent work (Zuiddam et al., 2017). This work suggests that it is less the elasticity than the intrinsic geometry of base pair steps that underlies most of the sequence preferences. In addition, one needs to be aware of the fact that each base pair step is part of a longer sequence. Only when accounting for this sequence continuity, the preferences of the nucleosome for certain base pair steps at certain positions of its wrapped DNA can be understood. To come to this conclusion, handwaving qualitative arguments—like the ones I provided in the previous subsection— are only of limited help and instead a rigorous analytical approach is needed. As a starting point we introduce a physical model for the sequence dependent mechanics and elasticity of the DNA double helix. There are various such model in the literature; here we use the so-called rigid base pair model (Olson et al., 1998). This is a coarse grained representation of the DNA double helix, that treats the base pairs as rigid plates. It is therefore in the same spirit as in the previous section (without accounting for the inner degrees of freedom of a base pair step). As already depicted above (see Fig. 4.8), the relative orientation and position of neighboring plates are described by six degrees of freedom which are known as tilt, roll, twist, shift, slide and rise. The model further assumes that a given base pair step has a preferred intrinsic geometry that is characterized by the values θ¯tilt , θ¯roll , θ¯twist , ρ¯ shift , ρ¯ slide and ρ¯ rise of its six degrees of freedom (the bars indicate that these are the intrinsically preferred and not the actual values of these conformational variables). We can collect these values in a sixcomponent vector x¯: (4.2) x¯ = (x¯ 1 , . . . , x¯ 6 ) = θ¯tilt , θ¯roll , θ¯twist , ρ¯ shift , ρ¯ slide , ρ¯ rise . As already discussed in the previous subsection, these intrinsic values depend on the type of base pair step, i.e., a GA step features a different set of values than an AA step and so on. The values of the intrinsically preferred configurations of the different base pair steps
131
132 DNA
can be extracted from DNA-protein crystals (Olson et al., 1998), as mentioned already in the previous section, see also Figs. 4.11 and 4.12. A complication to mention is that rotations are non-commutative. Going back to Fig. 4.8, you can convince yourself that a 180◦ -rotation around the tilt axis followed by a 180◦ -rotation around the roll axis leads to a different final relative orientation between the base pairs than in the case where the order of the rotation was reversed. In other words, when providing the values of θ¯tilt , θ¯roll and θ¯twist , a recipe must also be given of how the rotations are defined; the same, of course, also applies when describing the geometry of a DNA molecule in terms of the degrees of freedom of its base pair steps. However, as long as the angles are small enough (as it is the case here), the differences between those definitions stay small. We therefore do not go into detail here, but mention just for completeness that we use a definition (Coleman et al., 2003) based on the so-called Euler angles (which are discussed in the next section in a different context). There are good reasons to also use another convention (Lavery et al., 2009). So far we have a purely geometrical model. We can determine the ground state geometry of a given DNA molecule by setting the values of all the degrees of freedom of each base pair step to their intrinsic values. However, what we want to know is the bending energy that one has to pay when one forces the DNA molecule from its ground state shape to another configuration, e.g., the one in which it is found in a nucleosome. To do this, we must also take into account the elasticity of the DNA double helix. The assumption is that the elastic energy E el of a given base pair step is quadratic with respect to deformations away from the preferred intrinsic state x¯, i.e., that it can be described by the quadratic form E el (x) =
1 (x − x¯)T K (x − x¯) . 2
(4.3)
Here K is a sequence-dependent symmetric 6 × 6 stiffness matrix. The preferred intrinsic state x¯ of the base pair step is given by Eq. 4.2 and its actual state into which it is forced by x = (x1 , . . . , x6 ) = (θtilt , θroll , θtwist , ρshift , ρslide , ρrise ) .
(4.4)
DNA on the Base Pair Level
In terms of the individual components, Eq. 4.3 can be rewritten as 1 (xi − x¯ i ) Ki j x j − x¯ j . 2 i =1 j =1 6
E el (x) =
6
(4.5)
As is the case for the preferred intrinsic geometry, also the components Ki j of the stiffness matrix depend on the type of base pair step. They have been extracted from DNA-protein crystals, i.e., from data like the ones shown in Figs. 4.11 and 4.12, as described in the following. Starting point is the assumption that the elasticity of the DNA molecule is given by Eq. 4.5 (in reality there are, of course, deviations from this simple harmonic behavior). It can then be shown that in this case the expectation values of the covariances of the conformational variables are given by (xi − x¯ i ) x j − x¯ j kB T = [K −1 ]i j (K −1 denotes the inverse matrix of K ). This relation is the multi-dimensional version of Eq. A.12. In (Olson et al., 1998) this procedure was simply inverted. The observed covariances of the various degrees of freedom, extracted from the DNA-protein crystal complexes (see e.g., Figs. 4.11 and 4.12), were collected as the components of a covariance matrix K −1 from which the stiffness matrix K was calculated via matrix inversion. However, unclear is what temperature must be assumed for this procedure. The idea is that there is an “effective temperature” (Olson et al., 1998) which can be determined by comparison to, for example, the experimentally known average stiffness of the DNA double helix which manifests itself in a persistence length, a quantity to be defined in Section 4.3. Next, we force the DNA model into a configuration that is an idealized version of the DNA geometry inside a nucleosome: a smoothly deformed DNA superhelix. Before discussing how to achieve this, let us first consider a planar DNA circle. In fact, in the previous subsection we discussed how a stack of twisted base pair plates can be deformed into this geometry, namely by periodically changing the roll in a sinusoidal fashion, see Fig. 4.13(a). Strictly speaking, this does not lead to a circle with constant curvature, since base pair steps contribute differently to the overall bending: most is contributed by the steps that feature maximal and minimal roll, other steps do not contribute at all (at places of zero roll) and the
133
134 DNA
rest of the steps contributes with values in between. It is possible to let each base pair step contribute equally and thus creating a circle with constant curvature by making use of an appropriate amount of tilt in addition to roll. Specifically one needs to vary roll and tilt periodically, shifted by 90◦ with respect to each other: n n n = ( sin(2π n/10 − φ), cos(2π n/10 − φ), θtwist ) . θtilt , θroll , θtwist (4.6) Here n denotes the base pair step number and φ is a phase factor which determines positions of maximum and minimum roll and tilt. θtwist is a constant. By choosing θtwist = 36◦ one makes sure that roll and tilt always bend the base pair steps in the same plane toward the center of the resulting DNA ring (as the period of the undulations is 10 base pairs). is the amplitude of the in roll and undulations n 2 n 2 θtilt + θroll always tilt; the net base pair step bending angle amounts to this value. Moreover, for the translational degrees of freedom we assume n n n = 0, 0, 3.4 A˚ , (4.7) , ρrise ρshift , ρslide i.e., a constant rise but no shearing of the base pair steps. To create a circle of a given radius R, one needs to determine the net base pair bending angle from the relation sin (/2) = 3.4 A˚/ (2 × R); for instance, a radius R = 43 A˚ requires = 4.53◦ . However, to mimic the overall nucleosomal DNA geometry, we need a non-planar arrangement, a left-handed superhelix. Following (Balasubramanian et al., 2009) we choose R = 43.3 A˚ for the radius and P = 32.2 A˚ for the pitch. One can show that this superhelix is obtained for a bending angle = 4.46◦ and a twist of θtwist = 35.575◦ (Balasubramanian et al., 2009). Note that unlike for the planar circle, θtwist is chosen to be slightly smaller than 36◦ . This means that after 10 steps the right-handed stack of basepairs has not yet made a full 360◦ -turn. As a result, bending is slightly out of plane resulting in a left-handed superhelical configuration. Moreover, since the overall length of the base pair stack is 146 × 0.34 nm = 49.6 nm and the length of one turn of the superhelix is (2π R)2 + P 2 = 27.4 nm, this results in about 1.8 superhelical turns, comparable to the nucleosomal geometry, see Fig. 4.15. The phase φ in Eq. 4.6 is set to the value 147π/10 such that the base
DNA on the Base Pair Level
Figure 4.15 The nucleosomal DNA is represented by an ideal superhelix of rigid base pair plates. The six degrees of freedom for each base pair step are given by Eqs. 4.6 and 4.7 with = 4.46◦ and θtwist = 35.575◦ .
pair at the central position, between dinucleotide steps 73 and 74, corresponds to the position of maximal roll, in accordance with the fact that at that position the major groove faces the histone octamer, see Fig. 1.8. This is also the place where tilt changes sign from negative to positive values. Now that we have the positions and orientations of the 147 base pairs in Eqs. 4.6 and 4.7, we can calculate the bending energy for a given sequence of nucleotides {N1 , N2 , . . . , N147 } where the Ni ’s can be either A, T, G or C. To do this, we just need to sum over the elastic energies (given by Eq. 4.5) of all 146 base pair steps. However, this involves a large number of parameters. Some of these parameters do not play any role in our current geometric setup as we have set the shear degrees of freedom, shift and slide, to zero. To simplify our analysis even further, we only take those degrees of freedom into account that lead to the bending of the DNA double helix, i.e., tilt and roll, but neglect the energy contributions from twist and rise. We also set all non-diagonal elements of the stiffness matrix to zero, i.e., Ki j = 0 for i = j , which tend to be much smaller than the diagonal elements (Olson et al., 1998). In the end, we just need to know the
135
136 DNA
stiffnesses K11 = Ktilt and K22 = Kroll together with the preferred intrinsic values θ¯tilt and θ¯roll . Now what remains from the total energy, Eq. 4.5, of a given base pair step, say step n, are just two contributions, namely the bending costs due to roll and tilt: n n+1 n n+1 n n N N + E roll N N (4.8) E n (Nn Nn+1 ) = E tilt where Nn and Nn+1 denote the nucleotides belonging to the n-th base pair step, read along one of the two strands of the double helix. Specifically, the tilting energy is given by n 2 1 Ktilt Nn Nn+1 θtilt − θ¯tilt Nn Nn+1 2 and the rolling energy by n E tilt (Nn Nn+1 ) =
(4.9)
n 2 1 Kroll Nn Nn+1 θroll − θ¯roll Nn Nn+1 . (4.10) 2 The stiffnesses, Ktilt and Kroll , and the intrinsic values, θ¯tilt and θ¯roll , depend on the chemical identities of the base pairs involved in the step, namel nucleotides Nn and Nn+1 . We therefore explicitly yn the n+1 and so on. write Ktilt N N How many parameters do we now need to know explicitly? There are four different quantities, θ¯tilt , Ktilt , θ¯roll and Kroll , and we need their values for all 16 possible base pair steps: AA, AT, AC, AG, TA, TT, TC, TG, CA, CT, CC, CG, GA, GT, GC, and GG. However, some of these steps are related to each other. The DNA double helix has two strands that run in opposite directions, see Figs. 4.4 and 4.5. That means an AC step read along one strand, i.e., nucleotide A followed by nucleotide C, corresponds to a GT step read along the other strand. Likewise AA corresponds to TT, AG to CT, TC to GA, TG to CA and CC to GG. The remaining steps AT, TA, CG and GC do not change when read along the other strand. So in total there are only 10 independent base pair steps. Now it is obvious that the stiffnesses of a base pair step do not depend on which strand we look at. Therefore we have e.g., Ktilt (AC) = Ktilt (GT) and Kroll (AC) = Kroll (GT). Also the intrinsic roll of a base pair step remains the same when we switch from one strand to the other, e.g., θ¯roll (AC) = θ¯roll (GT). The reason is that roll is defined such that it is positive if a base pair steps closes n E roll (Nn Nn+1 ) =
DNA on the Base Pair Level
Table 4.1 Stiffnesses and intrinsic geometries with respect to tilt and roll of the 16 base pair steps, extracted from 76 DNA-protein crystals (Olson et al., 1998) Dinucleotide
θ¯tilt [deg]
AA AT AC AG TA TT TC TG CA CT CC CG GA GT GC GG
–1.4 0.0 –0.1 –1.7 0.0 1.4 1.5 –0.5 0.5 1.7 0.1 0.0 –1.5 0.1 0.0 –0.1
Ktilt
kB T deg2
0.100 0.166 0.111 0.149 0.148 0.100 0.087 0.082 0.082 0.149 0.119 0.068 0.087 0.111 0.082 0.119
θ¯roll [deg] 0.7 1.1 0.7 4.5 3.3 0.7 1.9 4.7 4.7 4.5 3.6 5.4 1.9 0.7 0.3 3.6
Kroll
kB T deg2
0.049 0.055 0.080 0.096 0.029 0.049 0.046 0.048 0.048 0.096 0.064 0.050 0.046 0.080 0.082 0.064
toward the major groove (as depicted in Fig. 4.8) and negative if it closes toward the minor groove; therefore it does not matter which strand one uses as reference. This is different for tilt: when a base pair shows a non-zero tilt, it opens up toward one strand and closes toward the other, see Fig. 4.8. The intrinsic value of a given base pair step therefore switches sign when going from one strand to the other, e.g., it must be true that θ¯tilt (AC) = −θ¯tilt (GT). In total this leads to 4 × 10 = 40 independent parameters in our model. As mentioned earlier, all the parameters have been determined from DNA-protein crystals (Olson et al., 1998). We show all parameters in Table 4.1 where we chose to present all 16 possible dinucleotides. As you can check, the values of the different quantities indeed show the expected symmetries discussed above. Of interest, it should also be noted that pyrimidine-purine steps (especially TA) are quite soft with respect to roll, as claimed in the previous subsection based on geometrical arguments.
137
138 DNA
p
AA AA TT GC TT GC TA TA G
G
AA + TT + TA
AA A TT GC T A T A C T T G A
0.05 −70
(a)
−50
−30
−10
10
30
50
70
x
(b)
A T A TA T
GC
0.10
AA TT TA
0.15
AA TT G C TA
0.20
GC
0.25
GC
0.30
Figure 4.16 The main sequence preferences of the nucleosome (Segal et al., 2006). (a) Dinucleotide fraction of GC and of AA, TT and TA combined extracted from 177 different nucleosome base pair sequences on the chicken genome. The nucleosomes were mapped from chromatin taken from red blood cells (Satchwell et al., 1986). (b) Schematic sketch showing the locations of these steps on the nucleosome. GC is found at locations where the major groove points inward (positive roll positions) and AA, TT and TA where the minor groove faces inward (negative roll positions).
Having now the model in place, we want to check whether is it capable to predict the nucleosomal sequence preferences. Specifically, we aim to find the rules as stated in Ref. (Segal et al., 2006), see Fig. 4.16: As can be seen from the experimental plot in (a), stable nucleosomes feature GC steps at positions of maximum positive roll and AA, TT and TA at positions of maximum negative roll, as summarized in the schematic picture of the nucleosome in (b). Let us now check whether these experimental findings are consistent with our model, trying first to get away with simple straightforward reasoning (in a similar style as in the previous subsection). We first check whether the preferred positions of GC make sense. According to Fig. 4.16, GC (i.e., nucleotide G followed by nucleotide C) peaks at positions of positive roll. Assuming that this preference reflects the intrinsic roll geometry and stiffness of GC steps, it seems logical to conclude that GC steps are among the steps with a high intrinsic roll and a low roll stiffness. Let us check whether this is true by inspecting the corresponding values in Table 4.1. To our surprise, the opposite is the case. The GC step has the lowest θ¯roll -value of all base pair steps and also the second largest roll stiffness (after AG/CT). So let us then go to TA which is typically
DNA on the Base Pair Level
found at positions of negative roll in the mapped (and thus rather stable) nucleosomes, see Fig. 4.16; we expect TA steps to be soft and featuring a small roll-value. According to Table 4.1 the TA step is in fact the softest of all steps with respect to roll, but its intrinsic roll has a rather large value, 3.3◦ . This seems to suggest that TA steps would be typically found at positive roll positions, in disagreement with the experimental findings. One possible solution to this might be that the nucleosomal DNA is non-uniformly bent with relatively sharp bends at positions with negative roll and that this attracts the soft TA steps, cf. Fig. 4.16(b). In fact, this is what we suggested in the previous subsection. Nevertheless we have not yet done a proper calculation of our model and at this point we do not yet know whether our simple argument is really correct or whether we have forgotten to consider something crucial. We therefore now turn to an exact analytical calculation of the probabilities in order to find the different base pair steps along the nucleosome on the basis of a proper statistical mechanics treatment. To do this, we consider all possible sequences that can be wrapped into a nucleosome, 4147 in total, an extremely large number on the order of 1088 . For each given sequence {N1 , N2 , . . . , N147 } it is straightforward to calculate the elastic energy using Eqs. 4.8 to 4.10. We further assume that our system is in equilibrium, i.e., that the probability distribution of the nucleosome on all these sequences is Boltzmann distributed. Now we would like to create plots from our theoretical model that can be compared with experimental plots like the one shown in Fig. 4.16(a). For this we need to calculate the probability of finding a given base pair step at each position on the nucleosome. This is rather straightforward to write down in a formula. For instance, the chance to encounter base pair step GC at step s on the nucleosome is given by 1 Ps (GC) = Z
N1 , ..., N147 Ns =G, Ns +1 =C
exp −β
146
E
n
n
N N
n+1
(4.11)
n=1
where the E n ’s are given by Eq. 4.8. In this expression we sum over all sequences that have a GC at the s ’s position (i.e., Ns = G and Ns +1 = C). The normalization factor Z (the partition function) is
139
140 DNA
given by Z =
exp −β
N1 , ..., N147
146
n n+1 E N N . n
(4.12)
n=1
After having written down concise formulas, we are left with the question how to actually calculate them. Note, for instance, that the summation in Eq. 4.12 is over 4147 ≈ 1088 terms. At first, this seems to be an impossible task. However, it is in fact a straightforward calculation using the method of transfer matrices. We first interpret the four different nucleotides as the basis B = {A, T, C, G} of a fourdimensional vector space. For each position n on the nucleosome we introduce a 4 × 4-matrix T n . Its components contain the Boltzmann n n (written as TGC ) is of the weights. For instance, the component T4,3 form n = exp [−β E n (GC)] . TGC
(4.13)
This allows us to rewrite the probability Ps (GC), Eq. 4.11 together with Eq 4.12, as s+1 1 s T . . . T s −1 N1 G TGC T . . . T 146 CN147 Ps (GC) =
N1 , N147
N1 , N147
T 1 T 2 . . . T 146
. (4.14)
N1 N147
Some reflection is required to understand this formula. Let us take a closer look at the denominator, which corresponds to the partition function Z , Eq. 4.12. Compared to that previous expression, we only have to perform very few summations, namely over N1 and over N147 , 16 terms in total. So what happened to the remaining 4145 terms? They are now carried out in a very compact scheme, namely through the multiplication over all 146 transfer matrices, i.e., by calculating T 1 T 2 . . . T 146 . This results in a matrix where each component N1 N147 contains all the sequences that connect the first nucleotide, N1 , to the last nucleotide, N147 . More specifically, it contains the sum over all exponentials of the energies of all these sequences (from N1 to N147 ), measured in units of kB T . If you then take the sum of all the components of this matrix, you get exactly Z , Eq. 4.12. Along the same lines, the numerator of Eq. 4.14 can be understood as accounting for all sequences with the extra condition that they contain a GC step at position s.
DNA on the Base Pair Level
Ps
0.5
Ps
AA + TT + TA
TT
0.2
TA
0.1
0.4
10
0.3
AA
15
20
s
0.2 0.1
−70 −50 −30 −10
GC
10
30
50
70
s
Figure 4.17 Dinucleotide fractions at room temperature calculated via Eq. 4.14. Note that the sequence preferences predicted by this model agree, at least qualitatively, with the experimentally measured sequence preferences of the nucleosome, see Fig. 4.16.
We have now transformed the seemingly impossible task of summing over 1088 terms into a multiplication of 146 4 × 4 matrices which is a trivial task to perform on a computer. Let us look at some results. Figure 4.17 shows the probabilities of the key dinucleotides. First inspect the curve for Ps (GC). In the experimental plot, Fig. 4.16, we saw that the probabilities peak at positions of positive roll. This seemed to clash with the intrinsic properties of GC steps as extracted from Table 4.1. However, to our surprise the full calculation reveals that also in our model GC steps peak at positions of positive roll. Also the combined probabilities of AA, TT and TA peak at negative roll positions, in agreement with the experimental observations. The inset in Figure 4.17 shows the probabilities of AA, TT and TA separately, revealing that TA indeed peaks at locations that go against its intrinsic shape. What went wrong in our simple argument? When we tried to predict the preferred locations of GC and TA steps on the nucleosome, we assumed that these are isolated mechanical units whose positional preferences simply reflect their intrinsic preferences. However, a TA step is not an isolated unit but part of a longer DNA molecule. When we place a TA step at a certain location inside the nucleosome, we restrict the possible neighboring base
141
142 DNA
pair steps. For instance, if we have a TA step at position s, we know that the previous step, at position s − 1, must end with a T. So only four out of 16 steps are possible. Similarly, we know that the step following TA, at position s + 1, needs to start with an A. Suppose that s is a position with maximal negative roll. At this position, according to Eq. 4.6, tilt changes sign from positive to negative values. TT happens to have positive intrinsic tilt and AA negative intrinsic tilt. In addition, even though TA has positive intrinsic roll, it is rather soft. So you can imagine that the larger unit composed of the four nucleotides TTAA fits nicely at a negative roll position. In fact, if you look closely at the inset of Fig. 4.17, you can see that Ps (TT) peaks just before and Ps (AA) just after the peak of Ps (TA). It seems extremely hard to guess the sequence preferences of the nucleosome by just staring at Table 4.1 as one needs to go beyond thinking of sequence preferences of individual base pair steps. This leads to the question whether it is still possible to understand the sequence preferences in relatively simple terms. For instance, is the explanation given above concerning the TTAA motif a valid one or does one need to perform a calculation accounting for all 4147 different sequences? A first hint can be found by inspecting the probability curves in Fig. 4.17 close to the boundaries. As there are no base pair steps beyond these locations, we can estimate how far sequence effects propagate along the DNA. It is immediately obvious that the probabilities of the outmost base pair steps are indeed strongly affected but also that there is hardly any effect for base pair steps further inside. This suggests that only neighboring base pair steps influence each other’s probabilities. Indeed, a more detailed analysis (Zuiddam et al., 2017) shows that to a good approximation the probability of a given base pair step at a given location can be understood by only accounting for its own energy, the average energy of the four possible preceding base pair steps and the average energy of the four possible succeeding base pair steps. This is still simple enough so that one can understand the sequence rules by hand (Zuiddam et al., 2017). As this is rather cumbersome, we will not delve deeper into this subject. Instead we ask the question: What is more important for determining the nucleosomal sequence preferences: shape or stiffness? We now demonstrate that it is shape that is crucial whereas stiffness
DNA on the Base Pair Level
Ps
AA + TT + TA
0.5
new data set
0.4
homogeneous stiffness
0.3 0.2 0.1
GC −10
−5
0
5
10
s
Figure 4.18 Dinucleotide fractions calculated via Eq. 4.14. The colored curves represent the probabilities using the full set of parameters from Table 4.1 and are thus identical to Fig. 4.17. The dashed curves represent a simplified treatment where all stiffnesses are replaced by average stiffnesses. Finally, the dotted curves were calculated using a newer data set for the shapes (Balasubramanian et al., 2009) but also assuming homogeneous stiffnesses.
plays a minor role. We redo the calculations but replace all the individual Ktilt ’s and Kroll ’s by their respective averages over all 16 base pair steps. The resulting probabilities are shown in Fig. 4.18. The original probabilities are depicted by colored continuous curves and the simplified model with homogeneous stiffnesses by dashed black curves. As you can see, the probabilities hardly change. This clearly shows that it is the shape of the DNA double helix that is most important for determining the nucleosomal sequence preferences. But how reliable is this result? The parameters used in our model were extracted a long time ago, in 1998, from the fluctuations and correlations of the rigid base pair parameters of 76 different proteinDNA crystals (Olson et al., 1998). We are now using a more recent data set of 135 protein-DNA crystal complexes of higher quality covering a larger range of DNA deformations (Balasubramanian et al., 2009). Nucleosome crystals are explicitly excluded from this data set to not bias the results. We only consider here the intrinsic shapes that are presented in Table 4.2. For the two types of stiffnesses we use again the same average values as before. Comparing Tables 4.1 and 4.2 one can find some substantial changes
143
144 DNA
Table 4.2 Intrinsic geometries with respect to tilt and roll of the 10 independent base pair steps, extracted from 135 DNA-protein crystals (Balasubramanian et al., 2009)
θ¯tilt [deg] θ¯roll [deg]
AA
AT
AC
AG
TA
CA
CG
GA
GC
GG
−1.1 0.7
0.0 1.0
0.6 1.6
−1.1 4.1
0.0 2.5
0.1 5.1
0.0 5.5
−1.5 1.9
0.0 1.2
0.7 5.0
between the two sets of parameters. Nevertheless, the dotted black curves in Fig. 4.18 demonstrate that the probabilities are hardly affected by these changes, suggesting that this system is rather robust against some variations in the set of parameters. The general recipe used here, namely to model nucleosomal DNA by forcing the rigid base pair model into a superhelical configuration, is also a good starting point for computer simulations of nucleosomes. Such models (see, e.g., (Morozov et al., 2009; Eslami-Mossallam et al., 2016)) are more realistic compared to the model described above because they typically allow local relaxation of the DNA rather than fixing the positions and orientations of all its basepairs. Such models can be used to demonstrate explicitly that the sequence preferences of nucleosomes can be employed even on top of genes to position nucleosomes at specific locations along the DNA. This can be achieved by putting the model nucleosome at the desired position on a given gene and then trying out various synonymous mutations (i.e., replacing codons by other codons that encode for the same amino acids) (Eslami-Mossallam et al., 2016). Doing this in a systematic way (Zuiddam and Schiessel, 2019) one can show for whole genomes that nearly everywhere (99.9943% of all positions in baker’s yeast) one can find synonymous sequences that position nucleosomes with single bp precision—at least within such a model. Such studies demonstrate the theoretical possibility of a “mechanical” evolution of genomes to put some of the nucleosomes at specific locations. But did nature make use of this possibility? Although the idea goes back a long way (Trifonov and Sussman, 1980; Satchwell et al., 1986), it is still an open question to what extent nucleosomes are positioned by mechanical cues. The answer also seems to depend on the specific organism one looks at. A well-
DNA on the Base Pair Level
known example concerns a form of “anti-positioning”: Nucleosomes in yeast tend to avoid the regions before transcription start sites (Segal et al., 2006). That this effect is caused by DNA mechanics and not by other mechanisms can be shown by reconstituting chromatin from its pure component, yeast DNA and histone proteins. When mapping nucleosomes of the reconstituted chromatin, one finds that also in this case nucleosomes are depleted at these regions (Kaplan et al., 2009). These nucleosome-free stretches make the DNA accessible which might be biologically advantageous as this facilitates transcription initiation. Remarkably, however, this effect is only seen in unicellular life forms. Genomes of multi-cellular organisms show the opposite effect, namely that their DNA molecules intrinsically attract nucleosomes to regions around transcription start sites (Tompitak et al., 2017). Whereas this can be observed for reconstituted chromatin, it is quite confusing that the mechanical cues in vivo are overruled by other processes, possibly transcription itself, such that nucleosomes tend to be depleted around transcription start sites. The mechanical cues around transcription start sites of multi-cellular life forms must therefore have a different function, if any. A possible explanation might be found by looking at cells that are transcriptionally inactive. Are there such cells in multi-cellular organisms? In fact, each animal, no matter how big it is (think of an elephant), needs to make itself very small when passing through the germ line into the next generation. Especially in sperm cells, elephants shrink substantially (even smaller than the sperm cells of fruit flies or mice). Small sperm cells are good swimmers and can be produced in larger numbers, a fact especially important for species with competing males. That might be the reason why in sperm cells DNA is tightly packed with the help of protamines, replacing the nucleosomes. But not completely: about 4% of the nucleosomes are retained in human sperm cells (Hammoud et al., 2009). How does a sperm cell know which nucleosomes to take along? It turns out, it might be the mechanical cues along the DNA molecules that determine which nucleosomes are retained: Regions where a mechanical model (based on the rigid basepair model) predicts the most stable nucleosomes correspond to regions where sperm cells retain nucleosomes (Tompitak et al., 2017).
145
146 DNA
What is the evolutionary driving force for retaining a fraction of nucleosomes in sperm cells instead of getting rid of them all? We can only speculate. One possible reason is to allow for the transmission of epigenetic information via the father to the offspring (and not only via the mother where the nucleosomes are all kept in the egg cell). Epigenetic information is information in addition to and shorter-lived than genetic information, scribbled along the margins of the book of life. We have already discussed an example in Section 2.5, the H3K9me mutation on the N-tail of histone H3 which marks heterochromatin. And it is the genes that are important for the early embryonic development that are singled out for receiving this extra information; these carry the mechanical cues around their transcription start sites and thus retain the nucleosomes with their modifications. A concrete (though controversial) experiment trained male mice using mild foot shocks to fear cherry blossom smell. Their offsprings had an aversion to this specific odor (Dias and Ressler, 2014).
4.3 DNA as a Wormlike Chain We now leave the microscopic length scales and take a look at DNA on much larger scales (100 bp and larger). Then the effects of the underlying base pair sequence average out and the elastic properties of a DNA chain are astonishingly simple. However, it is not straightforward and still the subject of active research to derive the simple model we describe below from the microscopic structure of DNA. A derivation that starts from the rigid base pair model of the previous section (see, e.g., Fig. 4.13) is presented in Ref. (Becker and Everaers, 2007) but the calculation is too involved to discuss it here any further. Also note that the rigid base pair model itself oversimplifies some of the DNA’s features as it neglects important additional degrees of freedom (e.g., the propeller twist, Fig. 4.10) (Lankasˇ et al., 2009). The wormlike chain (WLC) model is a continuum description of the DNA double helix. In this model the DNA chain is assumed to behave like an inextensible rubber tube. “Inextensible” here means that the chain has a fixed length. The state of lowest energy is
DNA as a Wormlike Chain
s τ (s) R (s) 0 L Figure 4.19 The conformation of a WLC is fully characterized by the local radius of curvature R (s) and its twist variable τ (s).
the straight conformation. At an energetic cost the chain can be bent away from this state. Since the DNA double helix features two backbones, it cannot swivel freely around its bonds, as is normally the case with synthetic polymers. Instead the double helix has a preferred twist rate and it costs energy to twist the DNA away from it. Figure 4.19 displays a DNA chain of contour length L. It is bent out of its equilibrium configuration attaining a local radius of curvature R (s), where s denotes the arc length of the chain, 0 ≤ s ≤ L. In addition, the DNA is twisted away from its natural twist—around 10 bp per helical repeat—by an extra twist τ (s). Within the WLC model it is assumed that the elastic energy of the chain is quadratic in deformations away from the straight, naturally twisted configuration. These deformations are the local curvature 1/R (s) and the extra twist rate dτ (s) /ds. The local energy density per length for bending thus has the form (A/2) (1/R (s))2 where A is called the bending modulus, a quantity with the units energy times length. Likewise the twisting energy per length is given by (C /2) (dτ (s) /ds)2 where C denotes the twisting modulus, also a quantity with units energy times length. The total elastic energy then follows by integration along the whole contour of the chain:
2
1 dτ (s) 2 1 L A +C ds. (4.15) H = R (s) ds 2 0 Strictly speaking, the name WLC model refers to a system that is described by the bending term only, i.e., to a chain in the absence of a twisting constraint or a chain with no twist rigidity, C = 0. The WLC with twist rigidity might be referred to as rodlike chain or twistable WLC but in the following we shall not make this distinction.
147
148 DNA
L e2 (s)
e3 (s)
s
e1 (s)
e3 (s)
3
e3 (0) ψ
2
e3 (0) e2 (0)
e1 (0)
e1 (s) e2 (0)
φ
0 e1 (0)
e2 (s)
θ
1
Figure 4.20 The Euler angle representation of the WLC.
It proves useful to rewrite Eq. 4.15 in the framework of the Euler angle representation. Consider a triad moving along the chain that is made from three normalized vectors, the tangent t (s) = e3 (s), as well as e1 (s) and e2 (s), see Fig. 4.20. We can choose this triad such that its orientation points along the three axes of the bp plates, as depicted in Fig. 4.8, assuming that we always have B-DNA with its bp plates perpendicular to the helix axis. The rolling and tilting as well as the twisting of the bp steps then translates into the bending and twisting of the WLC; shear motions (sliding and shifting) between bp plates is not considered in this model, but can normally be safely neglected for such a B-DNA stack. It is, however, even more convenient to choose a triad that does not depend on s when the chain is in its straight, untwisted conformation. The two unit vectors e1 (s) and e2 (s) are thus not pointing in the direction of the principal axes of the bp but are nevertheless imaged to be permanently inscribed into the DNA material. The advantage of this choice is that we no longer have to explicitly account for the DNA’s natural twist. The configuration of the chain then follows from the three Euler angles φ (s), θ (s) and ψ (s) that describe three consecutive rotations
DNA as a Wormlike Chain
that bring the original triad e1 (0), e2 (0) and e3 (0) at the starting point s = 0 into the orientation of the triad at position s. The rotations are shown in Fig. 4.20 on the right: a rotation around the e3 (0)-axis by φ (s), followed by a rotation around the new e1 -axis by θ (s), followed by the final rotation around the new e3 -axis by ψ (s). In mathematical terms, this can be achieved by applying the three rotation matrices R3 (φ), R1 (θ) and R3 (ψ) to the original coordinate system e1 (0), e2 (0) and e3 (0), e.g., e3 (s) = R3 (ψ ) R1 (θ) R3 (φ) e3 (0) with
(4.16)
⎛
⎞ ⎛ ⎞ 1 0 0 cos α − sin α 0 R1 (α) = ⎝ 0 cos α − sin α ⎠ , R3 (α) = ⎝ sin α cos α 0 ⎠ . 0 sin α cos α 0 0 1 (4.17) Matrix multiplication leads to R = R3 (ψ) R1 (θ ) R3 (φ) ⎛ ⎞ cψ cφ − sψ cθ sφ −cψ sφ − sψ cθ cφ sψ sθ = ⎝ sψ cφ + cψ cθ sφ −sψ sφ + cψ cθ cφ −cψ sθ ⎠ sθ cφ cθ s θ sφ
(4.18)
where we (only here) wrote cψ = cos ψ, etc., for compactness of notation. Now we show that there exists a vector (s) such that e˙ i (s) = (s) × ei (s)
(4.19)
for i = 1, 2, 3. The dot here denotes the derivative with respect to the arc length. Equation 4.19 describes the change in the triad orientation as one goes along the contour length of the chain. This change is given by a rotation around an axis parallel to the vector (s). You can read off from its components how the DNA is bent and twisted at position s. To derive (s) we start from Eq. 4.16 in the form ei (s) = R (s) ei (0). Then take the derivative with respect to s, e˙ i (s) = R˙ (s) ei (0). Rotational matrices are unitary, i.e., they obey R R = I where R denotes the transpose of matrix R and I is the identity matrix. Hence ei (0) = R (s) ei (s) from which follows e˙ i (s) = R˙ (s) R (s) ei (s). Upon inserting Eq. 4.18 into this relation
149
150 DNA
a straightforward but long calculation shows that this is identical to Eq. 4.19 with 1 = φ˙ sin θ sin ψ + θ˙ cos ψ 2 = −φ˙ sin θ cos ψ + θ˙ sin ψ 3 = φ˙ cos θ + ψ˙ .
(4.20)
Now we are in the position to replace the terms 1/R (s) and dτ (s) /ds in Eq. 4.15 by expressions in terms of the Euler angles. The inverse radius of curvature 1/R (s) is just the same as t˙ (s) and 2 hence (1/R (s ))2 = t˙ (s) = 21 + 22 where we used Eq. 4.19 with i = 3. The twist rate d τ (s) /ds is simply given by the component of (s) that points in the direction of the tangent, i.e., by 3 . Hence dτ (s) /ds = 3 . (More formally: dτ (s) /ds is the rotation of e1 around the e3 -axis, which is given by (e1 × e˙ 1 ) · e3 = 3 ). From this finally follows the Hamiltonian 4.15 expressed in terms of the Euler angles: % L$ C 2 A 2 2 φ˙ cos θ + ψ˙ H = ds. (4.21) φ˙ sin θ + θ˙ 2 + 2 2 0 In this book we encounter many situations where the DNA is under an external force that either tries to extend or compress the molecule. Suppose this force acts in the Z -direction and has a value f , with f being positive for a force that stretches the molecule—as e.g., in Fig. 3.5. We have thus to add a term − f z to Eq. 4.21 with z being the end-to-end distance in Z -direction. This leads to: % L$ 2 C A 2 2 2 ˙ ˙ ˙ ˙ φ cos θ + ψ − f cos θ ds. φ sin θ + θ + H = 2 2 0 (4.22) Equation 4.22 might look long and complicated but if you have studied classical mechanics, it is quite likely that you have encountered this integral before. It appears there in a completely different context, namely when one studies the dynamics of a symmetric spinning top with a fixed point in a gravitational field, see Fig. 4.21. More precisely, Eq. 4.22 is mathematically identical to the Lagrangian action of a spinning top: T S= L(φ (τ ) , θ (τ ) , ψ (τ )) dτ. (4.23) 0
DNA as a Wormlike Chain
ψ
g θ φ Figure 4.21 A spinning top in a gravitational field. The motion of the rigid body can be described by three Euler angles as indicated in the figure. The spinning top is attached at a fixed point, here assumed to be located at the tip of the top.
This is an integration over time t from t = 0 to some arbitrary t = T . The integrand L is called the Lagrange function and is given by the kinetic energy minus the potential energy of the spinning top. Readers not familiar with the Lagrangian action and Hamilton’s principle might want to take a look at Appendix D. The analogy between elastic rods and spinning tops is usually referred to as the Kirchhoff kinetic analogy after Gustav Kirchhoff who noticed this first in 1859 (Kirchhoff, 1859). Comparing the energy expression of the WLC, Eq. 4.22, with the Lagrangian action of the spinning top, Eq. 4.23 (not shown here explicitly), one can learn how the analogy works in detail. First of all you have to identify the arc length s of the WLC with the time τ of the spinning top. The other quantities to be identified turn out to be as follows: the tension f corresponds to the gravitational force, specifically to Mgl (M: mass of the spinning top, l: distance between the fixed point and the center of mass, g: gravitational acceleration), the bending modulus A to the moment of inertia I⊥ for a rotation around an axis perpendicular to the symmetry axis of the spinning top, and the twisting modulus C to the moment of inertia I along its symmetry axis (a moment of inertia I relates the kinetic energy E rot of a spinning body to the angular velocity vector via E rot = I 2 /2). Using the correspondence between the two systems, it is straightforward to write down the Lagrangian action of a spinning top. Its set of equations of motion then follows directly from the
151
152 DNA
θ=0
θ = const
sleeping top regular precession
twisted rod Figure 4.22
helix
φ=ψ=0
pendulum
planar filament
Three examples of the Kirchhoff analogy (see text).
Euler–Lagrange equations, Eq. D.9. Using the explicit solutions of the spinning top, one can employ the kinetic analogy to find shapes of elastic rods under tension or compression and torque. However, there is a caveat. According to Hamilton’s principle the solutions are stationary points but not necessarily minima of the action as outlined in Appendix D. This means that the shapes that are found by this analogy do not always minimize the elastic energy; all we can say is that any small perturbation around such a shape changes the energy only in second order in this perturbation. It is the same as when one determines a (local) minimum of an ordinary function. One needs to go to second order derivatives to know whether a certain point is a true minimum. Then this is the recipe to construct the shape of a WLC: Look at the direction of the figure axis of the spinning top in time and construct the corresponding WLC shape by letting its tangent (that “moves” at constant speed) point in the same direction. Similarly the twist rate of the WLC follows directly from the angular velocity of the spinning top. Three simple examples are given in Fig. 4.22. To the left is a so-called sleeping top where θ ≡ 0. This translates
DNA as a Wormlike Chain
Figure 4.23
Example of an Euler elastica under compression.
into a straight but twisted rod. In the middle a spinning top is shown, that performs a regular precession, familiar to anyone who has played with spinning tops as a child. This case translates into a rod of helical shape. Finally, on the right is a spinning top that does not spin. Since—as mentioned above—the spinning top has a fixed point, the top swings around this point. This is nothing but the familiar pendulum that performs an oscillatory motion in a given plane. It corresponds to a planar, untwisted filament, a member of the so-called Euler elasticas. A close-up of such an elastica is given in Fig. 4.23. In the rest of this book we restrict ourselves to the case of untwisted DNA, φ = ψ ≡ 0, i.e., to the Euler elasticas. The Hamiltonian 4.22 then reduces to % L$ A 2 ˙ θ − f cos θ ds H = (4.24) 2 0 with θ (s) denoting the angle between the tangent at s and the force direction, see Fig. 4.23. In this case it is easy to see that this Hamiltonian corresponds to the Lagrangian action of a pendulum % T$ 2 Ml 2 (4.25) θ˙ − Mgl cos θ dτ S= 2 0 since Ml 2 θ˙ 2 /2 is obviously its kinetic energy and Mgl cos θ its potential energy. This special case of the Kirchhoff analogy is depicted in Fig. 4.24. Leonard Euler (1707–83), in the appendix of his landmark book on variational techniques, characterized these planar solutions of
153
154 DNA
Figure 4.24 Special case of the Kirchhoff kinetic analogy between the pendulum and the Euler elastica.
untwisted rods as early as 1744 and presented example configurations in two tables, reproduced here in Fig. 4.25. The shapes labelled “Fig. 6,” “Fig. 7,” “Fig. 8” and “Fig. 9” are Euler elasticas for which— in the language of the Kirchhoff analogy—there are corresponding oscillating pendulum motions with different amplitudes (smallest for “Fig. 6” and largest for “Fig. 9”). Interestingly, the overall direction of the shapes changes from going downwards, “Fig. 6” and “Fig. 7” (as also in Fig. 4.24), to upwards, “Fig. 9.” “Fig. 8” is the 8shaped structure right at the boundary between these two cases; it is achieved when one starts a pendulum with an angle of close to 49◦ . The angle at the apex of each leaf is thus close to 81◦ . For a pendulum with a mass attached to an arm of fixed length there is also another set of possible motions, namely revolving orbits. This is the case with some kinetic energy remaining when the pendulum mass reaches the top position, θ = 0. The pendulum is then revolving in one direction, clockwise or counterclockwise. Euler showed an Euler elastica that corresponds to such an revolving orbit in “Fig. 11” of his table. The actual calculation of the possible motions of a pendulum and the shapes of the Euler elasticas are not straightforward since they involve special functions, elliptic integrals and Jacobi’s elliptic functions. We shall not discuss them here in the main text but refer the interested reader to Appendix D where the exact solutions are presented. Based on these solutions, Fig. 4.26 depicts example shapes of the Euler elasticas; on each curve a small pendulum indicates its corresponding motion. The
DNA as a Wormlike Chain
Figure 4.25
Leonard Euler’s original drawings of elasticas (continued).
phase portrait of the pendulum is shown in Fig. 4.27 for exactly the same set of parameters. It is remarkable how well Euler’s drawings, Fig. 4.25, hold up to modern computer plots, Fig. 4.26. Here in the main text we only discuss one special case of the Euler elasticas that corresponds in the analogy to a pendulum whose motion lies exactly at the boundary between the oscillating and the revolving solutions, the so-called homoclinic orbit. In that case the pendulum stands upright for an infinite time, then makes one swing and finally approaches again the upright position for an infinite time.
155
156 DNA
Figure 4.25
(continued).
This corresponds to “Fig. 10” in Euler’s original drawings (Fig. 4.25) and to the case m, m → 1 in Fig. 4.26. This shape is of particular interest for us for the simple reason that it is the only shape that is asymptotically straight for s → ±∞ (besides, of course, the trivial solution of the straight rod). If one pulls on a long DNA molecule, then one should expect such an asymptotic behavior; this case is
DNA as a Wormlike Chain
m = 0.8261
m = 0.75 m = 0.9 m = 0.65 m = 0.5
m = 0.4 m = 0.8 m = 0.95
m = 0.99
m = 0.3 m = 0.2 m, m → 1
Figure 4.26 Gallery of Euler elasticas based on Eqs. D.34 and D.35 as well as Eqs. D.39 and D.40.
α˙ m = 0.8261
α m, m → 1
m = 0.2
m = 0.8
m = 0.4
Figure 4.27 Phase portrait, α˙ vs. α, of the pendulum, Eq. D.21 with α = π − θ. The values of m and m are the same as in Fig. 4.26.
157
158 DNA
Figure 4.28 The homoclinic loop stabilized by a sliding ring.
therefore of experimental relevance, as we see in detail later in Section 8.3. In its simplest manifestation we can think of performing a stretching experiment with a piece of DNA that features a loop like the one depicted in Fig. 4.28. Such a loop is usually not stable in 3 dimensions. You can test yourself by taking a rubber tube, creating a loop on it and pulling at its ends. Let us assume that the loop is somehow stabilized, e.g., by a small ring through which the two pieces of DNA can slide freely, see Fig. 4.28. Such rings actually exist in nature in the form of ringlike proteins like cohesin that can hold two DNA strands together (cohesin can do much more; in Chapter 8 you learn about its spectacular role in chromosome organization). Our aim is to calculate the force-extension relation for this setup. The energy of the looped DNA chain of length L is % L/2 $ A 2 (4.26) θ˙ − f cos θ ds. H = − L/2 2 This is just Eq. 4.24 but with the contour length now running from −L/2 to L/2. This choice is convenient here since the loop is then centered around s = 0. We need to find the looped DNA conformation that minimizes the energy, Eq. 4.26. To find a stationary point of H (a minimum, maximum or saddle point) we have to write down the corresponding Euler–Lagrange equation (if you are not familiar with this, see Appendix D): (4.27) θ¨ = λ−2 sin θ. We introduce here a characteristic length scale for a WLC under tension: λ = A/ f . (4.28) We call λ the correlation length, for reasons that become clear later in this section. To solve Eq. 4.27, first multiply both sides by 2θ˙ , integrate over s and then take the square root. This leads to (4.29) θ˙ = λ−1 2 (C − cos θ )
DNA as a Wormlike Chain
with an integration constant C (again, check out Appendix D for more details; C should not be confused with the twisting modulus). √ Using θ˙ = dθ/ds we separate variables, ds = λdθ/ 2 (C − cos θ). Integration over s (from s0 to s) leads to θ(s) s − s0 = λ θ (s0 )
√
dθ . 2 (C − cos θ )
(4.30)
This is a so-called elliptic integral that unfortunately has no general solution in terms of elementary functions; check out Appendix D for details if you are interested. Fortunately, the special case of a loop we are discussing here leads to ordinary functions. We now write θloop (s) for this special solution. For simplicity we assume an infinitely long DNA chain, i.e., we let L go to infinity. As it turns out, this is an excellent approximation for a chain of finite length Las long as Lis much larger than λ. If you go from one end to the other along the curve depicted in Fig. 4.28, the tangent vector makes a 360◦ rotation. That leads to the boundary conditions θloop (−∞) = 0 and θloop (+∞) = 2π . Since the arms are asymptotically straight, i.e., the DNA curvature vanishes asymptotically, we require lim θ˙loop (s) = 0. From Eq. 4.29 s→±∞
it follows that we have to set C = 1. Hence θloop
(s) s dθ θloop (s) √ ln = tan = 4 λ 2 (1 − cos θ )
(4.31)
π
where we set s0 = 0 where θ (s0 ) = π. It is straightforward to invert Eq. 4.31: θloop (s ) = 4 arctan es /λ .
(4.32)
We can also rewrite Eq. 4.32 as follows: cos θloop (s) = 1 −
2 . cosh (s/λ) 2
(4.33)
To convert Eq. 4.32 to Eq.4.33, start from es /λ = tan θloop /4 . Then insert this into cosh x = ex + e−x /2. Using the addition formulas for cosine and sine, it follows that cosh (s/λ) = 2/ 1 − cos θloop . Plots of Eqs. 4.32 and 4.33 are provided in Fig. 4.29. Note that θloop approaches the straight configuration exponentially with the decay
159
160 DNA
θloop 2π
−2π
−π
0
π
2π
s/λ
π
2π
s/λ
cos θloop 1
−2π
−π
−1
Figure 4.29 The homoclinic loop solution.
length λ, i.e., θloop (s) ∼ e−|s |/λ for |s| λ as you can check using Eq. 4.33. To produce the parametric plot in the X Z -plane, Fig. 4.28, explicit formulas of the x- and z-position of the looped DNA are needed; they follow by integration: s (4.34) z (s) = cos θloop s ds = s − 2λ tanh (s/λ) 0
and
s x (s) =
sin θloop s ds = 2λ 1 −
1 cosh (s/λ)
.
(4.35)
0
A DNA chain with a loop has a shorter end-to-end distance than a straight DNA chain. The shortening L caused by the presence of the loop follows by subtracting the end-to-end distance of the looped chain from that of the straight one. This can even be done for an infinitely long chain: ∞ ∞ 2 1 − cos θloop (s) ds = ds = 4λ. L = 2 −∞ −∞ cosh (s/λ) (4.36) √ The loop size is thus of the order of λ = A/ f . The harder one pulls on the chain, the smaller the loop. Equation 4.36 leads directly to a force-extension relation. If the chain has a contour length L with L λ, its end-to-end distance z is given by z = L − L = L − 4 A/ f (4.37)
DNA as a Wormlike Chain
in a very good approximation. We rewrite this for later purposes as follows: 16A 1 f = 2 . (4.38) L (1 − z/L)2 A situation often encountered in DNA-protein complexes is that the protein induces a kink in the DNA. Suppose one applies a force f on that DNA chain as schematically depicted in Fig. 4.30. For this geometry we can construct the two DNA halves to the left and to the right of the protein from sections of the loop solution, Eq. 4.33. For a protein-induced kink with opening angle α, one finds the following force-extension relation: &
A π −α 1 − cos (4.39) z = L − 4 f 4 or, equivalently, 16A f = 2 L
1 − cos
π −α 4
2
1 . (1 − z/L)2
(4.40)
Equation 4.39 suggests the intriguing possibility that one could determine the microscopic geometry of the DNA-protein complex— its opening angle α—from a stretching experiment on a macroscopically long piece of DNA. We shall discuss this point further at the end of this section. We outline the calculation that leads to Eq. 4.39. The length loss L/2 resulting from the bending of say the right arm is given by ∞ 2 L = (4.41) ds = 2λ (1 − tanh (s0 (α) /λ)) 2 2 (s/λ) cosh s0 (α) with s0 (α) denoting the arc length where the curve θloop (s) starts from the kink. Comparing Eqs. 4.39 and 4.41 we see that we still have to show that
s0 (α) π −α tanh = cos . (4.42) λ 4 We start from the fact that at the kink θloop (s0 ) = (π − α) /2. From Eq. 4.33 follows then:
π −α 2 cos =1− . (4.43) 2 2 cosh (s0 (α) /λ)
161
162 DNA
Figure 4.30 to it.
A DNA chain under tension with a kink-inducing protein bound
One finds Eq. 4.42 by inserting Eq. 4.43 into the relation √ cos ((π − α) /4) = (1 + cos ((π − α) /2)) /2. Up to now we treated DNA like an elastic beam or rubber tube as it would behave in our macroscopic world. We determined the shape of a DNA molecule by minimizing its energy. We therefore conveniently “forgot” what we have learned in the previous two chapters, namely the crucial role that entropy typically plays in the microscopic world. To put it in other words, we studied “cold” DNA at a vanishing temperature, while we should have considered DNA at room or body temperature. With DNA being very long and very thin it should not come as a surprise that thermal fluctuations can induce huge deformations of the chain. However, in contrast to flexible polymers—such as the freely jointed chain from the previous chapter—DNA has a certain stiffness, which has a strong impact on its conformations and on its force-extension behavior. The DNA is an example of a semiflexible polymer—halfway in between a flexible polymer and a stiff rod, as we shall explain in more mathematical terms below. At the end of this section we return back to the cases of looped DNA, Fig. 4.28 and kinked DNA, Fig. 4.30, and discuss how thermal fluctuations influence their force-extension relationships. We start with studying the influence of thermal fluctuations on a DNA chain in the case when there is no external force present. Our aim is to calculate the shape of the chain. We first restrict our chain to two dimensions. The 3D case will turn out to be a trivial extension of this calculation. The 2D Hamiltonian is given by: A L 2 θ˙ ds. (4.44) H = 2 0 We arrive at Eq. 4.44 by setting φ ≡ 0 in Eq. 4.21 to enforce planar shapes; furthermore the twist term is not considered here
DNA as a Wormlike Chain
since it constitutes another degree of freedom that does not couple to the filament’s shape. Shapes that minimize Eq. 4.44 follow from the Euler–Lagrange equation, Eq. D.9, which takes here the form θ¨ (s) = 0. Consider a short section of the DNA chain of length l L between s = s0 and s = s0 + l, i.e., s0 ≤ s ≤ s0 + l. Let us assume without loss of generality that θ (s0 ) = 0. We calculate now the cost of bending this segment such that θ (s0 + l) = θl . The solution that minimizes Eq. 4.44 subject to the boundary conditions θ (s0 ) = 0 and θ (s0 + l) = θl is given by θ (s) = (s − s0 ) θl /l, which leads to the bending energy A l A (θl /l)2 ds = θl2 . (4.45) Hl = 2 0 2l The energy, Eq. 4.45, is quadratic in θl , so that we can apply the equipartition theorem as discussed in Section 2.2:
lkB T . (4.46) θl2 = A This means that the tangential correlation function t (s0 ) t (s0 + l) of two tangents to the chain, which are separated by a small distance l, behaves like kB T 1 2 t (s0 ) t (s0 + l) = cos θl ≈ 1 − θl = 1 − l . (4.47) 2 2A For tangents that are separated by twice the distance, we divide the DNA chain between them into two sections of length l, one of which is bent by the angle θl, 1 and the other by θl, 2 . This leads to
t (s0 ) t (s0 + 2l) = cos (θl, 1 + θl, 2 )
= cos (θl, 1 ) cos (θl, 2 )
− sin (θl, 1 ) sin (θl, 2 ) = t (s0 ) t (s0 + l) t (s0 + l) t (s0 + 2l) . (4.48) In the second step we used the independence of the two succeeding bending angles θl, 1 and θl, 2 . The underlined term vanishes due to the symmetry of the sine-function. So we find
kB T 2 t (s0 ) t (s0 + 2l) = 1 − l . (4.49) 2A
163
164 DNA
It is now straightforward to extend this procedure to n consecutive segments. One obtains
kB T n t (s0 ) t (s0 + nl) = 1 − l . (4.50) 2A Now consider two tangents separated by an arbitrary distance x along the chain. We can subdivide x into smaller pieces of length x/n and let n go to infinity:
x kB T n t (s0 ) t (s0 + x ) = lim 1 − = e−xkB T /2 A (4.51) n→∞ n 2A where we made use of the identity limn→∞ (1 − y / n)n = e−y . We introduce now the so-called persistence length l P = A/kB T
(4.52)
that allows us to rewrite Eq. 4.51 as follows: t (s0 ) t (s0 + x ) = e−x/2l P .
(4.53)
The last step that remains is to go from two to three dimensions. Unlike the 2D case, the angle φ is no longer necessarily constant in the Hamiltonian, Eq. 4.21. Disregarding again the twist term we find A L 2 2 φ˙ sin θ + θ˙ 2 ds. (4.54) H = 2 0 We introduce now two new variables θx = θ cos φ, θ y = θ sin φ.
(4.55)
They can be interpreted as the angles between the tangent vector projected in the X Z - and Y Z -planes and the line θ = 0. These new variables obey θx2 + θ y2 = θ 2
(4.56)
θ˙ x2 + θ˙ y2 = θ˙ 2 + φ˙ 2 θ 2 ≈ θ˙ 2 + φ˙ 2 sin2 θ.
(4.57)
and
Hence the 3D analogue to Eq. 4.44 is A L 2 θ˙ x + θ˙ y2 ds. H = 2 0
(4.58)
DNA as a Wormlike Chain
R s0 + lP
lP ≈ 50 nm
s0
Figure 4.31 A DNA chain is straight along distances of the order of the persistence length l P which is around 50 nm.
Each degree of freedom—as in Eq. 4.46—takes up k B T /2 (once again the equipartition theorem), such that e.g., θx,2 l = lkB T /A. The tangent correlations therefore decay as 1 2 t (s0 ) t (s0 + l) = cos θl ≈ 1 − θl 2 1 2 2 = 1− θx, l + θ y, l 2 kB T . = 1−l (4.59) A Following the steps that led us from Eq. 4.47 to 4.53, we find that the tangent correlations in three dimensions decay as t (s0 ) t (s0 + x) = e−x / l P . (4.60) This relation shows that the persistence length l P is the typical contour length along which the chain forgets its previous orientation. It turns out that for DNA at room temperature the persistence length is about 50 nm or 50 nm/0.33 nm = 150 bp, see Fig. 4.31. We shall show below how l P was determined with the help of a DNA pulling experiment. Having calculated the tangent-tangent correlation function, Eq. 4.60, it is now straightforward to calculate the mean-squared end-to-end distance of a DNA chain of arbitrary length:
2 L L L 2 t (s) ds = ds ds t (s) t s R = 0
L
=
0
L
ds e
ds 0
−
0
| s −s | lP
=
2l 2P
0
L − lL +e P −1 . lP
(4.61)
165
166 DNA
D
Figure 4.32 Onsager’s argument to determine the second virial coefficient of elastic rods. The black rod is fixed in space. The rotation of the blue rod is blocked once the rod (shown now in red) collides with the black rod. By shifting the blue rod downward by a distance D—the rod’s diameter—the rod, now shown in green, can rotate freely in a plane parallel to the plane in which the red rod got stuck.
Let us consider the two limiting cases. For L l P we find from Eq. 4.61 R 2 ≈ L2 . This corresponds to a rigid rod with hardly any shape fluctuations. On the other hand, for L l P Eq. 4.61 simplifies to 2 R ≈ 2l P L. (4.62) This case also has a simple interpretation. The WLC shows the ideal chain statistics, as we have already found for the random walk, Eq. 3.3, the freely jointed chain, Eq. 3.13, and the freely rotating chain, Eq. 3.20. Specifically, the interpretation of Eq. 4.62 goes as follows: The chain consists of N = L/ (2l P ) orientationally
independent segments of length b = 2l P . Hence R 2 = b2 N = 2l P L. Strictly speaking, for a DNA chain one has also excluded volume effects. We should therefore expect the chain to have a swollen coil conformation as discussed in Section 3.3. But since its “monomers” have a huge aspect ratio—a length of about 100 nm vs. a thickness of 2 nm—the chain needs to be very long before the excluded volume becomes important. Let us use the blob argument that we introduced in Section 3.3, Eqs. 3.38 to 3.41. As usual we disregard numerical factors. Within a thermal blob the DNA chain shows ideal statistics. From Eq. 4.62 follows then that the blob size scales as 1/2
ξT = l P (l P gT )1/2 ,
(4.63)
where gT is the number of persistence lengths within a blob. A thermal blob is characterized in that its two-body collision term is
DNA as a Wormlike Chain
in the order of the thermal energy: υgT2 = 1. ξT3
(4.64)
According to Lars Onsager (Onsager, 1949) the second virial coefficient υ for elongated rods of length l P and diameter D l P scales like υ = l 2P D. The l 2P -scaling of the second virial coefficient is surprising since the rods are extremely thin so that one might expect υ to be much smaller, e.g., υ = l P D2 . It can be understood as follows, see also Fig. 4.32. Suppose there is already a rod fixed in space with its midpoint positioned at the origin. Consider a second rod. As long as it is farther away from the origin than l P , it can freely rotate around its midpoint without ever colliding with the other rod. If its midpoint is closer than a distance of about l P , collisions are possible if the rod rotates in the plane in which the fixed rod is located. If we now move the rod up or down by a distance D, the rod can rotate freely again without collisions. In summary, there is a volume l 3P around the fixed rod, in which the rotation of the other rod is affected by collisions if its midpoint is within this volume. But only the fraction D/l P of rotations leads to collisions. Hence the excluded volume scales as D = l 2P D. (4.65) υ = l 3P × lP The thermal blob parameters scale in this case as l 4P l6 l2 l2 gT = P2 = P2 . (4.66) = P, D υ D υ Finally, the overall chain is a SAW of thermal blobs, Eq. 3.41:
3/5 L 1/5 R = ξT = D1/5l P L3/5 . (4.67) gT l P ξT =
This result also follows directly from Eq. 3.34 by setting a = l P , υ = l 2P D and N = L/l P . Note that it follows from Eq. 4.66 that the number gT of persistence lengths that are needed to see Flory statistics is rather large, namely gT = (l P / D)2 . This is on the order of (50 nm/2 nm)2 ≈ 600 corresponding to a contour length of 30 μm. This number should actually be smaller since DNA chains carry electrical charges and repel each other. Due to the presence of small ions, this repulsion is screened at physiological salt concentrations
167
168 DNA
beyond a typical distance of 1 nm; the theory behind this estimate will be presented later in the book, in Section 7.4. In that case the DNA chain would be effectively thicker by about 1 nm. gT is then smaller, on the order of (50/4)2 ≈ 150 corresponding to about 8 μm. This estimate is certainly rough, but gives the idea that it takes fairly long chains to find deviations from ideal chain behavior. That is why one usually gets away with neglecting swelling effects when studying DNA—also in the case that follows next. We study now a DNA chain under tension. This appears to be straightforward since we already learned in Chapter 3 how to calculate polymers under tension. In addition, we have learned above that if we use the freely jointed chain expression, Eq. 3.13, with N = L/ (2l P ) and b = 2l P , we can reproduce the WLC ideal coil result, Eq. 4.62. Now we have already calculated the exact forceextension relation of the freely jointed chain, Eq. 3.17. So the recipe is simply to translate this finding to the case of a WLC under tension by again replacing N by L/ (2l P ) and b by 2l P : z = bNL (βbf ) = LL (2βl P f ) .
(4.68)
Does Eq. 4.68 provide a satisfactory description of the forceextension relation of a WLC such as DNA? Since the beginning of the 1990s force-extension curves are experimentally accessible by attaching a DNA chain to a surface on one end and to an optically or magnetically trapped micron-sized bead on the other end. Figure 4.33 shows the force-extension curve of a 97 kbp long DNA chain that has been recorded using such a so-called magnetic tweezer, the first ever published such curve (Smith et al., 1992); the DNA stems from λ-phage, a bacterial virus. The blue dashed curve in Fig. 4.33 corresponds to the freely jointed chain expression, Eq. 4.68. The parameters L and l P have been chosen as follows. The force should go to infinity when z approaches the contour length L; the best fit is given for L = 32.8 μm. This is reasonable since this corresponds to a length of 32.8 μm/97000 = 0.34 nm per base pair. Next l P is chosen such that the curve fits best for small forces, f < 0.1 pN where the chain is only slightly deformed away from the Gaussian coil; this is achieved for l P = 50 nm. As can be seen from Fig. 4.33, the curve constructed that way gives a good description of the data for small forces, f < 0.1 pN. In this regime, DNA is an example of an entropic
DNA as a Wormlike Chain
100
10
f [pN ]
1
0.1
5
10
15
20
25
30
35
Δz [μm] Figure 4.33 The force-extension relation of the freely jointed chain model, Eq. 4.68 (blue dashed line), does not compare favorably with Smith’s 1992 stretching data on 97 kbp DNA (figure adapted from Bustamante et al. (1994)). The dashed red curve is based on the WLC model, Eq. 4.79, and shows a good agreement with the data for f > 0.1 pN. Finally, the solid red curve corresponds to Eq. 4.80, an interpolation formula between small and high forces.
spring, as discussed in the chapter on polymer physics. Specifically, we find from Eq. 4.68: z =
2l P L f. 3kB T
(4.69)
However, Eq. 4.68 fails entirely for larger forces, a problem also encountered in the original paper (Smith et al., 1992). Something is wrong with the freely jointed chain as a model for DNA under larger tension. Since the freely jointed chain does not work for large forces, we start from the WLC model instead and study the case where the chain is already nearly completely extended. That means we assume θ (s) 1 everywhere, cf. Fig. 4.34 for a comparison between the low force case and the high force case. For simplicity, let us start with the case where the chain is confined to two dimensions; the extension to the 3D case will then be straightforward. For θ 1, the Hamiltonian, Eq. 4.24, can be simplified by approximating cos θ by 1 − θ 2 /2. This
169
170 DNA
low tension f
θ (s)
1
high tension
f Figure 4.34 DNA under small and large tension f .
leads to a Hamiltonian that is quadratic in θ and θ˙ : % L$ A 2 f θ˙ + θ 2 ds − f L. H = 2 2 0
(4.70)
We write θ (s) as a Fourier series θ (s) =
∞
θˆn e−2π i ns/L.
(4.71)
n=−∞
For an introduction to Fourier series, we refer the reader to Appendix E. For simplicity, we assumed periodic boundary conditions, θ (L) = θ (0), in Eq. 4.71. This assumption does not affect the result for long chains. By inserting the Fourier series, Eq. 4.71, into the Hamiltonian, Eq. 4.70, we find that the Hamiltonian decouples into a sum over independent modes
2π 2 A f L 2 2 H = n + θˆn − f L. (4.72) L 2 n “Independent” here means that there are no cross terms, e.g., terms of the form θn θm∗ with n = m, i.e., the amplitude of each mode is entirely unaffected by the amplitudes of the other modes. We can now use the equipartition theorem from Section 2.2. It predicts that the mean-squared amplitude of the n-th mode is given by ' 2 ( kB T θˆn = . (4.73) 4Aπ 2 /L n2 + f L Equation 4.73 together with Eq. 4.71 has an interesting physical interpretation (Odijk, 1995). The shape of the fluctuating DNA chain
DNA as a Wormlike Chain
2πλ
×5 f = 1 pN f = 3 pN f = 10 pN
Figure 4.35 Example configurations of a DNA chain under tension for three different applied forces. Note that the chain’s excursions perpendicular to the force direction have been exaggerated by a factor of 5 to make them better visible.
can be considered as a superposition of different modes with the nth mode having a wavelength L/n and a mean-squared amplitude given by Eq. 4.73. The applied force only has a strong effect on modes that have sufficiently long wavelengths; a comparison of the two terms in the denominat or of Eq. 4.73 shows that these are modes for which f L > 4Aπ 2 /L n2 . In other words, modes of wave length √ larger than L/n > 2π A/ f = 2πλ are suppressed. Here the correlation length, Eq. 4.28, appears again. Earlier, it was the size of a protein-induced deformation, now the wavelength of the largest mode that survives the applied tension. Excursions of the chain away from the straight configuration are thus correlated over this characteristic length scale. On the other hand, modes of wavelength √ L/n (much) smaller than 2π A/ f = 2π λ have an amplitude θn proportional to 1/n and are thus scale invariant. Typical example configurations for three different experimentally relevant forces are displayed in Fig. 4.35. We are now in the position to calculate the end-to-end distance
L L 1 2 z = cos θ (s) ds ≈ θ (s) ds 1− 2 0 0 kB T L 2 L ∞ θˆn ≈ L − dn = L− n 2 −∞ 4 A π 2 / L n2 + f L 2
kB T (4.74) = L 1− √ . 4 Af Here in the third step we inserted the Fourier series, Eq. 4.71, and then in the following step we replaced the summation by an
171
172 DNA
integration. It is now straightforward to solve Eq. 4.74 for the force. We find kB T 1 f = (4.75) . 16l P 1 − z 2 L
In order to compare with the experimental data shown in Fig. 4.33 we need to extend the calculation to the 3D case. We start from the 3D Hamiltonian, Eq. 4.22, but drop the second term in the integral that accounts for the twist energy. This is possible since in the experiment the DNA ends were grafted to the surfaces via single-stranded overhangs around which the DNA can swivel freely; thus there is no coupling between the bending and twisting modes. Assuming sufficiently large forces such that θ (s) 1 we arrive at % L$ f A 2 2 H = φ˙ θ + θ˙ 2 + θ 2 ds − f L. (4.76) 2 2 0 We rewrite the Hamiltonian in terms of the variables θx and θ y introduced in Eq. 4.55. Using Eqs. 4.56 and 4.57 we find immediately % L$ f 2 A 2 2 2 ˙ ˙ θ + θy + (4.77) H = θ + θ y ds − f L. 2 x 2 x 0 It follows from Eq. 4.77 that the two variables θx and θ y decouple and that each part is identical to the 2D case, Eq. 4.70, which we solved already. The end-to-end distance is then given by
L 1 2 z ≈ 1− θ (s) ds 2 0
L 1 2 2 1− θx (s) + θ y (s) ds = 2 0
kB T (4.78) = L 1− √ 2 Af (cf. Eq. 4.56) which can be rewritten as kB T 1 f = (4.79) . 4l P 1 − z 2 L
A comparison between Eqs. 4.75 and 4.79 shows that for a given value of z the force is four times larger if the chain can fluctuate in three dimensions instead of two. How well does Eq. 4.79 work? The red dashed curve in Fig. 4.33 gives a good description to the large force data points, f > 0.1 pN,
DNA as a Wormlike Chain
when one chooses a chain length L = 32.8 μm and a persistence length l P = 50 nm—the same parameter values that we found when we fitted the freely jointed chain to the low force data points. While the formula works remarkably well, it obviously cannot describe the low force data where the assumption θ 1 is violated. Since no exact treatment is available that covers the whole range of forces, an interpolation formula between small and high forces was proposed in Ref. (Bustamante et al., 1994), namely 1 1 z kB T 1 . (4.80) f = − + lP 4 1 − z 2 4 L L
You can easily check that this interpolation formula indeed reduces to Eq. 4.69 for z L and to Eq. 4.79 for L − z L. As expected, the solid curve in Fig. 4.33 gives the interpolation, Eq. 4.80, between the two asymptotic cases. Equation 4.79 is markedly different from the high-force limit of the freely jointed chain: f =
1 kB T . 2l P 1 − z L
(4.81)
This relation follows from the large force limit provided in Eq. 3.17 (substituting again N by L/ (2l P ) and b by 2l P ). Why does the freely jointed chain model not work here? The force regime at which the chain is nearly straight is reached for forces for which the slack remaining in the chain is much smaller than to the √ L. According WLC model that slack is given by LkB T / 2 A f , Eq. 4.78. This means that the high force regime, Eq. 4.79, is reached for f l P > kB T . In this case, not only chain sections that are shorter than l P are nearly straight—as always, even without tension—but also sections longer than l P are nearly straight since they are aligned parallel to the direction of force. The high-force condition, f l P > kB T , can be rewritten as λ < l P . This shows that for the high force regime the surviving fluctuations are those with wavelengths smaller than l P . The increase of the end-to-end distance with increasing force reflects the freezing-in of degrees of freedom on those short length scales. Obviously, the freely jointed chain, which models the chain as completely stiff below b = 2l P , fails to capture these short wavelengths fluctuations. As a result, the increase in
173
174 DNA
force, as the end-to-end distances approaches L, is by far less dramatic than with the WLC model. The freely jointed chain model can only provide a reasonable description for small forces, namely for f < kB T /l P = (4/50) pN ≈ 0.1 pN as can indeed be seen in Fig. 4.33. We looked at two different types of force-extension relations in this section. First we studied a WLC under tension with a sliding loop, Fig. 4.28, or a kink, Fig. 4.30, in the absence of thermal fluctuations, and then we considered a straight WLC under tension in the presence of thermal fluctuations. Remarkably, we found force extension relations of the same functional form for these two very different problems: f =
f0 1−
z 2 L
.
(4.82)
For the loop f0 = 16A/L2 , Eq. 4.38, and for the we 2obtained 2 , Eq. 4.40. The proportionality kink f0 = 16A/L 1 − cos π −α 4 f0 ∼ A reflects the fact that in these examples the chain is extended through the purely elastic bending of the DNA associated with the loop or kink. On the other hand, for the stretching out of thermal fluctuations we found that f0 depends on the temperature, namely f0 = kB T / (4A), Eq. 4.79. This leads to the following interesting and experimentally relevant question: How does looped or kinked DNA react to an external force in the presence of thermal fluctuations? Let us try to guess the answer for the case of looped DNA. We start from the finding that in the absence of fluctuations a loop “eats up” a √ length 4 A/ f , Eq. 4.37. Now let us add fluctuations on top of this lowest energy configuration. Most of the DNA is in the arms that are oriented in the force direction. If we neglect the looped part, then we should that the fluctuations shorten the chain √ estimate by LkB T / 2 A f , Eq. 4.78. There is a small error in this estimate since the loop affects the fluctuations. The loop engages only a small fraction of the chain, λ/L (the loop size divided by chain length). Based on √this,we expect an error in our estimate of the order (λ/L)× A f = kB T / f . If we sum up all these contributions, we LkB T / find for a loop under tension in the presence of thermal fluctuations
DNA as a Wormlike Chain
the following end-to-end distance: &
A kB T kB T z = L − √ L− 4 +O . f f 2 Af
(4.83)
This guess is supported by a full calculation (Kulic´ et al., 2007), which turns out to be quite complicated and is therefore not presented here. We only mention that the calculation starts from the Hamiltonian, Eq. 4.22 with C ≡ 0, and studies small fluctuations around the ground state θloop (s) given by Eq. 4.32, i.e., θ (s) = θloop (s) + δθ (s) and φ (s) = δφ (s) with δθ (s) 1 and δφ (s) 1 everywhere. This calculation also allows to explicitly derive the correction term in Eq. 4.83, namely − (9/4) kB T / f . That correction term is on the order of λ/l P smaller than √ the A/ f -term which describes the loop-induced shortening. As discussed below Eq. 4.81, the condition λ < l P is precisely the condition for high forces where fluctuations away from the straight state are small, here where δθ and δφ stay small as assumed in the calculation. This means that the last term in Eq. 4.83 is automatically the smallest term in the range of validity of the equation, namely for forces with f 0.1 pN. In this case the two remaining f -dependent terms scale like f −1/2 . It is therefore straightforward to solve for f : 1 kB T , (4.84) f = app 4l P (1 − z /L)2 with lP app lP = (4.85) 2 . 1 + 8 lLP We remarked earlier that thermal fluctuations on a straight chain and a loop on a non-fluctuating chain lead to the same functional form of the force response, Eq. 4.82. Here we see that the combination of both effects leads to this functional form again. We wrote the relation Eq. 4.84 in such a way that it resembles app the classical WLC response, Eq. 4.79. However, the quantity l P reflects not only the chain stiffness, but also its configuration. Along similar lines we can determine the force-extension relation for a DNA chain with a kink, Fig. 4.30. Combining Eqs. 4.39 and 4.78 we obtain Eq. 4.84 with lP app (4.86) lP = 2 . lP 1 + 8 L 1 − cos π −α 4
175
176 DNA
GalR dimer 1
GalR dimer 2
f = 0.88 pN
(a)
38 nm GalR tetramer
f = 0.88 pN
(b) f = 0.88 pN
18 nm 38 nm
(c)
Δl = 56 nm
Figure 4.36 DNA under tension in the presence of gal repressor proteins (Lia et al., 2003). (a) Dimers bind to two specific binding sites at a distance of 38 nm. (b) and (c) Tentative models for DNA binding: the parallel loop and the anti-parallel loop configuration (schematic drawings after Geanacopoulos et al. (2001)). Only the anti-parallel loop leads to bent DNA arms which, according Eq. 4.39 with α = 0 and f = 0.88 pN, cause a length loss of 18 nm.
These results show that one has to be careful when one determines the persistence length of DNA in a micromanipulation experiments. If the DNA chain shows some extra features, e.g., a kink or a knot, or if there are special boundary conditions at its ends, one measures an apparent persistence length that reflects not only the chain stiffness but also conformational features and thus underestimates the true persistence length. Note, however, that this effect becomes negligible in the long chain limit L/l P → ∞, see Eqs. 4.85 and 4.86. But it remains significant over a surprisingly app large range of parameters, e.g., l P = 0.74l P for a looped DNA chain of length L = 50l P . Moreover, if the number of defects increases in proportion to the chain length, e.g., if there are kink-inducing app proteins at an average spacing L, l P stays independent of the chain length, since you have to replace L with L in Eq. 4.86. How does this compare to experiments? Here we show the case of the so-called gal repressor dimer protein (GalR), which—
DNA Melting
when bound to a specific piece of DNA of the bacterium Escherichia coli—suppresses transcription of genes related to the utilization of galactose in e.g., the synthesis of its cell wall. There are two dimers of GalR binding at two specific DNA positions at a distance of 113 bp (38 nm), see Fig. 4.36(a). The two dimers then bind together to form a tetramer, forcing the DNA in between to form a loop. Two competing models allow either for formation of an anti-parallel loop, Fig. 4.36(c) or a parallel loop, Fig. 4.36(b) (Geanacopoulos et al., 2001). A micromanipulation experiment (Lia et al., 2003) measured the extension of a DNA chain containing the two specific GalR binding sites under a moderate force of f = 0.88 pN, see Fig. 4.36. The binding and unbinding of the GalR repressor was detected by sudden jumps l with l = 55 ± 5 nm in the end-to-end distance. Remarkably, this change was substantially larger than just the 38 nm eaten up by the loop. This indicates an additional length loss caused by the bent DNA arms outside the loop. In the parallel loop model one has α = π (see Fig. 4.30) and hence no additional length loss (see Eq. 4.39); we expect l = 38 nm, which is inconsistent with the data. On the other hand, the anti-parallel loop conformation leads to α = 0. Equation 4.39 predicts for f = 0.88 pN an extra length loss √ √ of 4 A/ f 1 − 1/ 2 = 18 nm that leads to a total length loss of 56 nm, in good agreement with the data. This demonstrates that one can learn about the microscopic structure of a DNA-protein complex by pulling on a long DNA chain—just knowing about large scale DNA elasticity.
4.4 DNA Melting When a gene is transcribed by RNA polymerase, Fig. 1.3, or when the whole genome is duplicated by DNA polymerases, Fig. 1.2, the two strands of the DNA double helix need to be separated. Experimentally, it is very easy to induce the separation of the two strands by heating a solution containing double-stranded DNA chains. One can measure the fraction θb of paired bases through the characteristic light adsorption of double-stranded DNA at 260 nm. At low temperatures all the bases are paired, θb = 1, whereas at high
177
178 DNA
temperatures all the bases are unbound, θb = 0. At intermediate temperatures, typically around 70 to 90◦ C the thermal denaturation or melting of DNA occurs. In general, the actual melting curve θb = θb (T ) of long DNA chains looks complicated, exhibiting a multi-step behavior where sharp jumps are separated by plateaus of various lengths. This reflects the heterogeneity of the bp sequence. Remember that AT pairs are bound via two hydrogen bonds and are thus weaker than GC pairs with their three hydrogen bonds, see Fig. 4.4. As a result, stretches of the DNA double helix that have a high AT content open first and form what are known as denaturation loops or bubbles. The melting curve thus contains information about the sequence of the molecule under study. It is not possible to avoid these sequence effects: If one uses DNA with a homogeneous sequence instead, e.g., only A’s on one strand and T’s on the other, one has the problem that a given base of the A-strand can be paired with any base of the T-strand. In the following we will not go into the sequence dependence any further; see Ref. (Blossey, 2006) for an insightful discussion on that subject. Instead, we discuss here an idealized problem where all the base pairs have the same binding energy and each monomer can be bound to one specific matching base on the other strand. We ask ourselves how such an idealized DNA molecule would melt, especially in the limit of an infinitely long chain. In principle we can think of three possibilities: (a) There is no phase transition and θb goes smoothly from one to zero. (b) The curve θb = θb (T ) is continuous but goes at some finite temperature T M to zero and stays zero for T ≥ T M . Some higher-order derivative of θb has a jump or a singularity at T = T M . DNA melting would then correspond to a continuous phase transition. (c) The θb -curve features a jump from a finite value to zero once a certain temperature has been reached. This would correspond to a first-order phase transition. We discuss now the Poland–Scheraga model (Poland and Scheraga, 1966) following the treatment in (Kafri et al., 2002). Each monomer can be either in a bound or an unbound state. As a result, the conformation of two DNA strands can be described by a sequence of bound and unbound stretches. A schematic view of such a partly denatured DNA configuration is shown in Fig. 4.37. In this example we have a stretch of l 1 bases that are paired followed
DNA Melting
l1
l3 l2
l4
l5 l6
Figure 4.37 Schematic view of a microscopic configuration of a partly molten DNA molecule. It consists of an alternating sequence of bound and denatured stretches (shown in black and red). Here the bound segments have lengths l 1 , l 3 and l 5 bp, the denaturation loops have lengths l 2 and l 4 bases on each single strand. Finally, the rightmost stretch, the open endsegment, has a length of l 6 bases. The leftmost pair of bases is assumed to be always bound.
by a stretch of l 2 bases that are unpaired and so on. The li ’s sum up to the total length of the chain L = i l i . The pairing of bases is advantageous from an energetic point of view, since bound pairs have hydrogen bonds between each other. Unbinding of the pairs is advantageous from an entropic point of view since single-stranded DNA is much more flexible than double-stranded DNA. There are many more configurations available once the DNA starts to melt. We saw already in the previous section that double-stranded DNA has a persistence length of about 50 nm (or 150 bp). Single-stranded DNA, on the other hand, is much softer. In fact, when explaining why DNA forms a double helix, we assumed that the two single strands are rather flexible, which allows the base pairs to be stacked, see Fig. 4.6. Indeed, it is known that the effective bond length of single-stranded DNA is about 4 nm or 8 bases. For the sake of simplicity, the Poland–Scheraga model assumes that the bound sections are infinitely stiff. The unbound single strands are assumed to be infinitely flexible so that all the allowed conformations have the same energy. Moreover, interactions between different sections are neglected. The statistical weight of the particular configuration shown in Fig. 4.37 is thus given by the product of the weights of the individual sections: 1 l1 (4.87) w (2l 2 ) wl3 (2l 4 ) wl5 (2l 6 ) . Z Here Z is a normalization factor, which of course is nothing more than the partition function. The quantity w is the Boltzmann weight p ({l 1 , . . . , l 6 }) =
179
180 DNA
for the matching bases to be paired, w = e−β E 0
(4.88)
with E 0 < 0 being the binding energy. The quantities and count the number of configurations of the unpaired sections, i.e., the inner loops and the dangling ends. In particular, the inner loops (in the figure, those of length l 2 and l 4 per strand) can be interpreted as closed loops of length 2li : One first goes along one strand of length li and then goes back to the starting point along the other strand, also of length li . The number of configurations of a loop of length 2l has for large l the form (2l) = A
sl . lc
(4.89)
Here s and A are constants that depend on microscopic details (i.e., they are non-universal); in particular s reflects the number of conformations per segment of a single-stranded chain. On the other hand, c has a universal value that is determined by the properties of the loop configurations. To give a specific example for Eq. 4.89, let us consider random walks on a three-dimensional cubic lattice. In Eq. 3.7 we found that the probability distribution for the end-toend distance of such a random walk is Gaussian. Furthermore it is obvious that the total number of random walks on a cubic lattice with N = 2l steps is given by 62l , see also Eq. 3.6. For a loop we have R = 0 and hence (2l) = const 62l / (2l )3/2 ∼ 36l /l 3/2 . Obviously the quantities s and A depend on the details of the random walk, e.g., the lattice on which it lives, but c (here c = 3/2) does not. If we consider excluded volume interactions between monomers within a loop and/or between different loops, retains the form given in Eq. 4.89 but the value of c changes—as discussed further below. Finally, let us discuss the two ends of the DNA molecule. We assume that the leftmost base pair in Fig. 4.37 is always bound. Otherwise, our single DNA molecule in an infinite volume would gain an infinite translational entropy by unbinding its two strands and would never form a double helix at any finite temperature. The end on the right is allowed to separate into two dangling denatured strands as shown schematically on the right of the conformation in Fig. 4.37. If each strand has a length l, the conformation can be
DNA Melting
interpreted as a random walk of length 2l with the statistical weight sl (4.90) lc with B and s being constants. In most cases the exponent c does not equal c from Eq. 4.89. For instance, for a random walk on a cubic lattice one has obviously (2l) = 62l , i.e., B = 1, s = 36 and c = 0. For simplicity, we will set A = B = 1 from now on. In the following we work in the grandcanonical ensemble, the ensemble where the total chain length is allowed to fluctuate. As mentioned below Eq. 2.32, it is just a matter of convenience which ensemble one chooses. According to Eq. 2.30, the grandcanonical partition function is given by (2l) = B
ZG =
∞
zL Z L
(4.91)
L=0
where Z L is the canonical partition function of a chain of length Land z denotes the fugacity. We introduce now the functions U (z), V (z) and Q (z): U (z) =
∞
(2l) zl =
∞ sl l =1
l=1
V (z) =
∞
wl zl =
l=1
lc
zl = c (zs) ,
wz , 1 − wz
(4.92)
(4.93)
and Q (z) = 1 +
∞
(2l) zl = 1 +
∞ sl
l =1
l =1
lc
zl = 1 + c¯ (zs) .
(4.94)
Here c denotes the so-called polylog function. As we shall see now, Z G can be expressed in terms of U (z), V (z) and Q (z). We rearrange the summation in Eq. 4.91 such that we sum over k where k is the number of loops in a configuration. The rearranged sum is then given by ∞ k ∞ ∞ w m zm . (2l ) zl (4.95) Z G = (1 + V (z)) Q (z) k=0
l=1
m=1
The first term accounts for the leftmost segment, which can either have a length 0 (the term 1) or lengths l ≥ 1 (the term V (z)).
181
182 DNA
Q 1+V
k=0
+
1+V
Q
U V
k=1
+
1+V
U
Q
U V
V
+...
k=2
Figure 4.38 Graphic illustration of Eq. 4.95.
The second term, Q (z), accounts for the different possible statistical weights at the other end. Finally, each term in the k-summation contains all possible lengths l 1 to l 2k of k bound stretches and k loops. A graphic illustration of this summation is presented in Fig. 4.38. Each graph in that representation stands for all the conformations with k loops. The grandcanonical partition function, Eq. 4.95, is a geometric series in the quantity U (z) V (z): ∞ (1 + V (z)) Q (z) (U (z) V (z))k = Z G = (1 + V (z)) Q (z) . 1 − U (z) V (z) k=0 (4.96) The fugacity sets the average chain length through the relation Eq. 2.32, which here takes the form ∂ ln Z G ∂ ln Z G = . (4.97) L = L = ∂α ∂ ln z We are interested in the thermodynamic limit, L → ∞, since only then we can hope to find a phase transition. For small z-values, Z G is finite, see Eq. 4.91. Z G grows with increasing z and finally diverges, Z G → ∞, when z approaches the fugacity z∗ , z → z∗ . At that point, the slope of ln Z G is infinite and according to Eq. 4.97 L → ∞. The limit Z G → ∞ can either arise from the divergence of the numerator in Eq. 4.96, (1 + V (z)) Q (z), or from the vanishing of the denominator, 1 − U (z) V (z). It turns out that the latter case is relevant here, namely U (z∗ ) V (z∗ ) = 1. By inserting Eq. 4.93 we arrive at the condition 1 U (z∗ ) = − 1. (4.98) wz∗ This is an implicit equation for the function z∗ = z∗ (w). The experimentally measured quantity is the fraction θb of paired bases. The average number of paired bases is given by ∂ ln Z G ∂ ln Z G m = w = . (4.99) ∂w ∂ ln w
DNA Melting
6
1/V
5
8
U
z= 6
4 3 2
1.6
1 z= s
1
2.7
4
1/V
0.1 0.2 0.3 0.4 0.5 0.6
(a)
1 s
1.66
1
U 0.05
0.10
z
3
1
1/V
0.15
0.20
(b)
z=
1 s
1.57 2.7
2
2.7
2
1
z
4
1
U 0.1
z
0.2
0.3
(c)
Figure 4.39 Solving Eq. 4.98 graphically. Three qualitatively different scenarios are depicted: (a) no phase transition for c ≤ 1 (here c = 1/2 and s = 2), (b) continuous phase transition for 1 < c ≤ 2 (here c = 3/2, s = 6) and (c) first-order phase transition for c > 2 (here c = 2.115, s = 4).
You can see this immediately from Eq. 4.95 since w∂w m /∂w = mw m . Now we calculate the ratio of m and L: m w ∂ Z G /∂w ∂ ln Z G /∂ ln w = . (4.100) = L ∂ ln Z G /∂ ln z z ∂ Z G /∂z Inserting Eq. 4.96 into Eq. 4.100 with V (z) replaced by its explicit form wz/ (1 − wz), Eq. 4.93, it is straightforward to show that m Q (z) (1 + U (z)) = . −1 L Q (z) w − z − U (z) z + Q (z) (1 + U (z) + U (z) z) (4.101) In the thermodynamic limit L → ∞ we have to replace z by z∗ , Eq. 4.98. This dramatically simplifies Eq. 4.101: θb = lim
L→∞
m 1 1 = . (4.102) = L 1 + wc−1 (z∗ s) z∗ 1 + wU (z∗ ) z∗2
Thus the nature of the denaturation transition follows from the dependence of z∗ on w (or via Eq. 4.88 on the temperature). For instance, if z∗ changes smoothly with w, so will θb and there is no phase transition; if z∗ jumps at some w-value, so will θb and one finds a first-order phase transition. As we will now see, the nature of the transition depends on the value of the exponent c that characterizes the loop statistics. To determine the function z∗ = z∗ (w), Eq. 4.98 must be solved graphically by finding the intersection between the function U (z) and 1/V (z). This is depicted in Fig. 4.39 for the three different cases
183
184 DNA
1.0
c = 1/2
0.8
c=3
0.6
θb 0.4
c = 2.115
c = 3/2
0.2 0.0
1
2
3
4
5
w
6
7
8
9
10
Figure 4.40 Melting curves of an idealized DNA double helix for different values of c. Depicted is the fraction θb of paired bases as a function of w = e−β E 0 , see Eq. 4.102. The value w = 1 corresponds to infinite temperature, large w-values to low temperatures.
that we discussed above: (a) no phase transition, (b) continuous phase transition and (c) first-order phase transition. We start with the first scenario, which is found for sufficiently small values of c, namely c ≤ 1. As an example we show in Fig. 4.39(a) the case c = 1/2 and s = 2. The red curve gives U (z) = c (zs) for the range 0 to 1/s. When the argument of U approaches 1/s the polylog function diverges smoothly (this is true for any c ≤ 1). The three blue curves in Fig. 4.39(a) depict the function 1/V (z) = 1/ (wz) − 1 for three different values of w as indicated at each curve. Specifically we choose w = 2.7, 1.6 and 1, the latter case corresponding to infinite temperature (see Eq. 4.88). The point of intersection between the blue curve for a given w-value and the red curve determines z∗ (w). As you can see from Fig. 4.39(a), z∗ moves smoothly with w (and thus T ) and saturates at its maximal value for the w = 1-curve. From z∗ (w) follows the fraction of bound pairs, θb = θb (w), by inserting the numerically determined function z∗ (w) into Eq. 4.102. This leads to the blue curve in Fig. 4.40. As you can see, the curve starts at θb ≈ 0.37 for w = 1, i.e., for infinite temperature, and goes smoothly to 1 for w → ∞, i.e., for going towards zero temperature. There is no jump in θb and thus no phase transition. Note that θb stays finite even at infinite temperature, and thus the two strands remain bound. We come back to that later.
DNA Melting
The second possible scenario occurs for c-values with 1 < c ≤ 2; an example, c = 3/2, s = 6, is shown in Fig. 4.39(b). In this case the function U (z) = c (zs) increases smoothly approaching a finite value when it reaches z = 1/s; U (z) is infinite for z > 1/s. As a result, z∗ increases smoothly with decreasing w up to the point when z∗ reaches the value 1/s; for the example this happens for w ≈ 1.66. For smaller w-values z∗ remains equal to 1/s. Another feature found for c-values in the range 1 < c ≤ 2 is that the slope of U goes smoothly to infinity for z → 1/s, U (z∗ = 1/s) = ∞. According to Eq. 4.102, this means that θb goes smoothly to zero, see the purple curve, c = 3/2, in Fig. 4.40. θb stays zero for smaller w-values as a more detailed discussion of Eq. 4.98 reveals (Kafri et al., 2002). There is clearly a phase transition, but since θb has no jump, this transition is continuous. Finally, we discuss the third scenario which occurs for c-values larger than two. Figure 4.39(c) provides an example, namely c = 2.115 and s = 4. In this case, as in the second case, the function U (z) increases smoothly with z reaching a finite value at z = 1/s before it jumps to infinity. However, unlike in the second case, the function has still a finite slope at z = 1/s. This means that the fraction θb of bound pairs, Eq. 4.102, has a finite value at the transition point. θb is identically zero for smaller w-values, as a more detailed analysis of Eq. 4.98 shows (Kafri et al., 2002). The jump of θb between the low temperature and the high temperature phase is the hallmark of a first-order phase transition. We give two examples in Fig. 4.40, c = 2.115 and c = 3. The c = 3-curve shows a very clear jump whereas the value c = 2.115 is so close to c = 2 that in an experiment the melting curve could easily be confused with a continuous phase transition: with decreasing w-value the slope of θb goes to infinity before θb jumps to zero. We now come back to the question of why in the high temperature limit T → ∞ or equivalently w → 1 there is still a finite fraction of paired bases in some cases, e.g., θb ≈ 0.37 for c = 1/2, while in other cases θb ≡ 0, e.g., for c = 3/2, 2.115 and 3, see Fig. 4.40. We use a scaling argument which counts the number of configurations of a loop in the DNA molecule and estimates how that number changes when one closes a base pair in that loop. For simplicity, we set w = 1 so that the w-factors in Z G , Eq. 4.95, can
185
186 DNA
no longer compete with the s-factors (which appear in the (2l)’s). The behavior of the system is then dominated by entropy. Consider a large loop of length l. According to Eq. 4.89 it has a large number of configurations which scales as s l /l c . Now suppose you close one of the l matching pairs of bases inside the loop, say the one in the middle. Then the large loop is divided into two smaller loops, each of size l/2. This double loop has now a different number of configurations which scales as s l /2 s l/2 (2/l )c (2/l )c ∼ s l /l 2c . An open l-loop has actually l matching pairs of bases that can bind and thereby divide the loop into two. The number of configurations of double loops that are divided by any of those matching pairs therefore scales like s l /l 2c−1 . This means that the probability for closing of a large loop becomes negligible if 2c − 1 > c. We therefore expect for c > 1 a high temperature phase where the two strands are unbound. On the other hand, for c < 1 a finite fraction of bases remains paired at an infinite temperature. The question about the nature of the melting transition of the DNA double helix thus boils down to the question: What is the value of c that describes the statistics of the denaturation loops? We demonstrated below Eq. 4.89 that one has c = 3/2 if the loops are modeled by ideal chains. This would suggest that melting of an infinitely long DNA chain occurs via a continuous phase transition, see the purple curve in Fig. 4.40. With a similar line of argument you can easily convince yourself that c = d/2 corresponds to ideal loops in d dimensions. This suggests that there is no phase transition for d = 1 and d = 2, continuous phase transitions for d = 3 and d = 4, and a first-order phase transition for d > 4. In Fig. 4.40 you can see that this is indeed true for the cases d = 1, 3, and 6. Obviously, changing the space dimension is of purely academic interest. So far, however, we have considered ideal chains, which is not very realistic. Instead we should model our chain as an excluded volume chain and—as we will see now—this leads to a c-value greater than 3/2 in three dimensions. To estimate c for an excluded volume loop, we model the loop as a self-avoiding walk on a ddimensional cubic lattice that after N steps returns at a site adjacent to the starting site, see Fig. 3.9 for a two-dimensional loop. The number of such loops was given before, Eq. 3.29. Here it takes the
DNA Melting
Figure 4.41 Configuration of a partially molten DNA double helix with denaturation loops shown in red and helical parts in black.
form
N N (R = a) = (N) = s
N
a RF
d (4.103)
where R F = aN ν , Eq. 3.24, with the Flory exponent ν = 3/ (d + 2), Eq. 3.37. Comparison between Eq. 4.89 and Eq. 4.103 shows that now c = νd. For d = 3 we find c = 9/5 = 1.8, a value still below 2. This suggests that the DNA melting transition is still continuous, even if one takes excluded volume effects into account (Fisher, 1966). So far, however, we accounted only for excluded volume effects within a loop but neglected interactions of this loop with the rest of the chain. An example of the spatial arrangement of such a melting DNA helix with paired and molten stretches is shown in Fig. 4.41. It has been argued (Kafri et al., 2002) that the interaction of a loop with the rest of the DNA molecule leads to an effectively larger value of c that was estimated to be on the order of c = 2.115, the value taken in Fig. 4.39(c). This would mean that the excluded volume interaction between all parts of the DNA would cause the melting transition to be first order, see the red curve in Fig. 4.40. As pointed
187
188 DNA
out in Ref. (Hanke and Metzler, 2003), in order for this effect to come into place, the helical stretches (the black ones in Fig. 4.41) need to be much longer than their persistence length, 50 nm; the DNA chains in experiments are too short to make this possible. In addition, as discussed earlier, we assume here an idealized situation where the pairing energies for the two types of matching bases are identical. This makes it hard, if not impossible, to resolve this issue experimentally.
Problems 4.1 Zeros in the rigid base pair model In Table 4.1 there are some entries where a quantity takes the value 0.0. Are there any cases where this is a coincidence or is there in each case a reason for this value to be exactly zero? 4.2 The nucleosomal sequence preferences Reproduce Fig. 4.17. Write a small program (e.g., in Python) that uses the values from Table 4.1 and performs the necessary mathematical operations that are given by Eq. 4.14. Hints: You need to create 4 × 4-arrays for each position in the nucleosome, 146 in total. The elements in these arrays are the matrix elements of the transfer matrices, Eq. 4.13. The energies in the exponentials follow from Eqs. 4.8 to 4.10. Calculating the partition function (the denominator in Eq. 4.14) is a straightforward multiplication of 146 matrices, followed by a summation over the components of the resulting matrix. The numerator needs a little more care, as there are the special cases s = 1 and s = 146. 4.3 Kirchhoff kinetic analogy Use the Kirchhoff kinetic analogy to write down the action functional of a symmetric spinning top in a gravity field. 4.4 Circle-line approximation The shapes of the Euler elasticas are described by elliptic functions that are difficult to deal with. A useful approximation that typically deviates only 5% to 15% from the exact numerical result is the circle-line approximation. One replaces the
Problems
exact shape by a set of straight lines and circles that are connected smoothly and minimizes the bending energy with respect to some free parameter. As an example consider one half of the lying figure 8 that Euler happened to call Fig. 8 (see his original drawing in Fig. 4.25). One obtains this shape when one bends a beam such that its two ends touch. From a numerical minimization one finds that the angle at the apex of the tip is given by 81.6◦ . Try to estimate this angle using a circle-line approximation. You can approximate the tear-shaped loop by two lines that touch at one end and are connected via a circular section at the other end. Assume that the total length of the two lines and the circle is fixed to L. Minimize the bending energy (only the circular part is bent with a constant curvature) with respect to the apex angle. This leads to a simple transcendental equation for the apex angle that you need to solve numerically. 4.5 Micromanipulation experiment Suppose you perform a micromanipulation experiment with DNA. By fitting the force-extension curve to the wormlike chain model you deduce a 40.5 nm persistence length and a 1 μm contour length for your DNA molecule. Now bring the DNA in contact with a solution of identical DNA-binding proteins under conditions where all the proteins bind to the DNA practically irreversibly. These proteins induce fixed bends on the DNA with an opening angle α = 47◦ (see Fig. 4.30 for the definition of α). You know this value from a cocrystal structure between DNA and your protein. Pull again on the DNA, this time with the proteins bound to it. You find again the typical wormlike chain curve and the contour length is still 1 μm. But this time your fit produces a much smaller value for the persistence length, namely 17.5 nm. How many proteins are bound to your DNA molecule? 4.6 An exact approximation Why can you replace in Eq. 4.57 sin θ by θ and still produce an exact result afterwards?
189
Chapter 5
Stochastic Processes
5.1 Introduction Up to now, we restricted ourselves to the discussion of systems in equilibrium for which the framework of Chapter 2 can be applied. However, many processes in the cell are non-equilibrium processes. For example, consider the processes that take place within the central dogma of molecular biology, Fig. 1.1. Transcription and translation are both processes where a machine walks along a biopolymer (DNA or RNA) and reads out the sequence of bases in order to polymerize another biomolecule (RNA or protein). This is certainly not an equilibrium process that happens spontaneously without external input. In fact, like machines in our daily life, polymerases and ribosomes consume energy in order to perform their jobs. Another process that takes place in Fig. 1.1 is the folding of a protein. The transition from an unfolded configuration to a unique collapsed state does not require any external energy input, but it is not an equilibrium process: The unfolded state is energetically unfavorable and the folded state is energetically favorable. So protein folding has to be understood somewhat similar to the downhill flow of water in a mountainous landscape. In
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
192 Stochastic Processes
fact, one might go so far and say that maintaining cells in a nonequilibrium state is right at the heart of being alive. Non-equilibrium processes are studied in many biophysical experiments. For instance, the protein folding process can be reversed by pulling on such a folded chain using a setup similar to the one discussed for DNA, see Fig. 4.34. If one pulls very slowly, the chain has always time to maintain equilibrium at every endto-end distance imposed. Such a stretching experiment can then be described within the framework of equilibrium statistical physics. But often it is hard to perform an experiment slow enough for simple practical reasons (e.g., a thermal drift in the setup, the degradation of molecules, etc.). But beyond such technicalities, it turns out that one can learn something qualitatively new in a non-equilibrium setup. For instance, by forcefully unfolding a molecule at different pulling speeds, one can extract information about its internal energetic structure that one would not be able to obtain from an equilibrium pulling experiment. In short, it is time for us to expand our physics toolbox to be able to deal also with non-equilibrium processes. This can be achieved by introducing the concept of stochastic processes (van Kampen, 1992). Many processes in nature feature a quantity that varies with time in a highly complicated and irregular way, e.g., the position of a small particle (pollen, fat droplet in milk, etc.), a socalled Brownian particle, that jiggles randomly around, reflecting the bombardment by the surrounding invisible small molecules. In many cases it is, nevertheless, possible to extract useful information by studying averaged features that vary in a regular way. For example, the force on a piston that is under bombardment of gas atoms varies rapidly but averaged over small time intervals appears to be a smooth function of the gas pressure and temperature. This section provides a very general mathematical definition of such stochastic processes. In the subsequent sections we shall look at special cases. In Section 5.2 we introduce Markov processes for which we provide two mathematically equivalent formulations, the Chapman–Kolmogorov equation in Section 5.2 and the master equation in Section 5.3. A special type of master equation, the Fokker–Planck equation, is discussed in Section 5.4 which again has a mathematically equivalent formulation, the Langevin equation
Introduction
stochastic processes Markov processes
Fokker-Planck equation = Langevin equation
Figure 5.1 Fokker–Planck or Langevin equations are special cases of Markov processes which themselves are a special class of stochastic processes.
introduced in Section 5.7. The relationship between these different types of processes is shown schematically in Fig. 5.1. Section 5.5 and 5.6 present two applications of the Fokker–Planck equation, the escape over a barrier and dynamic force spectroscopy. Section 5.8 applies the Langevin formalism to polymer dynamics. We start out from a stochastic variable, an object defined by a set of possible values s (discrete or continuous) together with a probability distribution p (s) over this set. As an example think about the possible outcomes of throwing a dice, one to six eyes, each with a probability 1/6 (cf. also Appendix A). In a similar fashion one can define a stochastic process as the set of possible processes in time that describe a particular physical system, each occurring with a certain probability. Each process is of the form ys (t), a function in time, called a realization of the process. Here s labels the process, a stochastic variable with probability distribution p (s). A stochastic process is thus an ”ensemble” of the functions ys (t). At this point it is helpful to give a concrete example, namely the Brownian particle mentioned above, which is immersed in a solution of invisible solvent molecules. The trajectory of the particle is a random three-dimensional path induced by the collisions with the solvent molecules. Now xs (t) is such a particular path, labelled by s; xs (t) is an obvious generalization of the one-dimensional case,
193
194 Stochastic Processes
x1
s=3
t = t1 t=0
s=2
x0 s=4
s=1
Figure 5.2 Four trajectories of a Brownian particle. In each realization the particle starts at x0 at t = 0. Of the four example trajectories only the red trajectory, labelled s = 3, goes through x1 at t = t1 .
ys (t), introduced above. The stochastic process here is the set of all possible trajectories of the Brownian particle together with their probability distribution. Figure 5.2 shows a few example paths, all assumed to start at time t = 0 at the same point in space, x0 . For simplicity, we label the curves with integers here, but in reality there is a continuous set of possible paths and therefore s should be continuous. Note that it is far from obvious how one should actually label all the possible trajectories in a systematic way. But this has not to concern us since we shall soon see an alternative, more accessible way to define such processes. Averages are defined in the following straightforward fashion: y (t) = (5.1) ys (t) p (s) ds and
y (t1 ) y (t2 ) . . . y (tn ) =
ys (t1 ) ys (t2 ) . . . ys (tn ) p (s) ds.
(5.2)
Of special interest is the so-called autocorrelation function: κ (t1 , t2 ) = (y (t1 ) − y (t1 )) (y (t2 ) − y (t2 )) .
(5.3)
This function measures how much the process is correlated over time. For example, if one has κ (t1 , t2 ) > 0, one knows that when
Introduction
ys (t1 ) is larger than its average y (t1 ), then also ys (t2 ) is typically larger than its average y (t2 ) and so on. If there exists a time interval τc such that κ (t1 , t2 ) is zero or negligibly small for |t2 − t1 | > τc , one calls τc the autocorrelation time. A stochastic process is called stationary when the moments are not affected by a shift in time y (t1 + τ ) y (t2 + τ ) . . . y (tn + τ ) = y (t1 ) y (t2 ) . . . y (tn ) (5.4) for all n, τ and all t1 , t2 ,. . . , tn . A hierarchy of distribution functions can be constructed from a stochastic process. The probability density P1 (y1 , t1 ) that the process assumes the value y1 at time t1 is given by P1 (y1 , t1 ) = δ (y1 − ys (t1 )) p (s) ds. (5.5) The integration in Eq. 5.5 goes over all realizations; the deltafunction then picks out all those realization for which ys (t1 ) = y1 . Let us go back to the example of the Brownian particle shown Fig. 5.2. Here the set of all possible trajectories together with their probabilities represent the stochastic process. Then P1 (x1 , t1 ) gives the probability that the trajectory passes through the point x1 at time t1 . In the limited set of four trajectories presented in that figure, there is one (shown in red) that fulfills this requirement. The joint probability density that the process has the value y1 at t1 and y2 at t2 up to yn at tn can be calculated in a similar fashion. It is given by (5.6) Pn (y1 , t1 ; y2 , t2 ; . . . ; yn , tn ) = δ (y1 − ys (t1 )) δ (y2 − ys (t2 )) . . . δ (yn − ys (tn )) p (s) ds. From the infinite hierarchy of probability densities Pn (n = 1, 2, . . . ) one can compute all averages, e.g., y (t1 ) y (t2 ) . . . y (tn ) = y1 y2 . . . . yn Pn (y1 , t1 ; y2 , t2 ; . . . ; yn , tn ) dy1 dy2 . . . dyn . (5.7) It is thus not surprising that this infinite hierarchy defines the whole stochastic process. In other words, we have found an alternative formulation for a stochastic process—as promised above.
195
196 Stochastic Processes
Other important quantities that follow from the Pn ’s are conditional probabilities. For example, P1|1 (y2 , t2 |y1 , t1 ) is the probability density for the stochastic process to take the value y2 at t2 given that it had taken the value y1 at an earlier time t1 . Hence P2 (y1 , t1 ; y2 , t2 ) = P1|1 (y2 , t2 | y1 , t1 ) P1 (y1 , t1 ) .
(5.8)
More generally, one fixes the process at different times t1 , . . . , tk and then asks for the joint probability at l later times tk+1 , tk+2 ,. . . , tk+l . Then Pl|k (yk+1 , tk+1 ; . . . ; yk+l , tk+l | y1 , t1 ; . . . ; yk , tk ) (5.9) Pk+l (y1 , t1 ; . . . ; yk , tk ; yk+1 , tk+1 ; . . . ; yk+l , tk+l ) . = Pk (y1 , t1 ; . . . .; yk , tk )
5.2 Markov Processes We just learned that we can define a stochastic process by giving the infinite hierarchy of joint probability densities Pn . This seems not quite practical since one has to provide an infinite number of functions in order to define something. Note, however, that the situation is not as dramatic since these functions are not completely independent from each other. In fact, if one chooses a Pn , then all probability densities Pk with k < n are set immediately. This can easily be understood as follows. Consider, for instance, the case n = 3. Then obviously one has P2 (y1 , t1 ; y3 , t3 ) = P3 (y1 , t1 ; y2 , t2 ; y3 , t3 ) dy2 , (5.10) a relation that follows from the definition of the joint probability densities, Eq. 5.6. But even if you would define a Pn with a very large n, you would still be left with an infinite set of undefined functions, namely all Pm ’s with m > n. It is customary to cut the infinite tail of this monster right after P2 , reducing the large zoo of stochastic processes to that of the Markov processes, cf. Fig. 5.1. A Markov process is defined as a stochastic process with the property that for any set of n successive times t1 < t2 < . . . < tn one has P1|n−1 (yn , tn |y1 , t1 ; . . . ; yn−1 , tn−1 ) = P1|1 ( yn , tn |yn−1 , tn−1 ) . (5.11)
Markov Processes
(a)
(b)
Figure 5.3 Motion of a Brownian particle. (a) The full trajectory is continuous but not smooth since the velocity makes small jumps when the particle is hit by a small solvent molecule. (b) On a more coarse-grained time scale (relevant for experiments) the particle appears to make random jumps. In case (a) the velocity of the particle is a Markov process, in case (b) its position.
This means that the conditional probability at tn , given the value yn−1 at tn−1 , is uniquely determined by that value. It is not affected by the knowledge of values at earlier times. P1|1 is called the transition probability. At this point it is helpful to discuss a concrete example where Markovian and non-Markovian properties can be illustrated: the motion of a Brownian particle. When observing a small particle under a microscope, one finds that the particle is performing an incessant random motion. This is the result of random collisions with the much smaller molecules of the surrounding fluid. Each collision leads to a small jump in the velocity υ of the particle. When the particle moves in a certain direction, it suffers on average more collisions in front than from behind. This means that the change in velocity dυ in a short time dt depends on the current velocity but not on earlier values of the velocity. In other words, the velocity of a Brownian particle is a Markov process. On the other hand, its position does not obey the Markov requirement: knowing two recent positions allows one to guess the particle’s current direction of motion and thus gives a hint where the particle is most likely to be found next.
197
198 Stochastic Processes
However, as it turns out, this is not what is observed experimentally. What one sees is a much more coarse-grained version of that movement. Between two experimentally distinguishable positions, the particle has changed its direction of motion many times. A schematic sketch of a full trajectory of a Brownian particle and its experimentally observed coarse-grained version is provided in Figs. 5.3(a) and 5.3(b). In more mathematical terms: the velocity autocorrelation time is much smaller than the time interval between two observations. What one really sees is the net displacement of the particle as the result of many collisions with solvent molecules. If one has recorded the particle positions at many previous times and if one wants to estimate the following position, it is completely sufficient to only consider the most recent time. To conclude, on the experimentally relevant coarse-grained time scale the motion of a Brownian particle is a Markov process in its position, even though at a more microscopic time scale it is not. The Markov property can therefore be quite subtle. A Markov process is fully determined by the two functions P1 (y1 , t1 ) and P1|1 (y2 , t2 |y1 , t1 ). From these two functions one can construct the whole hierarchy of distribution functions Pn . For example, P3 (y1 , t1 ; y2 , t2 ; y3 , t3 ) with t1 < t2 < t3 can be rewritten as P3 (y1 , t1 ; y2 , t2 ; y3 , t3 ) = P2 (y1 , t1 ; y2 , t2 ) P1|2 (y3 , t3 |y1 , t1 ; y2 , t2 ) = P1 (y1 , t1 ) P1|1 (y2 , t2 |y1 , t1 ) × P1|1 (y3 , t3 |y2 , t2 ) .
(5.12)
In the second step we used the Markov property, Eq. 5.11, to reduce P1|2 to P1|1 . From Eq. 5.12 one can construct a relation that the transition probability must obey. First by integrating Eq. 5.12 over y2 , one obtains for t1 < t2 < t3 : P2 (y1 , t1 ; y3 , t3 ) = P1 (y1 , t1 ) P1|1 (y2 , t2 |y1 , t1 ) × P1|1 ( y3 , t3 |y2 , t2 ) dy2
(5.13)
where we used Eq. 5.10 on the lhs. Then by dividing both sides by P1 (y1 , t1 ) we arrive at P1|1 (y2 , t2 |y1 , t1 ) P1|1 (y3 , t3 |y2 , t2 ) dy2 . P1|1 (y3 , t3 |y1 , t1 ) = (5.14)
Markov Processes
P1|1 (y3 , t3 | y1 , t1 )
P1|1 (y2 , t2 | y1 , t1 )
y2
y1
t1
P1|1 (y3 , t3 |y2 , t2 )
y3
t2
t3 t
Figure 5.4 Graphical representation of the Chapman–Kolmogorov equation, Eq. 5.14, which relates the transition probability P1|1 (y3 , t3 |y1 , t1 ) to go from the start position y1 at t1 to the end position y3 at t3 to the transition probabilities to go from the start position to any position at t2 and then from there to the end position.
This is the Chapman–Kolmogorov equation. The transition probability of any Markov process must fulfill this equation. It essentially says that the transition probability to go from the start position y1 at t1 to the end position y3 at t3 must equal the transition probability of going from start to finish via any position y2 at a given intermediate time t2 . This statement is graphically represented in Fig. 5.4. As mentioned above, a Markov process is fully determined by P1 and P1|1 . These functions cannot be chosen arbitrarily but must obey Eq. 5.14 and the relation P1|1 (y3 , t3 |y2 , t2 ) P1 (y2 , t2 ) dy2 . (5.15) P1 (y3 , t3 ) = The latter relation follows from integrating Eq. 5.13 over y1 . Any two non-negative functions P1 and P1|1 that obey these two conditions uniquely define a Markov process. If P1|1 does not depend on two times but only on the time interval, one can use a more compact notation for the transition probabilities: P1|1 (y2 , t2 |y1 , t1 ) = Tτ (y2 |y1 )
(5.16)
199
200 Stochastic Processes
with τ = t2 − t1 . The Chapman–Kolmogorov equation reads then (5.17) Tτ +τ (y3 |y1 ) = Tτ (y3 |y2 ) Tτ (y2 |y1 ) dy2 with τ = t3 − t2 . For −∞ < y < ∞ the Chapman–Kolmogorov equation is obeyed by the transition probability (y2 − y1 )2 1 Tτ (y2 | y1 ) = √ exp − . (5.18) 2τ 2π τ If one chooses P1 (y, 0) = δ (y) a non-stationary Markov process ´ is defined, called the Wiener process or Wiener–Levy process. It is usually considered only for t > 0 and was originally invented to describe the stochastic behavior of the position of a Brownian particle, as will become clear in Section 5.4. The probability density for t > 0 then follows from Eq. 5.15: $ 2% 1 y P1 (y, t) = √ exp − . (5.19) 2t 2πt Equilibrium fluctuations can be described by stationary Markov processes where P1 is independent of time and P1|1 does not depend on two times but only on the time interval, see Eq. 5.16. The best known example of a stationary Markov process is the Ornstein– Uhlenbeck process that was originally constructed to describe the velocity of a Brownian particle, as we shall see in Section 5.4. It is defined by 1 1 2 P1 (y1 ) = √ e− 2 y1 2π and
2 y2 − y1 e−τ Tτ (y2 |y1 ) = exp − 2 1 − e−2τ . 2π 1 − e−2τ 1
(5.20)
(5.21)
The Ornstein–Uhlenbeck has zero average and using Eq. 5.7 one finds κ (t1 , t2 ) = y (t1 ) y (t2 ) = e−τ for the autocorrelation function, Eq. 5.3.
(5.22)
Master Equation
5.3 Master Equation The master equation is an equivalent form of the Chapman– Kolmogorov equation, Eq. 5.14. It is easier to handle and more directly related to physical concepts. For the sake of convenience of notation, we assume here a process where the transition probability depends only on the time interval τ . The master equation is a differential equation obtained by going to the limit τ → 0. We therefore begin with writing down the short-time behavior for the transition probability of a Markov process in leading order in τ : Tτ (y2 |y1 ) = (1 − a0 (y1 ) τ ) δ (y2 − y1 ) + τ W (y2 |y1 ) .
(5.23)
The transition probability is the sum of two terms. The first term on the rhs of Eq. 5.23 corresponds to the case that nothing happens during the very short time interval τ , i.e., the system stays in state y1 . The second term accounts for cases where the system has jumped to state y2 = y1 . Hence W (y2 |y1 ) denotes the transition probability per unit time to go from y1 to y2 with W (y2 |y1 ) ≥ 0. The coefficient 1 − a0 (y1 ) τ in front of the delta function is the probability that no transition takes places during τ . a0 (y1 ) is therefore the total transition rate per unit time to go from y1 anywhere else, i.e., (5.24) a0 (y1 ) = W (y |y1 ) dy. This choice of a0 is indeed consistent since then Tτ (y2 |y1 ) dy2 = 1 for any given value of τ . Note that for τ = 0 there is no time for a transition and Eq. 5.23 actually gives T0 (y2 |y1 ) = δ (y2 − y1 ). The master equation can be derived in two steps. First we replace the transition probability Tτ (y3 |y2 ) in the Chapman–Kolmogorov equation, Eq. 5.17, by its short time form, Eq. 5.23: Tτ +τ (y3 |y1 ) = 1 − a0 (y3 ) τ Tτ (y3 |y1 ) + τ W (y3 | y2 ) Tτ (y2 |y1 ) dy2 . (5.25) Now subtract Tτ (y3 |y1 ) on both sides, divide by τ , go to the limit τ → 0 and use Eq. 5.24: ∂ Tτ (y3 |y1 ) = {W (y3 |y2 ) Tτ (y2 |y1 )−W (y2 |y3 ) Tτ (y3 |y1 )} dy2 . ∂τ (5.26)
201
202 Stochastic Processes
W
W W
W
W W
Figure 5.5 Graphical representation of the master equation for a discrete set of states, Eq. 5.28. The blue state in the center position changes its probability as the result of the in- and outfluxes from three other states. Only the transitions from and to the blue state are depicted.
We have thus achieved our goal to find a differential form of the Chapman–Kolmogorov equation. This equation is called the master equation. To arrive at a more intuitive relation, pick a time t1 and then consider times t ≥ t1 . Multiply both sides of Eq. 5.26 with P1 (y1 , t1 ) and integrate over y1 . That way the transition probabilities Tτ (yi |y1 ) turn into probability densities P1 (yi , t1 + τ ). Finally rename t1 +τ = t, y3 = y and y2 = y . This leads to ) * ∂ P (y, t) = W y y P y , t − W y |y P (y, t) dy ∂t (5.27) where we simplified the notation to P (y, t) = P1 (y, t). The meaning of Eq. 5.27 becomes clear when we look at the case where the range of y is a discrete set of states, labelled by n: dpn (t) {Wnm pm (t) − Wmn pn (t)} . = (5.28) dt m Here pn (t) is the probability that the system is in state n at time t and Wnm is the transition rate per unit time to go from state m to state n. The lhs of Eq. 5.28 represents the rate with which pn (t) changes in time. One part of the rhs accounts for all the events where the system goes from state m to state n (gains of state n) leading to terms of the form Wnm pm (t). The other part accounts for events where the system leaves state n (losses of state n) through terms
Fokker–Planck Equation
of the form −Wmn pn (t). A graphical representation of Eq. 5.28 with four states is shown in Fig. 5.5. To conclude, the master equation, Eq. 5.28, is simply a gain-loss equation for probabilities of separate states n. Equation 5.27 can be interpreted in the same way, only that there is a continuum of states.
5.4 Fokker–Planck Equation The Fokker–Planck equation is a special type of the master equation 5.27 in which the transition probability per unit time, W, is of such a form that it effectively acts as a differential operator of second order: ∂ P (y, t) ∂ 1 ∂2 (B (y) P (y, t)) . (5.29) = − (A (y) P (y, t)) + ∂t ∂y 2 ∂ y2 The coefficients A (y) and B (y) may be any differentiable functions with the only restriction B (y) > 0. The range of y needs to be continuous and we assume in the following −∞ < y < ∞. Equation 5.29 is also called Smoluchowski equation or generalized diffusion equation. The first term on the rhs is sometimes called transport term, convection term, or drift term, the second term diffusion term or fluctuation term. The Fokker–Planck equation 5.29 can be considered as an approximation to any Markov process whose individual jumps are small (van Kampen, 1992). Max Planck derived it as an approximation to the master equation 5.27 in the following way. Express the transition probability per unit time, W, as a function of the jump size r and the starting point y : W y y = W y ; r (5.30) with r = y − y . Then Eq. 5.27 takes the form ∞ ∂ P (y, t) W (y − r; r) P (y − r, t) dr = ∂t −∞ ∞ W (y; −r) dr. − P (y, t)
(5.31)
−∞
We assume now that individual jumps are small which means that W (y ; r) is a sharply peaked function of r. Also let us require that
203
204 Stochastic Processes
W (y ; r) is a slowly varying function of y . More specifically, let us assume that there exists a δ > 0 such that for |r| > δ W (y ; r) ≈ 0 (5.32) W (y + y; r) ≈ W (y ; r) for |y| < δ. In addition we need to assume that the solution of interest, P (y, t), varies also slowly with y in the same sense as W does. This allows us to perform a shift from y − r to y of the integrand, W (y − r; r) P (y − r, t), in the first integral on the rhs of Eq. 5.31. That is achieved by means of a Taylor expansion up to second order: ∞ ∞ ∂ P (y, t) ∂ {W (y; r) P (y, t)} dr = W (y; r) P (y, t) dr − r ∂t ∂y −∞
+
1 2
−∞
∞
∂2 {W (y; r) P (y, t)} dr ∂ y2
r2 −∞
∞
− P (y, t)
W (y; −r) dr.
(5.33)
−∞
The first and fourth term on the rhs cancel and we are left with ∂ P (y, t) ∂ 1 ∂2 {a2 (y) P (y, t)} (5.34) = − {a1 (y) P (y, t)} + ∂t ∂y 2 ∂ y2 where we introduced the jump moments ∞ r ν W (y; r) dr. (5.35) aν (y) = −∞
Therefore, we have derived the Fokker–Planck equation, Eq. 5.29, from the master equation, Eq. 5.27. In this way we learned that the functions A (y) and B (y) can be interpreted as the first and second jump moments. The Fokker–Planck equation 5.29 can be divided into two equations: (i) the continuity equation for the probability density ∂ P (y, t) ∂ J (y, t) =− (5.36) ∂t ∂y where J (y, t) is the probability flux and (ii) a constitutive equation for the probability flux: 1 ∂ {B (y) P (y, t)} . (5.37) J (y, t) = A (y) P (y, t) − 2 ∂y
Fokker–Planck Equation
J (y + dy) y + dy P (y, t)
y y − dy
J (y − dy)
(a)
(b)
(c)
Figure 5.6 Schematic illustration of the one-dimensional continuity equation for the probability density P (y, t). (a) If the influx and outflux at y are the same, J (y − dy) = J (y + dy), P (y, t) stays constant in time. In cases (b) and (c) there is an imbalance of the fluxes, J (y − dy) = J (y + dy), and hence P (y, t) grows or shrinks in time.
The continuity equation for the probability density, Eq. 5.36, has a simple interpretation. It ensures that the probability is a conserved quantity such that the system is always somewhere with probability one, P (y, t) dy = 1. If we look at the probability density to be at state y, then P (y, t) does not change if the probability flux into y equals the flux out of y, J (y − dy) = J (y + dy) for a small value of dy, i.e., ∂ J (y, t) /∂ y = 0, as illustrated in Fig. 5.6(a). If on the other hand there are on average more transitions into y than out of y, i.e., if ∂ J /∂ y < 0, then P (y, t) increases with t, Fig. 5.6(b). Finally, if ∂ J /∂ y > 0 the probability P (y, t) decreases with t, Fig. 5.6(c). The continuity equation for the probability density is analogous to a corresponding continuity equation for the flow J of a compressible fluid with P being its mass density. Note that Fig. 5.6 is reminiscent of Fig. 5.5 which illustrates the master equation. In fact, the Fokker– Planck equation was derived above from the master equation for the case that only small jumps occur. We still need to figure out how to calculate J (y, t). This follows from the constitutive equation, Eq. 5.37, which depends on the system under consideration. We shall now do this for the case of the Brownian particle. The motion of a Brownian particle can be described— on a coarse-grained time scale—as a Markov process in
205
206 Stochastic Processes
its position y (t), assumed here to be one-dimensional. The particle makes random jumps back and forth on the Y -axis (cf. Fig. 5.3(b) for a depiction of a two-dimensional version of this motion). Jumps of any length y may occur but the probability of large jumps falls off rapidly. The jumps are symmetric, W (y; y) = W (y; −y), and independent of the starting point, i.e., W (y; y) = W (y ; y) for any y = y. We thus expect that the first and second jump moments are of the form
(y)2 y = 0, a2 (y) = a2 = = const. a1 (y) = a1 = t t (5.38) where t denotes some small time interval, set, e.g., by the time between two observations under the microscope. The Fokker– Planck equation, Eq. 5.34, for the transition is thus ∂ P (y, t) a2 ∂ 2 P (y, t) = . ∂t 2 ∂ y2
(5.39)
What we derived here is nothing but the well-known diffusion equation ∂ P (y, t) ∂ 2 P (y, t) =D . ∂t ∂ y2
(5.40)
The diffusion constant D, a phenomenological quantity, attains here a precise microscopic interpretation,
(y)2 a2 = , (5.41) D= 2 2t connecting the macroscopic constant D to the microscopic jumps of the particles. Consider an ensemble of independent Brownian particles that all start at y = 0 at t = 0. Their positions at t ≥ 0 constitute a Markov process with a density distribution $ % y2 1 exp − . (5.42) P (y, t) = √ 4Dt 4π Dt This density distribution obeys the diffusion equation, Eq. 5.40, is normalized to one, Eq. A.9, and is sharply peaked around y = 0 for t = 0, i.e., P (y, 0) = δ (y). This corresponds to the Wiener process defined in Eq. 5.18 which, after rescaling t by 2Dt, has indeed the same density distribution, see Eqs. 5.19 and 5.42. Using properties
Fokker–Planck Equation
of the Gaussian distribution, Eqs. A.10 and A.12, we find for the first and second moment: 2 y (t) = 0, (5.43) y (t) = 2Dt. The trajectory of a Brownian particle is an example of a random walk. As discussed in the Chapter 3, there is an analogy between trajectories of random walks and conformations of ideal polymer chains whose mean-squared end-to-end distance is given by R 2 = a2 N, see Eqs. 3.3 and 3.13. Here the mapping 2 between
polymers and 2 (t) ↔ y , N ↔ t/t and the Wiener pr ocess goes as follows: R
a2 ↔ (y)2 = 2Dt. Next, consider the same Brownian particles subjected to a gravitational force mg in the direction −Y . In that case the force induces a non-vanishing average velocity y /t, which in the usual condition of weak force is linear in the force: y mg a1 = =− . (5.44) t ζ The constant ζ is called the friction constant of the particle. This quantity can be obtained from hydrodynamics, e.g., for a rigid sphere of radius a the friction constant is given by the so-called Stokes’ law ζ = 6π ηa
(5.45)
with η denoting the viscosity of the solvent; for water η ≈ 10−3 Pa s. The particles attain an average speed −mg/ζ superimposed on the Brownian motion whose second jump moment is, as before, given by a2 = 2D. The resulting Fokker–Planck equation, Eq. 5.34, is now of the form ∂ P (y, t) mg ∂ P (y, t) ∂ 2 P (y, t) = +D . ∂t ζ ∂y ∂ y2
(5.46)
Let us search for a stationary solution, ∂ P /∂t ≡ 0, of Eq. 5.46, i.e., a solution to ∂ P (y) mg =− P (y) . (5.47) ∂y ζD If y is allowed in the range −∞ < y < ∞, there cannot be a stationary solution since particles would continuously fall in the negative Y -direction. But in our discussion of equilibrium statistical physics in Chapter 2 we learned that there exists a stationary
207
208 Stochastic Processes
solution for this problem if we assume a reflecting bottom at say y = 0. For that case we found the barometric formula, Eq. 2.41, that is here of the form mg − kmgyT (5.48) P (y) = e B . kB T This satisfies Eq. 5.47 if one chooses D as follows: D=
kB T . ζ
(5.49)
This is the famous Einstein relation. It relates the quantity D that characterizes the thermal motion to the quantity ζ that specifies the response to an external force. The Einstein relation is a special case of a more general theorem, called the fluctuation-dissipation theorem, which relates the spontaneous thermal fluctuations of a system to its response to an external perturbation. We learned above that any Fokker–Planck equation can be broken up into two equations, the continuity equation, Eq. 5.36, and a constitutive equation, Eq. 5.37. For an ensemble of independent Brownian particles in a gravity field, Eq. 5.46, the constitutive equation takes the form J (y, t) = −
∂ P (y, t) mg P (y, t) − D . ζ ∂y
(5.50)
The probability flux is the sum of two terms. Let us start with the second term which is proportional to the spatial gradient of the concentration of particles; the proportionality constant is given by D. This is called Fick’s law and its interpretation is depicted in Fig. 5.7(a). In this figure the particles move randomly up and down along the Y -direction. The higher concentration at smaller Y -values results in a net flux of particles from the bottom to the top. The first term in Eq. 5.50 simply states that in the presence of an external field, here the gravity field, there is an additional flux P υ of particles where υ is their average speed, υ = −mg/ζ , see Eq. 5.44. Here the force and thus the average speed is constant but the flux is not since towards the bottom one has a higher concentration and thus a higher flux, as shown in Fig. 5.7(b). When the concentration profile is given by the barometric formula, Eq. 5.48, these two fluxes cancel, J ≡ 0, and according to Eq. 5.36 one has a stationary state.
Fokker–Planck Equation
J (y + dy)
T
g
P (y, t)
J (y − dy)
(a)
(b)
Figure 5.7 Schematic illustration of the fluxes of a collection of Brownian particles in a gravity field: (a) flux due to random thermal motion (Fick’s law) and (b) flux as a result of an external field. Here we assume the density profile of Eq. 5.48, for which the two fluxes cancel each other exactly.
The Brownian particle studied on a finer time scale is referred to as the Rayleigh particle, see Fig. 5.3(a). It includes the description of the relaxation of the velocity. The stochastic variable considered here is thus the velocity rather than the position. The macroscopic law for the linear damping of the velocity υ is given by mdυ/dt = −ζ υ where m is the mass and ζ the friction of the particle. This relation translates into υ ζ = − υ. (5.51) a1 (υ) = t m The particle is hit by solvent molecules, incessantly changing its velocity. We therefore expect the second jump moment to be of the form (0) (0) a2 (υ) = a2 + O υ 2 ≈ a2 . (5.52) For symmetry reasons there is no term proportional to υ. For not too large values of the velocity, it should be sufficient to approximate (0) a2 (υ) by a constant, a2 . With this assumption we find the following Fokker–Planck equation for P (υ, t): + , ∂ P (υ, t) ζ ∂ kB T ∂ 2 P (υ, t) (υ P (υ, t)) + . (5.53) = ∂t m ∂υ m ∂υ 2
209
210 Stochastic Processes
(0)
(0)
To determine a2 , namely a2 = 2ζ kB T /m2 , we used our knowledge from equilibrium statistical mechanics: the stationary solution must be the Maxwell velocity distribution P (υ) = √
2 1 − mυ e 2kB T . 2πkB T /m
(5.54)
This is the one-dimensional version of the velocity distribution, Eq. 2.43, we derived earlier for three dimensions. Suppose we know the velocity of the particle at time t = 0 to be υ0 . Then what is the probability distribution P (υ, t) for times t > 0? As it turns out, the solution is the following form: P (υ, t) =
1
2πkB T 1 − e−2ζ t/m /m 2 m υ − υ0 e−ζ t/m . × exp − 2kB T 1 − e−2ζ t/m
(5.55)
It is easy to check that this distribution is normalized to one, see Eq. A.9, and that the function is sharply peaked around υ = υ0 for t → 0, i.e., one has the initial condition P (υ, 0) = δ (υ − υ0 ). Finally, by inserting the probability distribution, Eq. 5.55, into Eq. 5.53 one can convince oneself—after a longer calculation—that this is indeed the solution. Comparison of Eqs. 5.54 and 5.55 to Eqs. 5.20 and 5.21 shows that the velocity υ (t) of the Rayleigh particle is described by an Ornstein–Uhlenbeck process. It is instructive to give a more intuitive picture of this process. Before doing so, let us present the case of Brownian particles in a general external potential U (y). The Fokker–Planck equation for this case reads
∂ P (y, t) ∂U (y) ∂ P (y, t) 1 ∂ kB T + P (y, t) . (5.56) = ∂t ζ ∂y ∂y ∂y This follows from the flux
1 ∂ P (y, t) ∂U (y) + P (y, t) J (y, t) = − kB T ζ ∂y ∂y
(5.57)
together with the continuity equation 5.36. Equations 5.56 and 5.57 are obvious generalization of the special case of the linear potential of a gravitational field, U (y) = mgy, leading to a constant force −∂U /∂ y = −mg, see Eqs. 5.46 and 5.50.
Application 211
M v/2
T
(a)
(b)
Figure 5.8 Brownian particles in a harmonic potential. This system can be mapped onto the free Rayleigh particles (see text for details).
With the Fokker–Planck equation for a general external potential U (y) at hand, one can construct an appealing analogy to the Ornstein–Uhlenbeck process, Eq. 5.53, which describes the velocity of a free Rayleigh particle. Specifically, that equation is mathematically identical to the Fokker–Planck equation of a Brownian particle in a harmonic potential U (y) = K y 2 /2: , + ∂ P (y, t) K ∂ kB T ∂ 2 P (y, t) (y P (y, t)) + = . (5.58) ∂t ζ ∂y K ∂ y2 The velocity υ in Eq. 5.53 corresponds to the position y in Eq. 5.58, the mass m to the spring constant K . In addition, the combination ζ /m2 in Eq. 5.53 needs to be replaced by 1/ζ in Eq. 5.58. The particle density of the Brownian particles in a harmonic potential is Gaussian, centered around the potential minimum. Two fluxes annihilate (see Fig. 5.8): an outward flux as the result of random thermal motion that is proportional to the density gradient, Fig. 5.8(a), and an inward flux resulting from the external potential, Fig. 5.8(b). In the analogy mentioned above, the coarse-grained position dynamics of the Brownian particle in a quadratic potential well is mathematically identical to the velocity dynamics of a free Rayleigh particle.
5.5 Application: Escape over a Barrier In many biophysical systems one encounters the problem of the escape over a barrier: The system is stuck in a local energy minimum, a metastable state, since it is separated from the global minimum, the ground state, through an energy barrier. However,
212 Stochastic Processes
kA→C
U
B A
Eb
C yA Figure 5.9
yB
yC
y
Escape from state “A” over a barrier at “B” to state “C”.
given enough time a sufficiently large thermal fluctuation will eventually occur that helps the system over the barrier so that it reaches the global energy minimum. Later, in Chapters 6 and 8, we shall see examples of such barrier crossings. Here we work out the kinetics behind the escape over the barrier using the framework of the Fokker–Planck equation. For the sake of simplicity, we consider a one-dimensional system with a local minimum at position y A with energy U (y A ) = U A , see Fig. 5.9. The global minimum is at position yC , U (yC ) = U C , with U A − U C kB T . The barrier in between has its maximum at yB and is assumed to be much higher than the thermal energy, i.e., U (yB ) = U B obeys U B − U A kB T . We assume that the system starts at y = y A . Since the barrier is very high, the system only rarely crosses the barrier and we can assume that it is in a quasi-stationary state. This means that the distribution can be approximated to be constant in time, ∂ P /∂t = 0. From the continuity equation, Eq. 5.36, it follows that the flux J is independent of the position, J (y) = J = const. According to Eq. 5.57, J is given by
1 ∂ P (y) ∂U (y) + P (y) , (5.59) J =− kB T ζ ∂y ∂y which can be rewritten as J =−
U ( y) kB T − Uk (y)T d e B P (y) e kB T . ζ dy
(5.60)
Application 213
U ( y)
Multiplying both sides of Eq. 5.60 by e kB T and then integrating from y A to yC leads to yC
U (y)
e kB T dy = −
J yA
U (y) yC kB T P (y) e kB T , yA ζ
(5.61)
the relation from which we determine J in the following. Since the system is in a local thermal equilibrium in the potential well around y A , the probability density is Boltzmann distributed P (y ) = P0 e−U ( y)/ kB T
(5.62)
for all y with |y − y A | < |y A − yB |. We approximate U (y) in Eq. 5.62 by a Taylor expansion up to second order around the local minimum at y A : 1 1 d 2 U (y − y A )2 = U A + ω2A (y − y A )2 . U (y) = U A + 2 dy 2 y=y A 2 (5.63) The factor P0 in Eq. 5.62 is determined by requiring that P (y) is normalized to one, i.e., that the system is to be found somewhere around the local minimum y A : yB −∞
U
− A P (y) dy ∼ = P0 e kB T
∞ −∞
ω2A y 2 exp − dy = 1. 2kB T
(5.64)
Here we made an exponentially small error when we extended the integration domain to infinity. With Eq. A.8 it follows that Eq. 5.64 is UA √ fulfilled by choosing P0 = e kB T ω A / 2πkB T . The rhs of Eq. 5.61 can thus be approximated by UA U ( y ) yC kB T ω A kU AT kB T ∼ kB T kB T kB T P (y A ) e = P (y) e = e B . (5.65) − yA ζ 2π ζ ζ In the first step of Eq. 5.65, we neglected the upper boundary, y = yC , since here we describe U (y) by the Taylor expansion, Eq. 5.63, making this term exponentially small. What remains to be calculated in Eq. 5.61 is the integral on the lhs. The major contribution to this integral comes from the maximum of U (y) around y = yB . This allows to evaluate the
214 Stochastic Processes
integral via the saddle-point approximation. Again we Taylor expand the potential, this time around y = yB : 1 1 d 2 U ( y − yB )2 = U B − ω2B ( y − yB )2 . U (y) ∼ U + = B 2 dy 2 y=yB 2 (5.66) With this we estimate the integral as follows: yC dy e yA
U (y ) kB T
U
B ∼ = e kB T
∞ −∞
ω2 y 2 exp − B 2k B T
dy = e
UB kB T
√ 2π kB T . ωB
(5.67)
Inserting Eqs. 5.65 and 5.67 into Eq. 5.61 leads to an explicit expression for the probability flux: J =
ω A ω B U Ak −UT B e B . 2π ζ
(5.68)
Equation 5.68 is called Kramers’ rate. It can be rewritten in a more compact notation: k A →C = ν0 e−E b /kB T .
(5.69)
Here k A →C = J is the escape rate from state A to state C and E b = U B − U A denotes the barrier height. The factor ν0 is called the attempt frequency. We showed that ν0 has a precise meaning, namely ν0 = ω A ω B / (2π ζ ). Very roughly speaking, the attempt frequency is the typical frequency with which the system starting from A reaches C if the energy landscape were flat, i.e., ν0 ≈ 2D/ |yC − y A |2 , see Eq. 5.43. The exponential factor then accounts for the fact that most “attempts” are not successful in reaching the saddle point. Suppose one wants to calculate the escape rate of a system from state A to C. Since the escape rate, Eq. 5.69, is linear in the attempt frequency but depends exponentially on the barrier height, it is much more crucial to have a good estimate of the barrier height than of the attempt frequency. This is fortunate since one has often a good description of the energy landscape but a rather vague idea about the friction constant. An example of this situation can be found in Section 8.3 where we discuss force-induced nucleosome unwrapping.
Application: Dynamic Force Spectroscopy
−f
(a)
−f
+f
(b)
+f
(c) Figure 5.10 (a) Polymer with a stretch (shown in magenta) that condenses onto itself. (b) Same chain under an external force f . The condensed state of the magenta stretch is now only a local minimum of the energy (or, more accurately, the free energy to account for the large number of configurations of the polymer). (c) The global minimum corresponds to the non-condensed state where the chain can extend much more in the force direction.
5.6 Application: Dynamic Force Spectroscopy The structures of many biomacromolecules (e.g., proteins) and their complexes (e.g., DNA–protein complexes) are governed by noncovalent bonds. This means that these structures and complexes have a limited lifetime before they fall apart as the result of thermal fluctuations. By applying an external force, the bond lifetimes can be shortened considerably. This is because the barriers that have to be crossed for unbinding are affected by the external force. Systematic pulling experiments can be used together with Kramer’s rate expression, Eq. 5.69, to determine the internal energy landscape of the molecular bonds. In this section we discuss a specific experimental scheme and the theory behind it, which makes it possible to extract such information: dynamic force spectroscopy (DFS) (Evans, 1999). This is an important method widely applied in single molecule experiments; in Chapter 8 we shall give a concrete example when describing the force-induced unwrapping of nucleosomes. Let us consider first the system depicted in Fig. 5.10(a). It is a flexible polymer with a stretch S that attracts itself forming some kind of condensed region. This stretch might be an unstructured homopolymer for which the solvent is poor so that S collapses into a molten globule. Or it might be a protein that folds into a specific
215
216 Stochastic Processes
conformation as prescribed by its underlying aa sequence. Suppose one applies a force f to the ends of the polymer. If that force is not too large and not too small, the chain with its stretch S condensed, Fig. 5.10(b), is still a minimum of the (free) energy but now just a local one. The global minimum then corresponds to the stretched state where S is decondensed, see Fig. 5.10(c). In that state the end-to-end distance is much longer, substantially reducing the total energy of the system. The precise shape of the barrier between the local minimum (S condensed, Fig. 5.10(b)) and the global minimum (S decondensed, Fig. 5.10(c)) depends on the details of how the condensed stretch is folded and what the local interactions are. We give a specific example later, in Chapter 8. For now we only discuss how to extract information about the energy landscape through DFS measurements or how to predict the outcome of DFS if one knows the energy landscape from a theoretical model. Consider a system with an energy landscape as the one shown in Fig. 5.9. Now suppose we apply a force f to the system in the Y -direction; in the previous example, the polymer under tension, y would be its end-to-end distance. Then the energy landscape U (y) changes and takes the form U (y) − f y. We want to know how the escape rate k A →C depends on the force. In the following we call this quantity the failure rate νfail ( f ) of the bound state under tension f . Let us denote the f -dependent barrier height by E ( f ). Then Kramers’ rate, Eq. 5.69, takes the form νfail ( f ) = ν0 e− E ( f )/kB T .
(5.70)
Let E b be the barrier height of the unforced system and yb the distance in Y -direction between the local minimum at A and the top of the barrier at B. The effect of the force on the energy landscape is just to tilt it or, more precisely, to shear it, see Fig. 5.11. If we assume a very steep potential, the force f changes not much the distance yb between the bound state and the maximum and we can estimate the barrier height by E ( f ) = E b − f yb . In that case the failure rate is given by νfail ( f ) = ν0 e−(E b − f yb )/ kB T
(5.71)
in a good approximation. Assume that the force varies with time t, f = f (t). Let us calculate the probability Psurv (t) that the bound state survives up
Application: Dynamic Force Spectroscopy
νfail
U
B Eb
A
U (y)
≈ Eb − f yb
C U(y) − f y
y
yb
Figure 5.11 By applying a force in the Y -direction the energy landscape gets tilted. The height E b of the barrier at “B” is approximately reduced to E b − f yb with yb denoting the distance between the local minimum at “A” and the maximum at “B”.
to time t. This probability decays proportional to the product of the failure rate at time t and the probability that the bound state survives until that time: d Psurv (t) = −νfail ( f (t)) Psurv (t) . dt
(5.72)
We assume that the system was in the bound state when the measurement started at t = 0 which leads to the initial condition Psurv (0) = 1. Equation 5.72 can be rewritten as d ln Psurv (t) /dt = −νfail ( f (t)), which is solved by t
Psurv (t) = exp − νfail ( f (τ )) dτ . (5.73) 0
The rate with which the bound state fails, wfail , equals the growth rate of the probability 1 − Psurv (t), the probability to be in the unbound state: wfail (t) =
d (1 − Psurv (t)) = νfail ( f (t)) Psurv (t) . dt
(5.74)
On the rhs of Eq. 5.74 we used Eq. 5.72. The typical quantity determined in standard force spectroscopy is the time at which wfail (t) has its maximum. Experimentally this time is found by repeating the same measurement many times and drawing a histogram of the number of breakages observed within
217
218 Stochastic Processes
given small time intervals. On the theoretical side, the time where wfail (t) is maximal follows from dwfail (t) = 0. (5.75) dt Assuming a failure rate of the form Eq. 5.71, the decay rate wfail (t), Eq. 5.74, takes the form wfail (t) = νb eβ f (t)yb e−νb ρ f (t) = − with νb = ν0 e−β E b and
ρ f (t) =
t
d −νb ρ f (t) e dt
eβyb f (τ ) dτ.
(5.76)
(5.77)
0
Then the above condition, Eq. 5.75, takes the form: d 2 exp −νb ρ f (t) = 0, dt2 which can be rewritten as
d 2 ρ f (t) dρ f (t) 2 . = ν b dt2 dt
(5.78)
(5.79)
In DFS the imposed force is typically increased linearly in time f (t) = r f t
(5.80)
where r f is called the loading rate. In that case, ρ f (t) can be easily obtained from Eq. 5.77: t 1 βyb r f t ρ f (t) = eβyb r f τ d τ = e −1 . (5.81) βybr f 0 This together with Eq. 5.79 allows us to give an expression for the time t∗ and—using f ∗ = r f t∗ —for the force f ∗ at which wfail (t) has its maximum: 1 βybr f Eb 1 βybr f f∗ = ln ln = + . (5.82) βyb βyb yb νb ν0 Equation 5.82 is usually rewritten as 1 rf 1 r0 βyb Eb f∗ rf = ln + ln + βyb r0 yb βyb ν0
(5.83)
introducing some arbitrary loading scale r0 , e.g., r0 = 1pN/s. This expression is often used to interpret data obtained in DFS measurements.
Application: Dynamic Force Spectroscopy
f∗
yb
1 1/(βyb )
1
2
Eb
1 r0 βyb Eb ln + βyb ν0 yb
ln (rf /r 0 ) Figure 5.12 Most probable rupture force f ∗ vs. logarithm of the pulling rate r f , Eq. 5.83. From the slope of the curve one can deduce yb , from the intersection with the Y -axis the barrier height E b .
Note that only the first term in Eq. 5.83 depends on the loading rate r f , namely logarithmically, while the other terms are constant. This implies that f ∗ is small for small r f and large for large r f . This can be understood as follows: For small r f the system has a lot of time to find a large enough thermal fluctuation that carries it over the barrier at a time when the applied force is still small. On the other hand, for steep force ramps the time available to wait for a larger fluctuation is typically too short and the system only fails once the force is large and the barrier is small. To perform a DFS experiment, one needs first to repeat a rupture experiment many times with a given loading rate r f so that one can determine the peak in the histogram of rupture forces. Then one has to repeat the experiment with another loading rate and so on. After obtaining f ∗ r f over a wide range of loading rates, the recipe for the computation of yb and E b is as follows. First plot f ∗ vs. ln r f /r0 . According to Eq. 5.83 one expects that the data points lie on a straight line, see Fig. 5.12. Then one obtains yb from the slope of that line and, in a second step, E b from the offset constant. For that second step the attempt frequency ν0 is needed. It is often difficult to know the precise value of ν0 but uncertainties in the attempt frequency enter only logarithmically as uncertainties with respect to the barrier height.
219
220 Stochastic Processes
From the theoretical analysis given above one might expect that one always finds that the data points of f ∗ plotted against the logarithm of r f form a straight line, see Eq. 5.83 and Fig. 5.12. However, this is not always the case. The straight line follows from the approximation we made when going from Eq. 5.70 to 5.71. It is, however, not always the case that the barrier is so steep that its position does not change much when a force is applied. And even more, the whole energy landscape might not just be sheared but change altogether through the application of an external force, as we shall see later in Chapter 8. It is therefore not always possible to simplify Eq. 5.70 to 5.71. To derive a relation for the more general case, Eq. 5.70, we start from the condition for a maximum in the failure rate, Eq. 5.75, that follows from combining Eqs. 5.73 and 5.74 but restrict ourselves to the case of a linear force ramp, f (t) = r f t: t
% $ d νfail r f τ dτ = 0. (5.84) νfail r f t exp − dt 0 This leads to
t 2 r f t exp − r f νfail r f t − νfail νfail r f τ dτ = 0
(5.85)
0
where the prime denotes the derivative with respect to the argument. The force f ∗ follows from the condition 2 ( f ∗) . r f νfa il ( f ∗ ) = νfail
(5.86)
It is straightforward to derive the classical result, Eq. 5.82, from Eq. 5.86 for the special case, Eq. 5.71. For the general case, Eq. 5.70, we find from Eq. 5.86 the following condition: r f ( f ∗ ) = ν0 kB T
∂E ( f ∗ ) −1 − E ( f ∗ )/ kB T − e . ∂f
(5.87)
This is an explicit equation for r f = r f ( f ∗ ), the inverse function of the one we are interested in, namely f ∗ = f ∗ r f . It relates the loading rate r f to the barrier height E and its derivative ∂E /∂ f , both at f = f ∗ . Equation 5.87 is ideal for the case that one has a theoretical expression for the energy landscape, as is the case for the nucleosome unwrapping problem discussed in Chapter 8.
Langevin Equation
5.7 Langevin Equation An alternative and very popular approach to describe the impact of fluctuations on a system is provided by the Langevin formalism. It appears to be more concrete than the Fokker–Planck equation but— as we shall see—is mathematically equivalent to it. Let us start with the Brownian motion—again in one dimension for the sake of simplicity. We denote the position of the particle at time t by y (t). The equation of motion of a Brownian particle is given by ζ
dy (t) dU (y (t)) =− + L(t) . dt dy
(5.88)
On the lhs of this equation is the friction force experienced by the particle as it moves with the velocity dy/dt through the solvent. The friction constant ζ has already been introduced in Eq. 5.44. On the rhs are all the forces that “drive” the particle: an external force −dU /dy and a random force L(t) that accounts for collisions with the surrounding solvent molecules. Thus two terms in Eq. 5.88, the friction term and the random term, describe the forces exerted on the particle by the surrounding fluid. The following three physically plausible properties are postulated about these forces (see also Fig. 5.13): (i) The forces consist of a damping force linear in dy/dt and a random force L(t). The latter term is a stochastic process that cannot be predicted. However, averaged physical quantities are simple. (ii) The force L(t) is caused by collisions of individual molecules of the surrounding fluid that hit the particle from any direction. Thus the average of L(t) vanishes: L(t) = 0. (iii) One postulates for its autocorrelation function L(t) L t = δ t − t
(5.89)
(5.90)
with being a constant. With this relation one assumes that the collisions are instantaneous and successive collisions are uncorrelated.
221
222 Stochastic Processes
dR (t0 ) dt L (t0 ) R (t)
−ζ
dR (t0 ) dt
Figure 5.13 In Langevin equations the forces exerted by a fluid on a Brownian or Rayleigh particle are composed of two components: a friction force −ζ dR/dt and an irregular force, the Langevin force L (t). The Langevin force mimics collisions with solvent molecules (black disks). This microscopic point of view on the Langevin force is especially appropriate for the Rayleigh particle. This is the same particle and trajectory as in Fig. 5.3(a).
The quantity L(t) with properties (i) to (iii) is called a Langevin force. Equation 5.88 is named the Langevin equation, a prototype of a stochastic differential equation. Consider now a free Brownian particle that is described by Eq. 5.88 with U (y) ≡ 0. For a given realization of L(t) the trajectory of the particle follows simply by integration: 1 y (t) − y0 = ζ
t L(τ ) dτ
(5.91)
0
with y0 denoting the start position at t = 0, y0 = y (0). As usual, we are interested in averaged quantities instead of concrete realizations as in Eq. 5.91. The average position is trivial since 1 y (t) = y0 + ζ
t L(τ ) dτ = y0 0
(5.92)
Langevin Equation
where we used Eq. 5.89. It is thus more appropriate to look at the mean-squared displacement: ⎞⎛ ⎞ ⎛ t t
1 1 (y (t) − y0 )2 = ⎝ L τ dτ ⎠ L(τ ) dτ ⎠ ⎝ ζ ζ 0
=
1 ζ2
0
t
t dτ 0
0
dτ L(τ ) L τ = 2 t. ζ
(5.93)
In the first step we used Eq. 5.91 and in last step Eq. 5.90. We are now in the position to determine the value of . This is achieved through comparison to the corresponding Fokker–Planck
equation, Eq. 5.40, which fulfills y (t) = 0 and y 2 (t) = 2Dt (see Eq. 5.43 with y0 = 0). The Langevin equation leads to the same averages when we set = 2Dζ 2 = 2kB T ζ
(5.94)
where we used the Einstein relation, Eq. 5.49. Does this mean that the Fokker–Planck equation, Eq. 5.40, and the Langevin equation, Eq. 5.88 with U ≡ 0 and given by Eq. 5.94, are identical? Not quite. With postulates (i) to (iii) only the first two moments are determined but not terms like y n (t) with n > 2. For instance, for n = 4 we would need to integrate over a term like L(t1 ) L(t2 ) L(t3 ) L(t4 ) as follows from a calculation similar to the one presented in Eq. 5.93. On the other hand, from the Fokker–Planck equation follows the density distribution and from this distribution moments of any order can be calculated. We therefore need to supplement the previous three postulates with the following one: (iv) L(t) is Gaussian. This means that moments of any order in L(t) are immediately defined. All odd moments vanish and all even moments can be broken down into sums of products of second moments, see Eq. A.15, which have been defined in postulate (iii). For instance, using Eq. A.16 and then Eq. 5.90 we find L(t1 ) L(t2 ) L(t3 ) L(t4 ) = L(t1 ) L(t2 ) L(t3 ) L(t4 ) + . . . = 2 δ (t1 − t2 ) δ (t3 − t4 ) + . . .
(5.95)
223
224 Stochastic Processes
where . . . stands for the other two possible pairings. Thus all stochastic properties of L(t) are determined by the single parameter . L(t) is called Gaussian white noise. We now show that with postulate (iv) the Langevin equation 5.88 becomes equivalent to the Fokker–Planck equation 5.40. According to Eq. 5.91 y (t) is an integral of Gaussians. As shown in Appendix A, Eqs. A.17 to A.20 (for sums), y (t) itself must also be Gaussian. The equivalence to the Fokker–Planck equation 5.40 follows simply by inspection: The solution of Eq. 5.40 is also a Gaussian process, Eq. 5.42, with the same moments as the above Langevin equation. Both equations therefore describe the same process. We come now to the Langevin description of the Rayleigh particle. Its equation of motion follows by adding a Langevin force to its equation of motion, mdυ/dt = −ζ υ, leading to dυ (t) = −ζ υ (t) + L(t) . (5.96) m dt Again, it is useful to mention the analogy between a free Rayleigh particle and a Brownian particle in a harmonic potential U (y) = K y 2 /2, see Fig. 5.8. Starting with Eq. 5.88 one arrives at Eq. 5.96 by replacing y by υ, ζ by m2 /ζ and K by m. We now explicitly solve Eq. 5.96 for the initial condition υ (0) = υ0 . We use the standard trick of first calculating the Green’s function G (t, t ) which is the solution of the equation LG (t, t ) = δ (t − t ) where L is a linear differential operator. In our case L = m (d/dt) + ζ and hence G (t, t ) = (t − t ) e−ζ (t−t )/m /m. Here (x) is the Heaviside step function, (x) = 0 for x < 0 and (x) = 1 for x ≥ 0. Using the rule (d/dt) (t − t ) = δ (t − t ) one finds indeed LG (t, t ) = δ (t − t ). With G being determined, a special (t) solution υ (0) (t) for t ≥ 0 to the equation Lυ ∞ = L(t) follows immediately by integration, namely υ (0) (t) = 0 G (t, t ) L(t ) dt as can be checked by applying L on both sides. This equation, however, corresponds to the initial condition υ (0) (0) = 0. The general case is found by adding to υ (0) (t) the solution to the homogeneous differential equation Lυ (t) = 0 with the proper initial value; the latter simply has the form υ0 e−ζ t/ m . We then find the solution to Eq. 5.96 with the initial condition υ (0) = υ0 : t ζ ζ 1 ζ e m t L t dt . υ (t) = υ0 e− m t + e− m t (5.97) m 0
Application: Polymer Dynamics 225
From this follows with Eq. 5.89 ζ
υ (t) = υ0 e− m t
(5.98)
ζ ζ (υ (t))2 = υ02 e−2 m t + 1 − e−2 m t . 2mζ
(5.99)
and with Eq. 5.90
The long-time behavior, t → ∞, of Eq. 5.99 allows to determine : kB T = υ2 = 2mζ m where we used on the rhs the equipartition theorem, Eq. 2.36, with H = mυ 2 /2. Hence we recover Eq. 5.94.
5.8 Application: Polymer Dynamics The Langevin equations examined so far always correspond to the case of a single particle in an external potential, Eqs. 5.88 and 5.96. As we show in this section, a coupled set of such equations can describe the dynamics of a more complicated object, e.g., of a polymer. We consider in the following two classical models of polymer dynamics, the Rouse model and the Zimm model (Doi and Edwards, 1986). We start with the Rouse model which is computationally straightforward. However, as we shall see, with some notable exceptions, it does not reproduce the experimentally observed dynamics of single polymers. We have introduced several models for flexible polymers in Chapter 3. Here we introduce yet another model, the Gaussian chain model. The polymer is modeled as a string of N beads connected by harmonic springs. The springs make this model computationally more convenient than the models with fixed bond length. The chain’s conformation is given by the set {Rk } where Rk = (X k , Yk , Z k ) is the position vector of the k-th bead, k = 1, 2, . . . , N. Rk depends on time, i.e., Rk = Rk (t). Neglecting the interactions between beads, i.e., assuming a -solvent, the energy of the chain is given by U ({Rk }) =
N+1 K [Rn − Rn−1 ]2 . 2 n=1
(5.100)
226 Stochastic Processes
Here K denotes the (entropic) spring constant K = 3kB T /b2 . The spring constant is chosen in such a way that—according to the equipartition theorem, Eq. 2.36—the mean-squared length of each bond is b2 . For convenience we introduced in Eq. 5.100 hypothetical beads with indices n = 0 and n = N + 1 assumed to be at positions R0 = R1 ,
R N +1 = R N .
(5.101)
Since each bond between two neighboring monomers is Gaussian distributed, we know that Ri j = R j − Ri is Gaussian distributed. j −1 This is so because Ri j = k=i (Rk+1 − Rk ) (assuming here j > i ) is the sum of Gaussian distributed vectors, see Eq. 5.100. According to Eqs. A.17 and A.18, Ri j is then also Gaussian with 2 Ri j = b2 |i − j | . (5.102) Note that this model features Gaussian properties on all length scales, not only on large scales as for the polymer models discussed in Chapter 3. A Gaussian distribution is not very realistic on the microscopic monomer scale but we learned in Chapter 3 that usually the microscopic features do not matter for the understanding of the polymer behavior at larger scales. In the Rouse model the chain’s dynamics are described by N coupled Langevin equations, one for each monomer. The equation for the n-th bead is as follows: ζ
∂U ({Rk (t)}) dRn (t) =− + L (n, t) . dt ∂Rn (t)
(5.103)
Here U ({Rn }) is given by Eq. 5.100 and L (n, t) are Gaussian random forces with Li (n, t) = 0
(5.104)
and
Li (n, t) L j n , t = 2ζ kB T δi j δnn δ t − t .
(5.105)
The indices i and j denote the components of the force vector, i.e., i, j = X , Y, Z , and ζ is the friction constant of a monomer. The set of N Langevin equation, Eq. 5.103, is thus a straightforward generalization of the Langevin equation of the Brownian particle, Eq. 5.88.
Application: Polymer Dynamics 227
Equation 5.103 with the potential 5.100 is linear and hence the dynamics of the chain decouples in the X -, Y - and Z -directions. For instance, for the Y -direction of Rn one finds ∂Yn (t) ζ = K (Yn+1 (t) + Yn−1 (t) − 2Yn (t)) + LY (n, t) . (5.106) ∂t The mathematics simplifies by regarding the suffix n as being continuous. The connectivity term Yn+1 +Yn−1 −2Yn then transforms into the second derivative ∂ 2 Yn /∂n2 . In the continuum limit, Eq. 5.106 takes the form ∂ 2 Yn (t) ∂Yn (t) =K + LY (n, t) ∂t ∂n2 with the boundary conditions ∂Yn (t) = 0, ∂n ζ
(5.107)
(5.108)
n=0, N
the continuous version of Eq. 5.101. One finds analogous equations for the X - and Z -component of Rn . Any chain conformation can be written in terms of a Fourier series: ∞ pπ n Y ( p, t) cos Yn (t) = Y (0, t) + 2 . (5.109) N p=1 We chose here the orthonormal set given by Eq. E.18, for which each term fulfills the boundary conditions, Eq. 5.108, separately. The Fourier coefficients Y ( p, t), p = 0, 1, . . . , are given by: pπ n 1 N (5.110) Yn (t) dn. Y ( p, t) = cos N N 0 These are the normal coordinates, called Rouse modes in this model, for which Eq. 5.107 takes the form ζp
∂Y ( p, t) = −K p Y ( p, t) + L˜ Y ( p, t) . ∂t
(5.111)
where ζ0 = Nζ, ζ p = 2Nζ
(5.112)
for p = 1, 2, . . . and Kp =
2π 2 K 2 6π 2 kB T 2 p p = N b2 N
(5.113)
228 Stochastic Processes
for p = 0, 1, 2, . . . Note that the dynamics of the different modes is completely decoupled since Eq. 5.111 does not contain any cross terms. One arrives at Eq. 5.111 by performing the transformation, Eq. 5.110, on both sides of Eq. 5.107. The symbol L˜ Y ( p, t) on the rhs of Eq. 5.111 denotes the Fourier transform of the thermal noise: N pπn ζp L˜ Y ( p, t) = cos LY (n, t) dn. (5.114) ζN 0 N Thus the Fourier transformed forces fulfill
L˜ Y ( p, t) = 0 (5.115) and
L˜ Y ( p, t) L˜ Y q, t = 2ζ p kB T δ pq δ t − t .
(5.116)
To calculate Eq. 5.116, we used Eq. 5.105 but with δnn replaced by its continuous version, δ (n − n ). We have chosen all the definitions so that the Langevin equation for each Rouse mode, Eq. 5.111, has the same form as the Langevin equation for a particle with friction constant ζ p in a quadratic potential with spring constant K p , see Eq. 5.88. Also the Fourier transformed noises fulfill the corresponding relations (compare Eqs. 5.90 and 5.94 with Eq. 5.116). Suppose we start from a given chain conformation Yn (0) at t = 0. Equation 5.110 then allows to determine the values of the Fourier coefficients Y ( p, 0). For the sake of simplicity, let us first consider the behavior of the Y ( p, t) averaged over different realizations, Y ( p, t), for p > 0 before solving the full problem. From Eqs. 5.111 and 5.115 follows that Y ( p, t) obeys ∂ Y ( p, t) p2 Y ( p, t) = 0 + ∂t τR where we introduced the so-called Rouse time τ R : τR =
p2 ζ p ζ b2 N 2 = . 3π 2 kB T Kp
(5.117)
(5.118)
Therefore, for a given initial chain configuration, each mode decays 2 according to Y ( p, t) = Y ( p, 0) e− p t/τ R . The slowest mode is the mode with p = 1, which decays with the relaxation time τ R , Eq. 5.118. Since this mode is proportional to cos (π n/N) it describes the overall orientation of the chain in the Y -direction. The
Application: Polymer Dynamics 229
p=1
p=2 p=3 p=4 t = τR/16
t=0
t = τR/4
t = τR
Figure 5.14 The Rouse modes of an N = 5-chain. Displayed are contributions of the different modes to Yn (t). Each mode decays exponentially with a relaxation time τ p = τ R / p2 . Modes with t ≥ τ p are shown in gray. Note that only the vertical direction corresponds to the position in space; the horizontal axis gives the monomer position n.
Rouse time thus corresponds to the rotational relaxation time of the chain. Modes with larger values of p describe features of the chain conformation on smaller scales that decay faster with relaxation times ζp τR = 2. (5.119) τp = p Kp Figure 5.14 shows schematically the decay of an initial conformation of a chain with 5 monomers in the different modes. At t = 0 each mode has some given amplitude. Once a time t ≥ τ p has passed, the p-th mode has relaxed to almost zero and its conformation is shown in gray. The complete solution of the Langevin equation 5.107 for the p-th normal coordinate is straightforward since Eq. 5.111 has mathematically the same structure as Eq. 5.96 which is solved by Eq. 5.97. Hence Y ( p, t) = Y ( p, 0) e
2
− τp t R
+
1 − τp2 t e R ζp
t
p2 t e τ R L˜ Y p, t dt .
(5.120)
0
From this we now derive the explicit time dependence of the meansquared displacement of the chain’s center of mass and that of a
230 Stochastic Processes
tagged bead. We begin with the motion of the center of mass. The Y -component of the trajectory of the center of mass is given by the 0th Rouse mode since N 1 YCM (t) = Yn (t) dn = Y (0, t) . (5.121) N 0
Using Eq. 5.120 with p = 0 and Eq. 5.116 with p = q = 0, we obtain the mean-squared displacement of the chain’s center of mass in the Y -direction:
2kB T 2kB T [YCM (t) − YCM (0)]2 = t= t. (5.122) ζ0 ζN This means that the diffusion constant of the whole chain is given by D = kB T /ζ N, 1/N times the value for a single monomer. In other words, the chain has a friction constant that is N times larger than that of a single monomer. This is not unexpected: the friction forces of all the N monomers simply add up when the chain is dragged through the solution. The behavior of a tagged bead, say at one of the chain’s ends, is more complicated. We aim at calculating the mean square displacement of the bead with n = 0, namely the quantity [Y0 (t) − Y0 (0)]2 . As a first step, we write Y0 (t)−Y0 (0) as the sum over the Rouse modes. From Eq. 5.109 we find ∞ (Y ( p, t) − Y ( p, 0)) . Y0 (t) − Y0 (0) = Y (0, t) − Y (0, 0) + 2 p=1
(5.123) The second step consists of inserting the solutions given by Eq. 5.120 into the rhs of Eq. 5.123. In the third step one has to square the resulting expression which results in many terms. Most of the terms vanish when, in the forth step, the average is taken. Since we are not interested in a specific conformation at t = 0, we need to take two types of averages, one over the thermal noise and one over the starting conformation at t = 0. We assume here that at t = 0 the chain is in thermal equilibrium and hence—according to the equipartition theorem—obeys Y ( p, 0) Y (q, 0) = δ pq kB T /K p . After a longer calculation one arrives at ∞
2kB T 4kB T t − p2 τ/τ R 2 [Y0 (t) − Y0 (0)] = e t+ dτ. (5.124) ζN ζ N p=1 0
Application: Polymer Dynamics 231
t=
ζb2 3π 2 kB T
t=
ζb2 72 3π 2 kB T
t=
ζb2 152 3π 2 kB T
Figure 5.15 With increasing time a given monomer (highlighted in red in the configuration on the left) feels more and more the presence of the neighboring monomers. As a result the diffusion of this monomer slows down until the Rouse time, from which the monomer diffuses with the whole chain on average, see Eq. 5.126. For the sake of simplicity, we depict the chain in a frozen conformation.
The first term on the rhs of Eq. 5.124 corresponds to the diffusion of the whole chain, Eq. 5.122, that each monomer has to follow for larger times. The second term is important for short times, t < τ R , when the internal relaxation modes of the chain contribute to its dynamics but becomes negligible compared to the whole chain diffusion for larger times, t τ R . In fact, the second term of Eq. 5.124 can be well estimated for t τ R because then—to a very good approximation—the summation over p can be replaced by an integration: t ∞ ∞ t 4kB T 4kB T 2 − p2 τ/τ R dτ e dτ dpe− p τ/τ R ζ N p=1 ζN 0 0 0 & kB T 1/2 = 4b t . (5.125) 3π ζ To summarize, the mean-squared displacement of monomer n = 0 obeys ⎧ BT
⎨4b k3πζ t1/2 for t τ R [Y0 (t) − Y0 (0)]2 ≈ (5.126) ⎩ 2kB T t for t τ R . ζN
The behavior of the monomer at short times, t < τ R , is remark√ able because the mean square displacement only grows with t instead of t, as we might have expected for a diffusing monomer. In short notation we derived Y0 (t) ∼ t1/4 here, and similarly we
232 Stochastic Processes
would have found Yn (t) ∼ t1/4 for any other monomer at short enough times. How can we understand this subdiffusive behavior? Let us start from the observation that the Langevin equation of the Rouse model, Eq. 5.107, is formally identical to the diffusion equation, Eq. 5.40 with K/ζ corresponding to the diffusion constant D. Suppose at t = 0 the chain features a kink around monomer m. Employing the diffusion analogy, we know that the kink is smeared out over time as in Eq. to Eq. 5.43— 5.42 and—according 2 2 affects M (t) ∼ (K/ζ ) t ∼ kB T /ζ b t neighboring monomers. Inverting this relation, one can see that t just scales as the “Rouse time” τ R (M) ∼ ζ b2 M2 /kB T of this M monomer long subchain. As monomer m moves together with the growing cluster of M (t) collectively diffusing monomers, see Fig. 5.15, its mean square displacement is given by &
kB T kB T 1/2 2 [Ym (t) − Ym (0)] ∼ t∼b (5.127) t . ζ M (t) ζ This is actually Eq. 5.125 up to a numerical constant. The subdiffusive behavior can thus be understood as resulting from the diffusion in space of a growing object whose diffusion constant decreases in time according to D (t) ∼ 1/M (t). The growth M (t) ∼ t1/2 of the collectively moving subchain is the result of yet another diffusion process that is mediated along the chain via the elastic bonds. Unfortunately the Rouse model does not compare favorably with experiments on single polymer chains in -solvents. It is generally found that the diffusion constant D of such chains and their slowest time scale, the rotational relaxation time τr , scale with the polymerization degree N as D ∼ N −1/2 ,
τr ∼ N 3/2 .
(5.128)
However, the Rouse model predicts D ∼ N −1 , Eq. 5.122, and τr = τ R ∼ N 2 , Eq. 5.118. The Rouse model appears to have a reasonable microscopic representation of the polymer and is even exactly solvable. So what could have gone wrong? As it turns out, the Rouse model lacks a crucial ingredient: the hydrodynamic interaction between the monomers.
Application: Polymer Dynamics 233
In order to account for this effect, the Langevin equation for the n-th bead, Eq. 5.103 in the Rouse model, needs to be generalized to
∂U ({Rk (t)}) dRn (t) = Hnm − + L (m, t) (5.129) ∂Rm (t) dt m where Hnm is called the mobility tensor. The potential U is still given by Eq. 5.100. Equation 5.129 has on the lhs the friction force that slows down the n-th bead and on the rhs all the forces that drive the bead. What is special about this equation is that forces acting on other beads, i.e., L (m, t) for m = n, are felt by the n-th bead as well. These forces are not mediated along the elastic backbone of the chain but through the solvent. How does this work? First of all one needs to realize that water at the typical length scales and velocities that occur in such a microscopic system appears very different from the way we experience water in our macroscopic world. Water actually appears much more viscous, such as honey. In fact, the flow pattern produced by moving a spoon in a glass of honey is completely different from the one by moving a spoon in a cup of coffee. While the honey flows very regularly around the spoon, in the latter case—after adding milk—a very chaotic flow pattern can be observed. This is the difference between the case of so-called low Reynolds numbers and that of high Reynolds numbers. The Reynolds number is a dimensionless quantity defined as υ Rρ . (5.130) Re = η Here υ and R denote the typical velocity and spatial extension of the flow whereas ρ and η denote the mass density and the viscosity of the fluid. The number is just the ratio of the typical inertia force and the typical friction force (mass times acceleration) 2 force. The inertia 3 2 scales like ρ R υ /R where υ /R is the acceleration since the flow changes direction in a typical time R/υ. The friction force scales like η Rυ, see, e.g., Eq. 5.45. When Re is small, Re 1, the friction dominates over the inertia and one has a very regular flow pattern, a so-called laminar flow that stops immediately after one stops stirring the fluid (“honey”). On the other hand, when Re is large, Re 1, the inertia dominates over the friction and one finds a very irregular flow, a so-called turbulent flow that keeps moving after one
234 Stochastic Processes
m
Fm
Fm
m n vn
v (r)
(a)
(b)
Figure 5.16 The hydrodynamic interaction at low Reynolds numbers. (a) A force Fm on particle m creates a velocity field υ (r) in the fluid around it. (b) This flow field carries other particles along, e.g., particle n at position r moves with velocity vn = v (r).
stops stirring (“coffee”). Let us now calculate the Reynolds number in the two cases. We start with the cup of coffee: the viscosity of water is 9×10−4 Pa s and its density 1000 kg m−3 , the size of the cup is R ≈ 10 cm and the speed of a stirring spoon is around υ ≈ 1 cm/s. We find then from Eq. 5.130 Re ≈ 103 , i.e., we expect turbulent flow. On the other hand, when stirring honey in a glass, one now has a fluid with a much higher viscosity, η ≈ 5 Pa s. Keeping the other numbers the same, one finds Re ≈ 0.2, i.e., laminar flow. In the case of low-Reynolds numbers one can calculate the mobility tensor Hnm , the so-called Oseen tensor (Doi and Edwards, 1986): I for n = m (5.131) Hnm = H (Rnm ) = ζ 1 I + Rˆ nm Rˆ nm for n = m. 8πη|Rnm |
Here Rnm is the distance vector between bead n and bead m, Rnm = Rn − Rm , and Rˆ nm denotes the unit vector in that direction Rˆ nm = Rnm / |Rnm |. I is the unit tensor, Iαβ = δαβ . What is the meaning of Eq. 5.131? Suppose one applies a force Fm on a point-like particle m. This particle will eventually be one of the monomers of the polymer under consideration. As the force acts on this particle, it moves with a velocity vm = Fm /ζ . But this is not all. The particle “drags” fluid along and creates the velocity field v (r) = H (r) Fm
(5.132)
Application: Polymer Dynamics 235
with H (r) given by Eq. 5.131. This velocity field is schematically depicted in Fig. 5.16(a). Other particles (e.g., other monomers) that happen to be in this velocity field simply move along with the fluid, e.g., particle n at position r moves with vn = v (r), cf. Fig. 5.16(b). So far we discussed the case when forces act on one particle only. What about the case when forces act on N particles? The equations that govern the hydrodynamics for low-Reynolds numbers are linear in the velocities and forces (Doi and Edwards, 1986). That means that one can simply sum up the effects of the forces of all N particles, m = 1, 2, . . . , N, on particle n to find its velocity: Hnm Fm . (5.133) vn = m
This is precisely Eq. 5.129 with the forces being the sum of the “spring” and the Langevin forces. Equation 5.129 takes into account the hydrodynamic interaction between the monomers correctly and actually leads to the correct scaling of the diffusion constant and the rotational relaxation time, Eq. 5.128. To show this, we can, as a first step, go to the continuum limit of the chain leading to ∂Rn (t) = ∂t
N Hnm 0
∂ 2 Rm K + L (m, t) dm. ∂m2
(5.134)
However, it is far from obvious how to proceed further since the equation is non-linear in Rn − Rm due to the complicated functional dependence of the Oseen tensor, Eq. 5.131. Zimm (Zimm, 1956) devised a scheme that allows to find an approximate solution to Eq. 5.134 which shows the right scaling given by Eq. 5.128. This scheme is called pre-averaging approximation and is worked out in Appendix F. One is led to approximate equations that have exactly the same form as the one for the Rouse model, Eq. 5.111, but with the ζ p ’s given by Eq. F.13. This now enables the calculation of the diffusion constant of the whole chain and the relaxation times of the different modes. The diffusion constant is given by D=
8kB T kB T = √ ζ0 3 6π 3 ηbN 1/2
(5.135)
236 Stochastic Processes
(cf. Eq. 5.122) and the relaxation time of the p-th mode by ζp τ1 τp = = 3/2 . (5.136) p Kp Here τ1 is the slowest relaxation time: ηb3 N 3/2 . (5.137) τ1 = τr = √ 3πkB T Both the diffusion constant D, Eq. 5.135, and the rotational relaxation time τr , Eq. 5.137, show the same scaling as found in experiments, Eq. 5.128. This leaves only the question what a more accurate calculation without a pre-averaging approximation would find. In fact, a calculation that solved the eigenvalue problem associated with Eq. F.4 (Zimm, 1956) leads to the same scaling as in Eqs. 5.135 and 5.137, albeit slightly different numerical √ with 3 ∼ factors: 0.192 instead of 8/ 3 6π = 0.196 and 0.398 instead √ ∼ 0.326. of 1/ 3π = What is the physical interpretation of these results? We found in Eq. F.3 that the hydrodynamic interaction decays on average very slowly with the chemical distance along the chain, namely h (n − m) ∼ |n − m|−1/2 . This means that a monomer moving in the solvent creates a flow field that essentially pulls all the monomers and the fluid in between with it. Due to this strong hydrodynamic coupling, the Zimm chain behaves effectively like a sphere of the same size. Using Stokes’ law, Eq. 5.45, with the coil radius a = bN1/2 /2 one predicts the diffusion constant D ≈ kB T / 3π ηbN 1/2 that has the same scaling as Eq. 5.135 but with a different numerical factor. A sphere performs also rotational diffusion, i.e., the orientational correlations of a sphere’s axis decays as u (t) u (0) = exp (−t/τr ) with a rotational relaxation time τr =
4π ηa3 . kB T
(5.138)
From the rigid sphere analogy one predicts τr = π ηb3 N 3/2 / (2kB T ) for the Zimm chain, which scales indeed like Eq. 5.137. The Zimm model can be extended to good solvent conditions (Doi and Edwards, 1986). Using again a pre-averaging-type of approximation one finds D ∼ N −3/5 and τr ∼ N 9/5 . This result shows the same scaling as a rigid sphere of size bN 3/5 , see Eq. 3.34.
Application: Polymer Dynamics 237
+f −f
F
F v (r)
(a)
(b)
Figure 5.17 Polymers usually exhibit dynamics that is governed by hydrodynamic interactions as described by the Zimm model. There are, however, various exceptions, e.g., (a) A polymer extended by an external force (same polymer as in Fig. 3.11 with the velocity field induced by a point force, Fig. 5.16, superimposed). (b) A polymer molecule in a semidilute solution where the velocity field is truncated beyond a certain distance, the hydrodynamic screening length, as indicated here by a dashed circle. In case (a) the velocity field is not perturbed by the chain, but different blobs hardly see each other due to the extended chain configuration. In case (b) the velocity field vanishes beyond the screening length.
The Zimm model describes the so-called non-draining limit whereas the Rouse model corresponds to the free-draining limit where the solvent can freely flow through the polymer coil. At first glance, it seems that polymers are always showing Zimm-like behavior and that the Rouse model is just an academic, albeit elegant exercise. However, it turns out that there are several situations where polymers show Rouse-like behavior. Two important examples are shown in Fig. 5.17: (a) elongated polymers (e.g., chains elongated by an external tension) and (b) polymers in semidilute solutions. Let us first discuss the case of a polymer in a -solvent under an external force f , see Fig. 5.17(a) (Pincus, 1997). We start from the Zimm model, Eq. 5.134, and employ again the pre-averaging approximation as detailed in Appendix F. Also in this case the conformational dynamics can be described by the superposition of (approximately) independent Rouse modes. Short wavelength modes with p N/g P are essentially unperturbed by the forceinduced chain stretching and show the usual Zimm dynamics
238 Stochastic Processes
(compare Eq. F.10 with Eq. F.18). However, the tension has a dramatic effect on long wavelengths with p N/g P for which ζ p ≈ ηb2 N f/kB T is independent of p (up to logarithmic corrections), as is the case for the Rouse model, see Eq. 5.112. This leads to Rouselike scaling for the relaxation times of the p N/g P -modes, see Eq. F.19. One can again introduce a Rouse time, which has the form τ˜ R =
p2 ζ p ηb4 N 2 f = Kp (kB T )2
(5.139)
up to numerical constants. These results can be interpreted in terms of the Pincus blob picture that we introduced in Chapter 3, see Fig. 3.11. As discussed there, the chain can be described as a string of blobs of size ξ P = kB T / f consisting of g P = (ξ P /b)2 monomers. Inside the blobs the polymer statistics is the same as in the force-free case discussed above. We therefore expect full hydrodynamic coupling between the monomers for length scales smaller than ξ P , i.e., modes with p N/g P are described by the usual Zimm model, Eq. F.12. For length scales larger than ξ P , i.e., for p N/g P , the stretched polymer behaves like a Rouse chain. The renormalized units, the effective monomers, are the N/g P blobs, each having a size ξ P with the friction constant of a Stokes sphere of corresponding size, ζ˜ = ηξ P (see Eq. 5.45). The Rouse time is then given by Eq. 5.118 but with b replaced by ξ P , with ζ replaced by ζ˜ and with N replaced by N/g P leading to the rescaled Rouse time, Eq. 5.139. That the hydrodynamic interaction between the Pincus blobs is negligible is due to the fact that they are arranged in a linear fashion so that the hydrodynamic interaction decays on average as h (n − m) ∼ |n − m|−1 (see Eq. F.15), which only leads to a logarithmic p-dependence of h pp in the small p-limit, Eq. F.18. The second example mentioned above are semidilute polymer solutions where the chains are overlapping each other. We discussed in Chapter 3 that according to the Flory theorem excluded volume effects are screened, see Fig. 3.17. Likewise one can argue that also the hydrodynamic interaction is screened, see Fig. 5.17(b). Hydrodynamic screening is to be expected for the following reason: According to Eqs. 5.131 and 5.132, a force applied on the fluid should induce a velocity field proportional to 1/ηr for short distances. On
Problems
the other hand, at large length scales the polymer solution itself looks like a fluid, albeit with a much larger viscosity η˜ η. The velocity field at large distances from the point where the force is applied should thus behave as 1/η˜ r ≈ 0, cf. Fig. 5.17(b). That means that the hydrodynamic interaction between monomers of a chain sufficiently far apart from each other is negligible. Thus a chain in a semidilute solution shows Rouse behavior. This is even true for chains in a good solvent since according to the Flory theorem also the excluded volume is screened. For very long chains, however, there is an additional complication since the chains cannot cross through each other. In that case a chain is effectively trapped between other chains in a tube-like cage out of which it can escape only slowly. We shall come back to this problem in Chapter 8 of the book when we describe the structure of whole chromosomes.
Problems 5.1 Non-Markovian stochastic process Consider a stochastic process that takes the values 0 and 1 and where the time t only takes three values t1 < t2 < t3 . In principle there are eight possible realizations of this process. We attribute only to the following four a finite and equal probability: 1, 0, 0;
0, 1, 0;
0, 0, 1;
1, 1, 1.
Here the first position is the value at time t1 , etc. (i) Show that this process is not a Markov process. (ii) Show that it (nevertheless) obeys the Chapman–Kolmogorov equation. (iii) How is this possible? Does the fact that the transition probabilities obey the Chapman–Kolmogorov equation not automatically ensure that the process is Markovian? 5.2 Ornstein–Uhlenbeck process Show that the autocorrelation function of the Ornstein–Uhlenbeck process is given by Eq. 5.22.
239
240 Stochastic Processes
5.3 Rouse dynamics Show that the mean-squared displacement of a single bead in a Rouse chain is given by Eq. 5.124. 5.4 Dynamics of a tagged monomer in a swollen polymer chain Below Eq. 5.126 we provided an argument that explains the very slow displacement of a tagged monomer for short times, namely Y (t) ∼ t1/4 . This applies for the case of an ideal polymer in a -solvent. Suppose instead you have a good solvent in which the polymer forms a swollen coil. At equilibrium, the size of a subchain (say from monomer i to monomer j ) of this swollen polymer scales according to Eq. 3.55 with the Flory exponent ν ≈ 3/5. In this case, how does the displacement of a tagged monomer scale with time for very short times? Hints: Again, try to use an argument involving a growing cluster of monomers moving together, as we did for the -solvent. However, you cannot use the argument with the diffusion along the chain. Instead, take the more general argument that the tagged monomer has just enough time to “explore” the whole growing cluster. This argument ν is astonishingly simple—only two lines—and leads to a t 2ν+1 -scaling instead of t1/4 .
Chapter 6
RNA and Protein Folding
6.1 RNA Folding As mentioned in the introductory chapter, RNA and DNA are chemically very similar. Nevertheless, inside a living cell they behave radically different. Why? This reflects the fact that DNA molecules are always paired, resulting in a stiff double helix, while RNA chains are single stranded. The DNA replication process automatically ensures that to each strand also a complementary strand is produced, as depicted in Fig. 1.2. On the other hand, RNA is produced as a single-stranded molecule, see Fig. 1.3. These molecules are rather flexible with an effective bond length of a few bp—as we have discussed in Section 4.4 on DNA melting. Since the RNA backbone is flexible, the molecule can easily fold onto itself. If a sequence of a few bases finds its counterpart somewhere else along the chain (e.g., if there is a sequence AUGGC and somewhere else GCCAU), these two stretches can hybridize with each other and form a short piece of RNA double helix. This might be biologically irrelevant for a messenger RNA en route to a ribosome, but in other cases the folding of an RNA chain is of vital importance, e.g., for transfer RNAs, the adapters between the RNA and the protein worlds
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
242 RNA and Protein Folding
that can do their job only because they are folded in a specific way, see Fig. 1.4. We outline in the following how one can calculate the so-called secondary structure of an RNA molecules, i.e., the set of base pairings that minimizes its free energy. Note that on top of this the RNA adopts a specific three-dimensional shape, its tertiary structure. Usually that higher-order structure does not disrupt the secondary structure since its formation involves smaller energy scales. We shall see that the problem of determining the secondary structure of an RNA chain can be simplified so that it is much easier to handle than the problem of protein folding discussed in the following section. In first approximation, the RNA chain folds such that it maximizes the number of CG and AU pairs. Two common ways of depicting the secondary structure are presented in Fig. 6.1. The structure in Fig. 6.1(a) roughly reflects the real spatial arrangement. The sugarphosphate backbone is drawn as a solid line with the bases shown as vertices. Base pairing is indicated by dark blue and red lines between pairs of vertices. The structure shown here consists of two hairpins that “kiss” each other. Figure 6.1(b) shows the same RNA secondary structure, but now the backbone is stretched out into a straight line and the base pairings are indicated by arcs. This is a socalled arc or rainbow diagram. Note that the red arcs responsible for the “kissing” overlap with the two “rainbows” that form the stems
(a)
(b) Figure 6.1 Two graphic representations of the same secondary structure (two “kissing” hairpins) of an RNA molecule; see text for details.
RNA Folding
of the hairpins. Such overlapping bonds are called pseudoknots. Pseudoknots are relatively rare which suggests that it is reasonable to neglect them when calculating the optimal secondary structure of an RNA molecule. In fact, the calculation of the minimum free energy structure then becomes trivial, as we shall see in the following. Before doing so, it is useful to look at two concrete examples that are depicted in Fig. 6.2. The smaller molecule, Fig. 6.2(a), shows a tRNA, namely exactly that of Fig. 1.4. It is 76 bases long and has 4 hairpins and three loops giving it the clover leaf structure that is characteristic for all tRNAs. On the bottom is a stretch of three bp, GAA, forming the anti-codon. The overhanging end at the top is the point where the amino acid (aa) is anchored, here phenylalanine (Phe). The second example is the ribonuclease P RNA, Fig. 6.2(b), a molecule that is involved in the processing of tRNAs. This molecule is 377 bases long and its secondary structure features various bulges (short mismatches), hairpin loops (capping of helices), internal loops (connecting two helices) and multibranched loops (connecting three or more helices). The structure also features two pseudoknots indicated by P1 and P2. The most obvious way of finding the optimal secondary structure would consist of trying out all possible pairings between the bases. Starting with the first base to the left, it might form a pair with any of the N − 1 other bases or it might not pair at all. This leads to N possibilities. For each of these possibilities we can now choose the next free base to the right and go through its possible states, i.e., it can either be paired with one of the N − 2 or N − 3 remaining free bases or it can stay unpaired. Let us then take N − 2 as the lower estimate for the number of possible states of that base. Continuing like this up to the last base we have to check out more than N (N − 2) (N − 4) . . . 2 = 2 N /2 (N/2)! different configurations (assuming N to be even). We are faced with an overwhelming number of configurations that grow with chain length essentially as N N , see Stirling’s formula, Eq. 2.48. This means that it is impossible to go through all possible configurations of an RNA chain of any reasonable length. As we shall see now, the situation improves dramatically if one neglects the possibility of pseudoknots.
243
244 RNA and Protein Folding
76 A C C A 1 G C C G G C G U A U U A C UA U A GACAC A G U D GA G G U C U C CUC G D C T UG G G G AG A G CG AG C G C G A U G C A A C U Y G A A
CA A G C G G C C G C G C GA G U G C AU A GC G C G G UA C GA C GG U GA AA
A A C GUA A G A G U GC G A G GG C A G ACG G C A C anticodon G G A CGC GG C CU GU A GG GC A G U C G AC A C C U A A G C U G A C G A AAU G CA C A C C CA G U C A C A A UA U CA A GG G A A A G CC G G A C G GAG C A A G GC C G GGU U C G G U G GG G C GU G CCUC U CG G C A A C C G U C U A AU CA A C G G C A A G G C G G C G G U U C G G G G C G G A U G U CC G C U C AA C G A GC C A G U G A G C U G U C C G G U C G U U AG A A A G A A G U G G A G A G G C GGG GA GA C G GC G GA G G G A U C U C C U CU G C U G C U UC U U GC C G C G G U A G C 1 A U C G A U G A A G C U GA C C A G C C A C G A C A G C U U U GA C U G G U A C A A U U C G G C C CA C C 377 U
(a)
P2
P1
(b)
Figure 6.2 Secondary structures of RNA molecules: (a) tRNA specific for the aa phenylalanine and (b) ribonuclease P RNA from Escherichia coli (Pace and Brown, 1995). The tRNA contains some unusual bases, produced by chemical modifications after the tRNA synthesis, e.g., , pseudouridine, that derives from U. The two structures also contain some non-canonical GU pairs, indicated by dots.
RNA Folding
To define a secondary structure, let us label the bases from 1 to N. A base pair is allowed to form between two bases i and j only if | j − i | ≥ 4 because there must usually be at least three unpaired bases in a hairpin loop (see, e.g., Fig. 6.2). With this restriction, however, the number of possible configurations still grows as N N . A dramatic change is now brought about by restricting ourselves to so-called compatible pairings. The pair (k, l) of bases k and l is said to be compatible with the pair (i, j ) if both pairs can be present simultaneously in a structure without forming a pseudoknot. Pairs are thus compatible if they are non-overlapping (e.g., if i < j < k < l) or if one pair is nested within the other pair (e.g., if i < k < l < j ). Pseudoknots, interlocked pairs as e.g., i < k < j < l, are not considered in the present calculation. The secondary structure that minimizes the free energy (excluding the possibility of pseudoknots) can then be found straightforwardly using so-called dynamic programming algorithms which, despite the name, have nothing to do with dynamics. A simple implementation of such an algorithm is the maximum matching model (Higgs, 2000). Here i j is the energy of a bond between bases i and j with i j = −1 for a complementary bond and i j = +∞ otherwise. We want to calculate E i, j , the minimum energy of the RNA subchain that starts at base i and ends at base j with i < j . Suppose that base j is bonded to another base k ≤ j − 4 in the subchain. This creates two stretches, one from i to k − 1 and one from k + 1 to j − 1, that cannot interact with each other without forming a pseudoknot. The minimum energy of the subchain with pair (k, j ) is thus given by (k, j )
E i, j
= E i, k−1 + E k+1, j −1 + kj .
(6.1)
On the other hand, if base j is not paired, then the minimum energy is E i, j −1 . Therefore, the minimum energy of the allowed configurations of a stretch of chain from i to j is
(k, j ) E i, j = min E i, j −1 , min E i, j . (6.2) i ≤k≤ j −4
Combining Eqs. 6.1 and 6.2 we can see that the minimum energy of any chain segment can always be expressed in terms of the minimum energy of smaller segments. We know by definition that E i, j = 0 for j − i < 4. Thus we can build up the E i, j values for chains of
245
246 RNA and Protein Folding
successively longer lengths until the minimum energy E 1, N of the complete chain is obtained. All that we have to remember at each stage is the optimal configuration. The configuration corresponding to E 1, N can then be found by a backtracking algorithm. The upper example is, of course, too simple to describe real RNA. This requires a full set of energy parameters, including e.g., penalties for loop formation (Mathews et al., 1999). However, it is possible to predict the secondary structure of shorter molecules with a very simple extension of the maximum matching model, as you can convince yourself by working out Problem 6.1. Whatever the complications are, the algorithm scales in the end as N 3 . This can be understood as follows. First one considers the N − 4 subchains of length 5 (one from base 1 to 5, one from base 2 to 6 and so on), then the N − 5 subchains of length 6 and so on. For each subchain of length n one has to consider n − 3 possibilities, see Eq. 6.2. By going through all the possible subdivisions of all possible subchains, one finds (N − 4) × (5 − 3) + (N − 5) × (6 − 3) + . . . + 2 × (N − 4) + 1 × (N − 3) possibilities. This leads to N
N →∞ N 3 1 3 N − 6N 2 + 5N + 12 → . 6 6 k=5 (6.3) By removing the pseudoknots, the calculation time now grows as N 3 instead of N N . This can be handled on a computer—even for large RNA chains. Biology often brings a new quality into play that systems of the non-living part of the world do not have. In fact, nature provides us with a powerful method to determine the secondary structure of a given RNA chain without having to do any calculation: comparative sequence analysis between different species. An example is given in Fig. 6.3 where the tRNA for Ala for widely different species (a protozoa, a slime mould, yeast, rockcress, silkworm, a small fly, a frog, chicken and mouse) are aligned. Some of the bases differ between different species, but they are usually accompanied by a compensatory mutation in the base with which they are usually paired (e.g., A and U go over to G and C). The arc diagram on top of the sequences indicates the paired bases showing the typical cloverleaf structure of tRNAs, see Fig. 6.2(a). Paired bases with (N − k + 1) (k − 3) =
Protein Folding
Plasmodium Dictyostelium Saccharomyces Arabidopsis Bombyx Drosophila Xenopus Chicken Mouse
Figure 6.3 Alignment of tRNA (Ala) sequences for widely different species, see text for details.
compensatory mutations are indicated by red arcs. Light red arcs connect bases where a GC pair changes to a non-canonical GU pair that maintains pairing ability. Comparative sequence analysis shows how good (or bad) free energy minimizations are: for the short tRNAs the predictions work in 85% of the cases but for longer RNAs it is only around 50% (Higgs, 2000). This indicates that the models are still not realistic enough to be reliable for longer chains.
6.2 Protein Folding Proteins with their 20 different kinds of aa’s are more complex than RNA and DNA molecules, each with only four different types of bases. There are two negatively charged, three positively charged, five uncharged polar and 10 non-polar aa’s, see Fig. 1.5. Figure 6.4 shows the chemical structure of an aa, a central carbon atom (called the α-carbon) bound to an amino and a carboxyl group that eventually form the backbone of the protein, an H-group and a side chain, indicated by R, that characterizes the aa. A protein is a sequence of aa’s covalently linked via peptide bonds. The latter is the reason
247
248 RNA and Protein Folding
-carbon amino group
H
H
O
carboxyl group
N C C H
OH
R
side-chain
Figure 6.4 Chemical structure of an amino acid, the subunit of proteins.
NH3+ O
CH2
O C
CH2
CH2
CH2
H CH2
H CH2
H
CH2
H CH2
N
N
N
C
C
N C C
H O
H O
C
C
H O
Glu
C
C
H O
Lys
OH
Ser
Phe
Figure 6.5 A section of a polypeptide made from four aa’s.
why proteins are also called polypeptides. A short section of a protein made of glutamic acid (Glu), lysine (Lys), serine (Ser) and phenylalanine (Phe) is shown in Fig. 6.5. The basic mechanism of protein folding is that the aa-sequence induces the folding into a specific three-dimensional structure, the native state. Figure 6.6 displays an experimental test for this idea. One destroys the native state of a protein by heat or by a denaturing solvent. Then, when one restores the physiological conditions, the protein returns back to its native state. This amazing property of proteins leads to various physical questions, some of which we shall address in this section. That cells contain so-called chaperones, special proteins that help polypeptide chains to fold properly, does
Protein Folding
physiological conditions
heat denaturing solvent protein isolated from cell
native state denatured state
Figure 6.6
Unfolding and refolding of a protein.
not make protein folding less puzzling since proteins can find their native state also without those helpers. There are characteristic folding patterns that are often found in proteins: α-helices and parallel or anti-parallel β-sheets, see Fig. 6.7. They are stabilized by hydrogen bonds between atoms of the regular backbone as we discussed earlier in the book for the case of αhelices, see Fig. 4.2. These structures do not directly involve the side groups that instead stick out into the solution—radially away from the helix or below and above the sheets. The set of these structures in a given protein is referred to as secondary structure. In addition to that level, one calls the aa-sequence of a protein its primary structure and the three-dimensional conformation its tertiary structure. Complexes of more than one polypeptide chain are referred to as quaternary structures. Note that the secondary structure of a protein is based on an entirely different principle than that for RNA molecules. In the latter case the formation of a double helical stretch is very specific since it can only form between two parts of the chain that are complementary to each other. In proteins, α-helices and β-sheets can in principle form anywhere along the chain since they are stabilized by H-bonds between backbone atoms that come close inside the helix or between adjacent polypeptide backbones in a sheet. The secondary structure of a protein, i.e., its specific set of helices and sheets, corresponds to the one where those structural elements can be embedded in an energetically most favorable three-dimensional environment. And it is precisely here that the
249
250 RNA and Protein Folding
(a)
(b)
(c)
Figure 6.7 The three secondary structural elements in proteins: (a) α-helix, (b) parallel and (c) anti-parallel β-sheet. The red dashed lines indicate Hbonds.
aa-sequence comes into play. For example, an α-helix at the surface of a globular protein usually has hydrophilic side chains that stick into the water. The intimate linkage between the secondary and tertiary structures leads to a much more complex folding problem than that for RNA where the secondary structure constitutes the dominant energy scale. A major problem of protein folding can be summarized in a paradox that was put forward by Cyrus Levinthal in 1968. Levinthal’s paradox states that a protein normally has such a huge number of possible configurations that it would take much longer than the age of the universe for the chain to find its native state if it randomly sampled its configurations. Take for example a protein that is 100 aa’s long. Its important degrees of freedom turn out to be rotations involving backbone atoms, namely 99 rotations between each α-carbon and its neighboring nitrogen and 99 rotations between neighboring carbon atoms. These so-called backbone dihedral angles cluster around three values, 120◦ apart from each other. This leads to 3198 ≈ 1094 possible configurations of the backbone. If the protein were to take samples at a rapid rate of one configuration per picosecond, and if it visited each misfolded state only once, it would take the molecule approximately 1094 × 10−12 s = 1074 years to find its native state, much longer than 14 billion years, the age of the universe. It is helpful to discuss protein folding in the context of energy landscapes (Dill and Chan, 1997). Levinthal’s paradox follows
Protein Folding
Figure 6.8 Energy landscapes for protein folding: (a) Levinthal’s golf course, (b) a funnel-shaped landscape and (c) a rugged funnel with local minima corresponding to misfolded proteins. In each case three different folding pathways are shown.
from the assumption that the energy landscape of a protein resembles a golf course with the native state being a hole, in Fig. 6.8(a) indicated by “N.” To be more precise, the height of an energy landscape represents the energy of the protein, i.e., all the electrostatic interactions between charged groups, hydrogen bonds, torsional energies in the polypeptide backbone, etc. The two horizontal directions in Fig. 6.8 give a very crude representation of the conformational coordinates. In reality, these would be e.g., the backbone dihedral angles, making this a landscape with an unimaginable high number of directions. Figure 6.8(a) shows three non-native states by colored dots and their subsequent search for the native state by random walks. As the protein moves through the high-dimensional landscape without any guidance, it will never find the hole. Whereas it was originally believed that the solution to Levinthal’s paradox could be found in a specific folding pathway, the modern view is that the energy landscape as a whole is in the form of a funnel, as shown schematically in Fig. 6.8(b). So no matter where the protein starts in this energy landscape, it finds its way to the global minimum by sliding downhill. Three different denatured states are indicated close to the rim of the funnel and all of them find the global minimum along different pathways. This picture suggests that one achieves only a partial understanding of protein folding if one studies how a given protein folds along a specific path. Instead, it is more appropriate to think of an ensemble of proteins that find the native state along various pathways simultaneously.
251
252 RNA and Protein Folding
Where does the funnel shape come from? Since it drives the protein to the native state and this state is typically very compact, we can rephrase that question: What drives protein compaction? Electrostatics, e.g., salt bridges between positively and negatively charged side chains are not very important since the protein core has a low dielectric constant; this will become clear in the next chapter. Hydrogen bonds are important to a certain extent, especially in stabilizing α-helices and β-sheets. But the major driving force for protein compaction are hydrophobic effects, i.e., the protein attempts to fold such that the non-polar side chains are hidden inside the protein. It is hard to come up with a realistic model for proteins, but it is believed that simple models of self-avoiding heteropolymers on lattices have many features in common with real proteins. Five different example configurations of such a polymer are shown in Fig. 6.9. The polymer consists of 13 monomers of two types: yellow disks represent hydrophobic (H) and turquoise disks polar monomers (P); that is why one refers to this model as the HPmodel (Chan and Dill, 1989). In this specific example, the polymer
E=0
24482 12903 2922
E = −2
E = −2
300 9
E = −3
1
E = −5 Figure 6.9 Model protein on a lattice. Example configurations with HHcontacts indicated by red bars. The number of distinct configurations is provided at each energy level. Adapted from (Dill and Chan, 1997) and (Chan and Dill, 1998).
Protein Folding
“lives” on a two-dimensional square lattice. It is assumed that if a hydrophobic monomer becomes neighbor to another hydrophobic monomer on the lattice, the energy E of the polymer is reduced by one unit as compared to the case when that monomer was sitting next to a P-monomer or water, the latter being represented by empty sites. Exact enumeration studies show that there are many open conformations, fewer compact conformations and only one conformation with five HH-“bonds,” i.e., with energy E = −5; see Fig. 6.9 for the precise numbers for the chosen monomer sequence. In reality, the landscape might feature many local minima like the one depicted in Fig. 6.8(c). These minima represent misfolded proteins. Going back to Fig. 6.9: the example configurations shown on the rhs have all native HH-contacts shown in red. The configuration on the lhs has a native and a non-native HH-contact, the latter indicated by light red. Thermal fluctuations have to break the wrong HH-contacts before the protein can fold into its native state. After a misfolded protein has escaped such a local trap with a rate according to Eq. 5.69, it can proceed sliding down the funnel towards its native state. The ground state in Fig. 6.9 is a compact configuration, a shape with minimum perimeter so that the contact with the surrounding fluid is as small as possible. Since ground states of HP-polymers are typically compact, we now take a closer look at such configurations. Figure 6.10(a) shows a compact configuration for a polymer with four monomers. There are five different configurations of such a chain, one of which is compact. In enumerating the configurations, the two ends of the chain are considered to be distinct. Only configurations are counted that are not related by translations, rotations or reflections. Two compact configurations of a 13monomer polymer are shown in Fig. 6.10(b). The one to the left corresponds to the energy minimum in Fig. 6.9. Note that these two configurations have different overall shapes but both feature the smallest possible perimeter; in total one can find 68 compact shapes for a 13-monomer chain. Altogether there are 367 compact configurations out of a total of 40 617 self-avoiding walks. The latter is the sum of all the numbers given in Fig. 6.9. A compact configuration of a 16-monomer chain is shown in Fig. 6.10(c). It is
253
254 RNA and Protein Folding
(a)
(b)
(c)
Figure 6.10 Compact configurations of lattice polymers: (a) a 4-monomer chain, (b) a 13-monomer chain with two different overall shapes and (c) a 16-monomer chain.
one of 69 possible compact configurations out of 802 075 possible self-avoiding configurations in total. These three examples show that one has a large variation with N in the number S (N) of compact shapes and correspondingly large oscillations in c (N), the total number of compact configurations. Whenever the monomer number is a square number like in Fig. 6.10(a) and (c) there is only one shape, in the other cases the number of shapes is larger than one. This is the main reason why c (13) = 367 is much larger c (16) = 69. To overcome this artifact of the discrete model, one might look instead at c (N) /S (N). This ratio grows more smoothly, approximately following 1.41 N (Chan and Dill, 1989). In comparison, the total number of self-avoiding walks on a quadratic lattice increases much faster, namely roughly as 2.64 N , see below Eq. 3.23. The number of compact configurations is therefore considerably smaller than that of the open configurations, and this effect becomes more and more pronounced with increasing chain length. We are now trying to answer the question why α-helices and βsheets are so common in proteins. The configurations of compact lattice polymers suggest an answer to this question since they show a high probability to form helices, parallel and anti-parallel sheets (Chan and Dill, 1989). Figure 6.11 shows short stretches of a lattice polymer that have to be envisaged to be embedded inside a dense configuration. The depicted configurations might be considered as two-dimensional analogues to helices, Fig. 6.11(a), anti-parallel sheets, Fig. 6.11(b), parallel sheets, Fig. 6.11(c) and turns, Fig. 6.11(d). The dashed boxes encircle minimal units that need to be present in order to qualify them as secondary structures.
Protein Folding
(a)
(b)
(c)
(d)
Figure 6.11 Definition of secondary structures inside compact twodimensional lattice polymers: (a) helix, (b) anti-parallel sheet, (c) parallel sheet and (d) hairpin. The dashed boxes indicate minimal units of these structures.
In Fig. 6.12(a) we show the fraction of monomers that participate in secondary structures for the 16-monomer chain as a function of the number of contacts, i.e., the number of neighboring monomers that are not directly connected by the backbone. The maximal number of contacts, here 9, occurs for compact configurations, see Fig. 6.10(c) for an example. Note the steep increase of the number of secondary structures with the number of contacts, i.e., with the compactness of the chain. Now Chan and Dill—again by exact enumeration—calculated the fraction of monomers participating in secondary structures of compact chains as a function of chain length from 13 to 30 monomers, Fig. 6.12(b). Interestingly, with increasing chain length secondary structures become more and more prominent. We have learned from the HP-model so far that configurations of lower energy are typically more compact because this enables more HH-contacts, Fig. 6.9, and that such compact configurations have typically a large amount of secondary structures, Fig. 6.12. But what we still need to understand is why proteins have typically a unique native state. One could easily imagine that there are many compact configurations that allow a maximum number of HHcontacts to a given sequence of H’s and P’s, so that the chain has many different ground states, i.e., states of lowest energy. Although the number of compact configurations is vastly smaller than that of all possible configurations, this number is still very large for longer chains. For example, a lattice polymer on a two-dimensional
255
256 RNA and Protein Folding
1.0
1.0 combined helices antiparallel sheets parallel sheets turns
0.8 0.7 0.6
0.9
fraction of monomers
fraction of monomers
0.9
0.5 0.4 0.3 0.2
0.7 0.6 0.5 0.4 0.3 0.2 0.1
0.1 0.0 0
0.8
1
2
3
4
5
6
7
number of contacts
8
9
(a)
0.0
12 14 16 18 20 22 24 26 28 30
number of monomers
(b)
Figure 6.12 Fraction of monomers that participate in secondary structures: (a) as a function of the number of monomer–monomer contacts in a 16monomer chain and (b) as a function of the degree of polymerization for compact chain configurations. The two highlighted columns in (a) and (b) are identical. Adapted from (Chan and Dill, 1989).
square lattice with N = 36 monomers has only one compact shape, a 6 × 6 square, in which a total of 57 337 configurations fit. A famous three-dimensional example is the 27-monomer chain on a cubic lattice; there are 103 346 different configurations that fit into a cube of size 3 × 3 × 3 (Shakhnovich and Gutin, 1990). Going to a chain that fills a 4 × 4 × 3-box one finds an astonishing 134 131 827 475 different configurations (Pande et al., 1994) and the enumeration for a 4 × 4 × 4-cube might be already too much for a computer to handle. That is, at least, what I wrote in 2013 for the first edition of this book. In the meantime, this number has been calculated: 27 746 717 207 772 000 (Schram and Schiessel, 2013, 2016). Furthermore, in the same study it has been proposed (based on estimates up to the 7 × 7 × 7-cube) that the number of configurations grows as 2.14 N . Remarkably, within these compact configurations, various effects (hydrogen bonding, intrinsic properties, ion pairing, hydrophobicity. . . ) pick out one configuration to have a lower energy than all the others. The simple HP model already has this property to an astonishing extent (Lau and Dill, 1989). Figure 6.13 shows
number of sequences
Protein Folding
120
N = 10
100 80 60 40
20
N = 13
40
15
25
10
20
5
10
N = 24
20 0
0 1 5 10 15 20 number of lowest energy conformations
(a)
0 1 5 10 15 20 number of lowest energy conformations
(b)
10 15 20 1 5 number of lowest energy conformations
(c)
Figure 6.13 Histogram of the number of sequences as a function of the number of lowest-energy compact configurations of two-dimensional lattice polymers of chain lengths (a) N = 10, (b) N = 13 and (c) N = 24. The enumeration was restricted to compact configurations only. For the 10-monomer chain all the possible sequences were scanned, for the longer chains only 200 randomly chosen ones. Adapted from (Lau and Dill, 1990).
histograms of the number of sequences as a function of the number of lowest energy conformations for chains of three different chain lengths, N = 10, N = 13 and N = 24. The search was restricted to compact conformations since a full scan of the conformational space would have been too time consuming. Moreover, the full sequence space was only considered for the 10-monomer chain (210 = 1024 sequences), while 200 sequences were selected at random for the longer chains. As can be seen from Fig. 6.13, for all three chain lengths there are many sequences that have a very small number of ground states and this effect becomes more and more enhanced with increasing chain length. For the longest chain with 24 monomers, Fig. 6.13(c), the histogram has its peak already at the unique ground state. Finally let us ask the question how nature could have found the right sequence for a protein of say 100 monomers that folds into a specific configuration which performs a specific function inside the cell. The amount of possible sequences of a 100 monomer long protein is enormous, namely 20100 ≈ 10130 different sequences. This means that the probability to draw randomly the right sequence is 10−130 . It would be extremely unlikely to find this sequence through random mutations since the origin of life. We seem to
257
258 RNA and Protein Folding
encounter another paradox here, similar to that formulated by Levinthal. However, this paradox arises from the assumption that one specific sequence has to be found. But it is more likely that evolution cares to find the right folding configuration, regardless of the aa sequence by which this configuration can be achieved. And the number of sequences that fold into the same specific shape might actually be very large. A simple estimate follows from the critical core model (Lau and Dill, 1990). The idea originates from an observation made for the HP model, namely that mutations of monomers in the inner core of a native compact configuration tend to affect this configuration more than that of monomers on the surface. “Mutation” means here to change an H monomer into a P monomer or vice versa. Let us assume that the protein is N monomers long and forms a dense per monomer. sphere of volume a3 N where a3 denotes the volume 1/3 . We The radius of that sphere is given by R = 3a3 N / (4π ) define the critical core as the sum of all the monomers that are not on the surface and estimate its number of monomers by 4π Nc = 3 (R − a)3 . (6.4) 3a For N = 100 we find that there are about Nc ≈ 28 monomers in the critical core. If we assume that all the other monomers, 72 in total, can be freely chosen to be either H or P, we find 272 different sequences that all fold more or less the same way since they all share the same critical core. Real protein have, however, 20 different monomers, 10 of which are non-polar, see Fig. 1.5. If we assume that it only matters whether a particular aa is non-polar or not, we have 10 choices per aa giving us another factor of 10100 of possibilities. In the end, we find that there are about 272 × 10100 ≈ 10121 different sequences that all fold into the same native state. Figure 6.14 schematically shows the sequence space of a 100monomer protein. Its size is enormous, 10130 different sequences, but also the number of sequences that fold into a specific configuration is mind-blowing, 10121 according to the estimate given above. There is no paradox: it is actually fairly easy to understand how functional polymers arose via random mutations through the course of evolution, since only a small fraction of the sequence space needed to be and has been sampled since the origin of life.
Problems
sequences that code for one native structure (10121 sequences)
sequence space (10130 sequences)
one sequence
Figure 6.14 Schematic depiction of the sequence space of a protein with 100 monomers. Of 10130 possible sequences on the order of 10121 lead to the same native structure (see text for details).
Problems 6.1 Calculating the secondary structure of a transfer RNA molecule Figure 6.2(a) depicts the secondary structure of the tRNA specific for the aa phenylalanine. Here you first implement the maximum matching model (as introduced in Section 6.1) and try to predict its structure. This tRNA molecule is N = 76 nucleotides long which leads to about N 3 /6 ≈ 70000 different possible structures (excluding pseudoknots), see Section 6.1. Does the maximum matching model pick the right one? You will learn that the prediction is not impressive at all. However, your findings immediately give you an idea on how to improve the model. You then test this improved model and check whether it is good enough to predict the native structure. (i) Calculate the lowest energy that the molecule can achieve through maximizing the number of GC and TA contacts. For each correct pairing you obtain an energy −1 (in arbitrary units). There are some special nucleotides for which we make the following highly simplified assumptions: Replace , pseudouridine, by U and also T by U, and do not allow any basepairing for D (dihydrouridine) and Y, a highly modified purine. First, do the calculation for a 17 nucleotide long subchain, the one that forms the loop-hairpin structure with the
259
260 RNA and Protein Folding
anti-codon, i.e., the chain CCAGACUGAAYACUGG. Then go to the full 76 nucleotides long molecule. Hint: Create a two-dimensional array that contains all the energies of all the subchains. The values of the different components of this array follow from Eq. 6.2 and are obtained by starting with short subchains, going to longer and longer chains. In a second array you keep track which monomer the right monomer of your subchain is in contact with (i.e., the value k in Eq. 6.2) or whether there is no contact at all (you can then set k = 0). If there are several possibilities for k, make some arbitrary choice. Finally, in a third array you can keep track of the degree of degeneracy of the ground state for each subchain. (ii) Now calculate at least one of the ground-state configurations (there might be more than one) of the 17 nucleotide subchain and of the full molecule. Hint: This can be achieved using a recursive algorithm. Call a subroutine recursivecontacts (i, j ), starting with i = 1 and j = N. The first operation to perform in this subroutine is to print i , j and k. If there is no contact (k = 0), you call recursivecontacts (i, j − 1). Otherwise you call recursivecontacts (i, k − 1) (if that subchain is long enough) and then recursivecontacts (k+1, j −1) (if that subchain is long enough). (iii) Now draw rainbow diagrams of the actual secondary structure and of the one predicted by the algorithm and compare them. You are probably disappointed with the prediction of the full tRNA structure. Especially, there is a high degree of degeneracy. However, if you look at one of these structures, you may get an insight into what else is important for the folding of RNA molecules. Compare the number of contacts in the native and the predicted structure. And try to find which local characteristic of the native structure has not yet been reproduced well. (iv) By comparing the predicted to the native structure, you find that the predicted structure has more contacts. It is, after all, the maximum matching model. However, the arcs are not nicely nested like in the real structure where an arc connecting monomer m with n is often surrounded by an arc connecting m − 1 with n + 1. This actually corresponds to two stacked bp’s.
Problems
Therefore, we are now improving the model by giving stacked bp’s an energetic advantage. For the sake of simplicity, let us assume that each stack also reduces the energy by −1. Calculate now the energy and structure of the tRNA molecule. How many different ground state structures do you find now? It should be one. And how many native contacts do you predict? It should be 100% (plus four extra contacts). Hint: To include the stacking energy in your calculation, you can proceed as follows: When calculating the energy of subchain (i, j ), check, when creating a contact between j and some other monomer k, whether there is also a contact for the subchain (k + 1, j − 1). If yes, add the stacking energy. In order for this algorithm to work properly, it is important that you preferably choose the state with k = i for a given subchain (i, j ), if it is one of the ground states (there might be several). This way you allow for stacking to i − 1, j + 1 (if this connection should occur for larger subchains). (v) Since the inclusion of a stacking energy works so well, one might wonder if stacking of matched bp’s is actually the most important factor. To test this, keep the stacking energy at −1 but reduce the pairing energy to −0.01 (that is in principle negligible but setting it exactly to zero leads to unwanted degeneracies). Based on the predicted structure: what would you say is the most important contribution to the secondary structure of RNA molecules? 6.2 Energy spectrum of the HP model Figure 6.9 displays the energy spectrum of one particular HP polymer composed of 13 monomers, namely the one with sequence HPPHPHPPHPPHP. It features a unique ground state. Check whether the authors of (Dill and Chan, 1997) and (Chan and Dill, 1998) had to choose their sequence very carefully to achieve this or whether unique ground states are rather common for this type of polymer. (i) As a start, verify the energy spectrum presented in Fig. 6.9. For this you need to enumerate all possible configurations of a 12-step SAW on a square lattice. You have done this already in Problem 3.4 using a recursive algorithm (if not,
261
262 RNA and Protein Folding
work out this problem first). Note that in that earlier problem you counted all configurations, even if they were related by rotations and reflections. Here, however, you count such related configurations only once. Also note that for the HP model the start and end monomers are considered as distinct. This is obvious for the sequence considered here but it is also assumed for polymers with palindromic sequences which exist when the number of monomers is even. Hint: With one exception, there are always eight symmetrically related SAW configurations. Why? And what is the exception? (ii) Now repeat this analysis for all possible sequences. What is the fraction of sequences that feature a unique ground state configuration? Note that this is a computationally rather demanding calculation. For 13 monomers there are 213 = 8192 different sequences and hence your calculation will take by about this factor longer than in (i). This probably means that your program from (i) needs to be improved to perform this calculation in a reasonable time. Here are some hints. Calculate first all possible sequences. To do this, note that you can count your sequences from i = 0 to 213 − 1. Sequence i can be produced by repeatedly dividing i by two and assigning the rest of the k-th division (0 or 1) as the chemical identity (H or P) of the k-th monomer of the i -th sequence. As you produce your configurations through the recursive algorithm, you might want to keep track of the locations of your monomers in a two-dimensional array that features zero’s for unoccupied positions and positive integers at locations where monomers are, e.g., the number k + 1 at the location of the k-th monomer. Each time you have produced a full length SAW configuration, you can go along the chain, monomer by monomer (you also have stored the x- and y-positions of all monomers), and check whether neighboring positions are occupied by H monomers (for all different sequences). Keep track for each sequence of the lowest energy reached and its current degree of degeneracy. Start first to test your program with shorter chains. For instance, for 5-step SAWs (i.e., 6 monomers) you should find 36 distinct configurations; 7 out of the 64 sequences have a unique ground state.
Problems
(iii) Go back and look critically at Fig. 6.13. Do the numbers make sense to you? Note that only for chains with 10 monomers, Fig. 6.13(a), a full enumeration over all monomer sequences was performed. You can check whether you find the same number of sequences with unique ground states. Do you find much less such sequences? Also check for N = 13 (Fig. 6.13(b)) whether the fraction of sequences with unique ground states (only done with 200 randomly chosen sequences) is larger than what you would expect based on your full exact result. Then what is the difference between their calculation and yours?
263
Chapter 7
Electrostatics inside the Cell
7.1 Poisson–Boltzmann Theory A living cell is essentially a bag filled with charged objects. Besides the charged macromolecules (DNA, RNA and proteins) and the membranes (that also contain some charged lipids) there are lots of small ions. These ions are mostly cations, positively charged ions, compensating the overall negative charges of the macromolecules: 5–15 mM sodium ions, Na+ , 140 mM potassium ions, K+ , as well as smaller amounts of divalent ions, 0.5 mM magnesium, Mg2+ , and 10−7 mM calcium, Ca2+ . Here mM stands for millimolar, 10−3 moles of particles per liter. There are also small anions, mainly 5–15 mM chloride ions, Cl− . We know the forces between those charged objects; in fact, basic electrostatics is even taught in school. But even if we consider the macromolecules as fixed in space for the sake of simplicity, a cell contains a huge number of mobile small ions that move according to the electrostatic forces acting on them, which in turn change the fields around them and so on. This problem is far too complicated to allow an exact treatment. There is no straightforward statistical physics approach that can account for all the charge– charge interactions occurring inside a cell. In other words, we have not a really good handle on electrostatics, the major interaction
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
266 Electrostatics inside the Cell
force between molecules in the cell. The current chapter tries to give you a feeling of what we understand well and what not. Hopefully, this makes you rather critical when you encounter electrostatics problems in biophysics in the future. The standard approach to theoretically describe the many-body problem of mobile charges in an aqueous solution in the presence of charged surfaces is the so-called Poisson–Boltzmann (PB) theory. It is not an exact theory but contains a standard approximation scheme, the mean field approximation. This scheme is widespread in physics and often very successful. As I will argue, one has to be very careful when applying it to the highly charged molecules that are in a cell. To construct the PB theory, one first distinguishes between mobile and fixed ions. This distinction comes very natural since the small ions move much more rapidly than the macromolecules. So it is usually reasonable to assume that at any given point in time the small ions have equilibrated in the field of the much slower moving macromolecules. Let us denote the concentration of small ions of charge Z i e by ci (x) where e denotes the elementary charge and |Z i | the valency of the ion: |Z i | = 1 for monovalent ions, |Z i | = 2 for divalent ions and so on. The concentration of fixed charges, the “macromolecules,” is denoted by ρfixed (x). The total charge density at point x is then (7.1) ρ (x) = Z i eci (x) + ρfixed (x) . i
From a given charge density ρ (x) the electrostatic potential ϕ (x) follows via the Poisson equation: 4π ρ (x) . (7.2) ε Here ε is the so-called dielectric constant, which has the value ε = 1 in vacuum and the much larger value ε ≈ 80 in water. That this value is so high in water, the main ingredient of the cell, is crucial since otherwise there would hardly be any free charges, as we shall see below. The Poisson equation, Eq. 7.2, is linear in ϕ and ρ so that it is straightforward to solve for any given charge density. First one needs to know the Green’s function, i.e., the solution for a single point ∇ · ∇ϕ (x) = ϕ (x) = −
Poisson–Boltzmann Theory
charge e at position x , ρ (x) = eδ (x − x ). Since 1/ x − x = −4π δ (x − x ), this is given by e ϕ (x) = eG x, x = . (7.3) |x ε − x | Having the Green’s function G (x, x ) for the Poisson equation, one can calculate the potential resulting from any given charge distribution ρ (x) via integration: ϕ (x) = G x, x ρ x d 3 x ρ (x ) 3 = d x. (7.4) ε |x − x | You can easily check that this actually solves Eq. 7.2. Physically the integral in Eq. 7.4 can be interpreted as being a linear superposition of potentials of point charges, Eq. 7.3. Unfortunately things are not as easy here since mobile ions are present. The potential produced by a given charge density is in general not flat so that the mobile charges experience forces, i.e., they will move. If they move, the charge density changes and with it the potential and so on. Strictly speaking what we are looking for is the thermodynamic equilibrium where the net fluxes of each type of ion i amounts to zero. We already know what the answer is: The density of each ion type is given by the Boltzmann distribution: ci (x) = c0i e− Z i eϕ(x)/kB T
(7.5)
with c0i denoting the density at places in space where ϕ (x) = 0. Combining Eqs. 7.1, 7.2 and 7.5 leads to the Poisson–Boltzmann equation: 4π Z i ec0i 4π (7.6) e−Z i eϕ(x)/kB T = − ρfixed (x) . ϕ (x) + ε ε i This is an equation for ϕ (x); the densities of the different mobile ion species are then given by Eq. 7.5. An additional constraint is that the total charge of the system must be zero: (7.7) ρ (x) d 3 x = 0. system
This condition can be understood as follows: If the system of size R (here, e.g., the whole cell) would carry a non-vanishing charge Q,
267
268 Electrostatics inside the Cell
then the energy that it cost to charge it would scale like Q2 / (ε R). It is extremely unlikely that this energy is much larger than the thermal energy and therefore Q needs to stay very small. In other words, the huge positive and negative charges inside the cell need to cancel each other, leading to a total charge Q that can be considered to be zero for any practical purposes. There are two problems when dealing with a PB equation, one of more practical, the other of principal nature. The practical problem is that this is a non-linear differential equation for the potential ϕ (x) which is usually very hard to solve analytically; there exist exact solutions only in a few special cases, two of which will be discussed below. That ϕ (x) occurs at two different places in Eq. 7.6 just follows from the above-mentioned fact that charges move in response to the potential and at the same time determine the potential. Solutions need to be self-consistent, i.e., the distribution of charges needs to induce an electrical potential in which they are Boltzmann distributed. In many cases, the non-linearity makes it difficult to understand how sensitive the solution is to details in the charge distribution. However, what is much more worrying is the second problem. Solutions of Eq. 7.6 are usually smooth functions that look very different to the potentials featured by actual electrolyte solutions. Close to each ion the potential has very large absolute values that in the limit of point charges go even to infinity. Something has been lost on the way when we constructed the PB equation: Instead of looking at concrete realizations of ion distributions, we consider averaged densities ci (x), Eq. 7.5. These averages create smooth potentials. This is a typical example of a mean field approximation: the effect of ions on a given ion is replaced by an averaged effect. A priori it is not at all clear whether such an approximation makes sense when applied to the electrostatics of the cell. But it is intuitively clear that the field emerging from a solution of monovalent ions shows less dramatic variations than that of a solution of ions of higher valency. The question we have to answer will be when PB works reasonably well, when it breaks down and what new phenomena might emerge in this case. As we shall see, this a fascinating topic with many surprising results.
Electrostatics of Charged Surfaces
7.2 Electrostatics of Charged Surfaces In this chapter we aim at understanding the electrostatic interactions between macromolecules. Especially we would like to know what happens if two DNA double helices come close to each other or if a positively charged protein approaches a DNA double helix. Usually the charges are not homogeneously distributed on the surface of a macromolecule. For instance, charges on the DNA double helix are located along the helical backbones as described in Chapter 4 and the distribution of charged groups on a protein is often rather complex. Despite these complications, we shall see that one can learn a great deal about these systems by looking at much simpler geometries, especially by looking at the electrostatics of homogeneously charged flat surfaces. The reason for this is that in many cases all the interesting electrostatics happens very close to the surface of a macromolecule. Essentially, the ions then experience the macromolecules in a similar way to how we experience our planet, namely as a flat disk. We shall see in the following section that this is indeed true; in this section we focus on charged planes. To get started, we rewrite the PB equation 7.6 in a more convenient form by multiplying it on both sites by e/kB T : $ % ρfixed (x) − Z i (x) . (7.8) = −4πl B (x) + 4π Z i l B c0i e e i Here (x) denotes the dimensionless potential (x) = eϕ (x) /kB T . In addition we introduced in Eq. 7.8 one of three important length scales in electrostatics, the so-called Bjerrum length lB =
e2 . εkB T
(7.9)
This is the length where two elementary charges feel an interaction energy kB T : e2 / (εl B ) = kB T . In water with ε = 80 one has l B = 0.7 nm. This is small enough compared to atomic scales so that two oppositely charged ions “unbind.” On the other hand, inside a protein core the dielectric constant is much smaller, roughly that of oil with ε ≈ 5, and thus there are hardly any free charges inside the core. Looking again at Eq. 7.9, one can see that another route to free charges is to heat a substance to extremely high temperatures.
269
270 Electrostatics inside the Cell
This leads to a so-called plasma, a state of matter of no biological relevance. As a warm-up exercise, let us first consider a simple special case, namely an infinite system without any fixed, only with mobile charges. Suppose we have an equal number of positively and negatively charged ions of valency Z . In this case the PB equation 7.8 reduces to − (x) + 8πl B Z csalt sinh (Z (x)) = 0
(7.10)
with csalt denoting the bulk ion density, the salt concentration. At first sight Eq. 7.10 might look difficult to solve but in fact the solution is as trivial as possible, namely (x) = 0
(7.11)
everywhere. This result is rather disappointing but not really surprising since the PB equation results from a mean-field approximation. And the mean electrical field of an overall neutral system of uniform positive and negative charges vanishes. In reality one has thermal fluctuations that lead locally to an imbalance between the two charge species. But such fluctuations are not captured in PB theory. So far it seems that PB produces nothing interesting. This is, however, not true: as soon as fixed charges are introduced, one obtains non-trivial insights. As we shall see later, even the fluctuations in a salt solution in the absence of fixed charges can be incorporated nicely in a linearized version of the PB theory, the ¨ el theory, which is treated in Section 7.4. Debye–Huck In the following we study the distribution of ions above a charged surface as depicted in Fig. 7.1. This is an exactly solvable case that provides crucial insight into the electrostatics of highly charged surfaces and—as we shall see later—of DNA itself. The system consists of the infinite half-space z ≥ 0 and is bound by a homogeneously charged surface of surface charge number density −σ at z = 0. Above the surface, z > 0, we assume to have only ions that carry charges of sign opposite to that of the surface, socalled counterions. The counterions can be interpreted to stem from a chemical dissociation at the surface, leaving behind the surface charges. These ions make sure that the charge neutrality condition, Eq. 7.7, is respected. We assume that no salt is added, i.e., there are
Electrostatics of Charged Surfaces
z
−σ Figure 7.1 Atmosphere of positively charged counterions (blue) above a surface with a negative charge number density −σ (light red) which is assumed to be homogeneously smeared out.
no negatively charged ions present. The PB equation, Eq. 7.8, then has the following form: (z) + C e−(z) = 4πl B σ δ (z) .
(7.12)
We replaced here the term 4πl B Z c0 by the constant C to be determined below and the primes denote differentiations with respect to z, = d/dz. As a result of the symmetry of the problem, this is an equation for the Z -direction only since the potential is constant for directions parallel to the surface. To solve Eq. 7.12, let us consider the space above the surface, z > 0. Due to the absence of fixed charges, we find (z) + C e−(z) = 0.
(7.13)
Multiplying this equation with and performing an integrating along z leads to 1 2 (7.14) E = − C e− 2 where E denotes an integration constant. To solve Eq. 7.14 we use the trick of the separation of variables, here of z and , i.e., we rewrite this equation as d dz = ± √ . (7.15) 2E + 2C e− Integration yields d z − z¯ = ± √ (7.16) 2E + 2C e− ¯
271
272 Electrostatics inside the Cell
where we start the integration at height z¯ above the surface where ¯ . As we shall see a posteriori, we obtain the solution with (z¯ ) = the right boundary conditions if we use the positive sign and set E = 0. This makes the integral in Eq. 7.16 trivial. If we set z¯ = 0 and ¯ = 0 we find choose 2 /2 1 /2 e e d = −1 . (7.17) z= √ C 2C 0
Solving this for , one finally obtains the potential as a function of z: C = 2 ln 1 + z . (7.18) 2 At a charged surface the electrical field −dϕ/dz makes a jump proportional to the surface charge density. It vanishes below the surface and attains the value √ d = 4πl B σ = 2C (7.19) dz z↓0 √ just above the surface. This sets C . In fact, 2/C turns out to be the second important length scale in electrostatics, the Gouy–Chapman length: 1 . (7.20) 2πl B σ The physical meaning of this length becomes clear further below. We can now rewrite Eq. 7.18 as z = 2 ln 1 + . (7.21) λ The atmosphere of counterions above the surface is then distributed according to Eq. 7.5: λ=
c (z) = c0 e− =
c0 λ 2 . (z + λ)2
(7.22)
The factor c0 in Eq. 7.22 has to be chosen such that the total charge of the counterions exactly compensates the charge of the surface, see Eq. 7.7: ∞ [c (z) − σ δ (z)] dz = 0. (7.23) −∞
Electrostatics of Charged Surfaces
2
Φ
1
2πlB λ2 c 1
2
z/λ
Figure 7.2 Potential , Eq. 7.21, and rescaled counterion density 2πl B λ2 c, Eq. 7.24, as a function of the rescaled height z/λ above a charged surface. The dashed lines indicate a simplified counterion profile where all counterions form an ideal gas inside a layer of thickness λ.
This sets c0 to be σ/λ. We finally arrive at c (z) =
1 . 2π l B (z + λ)2
(7.24)
This distribution is depicted in Fig. 7.2 together with the potential , Eq. 7.21. The density of ions above the surface decays for distances larger than λ algebraically as z−2 . This is somewhat surprising since we have seen that the distribution of gas molecules in a gravity field decays exponentially, see Eq. 2.41. The physical reason is that the gas particles do not feel each other but the ions do. The higher the ions are above the surface, the less they “see” the original surface charge density since the atmosphere of ions below masks the surface charges. As a result, the ions farther above the surface feel less strongly attracted which leads to a slower decay of the density with height. We can now attach a physical meaning to the Gouy–Chapman length λ. First of all, λ is the height up to which half of the λ counterions are found since 0 c (z) dz = σ/2. Secondly, if we take a counterion at the surface where (0) = 0 and move it up to the height λ where (λ) = 2 ln 2 we have to perform work on the order of the thermal energy, eϕ = 2 ln 2 kB T ≈ kB T . One can say that the ions in the layer of thickness λ above the surface form an ideal gas since the thermal energy overrules the electrostatic attraction
273
274 Electrostatics inside the Cell
to the surface. On the other hand, if an ion attempts to “break out” and escape to infinity, it will inevitably fail since it would have to pay an infinite price: → ∞ for z → ∞. This means that all the counterions are effectively bound to the surface. But half of the counterions, namely those close to the surface, are effectively not aware of their “imprisonment.” Based on these ideas, we are now trying to estimate the free energy fapprox per area of this so-called electrical double layer. We assume that all the counterions form an ideal gas confined to a slab of thickness λ above the surface, as indicated by the dashed box in Fig. 7.2. The density of the ions is thus c = σ/λ, which, according to Eq. 2.62, leads to the free energy density $ 3
% σ λT −1 (7.25) β fapprox = c ln cλ3T − 1 λ = σ ln λ where λ T is the thermal de Broglie wavelength, see Eq. 2.11. We now show that this simple expression is astonishingly close to the exact (mean-field) expression. A more formal, less intuitive way of introducing the PB theory would have been to write down an appropriate free energy functional F from which the PB equation follows via minimization. This functional is the sum of the electrostatic internal energy and the entropy of the ions in the solution: 1 (∇ (r))2 d 3r βF = 8π l B
% $ %
$ ρ (r) ρ (r) 3 (7.26) + ln λ T − 1 d 3 r. e e Replacing ρ (r) in this functional by (r) through the Poisson equation = −4πl B ρ/e, Eq. 7.2, one finds that the EulerLagrange equation is indeed identical to the PB equation, Eq. 7.8, namely here (x)+4πl B c0 e−(x) = 0. Inserting the PB solution for a charged surface, Eqs. 7.21 and 7.24 into the free energy functional, Eq. 7.26, we find the following free energy density per area: $ 3
% σ λT β f = σ ln −2 . (7.27) λ The exact expression, Eq. 7.27, differs from the approximate one, Eq. 7.25, just by a term −σ . Given this agreement, we can rightly say
Electrostatics of Charged Surfaces
−σ
D
D
Figure 7.3 Two parallel, negatively charged surfaces and their counterions. Left: For large separations D between the surfaces, the two counterion clouds hardly interact. Right: If the planes are close to each other, the two clouds combine and form a dense “gas,” homogeneously distributed across the gap.
that we have a fairly clear qualitative understanding of the physics of the electrical double layer. Since we are mainly interested in the interactions between macromolecules, especially between two DNA molecules and between a DNA molecule and a protein, we will now discuss two model cases: the interaction between two negatively charged surfaces and the interaction between two oppositely charged surfaces. We begin with two negatively charged surfaces. The exact electrostatics can be worked out along the lines of Eqs. 7.12 to 7.18 using appropriate values of the integration constant. We prefer to give a more physical line of argument here. Suppose the two parallel walls, at distance D, carry exactly the same surface charge density −σ , see Fig. 7.3. Then due to the symmetry of the problem the electrical field in the midplane vanishes; this plane is indicated in the drawing, Fig. 7.3, by dashed lines. The disjoining pressure ! between the two planes, i.e., the force per area with which they repel each other, can then be easily calculated since it must equal the pressure of the counterions in that midplane. Using the ideal gas law, Eq. 2.27, we find
D ! (7.28) =c . kB T 2
275
276 Electrostatics inside the Cell
Without doing any extra work, we can now predict the disjoining pressure between the two surfaces in two asymptotic cases. If the distance is much larger than the Gouy–Chapman length λ of the planes, D λ, we can assume that the two counterion clouds are independent from each other. The density in the midplane is then the sum of the two single-plane densities, see Fig. 7.3 (left). From Eq. 7.24 we obtain ! 1 4 . (7.29) ≈2 2 = D πl B D2 kB T 2πl B 4 Remarkably, the disjoining pressure is here independent of σ . This results from the fact that the single plane counterion density, Eq. 7.24, becomes independent of λ (and thus σ ) for D λ. In the other limit, D λ, the two counterion clouds overlap strongly and we expect a flat density profile, see Fig. 7.3 (right). Hence ! 2σ 1 ≈ = . (7.30) D πl B λD kB T The pressure here is linear in σ , reflecting the counterion density. Note that these results show that the situation is very different from how we are used to think about electrostatics, namely that the pressure results from the direct electrostatic repulsion of the two charged surfaces. In fact, in the absence of counterions the electrical field between the surfaces is constant and follows from the boundary condition, Eq. 7.19. This leads to ! (7.31) = 4πl B σ 2 , kB T which is independent from the distance between the surfaces and, as a result of the pairwise interaction between surface charges, proportional to σ 2 . Therefore, the counterions completely modify the electrostatics and actually “dominate” it. This becomes even more evident when looking at the interaction between two oppositely charged surfaces, see Fig. 7.4. Such a situation arises when a positively charged protein comes close to a negatively charged DNA molecule. For simplicity, let us assume that the number charge densities of the two surfaces are identical, σ + = σ − = σ . If the two surfaces are very far from each other, we can
Electrostatics of Charged Surfaces
+σ
D
−σ
D
Figure 7.4 Two oppositely charged surfaces and their counterions. Left: For large separation D between the surfaces, the two counterion clouds hardly interact. Right: If the planes are close to each other, the counterions are not needed anymore. They gain entropy by escaping to infinity.
assume that both form the usual electrical double layer of thickness λ, one with positive counterions and one with negative counterions, see Fig. 7.4 (left). If the two surfaces come close to each other, Fig. 7.4 (right), there is, however, no need for counterions anymore since the two surfaces neutralize each other. The counterions can therefore escape to infinity and gain translational entropy on the order of kB T . The binding energy per area of the two surfaces as a result of this counterion release should therefore be of the order of kB T σ . If the surface charge densities are not the same, charge neutrality enforces that some of the counterions remain between the surfaces. Note that our model system that assumes two infinitely large surfaces and no added salt is quite academic and that a precise calculation of this effect is not possible in the current framework. No matter how far the two surfaces are apart: if we look at length scales much larger than the surface separation of the two surfaces, they look together like a neutral plane. As a result, the counterions are never really bound. In the following sections we have to come up with slightly more realistic situations that allow better descriptions of the counterion release mechanism.
277
278 Electrostatics inside the Cell
7.3 Electrostatics of Cylinders and Spheres So far we have discussed charged planar surfaces. However, at length scales below its persistence length the DNA double helix looks more like a cylinder and the shapes of globular proteins might be better described by spheres. We ask here whether the basic physics described in the previous section still applies to such objects. As we shall see, this is actually a subtle problem that can be understood in beautiful physical terms. Let us start with DNA. DNA is a charged cylinder with a diameter of 2 nm and a line charge density of −2e/0.33 nm, see Fig. 4.6. The question that we want to answer is whether the counterions of such a charged cylinder are effectively bound or whether they are free. The answer is surprising: Around three quarters of the DNA’s counterions are condensed but the rest are free and can go wherever they want. We give a simple physical argument here that goes back to the great Norwegian scientist Lars Onsager. For simplicity, we describe the DNA molecule as an infinitely long cylinder of line charge density −e/b and diameter 2R. The charges are assumed to be homogeneously smeared out on its surface. The dimensionless electrostatic potential of a cylinder is known to be 2l B r ln (7.32) (r) = b R where r ≥ R denotes the distance from the centerline of the cylinder. Suppose we start with a universe that consists only of one infinitely long cylinder. Now let us add one counterion. We ask ourselves whether this counterion will be bound to the cylinder or whether it is able to escape to infinity. In order to find out, we introduce two arbitrary radii r1 and r2 with r2 r1 R as depicted in Fig. 7.5. Now suppose the counterion tries to escape from the cylindrical region of radius r1 to the larger cylindrical region of radius r2 . According to Eq. 7.32 the counterion has to pay a price, namely it has to move uphill in the electrostatic potential by an amount of the order of
2l B r2 = (r2 ) − (r1 ) = ln . (7.33) b r1 At the same time it has a much larger space at its disposal, i.e., it enjoys an entropy gain. The entropy of a single ion in a volume V
Electrostatics of Cylinders and Spheres
R r1 r2
Figure 7.5 Onsager’s argument for counterion condensation on charged cylinders is based on an estimate of the free energy change for a counterion that goes from a cylindrical region of radius r1 to a larger region of radius r2 .
follows from the ideal gas entropy S = kB N ln V / Nλ3T + 5/2 with N = 1. This equation follows from combining Eq. 2.62 with Eqs. 2.26 and 2.60. When the ion moves from the smaller to the larger region we find the following change in entropy: 2
r2 r . (7.34) S = S (r2 ) − S (r1 ) = kB ln 22 = 2kB ln r1 r1 Altogether this amounts to a change in the free energy of
lB r2 . − 1 ln F /kB T = − S /kB = 2 b r1
(7.35)
There are two possible cases. For weakly charged cylinders, b > l B , the free energy change is negative, F < 0, and the counterion eventually escapes to infinity. For highly charged cylinders, b < l B , one finds F > 0. In that case the energy cost is too high as compared to the entropy gain and the counterion always stays close to the cylinder. Now the same argument can be used for the rest of the counterions. What we have to do is simply add, one by one, all the counterions. The non-trivial and thus interesting case is that of a highly charged cylinder with b < l B . In the beginning all the counterions that we add condense, thereby reducing the effective line charge. This continues up to the point when the line charge
279
280 Electrostatics inside the Cell
density has been lowered to the value −e/l B . All the following counterions added feel a cylinder that has an effective line density that is just too weak to keep them sufficiently attracted, allowing them to escape to infinity. To conclude, the interplay between entropy and energy regulates the charge density of a cylinder to the critical value −e/l B . Cylinders with a higher effective charge density simply cannot exist. According to the above given definition, DNA is a highly charged cylinder. Counterion condensation reduces its bare charge density of 1/b = 2/ (0.33 nm) to the critical value 1/l B = 1/ (0.7 nm). That means that a fraction b 1 e/b − e/l B =1− , =1− (7.36) ξ e/b lB i.e., about 76%, of the DNA’s counterions are condensed. Counterion condensation on cylinders is called Manning condensation and is characterized by the dimensionless ratio ξ = l B /b, the Manning parameter. Cylinders with ξ > 1 are highly charged and have condensed counterions. More precise treatments based on the PB equation show that this simple line of arguments is indeed correct. There is another interesting interpretation for Manning condensation. In Section 7.2 we have seen that all the counterions of an infinite, planar surface are condensed. Now a cylinder looks like a flat surface to a counterion if the Gouy–Chapman length, the typical height in which it lives above the surface, is much smaller than the radius of the cylinder, i.e., if λ R. Using the definition of λ, Eq. 7.20, this leads to the condition ξ 1. One can say that for ξ > 1 a counterion experiences the cylinder as a flat surface and thus stays bound to it. Let us now study a model protein, a sphere of radius R that carries a total charge eZ homogeneously smeared out over its surface. We can again use an Onsager-like argument by adding a single counterion to a universe that consists only of that sphere. We estimate the change in free energy when the counterion moves from a spherical region of radius r1 R around the sphere to a larger region of radius r2 r1 , see Fig. 7.6. The change in electrostatic energy is given by = (r2 ) − (r1 ) = l B Z r2−1 − r1−1 and that of the entropy by S = 3 ln (r2 /r1 ). We learn from this that the free energy change F /kB T = − S/kB goes to −∞ for
Electrostatics of Cylinders and Spheres
R r1 r2
Figure 7.6
Onsager’s argument applied to a charged sphere.
r2 → ∞, no matter how highly the sphere is charged. This suggests that a charged sphere always loses all its counterions. Our results on counterion condensation that we have obtained so far can be summarized as follows. The fraction of condensed ions, fcond , depends on the shape of the charged object as follows: • plane: fcond = 1, • cylinder: fcond = 1 − ξ −1 for ξ > 1, fcond = 0 otherwise, • sphere: fcond = 0. It is important to realize that we have considered fairly academic special cases so far. First of all, we assumed infinitely extended planes and infinitely long cylinders but any real object is of finite size. Any object of finite extension looks from far apart like a point charge and will thus lose all its counterions, as a sphere does. One might therefore think that theorizing about counterion condensation is a purely academic exercise. Fortunately, this is not the case because, as we shall see now, counterions might also condense on spheres. We concluded above for the spherical case fcond = 0, assuming that we only had one sphere in the universe. If there is a finite density of spheres, each with its counterions, the situation can be different. Also we assumed that there are no small ions present, except the counterions of the sphere. If we have a single sphere but a finite salt concentration, the situation can again differ from the above given academic case. In both cases, for a finite density of spheres or for a finite salt concentration, the entropy gain for a counterion to escape to infinity is not infinite anymore. Depending
281
282 Electrostatics inside the Cell
λ R
II I
Figure 7.7 A highly charged sphere in a salt solution. To a good approximation ions can “live” in two zones. Zone I contains “condensed” counterions, zone II the bulk electrolyte solution.
on the sphere charge and on the concentration of small ions in the bulk, there might be a free energy penalty instead. We consider now a single sphere in a salt solution following the line of argument given by Alexander and coworkers (Alexander et al., 1984). They postulated two zones for a highly charged sphere at moderate salt concentration csalt , see Fig. 7.7. Zone I is the layer of condensed counterions of thickness λ and zone II is the bulk. When a counterion from the bulk, zone II, enters zone I it loses entropy since it goes from the dilute salt solution of concentration csalt to the dense layer of condensed counterions. The ion concentration of that layer can be estimated to be ccond ≈ σ/λ = 2πl B σ 2 where σ denotes the surface charge number density of the sphere. We assume here that the sphere is so highly charged that most of its counterions are confined to zone I. The entropy loss is then given by S = SI − SII ≈ −kB ln
ccond = −kB . csalt
(7.37)
Counterions also gain something by entering zone I. In zone II a counterion does not feel the presence of the charged sphere since the electrostatic interaction is screened by the other small ions as shall become clear in the following section. On the other hand, in zone I it sees effectively a sphere of charge Z ∗ where Z ∗ denotes the sum of the actual sphere charge, Z , and the charges from the condensed counterions inside zone I. The gain in electrostatic
Electrostatics of Cylinders and Spheres
energy is thus ≈ −
lB Z ∗ . R
(7.38)
If we start with a system where all counterions are inside the bulk, counterions flow into zone I up to a point when there is no free energy gain anymore. This point is reached when the charge is renormalized to the value eZ ∗ = e
R . lB
(7.39)
To formulate it in a more elegant way: Z ∗ is the point where the chemical potentials of zone I and II are identical. However, it should be noted that in order to obtain Eq. 7.39 we cheated a bit since we assumed that is a constant. This is not really the case since according to Eq. 7.37 depends on ccond and thus on Z ∗ . Since this dependence is logarithmic, i.e., very weak, this simplification is quite reasonable and one can assume to be a constant with a value of around 5 for typical salt concentrations and surface charge densities encountered in cells. A more concise way of calculating the renormalized charge Z ∗ is given in the next section. We are now in the position to refine our argument on counterion release from the end of Section 7.2. Consider again the case of two oppositely charged surfaces as depicted in Fig. 7.4 but with additional salt. If the surface charge densities of the two surfaces are the same, all the counterions are released and the gain in free energy reflects the change in concentration experienced by the counterions. The free energy change per surface thus scales as 2πl B σ 2 f ≈ σ ≈ σ ln . kB T csalt
(7.40)
When discussing PB theory above—especially for a spherical geometry where no analytical solutions exist—we had to rely on simplified arguments. It turns out that one can gain a great deal of insight by linearizing PB theory. Strictly speaking such a linearization makes only sense for weakly charged surfaces but we shall see that there is an elegant argument that allows us also to extend this framework to highly charged objects.
283
284 Electrostatics inside the Cell
7.4 Debye–Huck ¨ el Theory As mentioned earlier, the PB equation is hard to handle since it is non-linear. Here we study its linearized version, the well-known ¨ el (DH) theory. It provides an excellent approximation Debye–Huck to PB theory for the case that the fixed charges are small enough. Consider the PB equation of a salt solution of valency Z = 1 and concentration csalt in the presence of fixed charges of density ρfixed . The PB equation 7.8 then has the form ρfixed . (7.41) + 4πl B csalt e− − e+ = −4πl B e Let us now assume that the electrostatic energy is small everywhere, i.e., that (x) 1 for all x. In that case we can linearize the exponential functions, e ≈ 1 + and e− ≈ 1 − . This results in the DH equation ρfixed . (7.42) − + κ 2 = 4πl B e We introduced here the final of the three important length scales in electrostatics, the Debye screening length κ −1 . For monovalent salt, as assumed here, this length is given by 1 κ −1 = √ . (7.43) 8πl B csalt Its physical meaning will become clear below. We can now come back to the disappointing result we encountered earlier when we looked at a salt solution in the absence of fixed charges where the PB equation 7.10 is solved by ≡ 0. This has not changed here since also the DH equation produces the same trivial answer. But now we are in the position to go beyond this result and to include in our discussion correlations between salt ions. This would have been very difficult to do for the PB equation where no appropriate analytical solutions are available. Consider a point charge +eZ at position x . The DH equation for such a test charge takes the form: (7.44) − + κ 2 G x, x = 4πl B Z δ x − x . Knowing G (x, x ), the Green’s function, allows to calculate for an arbitrary distribution of fixed charges: ρfixed (x ) 3 d x. (7.45) (x) = G x, x eZ
Debye–H¨uckel Theory
κ−1
+eZ Figure 7.8 An ion of charge +eZ is surrounded by an oppositely charged ion cloud of typical size κ −1 .
The Green’s function for Eq. 7.44 is given by l B Z −κ |x−x | G x, x = e . (7.46) |x − x | One calls this a Yukawa-type potential, referring to Yukawa’s original treatment introduced to describe the nuclear interaction between protons and neutrons due to pion exchange. That this indeed solves Eq. 7.44 can be checked by letting the Laplace operator in spherical coordinates act on the potential of a point charge at the origin:
2 −κr 2 ∂ e−κr ∂ 2e (r) + 4 π δ = − − κ . (7.47) − ∂r 2 r ∂r r r To derive Eq. 7.47 we used the fact that 1/ x − x = −4π δ (x − x ) as mentioned above Eq. 7.3. What is the physical picture behind Eq. 7.46? In the absence of salt ions, one would have just the potential (x) = l B Z / x − x around our test charge, i.e., Eq. 7.46 without the exponential factor or, if you prefer, the full Eq. 7.46 but with κ = 0. In the presence of salt ions, the test charge is surrounded by an oppositely charged ion cloud as schematically depicted in Fig. 7.8. This ion cloud effectively screens the test charge so that the potential decays faster than 1/r, namely like e−κr /r. The screening length κ −1 reflects the typical size of the cloud. Having at hand an expression for the potential around an ion, we calculate now the free energy of a salt solution on the level of the DH theory. As a first step we determine the change of the self-energy of an ion that is brought from ion-free water to the salt solution. We
285
286 Electrostatics inside the Cell
consider the ion as a homogeneously charged ball of radius a and 3 charge density ρ = 3e/ 4πa . We shall show below that the result will not depend on the radius so that we can take the limit a → 0. In an electrolyte free environment the self energy is 1 1 e2 3 3 ρ (x) ρ (x ) d x d x lim = ∞. = (7.48) a→0 2 ε |x − x | 2 εa a=0 On the right-hand side we assumed a point-like charge for which ρ (x) = eδ (x). There is evidently a problem since the self-energy of the point charge is infinite. Let us nevertheless go ahead and calculate the self-energy of the point charge inside an electrolyte solution: 1 1 e2 e−κa 3 3 ρ (x) ρ (x ) −κ |x−x | d x e d x = ∞. = lim a→0 2 ε |x − x | 2 εa a=0 (7.49) Also here the self-energy is infinite. However, we are not interested in what it costs to “form” a point ion. What we want to know instead is the change in the self-energy when the ion is transferred from ionfree water to the electrolyte solution. This change turns out to be finite: $ −κa % e lB 1 lBκ β E self = =− − . (7.50) lim 2 a 2 a→0 a Each particle in the electrolyte contributes this value to the internal energy. This leads to the following change in the internal energy density: κ3 . (7.51) 8π Combining Eqs. 2.14 and 2.60 we know that the average internal energy density u follows from the free energy density f via ∂ u = [β f ] . (7.52) ∂β This allows us to calculate the electrostatic contribution of the charge fluctuations to the free energy density: βu = 2csalt β E self = −
κ3 . (7.53) 12π This finding should surprise you. We discussed in Section 2.4 the impact of the interaction between particles of a real gas on f = −kB T
Debye–H¨uckel Theory
its pressure and free energy. According to the virial expansion (an expansion in the density n) the ideal gas expressions are changed by terms of the order n2 , see Eqs. 2.87 and 2.90. This reflects interactions between pairs of particles, see Fig. 2.8. Surprisingly, for the ion solution we find that interactions between ions instead lead 3/2 to a free energy contribution proportional to κ 3 ∼ csalt . How can one understand this discrepancy? The reason lies in the fact that the electrostatic interaction decays very slowly with distance. If one attempts to calculate the second virial coefficient B2 for such a longranged 1/r-potential, one finds a diverging integral: According to −β 2 w (r ) − 1 which then Eq. 2.88 the integrand is proportional to r e scales for large r as r 2 (1/r) = r. We provide now a scaling argument that makes Eq. 7.53 transparent. Consider a very small volume V inside the electrolyte solution. Ions can enter and leave this volume at will as if they would be uncharged and as a result the volume displays random fluctuations in its net charge. According to the central limit theorem, Eq. B.7, the net charge Q can be estimated to be proportional to the square root of the number of ions Nion inside that volume, i.e., (7.54) Q/e ≈ ± Nion = ± csalt V . The assumption that the ions are independent of each other is only true up to regions of size L with volume V ≈ L3 for which the electrostatic self energy equals the thermal one l B Q2 ≈ 1. L This condition can be rewritten as 1 ≈ κ −1 , L≈ √ l B csalt
(7.55)
(7.56)
i.e., the length scale up to which ions move independently from each other is just the Debye screening length, Eq. 7.43. For larger length scales a volume κ −3 that happens to carry a positive excess charge is typically surrounded by regions with negative excess charge as schematically indicated in Fig. 7.9. The interaction energy of two such neighboring, oppositely charged regions is on the order of −kB T as follows directly from Eq. 7.55. We therefore expect that the fluctuations in the charge distribution lead to a contribution to the
287
288 Electrostatics inside the Cell
κ−1
Figure 7.9 Schematic sketch of charge fluctuations inside an electrolyte solution. Regions of typical size κ −1 with an excess of negative ions are surrounded by regions with positive net charge and vice versa.
free energy density that scales like −kB T κ 3 . This is indeed what we found from the exact DH treatment, Eq. 7.53. The DH equation can be solved analytically for various geometries. Here we present the solutions for three standard geometries: a plane, a line and a charged ball. The DH equation for a plane of charge density σ is given by
∂2 − 2 + κ 2 = 4πl B σ δ (z) . (7.57) ∂z It is straightforward to check that this is solved by the potential (7.58) (z) = 4π l B σ κ −1 e−κz for z ≥ 0 and (z) = 0 for z < 0. A corresponding DH equation in cylindrical symmetry for a charged line of line charge density b−1 leads to the potential ⎧ ⎨ 2l B ln κr for κr 1 2l B b (r) = − K0 (κ r) ≈ (7.59) ⎩− l B 2π e−κr for κr 1. b κr b The function K0 is a modified Bessel function whose asymptotic behavior for small and large arguments has been used on the rhs of Eq. 7.59 to predict the potential close to and far from the charged line. The short-distance behavior is identical to the one of a naked rod, Eq. 7.32, and for larger distances the line charge is screened √ as e−κr / r (up to logarithmic corrections). Finally, for a charged sphere of radius R and charge Z one finds for r > R the potential l B Z e−κ(r−R) . (7.60) (r) = 1 + κR r
Debye–H¨uckel Theory
As for a point charge the potential decays proportional to e−κ r /r. However, for a sphere larger than the screening length, κ R > 1, the full charge can never be seen, not even close to its surface, since it is distributed in a volume larger than the screening length. Z is then effectively reduced to Z / (κ R). The above given three potentials are not only exact solutions to the DH equation but also excellent approximations to the PB equation if the potential is everywhere much smaller than one, 1. For a line charge this condition requires l B b, i.e., the Manning parameter ξ needs to be much smaller than one. Hence DH theory works well if we do not have Manning condensation. In other words, counterion condensation is just a physical manifestation of the non-linearity of the PB equation. For spheres the situation is similar. Assuming a sufficiently small sphere so that κ R < 1, the DH approximation works well if l B Z /R 1, see Eq. 7.60. This condition is fulfilled if the sphere charge is much smaller than the charge Z ∗ , Eq. 7.39, the value to which a highly charged sphere would be renormalized. In other words, DH can be used for weakly charged spheres that do not have charge renormalization. But what can one do if surface charge densities are so high that becomes larger than unity? Does one necessarily have to deal with the difficulties of non-linear PB theory or can one somehow combine the insights into counterion condensation and DH theory to construct something that can be handled more easily? That this is indeed possible has been demonstrated by Alexander and coworkers (Alexander et al., 1984). The idea is that the nonlinearities of the PB equation cause the charge renormalization of highly charged surfaces. As a result, the potential slightly away from such a surface is so small that DH theory can be used, but a DH theory with a properly reduced surface charge. Consider, for instance, a sphere with κ R < 1. If the sphere is weakly charged, we can simply use Eq. 7.60. If the sphere is highly charged, the non-linearities of the PB theory predict a layer of condensed counterions of thickness λ that effectively reduces the sphere charge Z to a smaller value Z ∗ as estimated in Eq. 7.39. We thus expect that the potential sufficiently away from the sphere’s surface is given by Z ∗ (r) = l B Z ∗
e−κ(r−R) . r
(7.61)
289
290 Electrostatics inside the Cell
However, note that Eq. 7.39 is only a rough estimate of Z ∗ based on an argument in which the space around the sphere is artificially divided into two zones. We are now in the position to give Z ∗ a precise meaning by requiring that the renormalized DH solution Z ∗ and the exact PB solution PB —which is here only known numerically—match asymptotically for large distances: lim Z ∗ (r) = lim PB (r) .
r→∞
r→∞
(7.62)
That Eq. 7.62 has a precise mathematical meaning follows from two facts: (1) due to the symmetry of the problem the electrical field is radially symmetric and (2) the potential decays to zero away from the sphere. Therefore, the potential must look asymptotically as the DH solution of a charged sphere. In Fig. 7.10(a) we sketch the potential (r) schematically for the three solutions around a highly charged sphere: the full PB solution, the DH solution with the bare charge Z and the DH solution with the renormalized charge Z ∗ . For a non-renormalized charge the DH solution overestimates the potential at large distances, while the renormalized DH solution matches asymptotically the full PB solution. The resulting counterion density c (r) ∼ e(r) for the full PB solution and the renormalized DH solution is depicted in Fig. 7.10(b).
Φ
c Z λ
PB
PB
λ R
Z∗
r (a)
R
Z∗
r (b)
Figure 7.10 (a) Schematic sketch of the potential around a charged sphere for the full solution (PB), the DH solution (Z ) and the DH solution with renormalized charge (Z ∗ ). (b) Resulting counterion density for the PB solution and for the DH solution with renormalized charge. At large distances the densities are the same but close to the sphere PB predicts a dense layer of condensed counterions.
Debye–H¨uckel Theory
λ = 0.24 nm √
A = 1 nm
Figure 7.11 The area A per charged group on the surface of the DNA double helix is around 1 nm2 but the Gouy–Chapman length λ of a homogeneously charged surface with the same surface charge density is only 0.24 nm.
You might be worried that all the details of the PB theory are lost since in this simple procedure everything is lumped together in one number, the renormalized charge. It is true that renormalized DH theory can only describe the electrostatics beyond the Gouy– Chapman length. It has nothing to say about the microscopic details inside the double layer. However, one can argue that one does not really want to know about these microscopic details anyway. As a concrete example let us consider again DNA which has two elementary charges per 0.33 nm and a radius of R = 1 nm. This leads to the Gouy–Chapman length λ=
σ −1 0.33 nm × R = ≈ 0.24 nm. 2πl B 2 × 0.7 nm
(7.63)
Up to now we assumed that the DNA charges are homogeneously smeared out. In reality the DNA surface area per phosphate charge is given by A=
2π R × 0.33 nm ≈ 1 nm2 . 2
(7.64)
In other words, the layer of condensed counterions per surface charge is much thinner than it is wide. So we must expect that the details of the charge distribution, namely its graininess, have an effect on the counterion condensation. Smearing out the surface charges might create huge errors, e.g., in the value of the renormalized charge. It is, however, difficult to estimate the size of this error since the PB theory is extremely non-linear close to the surface.
291
292 Electrostatics inside the Cell
In principle it is, of course, possible to numerically solve the PB equation for any distribution of surface charges, but one has to ask oneself how meaningful that is. Typical ion radii are of the order of the λ-value of DNA which can have an effect that again is difficult to estimate due to the inherent non-linearity of PB theory. And finally, there is yet another effect that we have brushed under the carpet: the difference in the dielectric constants between the inside of a macromolecule and the surrounding water. Since electrical field lines try to avoid regions of low dielectricity, e.g., the inside of a protein, ions feel an effective repulsion from such a region. In standard electrostatics such effects can be modeled via the introduction of so-called image charges, virtual charges that “live” inside regions of low dielectricity and repel real ions nearby. This is also an effect in which microscopic details play a role and which is difficult to correctly estimate. All that we can say is that all these effects act together in effectively reducing the charge densities of highly charged surfaces.
7.5 Breakdown of Mean Field Theory When discussing PB theory and its linearized version, DH theory, we might have given the impression that these theories always work in one way or another. We noted that the strong non-linearities close to highly charged surfaces are somewhat problematic but claimed that proper charge renormalization will fix this problem. However, as we shall see now, electrostatics is not always as simple as that. Let us go back to the problem of two equally charged surfaces. PB theory predicts that two such surfaces repel, see the two expressions for the disjoining pressure at short and large separations, Eqs. 7.29 and 7.30. In many experiments, however, it has been observed that equally charged objects attract each other, an effect that—as can be shown strictly mathematically—can never be produced by PB theory. In other words, PB theory does sometimes not even get the sign of the force right. A well-known example is DNA. Under the right conditions, a DNA molecule can condense onto itself. In contrast to a flexible polymer in a poor solvent, which is in the form a molten globule, such a condensed DNA molecule typically forms a toroid,
Breakdown of Mean Field Theory 293
A
B
c (a)
D
(b)
Figure 7.12 Two Wigner crystals formed by condensed counterions induce an attraction between two equally charged planes: (a) top view indicating the displacement vector c that leads to maximal attraction and (b) side view.
avoiding a region with too high a curvature in its center. How is it possible that a highly charged molecule like DNA attracts itself? In fact, this never happens inside monovalent salt solutions but when a sufficient amount of trivalent ions or ions of even higher valency is added, such a collapse is typically observed. It can be shown that the mean field approximation becomes less and less accurate with increasing ion valency. We are lucky that monovalent ion charges are small enough that PB theory can be applied. In fact, one can go much further and not just worry about the applicability of that theory: if the smallest charge unit would be e.g., 4e instead of e, everything would glue together and life would simply not be possible. It is not straightforward to develop a very clean theory that describes the origins of this attraction. We give here a simple argument that goes back to Rouzina and Bloomfield (Rouzina and Bloomfield, 1996). Again we study the interaction between two identically charged surfaces with their counterions. We assume monovalent counterions but lower the temperature to zero, i.e., we study the ground state of the system. This is, of course, rather academic since water freezes long before but what we are aiming at is just a basic understanding of the principle. According to socalled Earnshaw’s theorem any electrostatic system collapses at a sufficiently low temperature. At zero temperature, two surfaces with their counterions should thus stick on top of each other, D = 0. We shall see that the two surfaces indeed attract in that case.
294 Electrostatics inside the Cell
Let us first consider a single charged plane. For T → 0 its Gouy–Chapman length goes to zero, λ → 0, since the Bjerrum length goes to infinity, l B → ∞. This means that all the counterions sit on the surface. In order to minimize their mutual repulsion, they form a two-dimensional triangular so-called Wigner crystal as depicted in Fig. 7.12. If we now have two such surfaces that are sufficiently far apart, the counterions at both surfaces form such patterns independently of one another. When the two surfaces come closer, the counterions lower the electrostatic energy further by shifting their two Wigner crystals with respect to each other by a vector c as indicated Fig. 7.12(a). That way an ion in plane B is located above an ion-free area in plane A, namely above the center of a parallelogram with A-ions in its corners. In other words, the relative position of the two planes is shifted with respect to each other by half a lattice constant, so that the two Wigner crystals are out-of-register. A counterion sitting on one plane, say plane A, feels then the following dimensionless potential resulting from the interaction with plane B and its counterions: d 2r 1 √ . (7.65) − lBσ (D) = l B r 2 + D2 |Rl + c|2 + D2 l The first term on the rhs describes the repulsion from the counterions condensed on surface B that are located at positions Rl + c with c denoting the displacement vector between the two planes (both, Rl and c, are in-plane vectors). The second term accounts for the attraction of the counterion to the homogeneous surface charge on plane B. Further terms do not appear in Eq. 7.65 since the attraction of the fixed charge of plane A to ions in plane B is exactly cancelled by the repulsion from the fixed charge of plane B. From Eq. 7.65 follows directly the pressure between the two surfaces: 2π √ ! (D) ∂ (7.66) = −σ (D) ≈ −8π σ 2l B e− 31/4 σ D . ∂D kB T This formula, derived in Appendix G, is accurate for distances D √ much larger than the counterion spacing ∼ 1/ σ . We thus find an attraction with a decay length proportional to the counterioncounterion spacing.
Problems
What is the condition that needs to be fulfilled to have attraction between equally charged surfaces? Above we argued that PB theory is not useful anymore if the Gouy–Chapman length becomes shorter than the distance between fixed charges on the surface, see Fig. 7.11. Here we use a similar argument, but this time we focus on the counterions in order to estimate when the alternative theory of correlated counterions becomes reasonable (Moreira and Netz, 2002). If the counterions have valency Z , then the height up to which half of the counterions are found is λ/Z . On the other hand, the spacing a between the counterions sitting in a Wigner crystal, √ as shown in Fig. 7.12, is given by 3/2a2 = Z /σ . The typical lateral distance between counterions is larger than the height of the counterion cloud if a > λ/Z . This leads to the condition Z > 3
3 1 . 2 2 4π l 2B σ
(7.67)
It follows that the cloud is essentially two-dimensional for sufficiently large counterion valencies (note the cubic dependence) and for large enough surface charge densities. Remarkably, when condition 7.67 is fulfilled, one finds that—up to a numerical factor—the spacing between counterions fulfills a < Z 2 l B , i.e., the neighboring ions feel a mutual repulsion larger than kB T . Even though this is by far not strong enough to induce their ordering into a perfect Wigner crystal, the ions are correlated to some extent and can induce the attraction between the charged surfaces. For DNA one has σ = 1 nm−2 and condition 7.67 reads Z 3 > 0.06 or Z > 0.4. This seems to suggest that monovalent ions are already strong enough to cause attraction, but the argument is obviously too simple to provide a reliable quantitative estimate. In reality, ions with Z = 3 or larger cause attraction between DNA double helices.
Problems 7.1 Manning condensation You are going to explain Manning condensation rigorously by solving the PB equation and determining the concentration of counterions around a uniformly charged
295
296 Electrostatics inside the Cell
cylinder of radius R. The PB equation reads (r) + C e−(r) = 4πl B σ δ (r − R) . (i) Write down the explicit form of the Laplace operator in Cartesian coordinates (x, y, z). Perform a transformation to cylindrical coordinates (r, θ, z) and write down the PB equation in these coordinates. Hint: Use the chain rule of differentiation. Use the fact that the three coordinates (x, y, z), or equivalently (r, θ, z), are independent of each other. (ii) Now as you have the PB equation in cylindrical coordinates, make the change of variable s = R ln (r/R) and show that this gives—after redefining the potential—an equation equivalent to that of a charged surface, the case that we discussed in Section 7.2. Write down this redefined potential . (iii) Along the lines √of Section 7.2, solve the PB equation to obtain . Determine C from the boundary condition. As you will see, there is a√range of parameters where the solution makes no sense as C is negative. Our hope is, that we can set C = 0 in those cases. What would be the physical interpretation for C = 0? √ (iv) For those cases where C < 0, set C = 0 and solve the PBequation again. Hint: You can use the same change of variable as in (ii). (v) Give the full solution of (r). Compute the concentration of ions. Hint: For notational simplicity, use the Manning parameter ξ = 2π Rσ l B . 7.2 OSF theory Consider a charged stiff polymer of length L with persistence length l P . The charge is smeared out homogeneously along the polymer with a charge density e per length b; the charge density is assumed to be so small that one can use linearized PB theory. The polymer is immersed in a salt solution with Debye screening length κ −1 . Derive the so-called Odijk–Skolnick–Fixman (OSF) theory that says that the bare persistence length l P of such a polymer is increased to the value lP → lP +
lB . 4b2 κ 2
Problems
To do this, compare the energy (bending plus electrostatic repulsion) per length of a straight chain to that of a chain that is bent with a radius of curvature R κ −1 . What is remarkable about this result? Hints: Use κ R 1 and κ L 1. Compare the electrostatic interaction of a charge on the chain with the rest of the chain for the two cases: a straight configuration and a bent configuration where the chain is bent along a circle of radius R. If that charge sits at s = 0, the dist ance to other charges along the bent chain is given by d(s) = x(s)2 + y(s)2 with x(s) = R (cos(s/R) − 1) and y(s) = R sin(s/R). Taylor expand e−κd(s) /d(s) up to terms of order 1/R 2 .
297
Chapter 8
DNA–Protein Complexes
8.1 Protein Target Search Cells need to adapt quickly to changes in their environment. This often means that they have to either start or stop the production of certain proteins within a very short period of time. Transcriptional regulation typically involves transcription factors, DNA-binding proteins, that bind to their specific, a few bp long target sites. These proteins are either activators that are needed to switch on transcription or repressors that switch it off. We will not go into the more involved transcriptional regulation inside eukaryotic cells, which typically involves a large number of proteins. Instead, we restrict ourselves to the much more simple regulation inside bacteria. As mentioned in Section 1.2, such cells have no nucleus and no chromatin. The presence of various repressors is crucial, as a bacterium only needs a fraction of the proteins that are encoded in its genome at any given time. Here we focus on a very famous repressor, the lac repressor in the bacterium Escherichia coli (E. coli for short). This repressor, if bound to its target site, suppresses the transcription of the genes of the lac operon, see Fig. 8.1. Operons are clusters of genes that are always transcribed together. They are widespread in prokaryotes and have Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
300 DNA–Protein Complexes
lactose
high lactose level
lac repressor inactive repressor
mRNA promoter
operator
lacZ
lacI
lacY
lacA
mRNA RNA polymerase -galactoside transacetylase -galactoside permease -galactosidase
lac repressor
(a)
low lactose level
mRNA
lacI
RNA polymerase
lacZ
lacY
lac operon
lacA
(b)
Figure 8.1 Transcriptional regulation of the lac operon as an example of a genetic switch. (a) At high lactose concentration the lac repressor is inactivated and the genes of the lac operon are expressed. This leads to the production of proteins involved in the metabolism of lactose. (b) In the absence of lactose, the repressor binds to the operator blocking the transcription of the genes of the lac operon, preventing the wasteful production of useless proteins.
Protein Target Search
even been found in eukaryotes. The lac repressor considered here controls the transcription of three genes called lacZ, lacY and lacA. These genes encode for proteins that are involved in the metabolism of milk sugar, the so-called lactose. How do E. coli bacteria get into contact with lactose? Since E. coli live in our intestines, they will find themselves surrounded by lactose whenever we drink a glass of milk. Lactose is made from two covalently bound sugars, galactose and glucose. As a first step in the metabolism, the lactose needs to be broken into these two components. This is done by the enzyme βgalactosidase. This protein is encoded by the gene lacZ. lacY encodes for β-galactoside permease, a membrane protein that pumps lactose into the cell, and lacA encodes for β-galactoside transacetylase that is required for the chemical modification of the sugar molecules. As shown in Fig. 8.1, to the left of the lac operon, in the upstream direction, there is an operator site which is the binding site for the lac repressor. Adjacent to that site is the promoter, the site where the RNA polymerase has to bind first before it starts to transcribe the genes. If the operator site is unoccupied, the polymerase can bind and transcribe the three genes downstream, see Fig. 8.1(a). If the lac repressor is bound to the operator, RNA polymerase is blocked from binding to the promoter and the genes cannot be transcribed, see Fig. 8.1(b). The gene of the lac repressor itself, lacI, lies nearby the lac operon and is always expressed at a moderate level. This ensures that there are always lac repressors present in the cell, even though proteins are broken down after a certain period of time. The task of the lac repressor is to keep the lac operon inaccessible as long as there is no milk sugar around and to allow transcription if it is present. In this way, the lac repressor ensures that the cell does not waste energy in producing the proteins for the lactose metabolism in the absence of lactose. On the other hand, it also ensures that those proteins are produced when we drink milk. How can the lac repressor “know” when to bind to the operon and when not? The lac repressor acts as a genetic switch with two states, one without and one with a lactose molecule bound to it. In the lactose-free conformation, the complex has reading heads made from α-helices that fit into the major groove of DNA. They recognize the specific operator sequence by forming
301
302 DNA–Protein Complexes
sequence-specific hydrogen bonds with the edges of the DNA bases exposed in the major groove. In this case, the repressor binds to the operator sequence with high affinity as shown in Fig. 8.1(b). In the lactose-bound state, the repressor has a different structure that no longer allows the insertion of the reading heads into the DNA, as schematically depicted in Fig. 8.1(a). So it is just the result of a chemical equilibrium whether a repressor can bind to the operator (low lactose concentration) or not (high lactose concentration). Following Ref. (Bruinsma, 2002) we try to estimate the response time of E. coli to a change in the lactose level. In other words, we try to estimate the time that it takes for the lac repressor to stop the production of β-galactosidase and its sisters once the lactose concentration has dropped. We expect that it is advantageous for the survival of a bacterium to have a very fast response time. We study the reaction kinetics of complex formation between the lac repressor and the operator. The change in the concentration cRS of the repressor-operator complex (“R” stands for repressor and “S” for substrate) is the sum of two terms: dcRS (8.1) = kon cR cS − koff cRS . dt The first term on the rhs is positive and describes the formation of complexes which should be proportional to finding a free repressor and a free operator at the same site, i.e., proportional to cR cS (cR : concentration of free repressors, cS : free substrate concentration). The proportionality constant kon is the so-called on-rate. The second term is a loss-term describing the break-up of the complex that is proportional to cRS with a proportionality constant koff , the off-rate. Note that only the off-rate has the dimensions of a rate, whereas the on-rate has dimensions volume/time. Using Eq. 8.1 we can estimate the response time of the bacterium to a change in the lactose concentration. Suppose we start with a situation where the concentration of lactose molecules is high so that the repressors cannot bind to the operator. Now let us switch off the lactose concentration at t = 0. It then follows from Eq. 8.1 that at the beginning the concentration of occupied operator sites grows linearly in time as dcRS (8.2) ≈ kon cR cS = (kon cS ) cR , dt
Protein Target Search
neglecting the then still small loss term describing complex breakup. This allows us to estimate the characteristic time Tswitch for a free repressor to find the operator site, i.e., the typical time that the bacterium needs to switch off its lac operon: 1 Tswitch = . (8.3) kon cS (More formally: Using Eq. 8.2 and cR + cRS = const and assuming cR cS ≈ const, an assumption fulfilled in the in vitro experiment mentioned immediately below, we find dcR /dt = − (kon cS ) cR and hence c R ∝ exp (−t/Tswitch )). This quantity was measured in an in vitro setup where the repressor was rapidly mixed to a solution of DNA molecules that each contained an operator site (Riggs et al., 1970). This allowed to estimate the on-rate to be kon = 1010 M−1 s−1 . Here M stands for molar, moles per liter, i.e., 1 M = 6×1023 /1015 μm3 . For instance, the molar concentration of a single molecule inside a bacterium (typical volume: 1 μm3 ) is 1/μm3 = 1015 M/ 6 × 1023 = (1/6)×10−8 M ≈ 1 nM. Moreover, as mentioned in the beginning of Chapter 7, the typical salt concentration inside a cell is on the order of 100 mM. Let us now try to estimate the switching time inside E. coli. Since the E. coli genome contains only one operator sequence, we have an initial concentration of unoccupied operators on the order of cS ≈ 1/μm3 ≈ 10−9 M. Inserting this into Eq. 8.3 together with the measured on-rate kon = 1010 M−1 s−1 gives a typical switching time of about 0.1 s. This is a reasonable result as it suggests that E. coli can indeed adapt quickly to changes in its environment. Note, however, that this can only be viewed as a lower bound. Inside a bacterium one has a very crowded environment, presumably leading to much longer response times. But as long as the response time is less than a minute, this may still be acceptable. So far we have shown that the experimentally measured onrate kon is fast enough to allow E. coli to turn the switch rapidly. Now we attempt to estimate the on-rate theoretically (Bruinsma, 2002). We use the classical theory of Debye and Smoluchowski for diffusion-limited chemical reactions. We describe the cell as a spherical container of radius R with the operator sequence in its center, see Fig. 8.2. We denote the concentration of free repressors by c (r, t). The concentration field obeys the diffusion equation,
303
304 DNA–Protein Complexes
Figure 8.2 Geometry assumed for calculating the on-rate for the binding of the repressor to the operator (see text for details).
Eq. 5.40, that takes here the three-dimensional form ∂c (8.4) = D3 c. ∂t Here D3 is the diffusion constant in water. To estimate D3 , we assume the repressor to be a sphere of radius a with a on the order of 4 nm. This leads to a diffusion constant D3 = kB T / (6π ηa) ≈ 5 × 10−7 cm2 /s (see Eqs. 5.45 and 5.49) which serves us as an upper bound for the protein’s mobility; inside the crowded interior of the bacterium the diffusion constant is likely to be smaller. We want to determine the time that is needed for the first binding event between the operator and a repressor to occur. We assume that this happens once a repressor enters a small sphere of radius b around the origin (see Fig. 8.2); b represents the reaction radius for the repressoroperator binding. To make things easier, we assume that the operator acts as a sink: whenever a repressor hits the small sphere in the center, it disappears. Moreover, we assume that the concentration at the boundary of the cell, i.e., at r = R, is kept at a constant bulk value c (R) = c (∞). Under these conditions there will be a timeindependent steady-state solution with a constant current I of repressor molecules from the outer radius to the inner sphere. In this case ∂c/∂t = 0 and Eq. 8.4 simplifies to the so-called Laplace equation c = 0. (8.5)
Protein Target Search
This is a special case of the Poisson equation which we encountered earlier in electrostatics, namely Eq. 7.2 with ρ ≡ 0. We have to solve the Laplace equation with the boundary conditions c (R) = c (∞) and c (b) = 0. The latter condition means that diffusing repressors disappear as soon as they are within the reaction radius of the operator. We already know the solution of the Poisson equation for a point charge, see Eq. 7.3. This is the only type of solution to the Laplace equation with spherical symmetry. Imposing the boundary conditions we find
b (8.6) c (r) = c (∞) 1 − r where we assumed b R. We can now calculate the flux J of repressor molecules, which points radially inward. The flux is given by J = −D3 ∇c (see, e.g., Eq. 5.50 with g ≡ 0). The radial component J r of the flux follows from J r = −D3 ∂c/∂r to be −D3 b c (∞) /r 2 . The current of repressor molecules into the reaction volume is then given by J r times the surface area 4πr 2 : I = −4π D3 b c (∞) .
(8.7)
This expression has to be compared with Eq. 8.2 according to which the rate of complex formation is given by cS−1 dcRS /dt = kon cR . Since this quantity must equal the incoming current, namely −I , we find kon = 4π D3 b
(8.8)
where we identified c (∞) with the repressor concentration cR far from the operator. With D3 = 5 × 10−7 cm2 /s from above and a typical reaction radius b = 0.5 nm we find μm3 ≈ 2 × 108 M−1 s−1 . (8.9) s This is a really remarkable finding. The value for the on-rate that we just estimated is about 50 times smaller than the above mentioned experimentally measured value 1010 M−1 s−1 (Riggs et al., 1970). How is this possible? We should not have been surprised if it had turned out that the estimated value was greater than the measured value since we did not account for all effects that might slow down the binding of the repressor to the target site. kon = 0.3
305
306 DNA–Protein Complexes
For instance, we did not consider the possibility that the repressor comes within the reaction radius but is not properly aligned with the binding site. It seems that the lac repressor can circumvent the laws of physics. This cannot be the result of some additional sophisticated ingredients inside E. coli because the experiment has been carried out in vitro and there were only DNA molecules (containing operator sites), repressors and salt ions present. A first clue to solving this puzzle comes from equilibrium experiments. At thermodynamic equilibrium the concentrations of the reactants must be constant and thus the lhs of Eq. 8.1 needs to vanish. This leads to the condition koff cR cS = (8.10) = Keq . cRS kon The ratio of concentrations is therefore directly related to the ratio of the off- and on-rates. This quantity is called the equilibrium constant Keq . In a solution of DNA molecules that contain operator sequences at physiological salt concentration, about 100 mM monovalent salt, Keq is found to be about 10−12 M. When experiments are performed under identical conditions, but this time with operator-free DNA, it is surprisingly found that repressors still tend to be bound to the DNA but less strongly, namely Keq ≈ 10−6 M (deHaseth et al., 1977). The lac repressor therefore has two types of interactions with the DNA molecule: a strong specific interaction and a weaker non-specific interaction. Even though the non-specific interaction is much weaker than the specific one, the bacterial genome offers a huge number of non-specific binding positions. The genome of E. coli contains 4.6 × 106 bp corresponding to a length of about 1.5 mm. As a rough approximation, we can consider the non-operator part of the bacterial genome as a solution of 10 bp long segments, in total about 105 ones, inside the 1 μm3 large E. coli cell. This suggests that the majority of the repressors is non-specifically bound to the DNA, namely cRS /cR = cS /Keq ≈ 170. From this follows that our assumption of the three-dimensional diffusion of the repressor to the operator is very likely wrong. A possible explanation for the fast on-rate onto the operator could then be that the repressor does not have to explore the whole three-dimensional space inside the E. coli cell since it is nonspecifically bound to the DNA. Instead, it finds the operator site
Protein Target Search
through one-dimensional diffusion along the DNA. We might expect this to greatly shorten the search time since the repressor always stays on track instead of having to explore the space in between the DNA chain. How can we test this idea experimentally? The most elegant setup would be to measure the on-rate for the binding to the operator as a function of the equilibrium constant for the nonspecific binding. This would allow us to observe the dynamics of the system for either three- or one-dimensional repressor diffusion. How can we experimentally tune the strength of the non-specific binding? What we need to understand first is what underlies the non-specific binding of the repressor to the DNA. In the previous chapter we argued that electrostatics is the most important interaction inside the cell. We learned that DNA is surrounded by an atmosphere of condensed counterions. If the repressor would be positively charged, some of the condensed counterions of the DNA would be released once the repressor gets very close to the DNA. According to our estimate, each of the released counterions gives a free energy gain where depends logarithmically on the salt concentration, see Eq. 7.37. This idea can be tested by measuring the non-specific equilibrium constant as a function of the salt concentration. Experimentally one finds (deHaseth et al., 1977) ln Keq ≈ 10 ln csalt + 8.5
(8.11)
where the Keq and csalt are given in units of mole. The equilibrium constant is related to the free enthalpy. This is the appropriate thermodynamic potential inside a cell which typically operates under conditions of nearly fixed temperature and pressure. According to Eq. 2.76, the free enthalpy of a system of N identical particles is of the form G = μN. In our system we have a mixture of NR free repressor, NS unoccupied binding sites (modeled as a solution of short fragments, as mentioned above) and NRS non-specifically bound repressors, for which the free enthalpy is given by G = NR μR (cR ) + NS μS (cS ) + NRS μRS (cRS ). The total number of repressors, NR + NRS , is fixed and so is the total number of (non-specific) binding sites. The thermodynamic equilibrium follows from ∂G/∂ NRS = 0, which leads to the usual condition on the chemical potentials: μR (cR ) + μS (cS ) = μRS (cRS ). To estimate the chemical potential for the three types of solute particles, we account
307
308 DNA–Protein Complexes
lac repressor
DNA
Figure 8.3 The non-specific binding of the lac repressor to the DNA leads to the release of 10 condensed counterions according to Eq. 8.15.
for their translational entropy through an ideal gas-like term (see Eq. 2.75): (0) (8.12) μi (ci ) = kB T ln λ3T , i ci + μi with i = R, S or RS. The last term μi0 is the standard chemical potential representing the intrinsic free enthalpy per solute particle, which depends on the type of particle, the temperature and pressure but not on the concentration. Free enthalpy minimization leads then to cR cS 1 = e−G0 /kB T cRS υ
(8.13)
where we introduced G0 = μ0R + μ0S − μ0RS , the standard free enthalpy change. The other quantity, υ, has units of volume and can be interpreted as a reaction volume; the repressor can only bind to a DNA site if it is within a reaction radius from that site. By comparing to Eq. 8.10 one finds that the rhs of Eq. 8.13 is nothing but the equilibrium constant Keq =
1 −G0 /kB T e . υ
(8.14)
Combining Eqs. 8.11 and 8.14 we arrive at G0 = −kB T (10 ln csalt + C )
(8.15)
Protein Target Search
where C is a constant. This result indicates that 10 small ions are released when the repressor binds unspecifically to the DNA. It is believed that these are counterions of the DNA that escape into the bulk once the positively charged repressor dips into the counterion atmosphere around the DNA double helix (deHaseth et al., 1977), see Fig. 8.3. By increasing the salt concentration and therefore reducing the entropy gain for counterion release, one can systematically lower the non-specific repressor-DNA interaction. At high salt concentrations one expects predominantly threedimensional repressor diffusion and at low salt concentration predominantly one-dimensional diffusion. If our idea is right, one should observe a monotonic dependence of the on-rate for operator binding on the salt concentration from small values at high salt to large values at low salt. However, in vitro experiments clearly show a non-monotonic dependence of kon on the salt concentration with a peak around physiological salt concentrations (Winter et al., 1981), see Fig. 8.4. What went wrong? We started from the assumption that the target search by onedimensional diffusion would be faster than by three-dimensional diffusion. But that is actually questionable. The diffusion constant for the non-specifically bound lac repressor along DNA has been estimated to be on the order of D1 ≈ 10−9 cm2 /s (Winter et al., 1981), a value which has since been confirmed through the direct measurement of the diffusion of repressors labeled with a fluorescent protein along stretched DNA (Elf et al., 2007). The genome of E. coli is about Ltot = 1.5 mm. The typical search time is then roughly T = L2tot /D1 ≈ 107 s, which corresponds to about one year. Also from a conceptional point of view, it should come as no surprise that the one-dimensional search is not very effective. If we assume that the DNA chain has Gaussian statistics on length scales larger than the persistence length (Eq. 4.62), then the random walk of the repressor along that chain explores the three-dimensional space very slowly—like t1/4 —as compared to a search via threedimensional diffusion that grows as t1/2 . The breakthrough idea in our understanding of the fast on-rate of the lac repressor was put forward by Berg, Winter and von Hippel in 1981, the BWH model (Berg et al., 1981). The basic idea of this theory is that the repressor speeds up the search by mixing
309
kon [M−1 s−1 ]
0.05 M
0.075 M
0.10 M
0.025 M
1010
0.125 M
0.2 M 0.175 M 0.15 M
310 DNA–Protein Complexes
109 108 107 3D 106 1D
105 1
10
100
1000
104
105
106
lslide [nm] Figure 8.4 On-rate of lac repressor binding to the operator as a function of the sliding length. Comparison between the in vitro experiment (Winter et al., 1981) and the theoretical prediction, Eq. 8.22, with D1 ≈ 10−9 cm2 /s (blue curve) and D1 = 5 × 10−7 cm2 /s (purple curve). Note that in the experiment only the salt concentration is controlled (value indicated next to each data point) but not the sliding length l slide . To relate those two quantities, we used Eq. 8.16 (see text for details). When the sliding length is as small as 0.5 nm, the reaction radius, we recover the 3D result (left red point), whereas for l slide = Ltot = 1.5 mm we arrive at the 1D case (right red point).
one-dimensional and three-dimensional diffusion, see Fig. 8.5. We give here a simplified presentation of their argument following Ref. (Halford and Marko, 2004). Let us consider a single repressor and ask ourselves how long it takes for that repressor to find the target site after the lactose concentration has dropped at time t = 0. As a first step, we calculate the sliding length, the typical length that the repressor slides along the DNA after it has been adsorbed nonspecifically: & D1 . (8.16) l slide ≈ koff Here and in the following we are only interested in the so scaling,
x 2 by x. we drop numerical factors and replace averages like The sliding length can be controlled experimentally since the off-rate depends on the salt concentration. We assume that any site along
Protein Target Search
lslide rtarget
Figure 8.5 By mixing three- and one-dimensional diffusion, the lac repressor (blue) finds the operator (yellow) much faster than through purely one- or purely three-dimensional search. Once the repressor is within the targeting radius rtarget , it finds the operator with a probability 0.5; the trajectory within rtarget is shown in green. According to Eq. 8.17, the targeting radius equals the sliding length l slide per one-dimensional search.
that length is visited during a sliding event and that, if the operator happens to be inside that length, the repressor binds to it. As a second step we introduce the targeting radius rtarget , see Fig. 8.5. This length is defined as follows: if the repressor starts at a distance rtarget from the operator site, it will reach the target with a probability 0.5 without leaving that targeting radius. The volume with radius rtarget around the operator contains some DNA stretch of length l. According to WLC statistics, Eq. 4.61, this length is given by √ l = rtarget for rtarget l P and by l P l ≈ rtarget for rtarget l P . We assume that whenever the repressor comes within the counterion atmosphere around the DNA double helix, it gets non-specifically adsorbed. Since the atmosphere is very thin, this means, to a good approximation, that the repressor adsorbs when it is within the DNA radius rDNA around the DNA central axis. In order to estimate the chance that the repressor binds non-specifically while it is diffusing within the targeting radius, we divide the space within this volume 3 . The repressor spends into small volume elements of size V = rDNA 2 /D3 in each volume element V and a typical a typical time rDNA 2 /D3 in the whole targeting volume. The number of total time rtarget volume elements that it visits during that total time is thus given 2 2 /rDNA . As the whole by the ratio of these two times, namely rtarget
311
312 DNA–Protein Complexes
3 3 targeting volume contains rtarget /rDNA volume elements, the chance of visiting a particular element is only rDNA /rtarget . The number of volume elements that contain DNA is l/rDNA . Therefore, the threedimensional diffusion of the protein leads on average to l/rtarget nonspecific interactions with the DNA inside the targeting volume. How many such non-specific interactions are required on average for the repressor to find the operator? We assume that every non-specific interaction leads to one-dimensional sliding along the DNA with the above given sliding length l slide and that the n different one-dimensional searches are non-overlapping. Then the operator is expected to be found, once the total length of these searches covers the whole length of DNA inside the targeting radius, i.e., nl slide ≈ l. Equating n with the number of non-specific interactions l/rtarget from above, we find
rtarget ≈ l slide .
(8.17)
Note that this result does not depend on the chain conformation, since for Eq. 8.17 we only used the random walk statistics of the protein trajectory and the volume of the DNA molecule, but not its conformation. However, the number of sliding events within the targeting radius depends on the DNA conformation. Let us start with a short sliding length l slide < l p . According to Eq. 8.17 one then has rtarget < l P which means that the DNA is straight within the targeting volume, i.e., l ≈ rtarget ≈ l slide . Therefore only one non-specific contact is enough for the repressor to find the operator in the subsequent sliding event. Now suppose the targeting radius is much larger than 2 ≈ l P l, the DNA persistence length, rtarget l P . In that case, rtarget 2 i.e., l ≈ rtarget /l P . With n ≈ l/l slide and using Eq. 8.17, we find n ≈ l slide /l P . In other words, if the non-specific interaction is strong enough, the sliding length and hence the targeting radius (Eq. 8.17) become larger than the DNA persistence length. In that case, the contour length l of the coiled DNA stretch within the targeting radius greatly exceeds l slide and several one-dimensional searches are necessary on average before the repressor finds the operator site. How much time does the repressor spend on average within the targeting volume—before it either binds to the operator or leaves
Protein Target Search
that volume? This time is the sum of the time of the 3D diffusion through the targeting volume and of the time spent during the n −1 , i.e., sliding events, each contributing the time koff τtarget ≈
2 rtar get
D3
+
n l2 l slidel l slidel ≈ slide + ≈ . koff D3 D1 D1
(8.18)
In the second step on the rhs of Eq. 8.18 we replaced n by l/l slide and used Eq. 8.16 to eliminate koff . In the last step we used the fact that the repressor spends much more time diffusing along DNA than performing three-dimensional diffusion since l ≥ l slide and D3 > D1 . We have now achieved a detailed understanding of the dynamics within the targeting radius, i.e., the region close to the operator. Next we need to look at the whole cell. We approximate the E. coli cell by a sphere of radius R (see Fig. 8.6(b)), even though E. coli cells are rather elongated (see Fig. 8.6(a)). There is a region of size rnuc < R in the cell where the bacterial genome is located, the so-called nucleoid. Note that the DNA in a bacterium is not separated from the rest of the cell through a nuclear envelope as in a eukaryotic cell, see Fig. 1.6. In fact, compartmentalization is a trait of eukaryotic cells but is not found in bacteria. Another case that we came across is the in vitro experiment with a solution of non-overlapping DNA coils. The calculation that we present in the following for the target search in an E. coli cell can be translated to the in vitro case if one identifies rnuc with the coil size and R with the typical spacing between different DNA coils in the test tube, see Fig. 8.6(c). Suppose the milk sugar level drops to zero at t = 0 and the repressor, diffusing somewhere outside the nucleoid, starts to search for the operator. We employ again a similar line of arguments as we did when we discussed the dynamics within the targeting radius. We first ask: how many times does the lac repressor have to explore the whole volume of the cell before it enters into the 3 . In nucleoid? We divide the volume R 3 into smaller volumes rnuc 2 each of those smaller volumes the protein spends a time rnuc /D3 . During one run through the entire volume, which takes the time 2 smaller volumes and thus τcell ≈ R 2 /D3 , the protein visits R 2 /rnuc finds the nucleoid with a probability rnuc /R. Therefore, the number of times the protein needs to explore the whole cell before finding the nucleoid is given by ncell = R/rnuc .
313
314 DNA–Protein Complexes
nucleoid
DNA coils
nucleoid
rnuc
R
R
cell wall
(a)
(b)
(c)
Figure 8.6 (a) Schematic sketch of an E. coli bacterium showing its overall elongated shape enclosed by a cell wall and the nucleoid, the region where the bacterial genome is located. (b) Spherical model of E. coli as used in the calculation described in the main text. (c) A solution of DNA coils, the typical situation in in vitro experiments.
We next calculate the number of times the lac repressor has to reenter the nucleoid before it manages to come within the targeting radius of the operator. This calculation is similar to the previous one. We simply have to replace R by rnuc and rnuc by rtarget . We immediately find that the protein needs on average nvisit = rnuc /rtarget visits to the nucleoid before it finds the targeting volume (and hence the operator with a probability 0.5). When inside the nucleoid, the motion of the repressor can be envisaged as a 2 2 /rtarget steps in total. random walk of step length rtarget with rnuc 3 However, not every rtarget -volume element contains DNA since there 3 3 /rtarget such elements in total but only Ltot /l with DNA inside. are rnuc Thus therepressor gets non-specifically stuck only in a fraction 3 3 2 2 (Ltot /l) / rnuc of a total of rnuc /rtarget /rtarget steps. At every such event, the repressor spends a time τtarget , see Eq. 8.18. Neglecting 3 -elements, the the time that the repressor spends inside empty rtarget average time per visit to the nucleoid is given by τnuc ≈
rtarget Ltot l 2 Ltot . τtarget ≈ slide rnucl rnuc D1
(8.19)
Before the repressor binds to the operator, it has to find the nucleoid nvisit times, each time spending the time ncell τcell , followed by the search inside the nucleoid for the targeting volume which takes time τnuc every time. We therefore estimate the overall search
Protein Target Search
time by Tswitch = nvisit (ncell τcell + τnuc ) .
(8.20)
Putting all the results together, we find the following dependence of the search time on the sliding length: Ltot R3 1 + l slide . D3 l slide D1 The on-rate for the specific binding kon of the repressor operator then follows immediately from Eq. 8.3. As there operator in the volume R 3 , one has cS = R −3 and thus R 3 /Tswitch leading to
1 Ltot l slide −1 + . kon ≈ D3l slide D1 R 3 Tswitch =
(8.21) to the is one kon ≈
(8.22)
In Fig. 8.4 we present a comparison between the theoretical prediction, Eq. 8.22, and the in vitro data from Ref. Winter et al. (1981). One set of the data points (filled circles) gives the measured on-rate with the corresponding KCl concentration written next to it. To compare these data points to the theory which gives kon as a function of the sliding length, we estimated l slide using Eq. 8.16. In that equation, the off-rate for non-specific binding follows from Eq. 8.10 and its relation to csalt from Eq. 8.11 and from the value 3 × 106 M−1 s−1 for the on-rate for non-specific binding which was determined experimentally (Winter et al., 1981). Two versions of the data points are shown. The black dots with error bars assume D1 ≈ 10−9 cm2 /s to correlate csalt to l slide , whereas the empty circles assume D1 = 5 × 10−7 cm2 /s. The values used to plot both theoretical curves are D3 = 5 × 10−7 cm2 /s, a chain length Ltot = 1.5 mm and an average spacing R = 12 μm of the DNA coils that follows from their concentration 10−12 M. For the blue curve we assume D1 ≈ 10−9 cm2 /s; this curve must therefore be compared to the filled circles. On the other hand, for the purple curve we assume fast one-dimensional diffusion, D1 = 5 × 10−7 cm2 /s, corresponding to the empty circles. For both cases, the theoretical curve is not very close to the corresponding experimental data. This is not surprising as this is a very simplified theory and we have also dropped all numerical factors. Nevertheless, the theory seems to reflect the overall trend of the data. Especially there is a maximum
315
316 DNA–Protein Complexes
for the on-rate for intermediate sliding lengths. For the larger D1 value, our simplified treatment gives a value of kon that is as large as the optimal experimental on-rate. Also indicated in Fig. 8.4 as red filled circles are our estimates of the on-rate for purely three- or purely one-dimensional search. The three-dimensional on-rate kon ∼ D3 b, Eq. 8.8, is reached once the sliding length is so short that it is on the order of the reaction radius b of the operator site. This follows by comparison to Eq. 8.22 which for small values of l slide simplifies to kon ≈ D3l slide . On the other hand, Eq. 8.22 approaches the case of purely one-dimensional search, kon ≈ R 3 D1 /L2tot , when l slide = Ltot . For D1 ≈ 10−9 cm2 /s this leads to an extremely small on-rate of about 5 × 104 M−1 s−1 . When the sliding length lies in between these two extreme values, the repressor mixes one- and three-dimensional diffusion, which speeds up the search considerably.
8.2 RNA Polymerase In the first chapter we discussed transcription (see Fig. 1.3), the copying of a gene into an RNA blueprint by an RNA polymerase. Since we are used to our macroscopic world, we described this as a simple deterministic process where nucleotides are added step by step according to the rules of Watson-Crick base pairing. As it should be clear by now, such a view is rather naive as the microscopic world of the molecules is governed by thermal fluctuations. In fact, a spectacular example are the experimental trajectories of transcribing RNA polymerases which show that these enzymes move with wildly varying rates, see Fig. 8.7. The trajectories suggest that polymerases switch erratically between periods where they move with a more or less constant speed and periods where they appear to be stuck for extended times (Neuman et al., 2003). At first glance, you might think that polymerases are poorly designed, flimsy machines. However, as we shall see below, this seemingly erratic behavior is crucial for a reliable performance of RNA polymerase. In Fig. 8.8(a) a transcribing RNA polymerase is shown, which opens the double stranded DNA locally and uses one of the single strands as a template to grow a corresponding RNA transcript.
kcat force
template position [bp]
RNA Polymerase
3000
2000 1300 1280
1000
156
100
(a)
(b)
200
300
160
400
time [s]
Figure 8.7 (a) Schematic view of the experimental setup in Ref. (Neuman et al., 2003). A bead is optically trapped and attached to a transcribing RNA polymerase. In the setup shown, the bead pulls against the direction of transcription. (b) Measured time trace for a load of 18 pN and a concentration of 18 mM nucleoside triphosphates (NTP) (adapted from Ref. (Neuman et al., 2003)).
Indicated is an active site where a nucleotide has just been added. By adding more and more nucleotides, the transcript grows and the RNA polymerase moves along the DNA template from bp position n − 1 to bp position n and so on, see Fig. 8.8(c). A typical rate for adding nucleotides is kcat = 10/s (10 bp per second) and results in the trajectory shown as curve 1 in Fig. 8.9. This trajectory is much smoother than the experimental one, Fig. 8.7(b), and again you might wonder why real polymerases seem to show such an erratic behavior. At this point it is worthwhile to ask where the energy comes from that allows the polymerase to move in a preferred direction. Remarkably, the monomers themselves that are built into the growing transcript are the fuel. They are available in the cell in a high-energy state in the form of nucleoside triphosphate, NTP in short (formed through reactions driven by the oxidative breakdown of food). If a G needs to be added, GTP is taken from solution and added to the growing end of the transcript. The polymerization reaction releases so-called pyrophosphate, abbreviated as PPi , Fig. 8.8(c). Remarkably, one of those NTPs, namely adenosine triphosphate (ATP), acts as a universal carrier of chemical energy to drive hundreds of cellular reactions. This is the fuel used by molecular motors, even if they have nothing to do with DNA or RNA.
317
318 DNA–Protein Complexes
(a)
(b)
RNA
RNA polymerase
error kcat
kcat
DNA active site
(c) ...
NTP
NTP
n−1
NTP
n+1
n PPi
PPi
... PPi
Figure 8.8 (a) Transcribing RNA polymerase with a new proper nucleotide just added at the active site. (b) Same as before but with an incorrect base added at the growing end of the transcript. (c) Reaction scheme for a transcribing RNA polymerase. The numbers inside the orange disks give the length of the RNA transcript (or equivalent the bp position on the DNA template).
For example, the motor protein myosin performs mechanical work that causes the contraction of our muscles by splitting off one of the phosphates from the ATP molecule thereby transforming it into ADP. This reaction goes on and on when you turn a page of this book or follow its lines with your eyes. Going back to the transcribing RNA polymerase, let us simplify things by assuming that there are only two types of nucleotides and that one of the two fits at any given position. The correct nucleotide has just been added in Fig. 8.8(a), whereas in Fig. 8.8(b) an error occurred when the last nucleotide was added to the growing end of the transcript. The free energy difference between adding the right and adding a wrong nucleotide has been estimated to be Gact ≈ 6 kB T (see (Depken et al., 2013) and references therein). The error rate, the ratio between the probability to add a wrong nucleotide, perror , to the probability to add the right nucleotide, pcorrect , is given by r0 =
perror 1 = e−Gact / kB T ≈ . 400 pcorrect
(8.23)
RNA Polymerase
template position [bp]
1200
1
2
3
800 750 740 400
730 225
100
200
300
230
400
time [s] Figure 8.9 Simulated RNA polymerase trajectories. Curve 1 corresponds to an RNA polymerase without backtracking, see Fig. 8.8. Curve 2 shows an example trajectory of a polymerase with one backtracking state (Fig. 8.10). Curve 3, corresponding to an RNA polymerase with multi-state backtracking (Fig. 8.12), looks similar to experimental curves like the one shown in Fig. 8.7(b). All the parameters have been chosen as indicated in the text.
Since pcorrect + perror = 1, we find the probability to insert the right base 1 1 pcorrect = ≈ 1 − r0 ≈ 1 − , (8.24) 1 + r0 400 a number very close to one. At first glance, this seems to be reasonable: Only in one of about 400 cases the polymerase makes a mistake when it adds a nucleotide. But what we really should care about is that a reasonable fraction of the end-products, the proteins, carry the proper sequence and therefore fold into the correct shape. The average length of a gene is about 104 bp. The error ratio at each base is independent of the errors made at previously added bases. Therefore the total error rate is the product of the individual error rates. For a gene of length l, this leads to the following probability pcorrect (l) for a correct transcript:
1 l l pcorrect (l) = pcorrect ≈ 1− . (8.25) 400
319
320 DNA–Protein Complexes
(a)
(b) error kcat
kcat
DNA
krec
kbt
krec
kclv
kclv
NTP
(c) ...
error kbt
NTP
n−1
n+1
n PPi
PPi
NMP
BT
NTP ... PPi
NMP
BT
BT
Figure 8.10 Transcribing RNA polymerase with built-in proofreading through backtracking. (a) RNA polymerase with a correct nucleotide just added to the growing transcript. (b) Same but with a wrong ultimate base pair. (c) Overall reaction scheme for a transcribing RNA polymerase with backtracking.
RNA Polymerase
For a gene of average length l = 104 we find pcorrect 104 ≈ 10−11 . This means that only one in about one hundred billion proteins would carry the correct sequence. Or in other words, if the RNA polymerase would rely only on the free energy difference between doing the right and the wrong thing, it would in practice only produce waste but never a blueprint for a functional protein. It is known from experiments that the error rate of RNA polymerases is much lower, typically of the order of 10−5 . In this case, the above argument gives a dramatically different result, namely that about 90% of the transcripts are correct. How can the polymerase achieve such a high fidelity when the individual error for adding the wrong base is about 1/400? The key idea has been put forward by John Hopfield in 1974 (Hopfield, 1974) as the kinetic proofreading scheme. According to this scenario, error suppression can be achieved through a sequence of serially connected energy consuming checkpoints. We shall discuss this idea directly for the case of transcription following the treatment by Depken, Parrondo and Grill (Depken et al., 2013). It is known that RNA polymerase has the capability to proofread the transcript by selectively cleaving bases that have already been incorporated, as depicted in Fig. 8.10. The situation of an RNA polymerase that has just added a correct base to the growing transcript is shown in Fig. 8.10(a). Different from the simplified picture in Fig. 8.8(a), the polymerase has not only the choice to move forward with a rate kcat but can also go into a backtracked state with a rate kbt . In that state it has moved backwards by one base. The structure of the polymerase is such that the last base pair that has just been formed has to be broken in the backtracked state. However, the overall number of base pairings inside the hybrid of the single-stranded DNA template and the growing RNA transcript, about 8 to 9 bp long, does not change because it re-forms a bond at the opposite end of the hybrid. This mechanism exposes the last incorporated base, which is then cleaved off with a rate kclv . Alternatively, the polymerase recovers from the backtracked state with a rate krec by returning to a state in which it can continue to transcribe. Obviously if only correct bases would be incorporated, this whole scenario is useless and the polymerase loses only time and
321
322 DNA–Protein Complexes
krec = kbt error kcat
error kbt
ΔGcat kclv
kbt
cleavage
kcat ΔGact
backtracked state
active state transition to backtracking
transition to catalysis
Figure 8.11 Sketch of the free energy landscape of an RNA polymerase for the two cases shown in Fig. 8.10(a) and (b). The case of correct base pairing is shown in black. The modifications of the landscape for the case of incorrect pairing is shown in red.
energy by going backwards and cleaving off pieces of the transcripts. However, since the polymerase makes a mistake quite often (in one of 400 bases or so), this scenario offers the polymerase a chance to correct it. The case when just a wrong base has been added is shown in Fig. 8.10(b). Now cleaving off the last base removes the error. As we shall see, the proofreading comes about since the polymerase cleaves off much more incorrect than correct bases. This is achieved by having different rates for transcription and for going into backtracking for the two cases shown in Fig. 8.10(a) and (b). The rates for the case of an incorrect last base are indicated error error and kbt . The free energy landscape of with a superscript: kcat the polymerase is shown in Fig. 8.11. The minimum at the right corresponds to the active transcribing state, the minimum to the left represents the backtracked state. You can see from this sketch that, for simplicity, we assume that the rate for recovery from backtracking, krec , is the same for right and wrong bases. Since for a correct ultimate base the active and the backtracked state have the same free energy (same number of correct bases paired in both states), the recovery rate obeys the equality krec = kbt where kbt is a correct . As mentioned earlier, incorporating shorthand notation for kbt the wrong base leads to a cost in the free energy by the amount
RNA Polymerase
Gact . According to our above assumption, this cost enters in full in error = kbt eGact / kB T . Moreover, it is known the rate to backtracking: kbt that the catalysis rate for adding a new base is reduced after a wrong base has been incorporated. We describe this by an increase of the barrier to catalysis by an amount Gcat , see Fig. 8.11. In other words, error Gcat /kB T correct e with kcat = kcat . kcat = kcat In the following we calculate the probability that the ultimate base is not cleaved off but another base is added and hence the checkpoint is cleared. This probability depends on whether the ultimate base is right or wrong and is denoted by pi where i stands for either “correct” or “error.” The probability to clear the checkpoint is given by ∞
n i i i kbt kcat kcat (1 − pclv ) , = i pi = i i i i i kbt + kcat kbt + kcat kcat + pclv kbt n=0 (8.26) where pclv denotes the probability that the ultimate base is cleaved off once the backtracked state is entered. The summation over n is a summation over probabilities for different scenarios. The case n = 0 is the probability that the polymerase adds another base without jumping backtracked state; the probability for that i into ithe i / kbt + kcat . For n = 1 the polymerase event is kcat jumps into i i i , then recovers the backtracked state with probability kbt / kbt + kcat with probability 1− pclv and finally adds a new base with probability i i i . Larger values of n correspond to cases where the / kbt + kcat kcat polymerase repeatedly falls back into the backtracked state before resuming transcription. The error rate r1 , defined as in Eq. 8.23, is then given by correct error correct kcat kcat + pclv kbt kcat + pclv kbt . correct error error = kcat + pclv kbt e(Gact +Gcat )/kB T kcat + pclv kbt kcat (8.27) It is now straightforward to calculate r1 for the case depicted in Fig. 8.10. The probability pclv is given by pclv = kclv / (kclv + kbt ) (using kbt = krec ). From Eq. 8.27 we find
r1 =
r1 =
kcat (kclv + kbt ) + kclv kbt kcat (kclv + kbt ) + kclv kbt e(Gact +Gcat )/kB T
(8.28)
323
324 DNA–Protein Complexes
which for kclv kbt kcat —a condition fulfilled for RNA polymerase (see below)—can be approximated by r1 ≈
kcat . kcat + kclv e(Gact +Gcat )/kB T
(8.29)
How much does the RNA polymerase gain in fidelity through this backtracking mechanism? To answer this question, we need to insert explicit values into Eq. 8.29. Typical numbers are kcat = 10/s, kbt = 1/s and kclv = 0.1/s, Gcat = 2 kB T and, as mentioned earlier, Gact = 6 kB T . With this we find r1 ≈ 1/30. At first you might think that this is worse than the error rate r0 ≈ 1/400 that we found before for the direct incorporation of base pairs, see Eq. 8.23. The point, however, is that checkpoints are arranged sequentially. In the first step, the wrong base pair is added in about one of 400 cases. In the second step, this wrong base pair survives the proofreading step 30 times less likely than a correct ultimate base pair. As a result, the two error rates multiply to a total error rate r = r0 × r 1 ≈
1 . 12 000
(8.30)
For a protein of length 104 this brings a vast improvement. Now about 40% of the transcripts are correct as compared to the tiny fraction—one in about hundred billion—found in the absence of the proofreading step. A simulated trajectory of such a polymerase with the same parameters as given above is depicted in Fig. 8.9 (curve 2). The polymerase is slightly slower and its trajectory is a bit less regular than that of the proofreading-free polymerase (curve 1) but the quality of its transcripts is vastly superior. To achieve even better results, the polymerase could have changed some of the parameters that govern its kinetics in the course of evolution. Looking at the error rate, Eq. 8.29, a straightforward choice would have been to increase the cleavage rate kclv . However, maybe as the result of the nucleotide chemistry and the intracellular concentrations of molecules that had been fixed during evolution at a much earlier stage, the RNA polymerase had to perfect its internal design to enhance its proofreading capability. It achieved this by allowing multiple backtracking, i.e., by going back more than only one base, see Fig. 8.12. The polymerase can
RNA Polymerase
(a) ...
NTP
NTP
n−1 PPi
PPi
... PPi
kclv
NMP
BT
k bt
(b)
NTP
n+1
n
BT1
BT1
kbt (NMP)2
BT2
kbt
kclv BT2
(NMP)3
kbt BT3 ...
kbt
Figure 8.12 Transcribing RNA polymerase with multi-step backtracking. (a) Overall reaction scheme showing the multi-step backtracking branching off state n + 1. (b) Zoom into part of the reaction scheme showing explicitly the various rates.
now diffuse back and forth between states BT1, BT2, BT3, . . . It can leave this set of backtracked states either by recovering into the active state from BT1 or it can directly jump from any state to the transcribing state by cleaving off the corresponding number of bases that have been exposed, see Fig. 8.12(a). Looking at Fig. 8.12(b) we find the following self-consistency relation for pclv :
m ∞ kbt kbt (1 − pclv ) 1 − pclv = . (8.31) 2k 2k + k bt clv bt + kclv m=0 This equation gives the probability 1 − pclv that the polymerase recovers from the backtracked state BT1 without cleavage. It must equal the expression on the rhs that sums over all possible paths that start from BT1 and eventually recover from the backtracked state without cleavage. For m = 0, the polymerase leaves the backtracked state directly with probability kbt / (2kbt + kclv ), see Fig. 8.12(b). For m = 1, the polymerase goes one step deeper into
325
326 DNA–Protein Complexes
backtracking, namely to state BT2, with probability kbt / (2kbt + kclv ). Since the sequence of states BT2, BT3, BT4,. . . has precisely the same transition rates as the sequence BT1, BT2, BT3,. . . , the recovery back through all possible paths from BT2 to BT1 without cleavage must also be given by 1 − pclv . With the probability kbt / (2kbt + kclv ) the polymerase then returns from BT1 to transcription. For the cases m > 1, the polymerase recovers m times from BT2 to BT1 before it goes back to transcription. Equation 8.31 leads to 1 − pclv =
kbt kbt + kclv + kbt pclv
(8.32)
2 + (kclv /kbt ) pclv − kclv /kbt = 0. This is which can be rewritten as pclv solved by & kclv 4kbt pclv = 1+ −1 . (8.33) kclv 2kbt
If we assume kclv kbt again, the cleavage probability can be approximated by & kclv pclv ≈ . (8.34) kbt To obtain the error suppression r2 through multi-step backtracking, we have to insert the cleavage probability, Eq. 8.34, into Eq. 8.27. We obtain kcat √ r2 ≈ (8.35) kcat + kclv kbt e(Gact +Gcat )/kB T where we assumed kclv kbt kcat again. By using Eq. 8.27 we assume that only the transition into the backtracked state and the rate of transcription depend on the type of the last added base but that the other transitions are not affected by it. If you compare r2 for multi-step backtracking, Eq. 8.35, to r1 for one-step backtracking, Eq. 8.29, you can see that the only difference of the multi-step mechanism is to replace the factor in front of the exponential, the √ small cleavage rate kclv , by the geometric mean kclv kbt . With the values for the rates and free energies assumed above, we now find r2 ≈ 1/100 instead of r1 ≈ 1/30. Altogether this leads to a total error rate r = r0 × r2 ≈ 1/40 000. Now about three quarters
Nucleosome Dynamics 327
of the transcripts from a 104 bp long gene are correctly copied. A simulated trajectory of such a polymerase is shown in Fig. 8.9 (curve 3). It shares similarities with experimental trajectories, cf. Fig. 8.7(b). This suggests that the seemingly erratic behavior of RNA polymerases simply reflects the fact that they are careful copy machines, constantly checking their transcripts. To reach the error rate of 10−5 observed for real transcription, we need even better proofreading. A possible second proofreading step might be that RNA polymerases account for the case that a wrong base has been added but slipped through the proofreading mechanism so that it sits now in the penultimate position. If there is still a bias for going into backtracking, even though the ultimate base is correctly paired, the required fidelity can be reached as outlined in Ref. (Depken et al., 2013).
8.3 Nucleosome Dynamics In this section we consider various aspects related to the dynamics of nucleosomes. This is a field of great importance since in eukaryotes about three quarters of the DNA are wrapped into nucleosomes (for a reminder see again Figs. 1.6 and 1.9). Before going into the details of the various experiments, it is worthwhile to take a closer look at the crystal structure of the nucleosome core particle, Fig. 1.8. 147 bp DNA are wrapped in 1 3/4 left-handed superhelical turns around an octamer of core histone proteins. The octamer is made of two of each of the core histones H2A, H2B, H3 and H4, shown in light/dark gray, orange/green, yellow/blue and red/gold in Fig. 1.8. The octamer plus the wrapped DNA results in a 6 nm high cylinder of 5 nm radius. There are fourteen regions where the wrapped DNA contacts the octamer surface, located where the minor groove of the DNA double helix faces inward toward the surface of the octamer. At each contact region there are several direct hydrogen bonds as well as positive charges that attract the phosphates of the DNA backbones. As indicated in Fig. 8.13(a), the complex has a twofold axis of symmetry, the dyad axis, which passes through the middle of the wrapped DNA.
328 DNA–Protein Complexes
8.3.1 Site Exposure Mechanism In this subsection we try to answer two questions. How does a DNA binding protein gain access to its specific target site if that site happens to be “buried” inside the wrapped DNA portion of a nucleosome? How can we obtain a quantitative estimate of the energetics involved in the DNA wrapping into nucleosomes? Both questions can be answered in one experiment that will be discussed in the following. Before we do this, we first estimate the elastic energy required for the DNA to wrap around the protein core using the WLC model discussed in Section 4.3. This estimate is not very precise, because one cannot be sure whether the WLC model applies to DNA bending as strong as in the nucleosome, but at least gives us a rough idea of the elastic energy involved. In a nucleosome 127 bp of DNA are strongly bent around the octamer; the rest, 10 bp at each terminus, is essentially straight, see Fig. 1.8. From Eq. 4.15 (without the twist term) it follows that lP l E elastic = . kB T 2R02
(8.36)
Here l P = A/kB T = 50 nm is the DNA persistence length, l is the bent part of the wrapped DNA, l = 127 × 0.34 nm = 43 nm, and R0 is the radius of curvature of its centerline (see Fig. 8.13(b)), which is roughly 4.3 nm. This leads to a bending energy of the order of 58 kB T . If this estimate holds, we know that the binding energy of the 14 sites together must exceed 58 kB T and it should exceed it by a substantial amount—at least in the order of one kB T per binding site—so that the nucleosome is stable and does not fall apart spontaneously. Figure 8.13(b) schematically shows a partially unwrapped nucleosome. The bending energy of the DNA is lowered by the straightening of the unwrapped section of the DNA but the cost of opening the binding sites must exceed this gain. If the DNA wraps back it has to pay a mechanical penalty but overall it lowers its energy by closing the binding sites. If the difference between the pure binding energy and the bending energy is small enough, one can imagine that the nucleosome can make parts of its DNA temporarily accessible by spontaneous unwrapping, which leads to open configurations as the one depicted in Fig. 8.13(b).
Nucleosome Dynamics 329
dyad axis wrapping
(a)
RR 0
(b)
Figure 8.13 (a) Schematic view of the nucleosome core particle showing half of the wrapped DNA with the binding sites to the protein cylinder. Also indicated is the dyad axis, the axis of two-fold symmetry. (b) A partially unwrapped nucleosome with open binding sites (stars). The nucleosome can lower its energy by closing these binding sites at the cost of bending the DNA.
Polach and Widom (Polach and Widom, 1995) demonstrated that nucleosomes actually show such opening fluctuations. They studied nucleosome core particles that consist of 147 bp of DNA wrapped around the histone octamer. Since all the DNA is wrapped in such a complex, one should expect it to be inaccessible to DNA binding proteins if the DNA is too strongly bound. The experimental setup is depicted in Fig. 8.14. The basic idea is that the nucleosome is a dynamic structure in which parts of its DNA are spontaneously unwrapped from either of its ends. This makes any given specific binding site inside the nucleosome (e.g., the yellow stretch of DNA in Fig. 8.14(a)) temporarily accessible to a corresponding DNA binding protein whenever its DNA unwraps far enough to open that binding site. Every time this happens, a window of opportunity opens for the protein (called “R” in the figure) to bind to its site. One expects that the probability for having the binding site temporarily exposed decreases with the distance from the closest terminus of the wrapped DNA and is smallest in the center of the wrapped portion at the nucleosomal dyad. To be able to demonstrate and measure this site exposure mechanism, Polach and Widom used special types of proteins, socalled restriction enzymes. These enzymes cut DNA at specific short bp sequences. Nature provides a large number of these enzymes, which are naturally found in bacteria. The biological function of restriction enzymes is to protect a bacterium against foreign
330 DNA–Protein Complexes
Figure 8.14 The experimental setup in Ref. (Polach and Widom, 1995). (a) A fully wrapped nucleosome unwraps spontaneously, thereby exposing the binding site (yellow) for the restriction enzyme R. The enzyme cuts the DNA at this particular site. (b) Same setup in the absence of the histone octamer.
DNA. This is related to transformation—discussed at the beginning of Chapter 4—where bacteria import foreign DNA and by this mechanism transform. Likewise DNA might be injected by bacterial viruses, so-called bacteriophages. A given restriction enzymes in a bacterium recognizes foreign DNA by a short bp stretch that does not occur in its own DNA. It works somewhat similar to our leukocytes (white blood cells) by “killing” the foreign substance, in this case by simply cutting the DNA at that specific site. It was the exposure of such cutting sites for restriction enzymes inside nucleosomal DNA that was monitored in the Polach and Widom experiment. As long as the nucleosome is sufficiently wrapped, restriction enzymes cannot bind due to steric hindrance. Once the nucleosome “breathes” spontaneously, i.e., unwraps its DNA beyond the binding site of the enzyme, the enzyme has the opportunity to bind. Once bound, the enzyme can either unbind or cut the DNA at that particular site, cf. Fig. 8.14(a). One measures the rate with which the nucleosomal DNA in a solution of core particles is degraded into the smaller segments. This is compared to a solution of bare DNA chains under identical conditions, see Fig. 8.14(b). In the latter case, DNA is cut much faster since the DNA binding site does not have to be exposed by unwrapping from the octamer. By comparing the
Nucleosome Dynamics 331
bare DNA kinetics to the nucleosomal one, the probability of that particular binding site to be open can be deduced. Let us first consider the set of reactions with the bare DNA, Fig. 8.14(b). We denote the bare DNA with “S” (S standing for “site,” the site where the enzyme binds), the restriction enzyme with “R,” the complex of enzyme and DNA by “RS” and the cut DNA with “P” (P stands for “product”). The reaction scheme is then given by k23
k34
S + R RS → P + R. k32
(8.37)
Here k23 and k32 denote the forward and backward rates for the binding and unbinding of the restriction enzyme to and from its target site and k34 is the irreversible rate for the cutting of the DNA. For this reaction scheme one can estimate the rate of the decay of the intact, uncut DNA by writing down the rate equations for the concentrations of the different species. In a compact matrix notation the set of equations is as follows:
d k32 cS cS −k23 cR cS = =A . (8.38) cRS cRS k23 cR −k32 − k34 dt cRS Here cS is the concentration of sites S, cRS the concentration of bound restriction enzymes and cR the concentration of free, unbound enzymes. These concentrations are functions of time, cS = cS (t) and cRS = cRS (t). We assume that the concentration of enzymes is so high that cR cRS at all times. This allows us to set cR = const to a good approximation. In Eq. 8.38 the concentration of the product P is not considered since it is directly related to the concentrations of cS and cRS . Solutions of Eq. 8.38 are linear superpositions of two solutions that follow from the ansatz
i i c1i eλ t (8.39) ci (t) = ci eλ t = c2i with i = +, −. Plugging this into Eq. 8.38 leads to the condition Aci = λi ci (8.40) with the 2 × 2 matrix A being defined in Eq. 8.38. From this we see immediately that c+ and c− are the eigenvectors of A and λ+ and λ− the corresponding eigenvalues. Of interest to us are especially the eigenvalues which are given by
1 2 ± λ = ± (k23 cR + k32 + k34 ) − 4k23 k34 cR − k23 cR − k32 − k34 2 (8.41)
332 DNA–Protein Complexes
where the plus sign in ± should be used for λ+ and the minus sign for λ− . The quantities −λ+ and −λ− are called the relaxation rates of the components proportional to c+ and c− , respectively (see Eq. 8.39). Generally, the initial experimental concentrations cS (0) and a single cRS (0) are not known. However, we want to measure − decay + rate in the experiments so we assume that λ λ . This is the case if and only if (k23 cR + k32+ k34 )2 k23 k34 cR . Then after a very short time on the order of 1/ λ− , only the component with the smaller decay rate, −λ+ , survives, whereas the faster mode has died out. The experimentally determined rate constant kbare that controls the decay of the bare DNA is then simply kbare = −λ+ ≈
k23 k34 cR . k32 + k34 + k23 cR
(8.42)
Next we determine the rate constant for the cutting of the nucleosomal DNA. The reaction scheme, Fig. 8.14(a), is now as follows: k12
k23
k21
k32
k34
N + R S + R RS → P + R
(8.43)
where “N” stands for the closed nucleosome. As before, we assume that cR cRS so that cR = const in which case we have three linear first-order differential equations for cN , cS and cRS . We also assume that the first reaction in 8.43, the equilibrium between open and closed nucleosome, is fast compared to the other reactions, the so-called rapid conformational pre-equilibrium. One can show that this is the case if k21 k32 + k34 + k23 cR (see Ref. (Prinsen and Schiessel, 2010) for a full analysis). After a short time the ratio cN /cS is approximately constant and equal to k21 /k12 and we can simplify the set of rate equations to
d cN + cS −k23 cR popen k32 cN + cS (8.44) = cRS k23 cR popen −k32 − k34 cRS dt where popen =
k12 cS ≈ . cN + cS k12 + k21
(8.45)
The quantity popen has a simple meaning: it is the probability to find the binding site open or, in other words, it gives the fraction of
Nucleosome Dynamics 333
time the binding site is accessible. Equation 8.44 corresponds to the following reaction scheme D+R
k23 popen
k32
k34
RS → P + R
(8.46)
where D represents the intact DNA, i.e., N and S lumped together. Note that the scheme 8.46 is of the same form as that for the bare DNA case, Eq. 8.37, with cS replaced by cD = cN + cS and k23 by k23 popen . There is an additional factor popen because the restriction site is only available in a fraction popen of the DNA molecules. Analogous to the case of bare DNA, we assume that the smaller of the two eigenvalues (in absolute value) is much smaller than 2the other one which is the case if and only if k23 cR popen + k32 + k34 k23 k34 cR popen . Then for times larger than the decay time of the faster mode, D will decay with a single decay rate (cf. Eq. 8.42) knucl ≈
k23 k34 cR popen . k32 + k34 + k23 cR popen
(8.47)
In principle one can determine popen from knucl . However, one needs to know the values of the rate constants k23 , k32 and k34 . If k32 + k34 k23 cR , matters simplify considerably. In this case, it follows from Eqs. 8.42 and 8.47 that popen ≈ knucl /kbare .
(8.48)
In other words, by comparing the two rates for DNA cutting, that for nucleosomes, knucl , and that for bare DNA, kbare , we can easily determine popen without having explicitly to determine the other rate constants. The experiments of Polach and Widom (Polach and Widom, 1995) actually meet all the restrictions on the rate constants and concentrations that lead to Eq. 8.48 (Prinsen and Schiessel, 2010). Figure 8.15 shows the results of the experiment. We plot popen as a function of xb , the position along the DNA of the binding site of the respective restriction enzyme (in bp). Experiments have been performed for positions close to the entrance at xb = 1 bp up to close to the middle of the wrapped portion at xb = 74 bp. Note that the accessibility is greatly reduced for binding sites anywhere in the nucleosomal DNA as compared to bare DNA, even for binding sites close to the terminus of the wrapped portion. Moreover, the data
334 DNA–Protein Complexes
Figure 8.15 Probability popen for a binding site to be open. The position of the binding site, xb , is given in bp. The termini of the wrapped portion are at xb = 1 bp and at xb = 147 bp. The data are taken from a restriction enzyme analysis (Polach and Widom, 1995); each data point provides the mean and standard deviation of the logarithm of popen . The theoretical curve, Eq. 8.50, is also shown. A nucleosome with the restriction sites is shown on the right.
points lie roughly along a straight line in the log-linear plot. This suggests that the probability decays exponentially from the termini towards the middle of the wrapped portion. So we found the answer to one of the two questions raised at the beginning of this section: DNA binding proteins can reach their target sites within nucleosomal DNA because of spontaneous opening fluctuations of the nucleosome. What remains to be determined from the data shown in Fig. 8.15 is the energetics involved in the site exposure mechanism. For this we need to relate the experimentally measured quantity popen to fcrit , the adsorption energy per length of the nucleosomal DNA. We call this quantity fcrit because it corresponds to the critical force that would be required to pull the DNA from the histone octamer. We assume that the unwrapping state of the nucleosome depends only on the number of unwrapped bp at each end of the DNA. We number the base pairs of the DNA that can be adsorbed on the histone octamer from x = 1 to x = L = 147. The unwrapping state of the nucleosome is then characterized by the section from u to w that is still wrapped where 0 ≤ u ≤ w ≤ L. The complexation energy of a fully wrapped
Nucleosome Dynamics 335
nucleosome is − fcrit L, whereas a partially unwrapped nucleosome has a lower complexation energy − fcrit (w − u). To obtain the partition function, we sum over all possible states weighted with the corresponding Boltzmann factor: L Z =
du a
L u
0
dw β fcrit (w−u) L eβ fcrit L eβ fcrit L − 1 − e = ≈ . 2 2 β fcrit a a (β fcrit a) (β fcrit a)2
(8.49) For simplicity, we assume here that u and w are continuous variables, which is a good approximation since L 1. We made the integrals dimensionless by dividing them by some arbitrary length a; a natural choice is a = 1 bp. We now suppose that there is a restriction site between bp xb and xb + 1 with 1 < xb < 147. Note that many restriction enzymes do not simply cut between two base pairs but instead produce overhangs (short single-stranded sections). In that case we define the restriction site as exactly between the cuts in the two single strands. We are interested in the probability that the restriction site is accessible to the enzyme. We assume that for the binding of the restriction enzyme it is not sufficient that the DNA is unwrapped up to the restriction site, but that δ extra bp’s of DNA have to be unwrapped. This is schematically depicted in Fig. 8.16. The probability that the restriction site is accessible is then popen
xb −δ xb −δ w dw du 1 du dw f (u, w) + f (u, w) a a Z a a u xb +δ xb +δ 0 −β f x −β fcrit δ −β fcrit (L−xb ) crit b e (8.50) +e ≈e
1 = Z
L
with the integrand denoting the Boltzmann factor f (u, w) = eβ fcrit (w−u) . A least-square fit of Eq. 8.50 to the data in Fig. 8.15 leads to the curve depicted in the same figure and does indeed show a reasonable agreement. The optimal fit parameters (± one standard deviation) are fcrit = 0.31 ± 0.05 kB T /nm and δ = 30 ± 12 bp. The latter value suggests that a substantial amount of DNA needs to be unwrapped before the restriction enzyme can cut as efficiently as on a bare DNA substrate. The net adsorption energy of the total amount of nucleosomal DNA is E net = fcrit 50 nm ≈ 15 ± 2 kB T .
336 DNA–Protein Complexes
1 bp
u
w 147 bp
xb
(a)
closed 1 bp
xb δ
u
w 147 bp
open (b) Figure 8.16 Schematic depiction of different wrapping states of the nucleosome. In case (a) the binding site (yellow) is closed and thus not accessible for the restriction enzyme. Case (b) shows a situation in which the left end of the nucleosome is unwrapped to position u with u > xb + δ. This means the binding site is open.
We mentioned earlier that the adsorption energy per binding site should not be much larger than kB T to allow for breathing but also not smaller than kB T to have well-defined binding sites. Interestingly, the average net binding energy per site is around 15 kB T /14 ≈ 1 kB T , i.e., at the lower boundary of the expected range. This is a surprisingly small number, especially taking into account the fact that we calculated above, in Eq. 8.36, that the elastic energy E elastic is about 4 times larger. This suggests that nature has tuned the pure adsorption energy E ads such that its value is close to E elastic , namely (8.51) E net = E ads − E elastic ≈ 15 kB T with E ads ≈ 75 kB T and E elastic ≈ 60 kB T . However, the fact that nucleosomes are so dynamic might come at a price: they might not be very stable and fall apart easily. This is especially the case if a protein binds at a DNA binding site that is located deep inside the nucleosome. Once the protein is bound, the nucleosome cannot rewrap but might easily unwrap completely and disintegrate. Another example is the case when the nucleosome is under tension, which can easily happen inside the nucleus, where many motor proteins are at work all the time. However, as we shall see in the next subsection, the nucleosome turns out to be much more stable than expected from Eq. 8.51 and this can be largely understood by its two-turn DNA spool geometry.
Nucleosome Dynamics 337
8.3.2 Force-Induced Nucleosome Unwrapping The analysis of the site exposure experiment in the previous section leads to various predictions. In the following we test one of the predictions, the value of the critical force needed to pull the DNA from the histone octamer. We estimated that the net adsorption energy of the DNA onto a histone octamer is E net ≈ 15 kB T , see Eq. 8.51, and we know that the length of DNA adsorbed in a nucleosome is about 50 nm. From this we expect that the critical force for unwrapping is simply the ratio of these two quantities, namely 15 kB T = 1.2 pN. (8.52) 50 nm In other words, the critical force at which the nucleosome should become unstable is none other than the net adsorption energy per length, which we estimated and called fcrit in the previous subsection—in anticipation of its role as the critical unwrapping force. The first experiment that studied the unwrapping of nucleosomes under tension was published in 2002 (Brower-Toland et al., 2002). The experiment—shown schematically in Fig. 8.17(a)— was performed with a DNA chain that contained 17 nucleosomes at well-defined positions along the DNA. This was achieved by reconstituting the nucleosomes along a DNA template that contained so-called nucleosome positioning sequences, bp sequences that have a higher affinity to nucleosomes than on average—as explained in more detail in the following section. One end of the DNA molecule was attached to a bead that was held in an optical trap, the other end was attached to a coverslip that could be moved to stretch the nucleosomal array. Figure 8.17(b) shows a typical force-extension curve measured with this setup. With increasing imposed endto-end distance, the force initially rises slowly and then—around 700 nm extension—begins to increase sharply. Once a force of about 25 pN has built up, something dramatic happens, manifesting itself in a drop in the tension. Increasing the end-to-end distance further, one observes 16 more of these rupture events. After the 17 events, the force-extension curve of the bare DNA chain is finally reached, which is comparable to that shown in Fig. 4.33. The 17 peaks fcrit ≈
338 DNA–Protein Complexes
nucleosome
40
coverslip
(a) 26
f [pN]
f
30 20 10 0
400
600
800 1000 1200
extension [nm]
f ∗ [pN]
24
(b)
22
80 bp
20 18
dyad axis
16
log (rf /r 0) 2
4
6
8
(c)
(d)
Figure 8.17 Unwrapping of a string of nucleosomes under an externally imposed tension or strain: (a) Experimental setup used in Ref. (BrowerToland et al., 2002). (b) Force vs. extension curve for a fixed velocity clamp of 28 nm/s. Note that the typical unwrapping forces are around 25 pN, 20 times greater than the critical force, Eq. 8.52. The red dashed curve corresponds to bare DNA of the same length. (c) DFS measurements showing the most likely force f ∗ for nucleosome unwrapping as a function of the logarithm of the pulling rate r f (r0 = 1 pN/s). (d) Model put forward in Ref. (Brower-Toland et al., 2002) to explain the data observed in (b) and (c). The light red DNA sections were suggested to represent the locations of the strongest DNA-histone interactions, stabilizing the remaining DNA turn on the nucleosome.
obviously represent the unwrapping events of the 17 nucleosomes. From the shift of the curve to the right one can estimate that a length of about 80 bp is freed at each rupture event corresponding to one turn of DNA inside the nucleosome. This observation suggests that the first 3/4 turn has been unwrapped already at an earlier stage (in Fig. 8.17(b) most likely around an extension of 600 nm where one finds a small drop in the force), whereas each distinct rupture event signals the unwrapping of the last DNA turn of a nucleosome followed by its disintegration.
Nucleosome Dynamics 339
Figure 8.17(b) shows two features that are especially important to note: (1) The unwrapping events of the last turns happen sequentially—one nucleosome at a time—and not in parallel. (2) The forces at which the nucleosomes unwrap are around 25 pN and higher, at least 20 times larger than what we expected from Eq. 8.52. These two features clearly indicate a kinetic barrier that delays the unwrapping. Given enough time, the nucleosomes spontaneously unwrap at much smaller forces, but since the array is stretched with a finite rate (28 nm/s in Fig. 8.17(b)), the nucleosomes will not jump over the barrier until much higher forces have built up. In Section 5.6 we learned how one can extract information about the height and position of an energy barrier with the help of dynamic force spectroscopy (DFS). In the experiment discussed here, a systematic DFS measurement was carried out. Many nucleosomal arrays were stretched with given pulling rates r f increasing the force linearly in time t, f = r f t, and a distribution of rupture forces was recorded by combining the rupture events of all 17 nucleosomes. If the nucleosomes unwrap completely independent of one another, the distribution of forces of a 17-nucleosome chain should be identical to the distribution obtained from a series of experiments performed on single nucleosomes, despite the fact that earlier rupture events typically occur at smaller force values than later ones, see Fig. 8.17(b). The DFS plot combining all the stretching data is depicted in Fig. 8.17(c). By putting a line through the data points as done in Fig. 8.17(c), it is straightforward—as outlined in Fig. 5.12—to extract from its slope the distance yb between the local minimum and the saddle point. One finds yb ≈ 3.2 nm. This number makes somewhat sense since it is comparable to the size of the nucleosome but it is not really obvious what it precisely corresponds to. In a second step, one can extract the barrier height from the intersection of this line with the Y -axis, here at f ∗ = 15.5 pN, as soon as one makes an educated guess about the attempt frequency ν0 . The authors of Ref. (Brower-Toland et al., 2002) assumed ν0 = 109 − 1010 s−1 which leads to a 33 − 35 kB T barrier. They suggested that this barrier is caused by some very strong DNA-histone interactions that need to be overcome after the first 70 bp have been unravelled, see Fig. 8.17(d). This number, however, is in serious conflict with
340 DNA–Protein Complexes
our estimate from the previous section, namely that the total net adsorption energy is about 15 kB T , Eq. 8.51, a value just half of that of the barrier suggested by the unwrapping experiment. Not only do the numbers not work out, but it is also hard to imagine that all the adsorption energy would be focused in two binding sites. If such special sites existed, they should have a radically different structure than the other binding sites; the crystal structure, Fig. 1.8, does not support this view. In addition, these special sites should leave their signature on the accessibility of DNA binding proteins to their target sequence by causing a dramatic drop of popen after the first 30 to 40 bp, an effect that is not visible in the data, see Fig. 8.15. As we shall see, the barrier (or at least most of it) can be understood as resulting from the underlying geometry and elasticity of the DNA without referring to any specific biochemistry of the nucleosome. This follows from the physical model shown in Fig. 8.18 of a nucleosome under tension (Kulic´ and Schiessel, 2004). In this model the DNA is described as a WLC. The torsional stiffness is neglected since in the experiment one has freely rotating ends (the DNA is anchored with single-stranded overhangs at one end to a bead and at the other to a coverslip). According to Eq. 4.15, the elastic energy of a WLC of length L is given by A L 2 κ (s) ds (8.53) E bend = 2 0 with κ (s) denoting the curvature of the chain at point s along its contour. The DNA is assumed to be adsorbed on the protein spool surface along a predefined helical path of radius R0 and pitch height H with a pure adsorption energy per wrapped length, εads , representing the attraction of the binding sites (not including the bending contribution which is incorporated in Eq. 8.53). The degree of DNA adsorption is described by the desorption angle α that is defined to be zero for one full turn wrapped and to be π for complete unwrapping, see Fig. 8.18. It is clear that the unwrapping problem is non-planar and that the spool needs to transiently rotate out of the plane while a full unwrapping turn is performed. Therefore a second angle, β, is introduced to describe the out-of-plane tilting of the spool, as indicated in Fig. 8.18. When a tension f (in Y -direction) acts on the two outgoing DNA “arms,” the system (i.e., the wrapped
Nucleosome Dynamics 341
z spool axis
β
α
−f
y +f
x Figure 8.18 Model of a nucleosome under tension (Kulic´ and Schiessel, 2004): a WLC is continuously adsorbed onto a cylinder along a helical adsorption path.
spool together with the free DNA ends) responds simultaneously with DNA deformation, nucleosome tilting and DNA desorption from the protein spool. The total energy of the system as a function of α and β has three contributions: E tot (α, β) = E bend + 2R0 εads α − 2 f y.
(8.54)
The first term is the deformation energy of the DNA chain, Eq. 8.53. The second term describes the cost to desorb a stretch R0 α at each end of the wrapped portion. Finally, the third term represents the potential energy gained by pulling out the DNA ends, each by a distance y, in the force direction. The energy (up to an unimportant constant) can be rewritten in a more convenient form: E tot (α, β) = 2E arm + 2R0 fcrit α − 2 f y.
(8.55)
Here E arm is the bending energy stored in each free DNA arm. The bending energy of the wrapped part is combined with the desorption term; hence fcrit = εads − l P / 2R02 . In order to proceed further, we need to find the optimal shapes of the DNA arms. Then, in a second step, we have to “glue” them properly to the wrapped helical section. For the sake of simplicity, we do not take entropic shape fluctuations into account, since they have only a small effect for the relatively large forces considered here (Sudhanshu et al., 2011). For given boundary conditions, i.e., given values of the angles α and β, it is possible to find the optimal shape that minimizes the bending energy, Eq. 8.53, by applying the Kirchhoff kinetic analogy discussed in Chapter 4 which relates
342 DNA–Protein Complexes
saddle points of the WLC energy to trajectories of a symmetric spinning top in a gravity field, Fig. 4.22. For the twistless case under consideration, this analogy reduces to that between planar untwisted rods, the Euler elasticas, and the plane pendulum, see Fig. 4.24. One of the two boundary conditions for each DNA arm is that they have to be asymptotically straight. Each DNA arm must therefore be a section of the special Euler elastica that corresponds to the homoclinic orbit within the pendulum analogy, see Figs. 4.26 and 4.28. The natural parametric representation of a DNA arm within its plane is then given by Eqs. 4.34 and 4.35. These two equations are for the X -Z -plane, whereas each DNA arm is in its own plane, each in general being tilted with respect to the X -Y -plane, see Fig. 8.18. We call the plane of the right arm X˜ -Y -plane, which means we need to replace z (s) in Eq. 4.34 by y (s) and x (s) in Eq. 4.35 by x (s) leading to s (8.56) y (s) = cos θloop s ds = s − 2λ tanh (s/λ) 0
and
s x (s) =
sin θloop s ds = 2λ 1 −
1 cosh (s/λ)
.
(8.57)
0
From Eq. 4.33 together with Eq. 4.29 (with C = 1) follows the local curvature 1 2 (8.58) κ (s) = θ˙loop (s) = λ cosh (s/λ) with λ denoting the correlation length, Eq. 4.28. The bending energy E arm per DNA arm is then given by 2A A ∞ 2 (1 − tanh (s0 /λ)) . E arm = κ (s) ds = (8.59) λ 2 s0 The offset parameter s0 is related to the angle θ = θloop (s0 ), the angle between the Y -axis and the tangent to the DNA at the point where it leaves the nucleosome. To find this relation, we use Eq. 4.42, which we derived for the related problem of a protein-induced DNA kink. This leads to tanh (s0 /λ) = cos (θ/2)
(8.60)
Nucleosome Dynamics 343
allowing us to rewrite the bending energy as E arm (θ ) = 2 A f (1 − cos (θ/2)) .
(8.61)
Next we have to glue the arms to the wrapped chain section. This first requires writing an expression for the helical wrapping path. In the non-planar case we have a tilting of n, the spool normal, with respect to the Z-axis by an angle β, see Fig. 8.18. We choose the orientation of the spool such that its dyad axis always coincides with the X -axis. For symmetry reasons, n is then always confined to the Y -Z -plane. Let us start with the case β = 0. The helical wrapping path of the DNA on the spool and its tangent are then given by ⎞ ⎛ R 0 cos t (8.62) h (t) = ⎝ R 0 sin t ⎠ H (π − t) 2π and
⎛
h0 (t) =
−R0 sin t
⎞
1⎜ ⎟ ⎝ R0 cos t ⎠ R H − 2π
(8.63)
with α < t < 2π − α. In Eq. 8.63, the factor 1/R is chosen so that of the tangent vector is normalized to one, i.e., the length 2 H R¯ = R02 + . To obtain the path for a non-vanishing value of 2π
β, one has to tilt the spool by applying a rotation around the X -axis. Using the rotation matrix R1 (β) (defined in Eq. 4.17) we find for the path ⎛ ⎞ R 0 cos t ⎟ ⎜ H h (t, β) = R1 (β) h (t) = ⎝ R 0 cos β sin t − 2π (π − t) sin β ⎠ R0 sin β sin t + and for its tangent
⎛
H 2π
(π − t) cos β (8.64)
−R 0 sin t
⎞
1⎜ ⎟ H sin β ⎠ . ⎝ R0 cos β cos t + 2π R H cos β R0 sin β cos t − 2π (8.65) We have now the helical wrapping path of the tilted spool in the X Y Z coordinates, Eq. 8.64, and the shape of the DNA arms, Eqs. 8.56
h0 (t, β) = R1 (β) h0 (t) =
344 DNA–Protein Complexes
and 8.57. In the following we have to “glue” the two DNA arms to the ends of the wrapped portions at t = α and t = 2π − α. As a first step we have to make sure that the exit point of the DNA from the spool at t = α, which is given by h (α, β), coincides with the starting point of the left arm. In addition, the arm needs to be parallel to the force direction, here the Y -axis. This means that the conformation R (s) of the arm is of the form
2λ 2λ − h⊥ (α, β) R (s) = h (α, β) + cosh (s/λ) cosh (s0 /λ) − ((s − s0 ) − 2λ (tanh (s/λ) − tanh (s0 /λ))) e y . (8.66)
Here h⊥ (α, β) denotes the normalized orthogonal component of h0 (α, β) with respect to e y , the unit vector in Y-direction. The first term on the rhs of Eq. 8.66 makes sure that the arm at s = s0 is attached to the end of the wrapped portion at t = α, i.e., R (s0 ) = h (α, β). The second term describes the X˜ -component, Eq. 8.57 (note that the X˜ -direction coincides with the h⊥ (α, β)-direction) and the third term describes the Y -component given by Eq. 8.56. The conformation of the other DNA arm starting at t = 2π − α follows by symmetry. Equation 8.66 automatically ensures that the DNA arm lies in the proper plane. However, we still need to make sure that the arm and the wrapped section are connected smoothly, i.e., that their tangents coincide. To achieve this, we have to meet the following requirement on θ = θloop (s0 ): H R0 cos β cos a + sin β. (8.67) R 2π R On the rhs we used Eq. 8.65 with t = α. From Eqs. 8.61 and 8.67 and the standard relation cos (θ/2) = √ (1 + cos θ ) /2 we immediately obtain the bending energy per arm as a function of α and β: &
H 1 R0 cos β cos α + sin β . 1+ E arm (α, β) = 2 A f 1− 2 R 2π R (8.68) Finally, using the explicit shape of the left DNA arm, Eq. 8.66, we can write down y, the last term still missing in Eq. 8.55. This is the distance of the left DNA terminus to the origin in the Y -direction.
cos θ = h0 (α, β) · e y =
Nucleosome Dynamics 345
If the arm would be straight and nothing would be wrapped, then y = L/2. If some DNA is wrapped onto the spool, y is reduced to &
L A θ y = − h (α, β) · e y − R0 (π − α) − 2 1 − cos . (8.69) 2 f 2 The second term on the rhs accounts for the Y-position of the point where the DNA leaves the spool, the third term gives the amount of DNA that is lost through wrapping and the forth term describes the lost length by bending of the free DNA. That last term follows directly from the expression found earlier for DNA with a protein-induced kink, Eq. 4.39. We are now in the position to present the total energy, Eq. 8.55. From Eqs. 8.68 and 8.69 follows (up to an unimportant constant) E tot (α, β) = 2R 0 fcrit α
H (π − α) sin β + 2 f R0 cos β sin α − α − 2π R0 & $ % H sin β 1 R0 cos α cos β 1+ + . +8 Af 1 − 2 R 2π R (8.70) The first term in Eq. 8.70 describes the effective cost of desorption due to pulling off the wrapped chain portion. The second term describes the gain/loss of potential energy by spool opening (change of α) and rotation (change of β). Finally, the last term accounts for the stiffness of the non-adsorbed DNA portions. Two effects contribute equally to this term: (i) the bending energy of the deformed DNA arms, Eq. 8.68, and (ii) the loss of potential energy by “wasting” length due to DNA deformation, the forth term on the rhs of Eq. 8.69. The total energy 8.70 has a complicated functional form but the overall energy landscape looks simple. Figure 8.19 shows E tot as a function of α and β for a nucleosome with fcrit = 2.1 kB T /nm under a tension of 14 pN. For these specific values (and also for a wide range of other values) the system has two minima, a local one around α = β = 0, a nucleosome containing one full DNA turn (state “b”), and a global one around α = β = π , the unwrapped state “f.” In order to move from the local minimum at α = β = 0 to the
346 DNA–Protein Complexes
Figure 8.19 Energy landscape, Eq. 8.70, for an applied force of f = 14 pN. Even though f is larger than the critical force, fcrit —assumed here to be 2.1 kB T /nm—the nucleosome is kinetically trapped with one full DNA turn (state “b”). To unwrap, the system needs to pass through transition state “d” with highly bent DNA arms. During the unwrapping the nucleosome flips by 180◦ from α = β = 0 (metastable state “b”) to α = β = π (fully unwrapped nucleosome, state “f”).
global minimum, the system has to cross a substantial barrier that is located close to α = β = π/2. The state on top of the barrier, the so-called transition state, corresponds to a nucleosome with highly bent DNA arms, state “d” in Fig. 8.19. Note that the unwrapping path via this saddle point involves a flip of the nucleosome by 180◦ from β = 0 to β = π, which manifests itself in a rotation of the cylinder, see the example configurations in Fig. 8.19. We now have all the information we need to compare the theoretical model to the DFS data shown in Fig. 8.17(c). In particular, we know the energy of the local minimum at state “b” and of the saddle point at “d” as a function of the applied force, which enables us to determine the barrier height E ( f ) that we need for the DFS formula 5.87. Since Eq. 8.70 is rather complex, one can, however,
Nucleosome Dynamics 347
only proceed numerically. This makes it difficult to grasp the physics underlying the energy barrier. Instead of looking at the full formula, Eq. 8.70, we approximate it by a much simpler expression that still contains its most important terms but neglects the less relevant ones. Specifically, we use the following approximations: (1) The pitch is much smaller than the radius of the spool and therefore we can set H = 0 everywhere to a good approximation. (2) In the second term on the rhs of Eq. 8.70 we neglect the cos β sin α-term and only keep the −α term; the latter dominates the first one everywhere along the unwrapping path. (3) Having done approximations (1) and (2), one can immediately see that the path of lowest resistance for the case f = fcrit , the force at which the two minima attain the same height, is along the line α = β. We then assume that α = β for every value of f , which turns out to be a reasonable approximation. This means we transform the twodimensional energy landscape E tot (α, β) into a one-dimensional one, E tot (α) = E tot (α, β = α). With these three approximations, the full expression, Eq. 8.70, simplifies substantially:
1√ E tot (α) ≈ 2R0 ( fcrit − f ) α+8 A f 1 − 3 + cos 2α . (8.71) 2 This can be further simplified by Taylor expanding the square root √ term, 1 + x ≈ 1 + x/2, which is a good approximation since 3 cos 2α everywhere. We arrive at 2 E tot (α) ≈ 2R0 ( fcrit − f ) α − √ A f cos 2α + C 3
(8.72)
with an unimportant f -dependent constant √ C = 8−4 3 Af. Before we continue further, it is worthwhile to stop at this point and contemplate about this remarkable result. What we have found here is that the energy landscape, Eq. 8.72, is the sum of two contributions: a linear term proportional to ( fcrit − f ) α—the term that tilts the energy landscape—and a barrier term proportional to − cos 2α which has its largest value at π/2—precisely in the middle between one turn wrapped, α = 0, and fully unwrapped, α = 0. This is the kind of landscape that we sketched in Fig. 5.11 but with one
348 DNA–Protein Complexes
Etot kB T
60
0 pN
ΔE kB T 35
40
4 pN
30
20
−40 −60
1.0
1.5
2.0
2.5
3.0 14 pN
kB T nm
1.7
kB T nm
0.7
kB T nm
25
8 pN 0.5
−20
2.7
α
20 pN
20 15 10 5
5
(a)
10
15
20
f
(b)
Figure 8.20 (a) Energy landscape felt by a nucleosome under tension during unwrapping. Shown is the approximate landscape E tot (α), Eq. 8.72, as a function of unwrapping angle α for five different applied forces, as indicated at each curve. Note the absence of a barrier in the force free case, f = 0 pN. We assume here fcrit = 2.1 kB T /nm = 8.6 pN. (b) Barrier height E ( f ) for three different values of fcrit , as indicated at each curve. The continuous curves give the approximate formula, Eq. 8.76, the symbols the numerical solution of the exact expression, Eq. 8.70. As can be seen, the approximate expression overestimates the barrier height for small forces and underestimates it for large ones. This results in a steeper decay of E with f .
remarkable difference: the factor in front of the barrier term is not constant but depends on the force, namely 4 Eb = Eb ( f ) = √ Af. (8.73) 3 Thus we cannot hope that it makes sense to fit a line to the DFS data, as in Fig. 8.17(c), to extract a constant barrier height E b . In particular, note that Eq. 8.73 predicts that the barrier height vanishes for f = 0. This is not really unexpected: unwrapping DNA from a free nucleosome (like unrolling Scotch tape from a dispenser) costs just 2R 0 fcrit α, i.e., the energy is proportional to the unwrapping length. Figure 8.20(a) displays the approximate energy landscape, Eq. 8.72, for several values of the applied force, as indicated at each curve. The value of the critical force is here assumed as fcrit = 2.1 kB T /nm = 8.6 pN. As can be seen from this figure, the overall barrier height goes down with the applied tension despite the increase of E b . This reflects the fact that the tilting term is linear in f while the barrier grows only sublinear, proportional to √ f . If the barrier grew faster than the tilting term, one would have a
Nucleosome Dynamics 349
Figure 8.21 The origin of the barrier against force-induced unwrapping is localized in the bent portions of typical size λ where the DNA arms change their directions by 90◦ , as they lead into the wrapped portion which is oriented perpendicular to the force direction. This results in the barrier height given by Eq. 8.73.
case where the structure gets more and more stable with increasing tension. Such an effect is used, for example, in safety tongs for lifting heavy weights. Their inventor Leonardo da Vinci wrote: “The greater the weight held by this lifting tong, the better and stronger it will be supported.” What is the physics underlying this barrier and why does the barrier term, Eq. 8.73, increase with force? Figure 8.21 shows the nucleosome in the transition state, the state on top of the barrier. The barrier energy is highly localized at the DNA stretches close to the nucleosome where the DNA has to make a 90◦ -bent. Each bent √ portion has a length that scales like λ = A/ f and a curvature that scales like λ−1 , see Eq. 8.58. This results in a bending energy √ that scales like Aλ/λ2 = A f , see Eq. 8.73. The harder one pulls, the shorter the bent portions but the larger their curvature. Overall, this leads to a higher elastic price that one has to pay to reach the transition state. Since the barrier contribution E b is not constant, we need to analyze the DFS data using the generalized expression, Eq. 5.87. With our simplified formula for the energy landscape, Eq. 8.72, this is straightforward. First we find the local minimum and the maximum from d E tot /dα = 0: √ 1 3 R0 ( fcrit − f ) √ (8.74) αmin ( f ) = − arcsin 2 2 Af and π (8.75) αmax ( f ) = − αmin ( f ) . 2 From this follows the barrier height E ( f ) = E tot (αmax ( f )) − E tot (αmin ( f )) (8.76)
350 DNA–Protein Complexes
25
28 (1)
fcrit = 1.8
26 20
f
kB T nm
(1)
fcrit = 2.1
kB T nm
24
∗
f ∗ 22 15
20 (1) fcrit
10
kB T = 2.4 nm
18
(1)
fcrit = 2.1
kB T nm
16 2
4
6
8 10 12 14
log (rf /r0 )
(a)
−5
0
5
log (rf /r0 )
10
(b)
Figure 8.22 (a) Comparison between DFS data, Fig. 8.17(c), and theory, Eq. 5.87 and Eq. 8.76, with ν0 = 106 s−1 , Eq. 8.77, and fcrit = 0.3 kB T /nm, Eq. 8.52. The theory greatly underestimates f ∗ . (b) Same comparison but now using fcrit as fit parameter. The continuous curves use the approximate estimate, Eq. 8.76, the diamonds the full expression, Eq. 8.70, with ν0 = 106 s−1 (purple) and ν0 = 107 s−1 (blue). The best fit is indicated at each curve (r0 = 1 pN/s).
with E tot (α) given by Eq. 8.72. Examples curves of Eq. 8.76 for three different values of fcrit are shown in Fig. 8.20(b) together with the numerical solutions of the exact barrier heights determined from Eq. 8.70. There is a systematic tilt between the exact and the approximate curves, but the approximation works well enough to show that the bulk part of the barrier actually comes from the bending of the DNA arms near the entry-exit points. We are now in the position to test whether our theoretical energy landscape, Eq. 8.70, or its simplified version, Eq. 8.72, predicts the DFS data, Fig. 8.17(c), correctly. The only parameter appearing in Eq. 5.87 that remains to be estimated is the attempt frequency ν0 . Since the nucleosome unwraps through a 180◦ -flip, we choose the inverse of the rotational relaxation time of a “spherical” nucleosome, Eq. 5.138, as the attempt frequency: kB T ν0 = τr−1 = ≈ 106 − 107 s−1 (8.77) 4π η R03 with η ≈ 10−3 Pa s the viscosity of water. Since we convinced ourselves in Fig. 8.20(b) that the approximate estimate of the effective barrier height, Eq. 8.76, works quite
Nucleosome Dynamics 351
well, we use this formula to compare with the DFS data shown in Fig. 8.17(c). We apply the general DFS expression, Eq. 5.87, and assume fcrit = 0.3 kB T /nm for the net adsorption energy per length, which we estimated in the previous section from the independent experiment on spontaneous nucleosome breathing. A comparison between the theoretical curve and the data is shown in Fig. 8.22(a). Unfortunately, the theoretical estimate significantly underestimates the most likely rupture force f ∗ for all experimental pulling rates r f . This enormous discrepancy is even more worrying when we realize that there are no free parameters with which we can adjust the theoretical curve. All the quantities that enter in the theoretical model have been determined experimentally. However, if a theory does not work, there is always hope that you will learn something new if you understand why it failed. Here I shall argue that we learn that the nucleosomal two-turn geometry is biologically advantageous. Let us reconsider the nucleosome model shown in Fig. 8.18. What could be missing in this model that has such a huge impact on its unwrapping dynamics? If we assume that the WLC model holds up to the strong DNA curvatures encountered during unwrapping, then all we can change is the adsorption energy. At first glance, however, the effective adsorption energy fcrit follows from the nucleosome breathing experiments discussed in the previous section, and we do not appear to be free to change this value. Note, however, that the breathing experiments are concerned with a situation where only a fraction of the DNA is unwound from the protein spool. Could it be that fcrit changes when more DNA is unwrapped in the force-induced unwrapping experiments? Let us assume that there is a stronger adsorption energy once there is only one turn left: (0)
(1)
fcrit (α) = fcrit + (α) fcrit
(8.78)
with (α) = 0 for α < 0 and (α) = 1 for α ≥ 0. Here we denote the adsorption energy per length estimated in the previous (0) (0) section with fcrit (previously simply called fcrit ): fcrit ≈ 0.3 kB T /nm. (1) This means an extra adsorption energy fcrit is switched on once there is less than one turn adsorbed on the nucleosome (α ≥ 0), see Fig. 8.23(a). There is in fact a perfectly reasonable physical explanation for the steplike form of fcrit (α), namely that the two
352 DNA–Protein Complexes
fcrit (1)
fcrit (0)
fcrit
0
α (a)
(b)
Figure 8.23 (a) The effective adsorption energy per length, fcrit , is expected to be stronger when there is less than one turn wrapped, i.e., for α > 0. (b) This reflects an effective repulsion between the two DNA turns.
turns feel an effective repulsion, see Fig. 8.23(b). There is certainly an electrostatic repulsion between the phosphates of the two turns whose precise value is hard to estimate because of the proximity of the low dielectric protein core that modifies the electrical field, the presence of condensed counterions and other effects. Instead of trying to estimate the effective repulsion theoretically, (1) one can use the step height fcrit as a free parameter and check whether this allows to obtain a good fit to the experimental data. Figure 8.22(b) demonstrates that this is indeed possible. This plot shows a comparison between the experimental points (filled circles) and several different theoretical predictions. The continuous curves use the approximate formula for the barrier, Eq. 8.76. As an input we use attempt frequencies ν0 as suggested from Eq. 8.77, namely (1) 106 s−1 (purple curve) and 107 s−1 (blue). We then choose fcrit as (1) 6 −1 a fit parameter. For ν0 = 10 s we obtain the best fit for fcrit = (1) 2.1 kB T /nm and for ν0 = 107 s−1 we find fcrit = 2.4 kB T /nm. As we can see from Fig. 8.22(b), however, the agreement between the theoretical curves and the data points is not very satisfactory since the curves systematically show a smaller slope than the data. However, this effect might reflect a systematic error in our approximation when going from the full energy landscape, Eq. 8.70, to the approximate expression, Eq. 8.72. In fact we saw already in Fig. 8.20(b) that the barrier height as a function of the applied force has a steeper slope in the approximate treatment as compared to the exact expression. The DFS expression, Eq. 5.87, suggests that the approximate curves should show systematically a smaller slope than the curves derived from the exact expression, Eq. 8.70. This
Nucleosome Dynamics 353
(0)
(1)
fcrit + fcrit
(0)
fcrit
Figure 8.24 Schematic sketch indicating how the nucleosome manages to keep its DNA accessible without compromising its stability. The fully wrapped nucleosome (middle) can unwrap from either end with a (0) desorption cost per length fcrit up to the point that there is only one DNA turn left (configurations to the left and right). The remaining turn (dark red) (0) is effectively stronger adsorbed with an adsorption energy density fcrit + (1) fcrit , preventing further unwrapping and destabilization of the nucleosome.
is indeed the case: The diamonds in Fig. 8.22(b) give the results of the exact treatment for the same two attempt frequencies. As can be seen, one obtains a good agreement between the data and the (1) theoretical curves. We find slightly lower values for fcrit , namely (1) (1) fcrit = 1.8 kB T /nm for ν0 = 106 s−1 (purple diamonds) and fcrit = 2.1 kB T /nm for ν0 = 107 s−1 (blue diamonds). (1) It is hard to tell how well we estimated the value of fcrit , especially in view of the fact that it follows from the comparison between two very different experiments. The effect seems to be very strong but appears to be smaller when one studies the forceinduced unwrapping of a single nucleosome where one can observe the unwrapping of each DNA turn (Mihardja et al., 2006). Regardless of the exact numbers, the analysis given above suggests that there is a first-second-round difference that can be understood as resulting from the repulsion between the two nucleosomal DNA turns. This raises the questions whether this effect has any biological advantage. As discussed in the previous section, site-exposure is an important mechanism for allowing DNA-binding proteins to have temporary access to their target sites, see Fig. 8.14(a). Based on the experiment discussed in that section, we estimated a DNA adsorption energy of just 15 kB T for the whole nucleosome, a value small enough to allow site exposure to occur with a reasonable probability. This advantage seems, however, to come at a price, namely that the stability of the whole nucleosome would be at danger. The firstsecond-round difference suggests a simple solution to this problem,
354 DNA–Protein Complexes
combining both accessibility to buried binding sites and stability of the nucleosome as a whole. The idea is depicted in Fig. 8.24. The DNA can unwrap from either end and thereby spontaneously offer all its DNA. However, once there is only one turn left, that remaining turn has a strong grip on the nucleosome through the increased adsorption (0) (1) (0) energy per length, namely fcrit + fcrit instead of fcrit . Further unwrapping is therefore rather unlikely. This way all the DNA can be made temporarily accessible without compromising the stability of the overall nucleosomal complex. This suggests that the nucleosomal two-turn design is nature’s ingenious solution for combining accessibility and stability in the same DNA–protein complex. In addition, it provides protection against temporary tension by building up a barrier against unwrapping. We add a discussion of an important experiment (Ngo et al., 2015) here, which was published after the first edition of this book was printed. This experiment provided data that allow to understand in detail how a single nucleosome unwraps under force. This was achieved by adding an extra ingredient to the micromanipulation setup discussed above—a nucleosome under force—so that one can actually “see” which section of the nucleosomal DNA is still wrapped and which section is already unwrapped at any time during the experiment. The extra ingredient is called FRET (an acronym ¨ er resonance energy transfer). Two chromophores, light for Forst sensitive molecules, are attached to a nucleosome, one on the nucleosomal DNA and the other one on a histone nearby, such that the two molecules are in close spatial proximity when the involved DNA stretch is wrapped. When one shines light of the proper wave length on the wrapped nucleosome, the donor chromophore transfers energy to the acceptor chromophore via a non-radiative dipole-dipole coupling. The acceptor molecule then emits light of its own wavelength. On the other hand, this phenomenon stops when the two chromophores are more than a few nm apart, which is the case when the corresponding DNA portion is unwrapped. In the experiment (Ngo et al., 2015) a whole set of measurements was performed, with various positions of the pair of dyes. All the nucleosomes were wrapped on the same base pair sequence, a nucleosome positioning sequence called 601 Widom sequence.
Nucleosome Dynamics 355
Remarkably, it was found that the nucleosomes always unwrap from one end first and that it is always the same end. When the base pair sequence was changed (by flipping the inner two quarters of the sequence), the nucleosome always unwrapped from the other end. Are these findings consistent with the model that we discussed in this section? We had assumed that the nucleosome always unwraps symmetrically from both sides by an angle α, see Fig. 8.18. However, it is important to note that with this model (which assumes constant curvature and adsorption energy along the wrapping path) there is no particular energetic reason to unwrap symmetrically. In fact, we could have assumed that the DNA would only unwrap from one end and the other end would remain bound throughout the unspooling process, and we would have found exactly the same energy. This is the case because the configuration of the whole DNA would stay the same, with the only change that the wrapped part would be shifted more to the left or the right as compared to the symmetric case. For long DNA arms, as assumed here, this does not change the bending energy. However, as we mentioned in Subsection 8.3.1, the last 10 base pairs of the nucleosomal DNA are more or less straight in a real nucleosome. This suggests that it is energetically cheaper if the DNA remains bound to the protein core at one of the ends during the whole force-induced unwrapping process. This is one part of the explanation for the phenomenon of asymmetric unwrapping. But why does the 601 DNA always unwrap from the same end? And why does it unwrap from the other end when the inner two quarters are flipped and the outer two quarters remain unchanged? This can be understood as a base pair sequence effect. As discussed in detail above, the nucleosome under force gets stuck in a metastable state with one turn of DNA remaining wrapped. Since the terminal sections of the nucleosomal wrapping path are straight, it will be a single wrap including one of the two straight end sections. Because the sequences used in the experiment were non-palindromic (i.e., not symmetric with respect to the symmetry axis of the nucleosome), the bending energies in these two configurations differ substantially as they involve different sections of the nucleosomal DNA. A nucleosome model (de Bruin et al., 2016) using the rigid base pair model—similar to the one discussed in Subsection 4.2.2 but with a non-uniform wrapping path
356 DNA–Protein Complexes
extracted from the nucleosome crystal structure—actually leads to predictions consistent with the findings of the experiment (Ngo et al., 2015). This shows that it is useful to start with a model that is simple enough to understand the basic physics of a system before introducing more complexity as experiments reveal new details. It also demonstrates the richness of the physical properties of the nucleosome: the ideal spool geometry together with the DNA stiffness causes a transient barrier under force and the non-uniform features of both the wrapping path and the intrinsic DNA shape modulate the path over that barrier. Finally, let us address the question whether it is experimentally possible to directly “see” the binding sites by unpeeling a nucleosome under force. Is there a way to modify the geometry such that one can circumvent the force-induced barrier? The same group at Cornell University that performed the pulling experiment with an array of nucleosomes (Brower-Toland et al., 2002) found also an alternative setup that avoids the problem with the energy barrier. Their idea was to unzip DNA, i.e., pulling its two strands apart, using a micromanipulation setup. If a nucleosome is present on the DNA molecule, the unzipping slows down once the unzipping fork reaches the nucleosome (Hall et al., 2009). If one pulls with a constant force, e.g., 28 pN, the unzipping temporarily stops at all positions where there is an interaction with the histone core. It was found that each of the 14 binding sites is actually composed of two interaction points, one on each DNA strand. In addition, the data revealed that the binding strength is not constant along the nucleosomal DNA but shows three broader regions with stronger interactions with the strongest region around the dyad.
8.3.3 Nucleosome Sliding Spontaneous unwrapping is one mode by which nucleosomes allow access to their DNA. Another mode is nucleosome sliding: The position of a nucleosome is not fixed, and given enough time, it can move along the DNA, releasing a previously occupied position. Experiments that study nucleosome sliding are typically performed on DNA chains that are not much longer than the nucleosomal
Nucleosome Dynamics 357
10 50
40
30
20 20
10 30
40
50
Figure 8.25 Nucleosome sliding on a short piece of DNA of 200 bp length. The nucleosome is found at different positions, 10 bp apart from each other. The numbers indicate the lengths of the free DNA end portions. The arrows indicate transitions between different states assuming that only 10 bp steps occur.
wrapping length. For instance, if a chain is 200 bp long, it exceeds the wrapping length by about 50 bp. As we shall see, it is possible to detect the positions of nucleosomes in a whole ensemble of such complexes at a given time t0 . For instance, one finds that some fraction of the nucleosomes is located at one end, see e.g., the left configuration in Fig. 8.25 where all the extra length of 50 bp is stored in the left arm. After waiting for some time t, one measures the positions of the nucleosomes at the new time t1 = t0 + t and finds that some of the nucleosomes that were sitting before at the right DNA end are now at new positions. For instance, some might be found at a position where the left arm is now only 10 bp long and the right arm grew to a length of 40 bp, see the second last configuration to the right in Fig. 8.25. Two facts are particularly noteworthy: (1) The process is very slow. Only after about an hour has a substantial fraction of the nucleosomes repositioned. (2) The nucleosome positions are “quantized.” As indicated in Fig. 8.25, nucleosomes are found at positions that are multiples of 10 bp apart from the starting position but not at positions in between. This suggests that nucleosomes “jump” along the DNA with 10 bp steps and that these jumps occur with a very slow rate. Before we study a possible physical mechanism that underlies this phenomenon, we first try to understand how it has been experimentally detected. Nucleosome sliding experiments are based on a widely used method in molecular biology, so-called gel electrophoresis. The idea of this method is to drive charged molecules through a gel (a polymer network) by means of an electrical field. It has been observed that different molecules show
358 DNA–Protein Complexes
E2 E1 1 hour incubation
(a)
(b)
Figure 8.26 Nucleosome sliding observed via two-dimensional gel electrophoresis (Pennings et al., 1991) without (a) and with incubation (b). A sample of nucleosomes is added to the upper left corner (brown disk). An electrical field E1 drives the complexes to the bottom, separating them into bands according to their nucleosome position. In a second step, a field E2 is applied, driving the nucleosomes to the right. (a) Without incubation between the two steps the different bands stay intact and end up on a diagonal. (b) During incubation nucleosomes slide to new positions and electrophoresis produces a rectangular array of bands.
in gels very different electrophoretic mobilities, defined as the ratio of their mean velocity to the strength of the applied electrical field. This means that gel electrophoresis provides an exquisite means to separate different molecules. And it gets even better: Electrophoresis allows to separate nucleosomes according to their positions on the DNA. The complex runs fastest through the gel if the nucleosome is located at either DNA terminus and slowest when it sits in the middle of the DNA molecule. The precise mechanism that underlies this separation is not understood but what counts here is that it somehow works. An elegant version of this method was presented in Ref. (Pennings et al., 1991). A sample of complexes was first pulled in one direction by an electrical field E1 , see Fig. 8.26, under conditions where the nucleosomes do not slide (e.g., at low temperatures). In this way, the complexes were separated into different bands according to their nucleosome position. Then a field E2 —perpendicular to E1 —was applied, which drives the complexes
Nucleosome Dynamics 359
to the right. Since all the nucleosomes were still at the same position, the fastest species was still the fastest and so on. As a result, the different species were on a diagonal at the end of the electrophoresis, Fig. 8.26(a). However, the outcome was different when the sample was incubated at body temperature (37◦ C) for an hour before the complexes were sent into the second direction. In this case the nucleosomes in each band had enough time to slide to new positions. When the electrical field was applied in the second direction, each band split again into several bands that reflected the new positions that the nucleosomes had attained during the incubation step. As a result, a whole rectangular set of positions was found, see Fig. 8.26(b). What is the mechanism that underlies the spontaneous repositioning of nucleosomes along DNA? As a first idea one might imagine that the DNA can slide around the octamer in a fashion similar to that of a rope sliding around a cylinder. As we know from the crystal structure, Fig. 1.8, the DNA is attached to the histone octamer at 14 binding sites, located at the points where the minor groove touches the protein surface, see also Fig. 8.27(a). In a bulk sliding motion, all these 14 sites would have to detach at once to allow a sliding by 10 bp, see Fig. 8.27(b). Our analysis of the Polach and Widom site exposure experiment led to the estimate that the total binding energy E ads of all the 14 sites together amounts to 75 kB T , Eq. 8.51. This is a lower bound—the first-second-round difference suggests an even larger number. Since a sliding event does not change the bending energy of the DNA, it is the full amount, E ads , that such an event would cost and not just the 15 kB T net adsorption energy E net , Eq. 8.51. This number is so large that a sliding event would not happen even once during the lifetime of the universe. However, a second way of sliding seems to come free of charge: The cylinder might simply roll along the DNA. At one end it detaches DNA and at the other it attaches it, thereby keeping the length of wrapped DNA constant. This simple mechanism does not work. Let us start with a fully wrapped nucleosome. Of course, it is always possible to detach DNA at one end. At the other end, however, there are no sites where the DNA can bind since all the binding sites are already in use. If the octamer continues to roll in one direction, it simply rolls off the DNA. This mechanism could only work if the
360 DNA–Protein Complexes
octamer were an infinitely high cylinder that provides an infinitely long helical binding path. After having discussed two scenarios that cannot work, one too costly, one cheap but impossible, we present now two possible mechanisms. Both rely on intermediates with an energy penalty that is much lower than E ads . They are based on defects that spontaneously form in the wrapped DNA portion and that propagate through the nucleosome. The two possible types of defects are: 10 bp loop defects, Fig. 8.27(c) and 1 bp twist defects, Fig. 8.27(d). The basic idea of the loop defects is as follows: first some wrapped DNA peels off spontaneously as shown e.g., in Fig. 8.14(a). If the DNA is pulled in before it re-adsorbs, it creates an intranucleosomal bulge that stores some extra length L. This bulge diffuses back and forth inside the nucleosome before it leaves it at either end. If it leaves at the end where it was created, the nucleosome is again in the state from which it started and nothing happened. If it comes out at the other end, the nucleosome has effectively taken a step of length L along the DNA in the direction where the loop had entered. Here we do not discuss this mechanism in detail but mention that the energetics of loop defects can be worked out using the theory of Euler elasticas (Kulic´ and Schiessel, 2003b). One inscribes disks representing the octamer into curves like the one depicted in Euler’s original drawing, Fig. 4.25. From this follows that the optimal loop length Lis 10 bp. Larger untwisted loops that carry e.g., 20 or 30 bp are more expensive and much larger loops are simply not possible for short DNA templates such as the 200 bp chain of Fig. 8.25. Loops of lengths that are not multiple integers of 10 bp, e.g., L = 9 bp or 11 bp, have to store twist energy and are therefore more expensive than untwisted loops. Remarkably the loop mechanism seems to provide an immediate explanation for the observed 10 bp spacing of the nucleosome positions seen in the experiments, see Fig. 8.25. Rough estimates of the rate with which loops would induce repositioning events are also in the right ballpark. In the interest of space, we will not present a detailed calculation of the loop scenario. Instead we take a close look at the twist defect mechanism. We will arrive at the surprising finding that the repositioning dynamics caused by twist defects is also compatible with the experimental data. At the end of this section we discuss some
Nucleosome Dynamics 361
(a)
(b)
(c)
(d)
Figure 8.27 Nucleosome sliding scenarios: (a) The fully wrapped nucleosome as reference state. Shown is only half of the wrapped DNA (up to the dyad). (b) A bulk sliding motion of the DNA around the octamer requires the opening of all the binding sites (open binding sites are shown in yellow). (c) Nucleosome with a loop defect carrying 10 extra base pairs. (d) Nucleosome with a twist defect: A stretch between two binding sites shown in bulk red contains one bp less and is therefore overstretched and overtwisted.
very recent findings from experiments and computer simulations that shed new light on these two repositioning mechanisms. The basic idea of the twist defect mechanism is similar to that of the bulge mechanism. Here, a twist defect spontaneously forms at either end of the wrapped DNA portion. Such a defect carries either a missing or an extra base pair. A defect with a missing base pair is shown in Fig. 8.27(d). A defect is localized between two neighboring nucleosomal binding sites (i.e., within one helical pitch, 10 bp). In order to accommodate the defect, the corresponding DNA portion is stretched or compressed and, at the same time, over- or undertwisted. If a defect manages to cross the wrapped portion of a nucleosome and exits at the other end, the nucleosome makes a 1 bp step along the DNA. If, for example, a defect with a missing bp forms at the left and leaves at the right, the nucleosome steps 1 bp to the right. We now describe a theoretical model that allows to quantitatively estimate the impact of twist defects on nucleosome repositioning (Kulic´ and Schiessel, 2003a). As shown in Fig. 8.28, the nucleosome is mapped onto a chain of beads connected via harmonic springs. The beads represent the base pairs. The springs have an equilibrium distance b = 0.34 nm, the base pair spacing, and a spring constant K to be determined below. The elastic energy of the chain of beads
362 DNA–Protein Complexes
Figure 8.28 Model for twist defect diffusion: the nucleosome is mapped onto a bead-spring chain. In the undeformed chain, every tenth bead sits at the bottom of a potential well of depth U 0 and width 2a.
is therefore given by
2 K xk − xk−1 −1 . E elastic ({xn }) = b 2 k
(8.79)
Here the conformation of the wrapped DNA is given by {xn } where xn is the position of the n-th base pair, measured along the DNA chain. Each tenth bead (of the undeformed chain) is adsorbed in a potential well of depth U 0 and width 2a, see Fig. 8.28. This external potential of the 14 binding sites is modeled as follows: 2
14 xk − 10bl 2 − 1 (a − |xk − 10bl |) E ads ({xn }) = −U 0 a k l=1 (8.80) with denoting the Heaviside step function. The external potential is constructed in such a way that—if the chain is undeformed—each tenth bead sits on the bottom of the potential well. If an “adsorbed” bead is shifted out that position by a distance a, it smoothly reaches zero adsorption energy. If there is a twist defect on a nucleosome, say one missing base pair between two binding sites, the DNA has to overtwist and overstretch to accommodate this defect. The spring constant K is thus chosen to reflect the energetics of that DNA deformation. Within the framework of the WLC model, Eq. 4.15, there is, however,
Nucleosome Dynamics 363
only bending and twisting possible but no stretching. Therefore, according to this model K = ∞ and twist defects cannot be formed. In reality it is, however, possible to stretch DNA, i.e., to increase its contour length, albeit at a very high price since base pair stacking has to be severely disturbed. This is why the inextensible WLC is usually good enough to describe DNA elasticity. The DNA twist and stretch moduli are known experimentally allowing to estimate K , namely K ≈ 200 kB T . From our previous considerations on the unwrapping of DNA from nucleosomes—spontaneously or induced by an externally applied force—we can estimate the depth U 0 of the potential well. There are three effects that determine the adsorption strength per length of the DNA onto the octamer (see Eqs. 8.36, 8.78 and Fig. 8.23): (1) The full net adsorption energy that hinders the (0) (1) unwrapping of the last DNA turn, fcrit + fcrit , (2) the smaller (0) unwrapping force fcrit for the first turn in the presence of the repelling other turn, and (3) the bending energy per length that is released during unwrapping, A/2R02 . In the repositioning problem, DNA has to unbind from binding sites but—unlike during DNA unwrapping—without straightening of the DNA conformation. The (0) (1) relevant energy per length that determines U 0 is thus fcrit + fcrit + (0) (1) 2 A/2R0 . With fcrit = 0.3 kB T /nm, fcrit = 1.8 kB T /nm, A = 50 kB T nm and R 0 = 4.3 nm, this leads to an adsorption energy per length of 3.5 kB T /nm. With the 3.4 nm spacing of binding sites, this amounts to U 0 ≈ 3.5 × 3.4 kB T ≈ 12 kB T . The only remaining parameter that still needs to be determined in the above energy expressions is 2a, the width of the potential wells. This can be estimated from the fluctuations of the DNA inside the nucleosome crystal (Fig. 1.8) (Luger et al., 1997). It is found that the 2 DNA in between binding sites shows mean-squared fluctuations xmiddle 2that
are about three times larger than those close to binding sites, xbond : 2
x middle
≈ 3. (8.81) 2 xbound We now compare this with the prediction of our theoretical model. A bead in the middle between two binding sites is “connected” to these sites via two stretches of 5 springs. This leads to an effective spring
364 DNA–Protein Complexes
−0.8
−0.6
−0.4
−0.2
bead position
0
−1.0
(a)
−0.8
−0.6
−0.4
−0.2
bead position
defect energy
−1.0
14 12 10 8 6 4 2
defect energy
17 16 15 14 13 12 11
0
(b)
Figure 8.29 (a) Energy landscape (in units of kB T ), which is felt by a twist defect with one missing base pair when it moves from one location inside the nucleosome to the next (see text for details). For clarity, we only draw three bp per helical repeat instead of 10 as in Fig. 8.28. (b) Energy landscape felt by the defect when it leaves the nucleosome from the most right location.
constant of 2K/5. On the other hand, the bound bead mainly feels the attraction to its site, while the contributions from the springs can be neglected. According to Eq. 8.80, when the bead is displaced by x from the equilibrium position of the binding site, the adsorption energy changes from −U 0 to $ x 2 %2 2U 0 ≈ −U 0 + 2 x 2 . −U 0 1 − (8.82) a a Thus the binding site acts as an effective spring with a “spring constant” 4U 0 /a2 . The ratio of the fluctuations then follows from the equipartition theorem, Eq. 2.36, to be
2 x 10b2 U 0 middle
≈ . (8.83) 2 a2 K xbound This ratio is around 3, the experimental value, if we choose a = √ 10U 0 /3K b ≈ 0.5b, the value that we use in the following. The theoretical model now enables us to calculate the diffusion constant for nucleosome sliding along DNA. As a first step, we have to study the diffusion of a twist defect inside the nucleosome. Once a defect has formed, it can hop from one position between two binding sites to one of the two neighboring positions. From Eqs. 8.79
Nucleosome Dynamics 365
and 8.80 with K = 200 kB T , U 0 = 12 kB T and a = b/2 follows the curve depicted in Fig. 8.29(a), which shows the energy felt by a defect when moving from one position to a neighboring one. In the beginning the defect sits say to the left of the middle binding site, see right example configuration in Fig. 8.29(a). Its energy is U defect ≈ 10 kB T . In order for the defect to cross to the right, the bead bound to the middle binding site needs to detach first. This bead is highlighted throughout all example configurations; the abscissa of the plot gives its position (in units of bp steps). Once it has detached and moved halfway to the left (upper example configuration in Fig. 8.29(a)), the defect stretches out over 20 bp, thereby reducing its elastic energy. Nevertheless, since the middle binding site is now unoccupied, the total energy of the system is maximal. As can be seen from the graph, the overall cost to go over the barrier is of the order of 7.5 kB T . At the end, the defect has moved to the right to the next possible location, and the highlighted bead ends up to the left of the binding site, as shown in the left configuration in Fig. 8.29(a). The situation at the end of the wrapped DNA section is depicted in Fig. 8.29(b), which shows how a defect leaves the nucleosome to the right. First the defect is located to the left of the most right binding site, see the example configuration at the rhs. In order to leave the nucleosome the defect has to cross over a barrier of about 5 kB T . This value is lower than that for the inner locations because the deformation can be completely relaxed in the unbound part of the DNA. At the end the nucleosome is defect free and its energy has relaxed from U defect ≈ 10 kB T to zero. Using Kramers’ rate theory, Eq. 5.68, we now calculate the rate with which a defect is ejected from the nucleosome. We know already the barrier, see Fig. 8.29(b), but we still have to determine the attempt frequency ν0 . A rough estimate can be given by realizing that in order to reach the top of the barrier, a 20 bp stretch of DNA needs to perform a translational motion. For the friction constant we thus use the translational friction constant of a cylinder of length L = 20 × 0.34 nm = 6.8 nm, which (up to a logarithmic correction in the aspect ratio) is given by ζ = 2π ηL (Doi and Edwards, 1986). In total we find 4U 0 ω A ωB ≈√ ≈ 3.6 × 1010 s−1 (8.84) ν0 = 2π ζ 2πa2 ζ
366 DNA–Protein Complexes
where we approximated the curvatures at the bottom and the top of the energy landscape by that of the dominant contribution, E ads . (0) = 4U 0 /a2 and ω2B ≈ −E ads (b/2) = This leads to ω2A ≈ E ads 2 8U 0 /a for a = b/2 (as assumed throughout). Let me reiterate that Eq. 8.84 can only be viewed as a very rough estimate. Unfortunately, ν0 enters linearly into later results, especially the diffusion constant of the nucleosome along DNA. If our estimate of ν0 is off by a factor of 10, the diffusion constant will be off by the same factor. The escape rate with which a defect at a terminal position leaves the nucleosome is thus kesc = ν0 e−5 ≈ 2 × 108 s−1 .
(8.85)
In equilibrium, on average as many defects enter the nucleosome as they leave it. This leads to the condition pnone kenter = pdefect kesc where kenter is the rate with which a particular defect (e.g., a missing bp defect) enters from say the right, pnone is the probability that there is no such defect at the most right position and pdefect is the probability that there is such a defect. With pdefect / pnone = e−U defect /kB T we find for the injection rate of a particular kind of defect from one end of the nucleosome: kenter = kesc e−U defect /kB T ≈ 104 s−1 .
(8.86)
The total rate with which defects enter the nucleosome has to be multiplied by 4 since there are two entrances (left and right) and two types of twist defects (missing and extra bp). Not every defect that forms at one end reaches the other end. We need to know the fraction of defects that cross the nucleosome because only these defects contribute to nucleosome repositioning. Note that the transition rates to leave the nucleosome at the two termini are higher than those for transitions between inner locations, compare Fig. 8.29(b) with Fig. 8.29(a). The latter we have assumed for simplicity to be the same everywhere. What we need is (2) the probability pN for a defect to start at say the left and leave at the right where the lower index N denotes the number of inner sites (11 for a nucleosome) and the upper index (here 2) stands for the two outer sites, see Fig. 8.30(c). This probability can be calculated in 3 steps. First we find pN , the probability for a random walker to transverse N identical sites with escape rates identical to the inner
Nucleosome Dynamics 367
pN
(a) (1) pN
q
p 1 2
N
(b)
(2) pN
q
p
p
1 2
N
q (c)
Figure 8.30 How to calculate the probability for a defect to cross the nucleosome: (a) Start with the N inner sites, (b) add the leftmost site and finally (c) the rightmost site. For the nucleosome N = 11 (see text for details).
rates, see Fig. 8.30(a). Then we add an outer site at the start site, (1) Fig. 8.30(b), and calculate the crossing probability pN . In the last (2) step, we determine pN , Fig. 8.30(c). The pN ’s can be found by mathematical induction. Suppose we already know pN−1 . Then we can calculate pN as follows. The probability that a defect that starts at site 1 reaches site N, the site at the other end, is pN −1 . The chance that the defect at N still falls off to the left is obviously pN for reasons of symmetry. This means that 1 − pN is the probability that the defect at N will eventually fall off to the right. Multiplying the two probabilities, namely pN−1 for going from 1 to N and 1 − pN for eventually falling off to the right, gives us the crossing fraction pN : pN = pN−1 (1 − pN ) .
(8.87)
Equation 8.87 together with the trivial case p1 = 1/2 is solved by pN = (1)
1 . N+1
(8.88)
Next we determine pN . For inner sites the probabilities for the defect to jump to the left or to the right are the same: 1/2. For the
368 DNA–Protein Complexes
first site, however, the values are different: p to the right and q to the left with p + q = 1. Summing up all possibilities we find (1)
pN = ppN + p (1 − pN ) ppN + p2 (1 − pN )2 ppN + · · · ppN p = = . (8.89) ) (1 qN + 1 1 − p − pN Each term in this infinite series accounts for a different number of times the defect returns to the start site before it eventually falls off at the other end. For the first term, ppN , the defect never passes through the left site again, i.e., it jumps from the left site one step to the right (factor p) and then crosses all the inner sites and falls off to the right (factor pN ). The second term, p (1 − pN ) ppN , accounts for all random walks that start at the left, make a step to the right (factor p), return back to the starting site (factor 1 − pN ), step again to the right (factor p) and finally fall off to the right without again going back to the start site (factor pN ). Note that for p = q = 1/2 (1) one recovers Eq. 8.88 as expected, i.e., pN = 1/ (N + 2) = pN+1 . As the last step, we have to add the terminus to the right. The (2) (1) probability pN can then be calculated along similar lines as pN in Eq. 8.89. One finds (2) (1) (1) pN = pN q 1 + p (1 − pN ) + pN pN % 2 (1) 2 + ··· + p (1 − pN ) + pN pN (1)
=
pN q
1 − p 1 − pN +
(1) pN pN
=
p . (8.90) N + 1 + p (1 − N ) (1)
We explain the first two terms. For the first term, pN q, the defect (1) arrives with probability pN at the right end and then—without stepping back to the left—falls off with probability q. The second term accounts for the case that the defect, after having reached the (1) right end (factor pN ), jumps back to the left (factor p) and diffuses through the inner sites without going through the start site (term (1) 1 − pN ) or with passing through it (term pN pN ). As a test, we can (2) set p = q = 1/2 again and indeed recover pN = pN+2 . To determine the quantity p for the nucleosome, we need to compare the escape rate kesc , Eq. 8.85, and the rate kinner for hopping between inner sites, kinner = ν0 e−7.5 (see also Fig. 8.29). The rates
Nucleosome Dynamics 369
directly determine the probabilities: p/q = kinner /kesc . From this −2.5 2.5 ≈ 1/13. For a q and hence p = 1/ 1 + e follows p = e (2) nucleosome one has N = 11 inner defect locations and thus p11 ≈ 1/150 is found for the fraction of defects that cross the complex. This allows us to calculate the time Tstep that passes on average between two successful crossing events or, in other words, the stepping time for the nucleosome along the DNA: Tstep =
1 (2)
4kenter p11
≈ 4 × 10−3 s.
(8.91)
As mentioned earlier, the factor 4 in front of kenter accounts for the fact that defects can enter from both sides and that there are two types of defects. This finally leads to the following estimate for the diffusion constant for nucleosome sliding along DNA: D=
b2 ≈ 130 bp2 /s. 2Tstep
(8.92)
This result, D ≈ 130 bp2 /s, is unfortunately far off experimental observations which suggest much smaller diffusion constants of about 1 bp2 /s. Only for such small values, redistribution times on short DNA templates are on the order of an hour instead of a fraction of a second as predicted by Eq. 8.92. Moreover, it was found that nucleosome sliding produces preferred locations that are 10 bp apart from each other. According to our model, the nucleosome should be found with the same probability at every position. There is clearly something missing in our model. What could this missing ingredient be? Remember Subsection 4.2.2 where we considered DNA bending at the base pair level. Since the geometry of a given bp step depends on its nucleotides, certain base pair sequences lead to a DNA double helix that is strongly intrinsically bent in one direction. Remarkably, the majority of repositioning experiments (like, e.g., (Pennings et al., 1991)) has been performed with DNA molecules that feature such sequences, mostly the sea urchin 5 S rDNA positioning sequence. The reason for this is mainly of a technical nature, since nucleosomes can only be reconstituted in a controlled manner on DNA templates containing such a sequence. What does the energy landscape look like that is felt by a sliding nucleosome? First of all, the nucleosome positioning sequence
370 DNA–Protein Complexes
comes with a favorite position where many of the dinucleotide steps are at locations that make it easy for the DNA to bend around the octamer (as it is already intrinsically bent in that direction). There might be, e.g., on average more TA steps at positions where the minor groove faces inward in accordance to the rules depicted in Fig. 4.16(b). As the nucleosome slides, the DNA performs a corkscrew motion, thereby progressively violating these rules. Once it has moved 5 bp to the left or to the right, the dinucleotide steps will be on average out-of-phase and the bending energy of the nucleosomal DNA reaches a maximum. Once the nucleosome has moved 10 bp away from the optimal position, 1/14th of the positioning sequence has left the nucleosome but 13/14th are still wrapped with many dinucleotides facing into the right direction. We thus expect to find undulations with a 10 bp wavelength which gradually disappear as the nucleosome moves out of the positioning sequence. To get an estimate of how much nucleosome sliding is reduced by a positioning sequence, we provide in Fig. 8.31 a plot of the bending energy cost to wrap DNA into a nucleosome as a function of the bp position for a 207 bp long DNA fragment that contains the 5 S rDNA positioning sequence. This DNA segment offers 207 − 147 + 1 = 61 different possible positions for the nucleosome. We calculate this energy landscape using the model from Subsection 4.2.2 where the rigid base pair model is forced into an ideal superhelix with dimensions comparable to those of the nucleosome, see Fig. 4.15. We only sum over the bending energy of the 146 involved bp steps and assume that the rest of the DNA is relaxed. The bending energy at position k, 1 ≤ k ≤ 61 is the sum of the bending costs of all involved bp steps: E bend =
k+145 n=k
E n Nn Nn+1
(8.93)
where E n Nn Nn+1 has been defined in Eqs. 4.8 to 4.10. We show the elastic energy landscapes for all three parametrization that we have used earlier, in Fig. 4.18: the blue curve uses the full set of parameters from Table 4.1 extracted from (Olson et al., 1998), the dashed curve uses an average stiffnesses, and the dotted curve was calculated from the newer data set for the
Nucleosome Dynamics 371
Ebend [kB T ]
190 new data set
185 180 175 170 165 160
homogeneous stiffness
0
10 20 30 40 50 60 nucleosome position [bp]
Figure 8.31 Elastic energy landscapes for the 207 bp long 5 S rDNA DNA fragment (the specific bp sequence is given in Problem 8.2). The energy is calculated using the ideal superhelix model for the nucleosome, depicted in Fig. 4.15. We show the energy landscape for three different parametrizations, used also earlier, in Fig. 4.18: the full set of parameters from Table 4.1 extracted from (Olson et al., 1998) (blue curve), the same intrinsic values but average stiffnesses (dashed curve), and the newer data set for the shapes (Balasubramanian et al., 2009) but assuming homogeneous stiffnesses (dotted curve). All curves show characteristic 10 bp undulations with about the same amplitude.
shapes (Balasubramanian et al., 2009) but assuming homogeneous stiffnesses. For all three parametrizations, the energy landscape shows typical energy undulations on the order of A ≈ 15 kB T with a periodicity given by 10 bp, the DNA helical repeat. Note that the total elastic energies are very high, between 166 and 188 kB T . This contrasts with our much smaller estimate of 58 kB T based on the WLC model, see below Eq. 8.36. Even worse, note that we find this high number despite having not even accounted for all degrees of freedom (like twist and rise) that cost energy. As it turns out, the overall large bending energies reflect the fact that we force our model into an ideal superhelix without giving it a possibility to relax locally. As we discussed earlier, the DNA is bound at 14 locations to the histone octamer but the rest of the DNA is rather free and shows substantial thermal fluctuations
372 DNA–Protein Complexes
in shape in the crystal structure (Luger et al., 1997). The overall deformations that are necessary to bend the DNA so that it has the proper orientations and positions at the 14 contact points can thus be distributed among the base pair steps which lowers the overall bending costs substantially. In fact, a model (Eslami-Mossallam et al., 2016) based on the rigid base pair model with 28 constraints, two per binding site, shows an overall bending cost on the order of 68 kB T , much closer to the WLC estimate of 58 kB T , even though the elastic contributions of all degrees of freedom are accounted for. The amplitude of the undulations is also reduced accordingly. We therefore assume that the overall energy landscape shown in Fig. 8.31 has to be multiplied by a factor 1/3, which reduces the undulations to a value of A ≈ 5 kB T . Also other approaches, e.g., a free energy landscape estimated from experimental bp step distributions like the one shown in Fig. 4.14, comes to similar findings. The latter approach (Segal et al., 2006) was in fact what we used in the first edition of this textbook, as the superhelix model was not yet available. We now simplify this energy landscape as follows:
A 2πs U bind (s) = cos −φ . (8.94) 2 10 Here s counts the bp position of the nucleosome and φ is some phase factor. The nucleosome has therefore to cross a barrier of height A in order to move 10 bp further to the left or to the right. The rate to cross the barrier then follows from Kramers’ rate expression, Eq. 5.68: k=
(smin ) −A/kB T U bind πA e = e−A/kB T 2π ζ 100 bp2 ζ
(8.95)
where smin denotes a nucleosome position that minimizes U bind . From this follows the effective diffusion constant: Deff =
π A − A/kB T 100 bp2 =D e −1 2kB T 2k
(8.96)
where we used in the second step the Einstein relation D = kB T /ζ , Eq. 5.49. For A = 5 kB T the diffusion constant is reduced from D = 130 bp2 /s to effectively Deff ≈ 7 bp2 /s, a value that is reasonably close to the experimental value of about 1 bp2 /s. In addition, we
Nucleosome Dynamics 373
can now explain the preference for certain positions with a 10 bp spacing. This just reflects the Boltzmann weight since it is e5 ≈ 150 more likely to find the nucleosome at a favorite rotational setting as compared to an unfavorite one. We close this section by mentioning some new developments in the field. Computer simulations of nucleosomes from two different groups, one in Chicago and one in Kyoto, found evidence that both types of defects, twist and loop defects, can cause the repositioning of nucleosomes and that the predominant type depends on the underlying bp sequence (Lequieu et al., 2017; Niina et al., 2017). Both groups used the same coarse-grained DNA model (called “3 Site Per Nucleotide,” or 3SPN for short) which is different from the rigid base pair model, discussed earlier in this book, but also accounts for the sequence dependent DNA elasticity. The model DNA ¨ el includes charged phosphates which are attracted via a Debye-Huck interaction to the oppositely charged protein cylinder (modeled by a coarse-grained protein model called “Atomic-Interaction-based Coarse-Grained,” or AICG for short). Remarkably, as a result of this interaction, a proper nucleosome forms. Over the course of each simulation, the position of the octamer along the DNA molecule changes. For sequences where the energy landscape does not show very large 10 bp undulations, the nucleosome slides along DNA via twist defects. On the other hand, for sequences that position nucleosomes very strongly, the octamer seems to “tunnel” occasionally via loop defects through the high barriers in the energy landscape. Also a recent experimental observation suggests that both types of defects might underlie nucleosome sliding (Rudnizky et al., 2019). The experiment uses a micromanipulation setup where a nucleosome-containing DNA molecule is unzipped. This is similar to the experiment mentioned in the previous section that allowed to map the DNA-octamer interactions (Hall et al., 2009). The difference is that the new experiment uses a slightly lower force (23 pN instead of 28 pN) such that the DNA does not peel off the nucleosome as the unzipping fork stops upon encountering the nucleosome. By repeatedly pulling and relaxing the two DNA strands, the position of the nucleosome was determined with a 2 bp-resolution at each probing cycle. Remarkably, nucleosomes containing a histone H2A
374 DNA–Protein Complexes
variant, called H2A.Z, showed two types of movements: small scale movements, possibly based on twist defects generated at short time scales, and longer ranged repositioning events on the time scale of minutes, possibly caused by loop defects. In addition to the spontaneous sliding of nucleosomes caused by thermal fluctuations, nucleosomes in cells are pushed by dedicated motor proteins along the DNA. Only recently—based on improved experimental techniques and computer simulations— have we found out how these motor proteins, called chromatin remodelers, accomplish this feat. Also in the case of chromatin remodelers the main challenge lies in the fact that DNA inside a nucleosome is bound at discrete sites to the protein cylinder and that the total binding energy amounts to many tens of kB T . Remodelers use ATP and therefore have much less energy at their disposal (about 12 kB T ). They cannot act in groups on nucleosomes (like myosin in muscles) as there is simply not enough space for more than one remodeler to push or pull a nucleosome along. A breakthrough has been achieved with transmission electron cryomicroscopy (cryo-EM), an area that has made great strides recently due to advances in software algorithms. With this EM technique, samples are cooled quickly and thus remain in their natural environment, unlike for other methods such as X-ray crystallography. By combining a large number of images of particles, one can now reach near-atomic resolution. Using this method, a remodeler (Snf2) in complex with a nucleosome, has been studied (Liu et al., 2017). By processing cryo-EM data from about 200 000 of these particles, their structure could be determined with about 5 A˚ resolution. It was found that the remodeler is bound asymmetrically on the nucleosome, namely at a location about 20 bp away from the middle of the wrapped portion, and that there is another contact with a stretch of DNA on the other nucleosomal turn. Could the chromatin remodelers mobilize nucleosomes by actively creating defects inside nucleosomes? How this could work is far from obvious. The remodeler contains a motor that shares similarities with a class of well-known motors called helicases. Some helicases move unidirectionally along DNA, essentially in an inchworm fashion. To achieve this, they feature two lobes that bind to the DNA, changing their distance and modulating individually
Nucleosome Dynamics 375
their binding strengths through each ATP cycle. As a result, they walk like we would walk, if we would keep always the same foot in front of the other one. However, whereas the helicase moves along following the DNA helix, a remodeler bound to a nucleosome cannot perform any motion as it would clash either with the protein core or (moving in the opposite direction) with DNA from the other nucleosomal turn. So how can a remodeler move a nucleosome along DNA? A molecular dynamics study helped to answer this question (Brandani and Takada, 2018). The simulation uses the same Kyotobased model (Niina et al., 2017) mentioned above that—together with the similar Chicago-based model (Lequieu et al., 2017)— observed nucleosome repositioning by twist and loop defects. This new simulation used the information from the cryo-EM structure of the nucleosome-remodeler complex (Liu et al., 2017) to place the remodeler asymmetrically onto their nucleosome model. Using the similarity to helicases, the authors let their remodeler model go through three chemical states where each has a slightly different set of interactions. When they led their remodeler act on bare DNA, not surprisingly it walked along the double helix in an inchworm fashion. About 50% of the ATP cycles led to successful steps. Remarkably, the remodeler bound to the nucleosome managed to walk along the DNA as well, even though it seems to have no space to move. Even more remarkable is the fact that it managed to drag the whole nucleosome along and practically each cycle led to a successful step. How does the remodeler execute this seemingly impossible feat? In the starting position (the one corresponding to the cryo-EM structure (Liu et al., 2017)) the two lobes are distant. ATP binding leads to the motion of one lobe toward the other, changing the remodeler from an open to a closed configuration. As a result there is now a gap left behind that lobe. Surprising is what happens next: The remodeler (still in the closed configuration) swings back, closing the gap and deforming with it the DNA around the position where it is bound. What causes this motion is a basic patch on the lobe that is attracted to the other DNA turn with which it was in contact with before. Next, ATP hydrolysis can take place since there is now room for the other lobe to move so that the remodeler can return to the open configuration and the cycle is finished.
376 DNA–Protein Complexes
Now we can return to the twist defects. During the swingingback of the remodeler, caused by the “electrostatic spring,” the DNA is deformed. As a result, a twist defect with an extra base pair has formed on one side of the remodeler (between the two closest binding sites of the DNA to the protein cylinder) and another one with a missing base pair on the other side. With the electrostatic spring holding the remodeler tightly in place, this over-/under-twist pair cannot annihilate itself. Instead, the defects diffuse along their respective wrapped DNA portions and fall off at the ends. As a result, the nucleosome makes a step by one base pair. The whole scenario is reminiscent of historical treadmills (or treadwheels), machines where humans provided the power. Instead of the electrostatic spring pulling the remodeler back, it was gravity that pulled the person down again after each step, causing the rotation of the wheel. With this comparison, nucleosomes and treadmills, we finish the part on nucleosomes and now move on to larger length scales.
8.4 Chromatin Fibers What we really know for sure about the structure of chromatin inside cells goes only up to the level of the nucleosome. Surprisingly, the structures beyond that are still a matter of debate. If you look at Fig. 1.9 again, you can see that it depicts a fiber with about 30 nm diameter, the chromatin fiber, as the next level of chromatin organization. In fact, such structures are easily observed under an electron microscope if one extracts chromatin from the nuclei of cells and then produces chromatin fragments through mildly digesting its DNA with micrococcal nuclease (Finch and Klug, 1976). Fibers also form when one reconstitutes chromatin from its pure components, DNA and histones (Robinson et al., 2006). However, in both cases fibers are observed under in vitro conditions that are very far from the crowded conditions inside cell nuclei. In fact, in Ref. (Eltsov et al., 2008) electron micrographs of frozen sections of cell nuclei were presented that did not show any evidence of fibers but instead a uniform mass of nucleosomes. Whether this structure is caused by the preparation of the sample or reflects the true organization of chromatin in living cells is still a matter of debate.
Chromatin Fibers 377
n+1
n+1
n n
Figure 8.32 The two main competing models for the chromatin fiber: the solenoid model (left) and the crossed linker model or two-angle fiber (right). The figures are adapted from the 1994 and the 2002 editions of Ref. (Alberts et al., 2008). Every new edition of this standard textbook sponsors a different model.
But that is not the whole problem with chromatin fibers. Even if we would observe fibers inside cells, we would still not know their actual geometry. Current experimental techniques have been spectacularly unsuccessful in determining their structure. Some successes have been made like, e.g., the crystallization of the tetranucleosome in 2005 (Schalch et al., 2005). However, it is not clear whether the spatial arrangement of the four nucleosomes reflects the spatial arrangement of nucleosomes inside a fiber. In this section, we do not try to answer the question of whether chromatin fibers exist in living cells but merely try to answer the second question, namely that of the geometry of the in vitro fibers. We will discuss the two most popular models of chromatin fibers. Even though they seem to be mutually exclusive, I shall argue that they are two sides of the same coin. Depending on the experimental conditions, one or the other model might apply. Specifically, the two major competing classes of models are the solenoid-type models and the zig-zag or crossed linker models, see Fig. 8.32. In the most classical version of solenoid-type models, the solenoid model (Finch and Klug, 1976) (see lhs of Fig. 8.32) it is assumed that the chain of nucleosomes forms a
378 DNA–Protein Complexes
helical structure with the axes of the superhelical DNA wrapping paths being perpendicular to the fiber axis. The DNA entry-exit sites to the nucleosomes face inward toward the center of the solenoidal fiber. The linker DNA needs to bend in order to connect neighboring nucleosomes in the solenoid which in turn calls for strong nucleosomal attraction to hold this structure together. The other class of models posits straight linkers that connect nucleosomes located on opposite sides of the fiber. The geometry is then characterized by two angles so that these fibers are also referred to as two-angle fibers. This typically results in a threedimensional zig-zag-like pattern of the linker DNA (see rhs of Fig. 8.32(b)). We will now discuss the two types of models in detail, first the two-angle fibers where the linker DNA sets the fiber geometry, and then the solenoid-type fibers where the attraction between the nucleosomes overrules the DNA elasticity.
8.4.1 Two-Angle Model The two-angle model is based on electron micrographs of swollen, open chromatin fibers at low salt concentrations. It is then speculated that such a two-angle geometry is also present at high, physiological salt concentrations where fibers are so dense that one cannot detect their actual structure (Woodcock et al., 1993). Before describing this geometry, we need to mention that chromatin fibers contain an additional protein, the linker histone. Experiments (Syed et al., 2010) indicate that the linker histone binds to portions of the in- and outgoing DNA and to a short piece of the wrapped DNA around the dyad axis. This results in a so-called stem as depicted in Fig. 8.33(a). The repeating unit of the two-angle model is then a cylinder (the core particle), a stem and a piece of linker DNA connecting to the stem of the next nucleosome, see Figs. 8.33(b) and (c). The geometry of the fiber can be described by two angles (Schiessel et al., 2001): the deflection angle θ, see Fig. 8.33(b), and the rotational or dihedral angle φ, see Fig. 8.33(c). The latter is the angle by which a linker is rotated out of the plane that is defined by the two preceding linkers. Since the DNA double helix is attached to the histone octamer in a specific rotational and translational setting—the minor groove is attached to specific binding patches—
Chromatin Fibers 379
R stem
vertex
linker histone
(a)
π−θ
(b)
φ
b
(c)
Figure 8.33 (a) A single nucleosome with stem induced by a linker histone. (b) Simplified geometry with a spherical nucleosome. π − θ, the angle supplementary to the deflection angle, is indicated. (c) Section of the twoangle fiber showing the definitions of the dihedral angle φ and of the linker length b.
the rotational angle is a function of the linker length. Adding one bp leads to a change in φ by about 36◦ . Since a minute change in the linker length has such a strong effect on φ, we can simplify our analysis by setting the linker length to a fixed value and then consider the geometry of fibers as a function of the two angles. Let us assume that these two angles are constant throughout the fiber which is the case when the nucleosomes are equally spaced along the DNA; this assumption is based on the observation that the distribution of linker lengths are peaked around some typical values (Widom, 1992). Under this assumption we obtain regular fibers; some examples are presented in Fig. 8.34. An analytical description of the structures (first done in the context of helical polymers in 1961 (Miyazawa, 1961)) can be achieved by constructing a spiral, the master solenoid, of radius R and pitch angle ψ such that the spiral passes through all the nucleosomes. More precisely, the spiral goes through the points where the linker DNA enters the stem or, in other words, through the vertices of the angle π − θ . One such vertex is indicated in Fig. 8.33(b). The vertices are placed along the spiral in such a way that successive vertices have a fixed distance b from one another. In Appendix H we derive analytical expressions that relate pitch
380 DNA–Protein Complexes
angle ψ and radius R of the solenoid as well as s0 (defined as the distance between successive vertices along the helical axis) to the pair of angles θ , φ and linker length b. The corresponding relations b = b (ψ, R, s0 ), θ = θ (ψ, R, s0 ) and φ = φ (ψ, R, s0 ) are Eqs. H.6 – H.8. From these follow the reverse relations that allow to calculate the overall fiber geometry from the local geometry. Specifically, the radius R of the master solenoid is given by R=
b sin (θ/2) 2 − 2 cos2 (θ/2) cos2 (φ/2)
(8.97)
and its pitch angle ψ by
tan (θ/2) arccos 2 cos2 (θ/2) cos2 (φ/2) − 1 cot ψ = . 2 sin (φ/2) 1 − cos2 (θ/2) cos2 (φ/2)
(8.98)
Finally, the distance s0 of neighboring vertices along the fiber axis is obtained from b sin (φ/2) s0 = . (8.99) 2 (θ/2) sec − cos2 (φ/2) Since the stems with the nucleosomes point radially outward the master solenoid, the actual chromatin fiber has a diameter given by Dfiber = 2R + 2rstem + 2Dnucl where rstem is the stem length and Dnucl the nucleosome diameter. One can get an overview of all the possible fiber geometries by looking at the (θ,φ)-plane, Fig. 8.34. Depicted are example configurations of two-angle fibers with their location in the (θ ,φ)plane indicated. At the edges of the diagram where one of the angles is zero, the configurations are always planar (more precisely, the linker DNA lies in a plane, the nucleosomes might point out of that plane). On the line φ = 0 one obtains circles (e.g., “A”), convex polygons (e.g., a square “B” and a triangle “C”) and startype polygons that are closed for special values of θ , e.g., the regular pentagram “D.” Planar zig-zag fibers are found for φ = π (“E,” “F” and “G”). If one moves away from those boundaries circles evolve into solenoids, “I,” star-shaped polygons into crossed-linker structures, “J,” and zig-zag fibers into two-start helices, “K.” For the fibers depicted in Fig. 8.34, we made the following choices for the various lengths: a linker length of b = 7 nm, a stem length of 2 nm and a diameter of the nucleosome sphere of 10 nm. These
Chromatin Fibers 381
Figure 8.34 (θ, φ)-plane of the two-angle model together with example configurations. The white part of the plane contains the area of allowed fibers where the nucleosomes (assumed here to be spherical) do not overlap. The colored areas constitute forbidden regions. The number at each region indicates how far one has to go along the fiber from a given nucleosome before it collides with another nucleosome. For instance, in the dark blue region “3” nucleosome i would overlap with nucleosome i + 3, as it is the case for structure “C.” The specific values of θ and φ of the example fibers are as follows: (π/5, 0) for “A,” (π/2, 0) for “B,” (2π/3, 0) for “C,” (4π/5, 0) for “D,” (0, π ) for “E,” (π/4, π ) for “F,” (1.55, π ) for “G,” (0, 1.07) for “H,” (π/5, 0.2) for “I,” (2.52, 1.33) for “J,” (1.99, 2.50) for “K,” and (π/2, π/2) for “L.”
number are chosen to mimic the 210 bp nucleosomal repeat length of chicken erythrocyte (red blood cell) chromatin. We assume that about 190 bp are associated with the nucleosome plus stem leaving about 20 bp left for the linker which amounts to about 7 nm. Large sections of the (θ ,φ)-plane are forbidden because nucleosomes would overlap. For simplicity we assume here the nucleo-
382 DNA–Protein Complexes
somes to be spherical (for a (θ,φ)-diagram with cylindrical particles see Ref. (Diesinger and Heermann, 2008)). This simplification might appear crude but is actually sufficient since, as we shall argue later, the two-angle model is not very useful to describe dense fibers. The forbidden regions are shown in various colors in Fig. 8.34. Each color corresponds to a number also displayed in the figure; it indicates how many nucleosomes one has to go beyond a given nucleosome before a steric clash occurs. The bumps in the intricate boundary between allowed and forbidden geometries toward the bottom of the diagram reflect commensurable angles where nucleosomes of the next round sit on top of nucleosomes of the previous round. The blue region “2” to the right is related to the overlap of the nucleosomes with their next nearest neighbors. Various example configurations (“G,” “H,” “J” and “K”) are placed precisely on the boundary between the allowed and forbidden region. In these fibers each nucleosome touches at least two other nucleosomes. The crossed-linker geometry “J” is characterized to sit on the boundary at the unique point where nucleosome i touches two nucleosomes on the next round, namely i + 5 and i + 7. In addition it is very close to its next-nearest neighbor, i + 2. From all the fibers in the allowed region, this is the one with the largest three-dimensional density 2 with a value close to that of real fibers. The fiber also 1/ π s0 Dfiber happens to have a diameter of Dfiber ≈ 31 nm, a value very close to actual fiber diameters. Does this structure reflect the geometry of real fibers at physiological salt concentrations? Electron micrographs taken from fibers at various salt concentrations show that the deflection angle θ is a function of the salt concentration (Bednar et al., 1998), namely θ ≈ 95◦ at 5 mM, θ ≈ 135◦ at 15 mM and θ ≈ 146◦ at 40 mM. This change in θ might reflect the increase in electrostatic screening leading to a smaller and smaller repulsion between the in- and outgoing DNA linkers. Since the other angle, φ, is fixed by the linker length, adding salt corresponds to a movement toward the right in the (θ ,φ)-plane. Even though the highest ionic strength is still less than half of that under physiological conditions, the corresponding angle is already quite close to θ ≈ 144◦ , the value where we find our densest structure. This again suggests structure “J” as a candidate for dense chromatin fibers. We will come back to this point later, but first consider the
Chromatin Fibers 383
response of two-angle fibers and of real chromatin fibers to external forces. Up to now we discussed purely geometrical properties of the two-angle model. It is also possible to calculate their mechanical properties. The structure is held together by linker DNA which is bendable and twistable, see Eq. 4.15. Two-angle fibers can be mechanically deformed by bending and twisting their linker DNA. Because the DNA linkers in such fibers typically follow an intricate three-dimensional path, the mechanical properties of two-angle fibers and in fact of chromatin fibers differ dramatically from that of bare DNA. We focus here on the stretching modulus of such fibers, which describes how the contour length is stretched under an external tension. The stretching modulus γ is defined by L (8.100) L0 where L0 denotes the contour length of the undeformed fiber (taken along the axis of the master solenoid) and L+ L0 of the fiber under tension f . The stretching modulus therefore relates the relative extension L/L0 to the applied force. Of course, such a linear relationship only applies to tensions that are not too large. It is indeed possible to work out the elastic properties of twoangle fibers on purely analytical grounds based on equilibrium conditions of the WLCs that make up the DNA linkers (Ben-Ha¨ım et al., 2001). For the stretching modulus one finds 2 s0 C + S sb0 3A αs0 . γ = 2 2 αs kB T b R cos 2R 3A + tan2 0 C + S s0 2 2R b f = kB T γ
(8.101) Here A and C are the bending and twist modulus, see Eq. 4.15, and S = A − C . Since we know R, α = cot ψ and s0 as a function of θ and ψ we can also write γ as a function of these angles. We refrain from displaying this formula explicitly here because it is very long. We mention that from the fact that both A and C appear in Eq. 8.101, it follows that two-angle fibers stretch by a combination of bending and twisting the linker DNA. In the following we calculate γ for a special case, the zig-zag fibers where φ = π, see structures “E,” “F” and “G” in Fig. 8.34. An
384 DNA–Protein Complexes
undeformed zig-zag fiber is shown in the top of Fig. 8.35(a). The stretching of the fiber is achieved by a bending (but no twisting) of the linkers with the entry-exit-angle θ remaining constant, see bottom of Fig. 8.35(a). This leads to a deformation where the tangent vectors at the two ends of a DNA linker remain parallel but undergo lateral displacement, see inset of Fig. 8.35(a). Since we are interested in the linear regime that is valid for small deformations, we do not have to solve this geometry for the general Euler elasticas discussed in Appendix D. We assume a displacement u (s) from the straight configuration with u (s) b for all s, 0 ≤ s ≤ b. The bending b energy E linker = (A /2) 0 [u (s)]2 ds needs to be minimized which leads to the Euler-Lagrange equation d 4 u/ds 4 = 0. The boundary conditions that must be obeyed by the solutions are u (0) = u (0) = u (b) = 0 and u (b) = d where d describes the displacement of the linker vertical to the original straight linker, see inset of Fig. 8.35(a). We neglected here terms of the order (d/b)2 . It follows that the deformation profile is given by u (s) = −2ds 3 /b3 + 3ds 2 /b2 . The associated bending energy per linker is E linker = 6Ad 2 /b3 . The deformation translates into an effective change in the deflection angle from θ to θ − θ where θ/2 = d/b, see inset of Fig. 8.35(a). The energy of a zig-zag fiber with N linkers as a function of θ is thus given by E = (3/2) (A/b) θ 2 N. The change in θ leads to a change in the overall length of the zigzag fiber from L0 = Ns0 = bN cos (θ/2) to
(8.102)
θ + θ θ L = bN cos ≈ L0 + bN sin (θ/2) (8.103) 2 2 where we used the condition d b again. The energy can be rewritten in terms of the extension L = L − L0 . The restoring force is then given by f = d E /dL: A 12 f = L. (8.104) 2 3 Nb sin (θ/2) The associated stretching modulus (defined in Eq. 8.100) follows from Eqs. 8.102 and 8.104: 12A cos (θ/2) . (8.105) γ = kB T b2 sin2 (θ/2)
Chromatin Fibers 385
ΔL
L0
25
5 mM NaCl
f [pN]
20
π−θ
15
+f
−f
10 5 0 0
f
1 2 extension [μm]
25
π−θ
3
(b)
40 mM NaCl
f [pN]
20 15 10 5
plateau
b d
Δθ d = 2 b
(a)
0 0
1 2 extension [μm]
3
(c)
Figure 8.35 (a) The zig-zag fiber: free (top) and under tension (bottom). (b) Force-extension relation measured on a chromatin fiber extracted from chicken erythrocyte chromatin at 5 mM NaCl. (c) Same as (b) but at 40 mM NaCl. (b) and (c) are adapted from (Cui and Bustamante, 2000); see text for details.
You can check for yourself that the general formula Eq. 8.101 for φ = π is reduced to Eq. 8.105. The stretching modulus of zig-zag fibers thus scales up to a geometrical factor as l P /b2 . For a 7 nm linker length this factor is given by 50 nm/ (7 nm)2 ≈ 1 nm−1 . Note, however, that γ can in principle have any value, from γ = ∞ for θ = 0 (reflecting the inextensibility of the DNA linkers) down to γ = 0 for θ = π. The latter case, however, is sterically forbidden since it would lead to overlapping nucleosomes. We now take a closer look at force-extension curves of real chromatin fibers. Figures 8.35(b) and 8.35(c) show results from the first chromatin fiber stretching experiment with fibers extracted from chicken erythrocytes (Cui and Bustamante, 2000). A forceextension plot taken for low salt concentrations, 5 mM NaCl, is shown in Fig. 8.35(b). The authors of this work argued that the contour length of the unstretched fiber is about 1 μm. In contrast to naked DNA, the force does not go to infinity once the shape
386 DNA–Protein Complexes
fluctuations have been stretched out (see Fig. 4.33 for comparison). Instead, the force increases linearly with tension with a stretching modulus kB T γ ≈ 5 pN (see the dashed line that intercepts the X axis at 1 μm) before non-linear effects lead to a steepening of the curve. We compare now this plot to the predictions of the two-angle model, especially to Eq. 8.101. As argued above, we have b = 7 nm and θ ≈ 95◦ . From the statistical distribution of nucleosome repeat lengths it was found that linker lengths equal to 10k + 1 bp with k a positive integer are preferred (Widom, 1992). This, in turn, indicates that the rotation angle φ corresponds to a change in helical pitch associated with 1 bp, i.e., 360◦ /10 = 36◦ . We therefore set φ = 36◦ . Such a fiber is shown in Fig. 8.35(b). If we also choose A/kB T = 50 nm and C /kB T = 80 nm, we find from Eq. 8.101 kB T γ = 6.3 pN. This fits the data very nicely, see the slope of the red line in 8.35(b). However, it is important to recognize that this does not automatically mean that we understand the mechanical properties of chromatin fibers at low salt concentrations with such a high precision. For example, the experimentalists showed that the DNA in the fiber was 20 μm long. This together with the 210 bp repeat length leads to about 280 nucleosomes in the fiber. If we choose the above given values for b, θ and φ, we find a total fiber length of around 0.5 μm, about half the value that was deduced from Fig. 8.35(b). One could, of course, argue that this difference is caused by about 10 missing nucleosomes which would free roughly 10 × 50 nm. But according to Eq. 8.100 this would also mean that the γ -value of a fiber without missing nucleosomes would be about half as big. Overall, however, the predictions of the two-angle model are quite satisfactory, especially given that real fibers are not completely regular because their linker length varies. A systematic analytical study on the effects of linker length heterogeneity has been published only recently and shows how dramatic this heterogeneity can influence the physical properties of the two-angle fiber, specifically with respect to the effective persistence length (Beltran et al., 2019). One might speculate that the fiber acts as a safety cushion for the DNA during large scale rearrangements inside the nucleus as they
Chromatin Fibers 387
U C
S
f
3
elasticity
2
coexistence
S
elasticity
1
Dnucl
(a)
fCS 1
x
fCS
C (b)
L
Figure 8.36 (a) Three examples of internucleosomal interaction potentials between two nucleosomes inside a fiber and (b) resulting force-extension curve, see text for details.
might occur, for example, during cell division. Due to the softness of the fibers there will only be small forces on the DNA. Up to now we have neglected the influence of nucleosomenucleosome interactions. When one goes to higher salt concentrations, chromatin fibers become denser and nucleosomes might start to interact. Let us assume that nucleosomes attract. Let us further assume that the undeformed DNA-linker backbone would lead to a structure where nucleosomes would not touch each other. A simple example is the zig-zag fiber in Fig. 8.35(a). As the linker DNA is bendable, the zig-zag fiber can form a denser zig-zag fiber with the nucleosomes in contact, if the nucleosomal attraction is strong enough. The linker DNA backbone would then be under an internal tension. By applying a force, one can decondense this dense structure. This is shown in Fig. 8.36(a) where the interaction potential U (x) between nucleosomes (say between number i and i + 2 in a zig-zag fiber like in Fig. 8.35(a)) is sketched as a function of their center-to-center distance x. The potential has two contributions, the elastic energy with a preferred distance at a larger x-value and a short-ranged attractive term that kicks in for nucleosomes close to contact, x ≈ Dnucl . Curve “1” to “3” show cases of different relative importance of the two contributions. For curve “1” the elastic energy dominates, whereas for curve “3” the condensed state corresponds to the global minimum. Curve “2” is just at the border between these two cases.
388 DNA–Protein Complexes
Let us now focus on a condensed fiber (case “3”). With an external force one can decondense such a fiber. The critical force fCS to do this follows from the common tangent construction, namely the slope of the common tangent to U (x) is fCS , see Fig. 8.36(a). Figure 8.36(b) shows the force-extension curve for this case. There are three regimes: “hard elasticity” where one stretches the fiber maintaining the nucleosomal contacts, a coexistence plateau between condensed and open fibers (similar to the horizontal lines in Figs. 2.11 and 3.14) and a soft-elasticity part that corresponds to the case discussed above. Stretching experiments on chromatin at salt concentrations closer to physiological values (40 mM as compared to physiological 100 mM) show indeed a hint of a plateau, see Fig. 8.35(c). This suggests that such fibers are normally condensed and that they can be decondensed by an external force fCS of about 5 pN. From the extent of the plateau, about 0.6 μm, its height, about 5 pN, and the number of nucleosome in the stretched fiber, about 280, one can estimate an attractive energy of 0.6 μm × 5 pN/280 ≈ 3 kB T per nucleosome.
8.4.2 Solenoid-Type Models So far we have assumed that the geometry of the chromatin fiber is controlled by the geometry of the underlying linker DNA backbone. This leads to a satisfactory description of the geometrical and mechanical properties of chromatin fibers at low salt concentrations. We have also seen that when we move toward more physiological conditions, fibers get denser and force-extension curves show indications of an attraction between nucleosomes, see Fig. 8.35(c). This raises the question whether fibers might undergo a transition to an altogether different geometry when one approaches physiological salt concentrations. Experiments on solutions of nucleosome core particles indicate that such particles show a maximum mutual attraction around physiological ionic conditions (Mangenot et al., 2002), which manifests itself in a steep drop of the second virial coefficient around 100 mM salt. There is therefore the possibility that the nucleosomal attraction will override the linker DNA elasticity, resulting in structures in which the nucleosomes are arranged to optimize their interactions. This
Chromatin Fibers 389
Dfiber [nm]
50 40 30 20
187 197 207 217 227 237
repeat length [bp] Figure 8.37 Chromatin fiber diameter as a function of the nucleosomal repeat length for reconstituted fibers (Robinson et al., 2006). Note that the fiber diameter stays constant for every three repeat lengths. The dashed lines indicate Dfiber = 33 nm and Dfiber = 44 nm respectively.
can be described by a class of fiber models where the nucleosomes dictate the geometry. An example of such a model is the classical solenoid model displayed on the lhs of Fig. 8.32. We are now discussing an experiment which showed very clearly that, at least under the conditions of this experiment, the chromatin fiber geometry is governed by the arrangement of the nucleosomes, while the DNA linkers play only a minor role (Robinson et al., 2006). In this experiment between about 50 to 70 nucleosomes were assembled on DNA templates that contained equally spaced nucleosome positioning sequences. Electron micrographs of the resulting fibers at physiological ionic conditions allowed to determine their diameters and lengths. This was done for 6 different nucleosomal repeat lengths from 187 bp up to 237 bp in steps of 10 bp. Surprisingly, the fiber diameter stayed constant over a wide range of repeat lengths, see Fig. 8.37: fibers with repeat lengths 187 bp, 197 bp and 207 bp have a diameter of 33 nm, whereas fibers with 217 bp, 227 bp and 237 bp repeats are 44 nm thick. This result was quite spectacular, as it clearly showed for the first time that fibers have bent linkers under certain conditions; otherwise the diameter would depend on the linker length, see Eq. 8.97. This suggests that it is the geometrical arrangement of the nucleosomes that determines the geometry of these fibers.
390 DNA–Protein Complexes
Figure 8.38 In a five-ribbon fiber five stacks of nucleosomes twirl around each other (lhs). To map the fiber into two dimensions an imaginary cylinder is constructed that goes through the centers of the nucleosomes (lhs and middle). The cylinder is cut open and rolled out. The nucleosomes are now transformed into 5 stacks of rectangles (rhs).
Let us assume that the geometry results from the attraction between nucleosomes and that nucleosomes pack as close as possible. In (Depken and Schiessel, 2009) all possible dense packings of nucleosomes were characterized. A dense packing is achieved by stacking nucleosomes on top of each other and having one or several of those stacks twirl around each other, see lhs of Fig. 8.38 for a fiber made from 5 stacks. The nucleosomes are connected by one DNA chain and typically the DNA linkers have to bend to make this possible. The following discussion does not depend on the precise way of how nucleosomes are connected. We will come back to the DNA linkers at the end of this section. To proceed with the calculation, we place an imaginary cylinder through the centers of nucleosomes (lhs of Fig. 8.38). The cross sections of the nucleosomes with this cylinder are shown in the middle of Fig. 8.38. Finally we cut open the cylinder and roll it out, see rhs of Fig. 8.38. The stacks can now be seen as ribbons made from stacked rectangles that represent crosscuts through the nucleosomes. The rectangles have a diameter Dnucl = 11.5 nm and a height H nucl = 6.0 nm, the known dimensions of the repeat unit of densely packed nucleosome core particles (Mangenot et al., 2003). The rolled out cylinder has a width π (Dfiber − Dnucl ). This immediately implies that the nucleosome line density σ along the
Chromatin Fibers 391
1.8
217 bp
1.6
σ [nm−1 ]
1.4
237 bp
1.2
227 bp
197 bp
1.0
207 bp
0.8
187 bp
0.6 0.4 20
25
30
35
40
45
50
Dfiber [nm] Figure 8.39 Nucleosome line density σ versus chromatin fiber diameter. Comparison between experiment (Robinson et al., 2006) and theory, Eq. 8.106. The agreement between data and theory shows that nucleosomes are densely packed.
fiber is given by π (Dfiber − Dnucl ) . (8.106) Dnucl H nucl Note that this result is independent of the number of stacks that form the fiber. Equation 8.106 can be checked against the experiment (Robinson et al., 2006) since the lengths and diameters of fibers with a known number of nucleosomes were measured. In Fig. 8.39 the prediction of the model, Eq. 8.106 is compared to the experimental data. The agreement between the data and the model suggests that these fibers actually consist of tightly packed nucleosomes. The model suggests that dense arrangements can be achieved for any fiber diameter but the data suggest that there are two preferred “magical” diameters, namely 33 nm and 44 nm. Where could such a preference come from? In Fig. 8.40 we display again a five-ribbon fiber, this time highlighting two stacked nucleosomes. As you can see in the close-up to the right, there is a problem with the assumption of densely stacked nucleosomes that we overlooked when we first showed the 2D rollout of the fiber on the rhs of Fig. 8.38. As the nucleosomal stacks have to twirl around each other, there needs to σ =
392 DNA–Protein Complexes
Figure 8.40 In a five-ribbon fiber, two nucleosomes stacked on top of each other are highlighted in red. A close-up shows their geometry and indicates the splay angle θ between them.
be a non-vanishing splay angle between them. As a result, there is a gap between the cylinders (representing the nucleosomes) at the outside of the fiber and an overlap at the inside. Let us ignore this problem first and instead calculate the splay angle θ as a function of the fiber diameter and of the number of ribbons in the fiber. A straightforward geometric analysis (see Appendix H) allows to calculate this quantity:
2 Dnucl Nrib 2H nucl 1− . (8.107) θ≈ Dfiber − Dnucl π (Dfiber − Dnucl ) In Fig. 8.41(a) we plot the splay angle θ as a function of Dfiber for six-ribbon fibers up to a maximal diameter of 100 nm. The thinnest six-ribbon fiber with Dfiber = 33.5 nm is found for a vanishing splay angle, θ = 0, where the stacks are parallel to the fiber axis. With increasing diameter the splay angle first increases and then decreases. The decrease is related to the formation of very wide fibers with Dfiber Dnucl . As one round of ribbon is made of many nucleosomes, one has very small θ -values. In Fig. 8.41(b) θ is plotted again as a function of the fiber diameter but this time for 10 different numbers of ribbons, from Nrib = 1 until Nrib = 10. The general shape of the curves is the same. With increasing value of Nrib , the maximal splay angle decreases and is reached at a larger diameter. Since its beginnings in the late 1970s, the building of chromatin fiber models has typically involved placing the nucleosomes in such a way that the desired fiber diameter is ensured. Following this tradition, we have put in Fig. 8.41(b) a vertical line Dfiber = 33 nm through the θ versus Dfiber curves. We find five points of intersection
Chromatin Fibers 393
Figure 8.41 (a) Splay angle θ between stacked nucleosomes as a function of the diameter of six-ribbon fibers together with example configurations. (b) Splay angle versus fiber diameter for fibers made of one up to 10 ribbons. The intersections between the red vertical line at 33 nm diameter and the curves correspond to 5 possible geometries displayed to the left.
394 DNA–Protein Complexes
between this line and the curves, which correspond to five different possible dense packings of nucleosomes inside a 33 nm wide fiber, namely a one-ribbon, a two-ribbon, a three-ribbon, a four-ribbon and a five-ribbon fiber. Not surprisingly, all these geometries have been proposed in the literature. The one-ribbon fiber is nothing but the classical solenoid model (Finch and Klug, 1976) that we already displayed on the lhs of Fig. 8.32. The two-ribbon fiber is favored in (Dorigo et al., 2004), a three-ribbon fiber has been proposed in (Makarov et al., 1985) and four- and five-ribbon fibers are displayed ´ in (Daban and Bermudez, 1998). However, these models cannot explain why only chromatin fibers with a fiber diameter of 33 nm (or 44 nm) are found in the experiments. Let us come back to the point where we cheated before, namely the problem of the non-zero splay-angle between stacked nucleosomes which seems to entail a steric clash, see Fig. 8.40. The point is that there is actually not really a problem. Nucleosomes are not cylinders with parallel top and bottom surfaces. Look again at the crystal structure of the nucleosome core particle, Fig. 1.8. As you can see from the side view on the rhs, nucleosomes have the shape of a wedge. The thinner half is on the top where the DNA enters and leaves the complex, i.e., at the inner side of the fiber. This suggests that stacked nucleosomes prefer to have a non-zero splay angle. But what exactly is the preferred value? We could try to guess the angle by putting a ruler at the crystal structure. But of course it is much better to let the nucleosomes decide for themselves what their preferred splay is. In fact (Dubochet and Noll, 1978) presented electron micrographs of stacked nucleosome core particles. It was found that core particles tend to stack into arcs with an 8◦ -splay angle. In Fig. 8.42 we show again the plot of θ versus Dfiber , the same plot as in Fig. 8.41(b). But this time we put at 8◦ a horizontal line through the plot. The intersections between the curves and the line correspond to dense packings of 8◦ -wedges. Beside very wide fibers that we do not consider in the following there are 4 possible geometries: a five-ribbon fiber with 33 nm diameter, a six-ribbon fiber with 38 nm diameter, a seven-ribbon fiber with 44 nm diameter and an eight-ribbon with 52 nm diameter. This result is remarkable. We now have a natural explanation of why chromatin fibers have a finite set of preferred diameters rather
Chromatin Fibers 395
Figure 8.42 Splay angle versus fiber diameter for fibers between 1 and 10 ribbons (same as Fig. 8.41(b)). A horizontal blue line indicates the preferred 8◦ -splay angle between nucleosomes. The intersections with the curves give all the possible dense packings of 8◦ -wedges. Two of those geometries, the five-ribbon and the seven-ribbon fiber, have diameters that coincide with the ones experimentally observed in (Robinson et al., 2006), namely 33 nm and 44 nm.
than a continuous range of values. Even better, two values coincide with measured values, suggesting that the 33 nm wide fibers found in (Robinson et al., 2006) are five-ribbon fibers and the 44 nm wide fibers have seven ribbons. However, current experimental methods do not allow to check whether this is pure coincidence. In fact, some experiments—albeit with shorter fibers—give strong indications for other ribbon numbers, e.g., a stretching experiment (Kruithof et al., 2009) suggests a one-ribbon fiber (also called one-start helix) and a cross-linking experiment supports the two-ribbon fiber (also called two-start helix) (Dorigo et al., 2004). But I hope to have demonstrated how simple geometrical arguments can be used to categorize the plethora of fiber models and to come to quantitative predictions.
396 DNA–Protein Complexes
Figure 8.43 Possible regular connections of five-, six-, seven- and eightribbon fibers. For simplicity, we show here fibers with vanishing splay angle θ . Note also that after having connected to all Nrib nucleosomes, the next linker connects to a nucleosome stacked on top of the starting nucleosome; this nucleosome is symbolically highlighted by a thicker edge. Each connection is a solution to Eq. 8.108. For instance, the fiber with Nrib = 7 and Nstep = 3 is solved for k = 5 and n = 2.
Finally, let us address the problem of how to connect the nucleosomes by the DNA linkers. Let us find all possible ways in which the linker backbone connects the nucleosomes in an identical fashion from nucleosome to nucleosome. Denote by Nstep the distance across ribbons between connected nucleosomes; Nstep = 1 means connections between neighboring ribbons. The necessary and sufficient condition for a regular backbone winding is the existence of two integers n and k with 0 ≤ n ≤ k ≤ Nrib such that kNstep = nNrib + 1,
(8.108)
see (Depken and Schiessel, 2009) for a proof. Here n gives the number of times the backbone passes the nucleosome from which we started before it connects to the nucleosome in the ribbon next to it. In total, the backbone has then passed nNrib + 1 nucleosomes (with more than one passage for a given nucleosome allowed). k denotes the number of linkers that were needed to achieve this. Since Nstep nucleosomes are passed at each step, the total number
Chromatin at Large Scales
of nucleosomes must also equal kNstep . Hence for each possible solution of Eq. 8.108, there is a way to connect two nucleosomes in neighboring ribbons either directly or via other nucleosomes. Since all nucleosomes are connected in an identical fashion, we then automatically know that all nucleosomes are connected. All the possible backbone connections for five-, six-, seven- and eight-ribbon fibers are displayed in Fig. 8.43. Obviously one can always choose to connect to the neighboring ribbons corresponding to Nstep = 1. This can also be seen from Eq. 8.108: For any value of Nrib , n = 0 is always a solution by setting k = Nstep = 1. For some fiber types, e.g., six-ribbon fibers, this is also the only possible way to connect nucleosomes. For the other three examples shown in Fig. 8.43, there are also solutions that resemble the crossedlinker geometries discussed for the two-angle model, see Fig. 8.34. However, the requirement for densely packed nucleosomes leads here to a finite set of fiber geometries, which typically entail a substantial bending of the DNA linkers. One might speculate that the fiber diameters observed in (Robinson et al., 2006) correspond to the geometries that minimize the linker bending energy for a given nucleosomal repeat length. The problem is not straightforward, however, as the energies associated with these linker lengths are typically extremely high. A possible way out of this problem was discussed in (Lanzani and Schiessel, 2012), where it has been suggested that the bending energy can be relieved by sliding the nucleosomal ribbons out-of-register, which leads to a prediction that is in good agreement with Fig. 8.37. As current experiments do not allow to test these ideas, we will not discuss them further here. Instead we devote the last section of this chapter to the structure of whole chromosomes. Until recently, this was an even more speculative topic, but thanks to new experiments, our understanding improved tremendously.
8.5 Chromatin at Large Scales A natural starting point for approaching the organization of chromatin at large scales is the question: In which polymeric state are DNA molecules inside the nucleus? Up until 10 years ago, most
397
398 DNA–Protein Complexes
scientists assumed that the DNA molecules are in one of the standard polymeric states, as the ones discussed in Chapter 3. However, new experimental methods suggested in 2009 that DNA molecules have an exotic state called the fractal globule. We discuss these developments in Subsection 8.5.1. This is followed by Subsection 8.5.2, which provides deeper insights into how this state occurs and why it is biologically important. An analogy to non-concatenated polymer rings or loops proves to be essential. We also learn that this is a central motif in various other contexts when organizing chromatin on larger length scales, because cells use dedicated
timeline 1990s
2009
2016
method FISH
Hi-C
Hi-C
(1 Mb resolution)
(1 kb resolution)
polymer coil
fractal globule
loopy globule
model
Figure 8.44 Paradigm shifts in large-scale chromatin organization. Fluorescent in situ hybridization experiments or FISH (Fig. 8.45) on chromosomes in the 1990s suggested that DNA conformations in interphase chromosomes (interphase is the resting phase between successive mitotic divisions of a cell) behave like random polymer coils at equilibrium. Chromosome conformation capture (Fig. 8.46), specifically Hi-C data at 1 Mb resolution, suggested in 2009 that the chromosomes are in a metastable polymer state, the fractal or crumbled globule. In addition, it mapped two subcompartments (indicated here by colors). More recently, Hi-C experiments at 1 kb resolution point toward a loopy globule state, a steady state maintained by the continuous action of molecular motors called loop extrusion complexes. New sub-compartments have been identified (three of which are indicated here by colors)
Chromatin at Large Scales
motors, called loop extruders, to produce them. An overview of the progress in our understanding is shown in Fig. 8.44.
8.5.1 From Classical Polymers to Fractal Globules We have seen in Fig. 1.6 that chromosomes live in their own territories. This is quite surprising since we know that polymers overlap strongly in semidilute solutions, see Fig. 3.16. Since the DNA molecules (or chromatin fibers) are quite long compared to the cell nucleus, we would have expected chromosomes to mix as well. First, let us boldly ignore this inconsistency and simply accept the existence of territories. Then what do we expect for the structure of a chromosome within its own territory? Many polymer models have been proposed over the years to predict the configurations of chromosomes (e.g., (Hahnfeldt et al., ¨ el and Langowski, 1998; Mateos-Langerak et al., 2009; 1993; Munk Emanuel et al., 2009)). We might expect that the chromosomes have conformations that resemble those of polymers in poor solvents as only in this case they are compact enough to fit in their own territories. Based on the Flory theorem, we had argued that such a polymer has an internal structure in which short enough pieces of the chain follow a Gaussian chain behavior that levels off once the extension of the subchain is comparable to the overall size of the globule, see Eq. 3.56. To test whether this idea works, we need an experimental method that can measure the spatial distance between pairs of monomers as a function of their chemical distance. Such a method exists and is called fluorescence in situ hybridization (or FISH for short). It is a fairly harsh method in which the cell must first be killed before any distances can be measured (therefore it is in situ and not in vivo). First, short single-stranded DNA pieces are synthesized that are coupled to a green or a red dye. These DNA fragments are complementary to the DNA at the locations that one wants to mark. The problem lies in the fact that DNA inside the cell is double-stranded but the DNA probes can only hybridize with single-stranded DNA. The required DNA melting is achieved through heating, see also Section 4.4. To heat the cell without destroying its structure is tricky. For this reason, the cell is fixed with formaldehyde
399
400 DNA–Protein Complexes
R2ij [μm]
2.5
j Rij
2.0 1.5 1.0 0.5
i
0.0 0.0
(a)
0.5
1.0
1.5
2.0
2.5
|i− j| [Mbp]
3.0
3.5
(b)
Figure 8.45 (a) FISH is a method to determine the spatial distance as a function of the genomic distance (see text for details). (b) Result from a FISH measurement on human fibroblast cells (1 Mbp = 106 bp) (MateosLangerak et al., 2009).
before the heating step, which forms chemical cross-links between different proteins as well as between proteins and DNA. After melting the DNA, the short DNA pieces that carry the dye hybridize at the chosen positions, see Fig. 8.45(a), and their distance can be measured by a fluorescence microscope and plotted as a function of their chemical distance. An example of such an experiment on human fibroblast cells is depicted in Fig. 8.45(b) (Mateos-Langerak et al., 2009). As you can see, the distance first grows and then levels off at about 2 Mbp. Each data point is an average over many pairs of bp positions and over many cells. Note that the bars for each point do not correspond to errors in the measurement but rather reflect large cell-to-cell variations in the DNA conformation. We can now try to check whether chromosomes actually show the statistics of a poor solvent chain. The curve through the data points is a fit using Eq. 3.56. The precise shape of the curve results from a calculation where one assumes the territory to have the shape of a box with dimensions 4 μm×4 μm×0.15 μm, comparable to those observed in human fibroblasts (Emanuel et al., 2009). Two further assumptions were made, namely a nucleosomal repeat length of 200 bp and a nucleosome line density of 0.7 nm−1 (a value slightly below that of the superdense fibers discussed in the previous section). The remaining fit parameter was the step length of the random walk. 300 nm led to the best fit corresponding to a persistence length of
Chromatin at Large Scales
original conformation
cut with restriction enzyme
ligate open ends
10−8 10−9
pc 1.080
10−10
1
−11
10
dilute crosslink DNA
0.1
(a)
1
10
|i − j| [Mbp]
100
(b)
Figure 8.46 (a) Some of the many steps in the chromosome conformation capture method. (b) Contact probability as a function of the genomic distance (Lieberman-Aiden et al., 2009).
150 nm, see Eq. 4.62. This might be a reasonable value as chromatin fibers are thought to be stiffer than bare DNA. Altogether the agreement between data and theory is satisfactory. However, one could easily come up with other chain statistics that also fit the data. What we need is another method that gives access to a different quantity. Chromosome conformation capture (Lieberman-Aiden et al., 2009), is such a method. In fact, it has revolutionized our understanding of large scale chromatin architecture in the last decade. This method also takes place in vitro, see Fig. 8.46(a). First, the cell is cross-linked with formaldehyde. However, the function of cross-linking here is not to preserve the overall structure of the nucleus (as in FISH), but to link pieces of DNA that happen to be close together in space. This makes it possible to extract the contact probability of the DNA as a function of the chemical distance. To extract this quantity from the cross-linked chromosomes, a few more steps are required, see Fig. 8.46(a). First, the DNA is digested with a restriction enzyme. A large number of short DNA fragments are obtained which are chemically linked to each other. The solution of fragments is then diluted. In the next step, the four open ends of the two cross-linked fragments are ligated, i.e., joined to each other. This way one has chemically linked DNA pieces that were close together in space in their original setting. A few more steps and tricks, not discussed here any further, then allow to extract the contact probability pc as a function of the genomic distance (Lieberman-Aiden et al., 2009).
401
402 DNA–Protein Complexes
What do we expect, based on our theoretical model from above, as the outcome of this experiment? Let us calculate the contact probability between two locations on the DNA (at bp positions i and j ) that have a genomic distance |i − j |. Since the DNA (or chromatin fiber) performs a random walk for not too large distances, it takes up a volume that scales like V ∼ |i − j |3/2 . What is the probability pc then that the two “monomers” are close together in this volume (close enough that they would be cross-linked)? The first monomer is allowed to be anywhere in the volume V but the second needs to be close to the first one. The contact probability is therefore given by pc ∼ 1/V . Assuming that the chromosome obeys poor solvent statistics, we expect 1/ |i − j |3/2 for |i − j | small (8.109) pc ∼ const for |i − j | large. The point at which the crossover occurs depends on the details of how the DNA is packaged into a string of nucleosomes and does not concern us here. Did the chromosome capture method find any indications of the exponent −3/2? In Fig. 8.46(b) we display the contact probability in human lymphoblasts averaged over the genome. Interestingly, there is no indication of a slope of −3/2, but in the range from of 500 kbp to 7 Mbp, a slope −1 is found. This finding is rather surprising. If we repeat our argument from before but modify it such that we obtain pc ∼ 1/ |i − j | we need to assume that the volume of the |i − j |subchain scales like V ∼ |i − j |. This suggests 2 a mean-squared distance between monomers that scales like Ri j ∼ |i − j |2/3 or Ri j ∼ |i − j |1/3 in short. Remember the discussion in Section 3.8 below Eq. 3.55 where I had stressed that this scaling law does not make sense as it is based on the wrong idea that the scaling of the overall chain with its volume proportional to N (“dense” packing of N monomers) is also applicable to short chain pieces. According to the Flory theorem, one should expect Gaussian behavior instead. Remarkably, here nature seems to prove Flory wrong. How can this finding be rationalized? So far, we had implicitly assumed that we are dealing with equilibrium polymer physics. However, eukaryotic DNA molecules are extremely long and it is far from obvious whether their configurations are in equilibrium.
Chromatin at Large Scales
(a)
(b)
Figure 8.47 (a) The tube model: a given polymer (red) is effectively confined by other polymers (blue). (b) Polymer collapse into a crumpled globule.
In fact, the equilibration times can be rather spectacular. It has been estimated that human chromosomes need about 500 years to equilibrate (Rosa and Everaers, 2008). The reason why it takes so long is that the dynamics of a long polymer chain in a solution of other long polymers is rather different from the Rouse dynamics that we had discussed in Section 5.8. According to Edwards’ tube model, the other chains confine a given polymer effectively in a tube, see Fig. 8.47(a) (Doi and Edwards, 1986). The chain can only escape from the tube through its ends by a snake-like motion called reptation (de Gennes, 1979). To do this, a polymer of N monomers has to move a distance that is proportional to its own length L, i.e., proportional to N. As it diffuses back and forth along the tube, this so-called disengagement time is proportional to τd ∼ L2 /D ∼ N 2 /D. In addition, the friction constant of the polymer is proportional to N and thus according to the Einstein relation, Eq. 5.49, its diffusion constant D proportional to N −1 . This contributes another factor N to the disengagement time. In total, τd is proportional to N 3 . Hence long chains like chromosomes show a very slow dynamics. If chromosomes are not in equilibrium, their conformations could be almost anything. The contact probability suggests that for some reason the chain shows a very exotic behavior with Ri j ∼ |i − j |1/3 . Remarkably already in 1993, long before the cross-linking experiments, a polymer model for chromosomes was proposed on purely theoretical grounds that shows precisely this scaling law
403
404 DNA–Protein Complexes
(Grosberg et al., 1993). The polymer model is that of the crumpled or fractal globule. Such a globule can be formed from a polymer in a good solvent by suddenly switching to poor solvent conditions. According to Eq. 3.22 this can be achieved by a drop in temperature. As a result, the polymer collapses in a hierarchical fashion, as schematically depicted in Fig. 8.47(b). First pieces of chain with some local slack form small globules that then collapse onto each other forming larger globules and so on. This picture suggests that one ends up with a globule that is neatly folded on every length scale, the crumpled globule. What is special about this conformation is that it is not topologically entangled. If one switches back to good solvent conditions, it quickly unfolds again into a swollen chain conformation. However, the crumpled globule conformation is not an equilibrium conformation. If one waits long enough, an equilibrium globule will form. This is a knotted structure that will not immediately swell to its full size after switching to good solvent conditions. The reason why the crumpled globule can exist for an extended period of time before reaching the molten globule state is similar to the reptation picture we mentioned above. In order to go from an unknotted state to a knotted state, the ends of the chain have to go around other parts of the chain which is very time consuming. It is precisely the feature of being unknotted that led the authors of (Grosberg et al., 1993) to the speculation that DNA in vivo has a crumpled globule conformation because this greatly alleviates access for, e.g., proteins to the DNA. What do we expect for the contact probability of such a crumpled globule? The crumpled globule has a self-similar conformation which successively fills the space by forming globules of globules of globules and so on. It is in fact a space filling fractal whose mass |i − j | grows proportional to the volume it occupies. A fractal globule is therefore a curve in space with a fractal dimension d f = 3. Using the previous argument, we expect that the contact probability decreases with chain length as pc ∼ |i − j |−1 . Remarkably, the exponent has the value −1, the same value that was observed by chromosome conformation capture (Lieberman-Aiden et al., 2009). The authors of this work actually claimed that chromosomes are folded in the form of a fractal globule, supporting the speculations previously presented in (Grosberg et al., 1993).
Chromatin at Large Scales
Figure 8.48 First three iterations in the construction of the Hilbert curve in two and three dimensions. Rhs: contact probability pc of the 3D Hilbert curve as a function of the distance |i − j | along the chain (see text for details).
To corroborate this idea further, various examples of mathematical space-filling fractal curves were studied in (Lieberman-Aiden et al., 2009) and also the collapse of a polymer into a crumpled globule was simulated. We discuss an example of a mathematical fractal curve first, the 3D Hilbert curve, see middle of Fig. 8.48. To better understand how it is constructed, we also show on the lhs of Fig. 8.48 David Hilbert’s original construction of the twodimensional version of this curve from 1891 (Hilbert, 1891). When one repeats this procedure infinite times, one obtains a self-similar curve of infinite length that fills either the surface or the threedimensional space, i.e., a curve of fractal dimension 2 or 3. On the rhs of Fig. 8.48 we show the contact probability pc of the 3D Hilbert curve as a function of the chemical distance |i − j | (Lieberman-Aiden et al., 2009). Remarkably, the slope of pc has not the expected value −1 but −1.335. What went wrong? We assumed before that pc ∼ 1/V where V is the volume of the |i − j |-subchain under consideration. This relation implicitly presupposes that the
405
406 DNA–Protein Complexes
monomers are well mixed, i.e., a given monomer has a probability 1/V of coming into contact with any other monomer. However, space-filling fractal curves like the 3D Hilbert curve are special in having each section of their chain spatially separated from the rest of the chain. This can be clearly seen by coloring the curve along its contour from red to purple through all colors of the rainbow; the resulting fractal keeps all the colors spatially separated, see middle of Fig. 8.48. The relation pc ∼ 1/V is only valid if the chain is wellmixed on all length scales, which is precisely not the case here. We give now an argument that accounts for the fact that the Hilbert curve is spatially separated on all length scales. To determine the contact probability between two monomers that are spaced apart along the chain g monomers, we first divide the connecting chain section into two halves of length g/2. These two halves are also spatially separated. In order for the two monomers on the ends of the g monomer long subchain to be in contact, they need to be at the interfacial area between the two subchains. The probability that a given monomer is in that interfacial area S is proportional to S/V . Since S ∼ g2/3 and V ∼ g, one has S/V ∼ g−1/3 . Each monomer contributes this factor to the contact probability. In addition—if the two monomers are at the interface between the two subchains— they need to be close to each other. The probability for this to happen goes like 1/S ∼ 1/g2/3 . Collecting all three factors, we find for the contact probability pc ∼ g−4/3 . This is precisely the scaling of the contact probability for the 3D Hilbert curve, see rhs of Fig. 8.48. Now we understand why the Hilbert curve (and many other similar space-filling fractal curves) do not lead to the −1-slope that we hoped for to explain the experimental data. Is there another way to achieve the desired slope? According to (Lieberman-Aiden et al., 2009) there is. The authors claim that a curve like the 3D Hilbert curve does not work as it is a mathematical idealization but that more physical conformations reproduce the desired −1slope. To demonstrate this, they simulated the collapse of a polymer into a crumpled globule. The resulting chain conformations were indeed spatially separated similar to the 3D Hilbert curve (see also Fig. 8.47(b)), but the slope of the contact probability was now close to −1 instead of −4/3. How can one understand this? A possible explanation might be that the interface between the
Chromatin at Large Scales
demixed subchains, each of length g/2, is not a smooth surface but a fractal object with dimension 2 ≤ ds < 3. We generalize now the argument from above. The probability for each monomer to be at the interface now scales as g(ds /3) /g and the probability that two monomers from two subchains meet at their common interface as 1/gds /3 . Overall, we expect that the contact probability scales like g(ds −6)/3 . For a smooth surface, ds = 2, we recover the above slope −4/3 but for a very rough surface ds → 3 the slope approaches −1. This might be related to what the authors of (Lieberman-Aiden et al., 2009) call an interdigitated fractal. Do chromosomes have the conformations of fractal globules and why? There are a couple of rather confusing points. First of all, what has a polymer collapse in common with any of the processes taking place inside the cell nucleus? One might argue that the condensation of the chromosomes into their mitotic state is some kind of collapse. But here we are interested to describe the chromosomes during interphase which is the portion of the cell cycle after decondensation of the mitotic chromosomes up to their next condensation. Secondly, how does one explain the existence of chromosome territories (see the 5 μm-window in Fig. 1.6)? This seems to be a state very far from the equilibrium state, the entangled polymer solution depicted in Fig. 3.17. So what keeps the chromosomes in this state? Finally, the computer simulation accompanying the chromosome conformation capture experiment applied huge forces to introduce a rapid polymer collapse (Lieberman-Aiden et al., 2009). A computer simulation (Schram et al., 2013) employing less extreme, more physical conditions did not find any evidence for the −1-power law for the wide range of chain lengths investigated.
8.5.2 From Polymer Rings to Loop Extrusion Is there a simpler, more natural explanation for the polymeric state of chromosomes? An attractive alternative approach (Rosa and Everaers, 2008) turned the argument around. Instead of starting with extended polymers, the initial state in this computer simulation study corresponded to several neatly folded polymers that were spaced apart in a container, see Fig. 8.49 (top) for a schematic view. These chains were then allowed to expand. As soon as the polymers
407
408 DNA–Protein Complexes
condensed chromosomes
decondensation
ring
500
closure
years
solution of nonconcatenated rings at equilibrium
chromosomal territorries
solution of linear polymers at equilibrium
Figure 8.49 Large scale organization of chromosomes. Bottom middle: Chromosomal territories in interphase. These follow from the decondensation of the compact chromosomes (top) after cell division. The organization of chromosomes in interphase is an extremely long-lived metastable state. In the case of human chromosomes one would have to wait 500 years to arrive at the well-mixed equilibrium state (bottom right). The metastable demixed state is related to the equilibrium state of a solution of nonconcatenated ring polymers (bottom left).
came into contact, the expansion stopped since chain crossing was not allowed. The resulting set of polymers in the container resembled the global arrangement of chromosomes in the nucleus, as they were neatly separated into chromosome territories, see Fig. 8.49 (bottom middle). Remarkably, the predictions of the model were also in excellent agreement with FISH data, particularly with plots of the spatial distance as a function of the genomic distance, even though the model did not have any adjustable parameters. And, finally, the whole process resembled a biological process—the decondensation of chromosomes after cell division. Can this scenario explain the −1-slope from the chromosome capture experiments? This was not investigated, as such experimental data were not available at the time of the publication. However, the study contained a crucial insight that allows now to answer this
Chromatin at Large Scales
question. The main point of the argument is that the equilibration times for human chromosomes, as mentioned above, are very long, about 500 years. Equilibration would take place via reptation-like motions involving the chain ends. As this process would occur only on biologically irrelevant time scales, we can forget about the existence of the chain ends altogether. We might as well just get rid of them by connecting them, creating DNA rings in the process, see Fig. 8.49 (bottom left). This does not affect the physics on the relevant time scales. It is now important to realize that the different chromosomes are not entangled, reflecting the initial conditions which cannot be “forgotten” as the simulation (and also the biological) times scales are much shorter than the equilibration times of the chromosome polymers. This means that—after ring closure—the polymer rings are non-concatenated. We can therefore say that our system of decondensed polymers is practically identical to a solution of non-concatenated polymer rings. Furthermore, we assume that the polymer chains—after the expansion stops—quickly reach a quasi-equilibrated state. We can therefore also assume that the corresponding ring polymer state is that of a system at equilibrium. Through this elegant line of arguments we got rid of non-equilibrium physics which is notoriously hard to treat as it depends on initial conditions. Instead, we arrived at a well-defined equilibrium system. The exciting question now is: What does a semidilute solution of non-concatenated rings look like? We know that a semidilute solution of linear polymers consists of strongly overlapping chains, see Fig. 8.49 (bottom right), where each chain shows a random walk configuration, reflecting the screening of the excluded volume, Fig. 3.17. But what about ring polymers? Large-scale computer simulations of solutions of non-concatenated polymer rings show features that are substantially different from solutions of linear polymers (Halverson et al., 2011). Notably, rings under these conditions show an overall compact structure, i.e., an overall size that scales like N 1/3 . Importantly, the structure of these rings is self-similar on all length scales, i.e., a stretch of g monomers has a size that scales like g1/3 . Again, there are crumples within crumples similar to the crumpled globule mentioned above. The contact probability between monomers decreases with genomic distance
409
410 DNA–Protein Complexes
as 1/g1.1 (Halverson et al., 2011; Rosa and Everaers, 2014) which is compatible with data from chromosome conformation capture (Lieberman-Aiden et al., 2009). The fact that non-concatenated rings do not mix is not just an isolated topological phenomenon, reflected in the existence of chromosomal territories. It turns out that cells make actively use of non-concatenated loops in a variety of contexts. This became clear only very recently when chromosome conformation capture methods reached higher resolutions. We turn now to these new discoveries. Imagine 46 garden hoses, each 5 cm thick and 1000 km long, lying entangled in your garden. Now imagine further that each hose doubles by dividing along its length. Finally, after this doubling is completed, magic happens: the 46 chains separate and each pair forms a nicely divided package. This is what occurs, at a 25 million times smaller scale, each time when our DNA molecules duplicate in a cell, just before its division. The product of this process is wellknown: the iconic X-shaped mitotic chromosome shown on the right of Fig. 1.9. In these structures, the two identical DNA copies are already neatly divided from one another, rendering the last step of this process, the pulling apart of the two halves by the mitotic spindle, almost trivial. Many people think that this latter process “explains” how our DNA material gets divided into the two daughter cells. This means to overlook the much more interesting and highly non-trivial phenomenon of the seemingly spontaneous segregation of each of the entangled pairs of identical DNA molecules. But should we really be amazed? Here, too, a thorough understanding of polymer physics is required to recognize the gigantic problem that cells need to overcome with each division. As demonstrated by Grosberg and coworkers (Grosberg et al., 1982) the free energy cost for a substantial overlap of two polymer coils with excluded volume is only of the order of the thermal energy, kB T . This goes against intuition as one would expect a swollen polymer coil rather to behave as an impenetrable sphere. The fact that polymers can easily overlap is bad news for the cell, since polymer physics seems not to help to segregate the chromosomes after duplication.
Chromatin at Large Scales
Let us try to estimate the free energy cost for overlapping. We first present a calculation that seems to support the picture of impenetrable spheres. We start from the free energy of interaction of two overlapping polymer coils. It is given by F int = kB T k where k denotes the number of contacts between the two polymers. According to Flory, the size of a swollen polymer coil is given by R = aN 3/5 , Eq. 3.34. We assume a very good solvent with a second virial coefficient υ = a3 . In this way we can hope for a particularly strong repulsion between overlapping chains. For simplicity let us assume that the monomers of both polymers form an ideal gas in the volume R 3 . The number of contacts is then the product of the number of monomers of one of the polymers, c R 3 (c: average monomer concentration c = N/R 3 ), and the probability for each of these monomers to be in contact with a monomer from the other chain, ca3 (a: monomer size). Hence F int = kB T × c R 3 × ca3
(8.110)
which is actually the term with l = 2 in the virial expansion of the free energy, Eq. 2.90. Therefore we find that the free energy of interaction (in units of kB T ) is given by F int /kB T = N 1/5
(8.111)
which for long polymers is substantially larger than one. This suggests that swollen coils behave as impenetrable spheres. However, this argument is wrong. We assumed above that the monomers can be considered to form an ideal gas. Instead, a swollen chain looks more like the configuration depicted in Fig. 3.10. You can imagine that if a monomer (or thermal blob) is not in contact with a corresponding element from the other polymer chain, then it is also less likely that the monomers (or blobs) directly connected to it are in contact with the other chain and so on. As a result, the probability of contact between the two polymers is much smaller than the value assumed above. In Eq. 8.110 the probability of contact scaled like c 2 but because of the correlation effect the probability of contact is smaller. It turns out to scale like c × c5/4 = c 9/4 (Daoud et al., 1975). This can be shown by a scaling argument that you can work out in Problem 8.6. The free energy of contact is now 5/4 F int /kB T = c R 3 × ca3 . (8.112)
411
412 DNA–Protein Complexes
(a) chromosome 1
chromosome 1 A
A
A off diagonal:
B A
B
B A
A B
B
intrachromosomal contacts between non-adjacent regions
B diagonal:
A
A
10 Mb
intrachromosomal contacts between adjacent regions
A B
B
(b) TAD topologically associated domain
TAD TAD
TAD 200 kb Figure 8.50 Schematic contact map of a chromosome (“chromosome 1”) at two levels of resolution. (a) Chromosome conformation capture data at low resolution (e.g., 1 Mb resolution) display a checkerboard pattern with regions of higher probability of contact (red) and lower probability of contact (blue). This suggests the existence of two types of compartments, A and B, as indicated. (b) Chromosome conformation capture at higher resolution (e.g., 1 kb) makes it possible to zoom in on the diagonal, as a result of which topologically associated domains (TADs; yellow squares) become visible. These are chromosome stretches with a high probability of intradomain contact (yellow) that are physically insulated from the rest of the chromosome.
Chromatin at Large Scales
Inserting the density from above, we arrive at: F int /kB T = 1.
(8.113)
The surprising conclusion is that it does not cost much if two polymer chains overlap, although we have assumed a very strong repulsion with a second virial coefficient υ = a3 . Now we understand that we should be amazed that cells manage to separate pairs of identical, highly entangled chromosomes in a reliable fashion. The cell cannot rely on simple polymer physics to sort out the mess of entangled chromosomes. So how does it work? Before giving the amazing answer to this question, we take a closer look at chromosomes in interphase. In the previous section we discussed that chromosome conformation capture gives us information about the probability of contact between different sections of DNA inside a chromosome. We pointed out that the contact probability decays surprisingly slowly with genomic distance, and explained that this could reflect the topological state in which the chromosomes are trapped after decondensation. However, such data contain much more information. In Fig. 8.50(a) we schematically show a so-called contact map of a chromosome. It provides information about the probability of contact between pairs of stretches on the chromosome. This probability is simply proportional to the number of ligation products between a given pair of DNA stretches. Let us emphasize that such contact maps do not indicate the position of a particular DNA stretch in space, that is, its 3D location inside the nucleus, but only the probability that different DNA stretches are somewhere close together in the nucleus. The probability is greatest near the diagonal, since this corresponds to contacts between DNA stretches that are close together in terms of their genomic distance and thus are likely to encounter each other in space. The further one moves away from the diagonal, the smaller is the contact probability on average as discussed in the previous section. However, this dependence is weak compared to other effects we discuss in the following. Even at the relatively low resolution of earlier experiments, for example, at megabase resolution (1 Mb) (Lieberman-Aiden et al., 2009), one can observe some finer details of the chromosome organization. Especially, the contact maps reveal a characteristic
413
414 DNA–Protein Complexes
checkerboard pattern with regions of higher probability of contact (red) and lower probability of contact (blue), see Fig. 8.50(a). This pattern reflects the existence of two types of chromatin, A and B. Each type has a higher probability of contact with itself than with the opposite type. Further characterization revealed that the A (active) compartment corresponds to euchromatin and the B (inactive) compartment has the characteristics of heterochromatin. What the contact map reveals is then what we actually would have expected just on the basis of our discussion in Section 2.5, namely that euand heterochromatin are spatially segregated. So the checkerboard pattern is not surprising. However, when experiments reached a higher resolution, some new structural detail of chromosome organization appeared in the contact maps which was completely unexpected. Specifically we refer to experiments reaching kilobase resolution (1 kb) (Rao et al., 2014). When one zooms in at the diagonal of such a high-resolution contact map, squares appear along the diagonal, indicating regions with higher contact, see Fig. 8.50(b). Amazingly, these regions have only increased contact with themselves. Unlike the checkerboard pattern of the A- and B compartments, they can be found only along the diagonal. Their size varies with an average length of 185 kB. They are called topologically associated domains (TADs) or contact domains. Now it is important to realize that this finding is extremely surprising. Whereas it is easy to comprehend that one has two (or a few more) types of chromatin, leading to a checkerboard pattern in the contact map, it is very difficult to explain why there are thousands of domains that have increased contact only with themselves. What could isolate them from each other? To make things even more mysterious: The borders of TADs are typically written into the bp sequence, each motif about a dozen bp’s long. They are called CTCF motifs as they serve as a binding site for a protein, the so-called insulator protein CTCF. The experimentalist (Rao et al., 2014) made the very strange observation that these sequences act as borders of a TAD only if they are in a convergent orientation (the motif is non-palindromic, so it can be assigned a direction along the genome). In short, something makes the DNA to form a loop, bringing these two CTCF motifs into spatial contact,
Chromatin at Large Scales
but only if they happen to point in the “right” orientation along the genome. But how could the two CTCF motifs know about their relative orientation as they are about 200 kB apart from each other? Clearly any memory (e.g., through the finite persistence length of the chromatin fiber) has been lost over these extremely large distances. So what is the physics behind these mysterious TADs? Remember again our discussion in the previous section about chromosomal territories. Also there we faced the problem to explain why different chromosomes do not mix. We explained their existence with their topology, namely the fact that chromosomes behave like nonconcatenated rings on the time scales of interest. Could a similar effect also explain the existence of TADs? The problem is that we look here at much shorter length scales. This also means that time scales involved in the dynamics of TADs are rather short and there should be no problem for a TAD to equilibrate quickly. Nevertheless, if we could somehow assume that each TAD consists of a nonconcatenated loop, we could understand why they are spatially separated. Amazingly, it turns out that this is the case: Chromosomes are spatially separated into TADs because they form non-concatenated loops. But how is that possible? The answer is that the cell constantly uses energy to maintain the existence of non-concatenated loops. They are created by special motor proteins, called cohesins. Cohesin is a so-called loop extruder. Extrusion complexes contain two DNA binding subunits tethered together. Initially these two subunits bind nearby on the DNA, see top of Fig. 8.51. They then move in opposite directions along the DNA while bridging these increasingly distant chromosomal sites, thereby extruding a DNA loop. The spooling of DNA into the loop continues until the subunits encounter CTCF proteins bound to flanking, convergently arranged CTCF binding sites which block further extrusion, see bottom of Fig. 8.51. As a result, TADs are dynamical systems of loops that are nonconcatenated with each other. Since non-concatenated polymer loops do not mix, TADs are spatially separated from one another. This can be seen in detailed computer simulations (Fudenberg et al., 2016). Through genome editing one can even flip, add or remove CTCF motifs and show through contact maps that the TADs change as expected (Sanborn et al., 2015).
415
416 DNA–Protein Complexes
CTCF
cohesin DNA
CTCF motif
Figure 8.51 Loop extrusion: cohesin associates with DNA, forms and enlarges a DNA loop until it encounters CTCF proteins bound to convergent CTCF binding motifs.
We have seen now two examples of non-concatenated loops, one being just a theoretical construct to understand the conformations of whole chromosomes while the other example corresponding to real structures. We do not understand the biological reason for TADs yet, but they seem to be important because cells pay a substantial energetic price to maintain them. We now return to the question we asked above: How can the two DNA copies be reliably separated before cell division? We convinced ourselves already that the free energy gain to separate two overlapping DNA molecules would just be of the order of the thermal energy, Eq. 8.113, which clearly is not enough to make
Chromatin at Large Scales
equilibrium polymer physics do this job. This suggests that cells need to expend some energy to separate the two DNA molecules. However, it is not obvious how to do this, even if one would be willing to spend a large amount of energy. The two DNA molecules cannot be distinguished from one another. So it would be a game of luck to catch one piece from one chain and another piece from the other chain and then pull them apart in opposite directions. In addition, the DNA molecules are much longer than the diameter of the nucleus, so there is no space to pull them in. So this mechanism which is actually employed later—the pulling apart of the two halves of the mitotic chromosome by the mitotic spindle—does not work at this stage. Non-concatenated rings do not mix and loop extrusion produces non-concatenated loops. Could loop extrusion explain the separation of chromosomes? We now know that this is indeed the case and that the non-concatenated loops required for chromosome separation are produced by another loop extruder, called condensin (Gibcus et al., 2018). When condensin molecules start to act on the DNA molecules, they create loops, shortening each chromosome lengthwise. As the loops are non-concatenated, they repel each other. On one hand, this creates the desired repulsion between the chromosome pairs which are kept together only at their centers, the centrosomes. On the other hand, the loops along each chromosome stiffen the complex. The result of this process is the mitotic Xshaped chromosome, as beautifully demonstrated in a computer simulation (Goloborodko et al., 2016). Note that the two DNA copies still suffer from entanglements when they are driven apart from each other. A specialized protein, toposiomerase II, resolves these entanglements by letting the DNA double helices pass through each other. It is worthwhile to mention that the whole idea goes back to Kim Nasmyth, a yeast geneticist. In 2001 he wrote (Nasmyth, 2001): “To those aware of the difficulties of disentangling ropes, the apparent ease with which eukaryotic cells separate their chromatids during mitosis is nothing short of miraculous.” And further on: “One possibility is that condensin associates with the bases of small loops or coils of chromatin and enlarges these loops or coils in a processive manner, which ensures that all chromatin within the loop or coil
417
418 DNA–Protein Complexes
must have been cleanly segregated from all other sequences in the genome.”
Problems 8.1 Gel electrophoresis Figure 8.26 shows how nucleosomes can be separated according to their position on the DNA using gel electrophoresis. A problem with this method is that it does not distinguish between the two chain ends. For example, a nucleosome located at one end has the same electrophoretic mobility as a nucleosome located at the other end. Can you suggest an additional experimental step that would allow to determine the actual position of the nucleosome? Hint: The section on the site exposure mechanism might give you an idea. 8.2 Energy landscape In Problem 4.2 you have implemented the ideal superhelix nucleosome model. Now adapt the model to calculate the energy landscape of the nucleosome, along the 207 bp long DNA fragment that contains the 5 S rDNA positioning sequence, i.e., reproduce Fig. 8.31. The nucleotide sequences is as follows: AATTCCAACG AATAACTTCC AGGGATTTAT AAGCCGATGA CGTCATAACA TCCCTGACCC TTTAAATAGC TTAACTTTCA TCAAGCAAGA GCCTACGACC ATACCATGCT GAATATACCG GTTCTCGTCC GATCACCGAA GTCAAGCAGC ATAGGGCTCG GTTAGTACTT GGATGGGAGA CCGCCTGGGA ATACGAATTC CCCGAGG. What causes the 10 bp undulations? Tilt, roll or both? And why? 8.3 Reptation Make a sketch that shows how a polymer disengages from its original tube (see Fig. 8.47(a)). 8.4 2D Hilbert curve Use a scaling argument to predict how the contact probability decays with the chemical distance between monomers for the two-dimensional Hilbert curve (see lhs of Fig. 8.48). 8.5 Helices Helical structures are very common in biological systems. In this book we encountered the DNA double helix as the
Problems
carrier of the genetic information, α-helices as secondary structural elements in proteins and also discussed various possible helical arrangements of chromatin fibers. Other examples are actin proteins that aggregate into helical filaments and microtubules built from tubulin proteins. Can you think of any reason for the widespread occurrence of helices? 8.6 Osmotic pressure of a semidilute polymer solution In Subsection 8.5.2 we claimed that the probability of contact between two overlapping polymers in a good solvent scales as c 9/4 . Here you derive this scaling law by calculating the osmotic pressure of a solution of polymers. The osmotic pressure is the “extra” pressure from the polymers. The name reflects the fact that one would need to apply this pressure to the solution to prevent the inward flow of its pure solvent across a semipermeable membrane. (i) Calculate the osmotic pressure ! of a dilute solution of polymers. Hint: You can assume an ideal gas of molecules with each polymer being a particle. (ii) Calculate the monomer concentration c ∗ delineating the crossover from a dilute solution where polymer coils are far from each other to a semidilute solution where polymers overlap. Hint: This is the concentration at which the swollen polymer coils pack densely. For simplicity you can assume a very good solvent with a second virial coefficient υ = a3 . (iii) Now calculate the osmotic pressure in the semidilute regime. In the spirit of the virial expansion, one would assume an additional term proportional to c2 . However, because monomers are connected into chains, the probability for collision between monomers of different chains is smaller, a problem mentioned in Subsection 8.5.2. To take this correlation effect into account, we assume here that the osmotic pressure in the semidilute regime scales as !/kB T = constant × c m . Calculate the exponent m. Hint: m follows from comparing the osmotic pressure at the overlap concentration c ∗ as calculated for the dilute case with
419
420 DNA–Protein Complexes
the pressure as calculated with the ansatz for the semidilute case and requiring that both are the same. Now you have shown that the osmotic pressure and thus the probability of contact between monomers of overlapping good solvent chains scales like c 9/4 . This in turn explains why overlapping polymers do not repel each other much, which in turn forced biology to come up with a sophisticated mechanism to separate chromosomes after duplication, see Subsection 8.5.2.
Chapter 9
Computational Methods
So far in this textbook we have focused on biophysical models that can be solved with analytical methods, but not on problems that require a numerical approach. This is common for many textbooks, since calculations can be presented step by step, whereas for a computer simulation only the model and the results can be described but the steps in between remain hidden. However, computer simulations have become an integral part of biophysics, so the current chapter provides an introduction to the most important computational methods: molecular dynamics simulations and Monte Carlo simulations. Often, simulations of biophysical systems can be rather complex and time consuming. For example, if we want to study the interaction between a protein and DNA with atomic resolution, we have to represent not only the rather large molecules, but also the water molecules and ions. Here we just want to explain the methods and give some exercises on systems that are easy to simulate on your laptop. This way you understand the general principles of these methods. By simulating a gas of Argon atoms (Problem 9.1) and a system of magnetic spins (Problem 9.2), you also get some additional insight into phase transitions, which we discussed earlier in Section 2.4 and Appendix C.
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
422 Computational Methods
9.1 Molecular Dynamics Simulations Molecular dynamics (MD) simulations solve the equations of motion of a system of particles numerically. The method is indispensable as most systems of interacting particles cannot be solved analytically. For example, while the trajectory of two celestial bodies that interact via the gravitational force can be solved exactly, there is no general solution for the case of three bodies (three body problem). Before the advent of computers, the motion of e.g., Haley’s comet was solved numerically (by hand) in order to be able to predict its next appearance. This was achieved by approximating its continuous trajectory by a discrete one, evolving the system by small but finite time steps. The same is done in an MD simulation, often involving a huge number of interacting particles. An example can be seen in Fig. 9.1 which shows an MD simulation of 108 Argon atoms in a three-dimensional box. For each atom the last 100 positions can be seen. As small time steps have been chosen, the trajectories look almost continuous. You can perform this simulation yourself following the instructions given in Problem 9.1 and study this system in the gaseous, liquid and solid state. The particular example in Fig. 9.1 depicts a liquid.
Figure 9.1 MD simulation of a system of Argon atoms in a threedimensional box with periodic boundary conditions. The density ρ and temperature T correspond to the liquid state (specifically ρ = 0.8 and T = 1.0 in dimensionless units, see Problem 9.1). Shown are the last 100 time steps in an equilibrated sample.
Molecular Dynamics Simulations
The motion of each particle is governed by Newton’s second law: m
d2x = F (x) = −∇U (x) . dt2
(9.1)
F or a potential that only depends on the distance r = x x 2 + y 2 + z2 , we have ∇U (r ) = dU . dr r In most cases, these equations cannot be solved exactly. In an MD simulation they are solved approximately. The general idea is to replace the evolution in continuous time by an evolution with finite time steps h. The simplest (often too simple) way to solve Eq. 9.1 is as follows. If xn are the positions and vn the velocities at time tn , the positions and velocities at the next time point tn+1 = tn + h are given by xn+1 = xn + vn h
(9.2)
and vn+1 = vn +
1 F (xn ) h. m
(9.3)
These so-called Euler methods for integrating ordinary differential equations (ODEs) have a major downside for the molecular dynamics simulations we are doing: They do not conserve the energy. However, our goal will be to simulate a system with conserved energy (the microcanonical ensemble), so we should look for a better algorithm. The first of these algorithms with this astonishing property was introduced at the very beginning of molecular dynamics, see the classic paper of Verlet of 1967 (Verlet, 1967). The method was used even earlier, before the arrival of computers, e.g., in 1909 to calculate the orbit of Halley’s comet and Delambre used it in the 18th century. The Verlet algorithm makes direct use of the fact that the equation of motion 9.1 is a second order differential equation. For this we can derive a finite time step algorithm by using the Taylor expansion of x (t ± h): h2 x¨ (t) + 2 h2 x (t − h) = x (t) − hx˙ (t) + x¨ (t) − 2 x (t + h) = x (t) + hx˙ (t) +
h3 d 3 x (t) + O h4 (9.4) 3 6 dt h3 d 3 x (t) + O h4 (9.5) 3 6 dt
423
424 Computational Methods
where O h4 means terms of order h4 . Adding those two equations together and using x¨ (t) = F (x (t)) /m, from Eq. 9.1, we obtain h2 (9.6) F (x (t)) + O h4 . m 4 Using Eq. 9.6 (neglecting the O h -term), one can propagate the system with high accuracy from time t to t + h by just using the two previous positions of each particle. A small technicality occurs at the first time step at, say, t = 0. As we have no information about x (−h), we need to replace it with x (−h) = x (0) − hv (0) + O h2 . Finally, in this version of the Verlet algorithm, only the positions enter. Velocities (that we need for calculating the kinetic energy) have to be estimated from x (t + h) − x (t − h) v (t) = + O h2 . (9.7) 2h The above algorithm can be slightly altered to use also velocities directly, the velocity Verlet algorithm. For this we take the Taylor expansion of x (t + h), Eq. 9.4, and replace x˙ (t) by v (t) and x¨ (t) by F (x (t)) /m to arrive at x (t + h) = 2x (t) − x (t − h) +
h2 (9.8) F (x (t)) + O h3 2m In addition, we now need an equation for v. Again, we use a Taylor expansion: x (t + h) = x (t) + hv (t) +
h2 (9.9) υ¨ (t) + O h3 . 2 We know that υ˙ (t) = F (x (t)) /m but we do not know υ¨ (t). To determine the latter quantity, we use a Taylor expansion again: (9.10) υ˙ (t + h) = υ˙ (t) + hυ¨ (t) + O h2 , v (t + h) = v (t) + hυ˙ (t) +
from which we find υ¨ (t) = 1h (F (x (t + h)) − F (x (t))) /m (up to terms of order h). Inserting this back into Eq. 9.9, we obtain: h (F (x (t + h)) + F (x (t))) + O h3 . (9.11) v (t + h) = v (t) + 2m Equation 9.8 followed by Eq. 9.11 constitutes the velocity Verlet algorithm. Compared to the Euler method, the Verlet and the velocity Verlet algorithm are not just better in the trivial sense of taking terms of
Monte Carlo Simulations 425
higher order in h into account. They happen to be much better than that, as they are so-called symplectic integrators. These integrators preserve a certain quantity that is a discretized version of the energy. This conserved quantity acts like an anchor to which the total energy is bound so that it cannot drift away, something that is not the case with the Euler method, Eqs. 9.2 and 9.3.
9.2 Monte Carlo Simulations If you only want to study the properties of a system at thermal equilibrium but are not interested in its dynamics, the Monte Carlo method is a very powerful approach. We learned in Chapter 2 that statistical mechanics typically requires the calculation of highdimensional integrals. For example, the average energy of a system follows from integrating over all degrees of freedom, see Eq. 2.13. A natural way to get to the Monte Carlo method begins with the question of how such high-dimensional integrals can be calculated numerically. b We start first with a one-dimensional integral, a f (x) dx. If we do not know the exact solution of this integral, we could try to determine its value numerically as follows: a
b
b
f (x) dx = (b − a) a
N 1 b−a f (x) dx ≈ fi b−a N i =1
(9.12)
with fi = f (xi ). Here the xi ’s are drawn randomly from the uniform distribution in the interval [a, b]. The scheme is called Monte Carlo integration. To understand how to arrive at the sum on the right hand side, we inserted in the middle of Eq. 9.12 an extra step, a trivial rewriting of the original integral. This expression can be interpreted as calculating the expectation value of the function f (x) for the probability distribution p (x) = 1/ (b − a), see also Eq. A.4. The expression on the right hand side of Eq. 9.12 is then an estimation of this integral by drawing random numbers xi from the probability distribution p (x).
426 Computational Methods
How accurate is this method? Let us calculate the variance (see Eqs. A.5 and A.6): 2 2 N N b−a b−a 2 (9.13) − fi fi σ = N i =1 N i =1 where the angular brackets denote the average over all possible realizations of sequences of N random coordinates xi . Let us denote
the average of f on the interval [a, b] by f . Using fi = f , fi2 =
2 f 2 and, for i = j , fi f j = fi f j = f , we can simplify Eq. 9.13 to (b − a ) 2 2 2 σ2 = f − f . (9.14) N Therefore the √ standard deviation σ from the real value, i.e., the error, scales as 1/ N. b There are much better methods to k numerically determine a f (x) dx , with errors that scale as 1/N with k being a positive integer. The simplest method leads to k = 1 and is as follows: Define an equidistant grid on the interval [a, b]. The grid points are then given by xi = a + i h with h = (b − a) /N and i = 0, . . . , N. The integral is then approximated by b N−1 f (x) dx ≈ h fi , (9.15) a
i =0
see also Fig. 9.2. Let us estimate the error. For the sake of simplicity, let us assume that f (x) is a continuous function on the interval [a, b], which has a finite slope everywhere with an absolute value that is less than or equal to a value M. Then the error of our estimate cannot exceed Mh/2 for each subinterval (this maximal deviation from the estimated value occurs for the special cases f (x) = Mx and f (x) = −Mx). The overall error follows from Eq. 9.15 to be smaller than h × N Mh/2 = (b − a) Mh/2. Therefore the error scales as 1/N. This clear √ly outperforms the Monte Carlo integration, which only scales as 1/ N. Then why should one ever consider numerically determining integrals using the Monte Carlo integration scheme? As we are now showing, this method is very powerful and outperforms other methods when calculating high-dimensional integrals. Suppose one
Monte Carlo Simulations 427
f(x)
a
b x
b Figure 9.2 A standard numerical method for calculating a f (x) dx involves dividing the interval [a, b] into an equidistant grid with corresponding N values xi . This particular example (Eq. 9.15) produces an error that scales like 1/N. It is straightforward to design approximations that lead to smaller errors.
wants to calculate the integral over a function f that is defined on a d -dimensional hypercube Ld . We can estimate the integral by evaluating f on an equidistant grid by dividing the hypercube into N little hypercubes, each of volume hd = Ld /N. The standard deviation is then proportional to hk (with some k ≥ 1, e.g., k = 1 as shown in the example above) which scales here as 1/N k/d . Therefore, these numerical methods do not work very well because increasing N hardly improves their performance. On the other hand, the Monte Carlo integration is not affected by the dimensionality of the system. You can verify this by reexamining the argument that led to Eq. 9.14. √ The standard deviation always scales as 1/ N. Now the type of integral we have to estimate is typically of the form 1 A = A (x ) e−β H (x ) dx (9.16) Z where x denotes, e.g., the positions and momenta of all the particles (for an example where A is the energy, see Eq. 2.13). Since this is a very high-dimensional integral, Monte Carlo integration is clearly the method of choice. However, there is an additional complication. The function A (x) e−β H (x ) is typically very sharply peaked in phase space, see
428 Computational Methods
A(x)e−βH(x)
x
Figure 9.3 In statistical mechanics, one typically has to compute very high-dimensional integrals. This figure is therefore misleading as it only shows a one-dimensional case. However, another typical feature is correctly displayed, namely that the functions that need to be integrated, e.g., A (x) e−β H (x ) , typically have very sharp peaks.
Fig. 9.3 and in most of phase space this function is practically zero. It is thus very inefficient to perform a uniform random sampling as most of the xi ’s happen to be in regions in phase space that do not contribute to the integral in any significant way. Despite this √ complication, the error still scales as 1/ N; however, the pre-factor is huge. To solve this problem, one introduces importance sampling. Suppose the probability distribution on phase space is given by p (x), with p (x) ≥ 0 everywhere, a function normalized to one and possibl y sharply peaked. Then the idea is to evaluate the integral p (x) A (x) dx as follows: N 1 A (x) p (x) dx ≈ (9.17) A (xi ) N i =1 where xi is sampled according to the probability distribution p (x). How can this be achieved in practice? The answer leads to the Metropolis algorithm which lies at the heart of Monte Carlo simulations. We introduce a Markov chain, a set of random states {xi } = {x1 , x2 , x3 , . . .} where the probability of xi +1 depends on xi only (see also Eq. 5.11). This can be described by transition probabilities T (x → x ) between states. These fulfill (9.18) T x → x = 1, x
Monte Carlo Simulations 429
reflecting the fact that you have to go somewhere from x. We can now write a master equation for the probabilities: p (x, i + 1) = p (x, i ) − p (x, i ) T x → x x
+ p x , i T x → x .
(9.19)
x
This equation is an example of the master equation 5.28 with the only difference that time is now discrete, not continuous. We want to sample the system at thermal equilibrium, i.e., we are interested in a stationary probability distribution, p (x, i + 1) = p (x, i ) = p (x ). In this case, Eq. 9.19 is simplified to p (x) T x → x = p x T x →x . (9.20) x
x
For this equation the values of p (x) are known but we still do not know the transfer probabilities T (x → x ). Since this relation is hard to solve, stronger constraints are typically used, namely p (x) T x → x = p x T x → x . (9.21) This is called detailed balance and it is a special case that fulfills Eq. 9.20. To proceed further, we separate T (x → x ) into two factors: T x → x = wx x × A x x (9.22) The first factor, wx x , denotes the trial step probability: the probability to propose state x given state x. The second factor, A x x , is the acceptance probability: the probability to accept the proposed new state x if the system was in state x before. The trial step probabilities should be symmetric, wx x = wx x . Whether this is actually the case must be checked when setting up a Monte Carlo simulation. Then the detailed balance condition, Eq. 9.21, simplifies to p (x ) Ax x = . (9.23) Axx p (x) This relation can be satisfied as follows: + 1 if p (x ) > p (x) Ax x = (9.24) p(x )/ p(x) if p (x ) < p (x) . Now we have all the information to outline the steps to be taken in the Metropolis algorithm, namely
430 Computational Methods
(1) Start with a state xi (with i = 0). (2) Generate a state x from xi (such that wxi x = wx xi ). (3) If p (x ) > p (xi ), A x x = 1, i.e., set xi +1 = x . Otherwise, accept move (xi +1 = x ) with probability q = p (x ) / p (xi ) or reject move (xi +1 = xi ) with probability 1 − q. (4) Continue with (2). In the context of statistical physics, p (x) is the Boltzmann weight, i.e., p (x) ∝ e−β H (x) . Since the probability decreases monotonously with H , step (3) can be reformulated as follows. Instead of using probabilities, one uses energies, i.e., instead of checking p (x ) > p (xi ), one checks H (x ) < H (xi ). In other words, if the trial move lowers the energy, it is always accepted, and if it increases the energy by an amount E , it is accepted with a probability e−β E .
Problems 9.1 Different phases in a system of Argon atoms In the following you simulate different phases in a system of interacting particles that mimic Argon atoms. Depending on the externally imposed conditions (temperature and density) the system is either in a gaseous, liquid or solid state. Why do we choose Argon and not water, which would be much more interesting for biological systems? Water turns out to be very complicated because the molecule is asymmetric and forms hydrogen bonds, etc. We want something simpler and a noble gas is the first choice as we do not have to worry about the formation of molecules. Historically, Argon (Ar) is the best studied system, and that is why we chose this element here. Unlike in Problem 2.6 (the ideal gas), the atoms interact. As Argon atoms are neutral, they do not interact via the Coulomb interaction. However, small displacements between the nucleus and the electron cloud give rise to a small dipole moment. Overall, the interaction between dipoles gives an attractive interaction that scales as 1/r 6 , where r is the distance between atoms (van der Waals interaction). On the other hand, repulsion of quantum mechanical origin occurs at short distances (Pauli repulsion). Both of these aspects are captured
Problems
in the Lennard–Jones potential, which takes the form
σ 12 σ 6 U (r) = 4 − . r r The specific form of the repulsive (σ/r )12 -term has just historic reason (it is convenient as it is the square of the attractive term). For Argon the fitting parameters have been determined to be /kB = ˚ 119.8 K and σ = 3.405 A. We want to simulate an infinite system of Argon atoms—but obviously, a computer cannot simulate infinite space. In practice, we will use a finite box of length L, with periodic boundary conditions (to mimic an infinite system). The periodic boundary conditions have two important consequences: • If a particle leaves the box, it reenters at the opposite side. • The evaluation of the interaction between two atoms is also influenced by the periodic boundary conditions, as they introduce an infinite number of copies of the particles around the simulation box with which a given particle interacts. Regarding the latter point: the Lennard–Jones potential decays fairly rapidly with distance, namely like r −6 . That is, although the interaction of a given particle, say i , with another, say j , contains an infinite number of terms (which results from the original particle and its infinite number of copies), only one term makes a significant contribution. All others are much smaller. It is only the copy (or the original) of particle j that is closest to particle i that needs to be accounted for; the rest can be safely neglected. This is called minimum image convention and it is used in the following. Figure 9.4 depicts an example. It shows the original box with five particles in the center and images of the system all around (in pale colors). Even though the orange particle interacts with four particles and their infinitely many copies, only four interactions need to be accounted for. In this particular configuration only one of those four interactions is an interaction between a pair of original particles. Now, how do the particle positions evolve over time? This is explained in the main chapter. Here you should implement first the most simple scheme, Eqs. 9.2 and 9.3. As in Problem 2.6, the particles’ positions and velocities can be most conveniently stored in
431
432 Computational Methods
Figure 9.4 The minimum image convention.
NumPy arrays. This then makes it possible to represent the original Eqs. 9.2 and 9.3 in a very compact form in your code. First milestone: Play around with this system. Start by simulating the time evolution of a few particles (e.g., two particles on collision course or several particles with random initial positions and velocities) in a periodic box, add the forces due to the Lennard–Jones potential, employing the minimum image convention. Store the trajectories of the particles and then display them (e.g., as a scatterplot). Check how the total energy of your system evolves over time. It is easier to start with a 2D system, but plan to switch to 3D at a later stage. When you start coding the simulation of the motion of the Argon atoms, you soon find that the normal SI units (kg, m, s, etc.) result in numbers of vastly different magnitude, e.g., the mass of the Argon atoms is 6.6 × 10−26 kg and typical distances are of ˚ order Angstr om, 10−10 m. Working with these different orders of magnitude is cumbersome and prone to round-off errors in the simulation. It is therefore useful to identify some natural units in the simulation so that the variables are of the same order of magnitude.
Problems
We can already find two natural units from the Lennard–Jones potential, namely σ for the position and for the energy. Let us thus define x˜ = x/σ (and hence also r˜ = r/σ ). We can then define a dimensionless Lennard–Jones potential U˜ (r˜) = U (r) / = 4 r˜ −12 − r˜ −6 . Next we derive Newton’s equation for dimensionless units. We start from 2 d 2 x˜ 1 −1 d x = σ = −σ −1 ∇U (r) = − ∇U˜ (r˜) = − ∇˜ U˜ (r˜) . 2 2 dt m mσ mσ 2 dt The only variable that still has a dimension is time t. If we also define a dimensionless time mσ 2 t˜ = t/ , then Newton’s equation takes the following simple form: d 2 x˜ = −∇˜ U˜ (r˜) . dt˜ 2 We can omit the ∼’s and just say that the length is measured in units of σ , the energy in units of and the time in units of mσ 2 /. The advantages of this approach are • Less cumbersome notation of formulas in the program (and thus less error prone) • Simpler equations that are easier to code • Insight into what are the expected length and time scales in our system Concerning the last item:If we put the numbers for Argon in the units of time, we arrive at mσ 2 / = 2.15 × 10−12 s. To check if this is a typical time for our system, we compare with the expected √ velocities. From the equipartition theorem we have B T /m. υ = 3k√ The natural unit of velocity in our units is σ/ mσ 2 / = /m. Hence, in dimensionless units υ/ /m = 3kB T /. As mentioned above, for Argon one has /kB ≈ 100 K. This means that the ratio of these two velocities is of order one if we simulate the system at temperatures of around 100 K. The particles then move a
433
434 Computational Methods
typical distance of order σ in time mσ 2 /. We need to approximate the continuous trajectories in the real system by sufficiently fine discretized trajectories of our MD simulation. This is especially the case for particles in close contact (at a typical distance σ ) where we have to make sure that e.g., the particles do not pass through each other. From this we can conclude that the time step h should be smaller than one (of order 10−3 to 10−2 ). Second milestone: Derive the expression of the kinetic energy in dimensionless units. Then write a molecular dynamics code that uses dimensionless units and simulate a few atoms (in three dimensions). Plot the kinetic, potential and total energy. We discussed earlier that the periodic boundary conditions also influence how the forces are calculated. In particular, we chose the convention that we only take into account the forces due to the nearest images of other particles (minimum image convention). One might naively expect that to calculate the force exerted by particle j at x j on particle i at xi , we would have to consider all the images of particle j in the 26 boxes surrounding the original box, and then to choose between particle j and its images the one that is closest to particle i . However, since we are working with an orthogonal coordinate system, there is a significant simplification. We need to 2 2 2 xi − x j + yi − y j + zi − zj minimize the distance ri j = where x j , y j , zj are the coordinates of the original particle j , x j = x j , or of an image, for example, x j = x j + L or x j = x j − L. Since we can shift each of the coordinates individually, we can minimize ri j by minimizing each coordinate direction separately. Instead of 27 possibilities, we only need to examine nine. The problem simplifies even more, since now we are effectively dealing with 1D problems. In particular we know that for a box of length L: −L/2 < xi − x j < L/2 → x j = x j xi − x j > L/2 → x j = x j + L xi − x j < −L/2 → x j = x j − L where x j denotes now the closest image. A very compact formulation of this operation (in Python) makes use of the Modulo operator:
Problems
(xi - xj + L/2) % L - L/2 Here xi is a NumPy array that contains the positions of all particles (in an 3N-dimensional array). Think about why it works by looking at the different possible cases. After having efficiently coded the minimum image convention, implement the Verlet algorithm, Eq. 9.6, or the Velocity-Verlet algorithm, Eqs. 9.8 and 9.11. Third milestone: Redo the simulation from the second milestone. You should find that variations in the kinetic energy with time are exactly cancelled by opposite variations in the potential energy, leading to a constant total energy. We now have a working code. Next we need to set correct initial conditions, i.e., we need to simulate the system at a given density and temperature. We start with the positions. Our choice will be motivated by the following two considerations: (i) We do not want the particles to start too close to each other, otherwise the r −12 repulsive potential can lead to huge contributions to the potential energy. So it makes sense to put particles on a regular grid. (ii) We want to simulate different phases of Argon. It is known that Argon forms a face-centered cubic lattice in the solid phase, see Fig. 9.5. As we use periodic boundary conditions and as we want a regular arrangement of atoms in the solid, we best choose a total number of particles that is compatible with such a lattice. In short, at the start we place the particles on a face-centered cubic lattice. Each repeating unit consists of 4 atoms. If we choose a total volume containing 3×3×3 such units, we need 108 atoms. This turns out to be a reasonable size for a simulation. It is obviously straightforward to set the desired particle density ρ by choosing the appropriate volume of the simulation box. Finally, we also need to set the temperature. The initial temperature should be chosen such that we get a Maxwell velocity distribution for that given temperature, Eq. 2.43. This is done by choosing the velocities components in X , Y and Z to be Gaussian distributed, e.g., p (υx ) ∼ e−mυx /(2kB T ) . 2
435
436 Computational Methods
Figure 9.5 A unit cell of the face-centered cubic lattice. The full lattice is obtained by extending this unit in all directions.
At first it seems straightforward to set the temperature by drawing velocities from the Maxwell distribution. Unfortunately, the situation is more complex: When we start the simulation, the system is not yet equilibrated. In fact, it equilibrates by exchanging energy between its kinetic and potential form. It is impossible to predict in advance what exactly will happen. The solution is to first let the system equilibrate for a while and then force the kinetic energy to have a value corresponding to the desired temperature. We do this by rescaling the velocities vi → λvi with the same parameter λ for all particles. λ follows from the equipartition theorem, Eq. 2.39. Specifically, we know that our “target” kinetic energy is given by 3 target E kin = (N − 1) kB T . 2 The reason for using N − 1 instead of N is that the total momentum in the system, i mυi , is conserved, leaving thus only N − 1 independent degrees of freedom. In order for our current kinetic target current = (1/2) i mυi2 to have the right value E kin , λ energy E kin target 2 current must be chosen such that E kin = λ E kin . From this follows & (N − 1) 3kB T . λ= 2 i mυi
Problems
The equilibration and rescaling needs to be repeated a few times until the temperature has converged. Note that the above formulas are not yet in dimensionless units. For the simulation they need to be converted accordingly. Fourth milestone: Implement the initial conditions such that you can simulate the system at any desired density and temperature. You now have a fully working code to simulate a system of Argon atoms in a box. Now it is time to study the system at different initial conditions and to observe the different phases. They can be neatly distinguished by the pair correlation function. Given a reference particle, it gives the probability to find another particle at distance r. In the simulation, you can compute the pair correlation by making a histogram of n(r), the number of all pairs of particles within a distance [r, r + r], where r denotes the bin size. The pair correlation function is then given by g (r) =
n (r) 2V N (N − 1) 4πr 2 r
where V is the volume of the simulation cell, and N the number of particles. .. denotes the average over many configurations. The various factors surrounding n (r) are chosen in such a way that g (r) = 1 if the particles would be homogeneously distributed (e.g., in an ideal gas); convince yourself of this. The pair correlation function has very specific forms in the different states of matter, which reflect the different degrees of order in these states. Finally, calculate the pressure for the different states of the system. The pressure p can be obtained from the following formula (Thijssen, 2007): ∂U ri j 1 p ri j =1− kB T ρ 3NkB T ∂r i j >i where ri j denotes the distance between particle i and j , i.e., ri j = x j − xi and U is the Lennard–Jones potential. The first term on the right hand side corresponds to the ideal gas whereas the second provides corrections due to the interaction between particles that can increase or decrease the pressure. This exact formula, based on the so-called virial theorem, is given here without derivation.
437
438 Computational Methods
Final milestone: Simulate Argon in three different states of matter, gas, liquid and solid, and plot the corresponding pair correlation functions. Also calculate the pressure values together with an estimate of the error by performing several (e.g., 10) independent simulations (see Problem 2.6(j) on how to calculate the error of the mean). Hints: You can choose the following (dimensionless) parameters for the different states: ρ = 0.3 and T = 3.0 (gas), ρ = 0.8 and T = 1.0 (liquid) and ρ = 1.2 and T = 0.5 (solid). When plotting the correlation function, only plot up to the distance r = L/2 (why?). 9.2 Phase transition in the two-dimensional Ising model Here you study the thermodynamic properties of the two-dimensional Ising model. It consists of a system of interacting spins on a lattice and serves as a simple model for ferromagnetism. It allows you to study the behavior of a system around a critical point, see Appendix C. More specifically, the spins sit on a square lattice of size N × N. si is the value of the spin at site i and it can take the values +1 and −1 as it either points “up” or “down.” The Hamiltonian is given by si . H = −J si s j − H i, j
i
The first term accounts for a coupling between spins. The summation is over nearest neighbor pairs on the lattice (each spin has four nearest neighbors; we assume periodic boundary conditions). J is a coupling constant. We assume here J > 0 which means that the spins prefer to be aligned to lower the energy of the system; for simplicity, set J = 1. The second term in the Hamiltonian accounts for the presence of an external magnetic field H which favors the spins to have the same sign as that of the field. In the following we do not consider this term, i.e., we always set H = 0. You will find that— below a critical temperature—the system shows overall alignment of the spins, even though an external magnetic field is absent, i.e., the system exhibits ferromagnetic behavior. There is obviously no molecular dynamics simulation of the Ising model. Instead, it is an ideal system to apply and understand how the Monte Carlo approach works. To set up the Metropolis algorithm we need to choose specific values for the trial step probabilities ωx x , see Eq. 9.22. As we are on an N × N lattice, this is most naturally
Problems
done as follows: ωx x =
+
1/N 2 0
if x and x differ by one spin otherwise.
This means the following. Suppose your N 2 spins are in state x. You create now a trial state x by picking one spin at random and flip it. You then calculate the energy difference E = H (x ) − H (x). To do this calculation, you only need to calculate the change in the interaction energy of the selected spin with its four neighbors. If E < 0, always accept the move and if E > 0, only accept it with a probability e−βE . One complication is that you need to account for the periodic boundary conditions. As in the previous exercise, the easiest way to do this is with the Modulo operator. Suppose you calculate the interaction of a spin at [nx, ny] with the neighbor to the right at [nx + 1, ny]. How does one deal with a spin at the right boundary? This is automatically accounted for if you look up the spin at [(nx + 1) % L, ny] It is useful to always keep track of the total energy of the system. To do this, calculate this energy at the beginning of the simulation by adding up all interactions between the spins. Then, as you run the simulation, simply add E to the total energy whenever there is a successful spin flip. Similarly, you can always keep track of the total magnetization, M = i si . Moreover, since e−βE can take only five different values, you can calculate these values only once for a given temperature and store them in an array (in fact, you only need two values, for those cases where E > 0). First run some simulations at various fixed temperatures. Plot the magnetization per spin, m = M/N, as a function of “time,” measured here in Monte Carlo steps per lattice site (also called one sweep of the lattice). The reason we measure time this way is that each spin had an average chance of attempting one flip after each sweep (independent of the total size of the lattice). You can start either with a random spin configuration (corresponding to an infinite temperature) or with a state where all spins point in the same direction (corresponding to a state at zero temperature). A
439
440 Computational Methods
reasonable size of the lattice is 50 × 50, i.e., N = 2500 spins, but it can also be useful to study smaller or larger lattices. It is also recommended to plot the spin configuration at the end of your simulation. As mentioned in Appendix C, there is a critical temperature below which there is a spontaneous magnetization. For the two-dimensional Ising model the critical temperature is known analytically (Kramers and Wannier, 1941): 2 k B Tc = (9.25) √ ≈ 2.269. J ln 1 + 2 You should run several simulations for various temperatures, (at least) from T = 1.0 to T = 4.0 (set J = kB = 1) in steps of 0.2. For each temperature you need to get a rough estimate of the equilibration time by inspecting curves of the magnetization as a function of time. It is best to perform two simulations starting at different (or even the same) initial conditions and to see when the two curves reach similar values in magnetization. You may find that it takes longer to equilibrate near the critical temperature. Now that you know how long to wait to get equilibrated samples, you need to figure out how long to run the simulation after equilibration to determine physical quantities with reasonable statistics. To do this, you next need to determine the correlation time, the time it takes for the system to forget its previous state. A physical quantity should then be measured in a simulation that is several correlation times long. To determine this time, you can follow the behavior of the magnetization after the sample is equilibrated. The autocorrelation function χ (t) of the magnetization measures how much the magnetization at time t is related to the magnetization at time t + t. For example, if the magnetization per spin at time t is larger than its average, i.e., m (t ) > m, then, if t is smaller than the correlation time, it is typically also larger than average at t + t, i.e., m (t + t) > m. On the other hand, if t is larger than the correlation time, then m (t + t) is equally likely positive or negative, independent of the value of m (t ). The so-called timedisplaced autocorrelation function of the magnetization is given by χ (t) = m t − m m t + t − m dt = m t m t + t − m2 dt .
Problems
(This quantity should not be confused with the magnetic susceptibility χ M that you measure further below). In order to determine χ for all times t, one has in principle to integrate over an infinite time, but this is obviously not possible for a simulation. In addition, time is here discrete and counts the Monte Carlo steps per lattice site. Call this total number of sweeps (after equilibration) tmax . For this case the formula for the autocorrelation function is given by χ (t) =
tmax −t 1 m t m t + t tmax − t t =0
−
tmax −t tmax −t 1 1 m t + t . m t × tmax − t t =0 tmax − t t =0
The second term calculates m2 ; the choice of the two summations makes sure that the same subsets of the data are used as in the first term. This is not strictly necessary but improves the behavior of χ (t) (Newman and Barkema, 1999). The measured autocorrelation function falls off approximately exponentially χ (t) ≈ χ (0) e−t/τ where τ denotes the correlation time. It should be noted, however, that the statistical fluctuations of χ (t) increase for larger values of t. This reflects the fact that there are less and less intervals [t , t + t] to average over. For long enough simulations, the exponential function has decayed already to a very small value before these fluctuations become important. This allows one to determine τ by integrating over the autocorrelation function: ∞ ∞ χ (t) = e−t/τ dt = τ. χ (0) 0 0 To perform this operation for the discrete time steps, you replace the integration by a summation and simply stop the summation once χ (t) < 0, as this signals the onset of poor statistics for longer times. Calculate the correlation time as a function of temperature for the same temperature values as above. Important is that you only calculate this quantity after the system is equilibrated. Plot τ as a function of temperature. As you will observe, τ peaks around the critical temperature, an effect called critical slowing down.
441
442 Computational Methods
Now if one wants to measure the thermal average of some quantity, e.g., the average magnetization, one needs to draw it many times. To calculate the thermal average is straightforward, but to calculate the standard deviation of the mean is more tricky. The problem is that two spin configurations drawn a time of t sweeps apart are not independent of each other if t is less than the correlation time τ . A natural choice to achieve statistical independence turns out to be t = 2τ (Newman and Barkema, 1999). Consequently the standard deviation of the mean is given by & 2τ 2 σ = m − m2 tmax which corresponds to Eq. 9.14 with N replaced by tmax / (2τ ). So even if you average over all sweeps, tmax in total with tmax τ , the equation for σ accounts for the fact that you took only about tmax /τ independent measurements. You are now ready to measure the equilibrium properties of the 50 × 50 system of spins. Begin with the magnetization which is most straightforward to implement. The only problem with the magnetization is that the system (well) below the critical temperature shows either a state with an overall positive or an overall negative magnetization, depending in which state the system settles in first. One way to avoid this problem is to calculate instead the average of the absolute value of the magnetization (per spin) 1 | m | = | si |. N i Calculate the mean of this quantity and the standard deviation of the mean (see the equation for σ above) for the set of temperatures mentioned earlier. Do the same for the energy per spin. For the same range of temperature determine the magnetic susceptibility per spin β ∂ M β 2 = M − M2 N ∂H N and the specific heat per spin χM =
C =
2 ∂E 1 ∂ 2 ln Z 1 = = E − E 2 . 2 2 2 NkB T ∂T NkB T ∂β
Problems
Estimate also the error for these two quantities. This is, however, a bit more challenging than for the cases of magnetization and energy. The reason is that—unlike for the latter two quantities—you do not determine χ M and C by averaging over some measurements repeated many times over the course of the simulation. Instead these quantities are determined in a more complex way from these measurements. In order to give a rough estimate of the error, you can use the blocking method. You divide the measurement into successive blocks, each containing n measurements. For each block you determine χ M and C . Assuming that blocks are independent from each other, the standard error is estimated in the usual way. How large should the blocks be? Obviously they need to be large enough such that a calculation of χ M and C within a block averages over (much) more than a correlation time τ . Otherwise the fluctuations in magnetization and energy are severely underestimated, as e.g., M2 and M2 are too similar in value. In addition, different blocks should be approximately independent, which requires again a block length (much) larger than τ . For the current system a reasonable value for the block length is 16τ , calling for a relatively long simulation time around the critical point (as a sufficient number of blocks is needed for proper averaging). There are many possible ways to explore this system further, see e.g., (Newman and Barkema, 1999). You could study the effect of an external magnetic field. There are more systematic methods to estimate errors for quantities like χ M and C , especially the bootstrap and the jackknife methods. You can implement so-called cluster flipping algorithms (as the Wolff algorithm) which are more suitable to explore the region around the critical point where large clusters of parallel spins occur (see also Eq. C.1 in Appendix C); these algorithms do not suffer from the dramatic critical slowing down around the critical point which we observed for the spinflipping algorithm. Also of great interest is to extract the values for the critical exponents ν, γ , α and β introduced in Appendix C; this can be achieved in a systematic way through the finite size scaling method.
443
Appendix A
Probability Theory
Probability theory is essential to statistical physics. This is because when we look at a macroscopic system (e.g., an air-filled balloon), we do not know exactly the microscopic state of the system (i.e., the positions and momenta of all its molecules). This appendix provides the essential concepts of probability theory that are needed in this book. For example, if we roll the dice, we have 6 different possible outcomes, each of which has a 1/6 probability of occuring. In more general terms, one has a set of possible events E i , i = 1, . . . , n (n = 6 for a dice) and each event has a certain probability pi = p (E i ) with 0 ≤ p (E i ) ≤ 1. The sum of the probabilities must add up to 1 (since one of all possible events has to occur), i.e., N
pi = 1.
(A.1)
i =1
It is also possible that the outcome is continuous, e.g. the orientation of a wheel of fortune with respect to some reference direction can obtain an angle α between −π and π . In that case p (α) is a function defined on the interval −π ≤ α ≤ π, called the probability π distribution. As in the discrete case, one has −π p (α) dα = 1 or in Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
446 Probability Theory
more general terms
p (x) dx = 1
(A.2)
where the integral has to be taken over the allowed values of the continuous variable x (or, alternatively, one can set p (x) = 0 if x cannot occur and the integral can then be taken from −∞ to +∞). If the wheel is perfect, then p (α) is constant in the relevant interval of angles. It then follows from Eq. A.2 that p (α) = 1/ (2π ) for all −π ≤ α ≤ π. A random variable (or stochastic variable) is a function f that attributes to each possible outcome (discrete or continuous) a number. For instance, for the case of a dice, we could choose the function fi = i that attributes to each outcome the number of eyes. We can then define the expectation value of such a variable as follows f =
N
fi pi .
(A.3)
i =1
Let us consider the case of a dice for two different stochastic variables. For fi = i one finds f = 7/2 and for fi = 1 one finds f = 1. The expectation value for a continuous variable can be defined in a similar way, namely f = f (x) p (x) dx. (A.4) The special case x is called the average of the distribution. For the above named example of a wheel of fortune and choosing f (α) = α, we find α = 0. More generally, μm = x m is called the m-th moment of the distribution. Of particular importance is the standard deviation σ f of f or its square, σ 2f , the variance. It is defined by
(A.5) σ 2f = ( f − f )2 . One squares here the quantity f − f since otherwise this expression would average out to zero, f − f ≡ 0. σ f is a measure of how much the stochastic variable varies around the mean value when one repeats the experiment (throwing the dice, turning the
Appendix A 447
p √
1 2πσ
σ σ
x
μ Figure A.1
The Gaussian distribution.
wheel, etc.) over and over again. One can rewrite σ 2f as follows
( f − f )2 = f 2 − 2 f f + f 2 = f 2 − f 2 , i.e., (A.6) σ 2f = f 2 − f 2 . For example, for a dice and fi = i , we find σ 2f = 91/6 − 49/4 = 35/12. The typical deviation σ f from the mean value is thus about 1.7. The most important probability distributions is the Gaussian distribution which we shall encounter many times in this book. It 2 2 is of the form p (x) ∼ e−(x−μ) /(2σ ) where μ and σ are some arbitrary numbers. This distribution needs to be normalized to one, see Eq. A.2. This is easy if you know that ∞
e−x dx = 2
√
π.
(A.7)
−∞
This integral is worthwhile remembering by heart. Hence ∞
2 2 e−(x−μ) /(2σ ) dx =
√
2π σ.
(A.8)
−∞
This means that the Gaussian distribution is normalized to one if we choose 2 1 2 p (x) = √ e−(x−μ) /(2σ ) . (A.9) 2π σ This distribution is depicted in Fig. A.1. Let us calculate now the average x and the variance x 2 −x2 of this distribution. We start
448 Probability Theory
with the average: ∞
1
x = √ 2π σ = √
1 2π σ
−∞ ∞
2 2 xe−(x−μ) /(2σ ) dx
2 2 (u + μ) e−u /(2σ ) du = μ.
(A.10)
−∞
2 2 For the last step we use the fact that the integral over ue−u /(2σ ) vanishes since this is an odd function, i.e., a function with the symmetry f (−u) = − f (u). The remaining term simply gives μ since the Gaussian distribution is normalized. Next we calculate the variance of the Gaussian distribution: ∞
2 1 2 2 (x − μ) = √ (x − μ)2 e−(x−μ) /(2σ ) dx 2π σ
−∞
∞ 2 d 1 2 √ = −2σ e−a(x −μ) /(2σ ) dx (A.11) da a=1 2π σ 2
−∞
where we introduced a new help variable a. Taking the derivative of the exponential with respect to a and then setting a to one, we get exactly the factor (x − μ)2 that we need. Using Eq. A.8, we get
2 d 2π 2 (x − μ) = − σ σ = σ 2. (A.12) π da a a=1
Hence μ and σ in the Gaussian distribution A.9 correspond to its average and standard deviation. The Gaussian distribution can be defined for a set of random variables x1 , x2 , . . . , x N as N 1 p (x1 , x2 , . . . , x N ) = C exp − A nm (xn − Bn ) (xm − Bm ) 2 n, m=1 (A.13) where C is a normalization constant and A nm is a symmetric positive definite matrix, i.e., A nm = A mn and m, n A nm xn xm ≥ 0 for all xn . Through the coordinate transformation yn = xn − Bn the distribution, Eq. A.13, takes the form N 1 A nm yn ym . p (y1 , y2 , . . . , yN ) = C exp − (A.14) 2 n, m=1
Appendix A 449
Higher moments of this multivariate Gaussian distribution with zer
have remarkable properties. Whereas odd moment like o mean yi y j yk vanish due to the symmetry of the distribution, even moments can be broken down into sums over products of second moments. Without proof we give here the general formula:
yn1 yn2 . . . yn2 p = ym1 ym2 ym3 ym4 . . . ym2 p−1 ym2 p . all pairings
(A.15) The set of subscripts ) m1 , m2 , . . . *, m2 p stands for the permutation of the original set n1, n2 , . . . , n2 p . The summation is taken over all possible pairings. To give a concrete example, we present the fourth moment: )
*
yn ym yk yl = yn ym yk yl + yn yk ym yl + yn yl ym yk . (A.16) Another important property of the Gaussian distribution, Eq. A.14, is that any linear combination of the yn ’s, e.g. an expression of the form N (A.17) Y = an yn n=1
is again Gaussian distributed. Specifically one has Y2 1 p (Y ) = exp − 2 2 Y 2π Y 2
(A.18)
with Y 2 = n, m an am yn ym . This can be shown as follows. One obtains p (Y ) by picking out all those states that fulfill Eq. A.17: N an yn dy1 . . . dyN . p (y1 , y2 , . . . , yN ) δ Y − p (Y ) = n=1
(A.19) First, let us integrate over y1 . What is then left is an (N − 1)dimensional integral over the remaining variables y2 to yN . The integrand Gaussian distribution, Eq. A.14, but with y1 replaced is the −1 by a1 Y − n=1 an yn . This integrand is again a Gaussian function of y2 , y3 ,. . . , yN and Y . For each of those variables the integrand is of the form exp −ay 2 + by where y stands for the variable and a and b are complicated expressions. a is a combination of the constant coefficients ai and A i j and is thus also some constant. b is a linear
450 Probability Theory
combination of the other remaining variables. Luckily we do not need to calculate these terms explicitly. All we need to know is that when we successively integrate over the yi ’s, we always find a Gaussian function in the remaining variables since ∞ ∞ π b2 b2 b 2 −ay 2 −by −a y+ ( ) 2a e e 4a . e dy = e 4a dy = (A.20) a −∞ −∞ Thus, after having integrated over all N yi ’s, we remain with a Gaussian distribution in Y . Since its average vanishes, we know that it must be of the form given in Eq. A.18.
Appendix B
The Distribution of Magnetization and the Central Limit Theorem
Here we explicitly derive an approximation for the number of microstates of the system shown in Fig. 2.6 for a given value M of the magnetization. The distribution that we shall find also follows via the central limit theorem, which we discuss later in this appendix. Let us give an approximate expression for the number of microstates to a given macrostate, Eq. 2.47. We consider deviations from the most probable state k = N/2, namely k = N2 + m. This corresponds to a magnetization M = 2mμ. We assume m N, which allows us to use Stirling’s formula, Eq. 2.48:
1 N N+ 2 N ≈√ Nmicro (M) = N N +m+ 12 N N −m+ 12 +m 2 2π N2 + m 2 −m 2 2 m 1 − 2m Nmax N (B.1) = N+1 1 + 2m 1 − 4m2 / N 2 2 N with Nmax given by Eq. 2.49. Now we repeatedly use the fact that the exponential function is given by the limit: x K K →∞ x 1+ → e . (B.2) K N+1 by 1 − 4m2 /N 2 2 In particular, we can approximate N 2 2 2 1 − 4m /N (since m N) and then we can use Eq. B.2 with Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
452 The Distribution of Magnetization and the Central Limit Theorem
K = N/2 to arrive at e−2m /N . In addition, for m 1 we can use Eq. B.2 (with K = m) to also replace the rightmost terms of Eq. B.1 by 2 2 e−2m /N (numerator) and e+2m /N (denominator). Note that the latter √ replacements work even well for small values of m (with m N) because 1 ± 2m2 /N are the of power series of the same beginnings exponential functions exp ±2m2 /N . We can thus approximate Nmax 2 2 (B.3) Nmicro (M) ≈ −2m2 /N e−4m /N = Nmax e−2m /N . e Replacing m in Eq. B.3 by M/2μ, we arrive at Eq. 2.50. Instead of doing this explicitly by expanding Eq. B.1 using Stirling’s formula, which—as we just saw—is quite cumbersome, one can make use of the famous central limit theorem. It states that the sum of a sufficiently large number of independent and identically distributed random variables X i with a finite mean μ and standard deviation σ is Gaussian distributed. In other words if we introduce the sum N Xi, (B.4) X = 2
i =1
then for sufficiently large N one finds ( X −Nμ)2 1 ρ (X ) = √ e− 2Nσ 2 (B.5) 2π Nσ irrespective of the shape of the distribution of the individual variables X i . We give this here without proof but as a consistency check we calculate N N
X i X j . (B.6) Xi X j − σ X2 = X 2 − X 2 = i, j =1
i , j =1
Since the r andom variables are independent from each other, one has X i X j = X i X j for all i = j . What remains are all the diagonal terms i = j leading to N N N 2 X i 2 = Xi − σ X2 = σ 2 = Nσ 2. (B.7) i =1
i =1
i =1
This is indeed the variance of the distribution given in Eq. B.5. If we use for X the magnetization which is the sum of the individual magnetic moments μsi with μsi = 0 and σ = μ, we find immediately from Eq. B.5 that ρ (M) ∼ e to Eq. 2.50.
−
M2 2N μ2
which leads directly
Appendix C
Connection between Polymer Statistics and Critical Phenomena
Critical points in ferromagnets: There is a surprising analogy between the statistics of long flexible polymers and critical phenomena. First we summarize the well-known behavior of ferromagnets around critical points. Then we show how spin systems and polymer statistics are closely connected. The macroscopic state of a ferromagnet is characterized by its magnetization M, a vector with n independent components. n = 1 corresponds to an uniaxial ferromagnet, for n = 2 there is an “easy” plane of magnetization and for n = 3 the spins can point in any direction. The average magnetization is a function of the temperature τ and the magnetic field H (we write τ to distinguish this temperature from T , the temperature of the polymer system which, as we shall see, is different). According to Landau’s theory, the free energy of a ferromagnet (with H = 0) as a function of M has two minima below a critical temperature τc (see Fig. C.1(a)) and the system will spontaneously take one of them. Above τc there is only one minimium (no spontaneous magnetization). The resulting M vs. τ is shown in Fig. C.1(b). The following discussion concerns temperatures very close to the critical temperature.
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
454 Connection between Polymer Statistics and Critical Phenomena
F
M0
τ > τc τ < τc
M0 (τ )
M
τc
(a)
τ (b)
Figure C.1 (a) The free energy of a ferromagnet as a function of M (see text). (b) The spontaneous magnetization as a function of temperature for zero magnetic field.
Consider a temperature τ = τc (1 + ) with small and positive. The average M is then zero, but zooming into the system you will see that there are regions of a characteristic size ξ , called the correlation length, where M does not average to zero. This length scales as ξ∼ = a ||−ν
(C.1)
where a is the distance between neighboring spins. Close enough to the critical point, ξ is much larger than the lattice spacing and microscopic details of the lattice structure and of the couplings between spins become irrelevant. We reach a universal regime, where only two relevant parameters remain: the dimensionality, d, and the number of equivalent components, n. All the critical exponents, such as ν in Eq. C.1, depend only on d and n. Suppose we are just above the critical point, where M0 = 0. If we then apply a small magnetic field H , the system turns out to be very susceptible to this external perturbation. It is found that M = χ M H , with the so-called magnetic susceptibility χ M diverging as χ M = χ0 ||−γ
(C.2)
when = (τ − τc ) /τc approaches zero. Another signal of being close to a critical point is found when one probes the specific heat of the system (at zero field) which scales as C = C 0 ||−α where α is a (small) exponent.
(C.3)
Appendix C 455
On the low temperature side (just below τc ) we find the same scaling laws with the same exponents, e.g. ν, γ and α, but prefactors such as C 0 are different. In addition, there is the non-zero spontaneous magnetization, which for small also follows a power law (see also Fig. C.1(b)): M0 = M1 ||β .
(C.4)
Only two critical exponents are independent of each other because they are connected by two relations. These relations which we give here without proof are: the Widom relation α + 2β + γ = 2
(C.5)
and the Kadanoff relation: α = 2 − νd.
(C.6)
We now make the discussion from above about correlated regions of spins more precise. Assume to be slightly above τc . A good measure for the correlation properties involves measuring M locally at two points separated by a distance r. Close to τc , the thermal average of this quantity, M (0) · M (r), can be written as
r 1 M (0) · M (r) = d−2+η f M (C.7) ξ r with η denoting another critical exponent, ξ is the correlation length from above and f M is a dimensionless function that satisfies f M (0) = 1 and
f M (x) ∼ = x η exp (−x) for x 1.
(C.8)
In the limit of small correlated regions, x 1, one has 1/r exp (−x) in d = 3, while for x = 0 correlations decay slowly as r −(d −2+η) . The exponent η is related to other critical exponents via a general thermodynamics theorem that connects the correlation function to the susceptibility: τ χ M = n−1 M (0) · M (r) dr. (C.9) Inserting Eq. C.7 and switching to the dimensionless variable r/ξ implies (for d = 3) χ M ∼ ξ 2−η and hence (Eqs. C.1 and C.2): γ = ν (2 − η) . The exponent η is small for all three-dimensional systems.
(C.10)
456 Connection between Polymer Statistics and Critical Phenomena
If we choose r = a, the correlations M (0) · M (r) measure the coupling energy, which is responsible for the magnetic order. This energy must contain a term that scales as ε1−α since by differentiation (note that ∂/∂τ = (∂ε/∂τ ) ∂/∂ε = τc−1 ∂/∂ε since τ = τc (1 + )) it gives the specific heat ∼ ε−α . Thus M (0) · M (a) = M (0) · M (a)|τ =τc + const. ε1−α .
(C.11)
The n vector model: We look now at a more specific model with atomic moments (spins) Si , the n vector model. The spins are located on a periodic lattice. Each spins has n components, e.g. spin Si has the components Si 1 , Si 2 ,. . . ,Si n . The total spin S is fixed as follows: n Si2α = n. (C.12) S2 = α =1
Spins that are neighbors on the lattice are coupled. This effect plus the influence of an external magnetic field are described by the Hamiltonian Ki j Si · S j − H · Si . (C.13) H=− i> j
i
Here Ki j = K for all nearest neighbor pairs (otherwise Ki j = 0). To proceed further, one would like to compute the partition function: 2 Z = di exp (−H/τ ) , (C.14) i
in which the integrations go over all orientations of all spins (kB = 1 for compactness of notation). As is well known, this is only possible exactly in very few cases (never in 3D). It is thus tempting to expand Z in powers of the coupling energy Ki j :
1 Ki j 2 2 Ki j Si · S j + . . . exp Ki j Si · S j /τ = 1 + Si · S j + τ 2 τ (C.15) Inserting this into Z typically results in a very complicated structure. In one case, however, it becomes simple, namely in the limit n = 0. This sounds strange, since we have obviously assumed so far that n is a positive integer. However, it is possible and useful. Most importantly, the expansion, Eq. C.15, leads to a problem related to self-avoiding walks on lattices. This allows then to create the link between polymers and critical phenomena.
Appendix C 457
The limit n = 0: We consider vector orientations in an ndimensional space and formulate calculations such that they remain meaningful even if we take the limit n → 0. We introduce the average over all spin configurations, equally weighted, denoted by 0 . Thermal averages (denoted by ) are then given by exp (−H/τ ) G0 exp (−H/τ )0
(C.16)
Z = exp (−H/τ )0
(C.17)
G = and the partition function by
with the uninteresting factor Ω, the total volume of phase space for the spins. Now consider averages with respect to a vector Si , which we call S in this section for the sake of simplicity. Averages like Sα 0 , Sα Sβ 0 ,
Sα Sβ Sγ 0 etc. have to be computed when performing the expansion of Z in terms of Ki j /τ , see Eqs. C.15 and C.17. We now show that for the case n = 0 only one type of these averages does not vanish, namely
Sα Sβ 0 = δαβ . (C.18) 4 Higher moments like e.g. Sα 0 all vanish which brings an enormous simplification to the expansion of Z . We now proof this moment theorem. Start with a positive integer n with spins normalized according to Eq. C.12. We introduce the “characteristic function” f (k) of the variable Sα . The vector k also has n components kα , and is defined by f (k) = exp (i k · S)0 .
(C.19)
From this all the moments of S can be extracted by derivation. E.g., the second moment is:
∂ ∂ Sα Sβ 0 = −i −i f (k) . (C.20) ∂k ∂k α
β
k=0
We now construct f (k) explicitly. By differentiating Eq. C.19 twice in k space, we obtain ∇2 f =
∂2 f α
∂kα2
=−
α
Sα2 exp (i k · S) 0 .
(C.21)
458 Connection between Polymer Statistics and Critical Phenomena
Using the normalization, Eq. C.12, this gives ∇ 2 f = −n f.
(C.22)
We now switch to k = |k| as the variable since f only depends on k (since it repr esents an average over equally weighted 2 orientations). With k = α kα one finds ∂f ∂k ∂ f kα ∂ f = = , ∂kα ∂k k ∂k ∂kα 1 ∂f k2 ∂ ∂2 f = + α 2 ∂ kα k ∂k k ∂k
1 ∂f k ∂k
,
and ∂2 f α
n ∂f ∂ = +k 2 ∂kα ∂k k ∂k
1 ∂f k ∂k
=
n−1 k
∂f ∂2 f + 2. ∂k ∂k
Inserting this into Eq. C.22, we arrive at the final equation for f (k):
∂2 f n − 1 ∂f (C.23) + + n f = 0. ∂k2 k ∂k In addition, we have the following boundary conditions at k = 0: ∂ f f (0) = 1 and = 0. (C.24) ∂k k=0
Importantly, Eqs. C.23 and C.24 remain valid for all n (not only positive integers). Using these equations, one can construct f (k) for any n providing an analytical continuation of the standard cases n = 1, 2, . . . The interesting case for us is n = 0 for which Eq. C.23 becomes 1 ∂f ∂2 f − = 0. (C.25) 2 ∂k k ∂k The solution satisfying the boundary condition Eq. C.24 is of the form 1 f (k) = 1 − k2 . (C.26) 2 This implies that higher moments involving 3, 4 etc. components of S must vanish (see e.g. Eq. C.20).
Appendix C 459
Sjα
Siα Siα Kij
Sjα Sjα
Khi
Shα
2 Kij
Siα Siα (a)
(b)
Figure C.2 (a) A closed self-avoiding loop. These are the only graphs where each factor Si α appears exactly twice (or not at all). (b) The smallest possible loop is the only one where Ki j appears quadratically.
The magnetic partition function as an expansion in self-avoiding loops: We can now return to the expansion C.15 for the partition function C.14 (specifically for the case of zero external field) 2 Z Ki j = exp (−H/τ )0 = exp Si α S j α (C.27) τ α i> j 0
which for n = 0 has the simple exact form ⎡ 2 ⎤
2 2 Z K K 1 i j i j ⎣1 + = Si α S j α + Si α S j α ⎦ . τ 2 τ α α i> j 0
(C.28) All higher moments in the expansion of the exponential must vanish because of the moment theorem. This becomes clear by representing different terms in Z /Ω by graphs on a lattice. Forget first about the term quadratic in Ki j ; we will discuss this term in a moment. Draw for each term Ki j (with (i, j ) being a nearest neighbor pair) a line connecting lattice site i with j . In order to have a non-vanishing contribution from lattice site i , we need that the spin component Si α appears exactly twice, Si α Si α . The only way to achieve this is to have graphs that consist of one (or several) closed loops, see Fig. C.2(a). In addition, a loop should never intersect with itself (or with other loops). Would there be an intersection at lattice
460 Connection between Polymer Statistics and Critical Phenomena
point i , this would result in a term Si4α and Si4α 0 = 0. Here we have a first glimpse of the connection between magnetism and selfavoiding walks. All higher terms in Ki j must vanish because the graph would pass the line connecting lattices sit 2es i and j more than once. There is one exception. The term Ki j /τ in the expansion C.28 corresponds to the smallest possible loop, see Fig. C.2(b). Note that each loop has a single index α occuring at all sites because of Eq. C.18. Each loop (or set of loops) of total length N contributes (after summing over the component index α) (K/τ ) N n to the partition function. Because n = 0 we find Z = 1. (C.29) Ω We arrived, after a long calculation, at a trivial result. Spin correlations and self avoiding
Consider the spin-spin walks: correlation function in zero field Si 1 S j 1 (choosing one component, α = 1; i and j are now lattice sites far apart). This corresponds to the magnetization correlation function M(0)M(r) (apart from a normalization condition) where r = ri j denotes the distance vector. Using the rule of averages, Eq. C.16, we obtain
exp (−H/τ ) Si 1 S j 1 0 Si 1 S j 1 = = exp (−H/τ ) Si 1 S j 1 0 . (C.30) exp (−H/τ )0 In the second step, we used the fact that the denominator is one (see Eqs. C.17 and C.29). Now expand the exponential. For n = 0, the only graphs that contribute are again self-avoiding walks. But here they are not closed loops because Eq. C.30 contains two extra spin factors, Si 1 and S j 1 . What we get instead are self-avoiding walks that connect sites i and j , see Fig. C.3. An N-step walk contributes the term (K /τ ) N to Eq. C.30. All along the walk, the index α must be chosen so that it is identical to the specified value α = 1. There is no summation over α. This leads to the fundamental theorem: N
K N N (i j ) (C.31) Si 1 S j 1 n=0 = τ N where N N (i j ) is the number of self-avoiding walks of N steps connecting lattice site i with j . This equation is the basic link between polymers and magnets.
Appendix C 461
Sj1
Si1
Figure C.3 A self-avoiding walk from lattice site i to j . Paths like this are the only ones contributing to Eq. C.30.
Properties of self-avoiding walks: The total number of SAWs of N steps, starting from point i is given by N N (i j ). (C.32) N Ntot = j
We claimed earlier (Eq. 3.23) that the asymptotic form of N Ntot is given by N Ntot ∼ = z N N γ −1 . We now check whether this agrees with Eq. C.2 for the magnetic susceptibility χ M . The susceptibility may be expressed in terms of magnetic correlations, Eq. C.9, or in terms of correlations for one component, α = 1: N
1 1 K Si 1 S j 1 = N N (i j ) (C.33) χM = τ j τ j N τ which with Eq. C.32 (and then 3.23) leads to
1 tot K N ∼ 1 zK N γ −1 χM = N . NN = τ N τ τ N τ
(C.34)
This series converges for high temperatures τ . If you decrease τ , it diverges when τ reaches the critical value τc = K z.
(C.35)
Consider now temperatures slightly above τc , namely τ = τc (1 + ε) ≈ τc exp ε. Then the magnetic susceptibility is given by (up to a numerical pre-factor of order one) 1 −Nε γ −1 e N . (C.36) χM ∼ = τc N
462 Connection between Polymer Statistics and Critical Phenomena
Replacing the sum by the integral 1 χM ∼ = τc
∞ 0
∞ 0
d N, we arrive at
1 N γ −1 e−Nε d N ∼ = ε−γ , τc
(C.37)
in agreement with Eq. C.2. Since we have located the transition point, we can now show how N and ε are related. We rewrite Eq. C.31 for small ε:
−Nε e N N (i j ) z−N . (C.38) Si 1 S j 1 = N
The relationship shows that the behavior of the spin-spin correlations for very small values of ε is determined by the number of paths for very large numbers of steps. Approaching the critical point in the spin system (ε → 0) is analogous to go toward an infinitely long polymer (N → ∞). Since polymers are long by definition, polymer physics is always close to a critical point. A direct consequence is the correspondence between a single correlation length ξ ∼ ε−ν for the magnetic system and that of a characteristic size of self-avoiding walks, R F ∼ N ν . Finally, let us consider the number of SAW that return to the origin. We announced earlier, Eq. 3.29, that
a d ∼ N −νd N z = z N N −2+α (C.39) N N (a) ∼ = = z N RF where we used the Kadanoff relation C.6 in the last step. The magnetic analogue of N N (a) is the correlation function between nearest-neighbor spins. Obviously, this is directly related to the average energy E per site:
1 E = − zK Si 1 S j 1 2
(C.40)
where i and j are nearest neighbors; the factor 1/2 makes sure that each pair is only counted once. Now we check whether Eq. C.39 is consistent with the scaling properties of E . We use the fundamental theorem (Eq. C.38): 1 E = − zK e−Nε N N (a) z− N ∼ e−Nε N −2+α = −K 2 N N
(C.41)
Appendix C 463
where we used Eq. C.39 on the rhs. We split E into two parts: 1 − e− N ε N −2+α (C.42) E (ε) = E (0) + K N
and then replace the sum by an integration ∞ E (ε) = E (0) + K 1 − e−Nε N −2+α d N.
(C.43)
0
E (0) must be a finite number and the integral has no singularities for N → 0 and N → ∞. Substituting t = εN we arrive at ∞ 1 − e−t t−2+α dt = E (0)+const. K ε1−α , E (ε) = E (0)+K ε1−α 0
(C.44) an equation equivalent to Eq. C.11. This proves that the exponent α governing the return of SAW to the origin, Eq. C.39, is indeed the specific heat exponent in the magnetic counterpart, Eq. C.3.
Appendix D
Hamilton’s Principle and the Pendulum
Consider a particle of mass M in one dimension. Its position at time t is given by x (t). Assume that the particle feels a time-dependent force f (t). Newton’s second law states that the particle’s mass times its acceleration, x¨ (t) = d 2 x (t) /dt2 , equals that force: Mx¨ (t) = f (t) .
(D.1)
This so-called equation of motion is easy to solve: 1 x (t) = x0 + υ0 t + M
t 0
dt
t
dt f t
(D.2)
0
with x (0) = x0 and x˙ (0) = υ0 denoting the initial position and velocity of the particle. As a special case of Eq. D.1, we consider a particle in an external potential V (x). In that case f (t) = −dV (x (t)) /dx and hence Mx¨ (t) = −
dV (x (t)) . dx
(D.3)
We now introduce Hamilton’s principle, which states that the dynamics of such a physical system is determined by a variational principle. As the first step we write down the Lagrange function L of Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
466 Hamilton’s Principle and the Pendulum
the system, which is given by the kinetic minus the potential energy. For the particle in the potential V (x) this leads to L(x (t) , x˙ (t)) =
1 Mx˙ 2 (t) − V (x (t)) . 2
(D.4)
Next we introduce the so-called action functional T S [x] =
L(x (t) , x˙ (t)) dt.
(D.5)
0
A functional maps a function, here x (t), onto a number, here S [x]. The square brackets indicate that the argument is not a number but an entire function. Hamilton’s principle states that the time evolution of the system, x (t), corresponds to a stationary point of the action, Eq. D.5. More precisely, of all the curves x (t) with given start point x (0) = x0 and given end point x (T ) = x T , the true solution is the one that is a minimum or a saddle point (in short a stationary point) of the action. How can we find this stationary point? We consider a small perturbation h (t) around a given function x (t). The new function x (t) + h (t) needs to have the same start and end points, i.e., we require h (0) = h (T ) = 0. Now let us consider T S [x + h] =
L x (t) + h (t) , x˙ (t) + h˙ (t) dt.
(D.6)
0
A Taylor expansion of the Lagrange function to first order leads to T S [x + h] = S [x] + 0
∂L ∂L ˙ h+ h dt + O h2 ∂ x˙ ∂x
(D.7)
2
where O(h ) stands for higher order terms, namely integrals that contain terms like h2 (t) and h˙ 2 (t). Through integration by parts, d d (∂ L/∂ x˙ ), and using (h∂ L/∂ x˙ ) − h dt namely replacing h˙ ∂ L/∂ x˙ by dt the fact that the boundary terms vanish, one arrives at T S [x + h] − S [x] = 0
∂L d ∂L − h dt + O h2 . dt ∂ x˙ ∂x
(D.8)
Appendix D 467
One says that x (t) is a stationary point of S if the integral vanishes for any small h. This is the case if x (t) fulfills the so-called EulerLagrange equation ∂L d ∂L − = 0. (D.9) ∂x dt ∂ x˙ Let us take the Lagrange function from above, Eq. D.4, as an example. By inserting it into the Euler-Lagrange equation, Eq. D.9, we find the equation of motion, Eq. D.3. The trajectory of the particle is thus a stationary point of the action, Eq. D.5. It is straightforward to extend the formalism to d dimensions where one obtains d EulerLagrange equations, one for each direction in space. One can then easily verify that this set of equations is identical to the equations of motion for a particle in d dimensions. So far, Hamilton’s principle seems to be a very complicated way to get the equation of motion, Eq. D.3, which you can write down immediately instead. For more complicated systems which contain certain constraints, however, such a framework is extremely useful. To give an example, consider the pendulum depicted on the rhs of Fig. 4.24. It consists of a mass M attached to a massless rod of length l, which is suspended from a pivot point at position (x, y) = (0, 0) around which it can swing freely. The potential of the mass in the gravitational field is given by Mgy. The Lagrange function of the pendulum is thus given by M 2 (D.10) x˙ + y˙ 2 − Mgy. L(x, y, x, ˙ y˙ ) = 2 The Euler-Lagrange equations for the X - and Y -coordinates lead to two equations of motion, x¨ = 0 and y¨ = −g. Unfortunately, these equations are completely wrong. Why? What we found are the equations of motion of a free particle in a gravitational field in two dimensions. Solutions are e.g. trajectories of rain drops or of cannon balls but certainly not the motion of a pendulum. What went wrong? We forgot to take into account the presence of the rod that imposes the constraint x 2 + y 2 = l 2 . A better approach would be to use a coordinate system that automatically takes this constraint into account, namely to describe the state of the pendulum by the angle θ (t) between the pendulum and the Y direction, see Fig. 4.24. But what does the equation of motion look like in terms of this angle?
468 Hamilton’s Principle and the Pendulum
This is where a big advantage of Hamilton’s principle comes into play: It is independent of the chosen coordinate system. Suppose one goes from one coordinate system x1 , x2 ,. . . , x N to another coordinate system q1 , q2 ,. . . , q f via the transformations q = q (x) and x = x (q). The trajectory x (t) then becomes q (x (t)). The action functional can then be rewritten as T T f ∂x (q (t)) q˙ i dt. S [x] = L(x (t) , x˙ (t)) dt = L x (q (t)) , ∂qi i =1 0
0
(D.11) The rhs of Eq. D.11 is again of the form T S [q] =
L˜ (q (t) , q˙ (t)) dt
(D.12)
0
with a new Lagrange function L˜ . Also here Hamilton’s principle must hold, i.e., the dynamic evolution of the system follows from the EulerLagrange equations ∂ L˜ d ∂ L˜ − =0 ∂qi dt ∂q˙i
(D.13)
for i = 1, . . . , f . If we have a system with constraints, we can sometimes introduce coordinates that automatically fulfill those constraints. The equations of motion are then simply given by the Euler-Lagrange equations in these coordinates. Let us go back to the pendulum. We now describe the configuration of the pendulum by the angle θ (t), which measures the deviation from the vertically upward pointing position, see Fig. 4.24. In terms of this angle, the kinetic energy of the pendulum is given by Ml 2 θ˙ 2 /2 and the potential energy by Mlg cos θ. This leads to the following Lagrange function: Ml 2 2 θ˙ − Mgl cos θ. (D.14) L θ, θ˙ = 2 The corresponding Euler-Lagrange equation is given by g θ¨ (t) = sin θ (t) . (D.15) l So we have found the equation of motion of the pendulum. In the following we solve this equation, which—as we shall see—is rather
Appendix D 469
cumbersome. We present this calculation here because it leads to the explicit formulas on which the plots of the Euler elasticas in Fig. 4.26 are based. We start by multiplying Eq. D.15 on both sites with 2θ˙ (t). This leads to 2g 2θ˙ (t) θ¨ (t) = θ˙ (t) sin θ (t) . (D.16) l This is straightforward to integrate: 2g θ˙ 2 (t) = − cos θ (t) + C (D.17) l with C being an integration constant. By multiplying this equation on both sides with Ml 2 /2, the physical meaning of C becomes obvious: Ml 2 Ml 2 2 C = E tot − Mgl. (D.18) θ˙ (t) + Mgl cos θ (t) = 2 2 Here we introduce E tot , the total energy of the pendulum, the zero energy of which has been chosen so that it correspond to the resting state of the pendulum, θ (t) ≡ π . Thus the integration constant C reflects the total energy, which is a conserved quantity: the sum of the kinetic and potential energy is always constant. Using the identity cos θ = 1 − 2 sin2 (θ/2), we can rewrite Eq. D.18 as follows:
θ˙ 2 (t) 1 θ (t) 2 = E tot − 2Mlg + 2Mlg sin . (D.19) 4 2Ml 2 2 Taking the square-root we obtain &
θ˙ (t) θ (t) g E tot 2 . (D.20) − 1 + sin =± 2 2 l 2Mlg It is useful to rewrite Eq. D.20 in terms of the angle α = π − θ, which gives the deviation from the resting position. With sin2 (θ/2) = 1 − sin2 (α/2) we obtain &
θ˙ (t) α˙ (t) g E tot α (t) 2 =− =± − sin 2 2 l 2Ml g 2 &
g α (t) =± 1 − m sin2 (D.21) lm 2 where we introduced m=
2Mlg . E tot
(D.22)
470 Hamilton’s Principle and the Pendulum
Note that 2Mlg is the difference in potential energy between the topmost position, α = π, and the lowest one, α = 0. When m > 1 the total energy is smaller than the range of possible potential energies. This means that the pendulum cannot reach α = π and instead swings back and forth around the α = 0 position. On the other hand, for 0 < m < 1 the total energy exceeds 2Mlg. In that case the pendulum still has kinetic energy when it reaches the top, α = π . This corresponds to a revolving pendulum. The special case m = 1 is solved and discussed in the main text. Separation of variables in Eq. D.21—using α˙ = dα/dt—leads to dα/2 g = ∓ dt. (D.23) lm 1 − m sin2 α2 Depending on the case—oscillating or revolving—we need to consider different initial conditions. We start with the revolving case, 0 < m < 1, for which we assume that at zero time the pendulum is pointing downwards, i.e., α (0) = 0. We now integrate Eq. D.23 from the time t = 0 with the angle α = α (0) = 0 to some arbitrary time t = t with α = α (t): ∓
g t= lm
α(t)/2
0
dα /2
= F 1 − m sin2 α2
α (t) m . 2
(D.24)
The integral on the rhs is called the elliptic integral of the first kind. So we now have an expression for t as a function of α, t = t (α). We want to invert this formula to obtain α = α (t). The inverse of F is called the amplitude, i.e., am (x |m ) = F −1 (x |m ). Hence
α (t) g = am ∓ t m . (D.25) 2 lm Taking the sin-function on both sides of Eq. D.25, we introduce yet another special function:
g g α (t) = sn ∓ t m = −sn ± t m . (D.26) sin 2 lm lm The function sn (x |m ) = sin (am (x |m )) is one of the so-called Jacobian elliptic functions. On the rhs of Eq. D.26 we used the fact
Appendix D 471
that sn (−x |m ) = −sn (x |m ). The identities cos α = 1−2 sin2 (α/2) and cos α = − cos θ allow to rewrite Eq. D.26 in the form
g 2 cos θ (t) = 2sn t m −1 (D.27) lm This is the solution to the equation of motion of the revolving pendulum. Next we study the oscillating pendulum, m > 1. We choose the time so that for t = 0 the pendulum has reached its maximal amplitude and thus α˙ (0) = 0. We call the time for one complete cycle T P , the period. Hence we have α (T P /4) = 0. Now we have to integrate Eq. D.23 from the time t = T P /4 with the angle α = α (T P /4) = 0 to some arbitrary time t = t with α = α (t): ∓
g lm
TP t− = 4
(t)/2 α
α(T P /4)/2
dα /2 1 − m sin2
α = F
α (t) m . 2
2
(D.28) Along similar lines as in Eqs. D.24 to D.27, we can invert this formula leading us to the solution for the oscillating pendulum:
g T P t− m − 1. (D.29) cos θ (t) = 2sn2 lm 4 We can simplify things a bit by introducing a shift in time by T P /4:
g t m − 1. cos θ (t) = 2sn2 (D.30) lm Using the Kirchhoff kinetic analogy, we can now write down the √ √ solutions for the Euler elasticas. Replacing g/l by f/A = 1/λ and t by s, we find for the revolving case, Eq. D.27:
1 s 2 √ cos θ (s ) = 2sn m − 1. (D.31) m λ To make the parametric plots of Fig. 4.26 in Cartesian coordinates, we need to perform the integrations: ⎞ ⎛ s s (x (s) , y (s)) = ⎝ sin θ s ds , cos θ s ds ⎠ . (D.32) 0
0
472 Hamilton’s Principle and the Pendulum
We already know cos θ (s) from which we obtain sin θ (s) = 1 − cos2 θ (s):
& 1 s 1 s 2 √ sin θ (s) = 2sn √ m m 1 − sn m λ m λ
1 s 1 s √ = 2sn √ m cn m , (D.33) m λ m λ where we introduced cn, another Jacobian elliptic function. Integrating this leads to yet another Jacobian elliptic function, dn = √ 1 − msn2 :
2λ 1 s x (s) = √ 1 − dn √ m . (D.34) m m λ On the other hand, the integration over cos θ (s) yields:
2 2λ 1 s y (s) = − 1 s − √ E am √ m m m m mλ
(D.35)
with E denoting the elliptic integral of the second kind. Finally, we discuss the Euler elasticas, which correspond to the oscillating case, m > 1. The Kirchhoff analogy leads to
√ s 1 1 s 2 −1, (D.36) cos θ (s) = 2sn2 √ m −1 = 2sn m λ m m λ where we introduced the reciprocal parameter m = 1/m, which assumes the values 0 < m < 1 for the oscillating case. The √ sn-function has the transformation property sn m x 1/m = √ m sn ( x| m ), which allows us to rewrite cos θ (s ) as follows: s cos θ (s) = 2m sn2 (D.37) m − 1. λ To get parametric plots, wecalculate again the X - and Y -coordinates, Eq. D.32. With sin θ (s) = 1 − cos2 θ (s ) we find s s √ 1 − m sn2 sin θ (s) = 2 m sn m m λ λ √ s s (D.38) = 2 m sn m . m dn λ λ Integrating this leads to s √ . (D.39) x (s) = 2 m λ 1 − cn m λ The integration of cos θ goes along similar lines as Eq. D.35: s (D.40) y (s) = s − 2λE am m m . λ
Appendix E
Fourier Series
Consider a complex-valued function f on the interval [0, a] or, equivalently, a periodic function f of periodicity a. Let us try to approximate f by functions of the form +N
gN (t) =
an e2πint/a .
(E.1)
n=−N
We ask ourselves: How should the coefficients an be chosen such that f (t) is approximated by gN (t) as closely as possible? More specifically, let us find the an -values (−N ≤ n ≤ N) such that the mean-squared deviation N between f and gN , namely a | f (t) − gN (t)|2 dt,
N =
(E.2)
0
is minimal. To solve this task, we introduce the following notation: a f, g =
f ∗ (t) g (t) dt
(E.3)
0
where the star indicates the complex conjugate of a complex number (e.g. z = a + ib, z∗ = a − ib). One has f, g = g, f ∗ , f, α1 g1 + α2 g2 = α1 f, g1 + α2 f, g2 for any complex numbers Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
474 Fourier Series
α1 and α2 , and f, f > 0 for any continuous f except f ≡ 0. Mathematically speaking ., . is an inner product on the vector space of continuous complex-valued functions on the interval [0, a]. The functions 1 (E.4) ϕn (t) = √ e2πint/a a fulfill the condition ϕn , ϕm = δnm
(E.5)
i.e., they form an orthonormal system. We now rewrite the problem of minimizing Eq. E.2 as follows. What are the values of the coefficients bn with −N ≤ n ≤ N such that +N +N N = f − (E.6) bn ϕn , f − bn ϕn n=−N
n=−N
is as small as possible? Using two of the properties of the inner product, we find: N = f, f −
+N
bn∗
+N ∗ ϕn , f + bn f, ϕn + bm bn ϕm , ϕn .
n=− N
n, m=− N
(E.7) Introducing cn = ϕn , f and using the orthonormality of the ϕi ’s, this can be rewritten as +N ∗ (E.8) N = f, f − bn cn + bn cn∗ − bn∗ bn − cn∗ cn + cn∗ cn n=−N
where we added and subtracted cn∗ cn . This leads to N = f, f −
+N n=−N
|cn |2 +
+N
|bn − cn |2 .
(E.9)
n=−N
From this expression we can see directly that n is minimal if one chooses bn = cn ≡ ϕn , f . Note that the cn ’s are independent of N. This is, if you choose a higher number of terms to approximate f , you do not need to adjust the coefficients bn —just as is the case when one approximates a function by a Taylor series. The infinite series +∞ +∞ 1 ϕn , f ϕn (t) = √ cn e2πint/a (E.10) a n=−∞ n=−∞
Appendix E 475
is called the Fourier series of the function f . It has been shown that the mean-squared deviation vanishes in the limit N → ∞. This is true even for functions with a finite number of jumps, where the Fourier series converges at every point outside the jump discontinuities. The coefficients cn are called Fourier coefficients. The set {ϕn |n = 0, ±1, ±2, . . . } represents a complete orthonormal system of functions. The Fourier series f (t) =
+∞
ϕn , f ϕn (t)
(E.11)
n=−∞
is analogous to the representation of a vector by an orthonormal basis. Especially, one has a +∞ f, f = f ∗ (t) f (t) dt = cn∗ cn . (E.12) n=−∞
0
Another complete orthonormal system is given by the set of functions
1 2 2πnt 2 2π nt √ , cos , sin (E.13) a a a a a with n = 1, 2, . . . These functions are simply the normalized real and imaginary parts of the function ϕn (t), Eq. E.4. The Fourier series, Eq. E.11, takes in this new system the form
%
∞ $ 2πnt 2π nt 1 f (t) = √ an cos + bn sin (E.14) a a n=0 a with 1 a0 = c0 = √ a
a f (t) dt,
(E.15)
0
2 an = (cn + c−n ) = √ a
a f (t) cos
2π nt a
dt
(E.16)
dt
(E.17)
0
for n > 0, and 2 bn = i (cn − c−n ) = √ a
f (t) sin 0
for n > 0.
a
2π nt a
476 Fourier Series
f π
−2π
−π
2π
π
t
−π Figure E.1 Two approximations to the function f (t) = t (shown in purple) on the interval −π < t ≤ π . The red curve corresponds to the first 4 terms of its Fourier series, Eq. E.21, the blue curve to the first 10 terms.
We mention two further complete orthonormal sets on [0, a] from which Fourier series can be built. One is the set √ 2/a sin (πnt/a) with n = 1, 2, . . . and the other is given by
1 2 πnt √ , cos (E.18) a a a with n = 1, 2, . . . These systems are obtained by attributing to each function on [0, a] an antisymmetric or symmetric function on [−a, a], respectively. Fourier series of (anti)symmetric functions on [−a, a] contain obviously only (anti)symmetric trigonometric terms. We give now one example, namely f (t) = t for −π < t ≤ π, see Fig. E.1. In this case one finds 1 cn = √ 2π
+π
te−int dt =
−π
√ (−1)n 2π i n
(E.19)
for n = 0 and c0 = 0. Hence f (t) = i
∞ (−1)n n=1
n
eint − e−int
(E.20)
and thus t=2
∞ n=1
for −π < t < π .
(−1)n+1
sin nt n
(E.21)
Appendix F
The Pre-Averaging Approximation
Free Gaussian chain: The idea is to replace Hnm by its average Hnm eq over the equilibrium distribution of the Gaussian chain (Doi and Edwards, 1986). Since the orientation of Rnm is independent of its length, Hnm eq is of the form
1 1 Hnm eq = I + Rˆnm Rˆnm eq (F.1) 8π η |Rnm | eq
which with Rˆnm Rˆnm eq = I/3 simplifies to I 1 Hnm eq = . (F.2) 6π η |Rnm | eq Internal monomer distances are Gaussian distributed with variances b2 |n − m|, Eq. 5.102, allowing to calculate Hnm eq exactly:
3/2 ∞ 2 I 3 − 23r 2b |n−m| Hnm eq = e 4πr 2 dr 6π ηr 2πb2 |n − m| 0
=
I
= h (n − m) I. 6π 3 |n − m| ηb Replacing Hnm in Eq. 5.134 by its average Hnm eq leads to 2
N ∂ Rm (t) ∂Rn (t) (m, = h (n − m) K + L t) dm. ∂t ∂m2 0
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
(F.3)
(F.4)
478 The Pre-Averaging Approximation
Through the pre-averaging approximation we arrived at a linear set of equations for the Rm ’s that, as in the Rouse model, decouples in the X -, Y - and Z -components. The motion of the monomers, however, is strongly coupled since h (n − m) ∼ |n − m|−1/2 . We analyze Eq. F.4 in terms of the Rouse normal coordinates, Eq. 5.110. To do this, we first apply the transformation, Eq. 5.110, on both sides of Eq. F.4. In this way one goes on the left-hand side from Rn (t) to R ( p, t) and on the right-hand side from a function with two variables h (n, m) = h (n − m) to a function with one Fourier transformed variable h p (m). In a second step, the functions in m, h p (m), Rm (t) and L (m, t) are replaced by their Fourier series (as in N Eq. 5.109) and the integration 0 dm is carried out. One arrives at ∂R ( p, t) (F.5) = h pq −Kq R (q, t) + L˜ (q, t) . ∂t q Here Kq is defined by Eq. 5.113 and h pq
1 = 2 N
N
N dn
0
dm cos
pπn N
cos
qπ m N
h (n − m) .
(F.6)
0
For p, q > 0 this can be rewritten as follows: N −n
N pπn 1 qπ (n + m) dn dm cos cos h (m) h pq = 2 N N N −n 0 ⎡ N−n N pπn qπ n qπ m 1 ⎣cos = 2 dn cos h (m) dm cos N N N N −n 0 ⎤ N−n qπn qπ m − sin sin h (m) dm⎦ . (F.7) N N −n
N −n
For large q the two −n -integrals converge quickly to & ∞ qπ m 1 N cos h (m) dm = N ηb 3π 3 q
(F.8)
−∞
and
∞ sin −∞
qπm N
h (m) dm = 0.
(F.9)
Appendix F 479
N −n Replacing the −n -integrals in Eq. F.7 by these asymptotic values one obtains & N pπn qπ n N 1 1 δ pq . h pq ≈ cos cos dn = 3 2 ηb 3π q N N N 12π 3 q N ηb 0
(F.10) This relation also allows to estimate h0q ≈ 0 for q > 1. Finally, h00 follows directly from Eq. F.6: h00
1 = 2 N
N
N dm h (n − m) =
dn 0
0
8 1 √ . 3 6π 3 Nηb
(F.11)
These equations indicate that hqp is nearly diagonal. The Rouse modes are effectively decoupled—despite the presence of hydrodynamic interactions that make Eq. 5.134 nonlinear. The equation for the p-th mode thus has the same structure as the one for the Rouse model, Eq. 5.111: ∂R ( p, t) = −K p R ( p, t) + L˜ ( p, t) . (F.12) ∂t with ζ p = 1/ h pp . The only difference to the Rouse model lies in the functional form of the ζ p ’s. In the Zimm model we find 3 √ (F.13) ζ0 = ηb 6π 3 N, ζ p = ηb 12π 3 Np 8 for p = 1, 2, . . . whereas ζ p is constant for the Rouse model, Eq. 5.112. ζp
Gaussian chain under tension: We estimate the mean distances between pairs of monomers in equilibrium using the Pincus blob argument, see Fig. 3.11. According to this argument one has a characteristic subchain monomer number g P = (kB T /bf )2 such that monomers n and m belong to the same blob if |n − m| < g P and to different blobs if |n − m| > g P . Hence b |n − m|1/2 for |n − m| < g P |Rnm |eq ≈ b2 f |n−m| (F.14) for |n − m| > g P . kB T Furthermore, the average value of the tensor I + Rˆnm Rˆnm can be approximated by (4/3) I for |n − m| < g P (isotropic case; see also the force free case above) and by I + e y e y for |n − m| > g P with
480 The Pre-Averaging Approximation
e y denoting the unit vector in the Y -direction, the direction of the f orce. Neglecting
that anisotropy (and numerical factors), i.e., setting I + Rˆnm Rˆnm eq ≈ I we obtain ⎧ I ⎨ for |n − m| < g P 1/2 Hnm eq = h (n − m) I ≈ ηb|n−m| I ⎩ 2 for |n − m| > g P . η(b f/kB T )|n−m| (F.15) After the pre-averaging, we come back to Eq. F.4, but with a different function h (n − m). In terms of the Rouse normal coordinates, this leads to Eq. F.5 with Kq given by Eq. 5.113 and h pq by Eq. F.6. h pq for p, q > 0 can be calculated along a line similar to that in Eqs. F.7 through F.10 above. Here, however, Eq. F.8 needs to be replaced by ∞ qπm cos h (m) dm N −∞ ⎞ ⎛ & qπgN P ∞ cos x ⎟ cos x kB T 1 ⎜ N dx ⎠ dx + (F.16) ≈ ⎝ 1/2 x bf q x ηb qπ gP N
0
leading to
⎛
&
h pq ≈
1 ⎜ ⎝ ηbN
qπg P
N q
N
cos x kB T dx + x 1/2 bf
0
∞ qπ gP N
⎞ cos x ⎟ dx ⎠ δ pq . (F.17) x
From the asymptotic behavior of the integrals, the first being a Fresnel integral and the second the Cosine integral, one finds the asymptotic behavior of h pq : ⎧ N ⎨ kB2 T ln δ pq for p gNP ηb N f pπg P h pq ≈ (F.18) ⎩ √1 δ pq for p N . ηb Np
gP
The dynamics of the different Rouse modes are again given by Eq. F.12 with ζ p = 1/ h pp now given by Eq. F.18. The behavior of short wavelength modes with large p, p N/g P , scale as in √ the case of a Zimm chain in the absence of a force, ζ p ∝ ηb Np, Eq. F.10. Remarkably, the behavior of large wavelength modes with p N/g P is entirely different. Neglecting the logarithmic factor in Eq. F.18, we find ζ p ≈ ηb2 N f/kB T which is independent of p. This
Appendix F 481
suggests that the long wavelength modes are effectively behaving Rouse-like. In complete analogy to the Rouse model (see Eq. 5.119) we find relaxation times for those modes of the form τ˜ R ζp = 2 (F.19) τp = Kp p with the Rouse time τ˜ R given by Eq. 5.139.
Appendix G
Interaction between Two Equally Charged Plates at Zero Temperature
Here we derive the pressure between two equally charged plates at zero temperature, Eq. 7.66. We begin by rewriting the sum over l in Eq. 7.65: 1 δ (r − Rl ) . (G.1) I = lB = l B d 2r 2 2 |Rl + c| + D |r + c|2 + D2 l l The X Y -positions of the ions on one surface form a lattice given by the set of 2D vectors Rl , while the ions on the other surface are shifted to the positions Rl + c. The integral introduced above is thus two-dimensional. The sum over the delta-functions is a periodic function in two dimensions. Any such periodic function f (r) can be written in the form of a plane wave expansion, a 2D version of the Fourier expansion introduced in Appendix E. Here f (r) = δ (r − Rl ) = fk ei kr (G.2) l
k
where the summation goes over all vectors k of the reciprocal lattice, which is defined further below. The fk are the Fourier coefficients given by (G.3) fk = σ e−i kr f (r) dr C
Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
484 Interaction between Two Equally Charged Plates at Zero Temperature
where C denotes a primitive cell of the direct lattice, a minimum repeat unit containing one ion. Here fk = σ and hence ei kr ei kr √ I = lBσ e−i kc d 2 r. d 2r = l B σ 2 2 + D2 2 r |r + D + c| k k (G.4) We exchanged the order of summation and integration here; substituting r + c by r yields the phase factor e−i kc . Note that the term with k = 0 in the summation corresponds exactly to the second term in Eq. 7.65. Hence we can write the dimensionless potential as ei kr −i kc √ d 2r. e (G.5) (D) = l B σ 2 + D2 r k=0 Using Eq. 7.66, we calculate the pressure from the potential by differentiation: ei kr ! (D) 2 2 −i kc e = lBσ D d r 2 + D2 3/2 kB T r k=0 2π ∞ rei kr cos φ = lBσ 2 D e−i kc dr dφ 3/2 . 0 0 r 2 + D2 k=0 We introduced here polar coordinates where φ denotes the angle between the respective k-vector and r. The double integral can be calculated analytically (first integrate over φ, then over r) and yields (2π/D) e−kD with k = |k|. This leads to ! (D) = 2πl B σ 2 e−i kc e−kD . kB T k=0
(G.6)
We have therefore expressed the interaction between the two surfaces as an infinite sum of exponentials. In the following, we are interested in the leading terms of this sum for large distances. These are the terms with the smallest value of k. The ground state of a single plane is given by counterions that form a triangular Wigner crystal. We expect that each surface with its counterions still remains in this triangular ground state as long as D is much larger than the spacing between counterions within their planes. More specifically, the positions of the counterions in one lattice are given by n1 a1 + n2 a2 with ni = 0, ±1, ±2, . . . , an
Appendix G 485
b2 a2 c b1
a1
Figure G.1 Primitive vectors a1 and a2 that span the triangular lattice. Also indicated are the primitive vectors of the reciprocal lattice, b1 and b2 , and the shift vector c for maximum attraction between the two surfaces.
example of a so-called Bravais lattice. The vectors ai that span the lattice, the so-called primitive vectors, are given by √ a 3a (G.7) ey a1 = aex , a2 = ex + 2 2 and are indicated in Fig. G.1. The lattice spacing a has tobe chosen so that it matches the charge density σ , leading to a = 2/ 31/4 σ 1/2 . The reciprocal lattice, the set of all vectors k for which ei kR = 1 for all R in the Bravais lattice, is given by k = k1 b1 + k2 b2 , ki = 0, ±1, ±2, . . . , with
2π 1 4π (G.8) e x − √ e y , b2 = √ e y . b1 = a 3a 3 The primitive vectors of the reciprocal lattice fulfill bi a j = 2π δi j , see also Fig. G.1. For large distances, the leading terms in Eq. G.6 are those with the smallest value of k, namely (k1 , k2 ) = (±1, 0) and (k1 , k2 ) = (0, ±1). For distances D with D a, all higher-order terms are negligible and the large distance pressure is given by ! (D) − √4π D (G.9) ≈ 4π σ 2l B (cos (b1 c) + cos (b2 c)) e 3a kB T to a very good approximation. With a vanishing length of c, the counterions of one surface are just directly above the counterions of the other surface, so that the two surfaces repel each other. One finds cos (b1 c) + cos (b2 c) = 2, which leads to maximum repulsion. However, if we allow one of the plates with its counterions to move in the X Y -plane relative to the other at a fixed value of D, the system
486 Interaction between Two Equally Charged Plates at Zero Temperature
can lower its energy. It reaches the ground state when cos (b1 c) + cos (b2 c) = −2. This is the case if we choose the shift vector c to obey b1 c = −π and b2 c = π. This is achieved for √ a 3a c = − ex + (G.10) ey , 4 4 as shown in Figs. 7.12 and G.1. We then find Eq. 7.66 from Eq. G.9.
Appendix H
Geometries of Chromatin Fiber Models
Two-angle model: For any given set of angles (θ, φ) in the twoangle model, there is a helix, the master solenoid, so that all the successive vertices (see Fig. 8.33(b))—called monomers in the following—lie along this helical path. There are actually many such solutions, but we are interested in the one with the largest pitch angle ψ, see Fig. H.1. We parametrize the solenoid as follows ⎛ ⎞ R cos (αs/R) r (s) = ⎝ R sin (αs/R) ⎠ . (H.1) s R denotes the radius of the solenoid and α is related to the pitch ψ by α = cot ψ
(H.2)
since r˙ (0) = (0, α, 1). Assume now an infinite fiber of monomers with a given pair of angles (θ, φ). The monomers are located at the positions R0 , R±1 , R±2 , . . . . The axis of the fiber coincides with the Z -axis. Put the monomer labeled i = 0 at s = 0 so that R0 = r (0) = (R, 0, 0). The next monomer, i = 1, is at a position R1 = r (s0 ) and the next nearest monomer at R2 = r (2s0 ). Finally, the position of monomer i = −1 is given by R−1 = r (−s0 ). Biophysics for Beginners: A Journey through the Cell Nucleus (Second Edition) Helmut Schiessel c 2022 Jenny Stanford Publishing Pte. Ltd. Copyright ISBN 978-981-4877-80-0 (Hardcover), 978-1-003-22310-8 (eBook) www.jennystanford.com
488 Geometries of Chromatin Fiber Models
Figure H.1 The master solenoid (blue) and the linker DNA backbone (red) of the two-angle fiber.
Now let us calculate the bond vectors between these monomers, see Fig. H.1. Monomer i = 1 is connected to monomer i = 0 via ⎞ ⎛ R cos (αs0 /R) − R (H.3) r0 = R1 − R0 = ⎝ R sin (αs0 /R) ⎠ . s0 The vector between monomer i = 2 and i = 1 is given by ⎞ ⎛ R cos (2αs0 /R) − R cos (αs0 /R) r1 = R2 − R1 = ⎝ R sin (2αs0 /R) − R sin (αs0 /R) ⎠ s0 and that between monomer i = 0 and i = −1 by ⎞ ⎛ R − R cos (αs0 /R) r2 = R0 − R−1 = ⎝ R sin (αs0 /R) ⎠ . s0
(H.4)
(H.5)
s0 follows from the condition of fixed linker length, i.e., |r0 | = b. This leads to the relation b2 = 2R 2 (1 − cos (αs0 /R)) + s02 .
(H.6)
We determine θ from cos θ = r0 · r2 /r02 , which leads to cos θ =
2R 2 cos (αs0 /R) (1 − cos (αs0 /R)) + s02 . 2R 2 (1 − cos (αs0 /R )) + s02
(H.7)
Appendix H 489
Finally, φ is the angle between normal vectors of the planes that are defined by monomers 0 and 1, i.e., cos φ = n1 · n2 . We obtain n1 and n2 from n1 = r0 × r1 / |r0 × r1 | and n2 = r2 × r0 / |r2 × r0 |. After some algebra we arrive at cos φ =
s02 cos (αs0 /R) + R 2 sin2 (αs0 /R) . s02 + R 2 sin2 (αs0 /R)
(H.8)
Equations H.6–H.8 relate α (or ψ), R and s0 of the master solenoid to θ , φ and b. Solenoid-type models: The geometry of the two-angle model and that of solenoid-type models have in common that some part of the fiber forms a helix. For the two-angle model, the helix consists of the linker DNA, Fig. H.1; for the solenoid-type models, one or several helices are made up of stacks of nucleosomes. This enables us to use the geometric relationships of the two-angle model to describe solenoid-type models. R from Eq. H.1 is now given by (Dfiber − Dnucl ) /2. According to Fig. 8.38 for a fiber with Nrib stacks, this quantity is given by π (Dfiber − Dnucl ) = Nrib
Dnucl sin ψ
(H.9)
where ψ is again related to α via Eq. H.2. Equation H.1 explicitly gives us the space curve of one of the stacks, the other stacks follow by adding a constant 2π k/Nrib with k = 1, 2, . . . , Nrib − 1 to the arguments of the trigonometric functions. The linker length b of the two-angle model is now replaced by the nucleosomal height H nucl . We need to determine the arc length of a helix segment that crosses through a nucleosome in a stack. We know that the length of√ the tangent vector of the space curve, Eq. H.1, is given by |r˙ (s)| =√ 1 + α 2 . That is, when s goes from s = 0 to s = s0 = H nucl / 1 + α 2 , the helix approximately crosses the nucleosome. This is an approximation because the helix is curved, but the error is negligible as long as H nucl Dfiber , which is always assumed here. The splay angle is identical to the angle θ of the two angle model. Since θ and αs0 /R are always small here, we can Taylor expand all
490 Geometries of Chromatin Fiber Models
the cosine functions in Eq. H.7. This leads to θ≈
s0 α 2 α2 2H nucl √ = Dfiber − Dnucl 1 + α 2 R 1 + α2 2H nucl 1 − sin2 ψ . = Dfiber − Dnucl
If we replace sin ψ by using Eq. H.9, we find Eq. 8.107.
(H.10)
References
Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., and Walter, P. (2008). Molecular Biology of the Cell, 5th edn. (Garland Science, New York). Alexander, S., Chaikin, P. M., Grant, P., Morales, G. J., and Pincus, P. (1984). Charge renormalization, osmotic pressure, and bulk modulus of colloidal crystals: Theory, J. Chem. Phys. 80, pp. 5776–5781. Balasubramanian, S., Xu, F., and Olson, W. K. (2009). DNA sequence-directed organization of chromatin: Structure-based computational analysis of nucleosome-binding sequences, Biophys. J. 96, pp. 2245–2260. Banani, S. F., Lee, H. O., Hyman, A. A., and Rosen, M. K. (2017). Biomolecular condensates: Organizers of cellular biochemistry, Nat. Rev. Mol. Cell Biol. 18, pp. 285–298. Becker, N. B., and Everaers, R. (2007). From rigid basepairs to semiflexible polymers: Coarse-graining DNA, Phys. Rev. E 76, pp. 021923–1–17. Bednar, J., Horowitz, R. A., Grigoryev, S. A., Carruthers, L. M., Hansen, J. C., Koster, A. J., and Woodcock, C. L. (1998). Nucleosomes, linker DNA, and linker histone form a unique structural motif that directs the higherorder folding and compaction of chromatin, Proc. Natl. Acad. Sci. USA 95, pp. 14173–14178. Beltran, B., Kannan, D., MacPherson, Q., and Spakowitz, A. J. (2019). Geometrical heterogeneity dominates thermal fluctuations in facilitating chromatin contacts, Phys. Rev. Lett. 123, pp. 208103–1–6. Ben-Ha¨ım, E., Lesne, A., and Victor, J.-M. (2001). Chromatin: A tunable spring at work inside chromosomes, Phys. Rev. E 64, pp. 051921–1–19. Berg, O. G., Winter, R. B., and von Hippel, P. H. (1981). Diffusion-driven mechanisms of protein translocation on nuclei acids. 1. Model and theory, Biochemistry 20, pp. 6929–6948. Blossey, R. (2006). Computational Biology: A Statistical Physics Perspective (Chapman & Hall/CRC, Boca Raton).
492 References
Brandani, G. B., and Takada, S. (2018). Chromatin remodelers couple inchworm motion with twist-defect formation to slide nucleosomal DNA, PLoS Comput. Biol. 14, pp. e1006512–1–25. Brower-Toland, B. D., Smith, C. L., Yeh, R. C., Lis, J. T., Peterson, C. L., and Wang, M. D. (2002). Mechanical disruption of individual nucleosomes reveals a reversible multistage release of DNA, Proc. Natl. Acad. Sci. USA 99, pp. 1960–1965. Bruinsma, R. F. (2002). Physics of protein-DNA interaction, Physica A 313, pp. 211–237. Bustamante, C., Marko, J. F., Siggia, E. D., and Smith, S. (1994). Entropic elasticity of λ-phage DNA, Science 265, pp. 1599–1600. Calladine, C. R., Drew, H. R., Luisi, B. F., and Travers, A. A. (2004). Understanding DNA: The Molecule and How It Works, 3rd edn. (Elsevier, Amsterdam). Chan, H. S., and Dill, K. A. (1989). Compact polymers, Macromolecules 22, pp. 4559–4573. Chan, H. S., and Dill, K. A. (1998). Protein folding in the landscape perspective: Chevron plots and non-Arrhenius kinetics, Proteins 30, pp. 3–22. Chen, F. E., Huang, D.-B., Chen, Y.-Q., and Ghosh, G. (1998). Chrystal structure of p50/p65 heterodimer of transcription factor NF-κB bound to DNA, Nature 391, pp. 410–413. Coleman, B. D., Olson, W. K., and Swigon, D. (2003). Theory of sequencedependent DNA elasticity, J. Chem. Phys. 118, pp. 7127–7140. Cotton, J. P., Decker, D., Benoit, H., Farnoux, B., Higgins, J., Jannink, G., Ober, R., Picot, C., and des Cloizeaux, J. (1974). Conformation of polymer chain in the bulk, Macromolecules 7, pp. 863–872. Cui, Y., and Bustamante, C. (2000). Pulling a single chromatin fiber reveals the forces that maintain its higher-order structure, Proc. Natl. Acad. Sci. USA 97, pp. 127–132. ´ Daban, J.-R., and Bermudez, A. (1998). Interdigitated solenoid model for compact chromatin fibers, Biochemistry 37, pp. 4299–4304. Daoud, M., Cotton, J. P., Farnoux, B., Jannink, G., Sarma, G., Benoit, H., Duplessix, R., Picot, C., and de Gennes, P. G. (1975). Solutions of flexible polymers. Neutron experiments and interpretation, Macromolecules 8, pp. 804–818. de Bruin, L., Tompitak, M., Eslami-Mossallam, B., and Schiessel, H. (2016). Why do nucleosomes unwrap asymmetrically? J. Phys. Chem. B 120, pp. 5855–5863.
References 493
de Gennes, P.-G. (1979). Scaling Concepts in Polymer Physics (Cornell University Press, Ithaka). deHaseth, P. L., Lohman, T. M., and Record, M. T. (1977). Nonspecific interaction of lac repressor with DNA: An association reaction driven by counterion release, Biochemistry 16, pp. 4783–4790. Depken, M., Parrondo, J. M. R., and Grill, S. W. (2013). Intermittent transcription dynamics for the rapid production of long transcripts of high fidelity, Cell Reports 5, pp. 521–530. Depken, M., and Schiessel, H. (2009). Nucleosome shape dictates chromatin fiber structure, Biophys. J. 96, pp. 777–784. Dias, B. G., and Ressler, K. J. (2014). Parental olfactory experience influences behavior and neural structure in subsequent generations, Nature Neuroscience 17, pp. 89–96. Diesinger, P. M., and Heermann, D. W. (2008). The influence of the cylindrical shape of the nucleosomes and H1 defects on properties of chromatin, Biophys. J. 94, pp. 4165–4172. Dill, K. A., and Chan, H. S. (1997). From Levinthal to pathways to funnels, Nature Struct. Biol. 4, pp. 10–19. Doi, M., and Edwards, S. F. (1986). The Theory of Polymer Dynamics (Oxford University Press, New York). Dorigo, B., Schalch, T., Kulangara, A., Duda, S., Schroeder, R. R., and Richmond, T. J. (2004). Nucleosome arrays reveal the two-start organization of the chromatin fiber, Science 306, pp. 1571–1573. Dubochet, J., and Noll, M. (1978). Nucleosome arcs and helices, Science 202, pp. 280–286. Elf, J., Li, G.-W., and Xie, X. S. (2007). Probing transcription factor dynamics at the single-molecule level in a living cell, Science 316, pp. 1191– 1194. Eltsov, M., MacLellan, K. M., Maeshima, K., Frangakis, A. S., and Dubochet, J. (2008). Analysis of cryo-electron microscopy images does not support the existence of 30-nm chromatin fibers in mitotic chromosomes in situ, Proc. Natl. Acad. Sci. USA 105, pp. 19732–19737. Emanuel, M., Radja, N. H., Henriksson, A., and Schiessel, H. (2009). The physics behind the larger scale organization of DNA in eukaryotes, Phys. Biol. 6, pp. 025008–1–11. Eslami-Mossallam, B., Schram, R. D., Tompitak, M., van Noort, J., and Schiessel, H. (2016). Multiplexing genetic and nucleosome positioning codes: A computational approach, PLoS ONE 11, pp. e0156905–1–14.
494 References
Evans, E. (1999). Looking inside molecular bonds at biological interfaces with dynamic force spectroscopy, Biophys. Chem. 82, pp. 83–97. Finch, J. T., and Klug, A. (1976). Solenoidal model for superstructure in chromatin, Proc. Natl. Acad. Sci. USA 73, pp. 1897–1901. Fisher, M. E. (1966). Effect of excluded volume on phase transitions in biopolymers, J. Chem. Phys. 45, pp. 1469–1473. Franklin, R. E., and Gosling, R. G. (1953). Molecular configuration in sodium thymonucleate, Nature 171, pp. 740–741. Fudenberg, G., Imakaev, M., Lu, C., Goloborodko, A., Abdennur, N., and Mirny, L. A. (2016). Formation of chromosomal domains by loop extrusion, Cell Reports 15, pp. 2038–2049. Geanacopoulos, M., Vasmatzis, G., Zhurkin, V. B., and Adhya, A. (2001). Gal repressosome contains an antiparallel DNA loop, Nature Struct. Biol. 8, pp. 432–436. Gibcus, J. H., Samejima, K., Goloborodko, A., Samejima, I., Naumova, N., Nuebler, J., Kanemaki, M. T., Xie, L., Paulson, J. R., Earnshaw, W. C., Mirny, L. A., and Dekker, J. (2018). A pathway for mitotic chromosome formation, Science 359, pp. eaao6135–1–12. Goloborodko, A., Imakaev, M. V., Marko, J. F., and Mirny, L. (2016). Compaction and segregation of sister chromatids via active loop extrusion, eLife 5, pp. e14864–1–16. Grosberg, A., Rabin, Y., Havlin, S., and Neer, A. (1993). Crumpled globule model of the three-dimensional structure of DNA, Europhys. Lett. 23, pp. 373–378. Grosberg, A. Y., Khalatur, P. G., and Khokhlov, A. R. (1982). Polymeric coils with excluded volume in dilute solution: The invalidity of the model of impenetrable spheres and the influence of excluded volume on the rates of diffusion-controlled intermacromolecular reactions, Macromol. Chem., Rapid Commun. 3, pp. 709–713. Guida, R., and Zinn-Justin, J. (1998). Critical exponents of the n-vector model, J. Phys. A: Math. Gen. 31, pp. 8103–8121. Hahnfeldt, P., Hearst, J. E., Brenner, D. J., Sachs, R. K., and Hlatky, L. R. (1993). Polymer models for interphase chromosomes, Proc. Natl. Acad. Sci. USA 90, pp. 7854–7858. Halford, S. E., and Marko, J. F. (2004). How do site-specific DNA-binding proteins find their targets? Nucl. Acids Res. 32, pp. 3040–3052. Hall, M. A., Shundrovsky, A., Bai, L., Fulbright, R. M., Lis, J. T., and Wang, M. D. (2009). High-resolution dynamic mapping of histone-DNA interactions in a nucleosome, Nature Struct. Mol. Biol. 16, pp. 124–129.
References 495
Halperin, A., and Zhulina, E. B. (1991). On the deformation behaviour of collapsed polymers, Europhys. Lett. 15, pp. 417–421. Halverson, J. D., Lee, W. B., Grest, G. S., Grosberg, A. Y., and Kremer, K. (2011). Molecular dynamics simulation study of nonconcatenated ring polymers in a melt. I. Statics, J. Chem. Phys. 134, pp. 204904–1–13. Hammoud, S. S., Nix, D. A., Zhang, H., Purwar, J., Carrell, D. T., and Cairns, B. R. (2009). Distinctive chromatin in human sperm packages genes for embryo development, Nature 460, pp. 473–478. Hanke, A., and Metzler, R. (2003). Comment on “Why is the DNA denaturation transition first order?”, Phys. Rev. Lett. 90, pp. 159801–1. Higgs, P. G. (2000). RNA secondary structure: Physical and computational aspects, Quart. Rev. Biophys. 33, pp. 199–253. Hilbert, D. (1891). Ueber die stetige Abbildung einer Linie auf ein ¨ ¨ Flachenst uck, Mathematische Annalen 38, pp. 459–460. Hopfield, J. J. (1974). Kinetic proofreading: A new mechanism for reducing errors in biosynthetic processes requiring high specificity, Proc. Natl. Acad. Sci. USA 71, pp. 4135–4139. ¨ Hyman, A. A., Weber, C. A., and Julicher , F. (2014). Liquid-liquid phase separation in biology, Annu. Rev. Cell Dev. Biol. 30, pp. 39–58. Kafri, Y., Mukamel, D., and Peliti, L. (2002). Melting and unzipping of DNA, Eur. Phys. J. B 27, pp. 135–146. Kaplan, N., Moore, I. K., Fondufe-Mittendorf, Y., Gossett, A. J., Tillo, D., Field, Y., LeProust, E. M., Hughes, T. R., Lieb, J. D., Widom, J., and Segal, E. (2009). The DNA-encoded nucleosome organization of a eukaryotic genome, Nature 458, pp. 362–366. Kirchhoff, G. (1859). Ueber das Gleichgewicht und die Bewegung eines ¨ unendlich dunnen elastischen Stabes, J. reine angew. Math. 56, pp. 285– 313. Kramers, H. A., and Wannier, G. H. (1941). Statistics of the two-dimensional ferromagnet. Part I, Phys. Rev. 60, pp. 252–262. Kruithof, M., Chien, F.-T., Routh, A., Logie, C., Rhodes, D., and van Noort, J. (2009). Single-molecule force spectroscopy reveals a highly compliant helical folding for the 30-nm chromatin fiber, Nature Struct. Mol. Biol. 16, pp. 534–540. ´ I. M., Mohrbach, H., Thaokar, R., and Schiessel, H. (2007). Equation of Kulic, state of looped DNA, Phys. Rev. E 75, pp. 011913–1–23. ´ I. M., and Schiessel, H. (2003a). Chromatin dynamics: Nucleosomes go Kulic, mobile through twist defects, Phys. Rev. Lett. 91, pp. 148103–1–4.
496 References
´ I. M., and Schiessel, H. (2003b). Nucleosome repositioning via loop Kulic, formation, Biophys. J. 84, pp. 3197–3211. ´ I. M., and Schiessel, H. (2004). DNA spools under tension, Phys. Rev. Kulic, Lett. 92, pp. 228101–1–4. ˇ F., Gonzales, O., Heffler, L. M., Stoll, G., Moakher, M., and Maddocks, Lankas, J. H. (2009). On the parametrization of rigid base and basepair models of DNA from molecular dynamics simulations, Phys. Chem. Chem. Phys. 11, pp. 10565–10588. Lanzani, G., and Schiessel, H. (2012). Out of register: How DNA determines the chromatin fiber geometry, Europhys. Lett. 97, pp. 38002–1–6. Larson, A. G., Elnatan, D., Keenen, M. M., Trnka, M. J., Johnston, J. B., Burlingame, A. L., Agard, D. A., Redding, S., and Narlikar, G. J. (2017). Liquid droplet formation by HP1α suggests a role for phase separation in heterochromatin, Nature 547, pp. 236–240. Lau, K. F., and Dill, K. A. (1989). A lattice statistical mechanics model for the conformational and sequence spaces of proteins, Macromolecules 22, pp. 3986–3997. Lau, K. F., and Dill, K. A. (1990). Theory for protein mutability and biogenesis, Proc. Natl. Acad. Sci. USA 87, pp. 638–642. Lavery, R., Moakher, M., Maddocks, J. H., Petkeviciute, D., and Zakrzewska, K. (2009). Conformational analysis of nucleic acids revisited: Curves+, Nucl. Acids Res. 37, pp. 5917–5929. Lequieu, J., Schwartz, D. C., and de Pablo, J. J. (2017). In silico evidence for sequence-dependent nucleosome sliding, Proc. Natl. Acad. Sci. USA 114, pp. E9197–E9205. Lia, G., Bensimon, D., Croquette, V., Allemand, J.-F., Dunlap, D., Lewis, D. E. A., Adhya, S., and Finzi, L. (2003). Supercoiling and denaturation in Gal repressor/heat unstable nucleoid protein (HU)-mediated DNA looping, Proc. Natl. Acad. Sci. USA 100, pp. 11373–11377. Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., Sandstrom, R., Bernstein, B., Bender, M. A., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L. A., Lander, E. S., and Dekker, J. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome, Science 326, pp. 289–293. Liu, X., Li, M., Xia, X., Li, X., and Chen, Z. (2017). Mechanism of chromatin remodelling revealed by the Snf2-nucleosome structure, Nature 544, pp. 440–445.
References 497
Lua, R., Borovinskiy, A. L., and Grosberg, A. Y. (2004). Fractal and statistical properties of large compact polymers: A computational study, Polymer 45, pp. 717–731. ¨ , A. W., Richmond, R. K., Sargent, D. F., and Richmond, T. J. Luger, K., Mader (1997). Crystal structure of the nucleosome core particle at 2.8 A˚ resolution, Nature 389, pp. 251–260. Makarov, V., Dimitrov, S., Smirnov, V., and Pashev, I. (1985). A triple helix model for the structure of chromatin fiber, FEBS Letters 181, pp. 357– 361. Mangenot, S., Leforestier, A., Durand, D., and Livolant, F. (2003). Phase diagram of nucleosome core particles, J. Mol. Biol. 333, pp. 907–916. Mangenot, S., Raspaud, E., Tribet, C., Belloni, L., and Livolant, F. (2002). Interactions between isolated nucleosome core particles: A tailbridging effect? Eur. Phys. J. E 7, pp. 221–231. Marvin, D. A., Spencer, M., Wilkins, M. H. F., and Hamilton, L. D. (1958). A new configuration of deoxyribonucleic acid, Nature 182, pp. 387–388. Mateos-Langerak, J., Bohn, M., de Leeuw, W., Giromus, O., Manders, E. M. M., Verschure, P. J., Indemans, M. H. G., Gierman, H. J., Heermann, D. W., van Driel, R., and Goetze, S. (2009). Spatially confined folding of chromatin in the interphase nucleus, Proc. Natl. Acad. Sci. USA 106, pp. 3812–3817. Mathews, D. H., Sabina, J., Zuker, M., and Turner, D. H. (1999). Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure, J. Mol. Biol. 288, pp. 911–940. Mihardja, S., Spakowitz, A. J., Zhang, Y., and Bustamante, C. (2006). Effect of force on mononucleosomal dynamics, Proc. Natl. Acad. Sci. USA 103, pp. 15871–15876. Miyazawa, T. (1961). Molecular vibrations and structure of high polymers. II. Helical parameters of infinite polymer chains as functions of bond lengths, bond angles, and internal rotation angles, J. Pol. Sci 55, pp. 215– 231. Moreira, A. G., and Netz, R. R. (2002). Simulations of counterions at charged plates, Eur. Phys. J. E 8, pp. 33–58. Morozov, A. V., Fortney, K., Gaykalova, D. A., Studitsky, V. M., Widom, J., and Siggia, E. D. (2009). Using DNA mechanics to predict in vitro nucleosome positions and formation energies, Nucl. Acids Res. 37, pp. 4707–4722. ¨ el, C., and Langowski, J. (1998). Chromosome structure predicted by a Munk polymer model, Phys. Rev. E 57, pp. 5888–5896.
498 References
Nasmyth, K. (2001). Disseminating the genome: Joining, resolving, and separating sister chromatids during mitosis and meiosis, Annu. Rev. Genet. 35, pp. 673–745. Neuman, K. C., Abbondanzieri, E. A., Landick, R., Gelles, J., and Block, S. M. (2003). Ubiquitous transcriptional pausing is independent of RNA polymerase backtracking, Cell 115, pp. 437–447. Newman, M. E. J., and Barkema, G. T. (1999). Monte Carlo Methods in Statistical Physics (Oxford University Press, Oxford). Ngo, T. T. M., Zhang, Q., Zhou, R., Yodh, J. G., and Ha, T. (2015). Asymmetric unwrapping of nucleosomes under tension directed by DNA local flexibility, Cell 160, pp. 1135–1144. Nienhuis, B. (1982). Exact critical point and critical exponents of O(n) models in two dimensions, Phys. Rev. Lett. 49, pp. 1062–1065. Niina, T., Brandani, G. B., Tan, C., and Takada, S. (2017). Sequence-dependent nucleosome sliding in rotation-coupled and uncoupled modes revealed by molecular simulations, PLoS Comput. Biol. 13, pp. e1005880–1–22. Odijk, T. (1995). Stiff chains and filaments under tension, Macromolecules 28, pp. 7016–7018. Olson, W. K., Gorin, A. A., Lu, X.-J., Hock, L. M., and Zhurkin, V. B. (1998). DNA sequence-dependent deformability deduced from protein-DNA crystal complexes, Proc. Natl. Acad. Sci. USA 95, pp. 11163–11168. Olson, W. K., Srinivasan, A. R., Colasanti, A. V., Zheng, G., and Swigon, D. (2009). DNA biomechanics, in: Handbook of Molecular Biophysics (ed.: H. G. Bohr) (Wiley-VCH Verlag, Weinheim), pp. 359–382. Onsager, L. (1949). The effects of shape on the interaction of colloidal particles, Annals of the New York Academy of Sciences 51, pp. 627–659. Pace, N. R., and Brown, J. W. (1995). Evolutionary perspective on the structure and function of ribonuclease P, a ribozyme, J. Bacteriol. 177, pp. 1919–1928. Pande, V. S., Joerg, C., Grosberg, A. Y., and Tanaka, T. (1994). Enumerations of the Hamiltonian walks on a cubic sublattice, J. Phys. A: Math Gen. 27, pp. 6231–6236. Pauling, L., and Corey, R. B. (1953). A proposed structure for the nucleic acids, Proc. Natl. Acad. Sci. USA 39, pp. 84–97. Pennings, S., Meersseman, G., and Bradbury, E. M. (1991). Mobility of positioned nucleosomes on 5 S rDNA, J. Mol. Biol. 220, pp. 101–110. Pincus, P. (1997). Dynamics of stretched polymer chains, Macromolecules 10, pp. 210–213.
References 499
Polach, K. J., and Widom, J. (1995). Mechanism of protein access to specific DNA sequences in chromatin: A dynamic equilibrium model for gene regulation, J. Mol. Biol. 254, pp. 130–149. Poland, D., and Scheraga, H. A. (1966). Occurrence of a phase transition in nucleic acid models, J. Chem. Phys. 45, pp. 1464–1469. Prinsen, P., and Schiessel, H. (2010). Nucleosome stability and accessibility of its DNA to proteins, Biochimie 92, pp. 1722–1728. Rao, S. S. P., Huntley, M. H., Durand, N. C., Stamenova, E. K., Bochkov, I. D., Robinson, J. T., Sanborn, A. L., Machol, I., Omer, A. D., Lander, E. S., and Lieberman Aiden, E. (2014). A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping, Cell 159, pp. 1665– 1680. Riggs, A. D., Bourgeois, S., and Cohn, M. (1970). The lac repressor-operator interaction. III. Kinetic studies, J. Mol. Biol. 53, pp. 401–417. Robinson, P. J. J., Fairall, L., Huynh, V. A. T., and Rhodes, D. (2006). EM measurements define the dimensions of the 30-nm chromatin fiber: Evidence for a compact, interdigitated structure, Proc. Natl. Acad. Sci. USA 103, pp. 6506–6511. Rosa, A., and Everaers, R. (2008). Structure and dynamics of interphase chromosomes, PLoS Comp. Biol. 4, pp. e1000153–1–10. Rosa, R., and Everaers, R. (2014). Ring polymers in the melt state: The physics of crumpling, Phys. Rev. Lett. 112, pp. 118302–1–5. Rouzina, I., and Bloomfield, V. A. (1996). Macroion attraction due to electrostatic correlation between screening counterions. 1. Mobile surface-adsorbed ions and diffuse ion cloud, J. Phys. Chem. 100, pp. 9977–9989. Rudnizky, S., Khamis, H., Malik, O., Melamed, P., and Kaplan, A. (2019). The base pair-scale diffusion of nucleosomes modulates binding of transcription factors, Proc. Natl. Acad. Sci. USA 116, pp. 12161– 12166. Sanborn, A. L., Rao, S. S. P., Huang, S.-C., Durand, N. C., Huntley, M. H., Jewett, A. I., Bochkov, I. D., Chinnappan, D., Cutkosky, A., Li, J., Geeting, K. P., Gnirke, A., Melnikov, A., McKenna, D., Stamenova, E. K., Lander, E. S. and Lieberman Aiden, E. (2015). Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes, Proc. Natl. Acad. Sci. USA 112, pp. E6456–E6465. Satchwell, S. C., Drew, H. R., and Travers, A. A. (1986). Sequence periodicities in chicken nucleosome core DNA, J. Mol. Biol. 191, pp. 659–675.
500 References
Schalch, T., Duda, S., Sargent, D. F., and Richmond, T. J. (2005). X-ray structure of a tetranucleosome and its implications for the chromatin fibre, Nature 436, pp. 138–141. Schiessel, H., Gelbart, W. M., and Bruinsma, R. (2001). DNA folding: Structural and mechanical properties of the two-angle model for chromatin, Biophys. J. 80, pp. 1940–1956. Schram, R. D., Barkema, G. T., and Schiessel, H. (2013). On the stability of fractal globules, J. Chem. Phys. 138, pp. 224901–1–11. Schram, R. D., and Schiessel, H. (2013). Exact enumeration of Hamiltonian walks on the 4 x 4 x 4 cube and applications to protein folding, J. Phys. A: Math. Theor. 46, pp. 485001–1–14. Schram, R. D., and Schiessel, H. (2016). Corrigendium: Exact enumeration of Hamiltonian walks on the 4 x 4 x 4 cube and applications to protein folding, J. Phys. A: Math. Theor. 49, p. 369501. ˚ om, ¨ A., Field, Y., Moore, I. K., Segal, E., Fondufe-Mittendorf, Y., Chen, L., Thastr Wang, J.-P. Z., and Widom, J. (2006). A genomic code for nucleosome positioning, Nature 442, pp. 772–778. Shakhnovich, E., and Gutin, A. (1990). Enumeration of all compact conformations of copolymers with random sequence of links, J. Chem. Phys. 93, pp. 5967–5971. Smith, S. B., Finzi, L., and Bustamante, C. (1992). Direct mechanical measurements of the elasticity of single DNA molecules by using magnetic beads, Science 258, pp. 1122–1126. Sommer, J.-U., and Daoud, M. (1996). Adsorption of multiblock copolymers at interfaces between selective solvents: Single-chain properties, Phys. Rev. E 53, pp. 905–920. Strom, A. R., Emelyanov, A. V., Mir, M., Fyodorov, D. V., Darzacq, X., and Karpen, G. H. (2017). Phase separation drives heterochromatin domain formation, Nature 547, pp. 241–245. Sudhanshu, B., Mihardja, S., Koslover, E. F., Mehraeen, S., Bustamante, C., and Spakowitz, A. J. (2011). Tension-dependent structural deformation alters single-molecule transition kinetics, Proc. Natl. Acad. Sci. USA 108, pp. 1885–1890. Syed, S. H., Goutte-Gattat, D., Becker, N., Meyer, S., Shukla, M. S., Hayes, J. J., Everaers, R., Angelov, D., Bednar, J., and Dimitrov, S. (2010). Single-base resolution mapping of H1-nucleosome interactions and 3D organization of the nucleosome, Proc. Natl. Acad. Sci. USA 107, pp. 9620–9625.
References 501
Thijssen, J. M. (2007). Computational Physics, 2nd edn. (Cambridge University Press, Cambridge). Tolstorukov, M. Y., Colasanti, A. V., McCandlish, D. M., Olson, W. K., and Zhurkin, V. B. (2007). A novel roll-and-slide mechanism of DNA folding in chromatin: Implications for nucleosome positioning, J. Mol. Biol. 371, pp. 725–738. Tompitak, M., Vaillant, C., and Schiessel, H. (2017). Genomes of multicellular organisms have evolved to attract nucleosomes to promoter regions, Biophys. J. 112, pp. 505–511. Trifonov, E. N., and Sussman, J. L. (1980). The pitch of chromatin DNA is reflected in its nucleotide sequence, Proc. Natl. Acad. Sci. USA 77, pp. 3816–3820. van Kampen, N. G. (1992). Stochastic Processes in Physics and Chemistry (Elsevier, Amsterdam). Verlet, L. (1967). Computer “experiments” on classical fluids. I. Thermodynamic properties of Lennard-Jones molecules, Phys. Rev. 159, pp. 98– 103. Watson, J. D. (1968). The Double Helix: A Personal Account of the Discovery of Structure of DNA (Atheneum, New York). Watson, J. D., and Crick, F. H. C. (1953). Molecular structure of nucleic acids – a structure for deoxyribose nucleic acid, Nature 171, pp. 737–738. Widom, J. (1992). A relationship between the helical twist of DNA and the ordered positioning of nucleosomes in all eukaryotic cells, Proc. Natl. Acad. Sci. USA 89, pp. 1095–1099. Winter, R. B., Berg, O. G., and von Hippel, P. H. (1981). Diffusion-driven mechanisms of protein translocation on nuclei acids. 3. The Escherichia coli lac repressor-operator interaction: Kinetic measurements and conclusions, Biochemistry 20, pp. 6961–6977. Woodcock, C. L., Grigoryev, S. A., Horowitz, R. A., and Whitaker, N. (1993). A chromatin folding model that incorporates linker variability generates fibers resembling the native structures, Proc. Natl. Acad. Sci. USA 90, pp. 9021–9025. Yeomans, J. M. (1992). Statistical Mechanics of Phase Transitions (Oxford University Press, New York). Zimm, B. H. (1956). Dynamics of polymer molecules in dilute solution: Viscoelasticity, flow birefringence and dielectric loss, J. Chem. Phys. 24, pp. 269–278.
502 References
Zuiddam, M., Everaers, R., and Schiessel, H. (2017). Physics behind the mechanical nucleosome positioning code, Phys. Rev. E 96, pp. 052412– 1–15. Zuiddam, M., and Schiessel, H. (2019). Shortest paths through synonymous genomes, Phys. Rev. E 99, pp. 012422–1–9. ¨ Zwicker, D., Hyman, A. A., and Julicher , F. (2015). Suppression of Ostwald ripening in active emulsions, Phys. Rev. E 92, pp. 012317–1–13.
Index
A-DNA 120 action functional 466 adenosine triphosphate 317, 375 aggregate 60 alpha carbon 247 alpha helix 115, 249, 254, 301 amino acid 4, 128, 247 alanine 246 glutamic acid 248 lysine 69, 248 phenylalanine 243, 248 serine 248 anion 265 anti-codon 4, 243 arc length 147 archaea 10 Argon atom 422, 430 assemblage 60 atmospheric pressure 20 attempt frequency 214, 350 average 446 average energy 27 Avogadro constant 20
B-DNA 120 bacteria 10, 113, 299, 313, 329 transformation 114, 330 bacteriophage 330 baker’s yeast 144 barometric formula 34, 208, 273 barrier height 214, 339, 346 base pair step 121 AA/TT step 124, 138
GC step 138 pyrimidine-purine step 126, 129 roll 121, 134, 137 slide 121 stacking 126, 363 TA step 138 tilt 134, 137 twist 121 base pairing 2, 117, 178, 316 bending modulus 147 beta sheet 249 anti-parallel 249, 254 parallel 249, 254 beta-galactosidase 301 beta-galactoside permease 301 beta-galactoside transacetylase 301 binomial coefficient 37 biomolecular condensate 60, 68 Bjerrum length 269, 294 blob picture 97 block copolymer 72 at a strongly selective interface 73 blocking method 443 Boltzmann constant 30, 39 Boltzmann distribution 24, 267 bond length 85 effective 89 bond vector 80, 85 Bravais lattice 485 Brownian particle 192, 193, 197, 205, 211, 221
504 Index
BWH model 309 sliding length 310 targeting radius 311
C-DNA 120 Cajal body 68 canonical ensemble 23 Cantor set 111 cation 265 cell differentiation 69 cell division 1, 15, 68, 71, 387, 416 cellular body 60 center of mass 230 central dogma of molecular biology 1, 191 central limit theorem 83, 287, 452 centrosome 68 chaperone 248 Chapman–Kolmogorov equation 199, 200 Chargaff’s rules 117 charge correlations 284, 295 charge density 266 charge fluctuations 286, 287 charge neutrality condition 267, 270 charge renormalization 283, 289 charged cylinder 278 charged plane 270, 288 charged sphere 281, 288 chemical distance 88, 399, 405 chemical potential 43, 57, 64, 105, 283, 307 chromatin 10 chicken erythrocyte 381, 385 chromatin fiber 13, 376 chromatin remodeler 374 Snf2 374 chromophore 354 chromosomal territory 12, 399, 410 chromosome 10, 399 contact map 412, 413
chromosome conformation capture 401, 412 circle-line approximation 188 classical mechanics 25, 150 codon 4, 128 coexistence plateau 388 cohesin 158, 415 combined gas law 20, 30 common tangent construction 64, 66, 75, 388 comparative sequence analysis 246 compensatory mutation 246 condensin 417 conditional probability 196 conserved quantity 23 contact probability 401 continuity equation 204 convection term 203 copolymer 79 correlation length 454 correlation time 440 Cosine integral 480 counterion 270, 275 condensation 280, 307 release 277, 283, 307, 309 coupling constant 438 critical core model 258 critical exponent 60, 454, 455 critical phenomena 453 critical point 54, 59, 75, 438, 454 density fluctuations 59 critical temperature 440, 453 crossed linker model 377 crumpled globule 404 CTCF motif 414 Curie’s law 87 cytoplasm 12 cytoskeleton 7
Debye screening length 284, 287 ¨ el theory 284 Debye–Huck free energy density 286
Index
¨ el theory 373 Debye-Huck degree of freedom 34, 165, 173 degree of polymerization 80 denaturing solvent 248 density 47 deoxyribonucleic acid 1 detailed balance 429 dielectric constant 252, 266 diffusion constant 206, 304, 403 diffusion equation 206, 232, 303 diffusion term 203 diffusion-limited reaction 303 disengagement time 403 disjoining pressure 275 DNA 1, 13, 79, 113, 146, 162, 168, 191, 241, 265, 291, 328 bending 127 bubble 178 crystal structure 115, 120 denaturation loop 178, 187 double helix 118, 278, 327 force-extension curve 168 major groove 120, 302 melting 178, 399 minor groove 120, 327, 359, 378 persistence length 173, 176, 328 thermal denaturation 178 toroid 292 unzipping 356, 373 DNA polymerase 3, 177 DNA–protein complex 215, 299, 354 DNA-protein complex 137, 161 drift term 203 dynamic force spectroscopy 215, 339 dynamic programming algorithms 245
Earnshaw’s theorem 293
Einstein relation 208, 223, 372, 403 elastic spring 83 electrical double layer 274 electrical field 272 electrophoretic mobility 358 electrostatic blob 110 electrostatic potential 266 elementary charge 266 elliptic integral 159 amplitude 470 of the first kind 470 of the second kind 472 end-to-end distance mean-squared 81, 88, 165 end-to-end vector 80 energy barrier 211 energy fluctuations 27 enhancement factor 91 entropic spring 84, 87, 94, 99, 226 entropy 39, 80, 82, 282 of mixing 61, 62 translational 277 epigenetic information 146 equation of motion 221, 224, 465 equilibrium 22 equilibrium constant 306 equilibrium state 39 equipartition theorem 33, 163, 165, 170, 225, 230, 364 escape rate 214 euchromatin 69, 414 eukaryote 301, 313, 327 eukaryotic cell 10 Euler angle representation 148 Euler angles 132, 148, 151 Euler elastica 153, 342, 360, 384, 471 Euler method 423 Euler–Lagrange equation 152, 158, 163, 274, 384, 467 exact enumeration 253 excluded volume 46, 90, 166, 187 expectation value 446
505
506 Index
extensive quantity 43 external potential 465
¨ er resonance energy Forst transfer 354 face-centered cubic lattice 435, 436 failure rate 216 ferromagnet 46 ferromagnetism 438, 453 fibroblast 400 Fick’s law 208 Flory argument 94, 107 Flory exponent 95, 240 in d dimensions 96 Flory theorem 108, 238, 399, 402 fluctuation term 203 fluctuation-dissipation theorem 208 fluorescence in situ hybridization 399 fluorescence microscope 400 Fokker–Planck equation 203, 223 force-extension curve 337, 385 formaldehyde 399, 401 Fourier series 170, 227, 475, 478 Fourier coefficient 475, 483 fractal 106, 404 fractal dimension 107, 404 fractal globule 404 free energy 42, 63, 83, 94, 279 free enthalpy 45, 57, 58, 84, 307 free-draining limit 237 freely jointed chain 85, 166, 168, 173 partition function 86 freely rotating chain 87, 166 Fresnel integral 480 friction constant 207, 221, 226, 365, 403 fugacity 32, 181 functional 466 stationary point 467
gal repressor dimer protein 176 anti-parallel loop model 177 parallel loop model 177 galactose 301 gas phase 46, 438 Gaussian chain 89 Gaussian chain model 225 Gaussian distribution 83, 89, 207, 226, 447, 452 average 447 multivariate 449 variance 448 Gaussian integral 26 Gaussian white noise 224 gel electrophoresis 357, 418 two-dimensional 358 gene 3, 301, 319 generalized diffusion equation 203 genetic code 4 redundancy 5, 128 genetic information 113 genetic switch 301 genome 1 Gibbs potential 43 glucose 301 good solvent 91, 404 Gouy–Chapman length 272, 273, 276, 291, 294 grandcanonical ensemble 32, 43, 47, 181 grandcanonical potential 43 granule 60 gravitational acceleration 34, 151 Green’s function 224, 266, 284, 285 ground state 211, 293
Hamilton’s principle 152, 466 Hamiltonian 26 Hamiltonian walk 110 harmonic potential 211, 224 heat 57, 248
Index
heat bath 27 Heaviside step function 224 Heisenberg’s uncertainty principle 25, 40 helicase 374 hemoglobin 115 heterochromatin 69, 146, 414 heterochromatin protein 70 chromodomain 70 chromoshadow domain 70 heteropolymer 79, 252 highly charged cylinder 280 Hilbert curve 405 histone octamer 14, 130, 329, 359 histone protein 13, 327 core histone H2A 70, 327 core histone H2B 70, 327 core histone H3 70, 327 H3K9me 70 core histone H4 70, 327 core protein H3 H3K9me 146 linker histone 378 histone tail 69 posttranslational modification 69 homopolymer 79 Hooke’s law 83 HP-model 252 compact configuration 253 hydrodynamic interaction 232 hydrodynamic screening 238 screening length 237 hydrodynamics 207 hydrogen bond 2, 115, 249, 302, 327, 430 hydrophilic substance 7 hydrophobic substance 7, 119
ideal chain 89, 90, 96, 106 ideal gas 26, 33, 42, 75, 273 canonical partition function 26
entropy 279 equation of state 30, 44, 275 grandcanonical partition function 32 Hamiltonian 26 image charge 292 incompressible fluid 104 inertia force 233 inner product 474 insulator protein CTCF 414 integration by parts 33, 466 intensive quantity 44 interaction parameter 63 internal energy 42 interphase 398, 407 ion 265 calcium 265 cloride 265 divalent 266 fixed 266 magnesium 265 mobile 266 monovalent 266 potassium 265 self-energy 285 sodium 265 trivalent 293 valency 266, 295 Ising model 438 isotherm 53, 75 critical 54
Jacobian elliptic function 470, 472 jump moment 204
Kadanoff relation 455 kinetic energy 26, 466 kinetic proofreading 321 Kirchhoff kinetic analogy 151, 188, 341, 471 Kramers’ rate 214, 365, 372 Kronecker delta 33
507
508 Index
lac operon 299 lac repressor 299 non-specific equilibrium constant 307 non-specific interaction 306 specific interaction 306 lactose 301 Lagrange function 151, 465 Lagrange multiplier 40, 43, 44, 74 Lagrangian action 150 lambda phage 168 laminar flow 233 Landau’s theory 453 Langevin equation 222, 226 Langevin force 222 Langevin function 87 Laplace equation 304 Laplace operator 285 Laplace pressure 65, 67 lattice constant 80, 294 Lennard–Jones potential 431 leukocyte 330 Levinthal’s paradox 250 linear superposition 267 linker DNA 13, 378, 384, 386, 389 lipid 12, 265 liquid phase 46, 56, 422, 438 liquid-liquid phase separation 60, 72 loading rate 218 loop extrusion 415, 416 lymphoblast 402
macromolecular condensate 70 macroscopic variable 20 macrostate 21, 37, 82 magnetic moment 37 magnetic susceptibility 442, 454 magnetic tweezer 168 magnetization 37, 87, 439, 453 spontaneous 440, 454, 455 Manning condensation 280 Manning parameter 280, 289
Markov process 196, 200, 428 master equation 202, 429 mathematical induction 367 maximum matching model 245 Maxwell construction 56, 75, 105 Maxwell velocity distribution 35, 210, 435 mean field approximation 266, 268, 293 membrane 12, 265 membrane protein 301 messenger RNA 4 metastable state 211, 346 methylase 71, 73 Metropolis algorithm 429, 438 microcanonical ensemble 28 micrococcal nuclease 376 micromanipulation experiment 176 microstate 21, 37, 61, 82 minimum image convention 431 mitochondria 12 mitotic chromosome 15, 410, 417 mitotic spindle 68, 417 mobility tensor 233 modified Bessel function 288 molar 265, 303 mole 20 molecular dynamics simulation 422 molecular motor 317 moment of inertia 151 moment theorem 457 momentum 21 monomer 79 Monte Carlo integration 425 Monte Carlo method 438 Monte Carlo simulation 425 multi-cellular organisms 145 multiplexing 129 myosin 318
n vector model 456
Index
neutron scattering 109 Newton’s second law 423, 465 non-concatenated polymer rings 409, 415 non-draining limit 237 non-equilibrium process 191 non-radiative dipole-dipole coupling 354 normal coordinates 227 nuclear body 60 nuclear pore 12 nucleoid 313 nucleolus 68 nucleoside triphosphate 317 nucleosomal repeat length 389, 397 nucleosome 13, 69, 129, 327 breathing 336 critical force 334 dyad axis 327, 343, 378 elastic energy 328, 336 first-second-round difference 353, 359 pure adsorption energy 336 sequence preferences 138 stem 378 nucleosome core particle 14, 329, 388, 390 crystal structure 14, 69, 327, 340, 359, 363, 394 wedge shape 394 nucleosome positioning sequence 337, 389 5 S rDNA 369, 370, 418 601 Widom 354 nucleosome sliding 356 corkscrew motion 370 diffusion constant 369 effective 372 loop defect 360, 373 twist defect 360, 373 crossing fraction 367 escape rate 366 injection rate 366
nucleotide 1, 316 adenine 1, 117 cytosine 1, 117 guanine 1, 117 pseudouridine 244 thymine 1, 117 uracil 3 nucleus 10, 12, 376
odd function 448 off-rate 302 oligonucleotide 14, 120 on-rate 302 one-start helix 395 operator 301 operon 299 organelle 12, 60 membraneless 60 Ornstein–Uhlenbeck process 200, 210 orthonormal system 474 Oseen tensor 234 OSF theory 296 osmotic pressure 419 Ostwald ripening 67
P granule 68 pair correlation function 437 paramagnet 36, 46, 87 particle-wave duality 25 partition function 24, 140, 179, 335, 456 canonical 24 grandcanonical 32, 181 Pauli repulsion 430 pendulum 153, 467, 468 equation of motion 468 homoclinic orbit 155, 342 Lagrange function 468 oscillating 470, 471 phase portrait 155 revolving orbit 154, 470
509
510 Index
peptide bond 247 periodic boundary conditions 431, 438 permutation 38 persistence length 164, 400 apparent 176 phase coexistence 57 phase separation 60 phase space 21, 427 phase transition 46, 56 continuous 59, 178, 185 first order 58, 105, 178, 183, 185 phosphate 118 Pincus blob 99, 100, 104, 238, 479 Planck constant 25 plane wave expansion 483 plasma 270 pneumonia 114 Poisson equation 266, 305 Poisson–Boltzmann equation 267 self-consistency 268 Poisson–Boltzmann theory 266 breakdown 292 free energy functional 274 Poland–Scheraga model 178 polar coordinates 484 polyelectrolyte 110 polylog function 181 polymer 79, 207, 225 rotational relaxation time 232 polymer coil 81 polymer globule 91, 107, 292, 399, 404 polymerization reaction 79 polypeptide 4, 248 poor solvent 91, 107, 404 potential energy 466 power law 59, 91 pre-averaging approximation 235, 478 pressure 28, 57, 76, 437 critical 55 primitive cell 484
primitive vectors 485 probability distribution 24, 445 m-th moment 446 probability flux 204, 208 constitutive equation 204 probability theory 445 prokaryote 299 prokaryotic cell 10 promoter 301 propeller twist 123, 146 protamine 145 protein 4, 79, 113, 191, 215, 265, 278, 280, 292, 299, 319 backbone dihedral angles 250 C-terminus 69 crystal structure 14, 16, 114 misfolded 253 N-terminus 69 native state 248, 250 primary structure 249 quaternary structure 249 residue 115 secondary structure 249, 254 side chain 115, 247 tertiary structure 249 protein folding 7, 191, 248 energy landscape 250 funnel 251 hydrophobic effects 252 purine 116 pyrimidine 116 pyrophosphate 317 quantum mechanics 24 quasi-stationary state 212
radius of curvature 147 random mutation 257 random variable 446 random walk 80, 166, 180, 207, 251, 312, 400 rapid conformational pre-equilibrium 332
Index
Rayleigh particle 209–211, 224 reaction kinetics 302 reaction radius 304 reaction volume 308 real gas 46, 59 Hamiltonian 46 reciprocal lattice 485 recursive algorithm 111 relative extension 383 relaxation rate 332 renormalization group transformation 60, 95 replication 1, 241 repressor-operator complex 302 reptation 403, 409 restriction enzyme 329, 335, 401 Reynolds number 233 ribonuclease 243 ribonucleic acid 3 ribosome 4, 7, 191 biogenesis 68 ribosomal protein 7 ribosomal RNA 7 rigid base pair model 131, 188, 355, 370 stiffness matrix 132 rigid rod 166 RNA 3, 79, 114, 191, 241, 249, 265, 316 arc diagram 242, 246 bulge 243 compatible pairings 245 double helix 241 hairpin 242 hairpin loop 243 internal loop 243 kissing hairpins 242 multi-branched loop 243 pseudoknot 243 rainbow diagram 242 secondary structure 242 tertiary structure 242 RNA polymerase 3, 177, 301, 316 backtracking state 321
multiple backtracking 324 RNA world 7 rodlike chain 147 room temperature 31 rotation matrix 149, 343 unitarity 149 rotational relaxation time 229, 350 Rouse model 225, 226, 403 mean-squared displacement 229 Rouse modes 227, 237, 478 Rouse time 228, 481
saddle point 346 saddle-point approximation 214 safety tong 349 salt bridge 252 salt solution 285 scaling law 81, 85 self-avoiding loop 459 self-avoiding walk 91, 111, 186, 254, 460 self-similarity 59, 107, 404 semidilute polymer solution 108, 238, 399, 409 semiflexible polymer 162 separation of variables 159, 271, 470 site exposure mechanism 329 Smoluchowski equation 203 solenoid model 377, 389, 394 solenoid-type models 377 splay angle 392 solid phase 438 solvent quality 91 specific heat 442, 454 speckle 60 sperm cell 145 nucleosome retention 145 spherical coordinates 49, 86, 285 spin 36, 438 spin-spin correlation function 460
511
512 Index
spliceosome 68 spring constant 83, 226, 361 standard deviation 335, 446 stationary point 152 statistical physics 19, 20 Stirling’s formula 38, 43, 243, 451 stochastic differential equation 222 stochastic process 193, 221 autocorrelation function 194, 221 autocorrelation time 195 average 194 joint probability density 195 realization 193 stationary 195 stochastic variable 193, 446 Stokes’ law 207, 236 stop codon 6 stretching modulus 383 sugar 118 surface tension 65, 102 swollen polymer coil 91, 166, 240 free energy cost for overlapping 411 symmetric spinning top 150 precession 153 sleeping 152 symplectic integrator 425 synonymous mutation 144
tadpole configuration 105 target search 309 Taylor expansion 59, 204, 213, 423, 424, 489 temperature 20, 30, 162 critical 55, 66 tetranucleosome 377 thermal blob 98, 100, 101, 104, 166 thermal de Broglie wavelength 26, 274 thermal energy 31
thermal fluctuations 162, 174, 208, 212, 215, 316 thermally isolated system 28 thermodynamic equilibrium 267, 306 thermodynamic potential 42 theta solvent 90, 225, 232 theta temperature 90 three-body collision term 50, 101 time-displaced autocorrelation function 440 topological entanglements 404 topologically associated domain 412, 414 toposiomerase II 417 transcription 3, 191, 316 error rate 318 free energy landscape 322 transcription factor 17, 299 activator 299 repressor 299 transcriptional regulation 299 transfer matrix method 140, 188 transfer RNA 4, 241, 243, 246 cloverleaf structure 243, 246 transition probability 197, 199, 201, 428 transition state 346, 349 translation 4, 191 transmission electron cryomicroscopy 374 transport term 203 tube model 403 turbulent flow 233 twisting modulus 147 two-angle model 378 deflection angle 378 dihedral angle 378 master solenoid 379, 487 pentagram 380 stretching modulus 383
Index
two-body collision term 50, 94, 97, 166 in d dimensions 96 two-start helix 395
unicellular organisms 145 unit tensor 234 universal gas constant 20 universality 60, 85, 180
van der Waals equation of state 74 van der Waals interaction 430 variance 446 variational principle 465 velocity Verlet algorithm 424 Verlet algorithm 423 virial coefficient 50, 167 second 49, 74, 90, 167, 287, 388 third 53 virial expansion 46, 74, 90, 94, 101, 287 virial theorem 437 viscosity 207, 233
water 430 dielectric constant 266 viscosity 207 Widom relation 455 Wiener process 200, 206 ´ process 200 Wiener–Levy Wigner crystal 294, 484 wormlike chain 146, 328, 340, 362 correlation length 158, 171, 342 tangential correlation function 163 twistable 147
X-ray diffraction 14, 114
Yukawa-type potential 285 Z-DNA 120 zig-zag fiber 380, 383 zig-zag model 377 Zimm model diffusion constant 235 good solvent 236 rotational relaxation time 236
513