Energetics of Biological Macromolecules [1 ed.] 9780121827847, 0121827844

This volume focuses on methods related to allosteric enzymes and receptors, including fluorescent proves, spectroscopic

457 79 5MB

English Pages 471 Year 2004

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Energetics of Biological Macromolecules [1 ed.]
 9780121827847, 0121827844

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

METHODS IN ENZYMOLOGY EDITORS-IN-CHIEF

John N. Abelson

Melvin I. Simon

DIVISION OF BIOLOGY CALIFORNIA INSTITUTE OF TECHNOLOGY PASADENA, CALIFORNIA

FOUNDING EDITORS

Sidney P. Colowick and Nathan O. Kaplan

Contributors to Volume 380 Article numbers are in parentheses and following the names of contributors. Affiliations listed are current.

Vahe Bandarian (7), Department of Biochemistry, University of Arizona, Tucson, Arizona 85721

Carl Frieden (18), Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110

James G. Bann (18), Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110

D. Travis Gallagher (4), Biotech Division, Chemical Science and Technology Lab, National Institute of Standards and Technology, Gaithersburg, Maryland 20899

George Barany (17), Department of Chemistry, University of Minnesota, Minneapolis, Minnesota 55455

Bertrand Garci´a-Moreno E. (2), Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218

Elisar Barbar (11), Department of Chemistry and Biochemistry, Ohio University, Athens, Ohio 45701

Robert A. Goldbeck (14), Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064

Michael Carey (10), Department of Biological Chemistry, UCLA School of Medicine, Los Angeles, California 90095 Nata`lia Carulla (17), Department of Chemistry, Cambridge University, Cambridge CB2 1EW, England

Gregory A. Grant (5), Department of Molecular Biology and Pharmacology, Washington University School of Medicine, St. Louis, Missouri 63110

Hue Sun Chan (16), Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada

Michael Hare (11), Department of Chemistry and Biochemistry, Ohio University, Athens, Ohio 45701

Eefie Chen (14), Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064

Sydney D. Hoeltzli (18), Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110

Diana Chinchilla (4), CARB/University of Maryland Biotechnology Institute, Rockville, Maryland 20850 Edward Eisenstein (4), CARB/University of Maryland Biotechnology Institute, Rockville, Maryland 20850

Vasanthi Jayaraman (8), Department of Integrative Biology and Pharmacology, University of Texas Health Sciences Center, Houston, Texas 77030

Carolyn A. Fitch (2), Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218

Kristina M. Johnson (10), Department of Biological Chemistry, UCLA School of Medicine, Los Angeles, California 90095

ix

x

contributors to volume 380

Hu¨seyin Kaya (16), Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada

David L. Smith (13), Department of Chemistry, University of Nebraska, Lincoln, Nebraska 68588

David S. Kliger (14), Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064

Elaine Stephens (1), Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, England

Heidi Lau (4), CARB/University of Maryland Biotechnology Institute, Rockville, Maryland 20850 Susan Marqusee (15), Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California 94720 Rowena G. Matthews (7), Biophysics Research Division, University of Michigan, Ann Arbor, Michigan 48109

Marek Sˇtrajbl (3), Department of Chemistry, University of Southern California, Los Angeles, California 90089 Jin Wang (10), Department of Biochemistry, Ninjing University, Ninjing, People’s Republic of China Arieh Warshel (3), Department of Chemistry, University of Southern California, Los Angeles, California 90089

Hai Pan (13), Amgen Inc., Thousand Oaks, California 91320

Joachim Weber (6), Department of Cell Biology and Biochemistry, Texas Tech University Health Sciences Center, Lubbock, Texas 79430

Gregory D. Reinhart (9), Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas 77843

David Wildes (15), Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California 94720

Claudia N. Schutz (3), Department of Chemistry, University of Southern California, Los Angeles, California 90089

Dudley H. Williams (1), Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, England

Alan Senior (6), Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York 14642

Clare Woodward (17), Department of Biochemistry, Biophysics and Molecular Biology, University of Minnesota, St. Paul, Minnesota 55108

Seishi Shimizu (16), Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada

Robert W. Woody (12), Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, Colorado 80525

Avital Shurki (3), Department of Chemistry, University of Southern California, Los Angeles, California 90089 Andrea Smallwood (10), Department of Biological Chemistry, UCLA School of Medicine, Los Angeles, California 90095

Rosa Zerella (1), Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, England Min Zhou (1), Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, England

Preface

One of the most intriguing problems in biological energetics is that of cooperativity. From the discovery of cooperativity and allostery in hemoglobin 100 years ago (Bohr et al., 1904)1 to the characterization of cooperativity in a myriad of processes in modern times (i.e., transport, catalysis, signaling, assembly, folding), the molecular mechanisms by which energy is transferred from one part of a macromolecule to another continues to challenge us. Of course, the problem has many layers, as a molecule as ‘‘simple’’ and familiar as hemoglobin can simultaneously sense the chemical potential of each physiological ligand and adjust its interactions with the others accordingly. Ironically, the very allosteric intermediates that hold the structural and energetic secrets of cooperativity are the same whose populations are suppressed and, in many instances, largely obscured by the nature of cooperativity itself. Thus, innovative methodologies and techniques have been developed to address cooperative systems, many of which are presented in this volume Energetics of Biological Macromolecules Part E and its companion volume, Part D. The reader will observe remarkable similarities among the wide range of experimental strategies employed, attesting to fundamental issues inherent in all cooperative systems. Jo M. Holt Michael L. Johnson Gary K. Ackers

1

C. Bohr, K. A. Hasselbach, and A. Krogh, Skand. Arch. Physiol. 16, 402 (1904).

xi

METHODS IN ENZYMOLOGY Volume I. Preparation and Assay of Enzymes Edited by Sidney P. Colowick and Nathan O. Kaplan Volume II. Preparation and Assay of Enzymes Edited by Sidney P. Colowick and Nathan O. Kaplan Volume III. Preparation and Assay of Substrates Edited by Sidney P. Colowick and Nathan O. Kaplan Volume IV. Special Techniques for the Enzymologist Edited by Sidney P. Colowick and Nathan O. Kaplan Volume V. Preparation and Assay of Enzymes Edited by Sidney P. Colowick and Nathan O. Kaplan Volume VI. Preparation and Assay of Enzymes (Continued) Preparation and Assay of Substrates Special Techniques Edited by Sidney P. Colowick and Nathan O. Kaplan Volume VII. Cumulative Subject Index Edited by Sidney P. Colowick and Nathan O. Kaplan Volume VIII. Complex Carbohydrates Edited by Elizabeth F. Neufeld and Victor Ginsburg Volume IX. Carbohydrate Metabolism Edited by Willis A. Wood Volume X. Oxidation and Phosphorylation Edited by Ronald W. Estabrook and Maynard E. Pullman Volume XI. Enzyme Structure Edited by C. H. W. Hirs Volume XII. Nucleic Acids (Parts A and B) Edited by Lawrence Grossman and Kivie Moldave Volume XIII. Citric Acid Cycle Edited by J. M. Lowenstein Volume XIV. Lipids Edited by J. M. Lowenstein Volume XV. Steroids and Terpenoids Edited by Raymond B. Clayton

xiii

xiv

methods in enzymology

Volume XVI. Fast Reactions Edited by Kenneth Kustin Volume XVII. Metabolism of Amino Acids and Amines (Parts A and B) Edited by Herbert Tabor and Celia White Tabor Volume XVIII. Vitamins and Coenzymes (Parts A, B, and C) Edited by Donald B. McCormick and Lemuel D. Wright Volume XIX. Proteolytic Enzymes Edited by Gertrude E. Perlmann and Laszlo Lorand Volume XX. Nucleic Acids and Protein Synthesis (Part C) Edited by Kivie Moldave and Lawrence Grossman Volume XXI. Nucleic Acids (Part D) Edited by Lawrence Grossman and Kivie Moldave Volume XXII. Enzyme Purification and Related Techniques Edited by William B. Jakoby Volume XXIII. Photosynthesis (Part A) Edited by Anthony San Pietro Volume XXIV. Photosynthesis and Nitrogen Fixation (Part B) Edited by Anthony San Pietro Volume XXV. Enzyme Structure (Part B) Edited by C. H. W. Hirs and Serge N. Timasheff Volume XXVI. Enzyme Structure (Part C) Edited by C. H. W. Hirs and Serge N. Timasheff Volume XXVII. Enzyme Structure (Part D) Edited by C. H. W. Hirs and Serge N. Timasheff Volume XXVIII. Complex Carbohydrates (Part B) Edited by Victor Ginsburg Volume XXIX. Nucleic Acids and Protein Synthesis (Part E) Edited by Lawrence Grossman and Kivie Moldave Volume XXX. Nucleic Acids and Protein Synthesis (Part F) Edited by Kivie Moldave and Lawrence Grossman Volume XXXI. Biomembranes (Part A) Edited by Sidney Fleischer and Lester Packer Volume XXXII. Biomembranes (Part B) Edited by Sidney Fleischer and Lester Packer Volume XXXIII. Cumulative Subject Index Volumes I-XXX Edited by Martha G. Dennis and Edward A. Dennis Volume XXXIV. Affinity Techniques (Enzyme Purification: Part B) Edited by William B. Jakoby and Meir Wilchek

methods in enzymology

xv

Volume XXXV. Lipids (Part B) Edited by John M. Lowenstein Volume XXXVI. Hormone Action (Part A: Steroid Hormones) Edited by Bert W. O’Malley and Joel G. Hardman Volume XXXVII. Hormone Action (Part B: Peptide Hormones) Edited by Bert W. O’Malley and Joel G. Hardman Volume XXXVIII. Hormone Action (Part C: Cyclic Nucleotides) Edited by Joel G. Hardman and Bert W. O’Malley Volume XXXIX. Hormone Action (Part D: Isolated Cells, Tissues, and Organ Systems) Edited by Joel G. Hardman and Bert W. O’Malley Volume XL. Hormone Action (Part E: Nuclear Structure and Function) Edited by Bert W. O’Malley and Joel G. Hardman Volume XLI. Carbohydrate Metabolism (Part B) Edited by W. A. Wood Volume XLII. Carbohydrate Metabolism (Part C) Edited by W. A. Wood Volume XLIII. Antibiotics Edited by John H. Hash Volume XLIV. Immobilized Enzymes Edited by Klaus Mosbach Volume XLV. Proteolytic Enzymes (Part B) Edited by Laszlo Lorand Volume XLVI. Affinity Labeling Edited by William B. Jakoby and Meir Wilchek Volume XLVII. Enzyme Structure (Part E) Edited by C. H. W. Hirs and Serge N. Timasheff Volume XLVIII. Enzyme Structure (Part F) Edited by C. H. W. Hirs and Serge N. Timasheff Volume XLIX. Enzyme Structure (Part G) Edited by C. H. W. Hirs and Serge N. Timasheff Volume L. Complex Carbohydrates (Part C) Edited by Victor Ginsburg Volume LI. Purine and Pyrimidine Nucleotide Metabolism Edited by Patricia A. Hoffee and Mary Ellen Jones Volume LII. Biomembranes (Part C: Biological Oxidations) Edited by Sidney Fleischer and Lester Packer Volume LIII. Biomembranes (Part D: Biological Oxidations) Edited by Sidney Fleischer and Lester Packer

xvi

methods in enzymology

Volume LIV. Biomembranes (Part E: Biological Oxidations) Edited by Sidney Fleischer and Lester Packer Volume LV. Biomembranes (Part F: Bioenergetics) Edited by Sidney Fleischer and Lester Packer Volume LVI. Biomembranes (Part G: Bioenergetics) Edited by Sidney Fleischer and Lester Packer Volume LVII. Bioluminescence and Chemiluminescence Edited by Marlene A. DeLuca Volume LVIII. Cell Culture Edited by William B. Jakoby and Ira Pastan Volume LIX. Nucleic Acids and Protein Synthesis (Part G) Edited by Kivie Moldave and Lawrence Grossman Volume LX. Nucleic Acids and Protein Synthesis (Part H) Edited by Kivie Moldave and Lawrence Grossman Volume 61. Enzyme Structure (Part H) Edited by C. H. W. Hirs and Serge N. Timasheff Volume 62. Vitamins and Coenzymes (Part D) Edited by Donald B. McCormick and Lemuel D. Wright Volume 63. Enzyme Kinetics and Mechanism (Part A: Initial Rate and Inhibitor Methods) Edited by Daniel L. Purich Volume 64. Enzyme Kinetics and Mechanism (Part B: Isotopic Probes and Complex Enzyme Systems) Edited by Daniel L. Purich Volume 65. Nucleic Acids (Part I) Edited by Lawrence Grossman and Kivie Moldave Volume 66. Vitamins and Coenzymes (Part E) Edited by Donald B. McCormick and Lemuel D. Wright Volume 67. Vitamins and Coenzymes (Part F) Edited by Donald B. McCormick and Lemuel D. Wright Volume 68. Recombinant DNA Edited by Ray Wu Volume 69. Photosynthesis and Nitrogen Fixation (Part C) Edited by Anthony San Pietro Volume 70. Immunochemical Techniques (Part A) Edited by Helen Van Vunakis and John J. Langone Volume 71. Lipids (Part C) Edited by John M. Lowenstein

methods in enzymology

xvii

Volume 72. Lipids (Part D) Edited by John M. Lowenstein Volume 73. Immunochemical Techniques (Part B) Edited by John J. Langone and Helen Van Vunakis Volume 74. Immunochemical Techniques (Part C) Edited by John J. Langone and Helen Van Vunakis Volume 75. Cumulative Subject Index Volumes XXXI, XXXII, XXXIV–LX Edited by Edward A. Dennis and Martha G. Dennis Volume 76. Hemoglobins Edited by Eraldo Antonini, Luigi Rossi-Bernardi, and Emilia Chiancone Volume 77. Detoxication and Drug Metabolism Edited by William B. Jakoby Volume 78. Interferons (Part A) Edited by Sidney Pestka Volume 79. Interferons (Part B) Edited by Sidney Pestka Volume 80. Proteolytic Enzymes (Part C) Edited by Laszlo Lorand Volume 81. Biomembranes (Part H: Visual Pigments and Purple Membranes, I) Edited by Lester Packer Volume 82. Structural and Contractile Proteins (Part A: Extracellular Matrix) Edited by Leon W. Cunningham and Dixie W. Frederiksen Volume 83. Complex Carbohydrates (Part D) Edited by Victor Ginsburg Volume 84. Immunochemical Techniques (Part D: Selected Immunoassays) Edited by John J. Langone and Helen Van Vunakis Volume 85. Structural and Contractile Proteins (Part B: The Contractile Apparatus and the Cytoskeleton) Edited by Dixie W. Frederiksen and Leon W. Cunningham Volume 86. Prostaglandins and Arachidonate Metabolites Edited by William E. M. Lands and William L. Smith Volume 87. Enzyme Kinetics and Mechanism (Part C: Intermediates, Stereo-chemistry, and Rate Studies) Edited by Daniel L. Purich Volume 88. Biomembranes (Part I: Visual Pigments and Purple Membranes, II) Edited by Lester Packer Volume 89. Carbohydrate Metabolism (Part D) Edited by Willis A. Wood

xviii

methods in enzymology

Volume 90. Carbohydrate Metabolism (Part E) Edited by Willis A. Wood Volume 91. Enzyme Structure (Part I) Edited by C. H. W. Hirs and Serge N. Timasheff Volume 92. Immunochemical Techniques (Part E: Monoclonal Antibodies and General Immunoassay Methods) Edited by John J. Langone and Helen Van Vunakis Volume 93. Immunochemical Techniques (Part F: Conventional Antibodies, Fc Receptors, and Cytotoxicity) Edited by John J. Langone and Helen Van Vunakis Volume 94. Polyamines Edited by Herbert Tabor and Celia White Tabor Volume 95. Cumulative Subject Index Volumes 61–74, 76–80 Edited by Edward A. Dennis and Martha G. Dennis Volume 96. Biomembranes [Part J: Membrane Biogenesis: Assembly and Targeting (General Methods; Eukaryotes)] Edited by Sidney Fleischer and Becca Fleischer Volume 97. Biomembranes [Part K: Membrane Biogenesis: Assembly and Targeting (Prokaryotes, Mitochondria, and Chloroplasts)] Edited by Sidney Fleischer and Becca Fleischer Volume 98. Biomembranes (Part L: Membrane Biogenesis: Processing and Recycling) Edited by Sidney Fleischer and Becca Fleischer Volume 99. Hormone Action (Part F: Protein Kinases) Edited by Jackie D. Corbin and Joel G. Hardman Volume 100. Recombinant DNA (Part B) Edited by Ray Wu, Lawrence Grossman, and Kivie Moldave Volume 101. Recombinant DNA (Part C) Edited by Ray Wu, Lawrence Grossman, and Kivie Moldave Volume 102. Hormone Action (Part G: Calmodulin and Calcium-Binding Proteins) Edited by Anthony R. Means and Bert W. O’Malley Volume 103. Hormone Action (Part H: Neuroendocrine Peptides) Edited by P. Michael Conn Volume 104. Enzyme Purification and Related Techniques (Part C) Edited by William B. Jakoby Volume 105. Oxygen Radicals in Biological Systems Edited by Lester Packer Volume 106. Posttranslational Modifications (Part A) Edited by Finn Wold and Kivie Moldave

methods in enzymology

xix

Volume 107. Posttranslational Modifications (Part B) Edited by Finn Wold and Kivie Moldave Volume 108. Immunochemical Techniques (Part G: Separation and Characterization of Lymphoid Cells) Edited by Giovanni Di Sabato, John J. Langone, and Helen Van Vunakis Volume 109. Hormone Action (Part I: Peptide Hormones) Edited by Lutz Birnbaumer and Bert W. O’Malley Volume 110. Steroids and Isoprenoids (Part A) Edited by John H. Law and Hans C. Rilling Volume 111. Steroids and Isoprenoids (Part B) Edited by John H. Law and Hans C. Rilling Volume 112. Drug and Enzyme Targeting (Part A) Edited by Kenneth J. Widder and Ralph Green Volume 113. Glutamate, Glutamine, Glutathione, and Related Compounds Edited by Alton Meister Volume 114. Diffraction Methods for Biological Macromolecules (Part A) Edited by Harold W. Wyckoff, C. H. W. Hirs, and Serge N. Timasheff Volume 115. Diffraction Methods for Biological Macromolecules (Part B) Edited by Harold W. Wyckoff, C. H. W. Hirs, and Serge N. Timasheff Volume 116. Immunochemical Techniques (Part H: Effectors and Mediators of Lymphoid Cell Functions) Edited by Giovanni Di Sabato, John J. Langone, and Helen Van Vunakis Volume 117. Enzyme Structure (Part J) Edited by C. H. W. Hirs and Serge N. Timasheff Volume 118. Plant Molecular Biology Edited by Arthur Weissbach and Herbert Weissbach Volume 119. Interferons (Part C) Edited by Sidney Pestka Volume 120. Cumulative Subject Index Volumes 81–94, 96–101 Volume 121. Immunochemical Techniques (Part I: Hybridoma Technology and Monoclonal Antibodies) Edited by John J. Langone and Helen Van Vunakis Volume 122. Vitamins and Coenzymes (Part G) Edited by Frank Chytil and Donald B. McCormick Volume 123. Vitamins and Coenzymes (Part H) Edited by Frank Chytil and Donald B. McCormick Volume 124. Hormone Action (Part J: Neuroendocrine Peptides) Edited by P. Michael Conn

xx

methods in enzymology

Volume 125. Biomembranes (Part M: Transport in Bacteria, Mitochondria, and Chloroplasts: General Approaches and Transport Systems) Edited by Sidney Fleischer and Becca Fleischer Volume 126. Biomembranes (Part N: Transport in Bacteria, Mitochondria, and Chloroplasts: Protonmotive Force) Edited by Sidney Fleischer and Becca Fleischer Volume 127. Biomembranes (Part O: Protons and Water: Structure and Translocation) Edited by Lester Packer Volume 128. Plasma Lipoproteins (Part A: Preparation, Structure, and Molecular Biology) Edited by Jere P. Segrest and John J. Albers Volume 129. Plasma Lipoproteins (Part B: Characterization, Cell Biology, and Metabolism) Edited by John J. Albers and Jere P. Segrest Volume 130. Enzyme Structure (Part K) Edited by C. H. W. Hirs and Serge N. Timasheff Volume 131. Enzyme Structure (Part L) Edited by C. H. W. Hirs and Serge N. Timasheff Volume 132. Immunochemical Techniques (Part J: Phagocytosis and Cell-Mediated Cytotoxicity) Edited by Giovanni Di Sabato and Johannes Everse Volume 133. Bioluminescence and Chemiluminescence (Part B) Edited by Marlene DeLuca and William D. McElroy Volume 134. Structural and Contractile Proteins (Part C: The Contractile Apparatus and the Cytoskeleton) Edited by Richard B. Vallee Volume 135. Immobilized Enzymes and Cells (Part B) Edited by Klaus Mosbach Volume 136. Immobilized Enzymes and Cells (Part C) Edited by Klaus Mosbach Volume 137. Immobilized Enzymes and Cells (Part D) Edited by Klaus Mosbach Volume 138. Complex Carbohydrates (Part E) Edited by Victor Ginsburg Volume 139. Cellular Regulators (Part A: Calcium- and Calmodulin-Binding Proteins) Edited by Anthony R. Means and P. Michael Conn Volume 140. Cumulative Subject Index Volumes 102–119, 121–134

methods in enzymology

xxi

Volume 141. Cellular Regulators (Part B: Calcium and Lipids) Edited by P. Michael Conn and Anthony R. Means Volume 142. Metabolism of Aromatic Amino Acids and Amines Edited by Seymour Kaufman Volume 143. Sulfur and Sulfur Amino Acids Edited by William B. Jakoby and Owen Griffith Volume 144. Structural and Contractile Proteins (Part D: Extracellular Matrix) Edited by Leon W. Cunningham Volume 145. Structural and Contractile Proteins (Part E: Extracellular Matrix) Edited by Leon W. Cunningham Volume 146. Peptide Growth Factors (Part A) Edited by David Barnes and David A. Sirbasku Volume 147. Peptide Growth Factors (Part B) Edited by David Barnes and David A. Sirbasku Volume 148. Plant Cell Membranes Edited by Lester Packer and Roland Douce Volume 149. Drug and Enzyme Targeting (Part B) Edited by Ralph Green and Kenneth J. Widder Volume 150. Immunochemical Techniques (Part K: In Vitro Models of B and T Cell Functions and Lymphoid Cell Receptors) Edited by Giovanni Di Sabato Volume 151. Molecular Genetics of Mammalian Cells Edited by Michael M. Gottesman Volume 152. Guide to Molecular Cloning Techniques Edited by Shelby L. Berger and Alan R. Kimmel Volume 153. Recombinant DNA (Part D) Edited by Ray Wu and Lawrence Grossman Volume 154. Recombinant DNA (Part E) Edited by Ray Wu and Lawrence Grossman Volume 155. Recombinant DNA (Part F) Edited by Ray Wu Volume 156. Biomembranes (Part P: ATP-Driven Pumps and Related Transport: The Na, K-Pump) Edited by Sidney Fleischer and Becca Fleischer Volume 157. Biomembranes (Part Q: ATP-Driven Pumps and Related Transport: Calcium, Proton, and Potassium Pumps) Edited by Sidney Fleischer and Becca Fleischer Volume 158. Metalloproteins (Part A) Edited by James F. Riordan and Bert L. Vallee

xxii

methods in enzymology

Volume 159. Initiation and Termination of Cyclic Nucleotide Action Edited by Jackie D. Corbin and Roger A. Johnson Volume 160. Biomass (Part A: Cellulose and Hemicellulose) Edited by Willis A. Wood and Scott T. Kellogg Volume 161. Biomass (Part B: Lignin, Pectin, and Chitin) Edited by Willis A. Wood and Scott T. Kellogg Volume 162. Immunochemical Techniques (Part L: Chemotaxis and Inflammation) Edited by Giovanni Di Sabato Volume 163. Immunochemical Techniques (Part M: Chemotaxis and Inflammation) Edited by Giovanni Di Sabato Volume 164. Ribosomes Edited by Harry F. Noller, Jr., and Kivie Moldave Volume 165. Microbial Toxins: Tools for Enzymology Edited by Sidney Harshman Volume 166. Branched-Chain Amino Acids Edited by Robert Harris and John R. Sokatch Volume 167. Cyanobacteria Edited by Lester Packer and Alexander N. Glazer Volume 168. Hormone Action (Part K: Neuroendocrine Peptides) Edited by P. Michael Conn Volume 169. Platelets: Receptors, Adhesion, Secretion (Part A) Edited by Jacek Hawiger Volume 170. Nucleosomes Edited by Paul M. Wassarman and Roger D. Kornberg Volume 171. Biomembranes (Part R: Transport Theory: Cells and Model Membranes) Edited by Sidney Fleischer and Becca Fleischer Volume 172. Biomembranes (Part S: Transport: Membrane Isolation and Characterization) Edited by Sidney Fleischer and Becca Fleischer Volume 173. Biomembranes [Part T: Cellular and Subcellular Transport: Eukaryotic (Nonepithelial) Cells] Edited by Sidney Fleischer and Becca Fleischer Volume 174. Biomembranes [Part U: Cellular and Subcellular Transport: Eukaryotic (Nonepithelial) Cells] Edited by Sidney Fleischer and Becca Fleischer Volume 175. Cumulative Subject Index Volumes 135–139, 141–167

methods in enzymology

xxiii

Volume 176. Nuclear Magnetic Resonance (Part A: Spectral Techniques and Dynamics) Edited by Norman J. Oppenheimer and Thomas L. James Volume 177. Nuclear Magnetic Resonance (Part B: Structure and Mechanism) Edited by Norman J. Oppenheimer and Thomas L. James Volume 178. Antibodies, Antigens, and Molecular Mimicry Edited by John J. Langone Volume 179. Complex Carbohydrates (Part F) Edited by Victor Ginsburg Volume 180. RNA Processing (Part A: General Methods) Edited by James E. Dahlberg and John N. Abelson Volume 181. RNA Processing (Part B: Specific Methods) Edited by James E. Dahlberg and John N. Abelson Volume 182. Guide to Protein Purification Edited by Murray P. Deutscher Volume 183. Molecular Evolution: Computer Analysis of Protein and Nucleic Acid Sequences Edited by Russell F. Doolittle Volume 184. Avidin-Biotin Technology Edited by Meir Wilchek and Edward A. Bayer Volume 185. Gene Expression Technology Edited by David V. Goeddel Volume 186. Oxygen Radicals in Biological Systems (Part B: Oxygen Radicals and Antioxidants) Edited by Lester Packer and Alexander N. Glazer Volume 187. Arachidonate Related Lipid Mediators Edited by Robert C. Murphy and Frank A. Fitzpatrick Volume 188. Hydrocarbons and Methylotrophy Edited by Mary E. Lidstrom Volume 189. Retinoids (Part A: Molecular and Metabolic Aspects) Edited by Lester Packer Volume 190. Retinoids (Part B: Cell Differentiation and Clinical Applications) Edited by Lester Packer Volume 191. Biomembranes (Part V: Cellular and Subcellular Transport: Epithelial Cells) Edited by Sidney Fleischer and Becca Fleischer Volume 192. Biomembranes (Part W: Cellular and Subcellular Transport: Epithelial Cells) Edited by Sidney Fleischer and Becca Fleischer

xxiv

methods in enzymology

Volume 193. Mass Spectrometry Edited by James A. McCloskey Volume 194. Guide to Yeast Genetics and Molecular Biology Edited by Christine Guthrie and Gerald R. Fink Volume 195. Adenylyl Cyclase, G Proteins, and Guanylyl Cyclase Edited by Roger A. Johnson and Jackie D. Corbin Volume 196. Molecular Motors and the Cytoskeleton Edited by Richard B. Vallee Volume 197. Phospholipases Edited by Edward A. Dennis Volume 198. Peptide Growth Factors (Part C) Edited by David Barnes, J. P. Mather, and Gordon H. Sato Volume 199. Cumulative Subject Index Volumes 168–174, 176–194 Volume 200. Protein Phosphorylation (Part A: Protein Kinases: Assays, Purification, Antibodies, Functional Analysis, Cloning, and Expression) Edited by Tony Hunter and Bartholomew M. Sefton Volume 201. Protein Phosphorylation (Part B: Analysis of Protein Phosphorylation, Protein Kinase Inhibitors, and Protein Phosphatases) Edited by Tony Hunter and Bartholomew M. Sefton Volume 202. Molecular Design and Modeling: Concepts and Applications (Part A: Proteins, Peptides, and Enzymes) Edited by John J. Langone Volume 203. Molecular Design and Modeling: Concepts and Applications (Part B: Antibodies and Antigens, Nucleic Acids, Polysaccharides, and Drugs) Edited by John J. Langone Volume 204. Bacterial Genetic Systems Edited by Jeffrey H. Miller Volume 205. Metallobiochemistry (Part B: Metallothionein and Related Molecules) Edited by James F. Riordan and Bert L. Vallee Volume 206. Cytochrome P450 Edited by Michael R. Waterman and Eric F. Johnson Volume 207. Ion Channels Edited by Bernardo Rudy and Linda E. Iverson Volume 208. Protein–DNA Interactions Edited by Robert T. Sauer Volume 209. Phospholipid Biosynthesis Edited by Edward A. Dennis and Dennis E. Vance

methods in enzymology

xxv

Volume 210. Numerical Computer Methods Edited by Ludwig Brand and Michael L. Johnson Volume 211. DNA Structures (Part A: Synthesis and Physical Analysis of DNA) Edited by David M. J. Lilley and James E. Dahlberg Volume 212. DNA Structures (Part B: Chemical and Electrophoretic Analysis of DNA) Edited by David M. J. Lilley and James E. Dahlberg Volume 213. Carotenoids (Part A: Chemistry, Separation, Quantitation, and Antioxidation) Edited by Lester Packer Volume 214. Carotenoids (Part B: Metabolism, Genetics, and Biosynthesis) Edited by Lester Packer Volume 215. Platelets: Receptors, Adhesion, Secretion (Part B) Edited by Jacek J. Hawiger Volume 216. Recombinant DNA (Part G) Edited by Ray Wu Volume 217. Recombinant DNA (Part H) Edited by Ray Wu Volume 218. Recombinant DNA (Part I) Edited by Ray Wu Volume 219. Reconstitution of Intracellular Transport Edited by James E. Rothman Volume 220. Membrane Fusion Techniques (Part A) Edited by Nejat Du¨zgu¨nes, Volume 221. Membrane Fusion Techniques (Part B) Edited by Nejat Du¨zgu¨nes, Volume 222. Proteolytic Enzymes in Coagulation, Fibrinolysis, and Complement Activation (Part A: Mammalian Blood Coagulation Factors and Inhibitors) Edited by Laszlo Lorand and Kenneth G. Mann Volume 223. Proteolytic Enzymes in Coagulation, Fibrinolysis, and Complement Activation (Part B: Complement Activation, Fibrinolysis, and Nonmammalian Blood Coagulation Factors) Edited by Laszlo Lorand and Kenneth G. Mann Volume 224. Molecular Evolution: Producing the Biochemical Data Edited by Elizabeth Anne Zimmer, Thomas J. White, Rebecca L. Cann, and Allan C. Wilson Volume 225. Guide to Techniques in Mouse Development Edited by Paul M. Wassarman and Melvin L. DePamphilis

xxvi

methods in enzymology

Volume 226. Metallobiochemistry (Part C: Spectroscopic and Physical Methods for Probing Metal Ion Environments in Metalloenzymes and Metalloproteins) Edited by James F. Riordan and Bert L. Vallee Volume 227. Metallobiochemistry (Part D: Physical and Spectroscopic Methods for Probing Metal Ion Environments in Metalloproteins) Edited by James F. Riordan and Bert L. Vallee Volume 228. Aqueous Two-Phase Systems Edited by Harry Walter and Go¨te Johansson Volume 229. Cumulative Subject Index Volumes 195–198, 200–227 Volume 230. Guide to Techniques in Glycobiology Edited by William J. Lennarz and Gerald W. Hart Volume 231. Hemoglobins (Part B: Biochemical and Analytical Methods) Edited by Johannes Everse, Kim D. Vandegriff, and Robert M. Winslow Volume 232. Hemoglobins (Part C: Biophysical Methods) Edited by Johannes Everse, Kim D. Vandegriff, and Robert M. Winslow Volume 233. Oxygen Radicals in Biological Systems (Part C) Edited by Lester Packer Volume 234. Oxygen Radicals in Biological Systems (Part D) Edited by Lester Packer Volume 235. Bacterial Pathogenesis (Part A: Identification and Regulation of Virulence Factors) Edited by Virginia L. Clark and Patrik M. Bavoil Volume 236. Bacterial Pathogenesis (Part B: Integration of Pathogenic Bacteria with Host Cells) Edited by Virginia L. Clark and Patrik M. Bavoil Volume 237. Heterotrimeric G Proteins Edited by Ravi Iyengar Volume 238. Heterotrimeric G-Protein Effectors Edited by Ravi Iyengar Volume 239. Nuclear Magnetic Resonance (Part C) Edited by Thomas L. James and Norman J. Oppenheimer Volume 240. Numerical Computer Methods (Part B) Edited by Michael L. Johnson and Ludwig Brand Volume 241. Retroviral Proteases Edited by Lawrence C. Kuo and Jules A. Shafer Volume 242. Neoglycoconjugates (Part A) Edited by Y. C. Lee and Reiko T. Lee Volume 243. Inorganic Microbial Sulfur Metabolism Edited by Harry D. Peck, Jr., and Jean LeGall

methods in enzymology

xxvii

Volume 244. Proteolytic Enzymes: Serine and Cysteine Peptidases Edited by Alan J. Barrett Volume 245. Extracellular Matrix Components Edited by E. Ruoslahti and E. Engvall Volume 246. Biochemical Spectroscopy Edited by Kenneth Sauer Volume 247. Neoglycoconjugates (Part B: Biomedical Applications) Edited by Y. C. Lee and Reiko T. Lee Volume 248. Proteolytic Enzymes: Aspartic and Metallo Peptidases Edited by Alan J. Barrett Volume 249. Enzyme Kinetics and Mechanism (Part D: Developments in Enzyme Dynamics) Edited by Daniel L. Purich Volume 250. Lipid Modifications of Proteins Edited by Patrick J. Casey and Janice E. Buss Volume 251. Biothiols (Part A: Monothiols and Dithiols, Protein Thiols, and Thiyl Radicals) Edited by Lester Packer Volume 252. Biothiols (Part B: Glutathione and Thioredoxin; Thiols in Signal Transduction and Gene Regulation) Edited by Lester Packer Volume 253. Adhesion of Microbial Pathogens Edited by Ron J. Doyle and Itzhak Ofek Volume 254. Oncogene Techniques Edited by Peter K. Vogt and Inder M. Verma Volume 255. Small GTPases and Their Regulators (Part A: Ras Family) Edited by W. E. Balch, Channing J. Der, and Alan Hall Volume 256. Small GTPases and Their Regulators (Part B: Rho Family) Edited by W. E. Balch, Channing J. Der, and Alan Hall Volume 257. Small GTPases and Their Regulators (Part C: Proteins Involved in Transport) Edited by W. E. Balch, Channing J. Der, and Alan Hall Volume 258. Redox-Active Amino Acids in Biology Edited by Judith P. Klinman Volume 259. Energetics of Biological Macromolecules Edited by Michael L. Johnson and Gary K. Ackers Volume 260. Mitochondrial Biogenesis and Genetics (Part A) Edited by Giuseppe M. Attardi and Anne Chomyn Volume 261. Nuclear Magnetic Resonance and Nucleic Acids Edited by Thomas L. James

xxviii

methods in enzymology

Volume 262. DNA Replication Edited by Judith L. Campbell Volume 263. Plasma Lipoproteins (Part C: Quantitation) Edited by William A. Bradley, Sandra H. Gianturco, and Jere P. Segrest Volume 264. Mitochondrial Biogenesis and Genetics (Part B) Edited by Giuseppe M. Attardi and Anne Chomyn Volume 265. Cumulative Subject Index Volumes 228, 230–262 Volume 266. Computer Methods for Macromolecular Sequence Analysis Edited by Russell F. Doolittle Volume 267. Combinatorial Chemistry Edited by John N. Abelson Volume 268. Nitric Oxide (Part A: Sources and Detection of NO; NO Synthase) Edited by Lester Packer Volume 269. Nitric Oxide (Part B: Physiological and Pathological Processes) Edited by Lester Packer Volume 270. High Resolution Separation and Analysis of Biological Macromolecules (Part A: Fundamentals) Edited by Barry L. Karger and William S. Hancock Volume 271. High Resolution Separation and Analysis of Biological Macromolecules (Part B: Applications) Edited by Barry L. Karger and William S. Hancock Volume 272. Cytochrome P450 (Part B) Edited by Eric F. Johnson and Michael R. Waterman Volume 273. RNA Polymerase and Associated Factors (Part A) Edited by Sankar Adhya Volume 274. RNA Polymerase and Associated Factors (Part B) Edited by Sankar Adhya Volume 275. Viral Polymerases and Related Proteins Edited by Lawrence C. Kuo, David B. Olsen, and Steven S. Carroll Volume 276. Macromolecular Crystallography (Part A) Edited by Charles W. Carter, Jr., and Robert M. Sweet Volume 277. Macromolecular Crystallography (Part B) Edited by Charles W. Carter, Jr., and Robert M. Sweet Volume 278. Fluorescence Spectroscopy Edited by Ludwig Brand and Michael L. Johnson Volume 279. Vitamins and Coenzymes (Part I) Edited by Donald B. McCormick, John W. Suttie, and Conrad Wagner

methods in enzymology

xxix

Volume 280. Vitamins and Coenzymes (Part J) Edited by Donald B. McCormick, John W. Suttie, and Conrad Wagner Volume 281. Vitamins and Coenzymes (Part K) Edited by Donald B. McCormick, John W. Suttie, and Conrad Wagner Volume 282. Vitamins and Coenzymes (Part L) Edited by Donald B. McCormick, John W. Suttie, and Conrad Wagner Volume 283. Cell Cycle Control Edited by William G. Dunphy Volume 284. Lipases (Part A: Biotechnology) Edited by Byron Rubin and Edward A. Dennis Volume 285. Cumulative Subject Index Volumes 263, 264, 266–284, 286–289 Volume 286. Lipases (Part B: Enzyme Characterization and Utilization) Edited by Byron Rubin and Edward A. Dennis Volume 287. Chemokines Edited by Richard Horuk Volume 288. Chemokine Receptors Edited by Richard Horuk Volume 289. Solid Phase Peptide Synthesis Edited by Gregg B. Fields Volume 290. Molecular Chaperones Edited by George H. Lorimer and Thomas Baldwin Volume 291. Caged Compounds Edited by Gerard Marriott Volume 292. ABC Transporters: Biochemical, Cellular, and Molecular Aspects Edited by Suresh V. Ambudkar and Michael M. Gottesman Volume 293. Ion Channels (Part B) Edited by P. Michael Conn Volume 294. Ion Channels (Part C) Edited by P. Michael Conn Volume 295. Energetics of Biological Macromolecules (Part B) Edited by Gary K. Ackers and Michael L. Johnson Volume 296. Neurotransmitter Transporters Edited by Susan G. Amara Volume 297. Photosynthesis: Molecular Biology of Energy Capture Edited by Lee McIntosh Volume 298. Molecular Motors and the Cytoskeleton (Part B) Edited by Richard B. Vallee

xxx

methods in enzymology

Volume 299. Oxidants and Antioxidants (Part A) Edited by Lester Packer Volume 300. Oxidants and Antioxidants (Part B) Edited by Lester Packer Volume 301. Nitric Oxide: Biological and Antioxidant Activities (Part C) Edited by Lester Packer Volume 302. Green Fluorescent Protein Edited by P. Michael Conn Volume 303. cDNA Preparation and Display Edited by Sherman M. Weissman Volume 304. Chromatin Edited by Paul M. Wassarman and Alan P. Wolffe Volume 305. Bioluminescence and Chemiluminescence (Part C) Edited by Thomas O. Baldwin and Miriam M. Ziegler Volume 306. Expression of Recombinant Genes in Eukaryotic Systems Edited by Joseph C. Glorioso and Martin C. Schmidt Volume 307. Confocal Microscopy Edited by P. Michael Conn Volume 308. Enzyme Kinetics and Mechanism (Part E: Energetics of Enzyme Catalysis) Edited by Daniel L. Purich and Vern L. Schramm Volume 309. Amyloid, Prions, and Other Protein Aggregates Edited by Ronald Wetzel Volume 310. Biofilms Edited by Ron J. Doyle Volume 311. Sphingolipid Metabolism and Cell Signaling (Part A) Edited by Alfred H. Merrill, Jr., and Yusuf A. Hannun Volume 312. Sphingolipid Metabolism and Cell Signaling (Part B) Edited by Alfred H. Merrill, Jr., and Yusuf A. Hannun Volume 313. Antisense Technology (Part A: General Methods, Methods of Delivery, and RNA Studies) Edited by M. Ian Phillips Volume 314. Antisense Technology (Part B: Applications) Edited by M. Ian Phillips Volume 315. Vertebrate Phototransduction and the Visual Cycle (Part A) Edited by Krzysztof Palczewski Volume 316. Vertebrate Phototransduction and the Visual Cycle (Part B) Edited by Krzysztof Palczewski

methods in enzymology

xxxi

Volume 317. RNA–Ligand Interactions (Part A: Structural Biology Methods) Edited by Daniel W. Celander and John N. Abelson Volume 318. RNA–Ligand Interactions (Part B: Molecular Biology Methods) Edited by Daniel W. Celander and John N. Abelson Volume 319. Singlet Oxygen, UV-A, and Ozone Edited by Lester Packer and Helmut Sies Volume 320. Cumulative Subject Index Volumes 290–319 Volume 321. Numerical Computer Methods (Part C) Edited by Michael L. Johnson and Ludwig Brand Volume 322. Apoptosis Edited by John C. Reed Volume 323. Energetics of Biological Macromolecules (Part C) Edited by Michael L. Johnson and Gary K. Ackers Volume 324. Branched-Chain Amino Acids (Part B) Edited by Robert A. Harris and John R. Sokatch Volume 325. Regulators and Effectors of Small GTPases (Part D: Rho Family) Edited by W. E. Balch, Channing J. Der, and Alan Hall Volume 326. Applications of Chimeric Genes and Hybrid Proteins (Part A: Gene Expression and Protein Purification) Edited by Jeremy Thorner, Scott D. Emr, and John N. Abelson Volume 327. Applications of Chimeric Genes and Hybrid Proteins (Part B: Cell Biology and Physiology) Edited by Jeremy Thorner, Scott D. Emr, and John N. Abelson Volume 328. Applications of Chimeric Genes and Hybrid Proteins (Part C: Protein–Protein Interactions and Genomics) Edited by Jeremy Thorner, Scott D. Emr, and John N. Abelson Volume 329. Regulators and Effectors of Small GTPases (Part E: GTPases Involved in Vesicular Traffic) Edited by W. E. Balch, Channing J. Der, and Alan Hall Volume 330. Hyperthermophilic Enzymes (Part A) Edited by Michael W. W. Adams and Robert M. Kelly Volume 331. Hyperthermophilic Enzymes (Part B) Edited by Michael W. W. Adams and Robert M. Kelly Volume 332. Regulators and Effectors of Small GTPases (Part F: Ras Family I) Edited by W. E. Balch, Channing J. Der, and Alan Hall Volume 333. Regulators and Effectors of Small GTPases (Part G: Ras Family II) Edited by W. E. Balch, Channing J. Der, and Alan Hall Volume 334. Hyperthermophilic Enzymes (Part C) Edited by Michael W. W. Adams and Robert M. Kelly

xxxii

methods in enzymology

Volume 335. Flavonoids and Other Polyphenols Edited by Lester Packer Volume 336. Microbial Growth in Biofilms (Part A: Developmental and Molecular Biological Aspects) Edited by Ron J. Doyle Volume 337. Microbial Growth in Biofilms (Part B: Special Environments and Physicochemical Aspects) Edited by Ron J. Doyle Volume 338. Nuclear Magnetic Resonance of Biological Macromolecules (Part A) Edited by Thomas L. James, Volker Do¨tsch, and Uli Schmitz Volume 339. Nuclear Magnetic Resonance of Biological Macromolecules (Part B) Edited by Thomas L. James, Volker Do¨tsch, and Uli Schmitz Volume 340. Drug–Nucleic Acid Interactions Edited by Jonathan B. Chaires and Michael J. Waring Volume 341. Ribonucleases (Part A) Edited by Allen W. Nicholson Volume 342. Ribonucleases (Part B) Edited by Allen W. Nicholson Volume 343. G Protein Pathways (Part A: Receptors) Edited by Ravi Iyengar and John D. Hildebrandt Volume 344. G Protein Pathways (Part B: G Proteins and Their Regulators) Edited by Ravi Iyengar and John D. Hildebrandt Volume 345. G Protein Pathways (Part C: Effector Mechanisms) Edited by Ravi Iyengar and John D. Hildebrandt Volume 346. Gene Therapy Methods Edited by M. Ian Phillips Volume 347. Protein Sensors and Reactive Oxygen Species (Part A: Selenoproteins and Thioredoxin) Edited by Helmut Sies and Lester Packer Volume 348. Protein Sensors and Reactive Oxygen Species (Part B: Thiol Enzymes and Proteins) Edited by Helmut Sies and Lester Packer Volume 349. Superoxide Dismutase Edited by Lester Packer Volume 350. Guide to Yeast Genetics and Molecular and Cell Biology (Part B) Edited by Christine Guthrie and Gerald R. Fink Volume 351. Guide to Yeast Genetics and Molecular and Cell Biology (Part C) Edited by Christine Guthrie and Gerald R. Fink

methods in enzymology

xxxiii

Volume 352. Redox Cell Biology and Genetics (Part A) Edited by Chandan K. Sen and Lester Packer Volume 353. Redox Cell Biology and Genetics (Part B) Edited by Chandan K. Sen and Lester Packer Volume 354. Enzyme Kinetics and Mechanisms (Part F: Detection and Characterization of Enzyme Reaction Intermediates) Edited by Daniel L. Purich Volume 355. Cumulative Subject Index Volumes 321–354 Volume 356. Laser Capture Microscopy and Microdissection Edited by P. Michael Conn Volume 357. Cytochrome P450, Part C Edited by Eric F. Johnson and Michael R. Waterman Volume 358. Bacterial Pathogenesis (Part C: Identification, Regulation, and Function of Virulence Factors) Edited by Virginia L. Clark and Patrik M. Bavoil Volume 359. Nitric Oxide (Part D) Edited by Enrique Cadenas and Lester Packer Volume 360. Biophotonics (Part A) Edited by Gerard Marriott and Ian Parker Volume 361. Biophotonics (Part B) Edited by Gerard Marriott and Ian Parker Volume 362. Recognition of Carbohydrates in Biological Systems (Part A) Edited by Yuan C. Lee and Reiko T. Lee Volume 363. Recognition of Carbohydrates in Biological Systems (Part B) Edited by Yuan C. Lee and Reiko T. Lee Volume 364. Nuclear Receptors Edited by David W. Russell and David J. Mangelsdorf Volume 365. Differentiation of Embryonic Stem Cells Edited by Paul M. Wassauman and Gordon M. Keller Volume 366. Protein Phosphatases Edited by Susanne Klumpp and Josef Krieglstein Volume 367. Liposomes (Part A) Edited by Nejat Du¨zgu¨nes, Volume 368. Macromolecular Crystallography (Part C) Edited by Charles W. Carter, Jr., and Robert M. Sweet Volume 369. Combinational Chemistry (Part B) Edited by Guillermo A. Morales and Barry A. Bunin Volume 370. RNA Polymerases and Associated Factors (Part C) Edited by Sankar L. Adhya and Susan Garges

xxxiv

methods in enzymology

Volume 371. RNA Polymerases and Associated Factors (Part D) Edited by Sankar L. Adhya and Susan Garges Volume 372. Liposomes (Part B) Edited by Negat Du¨zgu¨nes, Volume 373. Liposomes (Part C) Edited by Negat Du¨zgu¨nes, Volume 374. Macromolecular Crystallography (Part D) Edited by Charles W. Carter, Jr., and Robert W. Sweet Volume 375. Chromatin and Chromatin Remodeling Enzymes (Part A) Edited by C. David Allis and Carl Wu Volume 376. Chromatin and Chromatin Remodeling Enzymes (Part B) Edited by C. David Allis and Carl Wu Volume 377. Chromatin and Chromatin Remodeling Enzymes (Part C) Edited by C. David Allis and Carl Wu Volume 378. Quinones and Quinone Enzymes (Part A) Edited by Helmut Sies and Lester Packer Volume 379. Energetics of Biological Macromolecules (Part D) Edited by Jo M. Holt, Michael L. Johnson, and Gary K. Ackers Volume 380. Energetics of Biological Macromolecules (Part E) Edited by Jo M. Holt, Michael L. Johnson, and Gary K. Ackers Volume 381. Oxygen Sensing (in preparation) Edited by Chandan K. Sen and Gregg L. Semenza Volume 382. Quinones and Quinone Enzymes (Part B) (in preparation) Edited by Helmut Sies and Lester Packer Volume 383. Numerical Computer Methods (Part D) (in preparation) Edited by Ludwig Brand and Michael L. Johnson Volume 384. Numerical Computer Methods (Part E) (in preparation) Edited by Ludwig Brand and Michael L. Johnson Volume 385. Imaging in Biological Research (Part A) (in preparation) Edited by P. Michael Conn Volume 386. Imaging in Biological Research (Part B) (in preparation) Edited by P. Michael Conn

[1]

packing improvements within enzymes and receptors

3

[1] Contributions to the Catalytic Efficiency of Enzymes, and the Binding of Ligands to Receptors, from Improvements in Packing within Enzymes and Receptors By Dudley H. Williams, Elaine Stephens, Min Zhou, and Rosa Zerella Introduction

One of the great challenges to twenty-first-century science is to further our understanding of the noncovalent interactions that are responsible for the molecule-to-molecule binding that is the key to biological function. Suppose we were given a picture of a set of noncovalent interactions involved in the association of two entities (e.g., from X-ray crystallography). If we were then able to predict successfully the binding constant (say, to within a factor of 10), we could claim a relatively good understanding of noncovalent interactions. Among such attempts, the approach known as LUDI1a is—given its simplicity—moderately successful. LUDI builds on an equation developed in our own laboratory.2 Its modified version1a is Eq. (1). G ¼ Gtþr þ Gr þ AreaðGh Þ þ Ghb þ Gionic

(1)

In this equation, G is the observed free energy of a bimolecular association. Since G ¼ RT ln K, G determines the binding constant K. Five common parameters (right-hand side of the equation) that are known to be important in binding are considered. It is assumed that their sum will give a useful approximation of G, and hence of K. The first two terms oppose binding. Gtþr is the free energy cost of restricting the overall motion of a ligand when it binds to its receptor. Gr is the free energy cost of restricting an internal rotation of the ligand that is restrained upon binding (summed over all such rotations). Both these terms are essentially adverse entropy terms. The remaining three terms promote binding. Gh is the free energy bene˚ 2 of hydrocarbon surface area from water fit due to the removal of 1 A upon binding (the hydrophobic effect). Gh is therefore multiplied by the 1

(a) H.-J. Bo¨hm, J. Comp. Aided Mol. Des. 8, 243 (1994). (b) H.-J. Bo¨hm, personal communication (2001). 2 D. H. Williams, J. P. L. Cox, A. J. Doig, M. Gardner, U. Gerhard, P. T. Kaye, A. R. Lal, I. A. Nicholls, C. J. Salter, and R. C. Mitchell, J. Am. Chem. Soc. 113, 7020 (1991).

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

4

allosteric enzymes and receptors

[1]

buried surface area for each specified case. Ghb is the free energy benefit of a hydrogen bond in the binding (summed over all such hydrogen bonds). Gionic is the free energy benefit of an ionic bond in the binding site (summed over all such ionic bonds). To ‘‘train’’ the equation, a set of 45 complexes with experimentally known binding constants was used. In these complexes, ligands of relatively small molecular weight (66 to 1047) interact with proteins through sets of known interactions (determined by X-ray crystallography). Since Eq. (1) has only five types of G contributions, and the 45 binding sites involve different combinations of these five types of G contributions, average values for them can be obtained. Using these average values, the equation can then be used to estimate binding constants where ‘‘pictures’’ of binding sites are available. The equation is remarkably successful, for in a limited data set (but one that includes compounds outside the training set) it is able to predict binding constants with a standard deviation of only log10 1.7.1a However, in a wider data set, it performs less well.1b Estimated binding constants can be in error by a factor of 1000, or more. Partly this is because other important terms (e.g., other favorable terms such as – stacking) are neglected. Partly, it is because cooperativity is neglected. Some physical consequences of cooperative binding are the subject of this chapter. Cooperativity

Cooperativity is the phenomenon through which one set of binding interactions can change the binding energy of another. Equation (1) ignores such cooperativity. However, Eq. (1) shows that sets of interactions acting simultaneously can give more binding energy than the sum of the sets when occurring separately. This point can be understood by reference to Fig. 1A and B. Suppose that Z can make interactions to its receptor cup that promote binding by a factor of 103 M1. Let the cost of restricting the motion of Z into its receptor cup (Gtþr) oppose binding by a factor of 102 M1. The binding constant of Z to the receptor would therefore be 101 M1. Let Y, when bound alone, interact with the same parameters into its (central) receptor cup. The binding constant of Y to its receptor cup would therefore also be 101 M1. Equation (1) tells us that it would be false to conclude that X–Y (Fig. 1B, where X and Y are connected with a strain-free connection, allowing both groups to bind in the geometry as when binding separately) would exhibit a binding constant of 101  101 ¼ 102 M1 (the sum of the parts). Equation (1) assumes that the cost of a bimolecular association (Gtþr) has to be paid only once.2 Therefore, the estimated binding constant of X–Y to the receptor is 103  103/102 M1 ¼ 104 M1 (greater than the sum of the parts).

[1]

packing improvements within enzymes and receptors

5

Fig. 1. Schematic representation of a receptor that binds ligands X, Y, and Z. (A) Binding of Z results in a structure with intermolecular distance d0. (B) When Y and Z are connected by a rigid, strain-free linker (Y–Z), if they bind the receptor without positive cooperativity, then d0 ¼ d1. If they bind with positive cooperativity, there is structural tightening (d1 < d0). (C) If X is connected to Y–Z by a rigid, strain-free linker to form X–Y–Z then positively cooperative will cause further structural tightening (d2 < d1). (D) The shorter linker between Y and Z does not allow both these binding interactions to occur with optimal geometries. Y–Z binds the receptor with negative cooperativity, and there is structural loosening (d3 > d0).

The assumption of a useful average Gtþr term in Eq. (1) is an Achilles heel of the approach. It implies that all ligands are restricted in motion to the same, or similar, extents. That is, that the degree to which Z is restrained in Fig. 1A is essentially the same as the degree to which Z is restrained in Fig. 1B, i.e., d0 ¼ d1. However, the free energy cost Gtþr is not a standard cost that is paid for any bimolecular association. Rather, the cost becomes greater as the motions of a ligand relative to its receptor become more restricted by stronger bonds (bonds that are formed with a greater exothermicity).3 Awareness of the above fact points to a problem for any approach to the estimation of binding constants that treats individual interactions as though (when formed with the same geometry) they have usefully constant 3

M. S. Westwell, M. S. Searle, J. Klein, and D. H. Williams, J. Phys. Chem. 100, 16000 (1996).

6

allosteric enzymes and receptors

[1]

free energy benefits [e.g., for terms 4 and 5 in Eq. (1)]. It is motion that opposes bonding. In light of this observation, reconsider cases where a number of noncovalent interactions can be simultaneously made in a strain-free manner to promote the ligand/receptor binding (Fig. 1). The motions about a specified noncovalent interaction (Fig. 1A) will typically become more restricted as the ligand is held in place by more adjacent noncovalent bonds (Fig. 1A ! Fig. 1B ! Fig. 1C). The specified interaction (Fig. 1A) then forms with a more favorable enthalpy (i.e., it is associated with better bonding), but with an increased cost in entropy (a greater restriction in motion). Each of the three noncovalent interactions made by X, Y, and Z to a receptor give rise to better bonding when they are made simultaneously rather than separately. Evidence for the effects modeled in Fig. 1A–C is available from proton nuclear magnetic resonance (NMR) experiments carried out on the binding of ligands to glycopeptide antibiotics,4 and is detailed in the following section. Positively Cooperative Binding Probed by NMR Spectroscopy Several peptide ligands, all containing the carboxyl group depicted at the lower right in Fig. 2, were separately bound to the antibiotics. In all cases, a downfield chemical shift of the antibiotic amide NH proton w2 was observed upon ligand binding. A larger limiting downfield shift of w2 indicates a shorter carboxylate to NH hydrogen bond. This hydrogen bond was found to decrease in length as the number of the adjacent hydrogen bonds that aid ligand binding was increased. The motional restriction of the carboxylate group afforded by these additional hydrogen bonds shortens the hydrogen bonds directly made to the carboxylate.4 Analogous effects have been observed at other interfaces.5 Although the above experiments establish the shortening of noncovalent bonds as a consequence of positive cooperativity, they do not prove that the noncovalent bonds are thereby improved in terms of their free energy benefit. The proof that the benefit in improved bonding (increased exothermicity) outweighs the cost in entropy (more restricted motion) is seen in cases in which two interfaces made simultaneously give a larger free energy of association than the sum of their parts. Dimers of glycopeptide antibiotics of the vancomycin group are further stabilized when they bind two molecules of the bacterial cell wall analogues (Fig. 3). The dimeric system is stabilized by the ligand binding, with attendant distance 4

M. S. Searle, G. J. Sharman, P. Groves, B. Benhamu, D. A. Beauregard, M. S. Westwell, R. J. Dancer, A. J. Maguire, A. C. Try, and D. H. Williams, J. Chem. Soc. Perkin Trans. 1 2781 (1996). 5 C. T. Calderone and D. H. Williams, J. Am. Chem. Soc. 123, 6262 (2001).

[1]

packing improvements within enzymes and receptors

7

Fig. 2. Exploded view of the binding interaction between the glycopeptide antibiotics (in this case vancomycin) and the peptide ligand N--acetyl-Lys-(N-e-acetyl)-d-Ala-d-Ala. Hydrogen bonds between the two are indicated by dotted lines. The binding is also promoted by hydrophobic interactions, notably of the Ala methyl groups to the aromatic rings of the antibiotic. The amide NH proton W2, mentioned in the text, is labeled.

reductions at the dimer interface.6 In nine of nine cases, the positive cooperativity is associated with a benefit in enthalpy; in eight of nine cases, it is associated with a cost in entropy.7,8 There are large numbers of papers9–12 that report changes in receptor structures upon ligand binding, and clear indications that the receptor can in some cases be stabilized. The antibiotic work indicates some specific correlations that increase our understanding: 6

D. H. Williams, A. J. Maguire, W. Tsuzuki, and M. S. Westwell, Science 280, 711 (1998). D. McPhail and A. Cooper, J. Chem. Soc. Faraday Trans. 93, 2283 (1997). 8 D. H. Williams, C. T. Calderone, and D. P. O’Brien, J. Chem. Soc. Chem. Commun. 1266 (2002). 7

8

allosteric enzymes and receptors

[1]

Fig. 3. Peptide backbone of a glycopeptide antibiotic dimer, simultaneously bound to two molecules of a bacterial cell peptide precursor analogue (N-Ac-d-Ala-d-Ala). The binding of the N-Ac-d-Ala-d-Ala occurs with positive cooperativity, such that the dimer system is stabilized and shortens some of the distances at the central (dimer) interface, with an overall benefit in enthalpy and a cost in entropy.

9

M. Gonzalez, L. A. Bagatolli, I. Echabe, J. L. R. Arrondo, C. E. Argarana, C. R. Cantor, and G. D. Fidelio, J. Biol. Chem. 272, 11288 (1997). 10 D. C. Williams, D. C. Benjamin, R. J. Poljak, and G. S. Rule, J. Mol. Biol. 257, 866 (1996). 11 E. Freire, Proc. Natl. Acad. Sci. USA 96, 10118 (1999). 12 B. A. Johnson, E. M. Wilson, Y. Li, D. E. Moller, R. G. Smith, and G. Zhou, J. Mol. Biol. 298, 187 (2000).

[1]

packing improvements within enzymes and receptors

9

1. The dimeric nature of the receptor system allows the conclusion that tightening (shorter interfacial distances without geometric distortion) of an internal interface of the receptor induces increased stability of the receptor system. 2. The increased stability associated with the positive cooperativity is characterized by increased exothermicity and a cost in entropy. 3. A thermodynamic cycle establishes that increased stability of the receptor system when the ligand is bound necessarily leads to increased ligand-binding energy.13 Binding to Protein Receptors and the Use of Mass Spectrometry

From the above experiments, we can conclude that where the structure of a receptor undergoes tightening (reduced internal noncovalent distances) upon ligand binding, ligand binding is thereby enhanced. The properties of positively cooperative binding found above for a receptor dimer interface (Fig. 3) are equally applicable when the dimer interface is replaced by an interface that is within a monomeric receptor (Fig. 4). In each panel of Fig. 4, the ligand is represented as the upper molecule (a dipeptide) and the receptor as the lower structure (with illustration of only one set of its internal noncovalent interactions, in the form of two amide–amide hydrogen bonds). The tightening of noncovalent interactions (with exaggerated changes in bond lengths to illustrate the principle) occurs in Fig. 4B, where there is positively cooperative binding of ligand that is absent in Fig. 4A. Thus, in Fig. 4A and B, the proven reductions in dimer interfacial bond distances upon ligand binding with positive cooperativity are extrapolated to the monomeric receptor case. The physical basis for the tightening is that the matching fit of the ligand to the exposed binding site of the receptor causes, through the formation of the ligand/ receptor noncovalent bonds, a reduction on the motions of the exposed part of the receptor (here a peptide backbone). Since it is motion that opposes bonding, such reductions in motion will be accompanied by bond shortening within the receptor (Fig. 4A ! B). Since the internal tightening of receptor structures upon positively cooperative ligand binding reduces their dynamic behavior, the extent to which such tightened receptor structures undergo NH ! ND exchange of their backbone amide NHs upon exposure to D2O will be decreased. Such changes in exchange behavior can be conveniently monitored by mass spectrometry.14–16 A typical protocol, used to monitor H/D exchange in our laboratory, follows. 13

B. Bardsley and D. H. Williams, J. Chem. Soc. Chem. Commun. 2305 (1998).

10

allosteric enzymes and receptors

[1]

Fig. 4. Schematic representation of a ligand (upper peptide backbone) binding to a receptor (below) (A) in the absence of cooperativity, (B) with positive cooperativity, (C) with negative cooperativity prior to enthalpy/entropy compensation, and (D) with negative

[1]

packing improvements within enzymes and receptors

11

An H/D Exchange Protocol H/D exchange is typically initiated by dilution of 10 l of a 3 mM solution of a receptor protein in 100 mM ammonium acetate buffer (pH 8.0) into 90 l of 99.9 atom% excess D2O. The complex of the receptor protein with the appropriate ligand is formed by incubation of 0.3–3 mM receptor with a 10–20% molar excess of ligand at room temperature for greater than 1 h prior to dilution into D2O. Solutions are maintained at room temperature for H/D exchange and allowed to exchange for the desired times. At appropriate intervals, 5-l aliquots of the receptor solution are adjusted to pH 2.5 by the addition of 30 l of chilled acidic quench solution. These  aliquots are immediately cooled to 0 . The use of relatively acidic conditions and low temperatures at this stage minimizes the extent of ND ! NH back exchange. A-10 l aliquot is then loop injected for electrospray ionization mass spectrometry (ESI-MS) to determine the deuterium content of the receptor system, both in the presence and absence of ligand. Locating the Regions of Structural Tightening in Receptors The above procedure may indicate the tightening of a receptor system upon ligand binding and therefore indicates that the ligand-binding energy can be enhanced in this way. This was the case found in our laboratory for the binding of biotin to streptavidin.17 Specifically, 22 backbone amide NH protons per streptavidin are protected from H/D exchange upon biotin binding. Thus, tightening of the streptavidin structure upon the binding of biotin contributes to the binding affinity of biotin. Since the binding of biotin occurs to a streptavidin tetramer (which accommodates four molecules of biotin), it is important to determine where the tightening of the streptavidin tetramer occurs. To determine where in the receptor system the structural tightening occurs, enzymatic digestion of the receptor is carried out. This can be achieved by two experiments involving pepsin digestion—of both the ligand-free and ligand-bound receptor. Pepsin 14

V. Katta and B. T. Chait, J. Am. Chem. Soc. 115, 6317 (1993). (a) Z. Zhang and D. L. Smith, Protein Sci. 2, 522 (1993). (b) D. L. Smith, Y. Deng, and Z. Zhang, J. Mass Spectrom. 32, 135 (1997). 16 F. Wang, R. W. Miles, G. Kicsa, E. Nieves, V. L. Schramm, and R. H. Angeletti, Protein Sci. 9, 1660 (2000). 17 D. H. Williams, E. Stephens, and M. Zhou, J. Mol. Biol. 329, 389 (2003). 15

cooperativity after enthalpy/entropy compensation. Where the tightened (B), or loosened (D), interactions are coupled to other interactions within the receptor system, they will be similarly affected.

12

allosteric enzymes and receptors

[1]

digestion is used because this enzyme can function at pH 2.5—the pH at which back exchange of amide backbone ND ! NH is minimized. The peptide fragments derived from the receptor in both experiments are then analyzed by liquid chromatography (LC)-ESI-MS. The relative deuterium contents of each set are determined from their molecular weights (in comparison with those of the corresponding peptides obtained in the absence of H/D exchange). The amino acid sequence of the receptor is of course typically known. Therefore, the structures of the peptides produced by pepsin digestion can be determined from their molecular weights, in combination with some sequence information derived from their collision-induced fragmentation. A protocol for achieving such digestion, and analyzing the products, follows. Previously published protocols are available.15,16 The following protocol was used to show that the peptide backbone NHs that are protected upon the binding of biotin to streptavidin are widely distributed through the streptavidin.17 Thus the binding energy for biotin to streptavidin is widely delocalized. A Protocol for Pepsin Digestion and Analysis of the Digest (Used in the Case of Streptavidin as the Receptor) Pepsin digestions can be performed on-line by linking a digestion cartridge made by packing a microbore guard column (1  20 mm) (Upchurch Scientific) with pepsin Porozyme media (Applied Biosystems) to a Rheodyne 7010 injector coupled with LC-MS. The protein solution, quenched to pH 2.5, is injected into the pepsin cartridge and the protein is digested  for 3 min at 0 . The resulting peptide mixture is then infused at 100 l 1 min through a C18 reverse-phase peptide trap for 2 min, using ice-cold buffer (10 mM ammonium acetate, 2% acetic acid, pH 2.9). When the injector is switched to inject mode, the peptide trap is subsequently placed in-line with the LC column (PepMap C18, 300 m  5 cm; LC-Packings, Dionex) and peptides are eluted with increasing organic concentration. An LC-Packings Ultimate capillary high-performance liquid chromatography (HPLC) (Dionex) can be used to generate the gradient (flow rate 4 l min1), e.g., solvent A 0.1% formic acid in H2O and solvent B 90% acetonitrile containing 9.95% H2O and 0.05% formic acid. The peptic peptides eluted between 3.5 and 9 min with a 5-min 20–50% B gradient, at which time the gradient is held at 50% B for 10 min. The column effluent is delivered directly to a nanoflow ESI probe held at 3 kV. For all experiments, the solvents, Rheodyne injectors, peptide trap, and HPLC column  are all immersed in an ice bath (0 ) to minimize back exchange with solvents. To account for deuterium gain or loss under quenched conditions,

[1]

packing improvements within enzymes and receptors

13

two control samples are prepared. A ‘‘zero deuteration’’ control is prepared by diluting the protein solution directly into a 1:1 (v:v) mixture of deuterated buffer and quench buffer. A ‘‘full-deuteration’’ control is pre pared by incubating streptavidin in 8 M urea-d4 in D2O at 55 for 2 h. Using the above protocol, the extent of deuterium loss in the peptic peptides was ca. 30–50%, consistent with previous reports.16 The deuterium content of each peptide can be calculated after correction for back exchange, as described previously.15 Identification of Peptides from Pepsin Digestion Peptides can be sequenced by LC-MS/MS following pepsin proteolysis under conditions identical to those used for the deuterium exchange experiments, except that D2O is omitted. Switching between MS and MS/MS can be achieved with automatic switching triggered by the detection of specific peptide ions entered into Masslynx software as a peak list. Argon is used as the collision gas and collision energies from 32 to 35 eV are typically applied. Evidence that Enzymes Derive Catalytic Efficiency by Tightening Their Structures to the Greatest Degree in the Transition State

The concept that when a small molecule (L) binds to a protein (P), binding energy of L to P is derived by tightening (contracting) the structure of P, has potential implications for enzyme catalysis. If enzymes exploit this effect to derive binding energy of the substrate and product, then the enzyme structure should be contracted when substrate and product are bound. However, if enzymes exploit this effect to increase catalytic efficiency, then enzyme structures should be contracted to the greatest extent in the transition state for reaction. This last point follows since, for efficient catalysis, the free energy of the substrate transition state–enzyme system must be lowered to the greatest degree. Thermodynamic Evidence for Better Packing of Enzymes in the Transition States A reaction [S ! P, Eq. (2)] catalyzed by an enzyme (E) benefits, relative to the reaction in free solution, because the adverse entropy of the reaction in free solution is reduced by the preorganization of the catalytic groups in relation to the substrate (S).18 Catalysis will also be promoted 18

A. Fersht, ‘‘Structure and Mechanism in Protein Science,’’ p. 362. W. H. Freeman and Co., New York, 1999.

14

allosteric enzymes and receptors

[1]

if the enzyme binds the transition state (S#) for reaction with positive cooperativity. According to the model presented here, such positively cooperative binding will give a benefit in enthalpy and a cost in entropy due to better packing within the enzyme structure in the transition state [E S#, Eq. (2)]. The prediction is therefore that this latter cost in entropy will offset the advantage of the preorganization, but that a large benefit in enthalpy should be apparent in enzyme catalysis. E þ S ! E S ! E S# ! E P ! E þ P

(2)

The extent to which enzyme catalysis is provided by any overall improvement in bonding (H#) is available from the difference between the enthalpy of activation for the enzyme-catalyzed reaction (H#cat) and for the spontaneous reaction in the absence of enzyme (H#non). The cost or benefit to catalysis in terms of an overall change in order (TS#) is available from the difference of the parameters TS#cat and TS#non for the same two processes. These differences are available for the reaction catalyzed by cytidine deaminase.19 The effect of enzyme catalysis is to increase the reaction rate by 1016 M1, due to a benefit in enthalpy (H#) of 84 kJ mol1, and a benefit in entropy (TS#) of only 7 kJ mol1. From the Boltzmann equation, 5.7 kJ mol1 benefits a reaction rate at room temperature by a factor of 101. Thus, the benefit of improved bonding to the enzyme-catalyzed reaction is a factor of ca. 1015, whereas the benefit due to improved order is only a factor of ca. 101. The very large overall improvement in bonding in the transition state of the enzymecatalyzed reaction is consistent with catalysis being derived to a major extent by a tightening of the enzyme structure, induced by the transition state of the substrate. The above data are therefore interpreted to reflect to an important degree the increased bonding within the enzyme on passing from its free to transition-state–bound form. The expectation is that much of this increased bonding within the enzyme will be derived on passing from the enzyme/substrate complex to the form that is bound by the transition state of the substrate. This is because enzymes have evolved to bind the transition states (S#) of substrates more strongly than the substrates themselves. One way to effect this is through a greater degree of tightening of the enzyme upon binding the substrate transition state than upon binding the substrate. The enthalpic (bonding) component of this difference for cytidine deaminase19 is 30 kJ mol1, which establishes that there is an overall increase in bonding as the reaction proceeds from the enzyme-bound 19

M. J. Snider, S. Gaunitz, C. Ridgeway, S. A. Short, and R. Wolfenden, Biochemistry 39, 9746 (2000).

[1]

packing improvements within enzymes and receptors

15

ground state to the transition state. The data are consistent with an improvement in bonding within the enzyme during this transformation. The efficiency of enzyme catalysis can be measure in terms of the rate ratio of the enzyme-catalyzed and non-catalyzed reactions (kcat/knon). The enthalpic (bonding) component of this difference has been measured by Wolfenden and co-workers20,21 for reactions catalyzed by six enzymes. The work establishes that these reactions are accelerated largely as a result of a more favorable enthalpy of activation (in comparison to the reaction in free solution). These contributions are 33 (chorismate dismutase), 66 (chymotrypsin), 63 (staphylococcal nuclease), 80 (bacterial -glucosidase), 93 (urease), and 143 (yeast OMP decarboxylase) kJ mol1. Since these differences are derived by comparison of reactions that both involve the transition state of the substrate, they must largely reflect bonding changes in the surroundings of this transition state structure. They give rise, when considered in isolation from other variables, to rate enhancements of ca. 106, 1012, 1011, 1014, 1016, and 1025 s1, respectively. These enthalpy changes are so large that widespread improvements in bonding in the enzyme structures, induced by the transition state of the substrate, offer a probable explanation. Evidence for Better Packing and Reduced Dynamic Behavior from Backbone Amide NH ! ND Exchange Although there is evidence in the literature that some enzymes become better packed in the transition state for reaction, the point that this must improve catalytic efficiency has not been made clear. The key data come from experiments carried out by Wang et al.16,22 Hydrogen/deuterium (H/D) exchange into backbone amide bonds in hypoxanthine-guanine phosphoribosyltransferase (HGPRT)22 and purine nucleoside phosphorylase16 was used to compare the dynamic properties of the enzymes alone, in forms with bound reactant/product, and in forms with bound transition-state analogues. For both enzymes, it was found that the rate and extent of deuterium incorporation decreased when the reactant/product was bound, and decreased to an even greater extent when the transition state analogue was bound. Thus, the greatest reduction in dynamic motion of the enzymes is caused by the transition state analogue. The effects are large: the binding of the transition state analogue protects 34 peptide backbone NHs from exchange in the case of HGPRT, and 27 peptide backbone 20

A. Radzicka and R. Wolfenden, Science 267, 90 (1995). R. Wolfenden, M. Snider, C. Ridgway, and B. Miller, J. Am. Chem. Soc. 121, 7419 (1999). 22 F. Wang, W. Shi, E. Nieves, R. H. Angeletti, V. L. Schramm, and C. Grubmeyer, Biochemistry 40, 8043 (2001). 21

16

allosteric enzymes and receptors

[1]

NHs are similarly protected in the case of purine nucleoside phosphorylase. The reduced dynamic behavior of the enzymes goes hand in hand with improved noncovalent bonding within them. Our proposals indicate that for both enzymes binding energy is provided for the reactant/product through the enzymes becoming better packed. More importantly, even greater binding energy is provided for the transition state analogue when the enzyme packing is further improved. Negatively Cooperative Binding of Ligands and Structural Loosening in Receptors

So far, we have presented the case that positively cooperative binding can cause the tightening of noncovalently bonded structures. Negatively cooperative binding is the converse of positively cooperative binding. Therefore, it should be associated with converse properties, i.e., a reduction of the noncovalent bonding efficiency within the receptor system, and an increase in its dynamic behavior. These consequences of negatively cooperative binding should occur with a cost in enthalpy and a benefit in entropy. The physical model for an increase in receptor dynamics upon the exercise of negative cooperativity involves arguing via two hypothetically separated steps. The first step is that the ligand binds by making noncovalent bonds to the receptor whose formation demands distortion of the noncovalent bonds that previously existed within the receptor. That is, making simultaneously the two sets of bonds in the preferred geometry that would occur if each set were made alone is not possible. Thus, the making of the ligand/receptor bonds decreases the bonding efficiency of the noncovalent bonds within the receptor (Fig. 4C)—there has been a cost in enthalpy. In the second step, we consider the dynamic consequence of this cost in enthalpy. The decrease in bonding within the receptor will result in an increase in its dynamic behavior, which will in turn cause a further cost in enthalpy (Fig. 4C ! D). We were encouraged that the model might have general applicability by a study of changes in the properties of tetrameric recombinant human tyrosine hydroxylase isoform 1 upon binding the natural cofactor (6R)-l-erythro5,6,7,8-tetrahydrobioptrin.23 The binding of the cofactor occurs with negative cooperativity, and this cofactor-bound form of the enzyme then shows a decreased resistance to limited tryptic proteolysis—as would be expected from a loosening of the enzyme structure. 23

T. Flatmark, B. Almas, P. M. Knappskog, S. V. Berge, R. M. Svebak, R. Chehin, A. Muga, and A. Martinez, Eur. J. Biochem. 262, 840 (1999).

[1]

packing improvements within enzymes and receptors

17

Test of the Model for Negatively Cooperative Binding The classic work of Monod, Wyman, and Changeux (MWC) showed that the binding of O2 to a ‘‘tense’’ form (T, populated before O2 binding) of the hemoglobin tetramer could force the T form toward a ‘‘relaxed’’ (R) form (Fig. 5).24 We note that in the MWC model, the O2 binding is defined as positively cooperative because much of the work required for the T ! R conversion is effected by the first O2 to bind. Therefore, subsequently binding O2 molecules have the advantage of accessing a relatively high population of the R state, and bind with greater affinity.25 In summary, the behavior is described as positively cooperative because the later-filled sites have affinities greater than the site that is preferentially filled first. However, in terms of the definition used here, there is negatively cooperative binding between the interface that O2 can most favorably form to hemoglobin and the one that is presented by the available T state. The optimal binding of O2 is incompatible with the geometry existing in the T state. The receptor is therefore forced to a modified geometry

Fig. 5. MWC model for the binding of the first molecule of ligand (L) to a tetrameric protein existing in tense (T) and relaxed (R) forms.

24 25

J. Monod, J. Wyman, and J.-P. Changeux, J. Mol. Biol. 12, 88 (1965). See, for example, A. Fersht, ‘‘Structure and Mechanism in Protein Science,’’ p. 292. W. H. Freeman and Co., New York, 1999.

18

allosteric enzymes and receptors

[1]

when O2 is bound, and this modified geometry is that existing in the R state. The negatively cooperative binding does indeed force a loosening of the T state of the tetramer, through the breaking of intersubunit salt bridges.26 However, and more importantly, widespread structural changes in the T to R transition seem possible. All noncovalent interactions within a receptor system that are coupled with negative cooperativity to ligand binding should loosen. To test this conclusion we determined27 by ESI-MS the change in dynamic behavior of the hemoglobin tetramer polypeptide backbone when it binds O2. Through the binding of oxygen, a further seven or eight exchangeable amide hydrogens per -chain (5.2–6% of the total number) and a further 16 per -chain (11.4% of the total number) were exposed to solvent exchange. Thus, since there are two -chains and two -chains per hemoglobin tetramer, complete saturation of the hemoglobin tetramer by O2 binding results in an increase of 46–48 in the number of backbone NHs undergoing exchange. The dramatic increase in amide NH exchange is in agreement with the predictions regarding the changes associated with negatively cooperative binding. These increases in dynamic behavior of the amide backbones of the hemoglobin subunits are not evident from previous X-ray studies.26 Presumably, the changes in dynamic behavior, which are important for an understanding of binding interactions, are masked by crystal packing forces. The requirement that negative cooperativity, as exercised in this system, should be accompanied by enthalpy/entropy compensation (in the sense of a benefit in entropy, and a cost in enthalpy—see earlier) is also satisfied. Thus, as O2 binding promotes the T to R transition, there should be an uptake of heat by, and increase in disorder within, the hemoglobin tetramer. In the case of trout hemoglobin,28 starting from the T state, O2 binding occurs with an exothermicity very near to zero, and a favorable TS term of þ21 kJ mol1. In contrast, O2 binding to the R state is exothermic (H ¼ 32 kJ mol1) and slightly unfavorable in entropy (TS ¼ 3 kJ mol1). The difference in the two sets of thermodynamic data reflects the way in which O2 binding promotes the disordering of the T state tetramer toward the R state tetramer. All the properties demanded by the model for negatively cooperative binding are fulfilled.

26

M. F. Perutz, A. J. Wilkinson, M. Paoli, and G. G. Dodson, Annu. Rev. Biophys. Biomol. Struc. 27, 1 (1998). 27 M. Zhou, Certificate of Post Graduate Study, University of Cambridge, June 2002. 28 A. Colosimo, M. Coletta, G. Falcioni, B. Giardina, S. J. Gill, and M. Brunori, J. Mol. Biol. 160, 531 (1982).

[1]

packing improvements within enzymes and receptors

19

Conclusion

Furthering our understanding of the noncovalent interactions that provide the binding energies involved in protein folding, the binding of ligands to receptors, and the binding of transition states to enzymes, is an important goal. In simple systems (glycopeptide antibiotics), NMR chemical shift changes (in conjunction with thermodynamic cycles) show that a receptor system provides ligand binding energy when the receptor system becomes more compact upon ligand binding. This ‘‘tightening’’ of the receptor system is a manifestation of positively cooperative binding. It is accompanied by a favorable contribution to the change in enthalpy, and an unfavorable contribution to the change in entropy. Electrospray mass spectrometry (ESI-MS) shows that a high-affinity receptor (streptavidin) tightens its structure upon binding biotin; the binding affinity of biotin is thereby increased. Where investigated, enzymes show analogous marked tightening of their structures (manifested by reduced dynamic behavior) in the transition state for reaction. This tightening of the enzyme structure therefore improves the catalytic efficiency of the enzymes. Where data are available, thermodynamic parameters for catalysis are consistent with this conclusion. Thus, it appears that catalysis is promoted through marked (and relatively widespread) improvements in noncovalent bonding in the transition state system. In this conclusion (the delocalization of binding energy), we can see why enzymes are relatively large. Negatively cooperative binding should exhibit the converse properties of structural loosening, more dynamic behavior, an unfavorable contribution to the change in enthalpy, and a favorable contribution to the change in entropy. These properties are exhibited when O2 binds with negative cooperativity to the T form of hemoglobin. Acknowledgments We thank EPSRC (R.Z.), BBSRC (E.S.), and Churchill College, Cambridge (M.Z.) for financial support.

20

[2]

allosteric enzymes and receptors

[2] Structural Interpretation of pH and Salt-Dependent Processes in Proteins with Computational Methods By Bertrand Garcı´a-Moreno E. and Carolyn A. Fitch Introduction

It is widely recognized that electrostatics plays a significant role in modulating the structure, the stability, and the function of proteins. The study of protein electrostatics continues to be of interest because they are of paramount biological importance—most biochemical processes are governed by electrostatics. One unique and particularly attractive feature of the problem of protein electrostatics is the dual possibility of measuring electrostatic contributions to an equilibrium process experimentally, and of calculating these contributions with structure-based computational methods based on realistic physical models. Another unique feature is that it is possible to dissect global electrostatic contributions experimentally into contributions by individual ionizable groups. This ability to interpret physical and structural origins of measured electrostatic energies in microscopic detail is inaccessible to any other type of noncovalent contribution to function. The most direct and exact connection that can be established between structure, function, and energy of proteins is through structure-based calculations of electrostatic energy.1 Electrostatic energies are central to the correlation of structure and function of proteins. Key biochemical processes of living systems are governed by electrostatics. Examples include the following: 1. The capture of photons (e.g., reaction centers) and their transduction into proton (Hþ) gradients (e.g., bacteriorhodopsin) or mechanical energy (e.g., photoactive yellow protein). 2. The conversion of Hþ gradients into high-energy bonds (e.g., ATPase) or into mechanical energy (e.g., flagellar motors). 3. The establishment and maintenance of ion gradients (e.g., Kþ and Cl channels). 4. e conduction (e.g., light harvesting complexes) and redox chemistry (e.g., all redox centers). 5. Role of Ca2þ as a secondary messenger (e.g., calmodulin). 6. Catalysis (e.g., all enzymes).

1

A. Warshel and M. Levitt, J. Mol. Biol. 103, 227 (1976).

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[2]

electrostatic calculations with proteins

21

7. Recognition (e.g., proteins in their interactions with nucleic acids) and binding (e.g., most associating systems). Electrostatics also modulates important physicochemical properties of proteins, such as solubility, stability, and dynamics. It governs conformational transitions of proteins that are triggered physiologically by changes in pH and in the ionic composition. This includes (1) the allosteric transitions that regulate the affinity of human hemoglobin for oxygen, (2) the assembly, disassembly, and activation of viruses, (3) the activation of bacterial toxins such as the diphtheria toxin, and (4) the energetics of aggregation in many systems, including amyloidogenesis. In all of these cases it is impossible to understand the structural basis of function without quantitative understanding of the contributions made by electrostatics. Although high-resolution structures are available for many examples of proteins involved in the processes listed above, the electrostatic contributions to function and energetics remain poorly understood. The gap between structure and function cannot be bridged without quantitative understanding of the reciprocal relationship between structure and energy. This is possible for processes governed by electrostatics, and it involves the interplay between experimental measurement of electrostatic energies and structure-based calculations to understand their microscopic, physical origins. Specifically, to connect structure with function in depth in processes governed by electrostatics, it is necessary to know the pKa values or the redox potentials that describe the affinities for Hþ and e, respectively. pKa values and redox potentials can be measured experimentally with very high accuracy and precision in some proteins. They can also be calculated from structure with models based on physical principles. The ultimate goal of a study of electrostatic contributions to protein structure and function is to determine the pKa values or redox potentials, to understand their molecular determinants, and to understand how the values are linked to the conformation and dynamics of a protein. Computational methods for structure-based calculation of electrostatic energies are necessary in the analysis of structure–function relationship for other reasons. Experiments can carry the dissection of the electrostatic contributions to an equilibrium process only so far. Beyond a certain point, structure-based calculations are needed to interpret the structural origins of the measured energetics in detail. Calculations also provide specific, testable hypotheses. An even more important role of computational methods is to provide structural insight in cases in which experimental measurements are not possible. This is the situation, for example, with the integral membrane proteins that are involved in energy transduction modulated by electrostatics, such as bacteriorhodopsin, the photoreaction center,

22

allosteric enzymes and receptors

[2]

ATPase, cytochrome oxidase, etc. The experimental methods for measurement of pKa values or stability in water-soluble proteins cannot be applied to these membrane proteins. In these cases, a theory that has been calibrated carefully against experimental data with other simpler proteins can be used to elucidate relationships between structure, energy, and function. This chapter reviews some aspects of the theory and models available for calculation of electrostatic energies and pKa values in proteins. The emphasis is on the calculation of pKa values and Hþ binding/release reactions because it is a more general and common problem than the calculation of redox potentials and e affinities (the calculations of pKa values and of redox potentials are analogous). We emphasize from the outset that despite the widespread dissemination and popularity of several sophisticated algorithms for structure-based electrostatic calculations, the calculation of thermodynamic parameters from structure remains a difficult and daunting proposition. The accuracy of the available methods is sufficient to allow calculation of some thermodynamic quantities, but not all. In this chapter we describe the types of analyses of electrostatic effects in proteins that are possible with judicious application of structure-based calculations in concert with experimental studies. Examples are given of the type of insight about the character of electrostatic effects in proteins that has emerged recently from the joint application of calculations and experiments. Problems that currently limit the reliability of these computational methods are also discussed. Experimental Measurement of Hþ and Salt-Linked Thermodynamics

An excellent and extensive monograph on the thermodynamic concepts germane to the analysis of pH and salt-dependent equilibria of proteins is available.2 Here we provide a brief outline of the nature of thermodynamic information that must be obtained to elucidate electrostatic contributions to stability and function. The experimental dissection of electrostatic contributions to an equilibrium process begins with the study of salt and pH effects. The macromolecular equilibrium reaction A $ B will be pH dependent if protons (Hþ) are bound or released upon conversion of A to B. If, for example, Hþ are bound preferentially by B, then the conversion of A to B will be promoted by low pH. Preferential Hþ binding by one of the states is determined by the pKa values of ionizable residues. The pKa values are influenced by a large variety of molecular factors, including interactions with other 2

M. Schaefer, H. W. van Vlijmen, and M. Karplus, Adv. Protein Chem. 51, 1 (1998).

[2]

electrostatic calculations with proteins

23

charges, with polar atoms, with polarizable atoms, and with water molecules. In the structural analysis of pH and salt-dependent processes of proteins it is necessary to understand the molecular determinants of pKa values. It is virtually impossible to isolate these factors experimentally. This can be done only with structure-based pKa calculations. Electrolyte ions can interact with the charged and polar atoms of proteins; therefore they also modulate electrostatic effects and pKa values. For example, the energy of coulombic interactions between pairs of charges will decrease with increasing salt concentration because ions can screen charges effectively. Thus the salt dependence of the equilibrium can be used as a diagnostic, to identify certain types of electrostatic contributions to pKa values and to equilibrium processes in general. However, as discussed later, it has become increasingly clear that some of the early ideas about the origins of the salt sensitivity of equilibrium constants and their utility as a diagnostic for identifying electrostatic contributions to an equilibrium process must be revised. Effects of Salts and pH on Phenomenological Equilibrium Constants The analysis of the effects of pH on experimental equilibrium constants is performed with Wyman’s linkage relationships. The number of Hþ that are bound or released during an equilibrium process,  Hþ, can be obtained from the slope of the pH dependence of the equilibrium constant3,4: d ln Keq ¼ Hþ d ln Hþ

(1)

Another way of obtaining the net differential Hþ binding is by direct measurements with potentiometric methods.5 For example, in the case of protein folding,  Hþ refers to the moles of Hþ bound preferentially to either the native (N) or the unfolded (U) states.  Hþ can also be obtained by using Eq. (1) to analyze the pH dependence of stability measured by chemical denaturation data assuming a two-state model N $ U.  Hþ can also be obtained as the difference between Hþ binding curves measured separately for the N and for the U states with potentiometric methods.5 The effects of salt on equilibrium processes are considerably more difficult to interpret. Note that in the case of pH effects, the  Hþ obtained 3

J. Wyman and S. J. Gill, ‘‘Binding and Linkage.’’ University Science Books, Mill Valley, CA, 1990. 4 C. Tanford, J. Mol. Biol. 39, 539 (1969). 5 S. T. Whitten and B. Garcı´a-Moreno E., Biochemistry 39, 14292 (2000).

24

allosteric enzymes and receptors

[2]

with Eq. (1) can be interpreted unequivocally in terms of preferential Hþ binding because Hþ interact only with proteins through site-specific binding interactions. Considerable care is required in the application of equivalent expressions to analyze the dependence of equilibrium constants on salt concentration. There are different modes of interaction between ions and proteins,6 thus the slope of the salt dependence of the equilibrium constant represents a complex mixture of ionic effects on the thermodynamics of the process. Site-specific ion binding, aspecific screening of coulombic interactions, and even effects of ions on solvation properties of macromolecules are all reflected in the measured equilibrium constants and in their salt dependence, and it is virtually impossible to deconvolute these different types of contributions experimentally. For this reason the structural interpretation of the meaning of  ions must proceed with great care.6 Structure-based calculations can be useful to identify the component that is related to the ionic strength effect. Microscopic Origins of pH Effects: pKa Values  Hþ is a global or macroscopic descriptor of the effects of pH on equilibrium constants. It can be dissected into microscopic components reflecting contributions by individual ionizable residues. Toward this end, it is necessary to know the pKa values of individual ionizable sites. pKa values can be measured experimentally for histidines with one-dimensional nuclear magnetic resonance (NMR) spectroscopy methods,7 and with multidimensional methods for other ionizable residues.8 The pKa of an acid describes the process AH $ A þ Hþ, and for a base it describes BHþ $ B þ Hþ. The most notable difference between the dissociation of acids and bases is that for acids the dissociated form is charged and for bases it is neutral. Another difference is that the dissociation of the acid results in separation of charge and the dissociation of a base does not. For macromolecular equilibria to be pH sensitive, the pKa values of some of the ionizable residues must be different in the different macromolecular states in equilibrium. This is shown graphically in Fig. 1 for the process A $ B. The Hþ titration curves of an ionizable residue that titrates with a pKa ¼ 3 in state A of the macromolecule and with pKa ¼ 8 in state B are shown in Fig. 1. The difference in the pKa values in states A and B determines the pH dependence of the equilibrium between these two states.

6

B. Garcı´a-Moreno E., Methods Enzymol. 240, 645 (1994). S. P. Edgecomb and K. P. Murphy, Proteins Struct. Funct. Genet. 49, 1 (2002). 8 W. R. Forsyth, J. J. Antosiewicz, and A. D. Robertson, Proteins Struct. Funct. Genet. 48, 388 (2002). 7

[2]

electrostatic calculations with proteins

25

Fig. 1. Simulation of the Hþ titration curves of an ionizable residue that titrates with pKa ¼ 3 when the macromolecule is in state A and with pKa ¼ 8 when the macromolecule is in state B. The pH dependence of the free energy difference between states A and B, plotted with reference to the right ordinate, was calculated by integration of the area between the two titration curves with Eq. (2).

The effects of pH on the equilibrium can be obtained exactly by numerical integration of the area between the two titration curves with Eq. (2). The Go vs. pH curve calculated with Eq. (2) is also shown in Fig. 1. Z high pH  A  B o G ¼ 2:303RT H þ  H (2) þ d pH low pH

The pH-dependent Go

describes, in a completely model-independent way, the manner in which the free energy difference between states A and B is dependent on pH. The total area between the two titration curves in Fig. 1 represents the maximal Go . It can be calculated by converting pKa values into Go using Go ¼ RT ln Keq, then taking a difference.  At 25 , Go ¼ pKa  1.36 kcal/mol. Continuum Methods for Structure-Based Electrostatic Calculations

Several computational methods for structure-based calculation of electrostatic energies and pKa values in proteins are available, differing primarily in the amount of microscopic detail that is treated explicitly in the calculation, and in the accuracy and precision of the thermodynamic quantities that can be calculated. Atomistic models in which all atoms of both protein and water are represented explicitly represent one end of

26

allosteric enzymes and receptors

[2]

the spectrum. All-atom models can provide clear physical insight into the origins of electrostatic energies. However, in general, they are not yet useful for pKa calculations owing to problems with convergence and with inaccurate handling of long-range electrostatic interactions.9 The more useful models for structure-based pKa calculations in proteins are based on the continuum approximation, where the concept of a dielectric constant (e) is invoked to account implicitly for the polarizability of protein and water. In general, the ability of continuum models to obtain reliable electrostatic energies increases as microscopic detail is replaced by dielectric constants. However, this is achieved at the expense of a rigorous description of the physical origins of the energies. One example of a continuum model is the popular method based on the numerical solution of the Poisson–Boltzmann equation by the method of finite differences (FDPB).10 This is a continuum method in which a considerable amount of microscopic detail is retained. The protein dipole–Langevin dipole model (PDLD) by Warshel and coworkers11 is another example of a model in which dielectric-like constants are invoked to account for some forms of polarizability. However, the PDLD model has considerable microscopic character, and in its most useful version, treats explicitly the contributions by dynamics to the protein dielectric effect. The PDLD model is discussed in detail elsewhere in this volume. The recent review by Schutz and Warshel has an excellent critical discussion of the physical basis, shortcomings, and unique features of different models for electrostatic calculations in proteins.12 Thermodynamic Cycle Used to Calculate pKa Values The calculation of pKa values is a problem in electrostatics because it involves calculation of the difference in energy between charged and neutral forms of ionizable groups.13 The actual thermodynamic cycle that is used in pKa calculations is shown in Fig. 2. The horizontal arrows in this figure represent ionization reactions. The free energy of the reaction at the bottom of the cycle is determined by the pKa value of the ionizable prot group in the protein (pKa;i ). The free energy of the reaction at the top of the cycle is determined by the pKa value measured experimentally in 9

G. S. Del Buono, F. E. Figueirido, and R. M. Levy, Proteins Struct. Funct. Genet. 20, 85 (1994). 10 I. Klapper, R. Hagstrom, R. Fine, K. Sharp, and B. Honig, Proteins Struct. Funct. Genet. 1, 47 (1986). 11 A. Warshel and S. T. Russell, Q. Rev. Biophys. 17, 283 (1984). 12 C. N. Schutz and A. Warshel, Proteins Struct. Funct. Genet. 44, 400 (2001). 13 A. Warshel, Biochemistry 20, 3167 (1981).

[2]

electrostatic calculations with proteins

27

Fig. 2. Thermodynamic cycle used in the structure-based calculation of pKa values. The largest circles in the bottom side of this cycle represent the protein. The ionizable species are represented as large circles. The smaller circles represent polar atoms. The horizontal reactions represent the ionization of a model compound in water (top) or of an ionizable group in the protein (bottom). The vertical reactions represent the transfer of an ionizable group from water to protein in the neutral state (q ¼ 0, left) and in the charged state (q ¼ 1, right).

model ). The vertical axes describe the free energies to model compound (pKa;i transfer the model compound, either in the neutral (q ¼ 0) or ionized (q ¼ 1) states, from water into the protein environment. The largest spheres in this figure represent the protein. The large circles represent the ionizable groups. They can bear positive or negative charges. The smallest circles represent the background polar atoms. These are treated as point partial charges in some of the models. In most models the surface of the protein represents a boundary between regions with different polarizabilities— water, which has a dielectric constant eH2O ¼ 78.5, and protein, which has a lower dielectric constant ein. The calculation of pKa values consists of calculating correction factors to account for the effects of the macromolecular milieu on the pKa value measured experimentally in the model compounds used as reference. The transfer free energies that are calculated correspond to the electrostatic contributions to the ionization process in the protein: prot

model tr Gelec;i ¼ Gtr i;q¼1  Gi;q¼0 ¼ Ga;i  Ga;i model : Equation (4) shows how these terms are used to modify the pKa;i

(3)

28

allosteric enzymes and receptors

prot

model pKa;i ¼ pKa;i þ

tr Gtr i;q¼1  Gi;q¼0

2:303RT

model ¼ pKa;i 



[2]

 zi Gelec;i 2:303RT (4)

Energy Terms Included in pKa Calculations In the continuum method for pKa calculations with the FDPB model, two different energy terms are included in the calculation of the transfer free energies shown in Fig. 2.14 One is the free energy of pairwise coulombic interactions between titratable sites in the protein, Gij, which in kcal/mol can be expressed as Gij ¼

332zi zj rij e

(5)

The other term is the self-energy, Gii. The self-energy in turn has two components. The first is the Born or reaction field energy, GBorn, which describes the differences in the self-energy of each charge in water and in the protein owing to their different polarizabilities. The general Born expression in Eq. (6) illustrates the overall magnitude of Born energies (kcal/mol) and their dependence on the dielectric properties of the system. This expression describes the difference in the self-energy of an ion of approximate radius rcav in water with electrolyte ions and in a material of polarizability equivalent to ein.   332z2i 1 1  GBorn ¼ (6) 2rcav ein eH2 O erij In this expression the term erij captures the effects of the ionic double layer about the charged group on the electrostatic potential. The second component of the self-energy is referred to as the background energy, Gbg. The background energy represents the free energy of interaction between the charge of interest and the background partial charges used to describe polar atoms in the protein. The background energies are calculated with expressions similar to Eq. (5). Together these three free energy terms are used to modify the pKa values measured in model compounds:  z   i prot model  pKa;i ¼ pKa;i Gij þ Gii 2:303RT (7)  z   i model  ¼ pKa;i Gij þ Gbg þ GBorn 2:303RT 14

D. Bashford and M. Karplus, Biochemistry 29, 10219 (1990).

[2]

electrostatic calculations with proteins

29

The calculation of the G terms in the thermodynamic cycle and in Eq. (7) is not trivial. In general, for an electrostatic system, G are calculated as the product of charge, z, times potential, : G ¼ z  . An example of one of the expressions used in the FDPB algorithm to calculate energy terms for a simple ionizable residue is prot

Ga;i

n n X X 1 1 2 o;pr o;pr pr pr qoj ij ¼ q2i ii  qoi ii þ qi qj ij  qoj 2 2 j¼1 j¼1 m m X X pr o o;pr o qj ij qj ij  qi þ qi j¼nþ1

j¼nþ1

ð8Þ

In this expression ij is the potential at site j due to a unit positive test charge at i, q is the charge at site i, o designates the neutral state of charge q, n is the number of partial charges in residue i, and m is the number of ionizable sites. The first two terms correspond to the Born energy, the other terms correspond to the background, and coulombic energy. A second similar expression, but without the rightmost two terms, would be necessary to calculate Gmodel . The actual calculation of pKa values is usua;i ally broken down into two steps. The intrinsic contributions to the pKa is calculated first, followed by the calculation of the energy of charge–charge interactions. The intrinsic pKa is defined to be the pKa value that an ionizable group has in the protein when all other titratable sites are neutral (i.e., the intrinsic pKa values reflect the pH-independent shifts to pKa values relative to the value in the model compound). The Born and the background energies are included in the calculation of the intrinsic pKa values. The actual protocols used in the calculations vary significantly depending on the specific implementation of the FDPB procedure that is being used.14–17 Calculation of Electrostatic Potentials with the FDPB Method The most difficult step in the structure-based calculation of electrostatic G is the calculation of the electrostatic potential () in the macromolecular milieu. This is difficult for two reasons. First, the electrostatic potential is a complex function of the charges of the system, of its polarizability, dynamics, composition, and shape. Second, proteins exist in electrolyte solutions, thus it is necessary to account for the effects of mobile counterions 15

D. Bashford and K. Gerwert, J. Mol. Biol. 224, 473 (1992). A.-S. Yang, M. R. Gunner, R. Sampogna, K. Sharp, and B. Honig, Proteins Struct. Funct. Genet. 15, 252 (1993). 17 J. Antosiewicz, A. J. McCammon, and M. K. Gilson, Biochemistry 35, 7819 (1996). 16

30

allosteric enzymes and receptors

[2]

on , which in turn requires knowledge of the organization of ions around proteins. In the FDPB method these issues are addressed by using the linear Poisson–Boltzmann equation to calculate the electrostatic potential. This equation relates the electrostatic potential, , the effects of the ionic double layer, , the charge density of the protein, , and the dielectric constants of the system, e: r  ½eðrÞ rðrÞ  2 eðrÞ  ðrÞ ¼ 4ðrÞ

(9)

In this expression  is the Debye–Hu¨ckel parameter, proportional to the ionic strength. It is through this parameter that the effects of electrolyte ions on the electrostatic potential are captured. The best known solution of this second-order differential equation is the model-dependent solution by Debye–Hu¨ckel for the case of a spherical ion. The Tanford–Kirkwood model is another model-dependent solution of this equation, for the case of spherical proteins. The numerical solution of Eq. (9) by the FDPB method is the basis for the most popular continuum method for pKa calculation.10 In the finite difference solution of the linearized Poisson–Boltzmann equation the protein surface represents an interface between two regions of different polarizability. Water is treated with one dielectric constant, and the protein is treated with a lower dielectric constant. The dielectric constants of water are known over a wide range of physical conditions (temperature, pressure, and ionic strength). In contrast, quantitative treatment of the polarizability of the protein is difficult. It is not obvious that the inherently macroscopic concept of the dielectric constant is useful to describe the polarizability of proteins, which are chemically and dynamically heterogeneous and anisotropic. This is the problem that limits the accuracy and utility of all continuum models. Calculations with the PDLD Method The protein dipole–Langevin dipole model by Warshel is described elsewhere in this volume.11,18 In this semimicroscopic method some parts of the protein and solvent are treated as point-inducible dipoles on a lattice, some are approximated by a Langevin-type function, and a continuum dielectric constant is applied to treat the polarizability of the bulk solvent. Ionic strength effects were not treated explicitly in the original version of this algorithm. However, this is not a serious problem. A simple term proportional to 1/erij can be used to capture these effects nearly quantitatively. The unique features and advantages of the PDLD method for calculation of electrostatic energies are discussed later. 18

F. S. Lee, Z. T. Chu, and A. Warshel, J. Comput. Chem. 14, 161 (1993).

[2]

electrostatic calculations with proteins

31

Parameters Used in FDPB Calculations The pKa calculations with the FDPB method require the following set of inputs. 1. Coordinates of structures with H atoms added computationally. 2. Ionic strength, which in principle should not be higher than the physiological value. 3. Temperature, which is not a particularly important variable in these calculations. 4. A set of partial charges. The exact set will depend on the nature of the implementation that is used, as discussed later. 5. A set of pKa values of model compounds to describe the reaction in the top horizontal side of the thermodynamic cycle in Fig. 2. The values that are most commonly used are 3.8 for the C terminus, 4.0 for Asp, 4.4 for Glu, 6.3 for His, 7.5 for the N terminus, 9.6 for Tyr, 10.4 for Lys, and 12.0 for Arg. 6. A set of dielectric constants. Water is treated with eH2O ¼ 78.5 (25 ). The assignment of a dielectric constant to the protein interior, ein, is highly controversial for reasons discussed later. The dielectric constant of dry protein powders is close to 4; thus in many of the calculations ein ¼ 4 is used.19 However, in general this value exaggerates the magnitude of electrostatic effects. For this reason the empirical use of ein ¼ 20 has become popular in pKa calculations with static structures. This improves the agreement between calculated and measured pKa values for surface ionizable residues.20 Experimentally Measurable Thermodynamic Quantities That Can Be Calculated There is a wide range of thermodynamic quantities that can be measured experimentally in solution that can also be calculated from structure. Indeed, one important aspect of structure-based electrostatic calculations is that they can be tested rigorously against experimental data. pKa values are the single most important thermodynamic quantity that can be calculated. Since in the models the pKa values are pH-dependent quantities, the calculated pKa values that are compared against the pKa values measured by NMR spectroscopy are termed pKa,app. These represent the midpoint of the calculated Hþ titration curve for each titratable site. They

19 20

M. K. Gilson and B. H. Honig, Biopolymers 25, 2097 (1986). J. Antosiewicz, J. A. McCammon, and M. K. Gilson, J. Mol. Biol. 238, 415 (1994).

32

allosteric enzymes and receptors

[2]

can be used with the Hendersohn–Hasselbalch equation to calculate the Hþ titration curves of individual sites. The overall Hþ titration of the protein, which can be measured by potentiometric methods, can in turn be calculated as a sum of individual site Hþ isotherms. As shown by the simulated data in Fig. 1, the pKa values of ionizable residues in two states of a macromolecule can be used to determine the pH dependence of the equilibrium between the two states. For example, independent pKa calculations with the liganded and unliganded forms of hemoglobin can be used to calculate the pH dependence of the ligandbinding reactions (i.e., the Bohr effect). Similarly, using pKa values calculated with the native state of a protein, and assuming that the ionizable residues of an unfolded protein are well represented by the pKa values of model compounds, the FDPB method can be used to calculate the pH dependence of stability of any protein. In all of these calculations it is possible to vary the ionic strength in order to gauge the salt dependence of the calculated electrostatic contributions. It is also possible to dissect the calculated data into contributions by individual groups or by specific types of interactions. For example, the calculated contributions to stability or to the pH dependence of a process can be dissected into contributions from short-range ion pairing interactions and from medium-range interactions and long-range coulombic interactions. As shown later, this type of analysis can be very productive in the elucidation of the structural and physical origins of measured energetics. The most interesting recent applications of the calculations are in the analysis and interpretation of the determinants of pKa values, and in analysis of the effects of mutations that affect properties of ionizable residues. For example, the effects of removal of an ionizable residue on stability or on pKa values can be measured experimentally and compared against the calculated contributions. Equation (7) shows how FDPB-calculated electrostatic effects can be dissected into three types of contributions. In cases in which there is good agreement between calculated and measured pKa values, this dissection can be used to analyze experimental data in greater detail. The line of investigation also enables detailed tests of the accuracy and validity of the calculated energies by direct comparison with the experimental data. Use of experimental data to challenge structure-based pKa calculations has guided modifications to the algorithms that have greatly increased their accuracy and reliability. In the process our understanding of the physical character of electrostatic interactions in proteins has also been refined considerably.

[2]

electrostatic calculations with proteins

33

Problems with Existing Continuum Methods and Strategies to Address Them

The major problems with all models for calculation of pKa values and electrostatic energies in proteins stem from the difficulty of capturing quantitatively the dielectric properties of proteins. The protein surface represents an artificial boundary between media of different dielectric properties. The nature of dielectric relaxation and the properties of water or protein at this interface are not well understood. What is known with certainty is that the pKa values of surface ionizable residues measured by NMR spectroscopy are in general very similar to the values of model compounds in water.7,8 This suggests that the presence of the protein does not interfere significantly with the hydration of surface ionizable residues, or that the loss of hydration is compensated by favorable interactions with protein polar atoms. It also suggests that surface coulombic interactions are weak, or that strong attractive interactions are always balanced by repulsive interactions of the same magnitude. The notion that surface electrostatic effects are weak originated with the calculations with the Tanford–Kirkwood algorithm (SATK) modified empirically by inclusion of a normalized solvent accessibility parameter to attenuate all electrostatic effects. This algorithm is capable of reproducing the pKa values of surface residues.21 In the standard implementation of this method the lowest dielectric constant that is ever sampled between pairs of interacting charges is approximately 40, because even though the protein itself is treated with ein ¼ 4, the dielectric constant of water is weighted heavily. In SATK calculations the small deviations of pKa values relative to model compound values are the result of weak perturbations by coulombic interactions (no self-energies are included in the standard SATK method). In contrast, in early calculations with the FDPB method that treated the protein with ein ¼ 4, the magnitude of electrostatic effects was exaggerated, and the calculations failed to reproduce the experimental pKa values. This can be appreciated in some of the data included in Fig. 3. Today it is widely acknowledged that any model that treats the protein surface with high dielectric constants can capture pKa values of surface residues.20–23 Modifications to the Standard Implementation of the FDPB Method There are two trends among the strategies to improve the performance of pKa calculations with the FDPB method. One is to increase the amount of microscopic detail in the calculations. For example, in the simplest 21

J. B. Matthew, F. R. N. Gurd, B. Garcı´a-Moreno E., M. A. Flanagan, K. L. March, and S. J. Shire, CRC Crit. Rev. Biochem. 18, 91 (1985). 22 B. Svensson and B. Jo¨nsson, J. Comput. Chem. 16, 370 (1995).

34

allosteric enzymes and receptors

[2]

implementation of the FDPB method using static structures, the charge of an ionizable residue is considered to reside in only one atom.20 In this single-site mode the calculations with ein ¼ 4 yield grossly incorrect pKa values. The calculated pKa values can be improved by instead distributing the unit charge over several atoms of each ionizable side chain.15–17 Another alternative to improve calculations with ein ¼ 4 is to use the PARSE atomic charge set,24 which was parameterized to reproduce solvation energies of small molecules using FDPB theory. Another more recent device to improve the agreement between calculated and experimental pKa values is to use a small probe radius, smaller than the radius of a water molecule, to define the molecular surface of the protein.25 This helps because the dielectric constant of water will be weighted more heavily if the protein surface is described with a very small probe (i.e., more volume is treated with eH2O). Yet another way of improving the performance of the FDPB method is to treat explicitly the structural waters bound at specific sites that are seen crystallographically rather than subsuming their effect into a dielectric constant.26 The second trend in the empirical strategies to improve the performance of pKa values with the FDPB method is to arbitrarily increase the value of ein used in the calculations. FDPB calculations with static structures, using a single-site charge method, can give very reasonable results and reproduce the properties of surface residues if ein ¼ 20 is used.20 A virtue of the use of ein ¼ 20 is that it renders the calculations less sensitive to the details of parameters, structures, and charge sets used. In contrast, calculations with ein ¼ 4 tend to amplify uncertainties and errors. They are also highly dependent on the specific conformation of the structure used, requiring conformational averaging to improve the agreement between calculated and measured pKa values.16,17,20,27 With ein ¼ 20 the discrepancies between calculations with slightly different structures of the same protein are diminished.20 Figure 3 shows the rank-ordered pKa values of acidic residues in staphylococcal nuclease calculated with a static structure using these different implementations of the FDPB algorithm. The pKa values calculated with the SATK algorithm are also included for comparison. Note that the FDPB-calculated pKa values are systematically more depressed than the values calculated with the Tanford–Kirkwood algorithm, suggesting that the FDPB-calculated electrostatic energies are still exaggerated, no 23

A. Warshel, S. T. Russell, and A. K. Churg, Proc. Natl. Acad. Sci. USA 81, 4785 (1984). D. Sitkoff, K. A. Sharp, and B. Honig, J. Phys. Chem. 98, 1978 (1994). 25 H.-X. Zhou and M. Vijayakumar, J. Mol. Biol. 267, 1002 (1997). 26 H. Trylska, J. Antosiewicz, M. Geller, C. N. Hodge, R. M. Klabe, M. S. Head, and M. K. Gilson, Protein Sci. 8, 180 (1999). 27 D. Bashford, D. Case, C. Dalvit, L. Tennant, and P. Wright, Biochemistry 32, 8045 (1993). 24

[2]

electrostatic calculations with proteins

35

Fig. 3. Rank order of pKa values of Glu and Asp residues in staphylococcal nuclease calculated with the solvent accessibility modified Tanford–Kirkwood method (thick solid line) or with different implementations of the FDPB method. The symbols refer to FDPB calculations as follows: (&) ein ¼ 20, 100 mM KC1; (d) ein ¼ 40, 100 mM KCl; (m) ein ¼ 80, 100 mM KCl; (4) ein ¼ 80, 500 mM KCl; (!) ein ¼ 80, 1 M KCl; ( ) ein ¼ 20, PARSE charge set; (^) ein ¼ 4, Parse charge set, molecular surface defined with van der Waals radii instead of water-accessible surface; (H) ein ¼ 20, 100 mM KCl, charges arbitrarily placed on atom OD1 or OE1; (t) ein ¼ 20, 100 mM KCl, charges arbitrarily placed on atom OD2 or OE2; (o § ) ein ¼ 20, 100 mM KCl, charges placed as in Tanford–Kirkwood calculation, 100 mM KCl.

matter how the calculations are performed. Note also the large differences between pKa values calculated with slightly different implementations of the same method. Staphylococcal nuclease was selected on purpose to illustrate the dependence of the calculated pKa values on the details of the calculation precisely because in this protein these differences are exaggerated. The calculated pKa values of acidic residues in nuclease are more depressed than they should be because this nuclease is very basic, and substantial positive potentials impinge on the acidic residues in the pH range where they titrate. Under more neutral conditions of pH the FDPB calculations on this same protein can reproduce the pKa values of histidines with remarkable accuracy when ein ¼ 20.28,29 The data in Fig. 3 offer a sobering illustration of the degree to which the calculated pKa value can be affected by nuances and details of the calculations. 28 29

K. K. Lee, C. A. Fitch, J. T. J. Lecomte, and B. Garcı´a-Moreno E., Biochemistry 41, 5656 (2002). K. K. Lee, C. A. Fitch, and B. Garcı´a-Moreno E., Protein Sci. 11, 1004 (2002).

36

allosteric enzymes and receptors

[2]

Why Do pKa Calculations with FDPB Improve When Proteins Are Treated Artificially with High Dielectric Constants? To understand the reasons behind the failure of FDPB calculations with static structures and ein ¼ 4 it is necessary to understand the meaning of the dielectric constants of proteins. According to Schutz and Warshel,12 e ¼ 2 accounts for electronic polarizability and e ¼ 4 accounts for electronic polarizability plus small-scale atomic displacements. Values of ein > 4 are needed when larger-scale fluctuations of proteins contribute to the overall dielectric response. There is very little direct experimental information about the nature of the structural relaxation or flexibility that could affect the magnitudes of electrostatic effects so dramatically. It might involve dipolar movement, charge displacement, or even large-scale motions and fluctuations such as local unfolding or reorganization, and in extreme cases global unfolding.12 The problem with calculations that use static structures and ein ¼ 4 is that they underestimate the natural flexibility of the protein, and especially of surface ionizable side chains. ein ¼ 4 also likely underestimates the substantial structural response to the ionization events. Apparently, this is captured implicitly and nearly quantitatively with ein ¼ 20. The general trend is for the value of the dielectric constants used in the models to increase as microscopic detail is removed from the model and treated implicitly within the dielectric constant of the protein. If the dynamic response of a protein to an ionization event is not treated explicitly in the calculations, it must be subsumed into a dielectric constant, thus the need for dielectric constants greater than 4 to reproduce experimental data in calculations with static structures. Strategies for Handling Conformational Flexibility and Reorganization The recognition that most of the problems with exaggerated electrostatic effects originates with the use of low ein in calculations with static structures has led to yet another trend in modifications to methods for pKa calculations, focused on treating the relationship between electrostatic effects, dynamics, and dielectric relaxation explicitly. The attempts to improve FDPB calculations with explicit consideration of dynamics have involved mainly the use of molecular dynamics (MD) or Monte Carlo (MC) methods to explore side chain flexibility.16,30–33 The most important 30

H. W. T. van Vlijmen, M. Schaefer, and M. Karplus, Proteins Struct. Funct. Genet. 33, 145 (1998). 31 B. Rabenstein and E. W. Knapp, Biophys. J. 80, 1141 (2001). 32 T. You and D. Bashford, Biophys. J. 69, 1721 (1995). 33 P. Beroza and D. A. Case, J. Phys. Chem. 100, 20156 (1996).

[2]

electrostatic calculations with proteins

37

conclusion from the MD studies is that comparable pKa values can be obtained using a static structure and ein ¼ 20, and using ein ¼ 4 and averaging over many conformational states sampled in an MD trajectory.30 There is a disconnection between the time scale of minutes used in the measurement of equilibrium thermodynamic data by NMR or other equilibrium methods, and the time scales sampled by even the longest MD simulation possible. This issue is avoided with the multiple conformation continuum electrostatics (MCCE) methods of Gunner and Alexov, which handles conformational heterogeneity and flexibility using an MC protocol to sample side chain rotamers. This procedure is designed to explore the sensitivity of pKa calculations to local sources of heterogeneity involving rotation of side chains, of hydroxyl- or site-bound water molecules, and also to details of placement of polar H atoms. The MCCE calculations using ein ¼ 4 seem to be remarkably accurate, and they contribute significant insight about how the details of the microenvironments affect pKa values.34–36 Clearly in cases in which backbone fluctuations contribute to the high apparent polarizability of a site, neither the MD nor the MC approaches are appropriate. The vast majority of computational studies on structure-based pKa calculations have focused on the calculation of pKa values of surface residues. Warshel has pointed out previously that surface residues do not offer a stringent test of the ability of any model to reproduce physical reality because their pKa values are too close to the pKa values of model compounds. Any model that uses high effective dielectric constants to describe the interactions between charges at the surface is capable of reproducing the pKa values of surface residues.12 In contrast, the pKa values of internal residues constitute stringent benchmarks useful for identifying the most accurate and physically correct models. As shown later, the pKa values of internal residues can experience enormous shifts owing to the loss of hydration in the protein interior or at interfaces between proteins. Not enough experimental data on the pKa values of internal residues are available to allow calibration of the models. All that is known at present is that the ionization properties of buried groups are reproduced by FDPB calculations that treat proteins empirically with ein  10.37–40 This is problematic because in calculations with ein ¼ 10 the magnitude of surface electrostatic 34

E. Alexov, Proteins Struct. Funct. Genet. 50, 94 (2003). E. G. Alexov and M. R. Gunner, Biophys. J. 74, 2075 (1997). 36 R. E. Georgescu, E. G. Alexov, and M. R. Gunner, Biophys. J. 83, 1731 (2002). 37 K. Langsetmo, J. A. Fuchs, and C. Woodward, Biochemistry 30, 7603 (1991). 38 E. Demchuk and R. C. Wade, J. Phys. Chem. 100, 17373 (1996). 39 X. Raquet, J. Lounnas, J. Lamotte-Brasseur, J. M. Fre`re, and R. C. Wade, Biophys. J. 73, 2416 (1997). 35

38

allosteric enzymes and receptors

[2]

effects is exaggerated, and pKa values of surface ionizable residues are not captured. There is no FDPB-based strategy that we are aware of that can be used to capture the energetics of ionization of surface and internal groups simultaneously and in a self-consistent way, using a single value of ein. The practical solutions that have been implemented recently attempt to solve the problems by parameterization. In some of these methods the microenvironments of internal titrating sites are treated with low dielectric constants or with more atomic detail, and charge–charge interactions are treated with arbitrarily high effective dielectric constants.41–46 Some of these methods, in particular the screened Coulomb potential method of Mehler and co-workers, are extremely promising.44,46 In general, adding microscopic detail to continuum calculations can lead to improvement in reproducing pKa values of buried residues or of strongly interacting titratable groups, but they do so at the expense of generality and simplicity. Advantages of PDLD Methods The use of static structures in FDPB calculations is desirable because it is fast and convenient—it avoids the costly exploration of dynamics to improve the treatment of dielectric relaxation. Unfortunately, the trend is that the most useful strategies that have evolved for improvement of FDPB-based continuum calculations increase the amount of microscopic detail to the calculations. Specifically the most serious problems with FDPB calculations, related to the difficulty in accounting for structural relaxation due to the charging process, has required the use of MD or MC simulations to explore conformational relaxation. In some FDPB calculations separate MD have been done for fully charged and fully neutral protein.16 If conformational change is treated explicitly, separate calculations should be performed for each group in the charged and neutral states. A rigorous approach to the problem is impossible due to the large number of states that would have to be considered; approximations are necessary.18 40

C. A. Fitch, D. A. Karp, K. K. Lee, W. E. Stites, E. E. Lattman, and B. Garcı´a-Moreno E., Biophys. J. 82, 3289 (2002). 41 D. Voges and A. Karshikoff, J. Chem. Phys. 108, 2219 (1999). 42 J. J. Havranek and P. B. Harbury, Proc. Natl. Acad. Sci. USA 96, 11145 (1999). 43 R. A. Dimitrov and R. R. Crichton, Proteins Struct. Funct. Genet. 27, 576 (1997). 44 E. L. Mehler and F. Guarnieri, Biophys. J. 75, 3 (1999). 45 M. S. Wisz and H. W. Hellinga, Proteins Struct. Funct. Genet. 51, 360 (2003). 46 E. I. Mehler, M. Fuxreiter, I. Simon, and B. Garcı´a-Moreno E., Proteins Struct. Funct. Genet. 48, 283 (2002).

[2]

electrostatic calculations with proteins

39

In general, FDPB calculations are evolving toward the semimicroscopic method embodied in Warshel’s PDLD method.12 From birth this method has been inherently better suited for handling the contributions by fluctuations and dynamics to electrostatics. Variations of the PDLD model have been developed that incorporate structural relaxation in the neutral and charged states explicitly within the framework of the linear response approximation.18,47 This is tantamount to adding steps in the thermodynamic cycle in Fig. 1 to treat explicitly the reorganization of the protein concomitant with ionization. The difficulties in modeling the ionization behavior of groups in the interior of proteins self consistently show that the treatment of polarizability in the protein interior with a single macroscopic parameter is fundamentally flawed. This has been pointed out by Warshel, who suggested that in principle the explicit treatment of reorganization should take care of this problem.18,26,35 However, this is a difficult problem, and even in the PDLD/ LRA treatment47 it has been necessary to included screening parameters to improve agreement between calculated and measured energies. These screening parameters are used to attenuate interactions. When carefully calibrated against experimental data, this approach has produced excellent results.12 The PDLD method is the most reliable and promising approach for handling rigorously the calculations of pKa values, especially in cases of groups in environments secluded from solvent, which are the groups that are of greatest biological importance. Physical Properties of Surface Ionizable Groups

It is important to emphasize that none of the available computational methods for pKa calculations are free of deficiencies. No one method is capable of handling all types of calculations, nor free of approximations. Most calculations still overestimate the energies of ion pairs, many overestimate the magnitude of long-range interactions, and still others are not appropriate for estimating the ionic strength dependence of electrostatic energies. None is yet capable of handling calculations of surface and internal residues simultaneously and self-consistently, and with the exception of the PDLD, few are ideally suited to deal with the calculation of pKa values of internal residues. On the other hand, used judiciously and in concert with experimental studies, the available computational methods are reliable and useful for the dissection of pH and salt-sensitive thermodynamics of proteins. In the next two sections we discuss some of the properties of electrostatic effects in proteins that have been elucidated recently 47

Y. Y. Sham, Z. T. Chu, and A. Warshel, J. Phys. Chem. B 101, 4458 (1997).

40

allosteric enzymes and receptors

[2]

with joint application of computational and experimental methods. No attempt is made to review the recent literature comprehensively. The intent is simply to illustrate cases in which the structure-based analysis of experimental data with a computational method has contributed significant novel insight about the physical character of protein electrostatics that was inaccessible by experimentation alone. Accurate Electrostatic Potentials Calculated with FDPB The validity of physical models for calculation of electrostatic energies in proteins is usually established by comparison of measured and calculated pKa values. As shown below, most ionizable groups have net favorable coulombic interactions and unfavorable self-energies; pKa values reflect the balance between these two terms. To understand the molecular determinants of pKa values it is necessary to understand how coulombic and selfenergy contributions are parsed. This information is also necessary to establish that a computational method reproduces experimental pKa values for the right physical reasons. In the case of internal residues self-energies can be evaluated easily because the significant perturbation of their pKa values is dominated by the loss of hydration in the protein interior. For surface residues the experimental dissection of the determinants of pKa values is considerably more difficult because these pKa values are similar to the values in model compounds in water, and the nature of the balance between coulombic and self-energies is never obvious. The relatively unperturbed pKa values of surface residues can be reproduced with very different physical models, even incorrect ones. Therefore, in the case of surface residues, agreement between calculated and measured pKa values is a required but not a sufficient condition to establish that a model reproduces pKa values for the correct physical reasons. To test the validity of the electrostatic potentials calculated with the FDPB method, interaction energies between two charges (Gij) were obtained from the change in pKa that results from charge removal or charge reversal by mutagenesis. That is how the data in Fig. 4 were obtained.29 By measuring the pKa values of surface histidines in many mutants of staphylococcal nuclease in which charge has been either removed or reversed, it was possible to construct a map of the distance and salt dependence of charge–charge interactions. Some of the data in Fig. 4 were measured experimentally; the rest were obtained with FDPB calculations using ein ¼ 20. The curve represents the nonlinear least squares fit of Eq. (10) to the data sets. The important thing to note is that the calculations reproduced the distance dependence and the salt sensitivity (not shown) of pairwise coulombic interactions among surface residues. The solid line

[2]

electrostatic calculations with proteins

41

Fig. 4. Comparison of the distance dependence of calculated (closed squares) and measured (open squares) energies of pairwise coulombic interactions between histidines and other surface charges in staphylococcal nuclease. 29 The lines represent nonlinear least-square fits of the experimental (solid) and calculated (dashed) data with a simple coulombic  expression, as previously described. 29 All data at 25 , in 10 mM KCl.

represents the shape of the FDPB-calculated electrostatic potential. Despite the use of ein ¼ 20 the calculated pairwise coulombic energies are slightly larger than the measured values, but overall there is excellent agreement. This is a significant observation. It confirms that the electrostatic potentials that were calculated with the FDPB method are accurate. With the assurance that this aspect of the calculation is correct, the more detailed analysis of the calculated molecular determinants of pKa values becomes meaningful. This entails calculation of the relative contributions to the pKa values by the three types of energies described in Eq. (7). Magnitudes of Coulombic, Background, and Born Contributions to pKa Values The pKa values of histidines in staphylococcal nuclease calculated with FDPB methods using a static structure and ein ¼ 20 are in good agreement with the measured values.28 To determine the molecular determinants of these pKa values the data were dissected into contributions by coulombic, Born, and background energies. The data in Fig. 5 illustrate the magnitude and salt sensitivity of these contributions for His-8, His-121, and Glu-52 in nuclease. Note that in the case of these residues the salt dependence of the calculated pKa value closely mirrors the salt dependence of the coulombic

42 allosteric enzymes and receptors Fig. 5. Representative set of FDPB-calculated pKa values of ionizable residues in staphylococcal nuclease. Ionic strength dependence of pKa values (d) of His-8 (A), His-121 (B), and Glu-52 (C), and contributions from coulombic interactions ( ), from Born energies (4), and from interactions with background polar atoms (!). The contributions from self-energies, calculated as the sum of Born and background contributions, are also shown (). All calculations are with the simplest implementation of FDPB with ein ¼ 20.

[2]

[2]

electrostatic calculations with proteins

43

energy. In contrast, the self-energy is salt insensitive. The magnitude of the calculated coulombic contributions is, in general, greater than the magnitude of the self-energy contributions. This is entirely consistent with what has been determined experimentally for surface residues.29,48,49 The Born contribution to the pKa values of surface residues is always destabilizing because surface residues are always less solvated than the ionizable residues in model compounds in water. Thus the Born term depresses the pKa values of the two histidines (A and B) and it raises the pKa value of the glutamic acid (C). In the case of the two histidines the background energies are also destabilizing—they depress the pKa value. This is not unusual for histidines. In the case of Glu, Asp, Lys, and Arg residues the background energies are usually stabilizing. For these residues the Born and the background energies have approximately the same magnitude but opposite sign—they cancel each other. This is illustrated by the data in Fig. 5C. According to the NMR spectroscopy data the pKa value of His-121 is very depressed and extremely salt sensitive.28 The data in Fig. 5 illustrate how the calculations can help interpret the origins of these experimental observations. The calculated pKa for His-121 in Fig. 5B agrees with the experimental measurement—it is depressed and highly salt sensitive. Dissection of this pKa into the three energy terms suggests that at low ionic strength the self-energy and net repulsive coulombic interaction contribute significantly and equally to the depression of this pKa value. These calculated data also show that the salt sensitivity of the pKa originates entirely with the salt sensitivity of coulombic contributions. This histidine and the other residues described in Fig. 5 are in environments with substantial, net positive potential, thus the pKa values increase as the repulsive coulombic interactions with other basic groups are screened by increasing salt concentration. This is an example of Hþ titration behavior that could not be reproduced accurately with an electrostatic model that does not account for self-energies, such as the modified Tanford–Kirkwood method, or the simple coulombic formula such as Eq. (10).21 Long-Range and Short-Range Contributions to Stability The origins of the Born and background energies can be dissected further to describe the microenvironments responsible for their value. However, in general this type of analysis is not very productive because it 48

D. V. Laurents, B. M. Huyghues-Despointes, M. Bruix, R. L. Thurlkill, D. Schell, S. Newsom, G. R. Grimsley, K. L. Shaw, S. Trevino, M. Rico, J. M. Briggs, J. M. Antosiewicz, J. M. Scholtz, and C. N. Pace, J. Mol. Biol. 325, 1077 (2003). 49 B. M. P. Huyghues-Despointes, R. L. Thurlkill, M. D. Daily, D. Schell, J. M. Briggs, J. M. Antosiewicz, C. N. Pace, and J. M. Scholtz, J. Mol. Biol. 325, 1093 (2003).

44

allosteric enzymes and receptors

[2]

is extremely difficult to test experimentally any hypotheses concerning the molecular origins of Born or background energies, suggested by the calculations. In contrast, the contributions from coulombic interactions to pKa values can be analyzed further in a meaningful way. The effects of the contributions from individual charged groups can be calculated and measured, as shown by the data in Fig. 4. Given the good agreement between these calculated and measured data, further analysis of the character of coulombic interactions is warranted. For example, the distance dependence of pairwise coulombic interactions shown in Fig. 4 suggests that long-range coulombic interactions are very small, but nonzero. Interactions between ˚ are stronger than those separated by charges separated by less than 10 A greater distances. The data in Fig. 6 show the ionic strength sensitivity of the total contributions by coulombic interactions to the pKa of His-121 in staphylococcal nuclease at pH 7, parsed into short-range and long-range, stabilizing and destabilizing contributions. According to these data the at˚ are well balanced and they tractive and repulsive interactions at rij < 10 A ˚ are not balcancel each other. In contrast the interactions at rij > 10 A anced. The long-range repulsive interactions are stronger than the longrange attractive interactions. Thus, the long-range repulsive, coulombic interactions are responsible for the depressed pKa values of these and other histidines in nuclease.28 This type of parsing of coulombic interactions is

Fig. 6. Ionic strength dependence of coulombic interactions that contribute to the pKa of His-121 in staphylococcal nuclease, calculated with the simplest implementation of FDPB calculations with ein ¼ 20. Shown separately are the long-range interactions between charges ˚ (left), and short- and medium-range interactions between separated by more than 10 A ˚ (right). The total coulombic interactions (thick line) were charges separated by less than 10 A parsed into attractive (&) and repulsive (d) interactions. The sum of attractive and repulsive interactions in each of the distance regimes is also shown (.).

[2]

electrostatic calculations with proteins

45

useful because, as can be seen by the data in the left panel of Fig. 6, the data explain the detailed molecular origins of the unusual high salt sensitivity of this histidine at a level that is inaccessible by experimentation. The data show that although long-range electrostatic interactions are, in fact, very weak, they can add up to produce substantial electrostatic effects in situations when positive and negative charges are not balanced. This is a novel view, opposite to the view that has prevailed,50,51 and which was supported by the less physically realistic calculations with the SATK method, which can systematically underestimate the magnitude of medium- and long-range electrostatic interactions. Several groups have demonstrated recently that the significant contributions by medium- and long-range coulombic interactions to stability can be exploited to increase the stability of proteins artificially, by judicious manipulation of the constellation of surface charges.52–54 The Contributions of Ion Pairs to Protein Stability Are Context Dependent The data in Fig. 4 show that the FDPB calculations reproduce quantitatively the magnitude of pairwise coulombic interactions between charges ˚ or more. In this range the distance dependence and salt separated by 8 A sensitivity of coulombic interactions seem to follow general trends. Interactions at short range are considerably more complex and very difficult to reproduce with pKa calculations—no clear trends are obvious. Experimentally it is known that the contributions of ion pairs to the stability of proteins can vary dramatically.55 Not surprisingly, the artificial design of surface ion pairs to enhance the stability of proteins is notoriously difficult, and usually unsuccessful. Apparently other factors besides the distance of separation between charges become critical below a certain distance. These factors are not well understood. In the calculations of short-range interactions the details of the calculations seem to matter greatly.56–58 There are aspects of short-range interactions, for example, the contributions by hydrogen bonding,59 that make them unsuitable for study with calculations using static structures. These 50

D. Sali, M. Bycroft, and A. R. Fersht, J. Mol. Biol. 220, 779 (1991). S. Dao-pin, E. So´derlind, W. A. Baase, J. A. Wozniak, U. Sauer, and B. W. Matthews, J. Mol. Biol. 221, 873 (1991). 52 K. L. Shaw, G. R. Grimsley, G. I. Yakovlev, A. A. Makarov, and C. N. Pace, Protein Sci. 10, 1206 (2001). 53 V. Loladze, B. Ibarra-Molero, J. Sanchez-Ruiz, and G. Makhatadze, Biochemistry 38, 16419 (1999). 54 S. Spector, M. Wang, S. Carp, J. Robblee, Z. Hendsch, R. Fairman, B. Tidor, and D. Raleigh, Biochemistry 39, 872 (2000). 55 S. Dao-pin, U. Sauer, H. Nicholson, and B. W. Matthews, Biochemistry 30, 7142 (1991). 51

46

allosteric enzymes and receptors

[2]

are problems that are best handled with explicit consideration of flexibility and conformational heterogeneity with methods such as the MCCE.34,35 There are other aspects of short-range interactions, for example, the interactions between ionizable residues and aromatic residues,60 which are not well suited for study with methods that consider only electrostatic factors in the calculation of pKa values. A recent analysis of the contributions of ion pairs to the stability of proteins has contributed an important novel concept that could result in improved calculations in the future. Makhatadze and co-workers demonstrated experimentally that the extent to which an ion pair contributes to the stability of a protein is context dependent.61 Using double mutant cycles they established that the energy of interaction between a Lys and a Glu was identical even when the Lys-Glu pair was reversed to a Glu-Lys pair by mutagenesis. However, the magnitude of the net contribution to the global stability of the protein was highly sensitive to the orientation of the ion pair. This suggested that the contributions to stability are determined not only by the energy of interaction between the elements of the ion pair, but by the medium- and long-range interactions between the charges forming the ion pair and the rest of the charges in the protein. The SATK model, devoid of self-energy terms and of background polar charges, is capable of capturing the energetics of the interaction between the two charges in the ion pair remarkably well. However, the SATK calculations are poor at capturing the contribution of the ion pair to stability because they do not reproduce quantitatively the medium- and long-range electrostatic interactions. The FDPB calculations, on the other hand, reproduce quantitatively the magnitude of medium- and long-range coulombic interactions (Fig. 4). However, they routinely fail to reproduce the magnitude of short-range coulombic interactions because the calculations tend to exaggerate interaction energies and are too sensitive to structural and computational details even with ein ¼ 20. A combination of the best aspects of the two different models should greatly improve our ability to calculate properties of short-range electrostatic interactions.

56

W. R. Forsyth and A. D. Robertson, Biochemistry 39, 8067 (2000). W. R. Forsyth, M. K. Gilson, J. Antosiewicz, O. R. Jaren, and A. D. Robertson, Biochemistry 37, 8643 (1998). 58 Y.-H. Kao, C. A. Fitch, S. Bhattacharya, C. J. Sarkisian, J. T. J. Lecomte, and B. Garcı´aMoreno E., Biophys. J. 79, 1637 (2000). 59 L. Swint-Kruse and A. D. Robertson, Biochemistry 34, 4724 (1995). 60 R. Loewenthal, J. Sancho, and A. R. Fersht, J. Mol. Biol. 224, 759 (1992). 61 G. I. Makhatadze, V. V. Loladze, D. N. Ermolenko, X. Chen, and S. T. Thomas, J. Mol. Biol. 327, 1135 (2003). 57

[2]

electrostatic calculations with proteins

47

Origins of the Salt Sensitivity of Electrostatic Effects Among biochemists, the dominant ideas concerning the molecular origins of the salt sensitivity of equilibrium processes of proteins are still rooted in predictions with the original Debye–Hu¨ckel model. According to this model the energy of a pair of interacting charges is given by Gij ¼

332Zi Zj rij erij

(10)

 ¼ 50.29 (I/ e H2OT)1/2 with the ionic strength, I, in moles/liter, and T in K. This function predicts a dramatic decay in Gij with increasing ionic strength when rij is comparable to the distance between the charges in an ion pair. This has led to the idea that salt sensitivity of an equilibrium process is diagnostic of energetic contributions by ion pairs to that equilibrium. That is not what was found experimentally in NMR studies of the salt sensitivity of pKa values of groups involved in ion pairs. The pKa values of residues in ion pairs in proteins are largely insensitive to the ionic strength.58,62 This is exactly what would be expected with a more sophisticated theory that captured the fact that the volume occupied by the protein is not accessible to ions. The salt insensitivity of pKa values of charges involved in ion pairs is reproduced in pKa calculations with both SATK and FDPB methods.58 The data in Figs. 4, 5, and 6 suggest that the effects of ionic strength on equilibrium processes reflect mostly the attenuation of medium- and long-range coulombic interactions. The unusually high salt sensitivity of pKa values and coulombic interactions in staphylococcal nuclease is the result of its high charge density and of its high isoionic point. Charges are not balanced in nuclease; there are many more basic than acidic residues, and this imbalance is reflected in the salt sensitivity. None of the ionizable groups included in Fig. 5 make short-range coulombic interactions. The salt sensitivity of their pKa values reflects mostly the attenuation of medium- and long-range interactions. As these two histidines have many more repulsive than attractive interactions with the other ionizable residues in nuclease, the net effect of increasing salt concentration is to increase the pKa values. The same is true of Glu-52 shown in Fig. 5C, but in this case the predominant type of medium- and long-range coulombic interactions are attractive; therefore the pKa value increases with increasing salt concentration as these attractive coulombic interactions with basic residues are screened.

62

B. Kuhlman, D. Luisi, P. Young, and D. Raleigh, Biochemistry 38, 4896 (1999).

48

allosteric enzymes and receptors

[2]

The salt sensitivity of pKa values is very different when the number and surface distribution of acidic and basic residues is balanced. In those cases, every attractive interaction is balanced by a repulsive interaction, thus the net screening effect of increasing ionic strength is nullified. The effects of increasing ionic strength are much smaller in these cases, and they are governed by the ionic strength sensitivity of the self-energy.58,62 This effect favors the charged form of the ionizable group at high salt concentration because of the favorable solvation of the charged species relative to the neutral one. Thus, in proteins where positive and negative charges are balanced, increasing ionic strength increases the pKa values of basic residues and decreases those of acidic residues. These effects are captured nearly quantitatively by the FDPB calculations, which include the self-energy term explicitly. They are missed entirely in the SATK calculations, which normally do not include a self-energy term.58 Electrostatic Interactions in the Denatured State Reproducing the pH dependence of stability with structure-based pKa calculations is difficult. As shown in Fig. 1, this requires knowledge of the pKa values of the protein in the native and in the denatured states. In the calculations based on the thermodynamic cycle in Fig. 2 the pKa values of model compounds in water are used to represent the pKa values of the denatured state. This assumes the absence of any electrostatic effects in the denatured state. This has been shown to be an incorrect approximation in some cases,62,63 and an appropriate one in others.64 This is noteworthy in the context of this chapter primarily because it illustrates another reason that structure-based pKa calculations can fail to reproduce experimental thermodynamic quantities. In this case, the difficulties in capturing the pH dependence of stability of proteins are not related to the problems associated with computation of the electrostatic potential in the macromolecular milieu, or with the quantitative treatment of dielectric relaxation effects of proteins. They are related to the set of model compound pKa values used in the calculations. More experimental studies of the sequence dependence of pKa values in peptides, and new computational methods to reproduce the electrostatic properties of denatured states, will be necessary to overcome these problems.

63 64

M. Oliveberg, V. L. Arcus, and A. R. Fersht, Biochemistry 34, 9424 (1995). M. Tollinger, K. A. Crowhurst, L. E. Kay, and J. D. Forman-Kay, Proc. Natl. Acad. Sci. USA 100, 4545 (2003).

[2]

electrostatic calculations with proteins

49

Physical Properties of Internal Ionizable Residues

Nowhere are the inherent limitations of structure-based pKa calculations with continuum methods more apparent than in the calculations of pKa values of internal residues. The molecular determinants of the pKa values of residues buried in the interior of proteins or at interfaces between macromolecules are drastically different from those of surface residues. Lys-66 in staphylococcal nuclease is useful to illustrate this. This residue was introduced by substitution of the naturally occurring Val-66 with site-directed mutagenesis. Lys-66 is buried in the hydrophobic core of nuclease, surrounded by nonpolar atoms.40,65–67 The ionizable moiety is ˚ from the protein–water interface. Figure 7 describes approximately 12 A the dependence of the calculated pKa value of Lys-66 on the value of ein used in an FDPB calculation with a static structure. The contributions by coulombic, Born, and background energies are also shown. When ein ¼ 80 the calculated pKa value of Lys-66 is higher than the pKa of 10.4 for Lys in a model compound. At this value of ein the contributions by the Born and background terms to the pKa are zero and the slight shift in pKa reflects the effects of long-range attractive interactions with acidic

Fig. 7. Dependence of the calculated pKa value of the internal Lys-66 in staphylococcal nuclease calculated as a function of ein with the simplest implementation of FDPB. Also shown are the contributions by the Born energy () by the background term (!), and by coulombic interactions ( ). The dashed solid line represents the experimental pKa value of 5.7. 40 Note that it takes ein  10 to reproduce the experimental pKa value. 65

W. E. Stites, A. G. Gittis, E. E. Lattman, and D. Shortle, J. Mol. Biol. 221, 7 (1991). B. Garcı´a-Moreno E., J. Dwyer, A. Gittis, E. Lattman, D. Spencer, and W. Stites, Biophys. Chem. 64, 211 (1997). 67 J. Dwyer, A. Gittis, D. Karp, E. Lattman, D. Spencer, W. Stites, and B. Garcı´a-Moreno E., Biophys. J. 79, 1610 (2000). 66

50

allosteric enzymes and receptors

[2]

residues elsewhere on the protein. As the value of ein decreases the pKa value of Lys-66 also decreases. When a critical value of ein  10 is reached the dependence of pKa on ein becomes very steep. The source of the depression of the pKa is the destabilizing Born contribution. The removal of the Lys from water is costly because the protein cannot compensate for the loss of hydration; therefore the equilibrium between the neutral and the charged form is shifted in favor of the neutral form. As the dielectric constant used to treat the interior of the protein decreases the charged form of Lys-66 becomes destabilized relative to its state in water. The dependences of the pKa and of the Born contribution to the pKa on ein are almost parallel when ein < 25. At values of ein < 10 the contributions by coulombic interactions also become significant. They are also destabilizing; thus they also promote the neutral state of Lys-66 and depress its pKa. The coulombic contribution to the pKa of Lys-66 is much smaller than the magnitude of the contributions related to the Born energy. Notice also that the background contribution is very small, as expected for an amino group entirely surrounded by nonpolar groups. However, at very low values of ein, even the background energy, reflecting interactions with polar groups at medium-range becomes significant. The dramatic dependence of pKa values on the value of ein used to treat the protein is not unique to Lys66. Similar behavior has been observed with other internal residues in nuclease and in other proteins.12,68 Notice in Fig. 7 that it takes a value of ein  10 in an FDPB calculation with a static structure to capture the experimental pKa value of 5.7 for the internal Lys-66. The need for such a high value of ein to capture the pKa of this internal group suggests that the protein is undergoing some substantial structural relaxation concomitant with the ionization of the internal Lys-66. This is what ein ¼ 10 captures implicitly. It is noteworthy that according to the intrinsic fluorescence and CD signals, the protein is folded and nativelike after ionization of the internal Lys-66. Further experimental studies will be needed to elucidate the nature of the conformational relaxation induced by the ionization of this internal residue. In contrast to the FDPB calculation, the PDLD method with the linear response approximation reproduces the pKa value of Lys-66 and of Glu-66 when the protein is treated with ein ¼ 6.12 This represents a remarkable improvement over the FDPB calculation—the energy difference between energies calculated with ein ¼ 10 and ein ¼ 6 is very large (see Fig. 7). In principle the PDLD-LRA approximation should have captured the pKa of Lys-66 with ein ¼ 2 because all relaxation processes except induced electronic polarization are treated explicitly in this calculation. The reason that 68

S. Dao-pin, D. E. Anderson, W. A. Baase, F. W. Dahlquist, and B. W. Matthews, Biochemistry 30, 11521 (1991).

[2]

electrostatic calculations with proteins

51

ein > 2 was needed to reproduce the pKa values of residues is probably related to limitations inherent to the simulation times that are used to obtain the trajectories used in the LRA calculations. The calculations could also be underestimating the contributions by water penetration, which has been observed crystallographically.67 Concluding Remarks

The computational approaches for structure-based calculation of pKa values and electrostatic energies in proteins have improved dramatically. The wealth of experimental data on the physical character of surface ionizable groups that has accumulated has allowed rigorous testing, calibration, and modification of continuum methods. Calculations with the method based on the finite difference solution of the Poisson–Boltzmann equation are already useful to analyze a variety of electrostatic contributions to stability and function of proteins governed by surface ionizable groups. These methods capture quantitatively the ionic strength dependence of surface electrostatic effects, and, in general, are useful to estimate pKa values of all surface groups except those involved in strong, short-range interactions. Empirical approaches such as the modified Tanford–Kirkwood method are capable of reliably estimating pKa values of ion pairs on the protein surface. Calculation of pKa values and electrostatic energies in the protein interior or at interfaces between proteins remains very challenging. Not enough experimental data are available to guide the calibration and modifications of existing methods. Our ability to calculate pKa values of internal groups will not improve until we learn how to capture dielectric properties of proteins quantitatively. The most advanced computational method for calculation of pKa values in environments secluded from solvent is the PDLD-LRA, which is in principle equipped to calculate explicitly all contributions to dielectric relaxation other than the contributions from induced dipoles reflected in ein ¼ 2. More experimental data on energetics of ionization of internal residues are needed in order to continue to improve the PDLD-LRA methods. There is urgency in this matter. Without an improved ability to reproduce quantitatively the physical properties of internal ionizable groups, we will remain unable to understand the structural and physical basis of the biological function of most proteins. Acknowledgments Supported by grants from the National Science Foundation (MCB-0212414) and the National Institutes of Health (GM061597). The University of Houston Brownian Dynamics software was used to calculate the data shown in Fig. 3–7.

52

[3]

allosteric enzymes and receptors

[3] Electrostatic Basis for Bioenergetics By Avital Shurki, Marek Sˇtrajbl, Claudia N. Schutz, and Arieh Warshel Introduction

Understanding the molecular basis of biological energy transduction and storage is a problem of fundamental importance. Although different aspects of this problem have been elucidated,1–4 we still lack a detailed description of the relationship between the structure of biological systems and the way they store and utilize their energy. A detailed understanding should cover the action of molecular machines, proton pumps, electron pumps, and enzymes as well as the coupling between these systems. Our early experience has indicated that the best correlation between the structure and function of biological molecules is provided by the corresponding electrostatic energies.5,6 In this chapter we reevaluate the previous proposal considering the progress done in the past 22 years. We will focus on several key systems and show that the corresponding energetics is best described in terms of the corresponding electrostatic energy. Enzyme Catalysis

Enzymes play a pivotal role in biochemistry in controlling the rate of energy transduction. An example is the ATPase reaction in F1-ATPase.4,7 Thus it is important to explore the relationship between the structures of enzymes and the way they catalyze their reactions. Here we will illustrate that the best structure–function correlation in enzymes is obtained by evaluating the corresponding electrostatic energy. Since the subject of enzyme catalysis has been reviewed by us extensively,8 we will consider only several examples including some nontrivial cases. We will also use the discussion of enzyme catalysis as an opportunity 1

P. Mitchel, Nature 191, 144 (1961). W. P. Jencks, ‘‘Catalysis in Chemistry and Enzymology.’’ Dover, New York, 1987. 3 H. Michel et al., Annu. Rev. Biophys. Biomol. Struct. 27, 329 (1998). 4 P. D. Boyer, Biochim. Biophys. Acta 1140, 215 (1993). 5 A. Warshel, Acc. Chem. Res. 14, 284 (1981). 6 A. Warshel, ‘‘Computer Modeling of Chemical Reactions in Enzymes and Solutions.’’ John Wiley & Sons, New York, 1991. 7 H. Wang and G. Oster, Nature 396, 279 (1998). 8 A. Warshel, Annu. Rev. Biophys. Biomol. Struct. 32, 425 (2003). 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[3]

electrostatic basis for bioenergetics

53

to emphasize the importance of proper energy-based analysis, which is an essential requirement in proper analysis of bioenergetics. Defining the Problem in Enzyme Catalysis To clarify the issue of enzyme catalysis we will consider a generic enzymatic reaction using the equation K cat EþS! ES k! ES{ ! EP ! E þ P

(1)

where E, S, and P are the enzyme substrate and product, respectively, ES, EP, and ES{ are the enzyme–substrate complex, enzyme product complex, and transition state, respectively, and K ¼ k1 =k1 . As was shown so eloquently by Wolfenden and Snider,9 many enzymes evolved by optimizing kcat =KM , where KM ¼ (k1 þ kcat )/k1 and can be approximated as KM  k1 =k1 . However, this and related findings did not identify the factors responsible for the catalytic effect. As will be shown below, the key question is related to the reduction of the activation barrier in the chemical step. What is needed here is a quantitative tool for structure– function correlation and the ability to determine the individual contributions to the overall catalytic effect. It is gradually becoming clear that this requirement can best be accomplished by computer simulation approaches. Analysis of enzyme catalysis requires proper definition of the relevant questions. First, when talking about catalysis it is essential to ask ‘‘catalysis relative to what?’’ Here we must define a reference reaction and the most obvious reference is the uncatalyzed reaction in water (Fig. 1). Thus elucidation of the origin of enzyme catalysis involves understanding the origin of the difference between the activation barrier in water (g{w ) and the activation barrier in the protein (g{p ¼ g{enz ). However, the enzyme can reduce g{p by both binding the substrate with an equal strength in the reactant state (RS) and the transition state (TS) (in which case g{p  g{w ¼ Gbind) and/or by reducing the activation barrier g{cat for the chemical step. Since the factors that control the binding step are well understood, the real puzzle is related to the factors that govern the reduction of the activation barrier of the chemical step (g{cat ). This is, of course, a question of energetics. Experimentally, even the question of whether the enzyme works by stabilizing the RS or TS is not easily resolved, although the type of mutational analysis, which was introduced in Warshel10 (see also Kienhofer et al.11), can help in this respect. In view of the above discussion we should focus 9

R. Wolfenden and M. J. Snider, Acc. Chem. Res. 34, 938 (2001). A. Warshel, J. Biol. Chem. 273, 27035 (1998). 11 A. Kienhofer, P. Kast, and D. Hilvert, J. Am. Chem. Soc. 125(11), 3206 (2003). 10

54

allosteric enzymes and receptors

[3]

Fig. 1. Comparing the activation free energy profile for a reference reaction in water and the corresponding reaction in an enzyme active site. The figure illustrates the relationship between g{cat , g{cage , and g{w .

on two questions: (1) What are the contributions responsible for the difference between (g{cat ) and (g{cage ) and (2) how do these contributions operate (i.e., do they destabilize the RS or stabilize the TS)? Many proposals (see Villa and Warshel12 for a partial list) have attempted to address the above questions in either direct or indirect ways. However, most of the proposals have not been defined in a clear logical way or properly analyzed by their proponents. This, in part, is due to the difficulty of performing such an analysis before the emergence of computer simulations. Moreover, some proposals have not considered a proper thermodynamic cycle (see below). In fact, a major problem in the field is the use of soft definitions, which cannot be examined conceptually or computationally. As will be shown below, with a clear definition of the problem and with a combination of experimental and computational studies it is possible to explore and quantify the origin of the catalytic power of enzymes. Electrostatic Basis for Structure–Function Correlation in Enzymes The field of computer simulations of chemical reactions in enzymes started in 1976.13 This field expanded rapidly in recent years8 and reached a stage where it is possible to reproduce the observed catalytic effect, while 12 13

J. Villa and A. Warshel, J. Phys. Chem. B 105, 7887 (2001). A. Warshel and M. Levitt, J. Mol. Biol. 103, 227 (1976).

[3]

electrostatic basis for bioenergetics

55

calibrating the energy of the reference reaction on the corresponding observed energy (in this way the calculations focus on the difference between the reaction in water and in the enzyme active site). Properly calibrated studies have found that the difference between g{cat and g{cage is mainly due to electrostatic effects. The calculations indicated that enzymes ‘‘solvate’’ their TSs more than the corresponding TSs in the reference solution reactions.14 The nature of this ‘‘solvation’’ effect appeared to be far from obvious. That is, the calculated interaction energy between the TS charges of the reacting atoms and the enzyme was found to be similar to the corresponding interaction energies in solution. This finding indicated that the entire electrostatic energy associated with the formation of the TS must be examined, rather than only the interaction energy at the TS.6 This includes the penalty for the reorganization of the environment upon ‘‘charging’’ the TS. To quantify this catalytic effect it is useful to express the electrostatic free energy of the TS charges by the linear response approximation (LRA) using15   GðQ{ Þ ¼ 0:5 hUðQ ¼ Q{ Þ  UðQ ¼ OÞiQ¼Q{ þ hUðQ ¼ Q{ Þ  UðQ ¼ OÞiQ¼0   ¼ 0:5 hUiQ{ þ hUi0 (2)

where U is the solute–solvent interaction potential, Q designates the residual charges of the solute atoms with Q{ indicating the TS charges, and < U >Q designating an average over configurations obtained from an MD run with the given solute charge distribution. The first term in Eq. (2) is the above-mentioned interaction energy at the TS, where Q ¼ Q{ , which is similar in the enzyme and in solution. The second term expresses the effect of the preorganization of the environment. If the environment is randomly oriented toward the TS in the absence of charge (as is the case in water), then the second term is zero and we obtain GðQ{ Þw ¼

1 hUiQ{ 2

(3)

where the electrostatic free energy is half of the average electrostatic potential.16 However, in the preorganized environment of an enzyme we obtain a significant contribution from the second term and the overall GðQ{ ) is more negative than in water. This extra stabilization is the catalytic effect of the enzyme. Another way to see this effect is to realize that in water, where the solvent dipoles are randomly oriented around the uncharged 14

A. Warshel, Proc. Natl. Acad. Sci. USA 75, 5250 (1978). F. S. Lee et al., Protein Eng. 5, 215 (1992). 16 A. Warshel and S. T. Russell, Q. Rev. Biophys. 17, 283 (1984). 15

56

allosteric enzymes and receptors

[3]

form of the TS, the activation free energy includes the free energy needed to reorganize the solvent dipoles toward the changed TS. On the other hand, the reaction in the protein costs less reorganization energy since the active site dipoles (associated with polar groups, charged groups, and water molecules) are already partially preorganized toward the TS charges.6,14 The reorganization energy is related to the well-known Marcus reorganization energy, but it is not equal to it. More specifically, the Marcus reorganization energy17 is related to the transfer from the reactant to the product state while here we deal with charging the TS.12 Nevertheless, calculations of the Marcus reorganization energy in enzyme (p) and in solution (w) are also consistent with the above idea, and it has been repeatedly found that p is smaller than w.10,12,18,19 Now, the finding that electrostatic free energies are the key element in structure–catalysis correlation may seem to be inconsistent with many available alternative proposals. The problems with these proposals have been clarified recently.12 Thus we will focus only on some key alternative proposals and demonstrate that they are valid only when they reflect electrostatic effects. Steric Confinements Effect Are Significant Only When They Reflect Electrostatic TSS: The NAC and Related Proposals Several proposals for the origin of enzyme catalysis involve ground state steric strain.20–22 The original strain hypothesis of Phillips and coworkers20 and related subsequent works22 invoked the idea of ‘‘molding’’ the substrate toward the TS by strong steric forces. This idea was inconsistent with simulation studies, which demonstrated that enzymes are flexible and do not apply strong steric forces.6,13 A refined proposal, which is related to the strain hypothesis, has been put forward by Bruice and coworkers (e.g. ref. 23), who suggested that the confinement of the enzyme active site brings the substrate to a so-called near attack conformation (NAC) which is closer to the corresponding TS than in the reference reaction in water. However, the definition of this proposal was based on selecting critical distance and angle, where the NAC is supposed to occur, rather than on free energy surfaces that could be related directly to the difference 17

R. A. Marcus, J. Chem. Phys. 24, 966 (1956). A. Yadav et al., J. Am. Chem. Soc. 113, 4800 (1991). 19 ˚ qvist and M. Fothergill, J. Biol. Chem. 271, 10010 (1996). J. A 20 L. O. Ford et al., J. Mol. Biol. 88, 349 (1974). 21 N. A. Khanjin, J. P. Snyder, and F. M. Menger, J. Am. Chem. Soc. 121, 11831 (1999). 22 O. Tapia, J. Andres, and V. S. Safront, J. Chem. Soc. Faraday Trans. 1 90, 2365 (1994). 23 T. C. Bruice, Acc. Chem. Res. 35(3), 139 (2002). 18

[3]

electrostatic basis for bioenergetics

57

between g{cage and g{cat . That is, the activation free energy can be defined only by the difference between the free energy at the TS and the lowest point at the RS minima, or by the difference between the free energy in the TS and the overall free energy of the RS.24 Thus, selecting an arbitrary point along the reaction coordinate as a reference for the evaluation of the activation energy cannot give unique results. There are clear cases in which it is easy to show that the NAC effect does not contribute significantly to catalysis.25 However, to clarify our perspective we will intentionally take the chorismate mutase (CM) case in which the NAC effect might seem to be very significant. The catalytic reaction of this enzyme (i.e., the Claisen rearrangement of chorismate to prephenate26; see Fig. 2) has been the subject of a large number of theoretical studies. (e.g. refs. 21, 27, and 28). A superficial examination of the results of different studies may suggest that we have here a clear case of reactant state destabilization (RSD). In particular, we can focus on recent MD studies of Hur and Bruice.29,30a This study indicated that the enzyme helps in bringing the reacting atoms of the substrate to a typical distance that they defined as NAC, a distance that is rarely attained in water. Thus they considered this NAC effect as the major reason for the catalytic power of CM. As clarified in our previous work25 the NAC effect and the NAC distance are poorly defined and cannot be uniquely related to the catalytic effect of the given enzyme (see also below). Nevertheless, the MD studies of Hur and Bruice29,30a indicate that the C9 . . . C1 distance of the substrate ˚ for very little free energy, while according to (Fig. 2) can reach about 3.7 A their estimate, it costs 8 kcal/mol in water.30a The question that we would like to address concerns the meaning of this apparent NAC effect. In other words, we would like to find out whether the NAC represent a genuine reason for catalysis or merely reflects the result of electrostatic transition state stabilization (TSS). These two options are described in Fig. 3. According to Fig. 3 in the case of RSD if the enzyme pushes the reacting fragments in the RS toward the TS direction the average distance along the reaction coordinate in the RS (h R iRS in Fig. 3) is shorter in the protein than in water. The second option is the TSS mechanism, which stabilizes both the RS and the TS in 24

A. Warshel and W. W. Parson, Q. Rev. Biophys. 34, 563 (2001). A. Shurki et al., J. Am. Chem. Soc. 124, 4097 (2002). 26 E. Haslam, ‘‘Shikimic Acid: Metabolism and Metabolites.’’ John Wiley & Sons, New York, 1993. 27 S. Martı´ et al., J. Phys. Chem. B 104, 11308 (2000). 28 H. Guo et al., Proc. Natl. Acad. Sci. USA 98, 9032 (2001). 29 S. Hur and T. C. Bruice, Proc. Natl. Acad. Sci. USA 99, 1176 (2002). 30a S. Hur and T. C. Bruice, J. Am. Chem. Soc. 125, 1472 (2003). 25

58

allosteric enzymes and receptors

[3]

Fig. 2. The rearrangement of chorismic acid (1) to prephenic acid (3). The presumed transition state is shown (2). This [3,3]-pericyclic process is formally analogous to a Claisen rearrangement.

Fig. 3. Describing the two possible definitions of the NAC effect in terms of the free energy profile of the reacting fragments along the reaction coordinate. (A) The NAC effect is associated with ground state steric destabilization. That is, the free energy surface of the reacting fragments is pushed up in the RS and, thus, leads to small g{cat . The modification of the water free energy surface by the enzyme is presumably due to the binding of the nonreactive parts of the substrate (this energy is not included in the figure). (B) The enzyme stabilizes the TS by electrostatic effects and in doing so pushes the RS minimum closer to the TS position. This option is the regular TSS proposal, which has nothing to do with the implications of the NAC proposal.

[3]

electrostatic basis for bioenergetics

59

the protein more than in water. However, the stabilization is larger in the w!p w!p TS than in the RS, so that GTS is larger than GRS (where w and p designate water and protein, respectively). The two options in Fig. 3 correspond to two very different limits and to completely different proposals. Bruice and co-workers suggested that CM catalyzes its reaction by ˚ ), which is hard to reach in helping C9–C1 reach a NAC distance ( 3.7 A water. Now, from a rigorous point of view, the NAC concept is arbitrary since one can choose any distance on the way to the TS (including the TS). Nevertheless, as explained before,25 it is possible to formulate what is probably meant by the NAC concept in a relatively clear way by the restraint release approach of Shurki et al.25 Similar results can be obtained p by a simpler approach focusing on hRiRS and hRiw RS where R is the solute p contribution to the reaction coordinate. That is, if we evaluate hRiRS p we may ask how much it would cost to reach hRiRS in water. We may approximate the NAC free energy by    p  GNAC  G R ¼ hRiRS w  G R ¼ hRiw (4) RS w

where GðRÞw is the value of the free energy profile in water at the indip cated R. In the above approximation hRiRS provides a proper definition for the NAC distance. However, this analysis does not reveal the origin of the NAC effect. Here we can move again to Fig. 3. In the RSD case (Fig. 3A) we have a real confinement effect where the enzyme pushes the reactants toward the TS and this leads to a reduction of g{cat . Now in the TSS case (Fig. 3B) we also have a ‘‘compression’’ effect p where hRiRS is smaller than hRiw RS . However, this is simply a reflection of the fact that the TSS flattens the potential surface in the enzyme and p p thus hRiRS move closer to hRiTS (Fig. 3B). Obviously, we need to conduct a careful analysis to explore whether the catalytic effect of CM reflects RSD or TSS. In a recent study31 we examined the reaction of CM by the empirical valence bond (EVB) method.6 Our simulations reproduced quantitatively the catalytic effect of CM, thus establishing the reliability of our analysis. After validating the quantitative nature of our free energy calculations we analyzed the NAC effect. As a first step of this analysis we performed LRA calculations of the binding free energy of the RS and TS. The results combined with the EVB free energy profiles are summarized in Fig. 4. Using the binding free energies established that the correct version of Fig. 4 corresponds to TSS. Thus, the apparent NAC is not the reason for the catalysis, but the result of TSS. With this in mind, we tried to find the exact reason for the apparent NAC effect. This was done by performing two sets of calculations: one set with the full EVB, and the other in which

60

allosteric enzymes and receptors

[3]

Fig. 4. Finding the relative positions of the EVB free energy surfaces of CM and the corresponding reaction in water. Using the LRA analysis to obtain the binding free energy we are able to show that the protein curve corresponds to TSS.

Fig. 5. The free energy surfaces for the reaction in water and in CM for calculations that include the complete system (dark lines) and calculations that omit charges of the carboxylate groups.

we omitted the electrostatic interaction between the two carboxylates and between the carboxylates and their surroundings environment. The results of our analysis are shown in Fig. 5. As seen from the figure the omission of the electrostatic contribution from the carboxylates leads to the disappearance of most of the catalytic effect and the rest disappears when all residual charges of the substrate are set to zero. Thus, the main difference between the reaction in CM and in water is due to TSS electrostatic effects.

[3]

electrostatic basis for bioenergetics

61

Now, the reacting system (Fig. 2) involves two negative charges, which are covalently linked to the atoms involved in the bond making process. Thus, a major part of the electrostatic TSS involves stabilizing the two carboxylates at close proximity to each other. This electrostatic stabilization leads to a reduction in the average C9–C1 distance relative to the corresponding distance in water. Although this change in equilibrium distance can be called an NAC effect, it is simply the result of rather than the reason for the catalytic effect. That is, in analyzing enzyme catalysis it is very important to determine what factors actually lead to the overall catalytic effect and to distinguish these factors from other effects that are by-products of the catalytic factors. For example, if an enzyme operates by electrostatic stabilization and the electrostatic effect also leads to a change in the color of the substrate, we will have to recognize that the change in color is a result and not a reason. A recent work of Hur and Bruice30b presented a good correlation between the calculated NAC energies and the increase in activation barrier for different mutants of CM. The correlation was brought as a proves that the catalytic effect of CM is not associated with TS stabilization. Unfortunately, there are several major problems with the above assertion. First, any explicit energy diagram, used by Bruice and coworkers to present their NAC proposal in CM, corresponds to a very clear TSS situation and not to RSD or any other mechanism (see also analysis in Fig. 3 of Sˇtrajbl et al.31 Second, calculated NAC energies seem to represent significant overestimates. For example the calculated NAC energy for the water reaction is much larger than the value obtained by other workers including Strajbel et al.31 Similarly, the values for some mutants are likely to present an overestimate (they reflect an upper limit with some arbitrary definitions). This means that the correlation is not so impressive. Third, and most importantly, a much better correlation is obtained between the changes in the TS energies and the activation free energy. This is also an established experimental fact since KM is similar for the native enzyme and the most important mutant.11 Of course, in cases where the NAC effect is correlated with the TSS we will also have a correlation between the NAC effect and the catalytic effect. Finally, we would like to make a general point. At present all consistent studies indicate that enzymes do not work by RSD but by TSS.32 Now, since enzyme catalysis involves TSS and since this TSS is due to electrostatic preorganization, it is preferable to calculate the electrostatic stabilization 30b

S. Hur and T. C. Bruice, J. Am. Chem. Soc., 125, 10540 (2003). M. Sˇtrajbl et al., J. Am. Chem. Soc. 125, 10228 (2003). 32 A. Warshel et al., Biochemistry 39, 14728 (2000). 31

62

allosteric enzymes and receptors

[3]

effect and to obtain a clear estimate of the catalytic effect than to evaluate the NAC effect, which might or might not help in predicting the catalytic effect.6 It is important to note that the fact that CM works by electrostatic TSS was recently established experimentally in the work of Hilvert and co-workers.11 Most Reactant State Destabilization Proposals Reflect Incorrect Thermodynamic Cycles Many catalytic proposals involve the idea of RSD. These proposals involve the above-mentioned strain proposal, the entropy proposals,2,33,34 and the popular concept that enzymes provide a nonpolar (sometimes described as gas phase–like) environment that destabilizes highly charged ground states.32,35–37 All of these proposals were considered elsewhere,12,24 but in the general contest of bioenergetics it is very useful to consider the desolvation and related proposals. The validity of the desolvation idea has been examined carefully6 and was shown to reflect improper thermodynamic cycles that do not use a proper reference state. This amounts to ignoring the desolvation energy associated with taking the RS from water to a hypothetical nonpolar enzyme site. With a proper reference state, one finds6 that a polar TS is less stable in nonpolar sites than in water and that the RSD does not help in increasing kcat =KM . Thus, there is no evolutionary pressure for this mechanism. In fact, many desolvation models (e.g. refs. 36 and 38) involve ionized residues in a nonpolar environment. Such residues would not be ionized in nonpolar sites. Moreover, in any specific case, when the structure of the active site is known, one finds by current electrostatic models a very polar (rather than nonpolar) active site environment near the chemically active part of the substrate. A case in point is pyrovatedecarboxylase, which was put forward as a classic case of RSD by desolvation.35 However, the structure of this enzyme39 appeared to be very polar. In this respect we would also like to clarify a point of confusion in Jordan et al.,40 who assume that our opinion on polarity reflects a ‘‘simple inspection.’’ In fact, many of our studies that identify protein polarity as their most important feature are based on careful conversion of 33

P. A. Kollman et al., Acc. Chem. Res. 34, 72 (2001). D. Blow, Structure 8, R77 (2000). 35 J. Crosby, R. Stone, and G. B. Lienhard, J. Am. Chem. Soc. 92, 2891 (1970). 36 J. K. Lee and K. N. Houk, Science 276, 942 (1997). 37 M. J. S. Dewar and D. M. Storch, Proc. Natl. Acad. Sci. USA 82, 2225 (1985). 38 F. C. Lightstone et al., Proc. Natl. Acad. Sci. USA 94, 8417 (1997). 39 P. Arjunan et al., J. Mol. Biol. 256, 590 (1996). 40 F. Jordan, H. Li, and A. Brown, Biochemistry 38(20), 6369 (1999). 34

[3]

electrostatic basis for bioenergetics

63

X-ray structures to energetics.16,41,42 At any rate, despite the obvious fact that groups near charges were in a polar rather than nonpolar environment, it is still assumed40 that ion pairs are stabilized in a nonpolar environment and that this is the way pyrovatedecarboxylase catalyzes its reaction. However, as clarified in many of our papers, ion pairs are destabilized (relative to water) rather than stabilized (e.g. ref. 43). Failing to realize this rigorous but seemingly counterintuitive point is one of the obstacles to the progress of modern bioenergetics. A detailed illustration of the problem with the RSD proposal has been given in the case of orotidine 50 -monophosphate decarboxylase (ODCase).32 The catalytic action of this enzyme was first proposed to involve the desolvation effect.36 This was shown to involve an incorrect thermodynamic cycle.44 The elucidation of the structure of this enzyme showed that its active site is extremely polar (highly charged), but this led to a new RSD proposal in which the negatively charged groups of the protein destabilize the carboxylate of the orotate substrate.45 This proposal was shown to be inconsistent with the nature of the system, since a destabilized orotate will accept a proton and become stable.32 Furthermore, careful computational study illustrates that the protein works by TSS and not by RSD.32 This was shown first by including correctly the proton donor (Lys-72) in the reaction system and also by using the same reaction system as used by Wu et al.45 This fact was apparently overlooked in a recent review,46 in which it was suggested that the calculations of Warshel et al.32 involved only a specially selected reaction system. Finally, the fact that the system works by TSS was established experimentally in the mutational studies of Wolfenden and co-workers.47 In discussing the proper use of thermodynamic cycles in bioenergetics in general and in analyzing the desolvation hypothesis in particular, it is useful to consider a recent paper by Devi-Kasavan and Gao.48 They examined the origin of the catalytic power of haloalkane dehalogenase (DhlA) by a QM/MM approach. The calculations reproduced the correct trend of the catalytic effect indicating that it is due to electrostatic effects. It was also found that the activation barrier is higher in water than in the enzyme and that the reaction in water involves loss of solvation energy (the 41

A. Warshel, Nature 333, 15 (1987). A. Warshel and A. Papazyan, Curr. Opin. Struct. Biol. 8, 211 (1998). 43 A. Warshel, S. T. Russell, and A. K. Churg, Proc. Natl. Acad. Sci. USA 81, 4785 (1984). 44 A. Warshel and J. Florian, Proc. Natl. Acad. Sci. USA 95, 5950 (1998). 45 N. Wu et al., Proc. Natl. Acad. Sci. USA 97(5), 2017 (2000). 46 J. Gao and D. G. Truhlar, Annu. Rev. Phys. Chem. 53, 467 (2002). 47 B. Miller and R. Wolfenden, Annu. Rev. Biochem. 71, 847 (2002). 48 L. S. Devi-Kesavan and J. Gao, J. Am. Chem. Soc. 125(6), 1532 (2003). 42

64

allosteric enzymes and receptors

[3]

Fig. 6. Examining the desolvation hypothesis in haloalkane dehalogenase. The figure presents schematically the solvation free energies of the RS and TS, both in water and in the enzyme. It is shown clearly that the solvation effect stabilizes rather than destabilizes the RS in the enzyme (relative to water) but it stabilizes the enzyme TS by a larger extent. This establishes that the enzyme works by solvation rather than desolvation mechanism.

corresponding solvation analysis is not done in the enzyme). However, these findings were confused with the desolvation proposal. That is, the desolvation proposal states very clearly and unambiguously that the enzyme solvates the RS less than water does, regardless of the difference between the solvation energy of the RS and the TS in water.2–3,6 Thus, examination of the origin of the catalytic effect should involve comparison of the absolute solvation energy of the RS in both the enzyme and water systems, as well as a study of the solvation of the TS in the enzyme and in water (rather than a study of the solvation only in water). Consistent studies25 (see also Fig. 6) have found that DhlA stabilizes (solvates) its RS somewhat more than water does, while solvating the TS much more than water. The above considerations are illustrated schematically and unambiguously in Fig. 6 and can be used as an instructive exercise in bioenergetics. Electrostatic Stabilization by Hydrogen Bonds versus the LBHB Proposal Hydrogen bonds (HBs) provide an excellent example of preorganized dipoles that should contribute to catalysis.14 Nevertheless, the catalytic effect of HBs has been the subject of significant controversy and in some cases led to proposals that overlooked key requirements of proper

[3]

electrostatic basis for bioenergetics

65

energy-based analysis. Thus, we will consider the catalytic contribution of HBs as an excellent example for the need of clear energy considerations in bioenergetics. The first concrete support to the idea that HBs contribute to enzyme catalysis can be traced to the identification of the oxyanion hole in subtilisin.49 This structural observation did not involve, however, any estimate of the relevant catalytic energy. Subsequent theoretical studies14,50,51 have established the idea that the overall electrostatic effect of preorganized hydrogen bonds contributes in a major way to enzyme catalysis. These theoretical predictions were confirmed by mutation experiments, showing clearly that a single hydrogen bond can contribute around 5 kcal/ mol to an ionic transition state.52,53 The results of some specific mutation experiments were subsequently reproduced by FEP/US calculations.50 After the experimental demonstration of TS stabilization by hydrogen bonds it was proposed that HBs stabilize TSs in a special nonelectrostatic way, which was termed low-barrier hydrogen bond (LBHB).54–56 The LBHB proposal54–56 suggested that catalytic HBs involve a single (or a flat ground state) rather than a double minimum. Unfortunately, this suggestion (which is sometimes true) does not allow one to distinguish the LBHB proposal from the previous electrostatic proposal of ionic HBs (and thus, does not provide a relevant definition). To distinguish between ionic HBs and LBHB, it is essential to first define the LBHB proposal in a clear way, which reflects the energetics of the system and can be related to enzyme catalysis. At present the best (and in some respects the only) way to define the LBHB proposal is to use the valence-bond (VB) representation. This representation can be treated in a simplified two-state version of the three-state model of Coulson and Danielsson,57,58 augmented by the EVB solvent effect.6 In this representation (Fig. 7) one sees how the pure zero-order states 1 ¼ ½X  H Y , 2 ¼ ½X H  Y are mixed by an offdiagonal term H12 (resonance term). When the energy gap (g2  g1 ) at r10 is similar in magnitude to H12 we have a single minimum with large covalent character, which is qualified to be defined as LBHB. The transition 49

R. A. Alden et al., Biochem. Biophys. Res. Commun. 45, 337 (1971). A. Warshel, F. Sussman, and J.-K. Hwang, J. Mol. Biol. 201, 139 (1988). 51 S. N. Rao et al., Nature 328, 551 (1987). 52 R. J. Leatherbarrow, A. R. Fersht, and G. Winter, Proc. Natl. Acad. Sci. USA 82, 7840 (1985). 53 P. Carter and J. A. Wells, Proteins Struct. Funct. Genet. 6, 240 (1990). 54 W. W. Cleland and M. M. Kreevoy, Science 264, 1887 (1994). 55 P. A. Frey, S. A. Whitt, and J. B. Tobin, Science 264, 1927 (1994). 56 W. W. Cleland, P. A. Frey, and J. A. Gerlt, J. Biol. Chem. 273, 22529 (1998). 57 C. A. Coulson and U. Danielsson, Arkiv Fys. 8, 239 (1954). 58 C. A. Coulson and U. Danielsson, Arkiv Fys. 8, 245 (1954). 50

66

allosteric enzymes and receptors

[3]

Fig. 7. (A) A two-state VB model for an ionic hydrogen bonded system.59 The free energies g1 and g2 correspond to the states (X  H Y ) and (X H–Y). The ground state surface Eg (with a corresponding free energy surface g) is obtained from the mixing of the two states. The donor and the acceptor are held at a distance R. The equilibrium distances for isolated X–H and H–Y fragments are designated by r10 and r20 .  and GPT designate the reorganization energy and proton transfer energies, respectively. (B) The LBHB limit occurs when jH12 j >  þ GPT , and also when |H12| ’  þ Gpt.

between an ionic HB and LBHB can be quantified by considering the reorganization energy, , the mixing term H12 , and the proton transfer free energy, GPT .59 In the LBHB limit, the electrostatic contribution is larger than the covalent contribution. In other words we are dealing here with a competition between the localized [X H  Y], [X  H Y ] picture and the delocalized [X1=2 . . . H . . . Y1=2 ] picture. In the gas phase the delocalized picture tends to dominate while in solution the localized is more important. With these limiting cases in mind we can ask what is new in the LBHB proposal. Obviously the idea that HBs, which are preorganized to stabilize ionic TSs, contribute to catalysis is not new (see above). Thus, the only new element in the LBHB proposal is the idea that the covalent delocalized character, which leads to the single energy minimum (or a flat ground state), is the origin of the catalytic effect. In this respect, it should be clear that HBs in solution have a significant covalent character (see early works60). Furthermore, for the LBHB proposal to be valid the covalent character must be larger in the enzyme than in solution, and the corresponding difference must be the source of the HB catalytic effect.59 Obviously, these issues cannot be examined without evaluating the relevant energies. All current attempts to define the LBHB prposal in terms of experimental observations appeared to be circular and logically flowed (e.g. 59). The problem is that any experimental ‘‘proof’’ requires a theoretical interpretation and all current experimental observations are equally 59 60

A. Warshel and A. Papazyan, Proc. Natl. Acad. Sci. USA 93, 13665 (1996). A. Warshel and R. M. Weiss, J. Am. Chem. Soc. 102(20), 6218 (1980).

[3]

electrostatic basis for bioenergetics

67

consistent with the time HB model (see discussion in ref. 65, except pKa measurements, which are simply inconsistent with the LBHB proposal (see later). The entire issue of the validity of the LBHB proposal is related to the interaction between the environment and the VB states of the given ionic HB. The LBHB proponents (who originally assigned to LBHBs in enzymes the enormous energy, 20 kcal=mol, of gas phase LBHB54) argued that the enzyme environment is nonpolar. Obviously, such desolvation arguments are not useful without actual calculations of the relevant polarity and the corresponding solvation effect. Performing such calculations in a reliable way is the best way to examine the LBHB proposal. An excellent case for the analysis of the LBHB proposal is offered by the catalytic triad of serine proteases. Frey and co-workers55 put forward this system as a prime example of LBHB catalysis. They basically proposed that the proton of His-57 is shared between this residue and Asp-102 forming a O1/2 . . . H . . . Nþ1=2 system at the TS of the hydrolytic reaction; this assertion is established by the definition of the LBHB as a system where pKa ¼ 0 and by all the drawings presented by the LBHB proponents. (see Warshel et al.61 for the definition of the relevant system). Warshel and co-workers argued, in turn (based on early calculations61,62), that the pKa of Asp-102 is lower than that of His-57 in the protein, and, thus, the proton must be on His-57. This argument was supported by nuclear magnetic resonance (NMR) studies,63 which were ignored by the LBHB proponents. Now, instead of addressing the well-defined pKa issue, a new argument has been put forward by Cassidy et al.64 They considered the highly relevant system of chymotrypsin with a TS analogue (which will be referred to here as I) where the pKa of His-57 is shifted from 7 to 11. They then argued that the pKa shift is due to a binding-induced steric strain between Asp-102 and His-57, whose release supposedly leads to LBHB. The logical flaws of this proposal are quite significant,65 but here we will simply examine its validity. First, there is no obvious reason (or evidence) for bindinginduced steric strain between Asp-102 and His-57. It is very likely that a negatively charged TS analogue should increase the pKa of the His-57, and, thus, increase pKa between Asp-102 and His-57 and reduce, rather than increase, the LBHB contribution. Although we believe that the trend expected from the I . . . Hisþ interaction based on simple electrostatic considerations should have excluded 61

A. Warshel et al., Biochemistry 28, 3629 (1989). A. Warshel and S. Russell, J. Am. Chem. Soc. 108, 6569 (1986). 63 E. L. Ash et al., Science 278, 1128 (1997). 64 C. S. Cassidy, J. Lin, and P. A. Frey, Biochemistry 36, 4576 (1997). 65 C. N. Schutz and A. Warshel, Proteins Struct. Func. Genetics (in press 2004). 62

68

allosteric enzymes and receptors

[3]

Fig. 8. A schematic description of two thermodynamic cycles describing the different charging states of the intermediate structure with (A) and without (B) the negatively charged inhibitor for examining the LBHB proposal.64 The circles (from the left) designate Asp-102, His-57, and the inhibitor. The energy difference between the different states is given in kcal/mol.

the proposal of Cassidy et al.64 as a valid illustration of the LBHB catalytic effect, we found it useful to examine the actual energetics of the system. The detailed analysis, which will be reported elsewhere,65 is based on the semimacroscopic version of the protein dipole Langivin dipole (PDLD) model) the PDLD/S-LRA method,66 which is one of the most reliable ways of evaluating pKa s in proteins and on EVB calculations.6 The results of this analysis are summarized in Fig. 8. As seen from Fig. 8, the ion pair state is more stable than the nonpolar pair state by 68 kcal/mol in the presence of the TS analogue and about 5–7 kcal/mol in the absence of the TS analogue. The same calculations reproduced the observed change in the pKa of His-57 ( 12 and 6.5 with and without the TS analogue) and other observables such as the pKa of Asp-102 in the absence of the 66

F. S. Lee, Z. T. Chu, and A. Warshel, J. Comp. Chem. 14, 161 (1993).

[3]

electrostatic basis for bioenergetics

69

TS analogue (pKa ¼ 3 as compared to observed pKa of 2). Furthermore, direct EVB calculations gave free energies of 8 and 6 kcal/mol, respectively, for the PT from His-57 to Asp-102 with and without the TS analogue. Thus, the pKa shift is quantitatively consistent with a model in which there is no LBHB in the free enzyme and even less LBHB in the enzyme with the TS analogue. In addition, it is clear from the enormous contribution of the calculated solvation energy (very similar to those in Warshel and Russell62) that the environment here is very polar, and does not resemble the nonpolar environment envisioned by the LBHB proponents. Note, in this respect, that the arguments of Frey and co-workers64 do not involve any valid energy consideration. Furthermore, it is simply impossible to perform the analysis of Fig. 8 by any current experimental approach. On the other hand, the analysis of Fig. 8 is consistent with the available experimental information and involves reliable and well-tested electrostatic calculations. We consider our analysis to be valid due to its ability to reproduce the observed pKa s using a realistic model of the protein inhibitor system. Detailed examination of the LBHB proposal for the HBs in the oxyanion hole of subtilisin was already presented by Warshel and Papazyan.59 Here again, it was found that the catalysis is due to the preoriented [t . . . H  N] system and not to the [t1=2 . . . H . . . N1=2 ] of LBHB (t designates the oxyanion TS). Recent EVB analysis67 examined the LBHB proposal in ketosteroid isomerase. It was found that the HB between Tyr-16 and the substrate enolate has a double minimum and that the charge transfer contribution is similar in the protein and in water. The Energetics of ATP Synthase and Related Systems

The nature of energy transduction in biology has been one of the central issues of molecular biology and bioenergetics. The relationship between protomotive force1 and the action of molecular machines4,68 is a prominent example of the apparent complexity of bioenergetics. In some systems it is more or less clear that the useful energy is transformed in a form of charge separation and charge formation. In others and in particular in molecular machines, it is customary to talk about the conversion of energy to conformational changes. The elucidation of the structure of this system68 confirmed an early hypothesis4 and provides a detailed molecular picture of a remarkable molecular machine structure. Yet, the understanding of the relevant energetics is incomplete. Here we will demonstrate that the relevant energetics is best described as electrostatic energy. 67 68

˚ qvist, Theor. Chem. Acc. 108, 71 (2002). I. Feierberg and J. A J. P. Abrahams et al., Nature 370, 621 (1994).

70

allosteric enzymes and receptors

[3]

Fig. 9. A generic model of the accepted mechanism for the activation of ATP synthase.

Defining the Problem The accepted mechanism of the activation of ATP synthase4,68,69 is described by Fig. 9. Although some modifications of this scheme are available, we can use Fig. 9 as a starting canonical model. In this figure we consider the conventional notation of the three subunits (each composed of  and  chains), i.e., the structures E, T, and DP, which correspond to the structures of a relaxed empty subunit, a relaxed adenosine triphosphate (ATP) bound structure, and a relaxed adenosine diphosphate (ADP) þ Pi bound structure, respectively. We also use the shorthand notation t, d, and e for ATP þ water, ADP þ Pi, and no ligand, respectively.  The steps described in Fig. 9 start with a 120 rotation of the -subunit, which forces the subunits to change their structure and to change the binding energies of the ligands leading to binding of ADP and release of ATP (step A ! B). Next, the ‘‘strained’’ ADP, is formed as a result of step A ! B, is converted to ATP (step B ! C). Significant progress has been achieved in recent years in providing a mechanical picture of the action of ATP synthase.7,70,71 However, the current description is based on macroscopic concepts whose exact relationship to the actual molecular details is not fully understood. The instructive simulation72 showed that the system could be driven from one state to another by exerting a large force. However, a direct simulation of a process that occurs at a millisecond time scale (the time of the first step is 0.2 ms) by a nanosecond process might miss key relaxation processes. It is also hard to obtain reliable conclusions about the energetics of the system by such brute force molecular dynamics (MD) simulations. 69

A. Fersht, ‘‘Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding.’’ W. H. Freeman and Company, New York, 1999. 70 T. Elston, H. Y. Wang, and G. Oster, Nature 391, 510 (1998). 71 D. A. Cherepanov, A. Y. Mulkidjanian, and W. Junge, FEBS Lett. 449(1), 1 (1999). 72 R. A. Bo¨ckmann and H. Grubmtiller, Nat. Struct. Biol. 9, 198 (2002).

[3]

electrostatic basis for bioenergetics

71

Fig. 10. A schematic description for the energetics of the mechanism for the activation of ATP synthase described with the three zero order (diabetic) states in a curve-crossing diagram. The energies of the different states are displayed along the rotation of the -stalk. A hypothetical adiabetic surface (which describes the fully relaxed system as a function of ) is also given. Note that G2 is actually close to .

Here we would like to convert the structural information about F1-ATPase to a clear structure–energy correlation. As a starting point we convert Fig. 9 to a schematic energy diagram (Fig. 10). In this diagram we consider the three states of Fig. 9 (A, B, and C) as a function of a rotation of the -stalk. In the zero-order picture the structure and ligation state of the subunits are kept fixed (or allowed a limited relaxation) while the -subunit  performs a 120 rotation. Thus, the free energy barrier obtained from the energy diagram provides an upper limit for the actual barrier of -rotation. Consideration of Fig. 10 can be useful as a conceptual framework. For example, GA!B ( 0 ) is the free energy of binding ADP þ Pi by E1 and releasing ATP from T2 at ¼ 0. Some of the points on Fig. 10 can be elucidated by using experimental information. However, the construction of the actual free energy curves cannot be accomplished without some form of free energy calculations. Here the selection of the most effective formulation and computational strategy is extremely important. That is, brute force approaches, such as calculations of potential of mean force (PMF), are not likely to give reliable results due to the convergence problems.73 Macroscopic calculations of the direct interaction between the -stalk and its surrounding subunits might also be problematic since the reorganization of the system should have a large effect, which is hard to model by a uniform dielectric constant.74 Thus, we will try to place special emphasis on the selection of the proper simulation approach and will also try to establish the error range on the calculated results. 73 74

A. Burykin et al., Proteins Struct. Funct. Genet. 47, 265 (2002). C. N. Schutz and A. Warshel, Proteins Struct. Funct. Genet. 44, 400 (2001).

72

allosteric enzymes and receptors

[3]

Simplifying the Problem Analyzing the energetics of Fig. 10 by reliable calculations might look hopelessly complicated. Not only do we have to deal with a complex system, but we also have to consider the interaction between different subunits in different conformational states. In addition, we must consider the interactions between the subunits and their ligands. Here the task of obtaining stable and meaningful energy can be overwhelming. Fortunately, it is possible to simplify the problem enormously by using the LRA approach. That is, within LRA it is enough to look on the change in ligand–protein interaction upon a conformational change and to obtain a reasonable approximation for the corresponding change in the interaction between the subunits. This is based on the fact that the change in the interaction between the subunits and within the subunits is about half of the change in the interaction between the subunits and the ligands [note that Eq. (2) does not include the interaction between the environment and itself]. The LRA approach can be used for transformations between different states in the system and for the binding process. To illustrate our LRA approach we will start with the ‘‘simplest’’ case of moving from state B to state C in Fig. 9. Here we will not try to follow the reaction along the coordinate of Fig. 10 but along the conformational change coordinate of Fig. 11A. In other words, we will try to use the LRA to evaluate G(B ! CÞ ¼ 1 (G2 in Fig. 10), by considering the free energy surfaces for the ligand’s configurational states (L ¼ I; II; and III). This is done by using the Marcus-type parabola of Fig. 11A and obtaining

Fig. 11. A schematic energy description of the different diabatic states, shown in Fig. 9, along the conformational changes rather than the rotation of the -stalk in Fig. 10. (A) The second step (B ! C); (B) the first step (A ! B).

[3]

electrostatic basis for bioenergetics

73

1 GII!III ¼ ½hVIII  VII iIII þ hVIII  VII iII  2 1 II!III ¼ ½hVIII  VII iII  hVIII  VII iIII  2

(5)

where the Vs are the potential surfaces of the indicated states and h iL designates an average over the potential surface VL ðL ¼ I; II; III). It is also understood that the average is performed near the equilibrium configuration of each designated subunit so that hðd; e; tÞ  ðd; e; dÞiII is done with the potential surface Vðd1 ; e2 ; d3 ) where the -subunits are constrained by week constraints to be near their observed structure at (DP1 ; E2 ; T3 ). Also, note that at least formally the states I, II, and III are defined only by the ligands A, B, and C. However, the minima of states I, II, and III correspond to the minima of A, B, and C. Now, we can further simplify the nature of Eq. (5) using the explicit nature of the subsystems. For example, we can express 1h GII!III ¼ h½1 ðdÞ2 ðeÞ3 ðtÞ  ½1 ðdÞ2 ðeÞ3 ðdÞiDP1 ðdÞE2 ðeÞT3 ðtÞ 2 i þh½1 ðdÞ2 ðeÞ3 ðtÞ  ½1 ðdÞ2 ðeÞ3 ðdÞiDP1 ðdÞE2 ðeÞT3 ðdÞ (6) where i ðxÞ designates that the ligand x is bound to the ith subunit. Thus, for example, h½1 ðdÞ2 ðeÞ3 ðtÞ  ½1 ðdÞ2 ðeÞ3 ðdÞiDP1 ðdÞE2 ðeÞT3 ðtÞ designates an average over the interaction between d, e, and t and their surroundings when 1 ; 2 , and 3 are held (by the fixed ) near the DP, E, and T configurations, respectively. Now we may simplify our expression by writing 0

0

0

h1 ðdÞ2 ðeÞ3 ðtÞiDP1 ðdÞE2 ðeÞT3 ðtÞ ffi h1 ðdÞiDP 0 ðdÞ þ h2 ðeÞiE0 ðeÞ þ h3 ðtÞiT0 ðtÞ 1

2

3

(7)

where i0 ðxÞ designates the interaction between the ligand x (at the site ) and the entire system. The index III indicates that the subunits are near the configurational state III. Note that we exploit the fact that the LRA approach gives additive contributions. With the help of Eq. (7) we obtain i 1h 0 0 0 0 GII!III ’ hT3 ðtÞ  T3 ðdÞiT 0 ðtÞ þ hT3 ðtÞ  T3 ðdÞiT 0 ðdÞIII II 3 3 2h (8) i 1 0 0 0 0 II!III ’ hT3 ðtÞ  T3 ðdÞiT 0 ðtÞ  hT3 ðtÞ  T3 ðdÞiT 0 ðdÞ III III 3 3 2

Here the simplified notation h ii ðxÞL means that the average is over the configuration generated with ligand x where the system is kept near the specified structure of i . Equation (8) contains contributions from the chemical part (the ADP!ATP reaction). Following the EVB and our approach for pKa calculations75 we can use the reaction in water as a reference and write

74

allosteric enzymes and receptors

[3]

GII!III ¼ ðGII!III Þw!p þ Gw II!III

(9)

where w and p designate water and protein, respectively. Thus, we have GII!III ¼ ðGII!III Þw!p þ Gðd ! tÞw

(10)

The first term can be evaluated conveniently by the semimacroscopic PDLD/S-LRA approach and the second then is 7.3 kcal/mol in standard conditions where the concentration effect is easily treated.76 The reorganization energy term can also be divided by the solute contribution (inner shell contribution) and the rest of the system (outer shell). Here we can also write II!III ¼ ðII!III Þw!p þ w II!III

(11)

An estimate for the activation barrier can be obtained now from the modified Marcus theory6 g{II!III ¼ ðII!III þ GII!III Þ2 =4II!III  HII!III

(12)

where Hij is the mixing between the corresponding states. Because in our case   G we have   1 1 1 (13) II!III  HII!III g{II!III  II!III  HII!III ¼ ðII!III Þw!p þ 4 4 4 Here again the expression for  can be obtained from the PDLD/ S-LRA approach and the second term is obtained conveniently from the EVB calculation. Trying to use the direct LRA configuration exchange approach for step A!B is more complicated since this step involves binding processes that complicate the use of the configurational exchange approach. One can still try to use such an approach and the diagram of Fig. 11 and obtain the following approximated expression GI!II ffi

1h 0 0 0 0 hE1 ðdÞ  E1 ðeÞiE 0 ðeÞ þ hT2 ðeÞ  T2 ðtÞiT 0 ðtÞ I I 2 1 2 0

0

0

0

hDP1 ðeÞ  DP1 ðdÞiDP 0 ðdÞ þ hE2 ðtÞ  E2 ðeÞiE 0 ðeÞ 1

0

II

¼ GI!II þ Gðd ! tÞw

2

II

i

þ Gðd ! tÞw

(14)

where e designates the nonpolar and uncharged form of the ligand x in the expression hðxÞ  ðeÞi. Equation (14) can also be expressed as 75 76

A. Warshel, Biochemistry 20, 3167 (1981). M. Sˇtrajbl, J. Florian, and A. Warshel, J. Phys. Chem. B 105, 4471 (2001).

[3]

75

electrostatic basis for bioenergetics GI!II ¼ ðGI!II Þw!p þ Gðt ! dÞw þ Gðd ! tÞw ¼ ðGI!II Þw!p 0

0

(15)

Equation (15) is, however, a rather rough approximation, since step A!B really involves at least two steps77 that can be modeled as binding processes.77 By the electrostatic contribution to the binding processes by the LRA approximation15 we can obtain 0 0 GI!II ffi GðdÞbind  GðtÞbind T T 3

2

(16)

Here we used the cycle introduced in our binding studies15,78,79 where we consider an electrostatic cycle in which the ligand moves from its fully nonpolar and uncharged form to its actual charged configuration in both the solution and the protein environments. The actual electrostatic calculations involved the use of the PDLD/S-LRA approach. This analysis involved a careful evaluation of the binding entropies (by a restraint release approach) and the free energy associated with assembling the ligand components in water.77 The same type of binding cycle allows us to evaluate GII!III . Finally, we can even estimate the activation barrier for the rotation of the -stalk by using the LRA approach. Here, we have to use a nonstandard trick since the corresponding step 1 ! 2 does not involve any change of the ligand charges. However, we can generate a hypothetical process whose reorganization energy corresponds to the free energy of rotation. Doing so77 we obtain i 1h 0 0 0 0 hDP3 ðtÞ  DP3 ðdÞiDP0  hT3 ðtÞ  T3 ðdÞiT0 ðtÞ (17) 1!2 ¼ 3 3 2 Converting Conformational Energy to Electrostatic Energy The above formulation and the EVB approach allowed us to explore the energetics of the F1-ATPase.77 The first and the conceptually simplest study from our perspective involved the EVB study of the catalytic reaction in step B!C of Fig. 9. These EVB calculations77 reproduced significant catalysis as well as the correct trend in reaction free energy p Gðd ! tÞT ffi 0 kcal/mol. The same study reproduces the large catalytic effect of the enzyme. Similar results were obtained by the PDLD/S-LRA binding calculations. The use of the LRA approach allows us to demonstrate that the energy needed to convert ADP to ATP is supplied by the change of the protein–substrate electrostatic energy upon change of the protein structure from DP to T. M. Sˇtrajbl, A. Shurki, and A. Warshel, Proc. Natl. Acad. Sci. USA (in press 2003). Y. Y. Sham, I. Muegge, and A. Warshel, Proteins 36, 484 (1999). 79 Y. Y. Sham et al., Proteins Struct. Funct. Genet. 39, 393 (2000). 77 78

76

allosteric enzymes and receptors

[3]

The use of the PDLD/S-LRA binding calculations and Eq. (16) also allow us to estimate GI!II . This free energy was also found to reflect electrostatic effects. Although more qualitative studies are obviously needed it is clear that all or most of the conformational effects are converted to changes in electrostatic energies. It is interesting to note in this respect that a significant part of the difference between the energies of the ATP þ H2O and ADP þ Pi states is associated with the electrostatic repulsion between the ADP and Pi fragments. Thus a significant part of the I ! II process can be viewed as a step where the electrostatic energy of assembling the ADP þ Pi system is already paid by the binding process.77 Related Systems Phosphate hydrolysis is used in controlling many other biological processes in addition to the above-mentioned energy transduction processes.80 One of the most important cases is the control of biological signal transduction by G-proteins.81–83 Our extensive studies of the RasGAP switch84–89a have established the mechanism of the guanidine triphosphate (GTP) hydrolysis reaction showing that the GTP substrate itself is the general base. Furthermore, we were able to show how GAP catalyzes the Ras reaction.89b We were also able to quantify the action of Arg-789 (the arginine finger) and Gln-61. While the arginine finger acts by direct electrostatic TS stabilization, the effect of Gln-61 appeared to involve indirect electrostatic effects by changing the local environment of the GTP. Most significantly, from the perspective of the present review, we were able to show that control of GTP hydrolysis is mainly associated with electrostatic effects. Proton Translocation and Electron Transport

The transfer of protons and electrons and the coupling between these processes play a central role in bioenergetics.1,3 Recent progress in the elucidation of the structures of proton transport90,91 and electron transfer 80

F. H. Westheimer, Science 235, 1173 (1987). S. R. Sprang, Curr. Opin. Struct. Biol. 7(6), 849 (1997). 82 I. R. Vetter and A. Wittinghofer, Q. Rev. Biophys. 32(1), 1 (1999). 83 L. Wiesmuller and A. Wittinghofer, Cell. Signal. 6, 247 (1995). 84 R. Langen, T. Schweins, and A. Warshel, Biochemistry 31, 8691 (1992). 85 T. Schweins, R. Langen, and A. Warshel, Nat. Struct. Biol. 1(7), 476 (1994). 86 I. Muegge et al., Structure 4, 475 (1996). 87 T. Schweins and A. Warshel, Biochemistry 35, 14232 (1996). 88 T. Schweins et al., Biochemistry 35, 14225 (1996). 89a T. M. Glennon, J. Villa, and A. Warshel, Biochemistry 39, 9641 (2000). 89b A. Shurk and A. Warshel, Proteins Struc. Func. Genetics (in press 2004). 90 U. Ermler et al., Structure 2, 925 (1994). 91 M. Y. Okamura and G. Feher, Annu. Rev. Biochem. 61, 861 (1992). 81

[3]

electrostatic basis for bioenergetics

77

proteins (see refs. in Warshel and Parson24) has led to major advances in the understanding of such systems. Yet, converting the available structural information to clear energy-based concepts is far from obvious. Here again, we contend that the best way to correlate the structure of the relevant proteins to the corresponding functional properties, the best structure– energy–function correlation, is obtained by evaluating the relevant electrostatic energies. We will demonstrate this point by considering both proton transport and electron transfer. Electrostatic Control of Proton Translocation Processes To quantify the energetics of proton translocation in proteins, we have to consider the feasible proton conduction pathways in the given protein (e.g., the pathway presented in Fig. 12 for the bacterial reaction center of R. sphaeroides) and to evaluate the corresponding energetics and kinetics. This can be done by the approach formulated in Warshel92 and implemented in Sham et al.78 This approach considers the energetics of any relevant charge configuration, G(m), relative to the energy of a state in which all the residues are uncharged. This free energy is related directly to the pKa of each proton donor/acceptor at their sites in the protein and is given by92–94 ( ) h i 1X X ðmÞ ðmÞ ðmÞ p (18) Wij qi qj 2:3RTqi pKint;i  pH þ GðmÞ ¼ 2 i6¼j i where m designates the vector of the charge states of the given configurðmÞ ðmÞ ðmÞ ðmÞ ation, i.e., m ¼ ðq1 , q2 ; . . . ; qn ). Here qi is the actual charge of the ith group at the mth configuration. This can be 0 or 1 for acids and 0 or 1 for bases (where we restrict our formulation to mono ions, although the extension to jqj > 1 is trivial). The Wij qi qj term represents the charge– charge interaction term. The intrinsic pKa (pKint ) is the pKa that the given ionizable group would have if all other ionizable groups were kept at their neutral state (the evaluation of this term is described in Sham et al.94). Equation (18) can also be expressed in terms of the energy of forming the given configuration in a reference state (in this case in aqueous solution) at infinite separation of the ions and then transforming it into the protein. This gives75,95

92

A. Warshel, Photochem. Photobiol. 30, 285 (1979). A. Warshel, Methods Enzymol. 127, 578 (1986). 94 Y. Y. Sham, Z. T. Chu, and A. Warshel, J. Phys. Chem. B 101, 4458 (1997). 95 S. H. Chung et al., Biophys. J. 77, 2517 (1999). 93

78 allosteric enzymes and receptors

[3]

[3]

G

79

electrostatic basis for bioenergetics

ðmÞ



¼ G

¼

X i

þ

 ðmÞ w

(

þ

X i

(

ðmÞ 2:3RTqi

ðmÞ

 2:3RTqi

1X ðmÞ ðmÞ Wij qi qj 2 j6¼i

h

p pKint;i



w pKa;i

i

1X ðmÞ ðmÞ Wij qi qj þ 2 j6¼i

h  i

ðmÞ ðmÞ w!p pKaw  pH þ jqi j Gsol qi

)

)

0

(19)

w!p

where ½Gsol ðqi Þ0 represents formally the energy of moving qi from water to its actual protein site when all other ionizable groups are neutral. In the actual calculation we also include in Gsol the solvation of the uncharged form of the given group.94 ðGðmÞ Þw is the free energy of Eq. (18) when all the residues are in water at infinite separation. Now we have to convert the GðmÞ s to the corresponding energy of protonating the different sites. This can be obtained by92 X X ðmÞ ðm;iÞ GHþ ¼ GHþ ¼ (20) W i  qm i i

i

where GHþ is the free energy of the given proton configuration and ðmÞ Wi  qm i is the contribution to Eq. (19) from its ith term. In more explicit form we can write  i w

h ðm;iÞ ðmÞ w!p GHþ ðA ! AHÞ ¼ 2:3RT pK ðA HÞ  pH  G q i i a i sol 0



X

ðmÞ Wij qi

ðmÞ h qj

i

(21)  i

ðm;iÞ ðmÞ w!p GHþ ðBi ! Bi Hþ Þ ¼ 2:3RT pKaw ðBi Hþ Þ  pH þ Gsol qi j6¼i

þ

X

ðmÞ Wij qi

ðmÞ h qj

h

0

i

j6¼i

where instead of using the expression for the specific configuration of the qjs we consider now the average charge, h qj i, at the given pH (see discussion below). Applying this approach to H3 Oþ gives

Fig. 12. Schematic description of the groups that can be involved in a proton conductance ˚ . The to QB. The numbers near the dotted lines give the corresponding distances in A sequential numbers of the residues are given according to the Protein Data Bank (PDB) notation of the structure (1PCR) of Ermler et al.90

80

[3]

allosteric enzymes and receptors ðm;iÞ 

GHþ

 

 H2 Oi ! H3 Oþ ¼ 2:3RT pKaw H3 Oþ  pH i n h io ðmÞ  w!p  Gsol qi H3 Oþ 

þ qi H3 O

þ

X

< qj > Wij

j6¼i

0

(22)

The most significant point from the perspective of the present work is that we have a general expression for the total electrostatic energy of the system including the external pH. This point, which was recognized and formulated in 1979,92 is basically a general formulation of the role of protons in bioenergetics. That is, the pKa s and the corresponding electrostatic energies reflect the total energy of the system, including conformational rearrangements in response to change in charge states as well as the effect of the external proton concentration. The advances in calculations of electrostatic energies in general74 and pKa s in particular94 allow us to use the LRA approach in its semimacroscopic version66,94 and to calculate the energy of all the protonation configurations involved in the proton translocation process. A typical calculation for the proton translocation in the RC of R. sphaeroides, which considered the pathway of Fig. 12, gave the free energy diagram of Fig. 13. The activation barriers g{i!j for a proton translocation between state i and j can be estimated by a simplified ðmÞ modified Marcus relationship6,78 and the Gs of Eq. (19). Thus, we can have all the relevant rate constants (using the g{i!j and transition state theory). With the rate constants it is possible to describe the overall time dependence of the proton transport or proton pumping process by a master equation78 or more reliably by Brownian dynamics (as was done for ion transport73,95). Such studies can provide a direct connection between structural information to energetics and kinetics, thus allowing quantification of the chemiosmotic theory. Electrostatic Control of Light-Induced Electron Transfer Photosynthesis is the most efficient known process for conversion and storage of light energy. Photosynthetic systems operate by light-induced charge separation across a membrane where the electrostatic energy of the charge-separated state is stored in the form of a pH gradient accompanied by conversion of ADP to ATP.96 Now, the previous section considered the relationship between pH gradient and electrostatic energy as well as 96

P. L. Dutton and R. C. Prince, in ‘‘The Photosynthetic Bacteria’’ (R. K. Clayton and W. R. Sistrom, eds.), p. 525. Plenum, New York, 1987.

[3]

electrostatic basis for bioenergetics

81

Fig. 13. A free energy profile for a proton conductance that involves both water molecules and protein residues. This profile includes the effect of the interaction between ionized residues. An alternative profile that keeps some sites is also indicated.

the electrostatic basis of ATP synthesis. In this section we will consider briefly the electrostatic basis of the electron transport process. General considerations of the energetics and efficiency of light-induced electron transport across membranes were introduced in our early work.97 This work considered formally a chain of donors and acceptors that spans the width of the membrane (Fig. 14). It was shown that the entire energetics of the system is determined by electrostatic energy associated with the formation of charges in different sites along the donor and acceptor chain. It was also shown that the only way to control an efficient charge separation process is to control the energetics of the donors and acceptors as well as the reorganization energy in sequential charge transfer steps.97 A proper control can best be obtained by placing the donors and acceptors in the protein sites that will establish the proper electrostatic energy. In particular, it was suggested that fast ET steps should satisfy the relationship shown below (see also Fig. 15): Gij ffi ij

97

A. Warshel and D. W. Schlosser, Proc. Natl. Acad. Sci. USA 78, 5564 (1981).

(23)

82

allosteric enzymes and receptors

[3]

Fig. 14. A schematic model of a conduction chain for light-induced charge separation across a membrane. The chain is composed of a donor (D1) and acceptors (Ai) that span the width of the membrane.

Fig. 15. Energetics and dynamics of conduction chains in several limiting cases. (A) A conduction chain of identical acceptors in a low dielectric membrane. This system cannot give an efficient charge separation because of the large energy of transferring the charge through the membrane. (B) A conduction chain of identical acceptors in aqueous solution. This is an inefficient system because of the high-activation barriers in the individual i ! i þ 1 steps. (C) A conduction chain that combines dielectric stabilization, redox gradient, and optimal relaxation ½ðeÞ ¼ Gi!iþ1 . This provides an optimal downhill charge-separation process.

Our 1981 works,97,98 however, could not provide quantitative proof of the reliability of its concepts in the absence of relevant structural information. 98

A. Warshel, Isr. J. Chem. 21, 341 (1981).

[3]

electrostatic basis for bioenergetics

83

Fig. 16. Structure of the photosynthetic RC of R. sphaeroides.90 The protein is shown in green blue; the ‘‘special pair’’ of BChls (P) in red; the ‘‘accessory’’ BChls (BL and BM) in cyan; the BPhs (HL and HM) in green; quinines (QA and QB) in yellow; the carotenoid (car) in orange; and the nonheme Fe in orange red.

The elucidation of the structure of bacterial reaction centers (RCs)24,99 offered the opportunity to explore quantitatively the principle of lightinduced charge separation in photosynthetic systems. Of particular interest 99

J. Deisenhofer et al., Nature 318, 618 (1985).

84

allosteric enzymes and receptors

[3]

was the relationship between the protein structure and the control of the charge separation process. It was not clear whether the process of moving from the primary acceptor (P in Fig. 16) to the bacteriopheophytin (H in Fig. 16) obeys the electrostatic principles of Eq. (23) and moves through the accessory bacteriochlorophyl (B in Fig. 16), or whether it involves the so-called superexchange mechanism (B participates only as a virtual state). Our electrostatic calculations100 were able to determine for the first time that the system involves a direct hoping mechanism with downhill energetics and very small reorganization energy. This was found at a time at which most experimental studies supported the superexchange proposal. Subsequently, experimental studies have confirmed our prediction.24 More importantly, from the perspective of the present work, it was established that electrostatic calculations provide the ultimate tool for converting structure to energetics and to functions in charge separation processes.5,16,101 This was also found to be true with regard to the directionality of the ET in the two alternative branches in RCs101,102 and in regard to general questions about redox potentials and proteins.103,104 Concluding Remarks

This work considered the structure–correlation aspects of bioenergetics focusing on key processes that are involved in energy transduction and utilization. It was demonstrated that all of these processes can be described quantitatively by considering the corresponding electrostatic energy. Thus, we believe that bioenergetics can be formulated by considering all processes as charge formation, charge distribution, and charge transport. Doing so relative to the corresponding energetics in water provides what is probably the best way of quantifying bioenergetics. Acknowledgments This work was supported by NIH Grant GM 24492 GM 40283 and NSF Grant MCB-0003872.

100

S. Creighton et al., Biochemistry 27, 774 (1988). ˚ qvist, Chem. Scripta 29A, 75 (1989). A. Warshel and J. A 102 W. W. Parson et al., in ‘‘Reaction Centers of Photosynthetic Bacteria’’ (M.-E. MichelBeyerle, ed.), p. 239. Springer-Verlag, Berlin, 1991. 103 P. J. Stephens, D. R. Jollie, and A. Warshel, Chem. Rev. 96, 2491 (1996). 104 A. Warshel, A. Papazyan, and I. Muegge, J. Biol. Inorg. Chem. 2, 143 (1997). 101

[4]

85

control mechanisms in allosteric TD

[4] Local and Global Control Mechanisms in Allosteric Threonine Deaminase By D. Travis Gallagher, Diana Chinchilla, Heidi Lau, and Edward Eisenstein Introduction

Control of enzyme activity through cooperative interactions in multisubunit proteins is a fundamental mechanism for cellular regulation. Decades of biochemical and enzymological research have revealed a host of allosteric systems that exhibit a sigmoidal dependence of ligand binding on concentration.1–5 Additionally, many allosteric enzymes regulate complex metabolic pathways by binding effector ligands at distinct sites that are remote from active sites, in a process referred to as control by feedback inhibition.6,7 Feedback modifiers are often the end products of pathways. Feedback inhibitors, or negative allosteric effectors, typically decrease the activity of the first enzyme in a metabolic pathway. On the other hand, positive allosteric effectors increase the activity of allosteric enzymes at moderate substrate concentrations and are usually products of parallel pathways. The net effect of allosteric interactions is to balance metabolite pools, to increase the efficiency of flux though pathways, and to achieve cellular homeostasis. Despite their widespread importance in metabolic regulation, and although the control of metabolism by allosteric enzymes and their interactions has been well characterized physiologically, there is still much to learn about the molecular basis of this control. The allosteric enzyme threonine deaminase in part controls the biosynthesis of branched-chain amino acids in plants and microorganisms (and it has therefore been referred to as biosynthetic threonine deaminase).8–10 1

H. K. Schachman, J. Biol. Chem. 263, 18583 (1988). J.-P. Changeux, BioEssays 15, 625 (1993). 3 W. N. Lipscomb, Chemtracts-Biochemi. Mol. Biol. 2, 1 (1991). 4 P. R. Evans, Curr. Opin. Struct. Biol. 1, 773 (1991). 5 A. Mattevi, M. Rizzi, and M. Bolognesi, Curr. Opin. Struct. Biol. 6, 824 (1996). 6 H. E. Umbarger, Science 123, 848 (1956). 7 J. C. Gerhart and A. B. Pardee, J. Biol. Chem. 237, 891 (1962). 8 H. E. Umbarger, Adv. Enzymol. 37, 349 (1973). 9 H. E. Umbarger, Annu. Rev. Biochem. 47, 533 (1978). 10 H. E. Umbarger, ‘‘The Biosynthesis of Isoleucine and Valine and Its Regulation.’’ AddisonWesley, Reading, MA, 1983. 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

86

allosteric enzymes and receptors

[4]

Threonine deaminase [threonine dehydratase; l-threonine hydrolyase (deaminating); EC 4.2.1.16; TD1] catalyzes the pyridoxal 50 -phosphate (PLP)-dependent conversion of threonine to 2-ketobutyrate in a sigmoidal manner.11 Additionally, enzyme activity is decreased by isoleucine, the end product of the pathway, whereas valine, the product of a parallel pathway, increases activity. The assay and purification of threonine deaminase from several sources have been described previously in these volumes.12–15 Our aim in this contribution is to describe thermodynamic and kinetic approaches that have led to new insights into the molecular basis for allosteric control of threonine deaminase. A key asset to the implementation of these approaches was elucidation of the three-dimensional structure of the enzyme.16 Detailed structural knowledge of the tetrameric enzyme was essential for guiding site-directed mutagenesis to construct enzyme variants with useful properties for a study of the allosteric transition. First we introduce the allosteric and structural properties of threonine deaminase in order to provide a simple description of the promiscuous ligand-binding properties of the enzyme that complicated previous quantitative analyses of valine activation and isoleucine inhibition. Then, we show how the oligomeric structure of the enzyme was altered by a single amino acid substitution at a subunit interface that disrupted the tetrameric quaternary structure, yielding active dimers. The dimeric variant of threonine deaminase we describe shows an increased tendency to assemble to the native tetrameric structure in the presence of active site ligands, and therefore has been useful in providing estimates for the free energies for changes in the subunit interactions that give rise to cooperative active site ligand binding. Finally, we describe the use of chaotropic agents for the generation of hybrid tetramers containing two different dimeric protomers that have specific arrangements of native active sites or regulatory sites. These enzymelike molecules were used to determine whether effector binding to sites on one dimer affects the active sites in an apposing dimer. By extending these classic approaches to the study of threonine deaminase, we have gained significant insight into the nature of allosteric communication that gives rise to metabolic regulation. We believe that adaptation of these approaches to other complex regulatory proteins may also prove valuable in a study of protein interactions linked to biological regulation. 11

E. Eisenstein, J. Biol. Chem. 266, 5801 (1991). R. O. Burns, Methods Enzymol. 17B (1971). 13 P. Datta, Methods Enzymol. 17B, 566 (1971). 14 G. W. Hatfield and H. E. Umbarger, Methods Enzymol. 17B, 561 (1971). 15 Y. Shizuta and M. Tokushige, Methods Enzymol. 17B, 575 (1971). 16 D. T. Gallagher, G. L. Gilliland, G. Xiao, J. Zondlo, E. E. Fisher, D. Chinchilla, and E. Eisenstein, Structure 6, 465 (1998). 12

[4]

control mechanisms in allosteric TD

87

An Expanded Two-State Model for Homotropic Cooperativity in Threonine Deaminase

Historically, investigators have attempted to describe the sigmoidal enzyme kinetics, as well as the effect of isoleucine inhibition and valine activation of threonine deaminase, in terms of the classic two-state model of Monod et al.17 In its simplest form, this model assumes that the enzyme exists in two global forms, known as the T state and the R state (or lowand high-activity/affinity forms), that are in equilibrium in the absence of any ligands. Sigmoidal binding is manifested when the equilibrium distribution of these global conformations or energy states of the enzyme is shifted in the presence of substrates or effectors. Hence, one can determine the applicability of the two-state model by determining the well-known parameters for the allosteric equilibrium constant, L, the [T]/[R] ratio, the dissociation constant for ligands to the R state of the enzyme, KR, and c, the ratio of KR/KT, where KT is the dissociation constant for ligands to the T state. However, early analyses of steady-state kinetic data with threonine18,19 and more recent analyses of the cooperative binding of inhibitors to the active site of the enzyme20,21 were problematic. One issue that has long been considered a complication to the development of a quantitative description of regulation is the structural similarity of threonine, valine, and isoleucine and their potential promiscuous binding to the enzyme.19,22,23 Because the binding isotherms for these ligands to the wild-type enzyme led to conflicting estimates for the allosteric parameters, it was unclear whether the simple two-state model provided a reasonable description of the allosteric transition of threonine deaminase. Can the sigmoidal enzyme kinetics, the activation by valine and the inhibition by isoleucine, be explained by a simple mechanism whereby an equilibrium between global conformations of the enzyme is shifted by the binding of ligands? Or is another mechanism needed to explain the sigmoidal kinetics of threonine deaminase and its allosteric control by feedback modifiers? An important milestone in the attempt to describe the allosteric mechanism of threonine deaminase was the development of an expanded twostate model in which the substrate threonine acts as both a homotropic and heterotropic effector for the allosteric transition by binding not only 17

J. Monod, J. Wyman, and J.-P. Changeux, J. Mol. Biol. 12, 88 (1965). C. J. Decedue, J. G. Hofler, and R. O. Burns, J. Biol. Chem. 250, 1563 (1975). 19 J. G. Hofler and R. O. Burns, J. Biol. Chem. 253, 1245 (1978). 20 E. Eisenstein, H. D. Yu, and F. P. Schwarz, J. Biol. Chem. 269, 29423 (1994). 21 E. Eisenstein, J. Biol. Chem. 269, 29416 (1994). 22 J.-P. Changeux, J. Mol. Biol. 4, 220 (1962). 23 J.-P. Changeux, Cold Spring Harbor Symp. Quant. Biol. 28, 497 (1963). 18

88

allosteric enzymes and receptors

[4]

to the active sites, but also to the four regulatory sites on the tetrameric enzyme.24 Experimental data in support of this model have come from studies of the kinetic and ligand-binding properties of two mutants that are altered in catalysis or regulation. An analysis of the biochemical properties of the feedback-resistant mutant TDL447F revealed that it was insensitive to isoleucine because it was stabilized in the R state. This variant exhibited hyperbolic kinetics using either threonine or serine as substrates, and bound the allosteric activator valine hyperbolically with higher average affinity than that seen for the wild-type enzyme. Moreover, the inhibitors 2-aminobutyrate and alanine bound hyperbolically to the active sites of TDL447F, consistent with the enzyme adopting an R conformation. Another useful variant was an inactive mutant enzyme in which a lysine residue that is essential for catalysis and that forms the Schiff base with PLP in the active site was replaced with alanine (K62A). Several separate preparations of this enzyme variant revealed that it purified with threonine bound covalently to the pyridoxal phosphate cofactor in the active site, thereby stabilizing it in the R state. Additional support for the interpretation that these two enzyme variants were stabilized in the R state conformation came from binding studies that revealed that the parameters determined for isoleucine and valine binding to the TDK62A-Thr complex were identical with those obtained with TDL447F. Importantly, because the active sites of TDK62A were already completely occupied with threonine as a covalent complex with the PLP cofactor, it was possible to assess whether threonine possessed measurable affinity for the regulatory sites on the enzyme. Interestingly, threonine binding to the regulatory sites of the TDK62A-Thr complex revealed a binding constant of 0.8–1.1 mM, at least five-fold stronger than that to the active sites of wild-type enzyme (5–8 mM). Thus, threonine shows stronger affinity for the regulatory sites than it does for the active sites of threonine deaminase. This was seen even more strikingly for 2-aminobutyrate, which binds to the regulatory sites of TDK62A with a binding constant of 110 M, roughly 100-fold more tightly than to the active sites (8–12 mM). An expanded form of the two-state model was used to describe the cooperative binding data of substrate analogs not only to the active sites, but also to the effector sites, to synergistically promote the allosteric transition. This model accounts for the effects of regulatory site saturation in a manner analogous to that proposed by Rubin and Changeux for nonexclusive ligand binding to either the T or the R states in allosteric systems.25 24

E. Eisenstein, H. D. Yu, K. E. Fisher, D. A. Iacuzio, K. R. Ducote, and F. P. Schwarz, Biochemistry 34, 9403 (1995). 25 M. M. Rubin and J.-P. Changeux, J. Mol. Biol. 21, 265 (1966).

[4]

89

control mechanisms in allosteric TD

This expanded form of the two-state model for cooperativity has the following functional form Fractional saturation ¼

ð1 þ Þ3   4 C ½X 1þ½X ð1 þ Þ4 þ L 1 þ Kact;X KRact;X Ract;X

where L is the allosteric equilibrium constant ([T]/[R]) in the absence of ligands, cact,X is the ratio of dissociation constants, KRact,X/KTact,X, for the association of ligand X with the regulatory sites when the enzyme is in either the R or the T conformation, KRact,X is the dissociation constant for ligand X binding to the regulatory sites when the enzyme is in the R conformation,  ¼ [X]/KR where [X] is the particular ligand concentration, and KR is the dissociation constant for the ligand for the active sites in the R conformation. The parameter c, or the ratio KR/KT, was fixed at zero because numerical analyses indicated that this value tended to be exceedingly small, reflecting the exclusive binding limit for active site ligands.20 As can be seen in Table I, not only is there excellent agreement in the

TABLE I Allosteric Parameters for Homotropic Cooperativity in Threonine Deaminasea Parameterb

Threoninec

2-Aminobutyrate

L KR c KRact,X cact,X

1450 3.42 mM 0 1.0 mM 0.15

1125 10.9 mM 0 105 M 0.15

a

The allosteric parameters are reported for binding experiments and steady-state kinetics  that were conducted in 50 mM potassium phosphate, pH 7.50, at 25 . The fractional saturation for active site ligand binding was measured from the large change in PLP fluorescence upon formation of the external aldimine with an amino acid substrate or inhibitor.21 b The allosteric parameters were determined using nonlinear least-squares analysis methods,26 with the error on the parameters corresponding to 65% confidence intervals. This usually resulted in confidence intervals of about 10–20% of the parameter value for binding constants, and between about 20 and 40% of the value for the allosteric equilibrium constants.20,21,24,27 When binding constants were fixed to the independently determined values, the error on the allosteric equilibrium constants improved substantially to about 10% of the parameter value. c The parameters for threonine are derived from analysis of steady-state kinetics experiments in which it is assumed that fractional saturation is directly proportional to maximal velocity.

90

allosteric enzymes and receptors

[4]

analyses for the allosteric parameters for threonine and 2-aminobutyrate binding to threonine deaminase, but the returned values for the binding constants for these ligands to the effector sites are in accord with those determined separately for TDL447F and the TDK62A-Thr complex. Although the expanded model provided a good description of the regulatory properties of threonine deaminase, it cannot explain in molecular terms a mechanism for how ligands exert their effect on the activity of the enzyme, and to probe these issues, structural information for the enzyme was needed. Structure and Control of Threonine Deaminase

˚ resolution crystal structure of unliganded threonine deamiThe 2.8-A nase (pdb code 1 tdj) reveals that the four subunits of the tetramer appear loosely assembled and are related by three molecular two-fold axes, designated x, y, and z. There is no contact between subunits along the y-axis, and only limited interaction between the N-terminal catalytic domains along the x-axis. In contrast, there is extensive contact between adjacent subunits along the z-axis. The organization of the four chains in the tetramer can be seen in Fig. 1, where each chain is shown in a different color. Each polypeptide chain adopts two roughly equal-sized domains that relate to the mutually perpendicular two-fold symmetry axes determining the tetramer structure. The larger, N-terminal domain contains the PLPcontaining active site and is composed of two / folding units. Each of these two folding units consists essentially of four parallel -strands, with helices on either side of the sheet. Between these two folding units is the PLP cofactor. The N-domain is nearest the center of the molecule and contacts two of its three symmetry mates in the tetramer, at the x- and z-directed symmetry axes. Interestingly, the N-domain shows a striking similarity to the -subunit of the tryptophan synthase complex, in agreement with previous predictions based on sequence analysis and genetic selections for inactive variants.28,29 The active site can clearly be identified, and various residues stabilizing the PLP cofactor are evident. The C-terminal domain is only slightly smaller in size and provides extensive contact with its symmetry mate around the z-axis. The C-domain also has an / organization, with three helices flanking eight antiparallel -strands, which surround a stretch of irregular structure at the subunit 26

M. L. Johnson and S. G. Fraser, Methods Enzymol. 117, 301 (1985). E. Eisenstein, Arch. Biochem. Biophys. 316, 311 (1995). 28 B. E. Taillon, R. Little, and R. P. Lawther, Gene 63, 245 (1988). 29 K. E. Fisher and E. Eisenstein, J. Bacteriol. 175, 6605 (1993). 27

[4]

control mechanisms in allosteric TD

91

Fig. 1. Tertiary and quaternary structure of the threonine deaminase tetramer. Each of the four chains is shown in a different color. Shown in yellow are the pyridoxal 50 -phosphate cofactors that are embedded in the N-terminal catalytic domains. The C-terminal regulatory domains are at the top and bottom in this view. Also shown are the locations of the two symmetry axes that are involved in intersubunit contacts (the y-axis is perpendicular to the plane of the figure). Each subunit contacts two others in the tetramer, e.g., the magenta chain contacts both the green chain and the blue chain. The contacts along the x-axis are relatively few; they include the emphasized key residue glutamine 175 (see also Fig. 2). Contacts along the z-axis include both catalytic and regulatory domains; the regulatory domain contacts are the most extensive in the tetramer, giving the paired regulatory domains the appearance of a single dimeric globular unit. The asterisk indicates the helix in the green chain’s regulatory domain that contains leucine 447, leucine 451, and leucine 454 and is probably near to this chain’s regulatory site. Note that due to the twist in the ‘‘neck’’ between the N and C domains, this green chain regulatory helix is nearer to the magenta chain’s active site than it is to the active site in its own chain.

92

allosteric enzymes and receptors

[4]

interface along the z-axis. The location of the effector binding site in the TD regulatory domain has not been determined crystallographically, but both mutational evidence and alignment with the homologous phosphoglycerate dehydrogenase structure point to the same regions (the helix consisting of residues 446–454, see Fig. 1). A narrow necklike region formed by a single helix (residues 323–334) and its symmetry mate connects the catalytic and regulatory domains. Interestingly, due to the twist in this neck, the putative regulatory helix in one chain is closer to the active site in a different chain than to the active site in its own chain. The structure determination for unligated threonine deaminase provides only a starting point for analyzing the regulation of binding and catalysis. A detailed, molecular explanation for the allosteric transition of the enzyme will remain elusive without additional architectural information. However, the structure in hand has guided the engineering of several variants that have proven useful for a dissection of the allosteric properties of threonine deaminase and has enabled the construction of variants with useful, defined molecular attributes. Active Dimeric Variants Enable an Estimation of the Coupling Free Energy for Cooperative Active Site Ligand Binding

Tetrameric threonine deaminase can be considered a ‘‘dimer of dimers’’ since the crystal structure revealed significant differences in the extent and the nature of the quaternary interactions along the x- and zaxes. Moreover, the structure suggested that it might be possible to alter the smaller of the two quaternary interfaces (the x-interface) so as to promote dissociation of tetramers into component dimers. As can be seen in Fig. 2, a key target for site-specific mutagenesis was glutamine 175 since it forms the central interaction of the x-interface, a bidentate hydrogen bond with the same residue on a symmetrically related monomer. Replacement of the glutamine with glutamate results in a dramatic destabilization of the tetrameric quaternary structure of the enzyme, likely attributable to the significant potential for charge repulsion in the variant. Indeed, the TDQ175E variant is a stable, active dimer, with many functional properties nearly indistinguishable from wild-type threonine deaminase. The ability to construct stable dimers that retain wild-type functional properties provides a significant advantage in dissecting the regulatory energetics of the tetrameric enzyme. Because cooperative ligand binding in allosteric systems is coupled to changes in the intersubunit bonding energy,30 a useful approach for understanding this coupling is to combine 30

R. W. Noble, J. Mol. Biol. 39, 479 (1969).

[4]

control mechanisms in allosteric TD

93

Fig. 2. Quaternary interactions along the x-axis. The view is along the x-axis, which is at the center. The symmetry-related glutamine 175 side chains, interacting through bidentate hydrogen bonds, are emphasized. About six other side chains from each subunit, mostly hydrophobics, are involved in this contact.

ligand-binding studies with subunit association studies.31–33 In this way it is possible to correlate the changes in subunit interaction energies with changes in ligand affinity and thus the regulatory properties of a protein. Several lines of evidence suggest that the TDQ175E variant possessed native-like functional properties. First, an analysis of the kinetic parameters for TDQ175E using the Hill equation yields a K0.5 for threonine of 11 mM, which is increased by about a factor of two from the wild-type value of 5–8 mM, and a Hill coefficient, nH, of 2.0, which is similar to wild-type TD, reflecting a sigmoidal saturation curve. In the presence of 0.5 mM valine, the K0.5 for threonine of 10 mM is virtually unchanged, although nH is lowered to 1.2. On the other hand, in the presence of 50 M isoleucine, the K0.5 increases to 88 mM and the value for nH is 2.4. Second, using changes in tryptophan fluorescence to measure the binding of isoleucine and valine to the regulatory sites of TDQ175E suggests that the variant possesses nativelike functional properties.11,21 Analysis of the binding isotherm for isoleucine using the Hill equation reveals a K0.5 of 5.5 M and a 31

G. K. Ackers and H. R. Halvorson, Proc. Natl. Acad. Sci. USA 71, 4312 (1974). G. K. Ackers, M. L. Johnson, F. C. Mills, and S. H. C. Ip, Biochem. Biophys. Res. Commun. 69, 135 (1976). 33 G. K. Ackers, M. L. Doyle, D. Myers, and M. A. Daugherty, Science 255, 54 (1992). 32

94

allosteric enzymes and receptors

[4]

Fig. 3. Sedimentation equilibrium of TDQ175E. Sedimentation equilibrium was performed with a Beckman XL-A analytical ultracentrifuge. 21 Because of the relatively weak association of dimeric TDQ175E, concentration profiles were obtained from absorbance scans of the PLP cofactor of threonine deaminase at 412 nm, which is reduced by a factor of about six relative to protein absorbance at 280 nm. Scans were performed after equilibrium was attained

[4]

control mechanisms in allosteric TD

95

Hill coefficient, nH, of 1.5, whereas analysis of the valine isotherm yields a K0.5 of 146 M and an nH value of 1.3. Both of these sets of parameters are quite similar to those obtained for the wild-type enzyme.20 Third, at relatively low (mg/ml) concentrations the circular dichroism spectrum of TDQ175E is virtually identical to wild-type threonine deaminase, both in the near- and far-UV regions, as well as in the visible region near 410 nm, indicative of a nativelike environment for the pyridoxal phosphate cofactor. And fourth, at high concentrations (20 mg/ml) the mutant enzyme crystallizes in the same space group as the wild-type enzyme. Despite the similarity of the functional properties of TDQ175E to the wild-type enzyme, sedimentation equilibrium showed conclusively that the mutational alteration indeed resulted in a dimeric quaternary structure. Under native buffer conditions, the tetrameric wild-type enzyme does not show any tendency either to aggregate at high concentration or to dissociate at lower concentration, yielding a molecular weight of 220,000.21 However, as can be seen in Fig. 3A, analysis of sedimentation equilibrium data for TDQ175E reveals that the substitution results in the formation of a dimeric enzyme with a molecular weight of 105,000. Interestingly, sedimentation experiments performed in the presence of isoleucine and valine yielded similar values for the molecular weight, (between 18 and 24 h for 3-mm column heights). Typically, data for TDQ175E were taken at  11,000 rpm at two different protein concentrations at 25 . The molecular weight of dimeric TDQ175E was determined by nonlinear least-squares analysis26 in terms of a single species according to   2 Þ=2RT cr ¼ B þ cm exp Mð1 vÞ!2 ðr2 rm

where cr is the concentration of the protein at a given radial position, cm is the concentration of the protein at some reference position (e.g., the meniscus), M is the molecular weight, v is the partial specific volume,  is the solvent density, ! is the angular velocity, r is the radial position in centimeters from the center of rotation, rm is the distance in centimeters from the center of rotation to the meniscus, R is the gas constant, T is the absolute (Kelvin) temperature, and B is a correction term for a nonzero baseline. A partial specific volume of 0.738 ml/g was estimated from the predicted amino acid sequence of the threonine deaminase, and the solvent density was determined pychnometrically. In the presence of d-threonine where an analysis of the molecular weight distribution did not conform to that for a single species, data were fit to   2 Þ=2RT cr ¼ B hþ cm;D exp MDið1 vÞ!2 ðr2 rm   2 4 2 þ ðcm;D Þ = KDT exp MT ð1 vÞ!2 ðr2 rm Þ=2RT where MD and MT are the dimer and tetramer molecular weights, and 4KDT is the dimer– tetramer dissociation constant in the presence of saturating d-threonine. (A) Sedimentation equilibrium of TDQ175E in 50 mM potassium phosphate, pH 7.5, with no additional ligands. (B) Sedimentation equilibrium of TDQ175E in 50 mM potassium phosphate, pH 7.5, in the presence of 0.3 M d-threonine.

96

allosteric enzymes and receptors

[4]

indicating that neither of the regulatory ligands has a measurable effect on the quaternary structure of TDQ175E. However, the active site ligands d-threonine and 2-aminobutyrate have a striking effect on the molecular weight distribution of TDQ175E as seen in sedimentation equilibrium experiments. Both of these ligands, which form stable complexes with pyridoxal phosphate in the active site, increase the tendency of the dimeric TDQ175E variant to assemble into tetramers. As can be seen in Fig. 3B, the average molecular weight for TDQ175E obtained in the presence of saturating concentrations of d-threonine is increased relative to the unligated variant. Analysis of the sedimentation equilibrium data presented in Fig. 3B yields a dissociation constant (4KDT) of 11.25 M for the dimer–tetramer equilibrium of the ligated TDQ175E variant. Thus, in the presence of high concentrations of d-threonine sufficient to saturate the active site, the free energy for assembling fully liganded tetramers from liganded dimers, 4 GDT, is 6.75 kcal/mol. Since the sedimentation equilibrium results indicate that active site ligands stabilize the tetrameric form of TDQ175E, simple thermodynamic linkage relationships predict that d-threonine should bind more strongly to tetrameric relative to dimeric forms of the enzyme. This was verified by measuring the binding of d-threonine to TDQ175E at various enzyme concentrations. As can be seen in Fig. 4, representative isotherms for d-threonine binding to TDQ175E shift to the left, indicative of stronger average binding as the enzyme concentration is increased. Because of the limitations of the fluorescence assay for active site ligand binding,21,27 only a 20-fold range in active site concentration could be compared in d-threonine binding studies. At the lowest protein concentration of the assay, corresponding to 1.45 M sites, the average binding constant was estimated to be 49 mM, which corresponds to a free energy for ligand binding to a single site on a dimer, GDX, of 1.8 kcal/mol. At the highest protein concentrations examined, 28.3 M sites, the average binding constant is reduced to 29.7 mM. Thus, on a qualitative basis, the interaction energy at the glutamine 175 interface increases as ligands bind to the active site of threonine deaminase. Except for its quaternary structure, dimeric TDQ175E is so similar to the wild-type enzyme in its structural and functional properties that it is useful to consider the consequences of these energetic binding patterns by making two assumptions. The first assumption is that at high enough TDQ175E concentrations, beyond those amenable to the fluorescence assay for active site ligand binding, the saturation curve for d-threonine will approach that seen for wild-type threonine deaminase, characterized by an average binding constant of 19.8 mM. The second assumption is that at the lowest concentrations of TDQ175E used in the binding assay, the protein

[4]

control mechanisms in allosteric TD

97

Fig. 4. Protein concentration-dependent d-threonine binding to TDQ175E. Saturation curves for d-threonine binding to various concentrations of TDQ175E were measured from the change in PLP fluorescence upon titration with d-threonine, which binds only to the active site of threonine deaminase.21 Binding experiments were performed in 50 mM potassium  phosphate, pH 7.5, at 25 . Saturation curves were analyzed in terms of an average binding constant, equivalent to the median ligand activity, for each protein concentration.20,21,24,27 Three of the isotherms shown represent the following (site) concentrations of TDQ175E: (d) 1.45 M sites, yielding a Kav of 49 mM and a GDXav of 1.8 kcal/mol; (h) 6.0 M sites, yielding a Kav of 42.7 mM; (m) 17.7 M sites, yielding a Kav of 34.2 mM. Also shown is a reference curve for d-threonine binding to tetrameric, wild-type threonine deaminase (m) at a concentration of 1.76 M sites, yielding a Kav of 19.8 mM and a GTXav of 2.32 kcal/mol.

concentration is sufficiently below the dimer–tetramer equilibrium constant that the average binding energy for d-threonine reflects binding solely to the dimeric species. Employing these assumptions enables an evaluation of the total energies for ligand binding to dimeric and tetrameric forms of threonine deaminase, and along with the experimentally determined dimer–tetramer equilibrium constant, enables the estimation of the free energy for unliganded dimers associating to tetramers, and to evaluate the coupling free energy for cooperative active site ligand binding to the enzyme. A linkage scheme for d-threonine binding to dimers and tetramers of threonine deaminase and for the dimer–tetramer equilibrium in the absence and presence of ligand is presented in Fig. 5. The total binding energy for d-threonine saturation of two (TDQ175E) dimers is 7.2 kcal/mol. Since 4 GDT, the dimer–tetramer equilibrium constant in the presence of saturating concentrations of d-threonine, was determined to be 6.75 kcal/mol, then the total energy in assembling unliganded dimers to liganded tetramers is 13.95 kcal/mol. Because the total free energy for binding

98

allosteric enzymes and receptors

[4]

Fig. 5. Thermodynamic linkage scheme for cooperative d-threonine binding and dimer– tetramer assembly in threonine deaminase. The total energy for four ligands binding to two dimers was estimated from the average energy obtained for d-threonine binding to TDQ175E at a concentration (1.45 M sites) well below the dimer–tetramer dissociation constant. The total energy for binding 4 mol of d-threonine to tetrameric threonine deaminase is estimated from ligand binding to the wild-type enzyme. The free energy for the dimer–tetramer equilibrium of TDQ175E in the presence of saturating d-threonine was obtained from sedimentation equilibrium experiments. With these three quantities it is straightforward to estimate the free energy for the dimer–tetramer equilibrium of TDQ175E in the absence of ligands. A cooperative free energy of 2.08 kcal for active site ligand binding in this system can be estimated either from the difference in assembly free energy in the presence and absence of ligands or from the difference in the total ligand binding free energy for tetramers and dimers. d-Threonine is indicated by an X in the scheme, dimeric TDQ175E by D, and tetrameric threonine deaminase by T.

d-threonine to the (wild-type) tetramer is 9.28 kcal/mol, then it is possible to estimate an assembly free energy, 0GDT, of 4.67 kcal/mol for the association of two unliganded dimers to form an unliganded tetramer, which corresponds to a binding constant of about 375 M. Thus, the coupling or cooperative free energy for cooperative d-threonine binding to TDQ175E, and by analogy, to the wild-type enzyme, can be estimated either from the difference between the total energy of ligation for tetramers versus dimers, or the difference in assembly free energy between fully liganded dimers forming tetramers versus unliganded dimers forming tetramers. Either of these approaches yields an estimate for the coupling free energy for cooperative d-threonine binding to threonine deaminase of 2.08 kcal, a small, negative number. The importance of estimating a value for the cooperative free energy for threonine deaminase is that this is the energy that the molecule releases in undergoing the allosteric transition from a low-affinity to a high-affinity

[4]

control mechanisms in allosteric TD

99

form. Interestingly, this value is negative, which reflects the tightening of intersubunit bonding energies in the presence of active site ligands as seen by sedimentation equilibrium. The fact that this value is rather small may simply reflect the relatively limited surface area at the glutamine 175 inter˚ 2). Strikingly, the temperature dependence of the dimer– face (only 730 A tetramer equilibrium constant for d-threonine–ligated TDQ175E yields a positive enthalpy change of þ6.6 kcal/mol, indicating that the association reaction is entropy driven, possibly due to solvent release from the vicinity of glutamate 175 as the interface is buried. Although both isoleucine and valine showed significant effects on the d-threonine binding isotherms for TDQ175E, it was unexpected that they had no effect on the subunit association energies. The most obvious interpretation of this result is that the regulatory information stemming from isoleucine and valine binding to the effector sites is not communicated from one dimer to another across the x-axis, which raises the question of whether regulatory ligands act by a local or global mechanism. Hybrid, Enzyme-Like Tetramers Support a Local Model for Feedback Regulation of Threonine Deaminase

Despite the pronounced inhibition and activation of threonine deaminase that isoleucine and valine show in kinetics assays, the molecular basis for these effects, and the similar effects seen for many allosteric enzymes, is unknown. Do isoleucine and valine act globally to propagate their effects to all of the active sites of the tetrameric enzyme, or do heterotropic ligands exert local changes in the activity of only those active sites in the proximity of their binding? A classic approach to address these issues is to construct hybrid enzymes with distinct arrangements of native and defective catalytic and regulatory sites and to characterize them with respect to their allosteric properties.33,34–39 Based on the striking changes in quaternary structure of threonine deaminase upon the introduction of single amino acid substitution at the x-axis of the tetramer, the effect of low levels of chaotropic agents such as NaSCN to promote dissociation was evaluated. Sedimentation equilibrium 34

F. R. Smith and G. K. Ackers, Proc. Natl. Acad. Sci. USA 82, 5347 (1985). Y. R. Yang and H. K. Schachman, Anal. Biochem. 163, 188 (1987). 36 S. R. Wente and H. K. Schachman, Proc. Natl. Acad. Sci. USA 84, (1987). 37 E. Eisenstein and H. K. Schachman, in ‘‘Protein Function: A Practical Approach’’ (T. Creighton, ed.), p. 135. IRL Press, Oxford, 1989. 38 E. Eisenstein, M. S. Han, T. S. Woo, J. M. Ritchey, I. Gibbons, Y. R. Yang, and H. K. Schachman, J. Biol. Chem. 267, 22148 (1992). 39 G. K. Ackers, Adv. Protein Chem. 51, 185 (1998). 35

100

allosteric enzymes and receptors

[4]

indicated that the addition of as little as 0.125 M NaSCN increased the tendency of wild-type threonine deaminase to dissociate from tetramers into dimers, yielding an equilibrium constant for dissociation of about 0.1 M under these conditions, consistent with the limited contact between the dimers at the x-axis interface of the enzyme. Because of the relative ease of dissociating native enzyme, the ability of limited NaSCN treatment to construct hybrid tetramers containing dimers from different parents was evaluated using two chromatographically distinguishable variants of threonine deaminase. This approach utilized a purified enzyme variant expressed from pET-15b (Novagen) with a 20-residue, thrombin-cleavable, His6 sequence that requires higher salt concentrations to elute from an anion-exchange column relative to wild-type threonine deaminase. After addition of 0.125 M NaSCN to an equal mixture of wild-type and Histagged threonine deaminase (at mg/ml levels), overnight incubation, and dilution of NaSCN to minimize effects on chromatography, three species elute from an anion-exchange resin in a roughly binomial (1:2:1) distribution. These results are consistent with the formation of hybrid tetramers composed of two different homodimeric protomers. However, because His-tagged threonine deaminase exhibits a marked reduction in activity relative to native enzyme, analyses of the catalytic and allosteric properties of hybrids were performed on mixtures of hybrids and their parents to shed light on a mechanism for feedback regulation. The experimental strategy was to mix equal amounts of a catalytically deficient variant of threonine deaminase with another variant that was altered in binding feedback modifiers. In this way it was possible to generate a hybrid that contained one dimer with a pair of native catalytic sites but inactive regulatory sites apposed to another dimer with a pair of inactive catalytic sites but native regulatory sites. These hybrids were used to assess the effect of binding feedback modifiers on one dimer on the activity of the other dimer in a tetramer. The compartmentalized, twodomain structure for threonine deaminase readily enabled the construction of inactive variants with native regulatory binding properties29 and fully active variants that were unable to bind isoleucine and valine.40 The first hybridization experiment presented in Fig. 6 illustrates the predicted and observed properties of hybrid tetramers composed of one catalytically defective dimer and one regulatory site defective dimer. The catalytically inactive parent in this experiment contains the active site S86G substitution, resulting in the complete loss of enzyme activity and isoleucine auxotrophy.29 The parent defective in regulatory ligand binding is a triple mutant of threonine deaminase in which leucine residues at 40

D. Chinchilla, F. P. Schwarz, and E. Eisenstein, J. Biol. Chem. 273, 23219 (1998).

[4]

control mechanisms in allosteric TD

101

Fig. 6. Experimental strategy and results for the hybridization of catalytically inactive and regulatory site-deficient variants of threonine deaminase. The top panel is a schematic of the hybridization of a parental variant of threonine deaminase with no catalytic activity, TDS86G (shaded catalytic domains) and another parental species that cannot bind ligands at the effector sites, TDL447,451,454A (shaded regulatory domains) to yield hybrid tetramers with defined arrangements of native active and regulatory sites. (The catalytic and regulatory domains of each chain are presented in the same orientation as Fig. 1, with the small, central circles representing active sites and the small peripheral rectangles representing regulatory sites.) The expected hybrid possesses native regulatory sites adjacent to inactive catalytic sites in one dimer, and competent catalytic sites adjacent to impaired regulatory sites in the other dimer. The bottom panel shows the specific activities of the various components in the hybridization mixture. Control experiments (d) show no activity of the active site variant either in the absence or presence of isoleucine (50 M). Alternatively, the activity of the regulatory site variant (&) is unchanged in the presence of isoleucine. The mixture (m) exhibits half of the specific activity of the active parent because only half of the active sites are functional. However, there is no effect of isoleucine on the activity of the mixture. This result is consistent with a local mechanism for regulation by feedback modifiers and is in contrast to that expected for a global effect for feedback regulation (shown as a dotted line). The production of 2-ketobutyrate was assayed continuously at 230 nm in 50 mM potassium  phosphate, pH 7.5, at 25 , containing 10 mM l-threonine. Reactions were initiated by 500-fold dilution of enzyme mixtures that were incubated in the presence or absence of 0.125 M NaSCN into substrate solutions. There was no effect of up to 25 mM NaSCN on the activity or regulation of wild-type threonine deaminase. Because there was virtually no difference between the specific activities seen for the mixtures either in the absence or presence of feedback modifiers, only one set of data points is presented for each experimental condition to simplify the presentation of the data. The predicted effects for local and global mechanisms of regulation are shown parenthetically, adjacent to the relevant curves in the bottom panel.

102

allosteric enzymes and receptors

[4]

positions 447, 451, and 454 are replaced by alanine, resulting in a dramatic loss in affinity for isoleucine and valine.40 According to this rationale, dissociation of these two parents to dimers followed by random reassembly should yield hybrid tetramers with a distinct arrangement of native active sites and native regulatory sites (Fig. 6, top panel). If isoleucine (and valine) regulated the enzyme through a local mechanism in which they affected only the active sites within a dimer that were adjacent to competent regulatory binding sites, then no change would be expected in the level of activity of the mixture in the presence of heterotropic ligands. On the other hand, if feedback ligands altered enzyme activity in a global manner that was communicated across the x–interface, then a gradual decrease in the activity for a hybrid mixture would be expected in the presence of isoleucine, reflecting the time course for hybridization. Representative results can be seen in the bottom panel of Fig. 6 for the hybridization of inactive TDS86G with a regulatory site triple mutant deficient in isoleucine and valine binding. At 0.125 M NaSCN, which is sufficient to promote the complete hybridization at the dimer level for these two parents, there is no effect of isoleucine or valine on the mixture relative to controls incubated in the absence of chaotropic agent. These results are consistent with the interpretation that heterotropic ligand binding affects only those catalytic sites within the same dimer of a hybrid tetramer, and is in sharp contrast to that expected if the binding of feedback modifiers resulted in a global effect that altered activity throughout the tetramer. A second experiment performed using another catalytically inactive TD variant, the TDK62A-Thr complex, which is stabilized in the high-affinity, R state24 and the regulatory site triple mutant, yielded surprising results. As can be seen in Fig. 7, this hybridization mixture displayed an increase in activity with time, rather than a fixed level based on the concentration of native active sites in the mixture. However, because there was no effect of isoleucine (or valine) on the activity of this mixture, similar to the results for TDS86G, they provide additional support for a local mechanism for feedback regulation in threonine deaminase. The simplest interpretation for the increase in activity seen for this mixture is that the R form of the TDK62A-Thr complex was able to convert the unliganded active sites in the apposing dimer of the hybrid tetramers from the T into the R state, thereby increasing the activity of the mixture relative to control or predicted values (Fig. 7, bottom panel). Support for this interpretation comes from additional binding and enzyme kinetic measurements. Threonine saturation curves of the mixture treated with NaSCN relative to controls in the absence of chaotropic salt reveal that the hybrid mixture exhibits nearly hyperbolic kinetics, consistent with its

[4]

control mechanisms in allosteric TD

103

Fig. 7. Activation of unliganded active sites by inactive, R state dimers in a hybrid tetramer of threonine deaminase. The top panel shows the specific activities of the various components in a hybridization mixture containing the TDK62A-Thr complex, which is stabilized in the R state, and TDL447,451,454A, which is in the T state and cannot bind regulatory ligands. Control experiments show no activity for the inactive parent, TDK62A-Thr, (bottom curve, d), and no change in activity for the regulatory site parent, TDL447,451,454A (top curve, d), either in the absence or presence of NaSCN and isoleucine. Also, a mixture of these enzymes in the absence of NaSCN (&) exhibited half of the specific activity of the active parent. In the presence of NaSCN, however, the hybrid mixture (m) exhibited an increase in activity, reflecting an activation of the hybrid enzyme during the time course. The bottom panel shows a schematic representation of a possible mechanism for activation of the native catalytic domains in one dimer by the liganded, but inactive catalytic domains in an apposing dimer. Inactive domains are shaded gray, and liganded active sites are shown as small filled circles. Because there was virtually no difference between the specific activities seen for the mixtures either in the absence or presence of feedback modifiers, or in relevant cases, in the absence or presence of sodium thiocyanate, only one set of data points is presented for each experimental condition to simplify the presentation of the data.

conversion to the R state. Additionally, valine-binding experiments provide an estimate for an average dissociation constant of roughly 25 M, similar to that seen for the R state variant TDL447F, and vastly reduced from the value of roughly 140 M for control reactions of the parent tetramers mixed in the absence of NaSCN. Thus, these hybridization experiments suggest that it is possible to stabilize the activated R form of threonine deaminase when two catalytic sites within one dimer are

104

allosteric enzymes and receptors

[4]

occupied with substrate. This interpretation has significant implications for identifying important intermediates that populate the T and R states in the course of cooperative ligand binding to the active sites of the enzyme. The binding of a second substrate within the same dimer of threonine deaminase may therefore be sufficient to promote the allosteric transition by converting the entire enzyme population from the T to the R form. Although the previous two approaches do not support the notion of a global mechanism for feedback regulation in threonine deaminase, and they are also consistent with the finding that neither isoleucine nor valine showed an effect on the dimer–tetramer dissociation constant of TDQ175E, more direct evidence could be obtained for a local mechanism in a third hybridization experiment illustrated in Fig. 8. This strategy consisted of

Fig. 8. Isoleucine inhibition within an active dimer of hybrid tetramers supports a local model for feedback regulation. The top panel illustrates the hybridization strategy for the inactive, R state–like TDK62A-Thr complex that also contains alanine substitutions for leucines 447, 451, and 454, and wild-type threonine deaminase. The bottom panel shows the effect of isoleucine (50 M) on the steady-state kinetics of the hybrid mixture using threonine as a substrate. In the absence of isoleucine (d), the saturation curve is virtually hyperbolic, consistent with the fact that the majority of the active sites have been converted to the R state. The addition of isoleucine (&) results in a marked decrease in the activity of the hybrid mixture, indicating that feedback regulation occurs within a dimeric protomer of threonine deaminase.

[4]

control mechanisms in allosteric TD

105

combining in a single threonine deaminase variant the K62A mutation along with the three regulatory site mutations to produce an inactive enzyme that was unable to bind effector ligands. Once again, because the TDK62A-Thr complex is stabilized in the R state, hybridization experiments between this variant and with wild-type threonine deaminase resulted in a mixture that exhibited an increase in activity with time. However, in this case, there was a (slight) effect of isoleucine on reducing the increase in activity during the time course. This can be seen more clearly in the bottom panel of Fig. 8, which shows that a hybrid tetramer containing inactive catalytic and regulatory sites in one dimer and native catalytic and regulatory sites in another dimer exhibits hyperbolic kinetics, indicating that the hybrid is stabilized in the R state. However, in this case, the hybrid enzyme shows a significant decrease in activity in the presence of isoleucine. This experiment provides important, additional support for the idea that feedback modifiers act on catalytic sites that are adjacent to local regulatory sites, and that these ligands do not have a substantial effect on the global properties of the enzyme. Conclusions

The strategy of combining structural, thermodynamic, and kinetic approaches to a study of allosteric enzymes can yield important insights into the molecular mechanisms for their regulation. Here we have illustrated how the three-dimensional structure of threonine deaminase guided mutagenesis efforts to enable the construction of a number of enzymes that proved useful for biochemical studies. Experimental results on these variants have led to an interesting, yet complex picture for threonine deaminase regulation. Substrates bind not only to active sites, but also to regulatory sites to activate the enzyme. Thus, substrates can synergistically promote the allosteric transition, which can be explained quantitatively by an expansion of the classic two-state model to incorporate promiscuous ligand binding. Cooperative active-site ligand binding to the tetramer strengthens intersubunit interactions, and the relatively small coupling free energy that can be estimated suggests that subtle changes at subunit interfaces may significantly influence homotropic cooperativity. Indeed, analyses of hybrid tetramers composed of two different parental dimers that exhibit altered kinetics and effector-binding properties suggest that once two substrates are bound to the active sites within one dimer, the entire tetrameric enzyme converts to the high-activity conformation. Thus, full dimer ligation is an important switch for a global transition that leads to homotropic cooperativity in active site ligand binding. On the other hand, feedback modifiers appear to regulate only the active sites within

106

[5]

allosteric enzymes and receptors

the dimer to which they are bound. Since they do not have an effect on the catalytic activity of the apposing dimer within tetramers, feedback modifiers regulate threonine deaminase by a local mechanism that may involve intra- or interchain interactions that are communicated only within a dimeric protomer of enzyme.

[5] Methods for Analyzing Cooperativity in Phosphoglycerate Dehydrogenase By Gregory A. Grant Introduction

d-3-Phosphoglycerate dehydrogenase (PDGH) from Escherichia coli exhibits multiple cooperative processes. These include positive and negative cooperativity in effector binding,1 positive cooperativity in catalytic inhibition by the effector that reaches a maximum at less than full occupancy of effector binding sites,1 and negative cooperativity in cofactor binding.2 Thus, this enzyme provides a relatively unique and very interesting system for studying the structural basis of cooperativity. d-3-Phosphoglycerate dehydrogenase (EC 1.1.1.95) catalyzes the first committed step in the biosynthesis of l-serine by converting d-3-phosphoglycerate to hydroxypyruvic acid phosphate utilizing NADþ as a cofactor.3,4 Subsequently, hydroxypyruvic acid phosphate is converted to phosphoserine by phosphoserine transaminase and then to l-serine by phosphoserine phosphatase.3 In some organisms, such as E. coli,5 PGDH is inhibited by l-serine, the end product of the pathway. E. coli PGDH is a V-type enzyme6 that accomplishes its regulation of catalysis by altering the velocity of the reaction rather than the affinity of the substrates. PGDH is found throughout the range of living organisms4,7,8 and is a member of a family of proteins that are classified as 2-hydroxyacid dehydrogenases, which are generally specific for substrates with a d 1

G. A. Grant, X. L. Xu, and Z. Hu, Protein Sci. 8, 2501 (1999). G. A. Grant, Z. Hu, and X. L. Xu, J. Biol. Chem. 277, 39548 (2002). 3 A. Ichihara and D. M. Greenberg, J. Biol. Chem. 224, 331 (1957). 4 D. A. Walsh and H. J. Sallach, J. Biol. Chem. 241, 4068 (1966). 5 L. I. Pizer, J. Biol. Chem. 238, 3934 (1963). 6 E. Sugimoto and L. I. Pizer, J. Biol. Chem. 243, 2081 (1968). 7 J. E. Willis and H. J. Sallach, Biochim. Biophys. Acta 81, 39 (1964). 8 J. C. Slaughter and D. D. Davies, Methods Enzymol. 41, 278 (1975). 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

106

[5]

allosteric enzymes and receptors

the dimer to which they are bound. Since they do not have an effect on the catalytic activity of the apposing dimer within tetramers, feedback modifiers regulate threonine deaminase by a local mechanism that may involve intra- or interchain interactions that are communicated only within a dimeric protomer of enzyme.

[5] Methods for Analyzing Cooperativity in Phosphoglycerate Dehydrogenase By Gregory A. Grant Introduction

d-3-Phosphoglycerate dehydrogenase (PDGH) from Escherichia coli exhibits multiple cooperative processes. These include positive and negative cooperativity in effector binding,1 positive cooperativity in catalytic inhibition by the effector that reaches a maximum at less than full occupancy of effector binding sites,1 and negative cooperativity in cofactor binding.2 Thus, this enzyme provides a relatively unique and very interesting system for studying the structural basis of cooperativity. d-3-Phosphoglycerate dehydrogenase (EC 1.1.1.95) catalyzes the first committed step in the biosynthesis of l-serine by converting d-3-phosphoglycerate to hydroxypyruvic acid phosphate utilizing NADþ as a cofactor.3,4 Subsequently, hydroxypyruvic acid phosphate is converted to phosphoserine by phosphoserine transaminase and then to l-serine by phosphoserine phosphatase.3 In some organisms, such as E. coli,5 PGDH is inhibited by l-serine, the end product of the pathway. E. coli PGDH is a V-type enzyme6 that accomplishes its regulation of catalysis by altering the velocity of the reaction rather than the affinity of the substrates. PGDH is found throughout the range of living organisms4,7,8 and is a member of a family of proteins that are classified as 2-hydroxyacid dehydrogenases, which are generally specific for substrates with a d 1

G. A. Grant, X. L. Xu, and Z. Hu, Protein Sci. 8, 2501 (1999). G. A. Grant, Z. Hu, and X. L. Xu, J. Biol. Chem. 277, 39548 (2002). 3 A. Ichihara and D. M. Greenberg, J. Biol. Chem. 224, 331 (1957). 4 D. A. Walsh and H. J. Sallach, J. Biol. Chem. 241, 4068 (1966). 5 L. I. Pizer, J. Biol. Chem. 238, 3934 (1963). 6 E. Sugimoto and L. I. Pizer, J. Biol. Chem. 243, 2081 (1968). 7 J. E. Willis and H. J. Sallach, Biochim. Biophys. Acta 81, 39 (1964). 8 J. C. Slaughter and D. D. Davies, Methods Enzymol. 41, 278 (1975). 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[5]

analyzing cooperativity in PGDH

107

configuration.9 PGDH is the only known member of this family that exists as a tetramer; all other members so far identified are dimeric proteins. It appears that it is the tetrameric structure of E. coli PGDH that is responsible for its ability to be regulated by l-serine (see below). However, even though eukaryotic PGDHs retain a tetrameric structure, no eukaryotes have been shown to contain a PGDH that is inhibited in this manner. PGDH exists in two different structural motifs that do not appear to be strictly specific for organism type.10 The PGDH of some bacteria and some single-cell eukaryotes, such as yeast, are structurally similar to the E. coli enzyme. In addition to the 2-hydroxyacid substrate and nucleotide-binding domains, they possess a C-terminal domain that is involved in effector binding and regulation of activity. This third domain has been called the ACT domain11,12 and consists of a  structural motif that has been found in other proteins as well. Other bacteria and higher order eukaryotes, including mammals, possess a large polypeptide insertion in their C-terminal segment between the ACT domain and the 2-hydroxyacid domain. E. coli PGDH (EC 1.1.1.95) is composed of four identical subunits (see Fig. 1) each with three structurally distinct domains, the substrate-binding domain, the nucleotide-binding domain, and the regulatory domain.13 The substrate-binding domain is flanked on either side by the nucleotidebinding domain and the regulatory domain, which binds the effector, l-serine. The catalytic cleft is found between the substrate- and nucleotide-binding domains. In the tetramer, the subunit interfaces are formed by contact between adjacent regulatory domains on the one hand and adjacent nucleotide binding domains on the other, to form a ‘‘dimer of dimers’’ type structure. Two l-serine molecules can bind between each of the two pairs of adjacent regulatory domains forming a hydrogen bond network across the noncovalent interface. Note that each subunit does not contain a complete serine-binding site within its own structure. Rather, binding of effector at two symmetrical interfaces controls the activity of four active sites. The regulatory and substrate-binding domains are linked by only a single strand of polypeptide, which contains a Gly-Gly sequence approximately midway between the two domains. The substrate-binding and the nucleotide-binding domains are connected at one end of the catalytic cleft by only two strands of polypeptide, one of which also contains the only 9

G. A. Grant, Biochem. Biophys. Res. Comm. 165, 1371 (1989). Y. Achouri, M. H. Rider, E. Van Schaftingen, and M. Robbi, Biochem. J. 323, 365 (1997). 11 L. Aravind and E. V. Koonin. J. Mol. Biol. 287, 1023 (1999). 12 D. M. Chipman and B. Shaanan. Curr. Opin. Struct. Biol. 11, 694 (2001). 13 D. Schuller, G. A. Grant, and L. J. Banaszak, Nat. Struct. Biol. 2, 69 (1995). 10

108

allosteric enzymes and receptors

[5]

Fig. 1. Ribbon diagram of the structure of PGDH. The enzyme is a tetramer of identical subunits shown in alternating dark and light gray. The three structural domains in each subunit are labeled for the upper left subunit. The subunit contacts are at the interface of the regulatory domains and the interface of the nucleotide-binding domains. The location of the serine-binding sites are shown and the serines are depicted with space-filling structures. The serine molecules bind at the interfaces and form hydrogen bonds to both subunits. One of the active sites is also labeled. Bound NADH is shown in ball and stick form.

other Gly-Gly sequence found in the enzyme. Site-directed mutagenesis studies14,15 have shown that the properties of these Gly-Gly sequences are consistent with them functioning as flexible hinge regions in PGDH. Thus, catalysis and regulation of PGDH are thought to occur through movement of rigid domains about flexible hinges. As mentioned above, E. coli PGDH is particularly interesting because its regulation by l-serine displays various levels of cooperativity. While the 14 15

G. A. Grant, X. L. Xu, and Z. Hu, Biochemistry 39, 7316 (2000). G. A. Grant, Z. Hu, and X. L. Xu, J. Biol. Chem. 276, 17844 (2001).

[5]

analyzing cooperativity in PGDH

109

catalytic activity of PGDH displays normal Michaelis–Menton kinetics, its inhibition by l-serine behaves in a positively cooperative manner.6,16 Under appropriate conditions, the binding of l-serine to PGDH can display both positive and negative cooperativity in a stepwise manner.1 Furthermore, these two cooperative processes, serine inhibition and serine binding, can be uncoupled by restricting the rotational freedom around one of the Gly-Gly hinges.15 This observation underscores the concept that the two processes occur by distinct pathways that, although initiated by a common event, diverge at some point and proceed by way of unique structural elements. In addition, E. coli PGDH displays negative cooperativity in the binding of NADH.2 Two sites in the enzyme bind NADH very tightly and two sites bind NADH with weaker affinity. A very interesting property of E. coli PGDH is that cofactor binding modulates the degree of cooperativity observed in l-serine binding. The differential binding of NADH provides evidence for at least two additional conformational states that can be directly correlated to altered degrees of cooperativity of serine binding. Somewhat unexpected is the significant effect of cofactor binding on effector binding without a similar effect in the opposite direction. In summary, at least three cooperative processes can be distinguished in PGDH. They are the binding of l-serine, the inhibition of catalytic activity, and the binding of the cofactor, NADH. Investigations of these properties require reliable means to assess the nature of the interaction of the effector molecule with the protein and the modulation of catalytic activity produced, both in the native enzyme and in response to structural mutations introduced into the enzyme. This chapter discusses the methods that have been used to provide this information. None of the methods described here is new or can be attributed as unique to this investigation. However, they have been very productive in providing insight into the structural basis and dynamics of allosteric regulation in an enzyme with multiple dependent sites. Characterizing Cooperative Ligand Binding

Binding to a Single Site or Multiple Independent Sites In the simplest case, one molecule of ligand binds to one binding site on each molecule of protein. As you will see below, this case is relatively straightforward and can be characterized quite easily.

16

E. Sugimoto and L. I. Pizer, J. Biol. Chem. 243, 2090 (1968).

110

allosteric enzymes and receptors

[5]

The binding of a ligand (L) to a protein (P) can be depicted simply as P þ L ! PL and the equilibrium constant, which is equivalent to the association constant, is defined as Keq ¼ Ka ¼ ½PL=½P½L

(1)

Conversely, the process of dissociation of a ligand from a protein is depicted as PL ! P þ L and the dissociation constant, which is the inverse of the association constant, is Kd ¼ ½P½L=½PL

(2)

To determine the magnitude of these binding constants, the concentrations of protein [P], ligand free in solution [L], and ligand bound to protein [PL] must be determined. Note that [L] refers to the free ligand concentration and is not equivalent to the total ligand concentration [Lt] which is actually [L] þ [PL]. The total protein and ligand concentration can be directly measured prior to the start of the binding experiment. The challenge is in distinguishing bound from free ligand (see equilibrium dialysis below). The experimental determination of the binding constants is accomplished by measuring the concentration of [PL] as a function of the concentration of [L]. This is usually expressed in terms of the fractional saturation of P by L. The fractional saturation (Y) is defined as the fraction of available binding sites that are occupied by ligand and ranges from 0 to 1. Thus Y ¼ ½PL=ð½P þ ½PLÞ

(3)

Rearranging Eq. (2) to [PL] ¼ [P] [L] /Kd and substituting into Eq. (3) yields Y ¼ ð½L=Kd Þ=ð1 þ ð½L=Kd ÞÞ ¼ ½L=Kd þ ½L

(4)

For a protein with only a single ligand binding site ‘‘n,’’ Y ¼ n. However, if more than one binding site exists (n > 1), then Y ¼ r=n

(5)

where r is the number of ligands bound per protein molecule. Thus Eq. (4) becomes r ¼ n½L=Kd þ ½L

(6)

[5]

analyzing cooperativity in PGDH

111

This equation works well to define a system in which all of the binding sites are equivalent and independent of each other and produces a hyperbolic binding curve when r is plotted against [L]. Multiple Dependent Sites What if there are more than one binding site for a ligand and what if these binding sites are not equivalent, that is, what if the affinity for the ligand is different for each site and the affinity of the other sites changes as each new ligand binds? This is the situation that one is presented with when dealing with cooperative processes—that is, processes in which the ligand-binding sites are not independent but instead where they interact with each other by virtue of a dynamic structural change in the protein during the successive binding of multiple ligands. A useful equation to deal with this situation was developed by Adair17 in 1925 when it was used to describe the binding of oxygen to hemoglobin. The Adair equation is derived in a manner similar to that above by considering each individual binding event PLn1 þ L ! PLn For instance, the binding events for a protein with four ligand sites are PþL PL þ L PL2 þ L PL3 þ L

! ! ! !

PL PL2 PL3 PL4

The dissociation constant for the binding of the first ligand is expressed by Eq. (2), but for binding of each subsequent ligand it is expressed by Kd;n ¼ ½PLn1 ½L=½PLn 

(7)

The fractional saturation, Y, can then be expressed as above by Y ¼ ½PL þ þ n½PLn =nð½P þ þ ½PLn Þ

(8)

Substituting the appropriate terms for [PLn] from Eq. (2) and (7) and dividing both the numerator and denominator by [P] yields the Adair equation for n binding sites. For n ¼ 1: Y ¼ ð½L=K1 Þ=ð1 þ ð½L=K1 ÞÞ

17

G. S. Adair, Proc. R. Soc. Lond. A 109, 292 (1925).

(9)

112

allosteric enzymes and receptors

[5]

For n ¼ 2:   n h  io Y ¼ ð½L=K1 Þ þ 2½L2 =K1 K2 = 2 1 þ ð½L=K1 Þ þ ½L2 =K1 K2 (10) For n ¼ 3:

    Y ¼ ð½L=K1 Þ þ 2½L2 =K1 K2 þ 3½L3 =K1 K2 K3 =    n h io 3 1 þ ð½L=K1 Þ þ ½L2 =K1 K2 þ ½L3 =K1 K2 K3

(11)

For n ¼ 4:       Y ¼ ð½L=K1 Þ þ 2½L2 =K1 K2 þ 3½L3 =K1 K2 K3 þ 4½L4 =K1 K2 K3 K4 =    n h  4 1 þ ð½L=K1 Þ þ ½L2 =K1 K2 þ ½L3 =K1 K2 K3  io þ ½L4 =K1 K2 K3 K4 (12)

where Y is the fractional occupancy, [L] is the free ligand concentration, and Ki are the stepwise Adair constants. Note that in the case of n ¼ 1, Eq. (9) is equivalent to Eq. (4). The same equation can also be derived in a more general way using a binding polynomial or binding partition function.18,19 The partition function represents the sum of the concentrations of all of the species present relative to a reference species, which is usually taken to be the unliganded protein. For PGDH, which has four binding sites, its partition function can be represented as Q ¼ ð½P þ ½PL þ ½PL2  þ ½PL3  þ ½PL4 Þ=½P

(13)

First consider the various association constants that can be represented for each binding step,

18

K1 ¼ ½PL=½P½L

(14)

K2 ¼ ½PL2 =½PL½L

(15)

K3 ¼ ½PL3 =½PL2 ½L

(16)

K4 ¼ ½PL4 =½PL3 ½L

(17)

J. Wyman and S. J. Gill, ‘‘Binding and Linkage, Functional Chemistry of Biological Macromolecules.’’ University Science Books, Mill Valley, CA, 1990. 19 M. L. Johnson and M. Straume, Methods Enzymol. 323, 155 (2000).

[5]

analyzing cooperativity in PGDH

113

Solving for [PLn] and putting the equation in the form of just [P] and [L] gives ½PL ¼ K1 ½P½L

(18)

½PL2  ¼ K2 ½PL½L ¼ K1 K2 ½P½L2

(19)

½PL3  ¼ K3 ½PL2 ½L ¼ K1 K2 K3 ½P½L3

(20)

½PL4  ¼ K4 ½PL3 ½L ¼ K1 K2 K3 K4 ½P½L4

(21)

Substituting into Q gives Q ¼ 1 þ K1 ½L þ K1 K2 ½L2 þ K1 K2 K3 ½L3 þ K1 K2 K3 K4 ½L4

(22)

The degree of binding, r, is found by taking the partial derivative of ln Q with respect to ln [L] or @ ln Q=@ ln ½L ¼ ð½L=QÞ@Q=@½L

(23)

which is       r ¼ ðK1 ½LÞ þ 2K1 K2 ½L2 þ 3K1 K2 K3 ½L3 þ 4K1 K2 K3 K4 ½L4 =       1 þ ðK1 ½LÞ þ K1 K2 ½L2 þ K1 K2 K3 ½L3 þ K1 K2 K3 K4 ½L4

And since Y ¼ r=n; and n ¼ 4,       Y ¼ ðK1 ½LÞ þ 2K1 K2 ½L2 þ 3K1 K2 K3 ½L3 þ 4K1 K2 K3 K4 ½L4 =    n h  4 1 þ ðK1 ½LÞ þ K1 K2 ½L2 þ K1 K2 K3 ½L3  io þ K1 K2 K3 K4 ½L4

(24)

(25)

Since the dissociation constant is the inverse of the association constant, Eq. (25) can be expressed as       Y ¼ ð½L=K1 Þ þ 2½L2 =K1 K2 þ 3½L3 =K1 K2 K3 þ 4½L4 =K1 K2 K3 K4 = n h i o      4 1 þ ð½L=K1 Þ þ ½L2 =K1 K2 þ ½L3 =K1 K2 K3 þ ½L4 =K1 K2 K3 K4

(26) which is the same as Eq. (12).

114

[5]

allosteric enzymes and receptors

Note that the Adair constants are not equivalent to the intrinsic dissociation constant of each site. This is because as each site becomes occupied, there are fewer sites available for occupancy and more sites from which ligand can dissociate. In other words, for a protein with no bound ligand, there are four sites available for binding. Once the first site binds ligand, only one site can dissociate a ligand and there are now only three sites available for binding. When the second site is occupied, dissociation can occur from two sites but there are also only two sites available for more binding, and so on. Thus there is a statistical relationship between the Adair constant (Ki ) and the intrinsic constant (Ki0 ) for each site. For n ¼ 2 sites: 0

0

K1 ¼ 2K1 ;

(27)

K2 ¼ K2 =2

For n ¼ 3 sites: 0

K1 ¼ 3K1 ;

0

K2 ¼ K2 ;

0

K3 ¼ K3 =3

(28)

0

(29)

For n ¼ 4 sites: 0

K1 ¼ 4K1 ;

0

K2 ¼ 3K2 =2;

0

K3 ¼ 2K3 =3;

K4 ¼ K4 =4

The statistical parameters can be incorporated into the Adair equation to directly yield the intrinsic constants. For n ¼ 2:   n h  io   0 0 0 0 0 0 Y ¼ 2½L=K1 þ 2½L2 =K1 K2 = 2 1 þ 2½L=K1 þ ½L2 =K1 K2 (30) For n ¼ 3:      0 0 0 0 0 0 Y ¼ 3½L=K1 þ 6½L2 =K1 K2 þ 3½L3 =K1 K2 K3 =    n h io  0 0 0 0 0 0 3 1 þ 3½L=K1 þ 3½L2 =K1 K2 þ ½L3 =K1 K2 K3

(31)

For n ¼ 4 :        0  0 0 0 0 0 0 0 0 0 Y ¼ 4½L=K1 þ 12½L2 =K1 K2 þ 12½L3 =K1 K2 K3 þ 4½L4 =K1 K2 K3 K4      n h io  0 0 0 0 0 0 0 0 0 0 4 1 þ 4½L=K1 þ 6½L2 =K1 K2 þ 4½L3 =K1 K2 K3 þ ½L4 =K1 K2 K3 K4

(32)

[5]

115

analyzing cooperativity in PGDH

Thus, the Adair equation can be used to analyze systems in which cooperative behavior is occurring since dissociation constants can be derived for each binding event by fitting the equation to a plot of Y versus [L]. Once the Adair constants are determined, the relative distribution of the bound species as a function of free ligand concentration can be plotted using expressions for each species derived from the Adair equation: n h h i  D1 ¼ ð½L=K1 Þ= 1 þ ð½L=K1 Þ þ ½L2 =ðK1 K2 Þ h i h iio   þ ½L3 =ðK1 K2 K3 Þ þ ½L4 =ðK1 K2 K3 K4 Þ (33) D2 ¼

D3 ¼

h i n h h i   ½L2 =ðK1 K2 Þ = 1 þ ð½L=K1 Þ þ ½L2 =ðK1 K2 Þ h i h iio   þ ½L3 =ðK1 K2 K3 Þ þ ½L4 =ðK1 K2 K3 K4 Þ i n h h i   ½L3 =ðK1 K2 K3 Þ = 1 þ ð½L=K1 Þ þ ½L2 =ðK1 K2 Þ h i h iio   þ ½L3 =ðK1 K2 K3 Þ þ ½L4 =ðK1 K2 K3 K4 Þ

(34)

h

h

i n h h i  D4 ¼ ½L4 =ðK1 K2 K3 K4 Þ = 1 þ ð½L=K1 Þ þ ½L2 =ðK1 K2 Þ h i h iio   þ ½L3 =ðK1 K2 K3 Þ þ ½L4 =ðK1 K2 K3 K4 Þ

(35)



(36)

where Di indicates the distribution of the species with i bound ligands. For molecules that bind fewer numbers of ligands, the equations can be adjusted by deleting the later terms in the denominator. The change in free energy associated with ligand binding can be expressed as the free energy of dissociation and is given by Gd ¼ RT ln Kd

(37)

Then, the interaction free energy, which reflects the effect of binding of one ligand on another, is the difference in intrinsic free energy expressed as 0

0

0

0

Gint ¼ RT ln K2  RT ln K1 ¼ RT ln ðK2 =K1 Þ

(38)

Note that Gint is negative for a positively cooperative process. Assessing the Cooperativity of Binding A common perception in regard to positive cooperativity is that it produces a binding curve with sigmoidal character. This is generally true, but the sigmoidicity of the curve can be obscured in plots made over the whole range of binding. As the number of binding sites increases and the

116

allosteric enzymes and receptors

[5]

interaction between sites becomes more complex, the idea that cooperativity can necessarily be detected by visual inspection of the Y versus [L] plot is less useful. Even if positive cooperativity is sustained from step to step, the effect cannot always be determined visually. Figure 2 shows theoretical

Fig. 2. Theoretical binding curves. Curves were generated from the Adair equation for four binding sites [Eqs. (12) and (26)] to demonstrate the effect of cooperative binding on the curve shape. Top left: Curves demonstrating increasing positive cooperativity where the 0 0 0 0 dissociation constants differ 10-fold ( ) K1 ¼ 100, K2 ¼ 10, K3 ¼ 1, K4 ¼ 0:1; increasing 0 0 negative cooperativity where the dissociation constants differ 10-fold (r) K1 ¼ 0:1, K2 ¼ 1, 0 0 0 0 0 0 K3 ¼ 10, K4 ¼ 100; and no cooperativity (d) K1 ¼ 10, K2 ¼ 10, K3 ¼ 10, K4 ¼ 10. Bottom left: Expansion of the top left plot at low l-serine concentrations. Top right: Curves demonstrating 0 increasing positive cooperativity where the dissociation constants differ 2-fold ( ) K1 ¼ 10, 0 0 0 K2 ¼ 5, K3 ¼ 2:5, K4 ¼ 1:25; increasing negative cooperativity where the dissociation 0 0 0 0 constants differ 2-fold (r) K1 ¼ 1:25, K2 ¼ 2:5, K3 ¼ 5, K4 ¼ 10; and no cooperativity 0 0 0 0 (d) K1 ¼ 10, K2 ¼ 10, K3 ¼ 10, K4 ¼ 10. Bottom right: Expansion of the top right plot at low l-serine concentrations.

[5]

117

analyzing cooperativity in PGDH

Adair plots for a protein with four dependent binding sites (like PGDH) when the dissociation constants for successive sites differ by either 2- or 10-fold. In no case is a sigmoidal curve shape very evident when the data are plotted over the whole range of ligand concentration. The insets show that at low ligand concentrations, sigmoidal character can be visually detected in the positively cooperative plots, but this does not occur for negative cooperativity. An example of l-serine binding to PGDH that shows both positive and negative cooperativity is given in Fig. 3. Table I shows the dissociation constants determined from fitting the data to the Adair equation for four binding sites as well as the interaction free energy calculated from those constants. Ideally, the values of the dissociation constants determined with the Adair equation will reveal if cooperativity is present. If subsequent dissociation constants decrease in magnitude, the binding is positively cooperative. If they increase in magnitude, the binding is negatively cooperative. The distribution of species calculated from the dissociation constants by Eqs. (33)–(36) is shown in Fig. 4. Note that due to the positive cooperativity of binding the second ligand, the D2 species rises very rapidly and closely follows the occurrence of the D1 species at low serine concentration. As the number of binding sites in a protein increases, the requirement to fit additional variables in the Adair equation can introduce a large degree of uncertainty, which manifests itself as relatively large error factors. Although the fit may look very good visually and the fitting statistics are acceptable, the error for each variable may be so large that they are not statistically different. An example of such a fit to the Adair equation for a mutant of PGDH is presented in Fig. 5. Note that the dissociation constants produced from this fit (see figure legend) are associated with very

TABLE I Parameters Derived from Fitting Equilibrium Dialysis Binding Data to the Adair Equation for Four Sites Adair constant (M) K1 K2 K3 K4 R

¼ ¼ ¼ ¼ ¼

7.8 0.001 2.7 0.004 42.1 0.0002 1026 0.99304

Statistical correction factor 4/1 3/2 2/3 1/4

Intrinsic dissociation constant (M) K10 K20 K30 K40

¼ ¼ ¼ ¼

31.2 4.1 28.1 very large

Interaction free energy (kcal / mol)

Gint Gint Gint

12 23 34

¼ 1.2 ¼ þ1.1 ¼ þ very large

118

allosteric enzymes and receptors

[5]

Fig. 3. Serine binding to PGDH. Analysis of serine binding was performed by equilibrium dialysis under conditions in which both positive and negative cooperativity are seen. The fractional occupancy of l-serine is plotted against the free l-serine concentration. The solid line is the fit of the data (d) to the Adair equation for four binding sites [Eq. (12)]. The intrinsic dissociation constants (M) determined from the fit are given in Table I.

large errors. In this case, there are some methods in plotting the data that may help with an assessment of cooperativity. These include the Hill plot and the Scatchard plot. The Hill Equation and the Hill Plot Hill derived an equation in 191020 to describe the binding of multiple ligands by assuming a system in which as soon as the first ligand binds all subsequent ligands bind immediately. This can be depicted as P þ nL ! PLn The dissociation constant is then defined as Kd ¼ ½P½Ln =½PLn  and the fractional saturation as 20

A. V. Hill, J. Physiol. 40, iv (1910).

(39)

[5]

analyzing cooperativity in PGDH

119

Fig. 4. Fractional distribution of bound species. The distribution of PGDH tetramers with 1 (d), 2 ( ), 3 (r), and 4 (m) bound l-serine molecules was generated from the data determined in Fig. 3 using Eqs. (33)–(36).

Y ¼ ½Ln =Kd þ ½Ln

(40)

In addition, the ligand concentration at one-half maximal binding is equivalent to the nth root of the dissociation constant or Kd ¼ (K0.5)n and thus Y ¼ ½Ln =ðK0:5 Þn þ ½Ln

(41)

Since a binding phenomena that meets Hill’s criteria is highly unlikely, except in the case where n ¼ 1, the Hill equation is not as useful as the Adair equation for describing multiple binding phenomena. However, Eq. 40 can be rearranged to log½Y=ð1  YÞ ¼ n log½L  logKd

(42)

A plot of log [(Y/ (1Y)] versus log[L] gives information about site–site interaction that can be useful as an aid to the Adair equation. The theory and derivation are too complex to go into here, but a good treatment can be found by Wyman and Gill.18 If there is no cooperativity, this plot will produce a straight line whose slope is equal to n and the intercept on the ordinate will yield the Kd. However, the presence of cooperativity produces a plot that deviates from linearity. Figure 6 is a Hill plot in the form

120

allosteric enzymes and receptors

[5]

Fig. 5. Serine binding to a mutant of PGDH. Analysis of serine binding was performed by equilibrium dialysis. The fractional occupancy of l-serine is plotted against the free l-serine concentration. The solid line is the fit of the data (d) to the Adair equation for four binding sites [Eq. (12)]. The intrinsic dissociation constants (M) determined from the fit are 0 0 0 0 K1 ¼ 4206 23764, K2 ¼ 2:1 12:4, K3 ¼ 40:4 22:6, K4 ¼ 13:9 4:5. The correlation coefficient for the fit ¼ 0.99832.

of Eq. (42) produced from the l-serine binding data in Fig. 5. In the Hill plot, the positive cooperativity is manifest in an upward change of the slope of the data (negative cooperativity would produce a downward slope). The slope at any particular point on the curve yields the Hill coefficient at that point, which can be determined by calculating the derivative of the curve at that point. Thus the Hill plot depicts the cooperativity directly. In the limiting regions, as Y ! 0 and Y ! 1, the Hill plot will have linear asymptotes with a slope equal to 1 as shown by the solid lines in Fig. 6. This is because at these extremes there is no cooperativity. At very low concentrations the second ligand has not yet bound and at very high concentrations all ligands are bound. The intercepts of these asymptotes on the ordinate will yield estimates for the first and last intrinsic dissociation con0 0 stant. So, from this figure, K1 ¼ 525 and K4 ¼ 12.6. These can then be used 0 0 in the Adair equation to produce constraint on the fit so that K2 and K3 can be determined with reasonable error values (see figure legend). The overall value of the Hill coefficient is usually determined from the slope

[5]

analyzing cooperativity in PGDH

121

Fig. 6. Hill plot. The data from Fig. 5 are plotted as log Y/(1Y) versus the log of the free l-serine concentration according to Eq. (42) to produce a Hill plot showing positive cooperativity. The solid lines are asymptotes to the data (d) with slopes ¼ 1. Extrapolation of these asymptotes to the ordinate yields estimates of the first and last intrinsic dissociation constant. When these estimates are introduced into the fit for Fig. 5, the dissociation constants 0 0 0 0 are K1 ¼ 525, K2 ¼ 19:3 2:1, K3 ¼ 34:7 3:7, K4 ¼ 12:6. The slope of the data at the point of half saturation [log Y=ð1  YÞ ¼ 0] yields the Hill coefficient.

(derivative) of the line at half saturation, i.e., where log Y/(1Y) ¼ 0. For the plot in Fig. 6, n ¼ 2.1. Note that the measurements that define the limiting asymptotes are made at very low and very high saturations and are often difficult to make accurately. The Scatchard Equation and the Scatchard Plot Introduced by Scatchard in 1949,21 a Scatchard plot can be produced by rearrangement of Eq. (6) to give r=½L ¼ n=Kd  r=Kd

(43)

The Scatchard equation is in the form of that for a straight line. So when r/[L] is plotted versus r (with r plotted on the abscissa), the intercept on 21

G. Scatchard, Ann. NY Acad. Sci. 51, 660 (1949).

122

allosteric enzymes and receptors

[5]

the abscissa gives n and the slope gives 1/Kd. Note here that n refers to the number of sites and not the Hill coefficient. However, this linear analysis is applicable only to systems with n number of independent binding sites. When the sites are not independent or there are more than one class of site, the plot will deviate from linearity. Typically, a plot that is concave downward indicates positive cooperativity and a plot that is concave upward indicates either negative cooperativity or more than one class of site with different dissociation constants. A Scatchard plot for the data from Fig. 3 is shown in Fig. 7. The downward concavity indicates positive cooperativity and the observation that the plot approaches the abscissa at a value less than the number of binding sites on the enzyme (i.e., 4) suggests negative cooperativity. There is no general formula that can be fit to curved Scatchard data that will yield values for n and Kd or that will produce a quantitative assessment of the curvature (cooperativity). A general polynomial expression can be used to fit a line to the Scatchard plot for the purpose of visual presentation. However, none of the parameters from such a fit is useful for the determination of Kd or n.

Fig. 7. Scatchard plot. The data from Fig. 3 are plotted as r/[L] versus r according to Eq. (43). The solid line is the fit to the data (d) using a general polynomial expression.

[5]

analyzing cooperativity in PGDH

123

Characterizing Effector Inhibition of Catalytic Activity

A counterpart of the Hill equation for evaluating kinetic phenomena has also been developed, v=Vmax ¼ ½Lt n =Kd þ ½Lt n

(44)

where v is the velocity at a particular ligand concentration and Vmax is the maximum velocity. Note that here the ligand concentration is represented by the total ligand concentration [Lt]. It is not possible to determine the free ligand concentration in a kinetic experiment. However, since ligand is usually present in much greater concentrations than the enzyme, the total ligand concentration approximates the free ligand concentration. Also, the term S0.5 can be defined similarly to Eq. (41) by the relationship Kd ¼ (S0.5)n and is the ligand or substrate concentration at which the velocity is one-half the maximum velocity. Substitution into Eq. (44) gives the relationship v=Vmax ¼ ½Lt n =ðS0:5 Þn þ ½Lt n

(45)

Similarly, for an inhibitory process the following relationship is derived I=Imax ¼ ½Lt n =Kd þ ½Lt n ¼ ½Lt n =ðI0:5 Þn þ ½Lt n

(46)

where I0.5 is the inhibitor concentration at one-half the maximal inhibition. Thus, a plot of I versus ligand concentration [Lt] produces a plot that when fit to Eq. (46) yields values for n and I0.5. Keep in mind that n in this case refers to the Hill coefficient and not the number of binding sites. The Hill coefficient is sometimes wrongly interpreted as the number of binding sites in the system, but, as Weiss22 pointed out, this is necessarily true only in cases in which extreme cooperativity exists. In theory, the Hill coefficient should not exceed the total number of ligand binding sites, but except in extreme cases it really provides only an estimate of the extent of cooperativity or ‘‘degree of interaction’’ that exists in the system. The Hill coefficient can be a nonintegral number or can be less than one in cases of negative cooperativity, even though there are no fractional binding sites nor can there be less than one binding site. Thus, the Hill coefficient is useful in providing an estimate of the relative degree of cooperativity exhibited but should not be equated with the number of binding sites. Note also that the term for Kd in the Hill equation is a macroscopic constant that can be composed of multiple intrinsic constants. Thus, it does not describe sequential binding

22

J. N. Weiss, FASEB J. 11, 835 (1997).

124

allosteric enzymes and receptors

[5]

phenomena very well unless each intrinsic constant is the same, such as for equivalent independent sites. Figure 8 shows how the shape of the inhibition curve will vary as a function of the value of the Hill coefficient. A change in the I0.5 value has the effect of shifting the curves to the right or left. Figure 9 shows an l-serine inhibition curve for PGDH. The parameters derived by fitting the data to Eq. (46) are given in the legend. Note that although PGDH contains four binding sights for l-serine, the Hill coefficient is approximately 2. One might be tempted to conclude that this suggests that the binding of only two l-serine molecules is sufficient to produce maximal inhibition in PGDH. Although this turns out to be correct in this case, mutant forms of PGDH have been produced that also require only two l-serine molecules to produce maximal inhibition but the Hill coefficient is less than 2. Note also that the Hill coefficient for both curves in Fig. 9 is the same, but the curves are shifted reflecting a difference in I0.5.

Fig. 8. Theoretical serine inhibition plots. The fractional inhibition by l-serine is plotted versus the l-serine concentration according to Eq. (46) where the Hill coefficient is n ¼ 1 (d), 2 ( ), 3 (r), and 4 (m). I0:5 ¼ 10 in all cases.

[5]

analyzing cooperativity in PGDH

125

Fig. 9. Experimental serine inhibition plots. The fractional inhibition by l-serine is plotted versus the l-serine concentration according to Eq. (46) for native PGDH (d, n ¼ 2:03 0:07, I0:5 ¼ 9:2 0:1) and a mutant of PGDH (&, n ¼ 2.02 0.07, I0:5 ¼ 11:2 0:1).

Measuring Ligand Binding

Equilibrium Dialysis As you can see from the derivation of the Adair equation, two parameters are needed to determine the dissociation constants, the fractional saturation (Y) and the free ligand concentration [L]. With a knowledge of the protein concentration in the system, Y is determined if one is able to distinguish between the concentration of bound and free ligand. Equilibrium dialysis is a very straightforward method for measuring these concentrations and one that has routinely been used for PGDH. It relies on the ability to accurately measure the concentration of ligand present. In the case of PGDH, radiolabeled l-serine can be obtained and used for this purpose as a tracer. The bulk of ligand present is unlabeled, but it is assumed or determined that labeled ligand binds to the protein with the same affinity. Thus, the radiolabeled ligand can be present in very small (i.e., tracer) amounts relative to unlabeled ligand. Equilibrium dialysis has been described before,23 but essentially it consists of two compartments that are separated by a membrane that is impermeable to protein but allows the

126

allosteric enzymes and receptors

[5]

ligand to pass freely. In this way, at equilibrium, the concentration of free ligand can be measured from one compartment and the concentration of free plus bound ligand can be measured from the other compartment that contains the protein. The experiment is initiated by placing a solution of protein in one chamber and an initial concentration of ligand [Li] in the other. The ligand then diffuses across the membrane and a portion of it binds to the protein. At equilibrium, the chamber that originally contained the protein contains protein with ligand bound (PL) and free ligand (L), while the other chamber contains only free ligand. Since all the ligand started on one side of the membrane, when the ligand equilibrates between the two chambers, it effectively becomes diluted. If there were no protein present and the chambers were of equal volume as they usually are, the dilution would be a factor of 2 and ½Li  ¼ 2½L

(47)

However, when protein is present, the initial concentration of ligand is equal to twice the concentration of free ligand plus the concentration of bound ligand, ½Li  ¼ ½PL þ 2½L

(48)

and thus bound ligand is the initial ligand concentration minus twice the concentration of free ligand, ½PL ¼ ½Li   2½L

(49)

If radioisotopic labeling of the ligand is used to measure the distribution of ligand across the membrane, the bound and free ligand concentrations can be determined from the counts on each side of the membrane ½PL ¼ ðcpmprot  cpmbuffer =cpmtotal Þð½Li Þ

(50)

½L ¼ ðcpmbuffer =cpmtotal Þð½Li Þ

(51)

and

where the subscripts ‘‘prot,’’ ‘‘buffer,’’ and ‘‘total’’ refer to the protein chamber, the initial ligand chamber, and their sum, respectively. All that remains is to determine the concentration of protein in the protein chamber, which can be done by a variety of methods such as amino acid analysis.

23

D. J. Winzor and W. H. Sawyer, ‘‘Quantitative Characterization of Ligand Binding.’’ Wiley-Liss, New York, 1995.

[5]

analyzing cooperativity in PGDH

127

These values can then be used to calculate r, which is defined as the moles of ligand bound per mole of protein. Thus, r ¼ ½PL=½P

(52)

The fractional occupancy (Y) is then calculated according to Eq. (5). A plot of Y versus the free ligand concentration can then be fit with the Adair equation [Eqs. (9)–(12)] to yield the stepwise intrinsic dissociation constants as discussed above. Quantitating Protein for Binding Experiments Determination of the stoichiometry of binding is completely dependent on the determination of the protein concentration used in the experiment. If great care is not taken in this regard, equivocal or incorrect conclusions may be drawn from the data. Colorimetric determination of protein concentration can vary widely due to interfering substances during color generation or the use of incorrect standards. Determination of protein by light absorbance can be affected by the use of an incorrect extinction coefficient or absorbing materials in the solution. Amino acid analysis by quantitative column chromatography is generally considered to be the most accurate, but even it has been found to have an error of 10–20% in routine use.24 An error of as much as 20% in the protein level can have a significant influence on the conclusions made from a binding experiment since that represents a potential change in the measured binding stoichiometry by almost one complete ligand in a protein with four ligand-binding sites. Stoichiometric Binding

The determination of dissociation constants using equilibrium dialysis requires that the protein and ligand concentrations be in the same range as the dissociation constants. If the dissociation constants for a particular ligand are very strong or very weak, this condition cannot always be practically met. For instance, the binding of the cofactor, NADH, to PGDH occurs with such a high affinity that it cannot be measured by equilibrium dialysis. The dissociation constant for NADH has been estimated to be in the low nanomolar range. If low nanomolar concentrations of protein were used, the total binding of NADH would be so low that the 24

K. U. Yuksel, T. T. Anderrsen, I. Apostol, J. W. Fox, R. J. Paxton, and D. J. Strydom, in ‘‘Techniques in Protein Chemistry VI’’ (J. W. Crabb, ed.), p. 185. Academic Press, New York, 1995.

128

allosteric enzymes and receptors

[5]

bound concentration could not be distinguished from the free concentration. Conversely, under conditions in which sufficient protein is employed, the binding of NADH becomes stoichiometric. That is, at concentrations of ligand below that of protein, all ligand binds to the protein and no free ligand is left in solution. Operationally, stoichiometric binding starts occurring when the concentration of protein is at least 10 times greater than the dissociation constant for the ligand. When NADH binds to PGDH, a fluorescence resonance energy transfer occurs between the NADH molecule and the single tryptophan molecule in the protein subunit. This provides a signal that can be monitored during the binding of NADH. When a solution of NADH and PGDH is excited at 295 nm, the tryptophan emits light at 340–360 nm that then excites bound NADH, which in turn emits light at 420 nm. Although NADH in solution will also emit some light at 420 nm due to the excitation at 295 nm, that due to bound NADH can be distinguished by the difference in slope of the change in fluorescence at 420 nm as a function of NADH concentration. Figure 10 illustrates this property. The initial slope is due to bound NADH and the latter slope is due to just free NADH

Fig. 10. Stoichiometric binding. The fluoresence signal produced when PGDH is titrated with NADH (F  Fcorr ) is plotted versus the ratio of NADH to PGDH. Extrapolation of the intersection of the two slopes to the abscissa yields r, the moles of NADH bound per mole of PGDH tetramer.

[5]

analyzing cooperativity in PGDH

129

after the enzyme becomes saturated with bound NADH. Dissociation constants cannot be determined directly from this type of analysis, but extrapolation of the point where these two slopes cross to the abscissa yields the moles of NADH bound per mole of enzyme. Note that in this type of analysis, which employs a continuous titration of a sample, the fluorescence and the component concentrations must be corrected for dilution. Linkage Analysis

Site-directed mutagenesis has been extensively used to study the structure–function relationships in the regulation of PGDH. When a single modification is made, if that modification affects a specific functional process of the enzyme, the change in that functional process due to the modification can be measured. The power of the approach is maximized when the system property is chosen to reflect the energy states of the molecule as a whole and is directly related to its biological function, such as the Gibbs free energy.25 Each individual state of the enzyme can be characterized by the relationship G0 ¼ RT ln ðKÞ

(53)

and the change from one state to another can be expressed in terms of a change in Gibbs free energy, G. If two individual mutations, as well as the double mutation are made, then the process can be depicted as a thermodynamic cycle.

In this cycle, ‘‘N’’ denotes a residue position that is not mutated and ‘‘M’’ denotes a residue position that is mutated. In this scheme, G12 is the functional perturbation arising from modification at site 2 when site 1 has already been modified. Likewise, G21 is the perturbation of modification at site 1 when site 2 has been previously modified. The change in free energy is calculated from values that can be related to the free energy by the equation, 25

G. K. Ackers and F. R. Smith, Annu. Rev. Biochem. 54, 597 (1985).

130

[5]

allosteric enzymes and receptors

G ¼ RT ln ðK m =K n Þ

(54)

where the superscripts m and n indicate the value after and before mutation, respectively. The coupling energy, GCUPL is defined in terms of the respective Gs as follows: GCUPL ¼ G12  G1 ¼ G21  G2

(55)

If the two positions do not interact, the differences in G will be zero. On the other hand, if they do interact, the differences in G will be nonzero and GCUPL will have a nonzero value. The strategy employed by Fersht26 is based on the relationship between the overall rate constant, kcat/Km, and the free energy of binding (Gs) between enzyme and activated substrate so that G ¼ RT ln ½ðkcat =Km Þm =ðkcat =Km Þn 

(56)

where the superscripts m and n indicate the enzyme after and before modification, respectively. In addition, effector binding can be evaluated with either the individual dissociation constants (Kd) or the I0.5, which can be viewed as a global approximation of the overall macrosopic binding of serine based on the effector concentration producing 50% inhibition. In this case, an analogous relationship to the one above can be derived: G ¼ RT ln ½ðI0:5 Þm =ðI0:5 Þn 

(57)

Table II shows results that have been obtained with double mutants of PGDH. G294V and G295V are both at the active site hinge and G336V

TABLE II Thermodynamic Linkage Analysis for Selected Double Mutants Gcupl (kcal mol1)

26

Mutant

Kcat/Km

IC50

G294V, G295V G294V, G336V G336V, G337V

þ3.4 þ0.8 þ0.1

0 0.6 þ0.8

A. R. Fersht, ‘‘Enzyme Structure and Mechanism.’’ W. H. Freeman and Co., San Francisco, 1985.

[5]

analyzing cooperativity in PGDH

131

and G337V are both at the regulatory domain hinge. The analysis shows that the double mutant at the active site hinge is not linked with respect to serine inhibition but is strongly linked with respect to catalytic activity. Moreover, the opposite is true for the double mutant at the regulatory hinge. However, the double mutant consisting of the critical residue from both hinges shows that they are linked with respect to both activities. In a manner similar to that for Gint discussed previously, a negative sign for G indicates a more positive effect. The opposite sign of the values with respect to serine inhibition for G294V, G336V and G336V, G337V indicates that the linkage acts in opposing directions. For G336V, G337V the linkage decreases the sensitivity to serine, but for G294V, G336V the linkage actually increases serine sensitivity. The second mutation has a countereffect to the first and demonstrates a positive interplay between the two hinge regions. In other words, a mutation at the second hinge can partially overcome the effect of a mutation at the first hinge. Conclusion

The goal of this chapter was to present the ensemble of methods used to study cooperativity in PGDH. As mentioned in the introduction, the methods described here are not new or unique to the study of PGDH. However, the chapter brings together, in a single treatment, a number of different analytical approaches that can be used to study cooperative processes in general. The chapter was written with more of a practical approach in mind and, as a result, much of the statistical thermodynamics underlying the methodology has not been discussed. A discussion on the derivation of complex ligand-binding formulas can be found in Johnson and Straume,19 Wyman and Gill,18 and Winzor and Sawyer.23 Acknowledgments The author acknowledges the support of the National Institutes of Health (GM-56676) and the assistance of Xiao Lan Xu, Zhiqin Hu, and the staff of the Protein and Nucleic Acid Chemistry Laboratory at Washington University School of Medicine in these investigations.

132

[6]

allosteric enzymes and receptors

[6] Fluorescent Probes Applied to Catalytic Cooperativity in ATP Synthase By Joachim Weber and Alan E. Senior Fluorescence spectroscopy is a very sensitive, rapid, and convenient method to study environmental changes in a protein. Provided that suitable probes can be found or developed, this makes it an ideal tool to investigate the interaction of the protein with a ligand by a true equilibrium method. Knowledge of thermodynamic binding parameters, i.e., stoichiometries and affinities, for substrate(s) and product(s) is essential for understanding the mechanism of an enzyme. When combined with site-directed mutagenesis, this technique can add a frequently missing functional dimension to high-resolution structural models. Here we will describe the design and application of fluorescence probes to monitor nucleotide binding in adenosine triphosphate (ATP) synthase. Introduction and Background

ATP synthase catalyzes the final step of oxidative or photophosphorylation, the synthesis of ATP from adenosine diphosphate (ADP) and Pi.1–5 Proton translocation, down the electrochemical gradient, through the membrane-embedded F0 subcomplex supplies the energy for ATP synthesis on the peripheral F1 subcomplex. In bacteria, under certain physiological conditions, the enzyme runs in reverse, hydrolyzing ATP to generate a transmembrane proton gradient. F0 has the subunit composition ab2cn and F1 consists of subunits 33e. F1 contains six nucleotidebinding sites. Three of these sites, located on the three -subunits at the interface to the adjacent -subunit, participate in catalysis (‘‘catalytic sites’’). The remaining three sites, located mainly on the -subunits, have no known physiological function (‘‘noncatalytic sites’’). F1 can be easily detached from the membrane and is an active ATPase (‘‘F1-ATPase’’). The holoenzyme is also referred to as ‘‘F1F0-ATP synthase’’ or ‘‘F1F0.’’

1

J. Weber and A. E. Senior, FEBS Lett. 545, 61 (2003). R. A. Capaldi and R. Aggeler, Trends Biochem. Sci. 27, 154 (2002). 3 R. I. Menz, J. E. Walker, and A. G. W. Leslie, Cell 106, 331 (2001). 4 H. Noji and M. Yoshida, J. Biol. Chem. 276, 1665 (2001). 5 R. K. Nakamoto, C. J. Ketchum, and M. K. Al-Shawi, Annu. Rev. Biophys. Biomol. Struct. 28, 205 (1999). 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[6]

fluorescent probes for ATP synthase

133

Although no consensus mechanism for catalysis by ATP synthase has yet emerged, many current models share common features based on Boyer’s ‘‘binding change’’ principles.6 According to the models, the three catalytic sites have different substrate-binding affinities (termed here ‘‘high,’’ ‘‘medium,’’ and ‘‘low affinity,’’ or ‘‘site 1,’’ ‘‘2,’’ and ‘‘3’’) at any given time during steady-state catalysis, but they switch their affinities in a synchronized manner at one step of the catalytic cycle. This ‘‘binding change’’ step is coupled to proton translocation through F0 via subunit rotation. It is believed that catalysis occurs on the high-affinity site, sequestered from the medium. However, if only the high-affinity site is filled with substrate, catalysis and, especially, product release are extremely slow; this ‘‘unisite catalysis’’ can occur in the absence of rotation. All three catalytic sites must be occupied in the physiologically relevant working mode, featuring rapid catalysis rates and subunit rotation (‘‘trisite catalysis’’). Progress made in our understanding of the catalytic mechanism of ATP synthase during the last decade came mainly from three experimental approaches: (1) high-resolution structural analysis, mostly by X-ray crystallography, showing details of the enzyme with different conformations of the catalytic sites;3,7 (2) the demonstration that ATP hydrolysis on the catalytic sites actually drives rotation of a ‘‘rotor’’ assembly consisting of , e, and c subunits;8–10 and (3) the topic of this chapter, the development of fluorescence probes to measure binding of substrates and products to the catalytic sites and site occupancy during steady-state catalysis, on a real-time basis. In this chapter, we will first briefly describe some general strategies for the selection of probes to characterize ligand binding in a protein and, specifically, in the ATP synthase of Escherichia coli. Then we will illustrate application of one of these probes in more detail and discuss the contribution of experimental results to our knowledge of the enzymatic mechanism. Specifically, we will discuss the unique type of cooperativity in this enzyme, and the information about structure and function of the catalytic sites obtained by determination of binding affinities for substrates and products in normal and mutant F1.

6

P. D. Boyer, in ‘‘Membrane Bioenergetics’’ (C. P. Lee, G. Schatz, and L. Ernster, eds.), p. 461. Addison-Wesley, Reading, MA, 1979. 7 J. P. Abrahams, A. G. W. Leslie, R. Lutter, and J. E. Walker, Nature 370, 621 (1994). 8 H. Noji, R. Yasuda, M. Yoshida, and K. Kinosita, Jr., Nature 386, 299 (1997). 9 Y. Kato-Samada, H. Noji, R. Yasuda, K. Kinosita, Jr., and M. Yoshida, J. Biol. Chem. 273, 19375 (1998). 10 Y. Sambongi, Y. Iko, M. Tanabe, H. Omote, A. Iwamoto-Kihara, I. Ueda, T. Yanagida, Y. Wada, and M. Futai, Science 286, 1722 (1999).

134

allosteric enzymes and receptors

[6]

Selection of Probes and Insertion Sites*

When selecting a fluorescent probe to follow ligand binding to a protein, the first choice that has to be made is between an intrinsic probe, an aromatic amino acid, or an extrinsic probe, a fluorophor covalently attached to an amino acid side chain. In most cases, this means that the initial question is whether to insert a Trp as intrinsic probe or a Cys as attachment point for extrinsic probes. Phe, Tyr, and Trp are all fluorescent, but Trp is the preferred intrinsic probe due to its higher extinction coefficient, the fact that it can be excited selectively at exc  295 nm, the environmental sensitivity of its fluorescence, and its low natural abundance (1–1.5% of all amino acid residues in proteins).11,12 Amines, in Lys or at the N-terminus, and thiols, in Cys, have the highest reactivity for covalent attachment of extrinsic probes. Here again the low natural abundance of Cys (1–1.5%)11,12 enables greater specificity. Advantages of Trp over Cys-bound fluorophors are relative ease of insertion, since no subsequent labeling reaction is required, absence of concerns about labeling stoichiometry, and the relatively small size (the smallest Cys-probe combinations are at least as bulky as a Trp residue), which makes it more likely that the substitution is tolerated by the enzyme. One argument against the use of Trp is its low experimental sensitivity. The extinction coefficient (e) is 5600 M1 cm1 at 280 nm; at 295 nm, frequently used to avoid excitation of Tyr residues, it is less than 2000 M1 cm1. The quantum yield (F) can vary between 0 and 0.35 and is often around 0.1.13 Covalently attached probes, on the other hand, can have extinction coefficients of 100,000 M1 cm1 or more, and quantum yields of close to 1.13,14 In addition, excitation and emission spectra of extrinsic probes are generally located at higher wavelengths than those of Trp, making interference due to contaminating fluorophors and light scattering in turbid samples less problematic—important when working with membrane proteins. On the other hand, in the specific case of membrane proteins, many extrinsic probes are sufficiently unpolar so that they partition into the phospholipid/detergent micelle phase around the protein, thus increasing background fluorescence. An advantage of extrinsic probes is

*

This discussion assumes a working system for site-directed mutagenesis of the protein under investigation. 11 S. H. White, J. Mol. Biol. 227, 991 (1992). 12 S. Chakravarty and R. Varadarajan, FEBS Lett. 470, 65 (2000). 13 M. R. Eftink, Methods Biochem. Anal. 35, 127 (1991). 14 R. P. Haugland, ‘‘Handbook of Fluorescent Probes and Research Chemicals,’’ 6th Ed. Molecular Probes, Eugene, OR, 1996.

[6]

fluorescent probes for ATP synthase

135

versatility; once a Cys has been inserted and the protein purified, a number of different fluorophors can be covalently attached and tested. Another point that should be considered is whether it is necessary to first replace all natural Trp or Cys residues to reduce the background signal. In the F1 portion of the ATP synthase of E. coli, both types of residues could be replaced without impairment of function.15,16 However, certain Trp residues in subunit a of the F0 membrane sector could not be replaced,17 probably because they are required for properly anchoring transmembrane helices. Trp probes turned out to be superior for monitoring nucleotide binding in F1. Of 10 Trp residues that we inserted within ˚ of the catalytic site, nine showed a change in their steady-state 12 A fluorescence upon nucleotide binding.18,19 In contrast, when we tested 45 Cys-probe combinations (five Cys, each reacted with nine different thiolreactive probes), only four gave a usable response.20 Finally, it should be mentioned that probes with improved fluorescence properties have been obtained in some systems by biosynthetically incorporating Trp analogs such as 7-azatryptophan.21–23 This approach has not been used in ATP synthase but clearly deserves more attention in the future. Positioning of the probes for nucleotide binding in F1 turned out to be straightforward once the crystal structure was available. Selected Phe and Tyr residues close to the catalytic site could often be replaced by Trp with preservation of enzymatic function. In general, Trp residues inserted into the adenine-binding subdomain gave the same response with MgADP, MgAMPPNP (a nonhydrolyzable ATP analog), and MgATP, while those located toward the phosphate-binding region reacted differently upon binding of MgADP versus MgAMPPNP.18 Interestingly, two Trp residues of the latter group responded selectively to nucleotide binding at just the high-affinity catalytic site. However, lack of a high-resolution structure of a protein under investigation should not discourage using the approach described here. In fact, our first (and very successful) Trp probes were 15

S. Wilke-Mounts, J. Weber, E. Grell, and A. E. Senior, Arch. Biochem. Biophys. 309, 363 (1994). 16 P. H. Kuo, C. J. Ketchum, and R. K. Nakamoto, FEBS Lett. 426, 217 (1998). 17 S. Wilke-Mounts, J. Weber, and A. E. Senior, unpublished results (2000). 18 J. Weber, S. Wilke-Mounts, S. T. Hammond, and A. E. Senior, Biochemistry 37, 12042 (1998). 19 J. Weber and A. E. Senior, Biochemistry 39, 5287 (2000). 20 J. Weber, V. Bijol, S. Wilke-Mounts, and A. E. Senior, Arch. Biochem. Biophys. 397, 1 (2002). 21 M. Ne´grerie, S. M. Bellefeuille, S. Whitham, J. W. Petrich, and R. W. Thornburg, J. Am. Chem. Soc. 112, 7419 (1990). 22 J. B. A. Ross, A. G. Szabo, and C. W. V. Hogue, Methods Enzymol. 278, 151 (1997). 23 J. Broos, F. ter Veld, and G. T. Robillard, Biochemistry 38, 9798 (1999).

136

allosteric enzymes and receptors

[6]

engineered based on structural information obtained with photoreactive and fluorescent nucleotide analogs24 or via sequence homologies.25 -Trp331: An Exemplary Probe of Nucleotide Binding to the Catalytic Sites of F1

A number of fluorescent probes have been used to measure ligand binding in F1, and of these -Trp331* has been the most widely used (Table I). The crystal structure shows that the wild-type residue, -Tyr331, contributes considerably to the lining of the adenine-binding pocket of the catalytic site,3,7 with the Tyr side chain in van der Waals contact with the adenine base of bound nucleotide, forming a sandwich-type complex. It is very likely that the -Trp331 side chain in the Y331W mutant assumes a similar position, which would explain why the fluorescence of this residue is virtually completely quenched upon binding of nucleotide (Fig. 1).24 In absence of nucleotide, -Trp331 shows substantial fluorescence (F ¼ 0.22) with a maximum at 350 nm (Fig. 1), reflecting a highly polar environment, as the binding site is in an open conformation and readily accessible to the medium. The high accessibility of -Trp331 was confirmed in acrylamide quenching experiments.19 Functionally, the Y331W mutation is well tolerated. Oxidative phosphorylation in vivo is nearly as efficient as in wild-type ATP synthase, and while in Y331W mutant F1-ATPase activity (Vmax) is reduced by 50%, so is Km (MgATP), thus resulting in unchanged catalytic efficiency, kcat/Km.24 In the following, we will describe the determination of nucleotide-binding parameters using the -Trp331 probe. Enzyme Purification and Characterization

A high degree of protein purity is necessary for fluorescence spectroscopy studies. Calculation shows that this is particularly true for ATP synthase. Insertion of three Trp residues (one per catalytic site) in E. coli F1, which consists of more than 3500 amino acids residues, adds less than 0.1 Trp per 100 amino acids. Taking into account that the Trp content of an ‘‘average’’ natural protein is 10–15 times higher, this means that contamination of only 5–10% can add a Trp signal of the same intensity as the total from the three inserted probes.

24

J. Weber, S. Wilke-Mounts, R. S. F. Lee, E. Grell, and A. E. Senior, J. Biol. Chem. 268, 20126 (1993). 25 J. Weber, C. Bowman, and A. E. Senior, J. Biol. Chem. 271, 18711 (1996). * E. coli numbering is used throughout.

[6]

137

fluorescent probes for ATP synthase

TABLE I Selected Fluorescent Probes of Nucleotide Site Ligand Binding in F1-ATPase Probe Y331W mutationa

F148W mutationa

R323W mutationa

T106C labeled with CMo

F410C labeled with ABD-F R365W mutation

a

Comments

References

Fluorescence of inserted Trp fully quenched upon nucleotide binding to catalytic sites. Monitors range of different nucleotides. Does not discriminate nucleoside triphosphate from diphosphate. Used in F1 and F1F0. Fluorescence of inserted Trp discriminates nucleoside triphosphate from diphosphate in all three catalytic sites. Small responses require use of Trp-free background. Fluorescence of inserted Trp monitors presence of -phosphate of nucleoside triphosphate and release of Pi from the catalytic site. Used only with one site loaded so far. Coumarin fluorescence monitors the presence of -phosphate of nucleoside triphosphate and hydrolysis in the catalytic site. Similar to R323W above. Large responses. ABD fluorescence monitors catalytic site nucleotide binding in F1 and F1F0. Large responses. Fluorescence of inserted Trp fully quenched upon nucleotide binding to all three noncatalytic sites. Analogous to Y331W.

b–l

f, m

n

p, q

r

s, t

E. coli numbering is used. It should be noted that not all probes give the same responses when used in different systems. Although Y331W works equally well in E. coli and the thermophilic Bacillus sp. PS3, nucleotide-induced signals of F148W showed differences in both systems [K. Dong, H. Ren, and W. S. Allison, J. Biol. Chem. 277, 9540 (2002)]. Responses of R323W to phosphate binding described for PS3 F1 are significantly smaller with the E. coli enzyme [Z. Ahmad, J. Weber, and A. E. Senior, unpublished results (2003)]. b J. Weber, S. Wilke-Mounts, R. S. F. Lee, E. Grell, and A. E. Senior, J. Biol. Chem. 268, 20126 (1993). c J. Weber, S. Wilke-Mounts, and A. E. Senior, J. Biol. Chem. 269, 20462 (1994). d J. Weber and A. E. Senior, J. Biol. Chem. 271, 3474 (1996). e G. Gru¨ber and R. A. Capaldi, J. Biol. Chem. 271, 32623 (1996). f J. Weber and A. E. Senior, J. Biol. Chem. 273, 33210 (1998). g S. Lo¨bau, J. Weber, and A. E. Senior, Biochemistry 37, 10846 (1998). h S. Nadanaciva, J. Weber, and A. E. Senior, J. Biol. Chem. 274, 7052 (1999). i N. P. Le, H. Omote, Y. Wada, M. K. Al-Shawi, R. K. Nakamoto, and M. Futai, Biochemistry 39, 2778 (2000). j H. Ren and W. S. Allison, J. Biol. Chem. 275, 10057 (2000). k J. Weber and A. E. Senior, J. Biol. Chem. 276, 35422 (2001). l N. Mitome, S. Ono, T. Suzuki, K. Shimabukuro, E. Muneyuki, and M. Yoshida, Eur. J. Biochem. 269, 53 (2002).

138

allosteric enzymes and receptors

[6]

Fig. 1. Trp fluorescence spectra of Y331W mutant F1. exc ¼ 295 nm. Spectra are corrected. Spectrum of Y331W mutant F1 in the absence of nucleotide (1, solid line) and in the presence of 1 mM MgATP (2, dotted line). The latter spectrum is indistinguishable from that of wild-type F1 (3, solid line). The difference spectrum of ‘‘1’’ minus ‘‘3,’’ , represents the fluorescence of the three inserted Trp residues at position 331.

Expression Systems One provision for purity of a protein preparation is a high level of expression. For expression of wild-type F1 (or F1F0) from E. coli we use strain SWM1,26 which gives a yield of about 1 mg F1 per g of cells (wet weight), or 100 mg F1 per 13 liters of fermentor culture. For mutant enzymes best yields are found upon introduction of the mutation into strain 26

R. Rao, M. K. Al-Shawi, and A. E. Senior, J. Biol. Chem. 263, 5569 (1988).

m

J. Weber, C. Bowman, and A. E. Senior, J. Biol. Chem. 271, 18711 (1996). T. Masaike, E. Muneyuki, H. Noji, K. Kinosita, Jr., and M Yoshida, J. Biol. Chem. 277, 21643 (2002). o CM, N-{4-[7-(diethylamino)-4-methylcoumarin-3-yl)]maleimide}; ABD-F, 7-fluoro-2oxa-1,3-diazole-4-sulfonamide. p P. Turina and R. A. Capaldi, J. Biol. Chem. 269, 13465 (1994). q P. Turina and R. A. Capaldi, Biochemistry 33, 14275 (1994). r J. Weber, V. Bijol, S. Wilke-Mounts, and A. E. Senior, Arch. Biochem. Biophys. 397, 1 (2002). s J. Weber, S. Wilke-Mounts, E. Grell, and A. E. Senior, J. Biol. Chem. 269, 11261 (1994). t J. Weber and A. E. Senior, J. Biol. Chem. 270, 12653 (1995). n

[6]

fluorescent probes for ATP synthase

139

pBWU13.4/DK8,27 or, if an F1 with a Trp-free background is desired, into strain pB0W1/DK8. (Plasmid pB0W1 is a pBWU13.4 derivative, in which all Trp residues naturally occurring in F1 have been replaced.28). Depending on the mutation, yields of up to 0.4 mg F1 per gram of cells are obtained. The Y331W mutant F1 described here is an exception, with yields of up to 0.6 mg F1 per g of cells when expressed from strain pSWM4/JP17.24 Due to the unusually strong fluorescence signal of the introduced Trp331 residues, use of a Trp-free background is not necessary in this case. Enzyme Preparation and Analysis Preparation of F1 follows published procedures.29 Briefly, cells are grown in a 13-liter fermentor (New Brunswick Scientific, Edison, NJ), harvested, washed, and lysed using a French press (Thermo IEC, Needham Heights, MA). After removal of cell debris by centrifugation, membranes are collected by ultracentrifugation and washed.30 F1 is released from the membranes by repeated washes in low-ionic-strength buffer, and precipitated using Mg2þ/poly(ethylene glycol) 8000.31 The redissolved pellet is loaded on a Whatman DE52 cellulose ion-exchange column, washed, and eluted with Na2SO4.32–34 After concentration in an Amicon ultrafiltration cell (Millipore, Billerica, MA), the final step of the purification consists of gel chromatography on Sephacryl S-300 HR,29 followed by concentration by ultrafiltration. Purity and subunit composition of the final product are checked by sodium dodecyl sulfate (SDS) gel electrophoresis.32 Protein concentrations are determined using the Bio-Rad (Hercules, CA) protein assay,33 with bovine serum albumin as standard. ATPase assays are de scribed in Weber and Senior.34 F1 is stored at 70 in 50- or 100-l aliquots to avoid repeated freezing and thawing. It should be noted that several His35,36 and Flag epitope37 tags for E. coli F1 have been described; 27

C. J. Ketchum, M. K. Al-Shawi, and R. K. Nakamoto, Biochem. J. 330, 707 (1998). J. Weber, S. Wilke-Mounts, and A. E. Senior, J. Biol. Chem. 277, 18390 (2002). 29 J. Weber, R. S. F. Lee, E. Grell, J. G. Wise, and A. E. Senior, J. Biol. Chem. 267, 1712 (1992). 30 A. E. Senior, D. R. H. Fayle, J. A. Downie, F. Gibson, and G. B. Cox, Biochem. J. 180, 111 (1979). 31 A. E. Senior, J. A. Downie, G. B. Cox, F. Gibson, L. Langman, and D. R. H. Fayle, Biochem. J. 180, 103 (1979). 32 U. K. Laemmli, Nature 227, 680 (1970). 33 M. M. Bradford, Anal. Biochem. 72, 248 (1976). 34 J. Weber and A. E. Senior, J. Biol. Chem. 276, 35422 (2001). 35 H. Omote, N. Sambonmatsu, K. Saito, Y. Sambongi, A. Iwamoto-Kihara, T. Yanadiga, Y. Wada, and M. Futai, Proc. Natl. Acad. Sci. USA 96, 7780 (1999). 36 S. P. Tsunoda, R. Aggeler, H. Noji, K. Kinosita, Jr., M. Yoshida, and R. A. Capaldi, FEBS Lett. 470, 244 (2000). 37 C. J. Ketchum and R. K. Nakamoto, J. Biol. Chem. 273, 22292 (1998). 28

140

allosteric enzymes and receptors

[6]

however, for large-scale preparations tag-based purification schemes have not yet supplanted conventional methods such as the one given here. Determination of Catalytic Site Nucleotide-Binding Parameters Using -Trp331 Fluorescence

Experimental Procedure 

Routinely, experiments are performed at 23 in 10  10-mm (2-ml) quartz fluorescence cuvettes, equipped with a Teflon stirbar. exc is 295 nm, em is 360 nm.* The assay buffer is 50 mM Tris/H2SO4, pH 8.0; for experiments with ‘‘free’’ nucleotides, i.e., in the absence of Mg2þ, 0.5 mM ethylenediaminetetraacidic acid (EDTA) is added. F1, which is stored in a buffer containing MgATP, is equilibrated with the assay buffer by two sequential passages through 1-ml Sephadex G-50 centrifuge columns. This procedure effectively removes nucleotides from the catalytic sites.38 F1 concentration in the fluorescence cuvette is 50–150 nM. In Mg2þ–nucleotide-binding experiments, MgSO4 is added after reading the initial fluorescence of the enzyme, either all at once, if a fixed concentration of Mg2þ is used, or together with the nucleotide, if a constant Mg2þ: nucleotide ratio is desired. In MgADP and MgAMPPNP as well as in some MgATP-binding experiments we use 2.5 mM Mg2þ. In MgATP-binding experiments, especially when investigating the correlation between substrate binding and catalysis, we use an Mg2þ:ATP ratio of 1:2.5, where F1 has its maximal ATPase activity. The fluorescence intensity of Y331W F1 before addition of Mg2þ is required as reference, representing the enzyme with empty catalytic sites. Addition of Mg2þ alone can cause a small quench of fluorescence, due to nucleotides released from noncatalytic sites binding subsequently to the high-affinity catalytic site. If the affinity of the latter site is enhanced, as in the presence of fluoroaluminate, this quench can reach 10% of the total signal.39 On the other hand, if totally nucleotide-depleted F1 (i.e., enzyme with endogenous nucleotides removed from the noncatalytic sites)40 is used, this Mg2þ-induced quench is much reduced or completely absent. *

em of 360 nm was chosen because with the spectrofluorometer used in our laboratory (SPEX Fluorolog 2) at this wavelength the technical (uncorrected) spectrum of the inserted -Trp331 residue had its maximum. In other instruments, a em of 350 nm might give an even better response (see Fig. 1). 38 J. Weber, S. Wilke-Mounts, and A. E. Senior, J. Biol. Chem. 269, 20462 (1994). 39 S. Nadanaciva, J. Weber, and A. E. Senior, J. Biol. Chem. 274, 7052 (1999). 40 A. E. Senior, R. S. F. Lee, M. K. Al-Shawi, and J. Weber, Arch. Biochem. Biophys. 297, 340 (1992).

[6]

fluorescent probes for ATP synthase

141

To cover the whole affinity range, adenine nucleotide concentrations of between 30 nM and 2 mM are required (for guanine and inosine nucleotides the range is shifted toward higher concentrations36). As the enzyme solution in the cuvette is dilute, stability considerations make it advisable not to measure a complete titration curve consisting of 15–20 data points in a single experiment. We prefer to split a curve into three or four different experiments, each covering the whole concentration range (set 1: 30 nM, 300 nM, 3 M, 30 M, 300 M; set 2: 60 nM, 600 nM, 6 M, 60 M, and 600 M; set 3: 100 nM, 1 M, 10 M, 100 M, 1 mM; set 4: 200 nM, 2 M, 20 M, 200 M, 2 mM). With MgATP as ligand, only one or maximally two data points should be measured in the same experiment, to avoid interference by the hydrolysis product ADP. Measurements at concentrations in the low nanomolar range become difficult, for reasons discussed below. For each experiment with Y331W mutant F1, two controls are performed in parallel. In one, nucleotide is added to buffer without enzyme, to correct for a signal due to impurities in the nucleotide, tangible at high concentrations. In the other control, wild-type F1 is titrated with nucleotide. The nine Trp residues in wild-type F1 do not respond to nucleotide binding, so this control corrects for volume and inner filter effects due to added nucleotide. We found this method of correction of inner filter effects more convenient than those based on absorbance measurements.41 Evaluation of Nucleotide Titrations In the discussion of the evaluation of the binding assay, we will use the following abbreviations and symbols: F0 is the fluorescence intensity of Y331W mutant F1 with empty catalytic sites, before addition of ligand. Fexp is the fluorescence intensity after addition of a given concentration of nucleotide, and Fsat is the fluorescence intensity upon saturation with nucleotide (F0, Fexp, and Fsat are in arbitrary units). A prime indicates that the fluorescence has been normalized by setting the value obtained in absence 0 ¼ F =F (F 0 0 0 of ligand, F0, to 1. Thus, Fexp ¼ Fexp =F0 and Fsat sat 0 exp and F sat are dimensionless). Given in boldface, F0 and Fwt are the molar fluorescence intensities of Y331W mutant F1 (in the absence of added nucleotide) and wild-type F1, respectively; F 0 ¼ F0 =½E 0 and F wt ¼ F wt =½E 0 (in arbitrary units M1), where [E]0 is the total (molar) concentration of the respective enzyme. [L]0 is the total concentration of nucleotide, [L] the concentration of free nucleotide (i.e., nucleotide not bound to protein), and [L]bound the concentration of bound nucleotide.  is the degree of 41

M. R. Eftink, Methods Enzymol. 278, 221 (1997).

142

allosteric enzymes and receptors

[6]

Fig. 2. Conversion of fluorescence quenching data into binding data. Normalized fluorescence intensities (left-hand scale) obtained in titrations of Y331W mutant F1 with MgATP (solid circles) or ATP (open circles) are plotted versus the ligand concentration. A logarithmic scale on the abscissa is preferred to cover the whole ligand concentration range in equal detail. The right-hand ordinate scale gives the degree of binding ( ¼ [L]bound/[E]0) calculated from the fluorescence results using Eq. (3).

binding, defined as the molar ratio of bound ligand to total enzyme,  ¼ [L]bound/[E]0. In the case described here, the measured quantity is the molar ratio of occupied binding sites to total enzyme, which is reflected in the label ‘‘occupied catalytic sites (mol/mol F1)’’ on the y-axis of the plots (see Fig. 2, right-hand scale). n is the binding stoichiometry, i.e., the number of binding sites of a certain type, and Kd1, Kd2, and Kd3 give the (thermodynamic) dissociation constants for binding of nucleotide to sites 1, 2, and 3, respectively. After subtraction of the background signal (buffer and, if necessary, nucleotide), for each point of a titration curve with Y331W mutant F1 the fluorescence signal is adjusted for volume and inner filter effects by multiplying it by the correction factor obtained in the control titration with wild-type F1. It is convenient for the subsequent evaluation to normalize the corrected intensities, Fexp, by setting the fluorescence of the Y331W enzyme with empty catalytic sites, F0, to 1. Next, the corrected and normalized fluorescence values, F 0 exp, are converted into degree of binding, , which requires knowledge of the binding stoichiometry, n. In a typical fluorescence binding assay, determination of the binding stoichiometry can pose a problem, unless the affinity of the macromolecule for the ligand is very high (Kd < [E]0).41 By engineering the fluorescent

[6]

fluorescent probes for ATP synthase

143

probes directly into the binding site, a part of the problem is solved, as the number of probes determines the maximal stoichiometry. Still, calculation of binding data by Eq. (1) or, after normalization, (2), v¼

½L bound n ðF0  F exp Þ ¼ ðF0  Fsat Þ ½E 0

(1)

0

n ð1  F exp Þ ½L v ¼ bound ¼ 0 ½E 0 ð1  Fsat Þ

(2)

relies on further assumptions that ligand binding to each of the n sites gives the same fluorescence response, and that when the fluorescence signal reaches saturation at Fsat or F 0 sat, all n sites are actually occupied (excluding a scenario in which a fraction of the sites has very low affinities, beyond the concentration range covered in the experiment). However, the system presented here has a unique advantage, because there cannot be any doubt that the fluorescence of the three inserted Trp331 probes (one per catalytic binding site) is the same when the respective site is occupied by nucleotide, namely zero. Thus, when the signal obtained with Y331W mutant F1 during a titration experiment is the same as that of an equimolar concentration of wild-type enzyme, as in curve 2 of Fig. 1, all three catalytic sites are filled. The only remaining assumption that has to be made is that each of the three -Trp331 probes has the same fluorescence signal in the absence of nucleotide. Several findings support the idea of equal contribution of the three -Trp331 residues to the overall signal: (1) fluorescence lifetime measurements demonstrating that 98% of the -Trp331 fluorescence intensity is associated with a decay process with an unusually long, but apparently uniform lifetime,19 (2) independent determination of occupied sites using a fluorescent nuclotide analog,38 (3) nucleotide titrations with Y331W F1 modified by NBD-C1 (7-chloro4-nitrobenz-2-oxa-1,3-diazole), which saturate at a quench of two-thirds of the -Trp331 fluorescence,38 in agreement with the X-ray structure of NBD-modified mitochondrial F1 showing that the modification prevents one of the three sites from closing and binding nucleotide,42 and (4) numerous titration curves that show distinct intermediate plateaus at a quench of one-third24,39 or two-thirds39,43 of the -Trp331 fluorescence, before continuing to reach full quenching. At 360 nm, we observed a fluorescence ratio of wild-type F1 (or Y331W mutant F1 with occupied catalytic sites) to Y331W mutant F1 42 43

G. L. Orriss, A. G. W. Leslie, K. Braig, and J. E. Walker, Structure 6, 831 (1998). S. Nadanaciva, J. Weber, and A. E. Senior, Biochemistry 39, 9583 (2000).

144

[6]

allosteric enzymes and receptors

with empty catalytic sites, F wt =F 0 ð¼ Fsat =F0 Þ, of 0.49 (see Fig. 1). Thus, in this case Eq. (2) can be simplified to 0



0

½L bound 3 ð1  F exp Þ 1  F exp ¼ ¼ 0:17 ½E 0 ð1  0:49Þ

(3)

Figure 2 illustrates the conversion of fluorescence measurements (left-hand scale) to binding data (right-hand scale). Before calculating affinities (Kd values) from the binding curves, two issues regarding ligand concentrations should be discussed. First, in experiments with Mg2þ-nucleotide, depending on the stability constant of the Mg2þ-nucleotide complex, not all nucleotide will be in form of the Mg2þ complex. We calculate the actual Mg2þ-nucleotide concentration using stability constants of 20 M for MgATP (and MgAMPPNP) and 78 M for MgADP.44,45 In experiments with a constant Mg2þ concentration of 2.5 mM, the changes resulting from plotting [MgADP] instead of [ADP] are barely noticeable; however, when using an Mg2þ:ATP ratio of 1:2.5, at low ligand concentrations most of the Mg2þ and the ATP will be in the uncomplexed form, and calculation of the concentration of MgATP complex becomes a necessity. The second issue concerns the fact that evaluation of the binding curves requires knowledge of the concentration of free ligand (i.e., ligand not bound to enzyme). The enzyme concentration used is 100 nM. Thus, at ligand concentrations >1 M the difference between total and free ligand becomes negligible. Below 1 M, free ligand concentration is obtained by subtracting the concentration of bound ligand, [L]bound ¼  [E]0, from that of total ligand. In the concentration range where binding of Mg2þ-nucleotide to the high-affinity site 1 occurs, these corrections become substantial, giving the calculated values for Kd1 an increased margin of error. In MgATP titration experiments using an Mg2þ:ATP concentration ratio of 1:2.5, this correction is omitted, because it can be assumed that all MgATP complex lost from the medium due to binding to the enzyme is immediately reformed from the large pool of uncomplexed Mg2þ and ATP. Binding affinities are calculated by fitting theoretical curves to the measured data points using nonlinear least-squares regression analysis as offered by most commercially available data presentation programs, e.g., SigmaPlot (SPSS, Chicago, IL). For MgATP binding, a model with three 44 45

V. L. Pecoraro, J. D. Hermes, and W. W. Cleland, Biochemistry 23, 5262 (1984). In earlier studies, we used a program based on A. Fabiato and F. Fabiato [J. Physiol. Paris 75, 463 (1979)] to calculate Mg2þ-nucleotide concentrations. However, comparison with other work from the literature suggested that the constants used in this program might underestimate the concentration of the Mg2þ complexes.

[6]

fluorescent probes for ATP synthase

145

different and independent binding sites gives in general a good fit (see curve through the solid circles in Fig. 2): v¼

½L ½L ½L þ þ ½L þ Kd1 ½L þ Kd2 ½L þ Kd3

(4)

where [L] is the concentration of free ligand and Kdi the thermodynamic dissociation constant at the ith catalytic binding site. Lately, we have been using the three-site model to describe MgADP binding.39 Originally, we had applied a model with two types of sites for MgADP and MgAMPPNP binding24,46 that was better able to account for noninteger stoichiometries at sites 2 and 3, caused by enzyme instability during long titration experiments. After eliminating this problem by employing the protocol described above, both models gave similarly suitable fits. Using the same model for the different nucleotides allows better comparison of the results. To describe nucleotide binding in absence of Mg2þ, a model with a single type of site is generally sufficient: v¼

n½L ½L þ Kd

(5)

In most cases, values for n will be close to 3 (2.7–3.0). Small deviations from n ¼ 3 could be due to enzyme instability, which appears more severe in the absence of Mg2þ. Since MgATP is hydrolyzed during the course of the experiments, it has been questioned whether the binding data reflect true Kd values for MgATP. However, MgATP is bound with similar affinities and similar kinetics to F1 in which hydrolysis is fully inhibited by chemical modification38 or the presence of additional mutations.38,39,46 In the case of azide inhibition, it could actually be shown using the -Trp148 probe (see Table I) that the nucleotide species on the catalytic sites is MgATP.47 Thus, it is justified to take the calculated Kd values as representative for MgATP. Catalytic Site Cooperativity in F1-ATPase

Nucleotide binding and catalysis properties of F1-ATPase are often described as ‘‘negative binding cooperativity’’ combined with ‘‘positive catalytic cooperativity.’’ Still, in the evaluation of Mg2þ-nucleotide binding described in the preceding section we used a model with three different and independent sites, not a cooperative model. Based solely on the

46 47

S. Lo¨bau, J. Weber, S. Wilke-Mounts, and A. E. Senior, J. Biol. Chem. 272, 3648 (1997). J. Weber and A. E. Senior, J. Biol. Chem. 273, 33210 (1998).

146

allosteric enzymes and receptors

[6]

binding data it is not possible to differentiate between different, independent sites and negative cooperativity; only positive binding cooperativity, which obviously does not apply in this case, would be easily recognized. There is no doubt that ATP synthase is a really cooperative enzyme, in the true sense of the word, as all three catalytic sites have to ‘‘work together’’ for physiological catalysis to occur (see below). However, with regard to its nucleotide-binding behavior it does not appear to be negatively cooperative as traditionally defined in enzymology. This would require all sites to have a priori the same high affinity for Mg2þ-nucleotide; then after one site is filled, the remaining two would display a reduced affinity, with a further reduction for the third site after filling of the second. In contrast, in F1 it seems to be the position of the central -subunit, which predetermines which one of the three catalytic sites has high affinity, which one has medium affinity, and which one has low affinity. The latter scenario can most easily be accommodated within the framework of rotational catalysis. Thus, at any given time the three sites are different, as assumed in the evaluation model, depending on which face of  is directed toward the respective site. Whether they are completely independent, meaning that occupancy of one of them does not influence the affinity of the others, remains an open question. Some studies report an effect of nucleotide binding to a low-affinity site on the release rates for nucleotides from site(s) of higher affinity under conditions in which it should not be caused by a rotation-induced binding change.48,49 However, such behavior should not affect the choice of binding model used for evaluation of the data. Available information on the molecular basis for the different affinities of the three catalytic sites toward Mg2þ-nucleotides will be discussed below. Applications of -Trp331 Fluorescence: Implications for the Enzymatic Mechanism

-Trp331 fluorescence was the first and is still the best probe to directly demonstrate the vastly different affinities of the three catalyic sites for Mg2þ-nucleotides (Table II), a cornerstone of most models of the enzymatic mechanism. It was also the first probe to show that these different affinities were displayed only in the presence of Mg2þ. In the absence of Mg2þ, all three catalytic sites have the same, relatively low affinity for nucleotides (Table II). Arguably the most important result was obtained upon comparison of MgATP and MgITP binding and enzymatic activity, 48 49

H. Tiedge and G. Scha¨fer, Biol. Chem. Hoppe-Seyler 367, 689 (1986). J. Weber, S. Schmitt, E. Grell, and G. Scha¨fer, J. Biol. Chem. 265, 10884 (1990).

[6]

147

fluorescent probes for ATP synthase TABLE II Binding Affinities of F1 Catalytic Sites for ATP and ADP Presence of Mg2þ

Kd1 Kd2 Kd3 Nucleotide (M) (M) (M) ATPb ADP

0.02 0.04

1.4 1.8

28 35

No Mg2þ

Kd1,2,3 (M) 53 83

Go (kJ/mol)a Site 1 (Mg2þ) Site 2 (Mg2þ) Site 3 (Mg2þ) vs. vs. vs. site 2 (Mg2þ) site 3 (Mg2þ) site 3 (no Mg2þ) 10.5 9.4

7.4 7.3

1.6 2.1

Differences in binding energy (1) for Mg2þ-nucleotide at site 1 versus site 2, (2) for Mg2þnucleotide at site 2 versus site 3, and (3) for Mg2þ-nucleotide versus uncomplexed nucleotide at site 3. b MgATP binding was measured in the presence of 2.5 mM Mg2þ, to make the results comparable to the MgADP-binding data. a

establishing that all three sites have to be occupied for rapid, physiologically relevant catalysis.24,34 This establishes the critical importance of filling of the third site to trigger rotation. However, as this aspect has been discussed extensively in previous studies,24,34,50 we will focus here rather on the results of the affinity measurements themselves. Table II gives the binding affinities (Kd values) for ATP and ADP in the presence and absence of Mg2þ. It also lists the differences in free energy of binding between selected pairs of sites or selected conditions, according to GO ¼ RT ln ðKd1 =Kd2 Þ ¼ 2:3 RT log ðKd2 =Kd1 Þ

(6)



The effect of Mg on the binding energy at site 1 is pronounced, increasing it by close to 20 kJ/mol for both nucleotides. At site 2, the increase in binding energy due to Mg2þ is about 9 kJ/mol, whereas at site 3 there is little effect. The Mg2þ-induced asymmetry appears to be a prerequisite for catalysis; in the absence of Mg2þ, no enzymatic activity can be detected. These findings are in agreement with the postulate that catalysis occurs on the high-affinity site (see Introduction), which is formed only in the presence of Mg2þ. The Role of the Base-Binding Pocket in Nucleotide Binding Table III gives the Kd values obtained in binding experiments where adenine nucleotides were replaced by inosine nucleotides. The overall affinities are lower, but otherwise the binding pattern remains the same. The 50

C. Dou, P. A. G. Fortes, and W. S. Allison, Biochemistry 37, 16757 (1998).

148

allosteric enzymes and receptors

[6]

TABLE III Binding Affinities of F1 Catalytic Sites for ITP and IDP Presence of Mg2þ Nucleotide

Kd1 (M)

Kd2 (M)

Kd3 (M)

No Mg2þ Kd1,2,3 (M)

ITPa IDP

0.33 [6.9]b 1.2 [8.4]

62 [9.3] 100 [9.9]

1400 [9.6] 3500 [11.3]

2600 [9.6] 3600 [9.3]

MgITP and MgIDP binding were measured using a 2.5 mM excess of [Mg2þ] over nucleotide. b Given in brackets is the loss of binding energy, Go (kJ/mol), between adenine and inosine nucleotides. a

lower affinity of MgITP and MgIDP is an experimental advantage, because corrections necessary to calculate the concentrations of the free Mg2þnucleotide complex are much smaller, increasing the precision with which Kd1 can be measured.34 Losses in binding energy due to replacement of the adenine base by inosine amount to 9  2 kJ/mol for all sites, independent of the absence or presence of Mg2þ (Table III). These findings can be taken as an indication that the interaction of the nucleotide base with the binding pocket is the same at all three catalytic sites, suggesting that this region of the catalytic sites is not contributing to differences (or changes) in affinity. On the other hand, the base of the nucleotide (and/or the ribose moiety) is certainly required for efficient binding. As shown by competition experiments with MgAMPPNP, neither Mg2þ-pyrophosphate51 nor Mg2þtriphosphate52 binds to catalytic sites (Kd > 1 mM). Interestingly, both of these compounds bind with reasonable affinity to the three noncatalytic sites (Kd ¼ 20 and 50 M, respectively), as determined in similar competition experiments using the -Trp365 probe (Table I).51,53 This could be due to conformational differences between the two types of site. The crystal structure of an 33 complex in the absence of nucleotide54 suggests that the empty catalytic sites are in an ‘‘open’’ conformation, while the empty noncatalytic sites are ‘‘closed.’’ It is tempting to speculate that it is binding of the base moiety that triggers closing of the catalytic binding site, thereby effectively increasing affinity by several orders of magnitude.

51

J. Weber and A. E. Senior, J. Biol. Chem. 270, 12653 (1995). J. Weber and A. E. Senior, unpublished results (1997). 53 J. Weber and A. E. Senior, Biochim. Biophys. Acta 1458, 300 (2000). 54 Y. Shirakihara, A. G. W. Leslie, J. P. Abrahams, J. E. Walker, T. Ueda, Y. Sekimoto, M. Kambara, K. Saika, Y. Kagawa, and M. Yoshida, Structure 5, 825 (1997). 52

[6]

fluorescent probes for ATP synthase

149

Functions of Residues Interacting with the Phosphate Moiety of Nucleotides A probe such as -Trp331 adds a new dimension to mutational analysis of functional roles of specific amino acid residues, by allowing investigation of the effects of the various mutations on nucleotide-binding parameters. This approach was used to study three positively charged residues located in the vicinity of the phosphate groups of catalytic-site–bound nucleotide, namely -Lys155, -Arg182, and (approaching from the adjacent  subunit) -Arg376.7 All three residues are essential for catalysis. Results of the binding experiments indicated that none of them is involved in binding of MgADP,46,55,56 but -Lys155 and -Arg182 participate in binding of MgATP (Table IV). In general, the contribution of binding energy is more pronounced at site 1 than at site 2 or, especially, site 3. Thus, -Lys155 and -Arg182 both appear to be a factor in the differences in affinity of the three catalytic sites for MgATP. Interestingly, -Arg376 is not significantly involved in MgATP binding (Table IV and Le et al.57), even though the crystal structure suggests otherwise.3,7 According to the structural model, the shortest distance for a possible hydrogen bond between an -Arg376 nitro˚ , at a favorable angle. For gen and a -phosphate oxygen would be 3.1 A ˚ , for -Arg182 3.1 A ˚. comparison, for -Lys155 this distance would be 2.7 A

TABLE IV ATP Binding Affinities of F1 with Mutations in the Phosphate-Binding Pocket Presence of Mg2þ Enzyme

Kd1 (M)

Kd2 (M)

Kd3 (M)

No Mg2þ Kd1,2,3 (M)

Wild-type K155Q R182Q R376Q

0.02a 0.8 [9.1]b 0.8 [9.1] 0.03 [1.0]

1.4 60 [9.2] 13 [5.5] 2.0 [0.9]

28 200 [4.8] 120 [3.6] 88 [2.8]

53 2700 [9.7] 670 [6.2] 100 [1.6]

a b

55

MgATP binding was measured using 2.5 mM Mg2þ. Given in brackets is the loss of binding energy, Go (kJ/mol), between the mutant and the wild-type enzyme at the respective site.

S. Nadanaciva, J. Weber, and A. E. Senior, Biochemistry 38, 7670 (1999). S. Nadanaciva, J. Weber, S. Wilke-Mounts, and A. E. Senior, Biochemistry 38, 15493 (1999). 57 N. P. Le, H. Omote, Y. Wada, M. K. Al-Shawi, R. K. Nakamoto, and M. Futai, Biochemistry 39, 2778 (2000). 56

150

[6]

allosteric enzymes and receptors

However, the main function of the three residues, -Lys155, -Arg182, and -Arg376, is clearly stabilization of the catalytic transition state. Using the tight-binding transition state analog MgADP fluoroaluminate as indicator, we showed that formation of the transition state complex requires each one of the three residues.39,55,56 If only one of the positive charges is missing, the transition state complex can no longer be formed. Based on these findings, we developed a model for the natural transition state complex56 that was subsequently supported by X-ray crystallography.3 Mg2þ Coordination Mutational analysis combined with nucleotide-binding measurements was also used to characterize the coordination of Mg2þ in the catalytic site. Mg2þ has a strong propensity for octahedral coordination. The first coordination shell consists of oxygen atoms from the hydroxyl group of -Thr156, from the -phosphate of Mg2þ-nucleotide, and, if present, the -phosphate of the nucleotide, and three or four water molecules, which were not resolved in the original crystal structure.7 We were able to identify two residues necessary for hydrogen bonding of three of the water molecules, -Glu185 and -Asp242.58 Elimination of any of the three functional groups, the -Thr156 hydroxyl or the -Glu185 and -Asp242 carboxylates, resulted in abolition of enzymatic activity and striking changes of the binding behavior (see Table V and Weber et al.61). The strong Mg2þ-induced

TABLE V ATP Binding Affinities of F1 with Mutations Affecting Mg2þ Coordination Presence of Mg2þ Enzyme

Kd1 (M)

Kd2 (M)

Kd3 (M)

No Mg2þ Kd1,2,3 (M)

Wild-type T156A E185Q D242N

0.02a 120 [21.4]b 2.0 [11.3] 17 [16.6]

1.4 120 [11.0] 2.0 [0.9] 17 [6.1]

28 120 [3.6] 2.0 [6.5] 17 [1.2]

53 110 [1.8] 3.5 [6.7] 20 [2.4]

a b

58

MgATP binding was measured using 2.5 mM Mg2þ. Given in brackets is the loss of binding energy, Go (kJ/mol), between the mutant and the wild-type enzyme at the respective site. A negative value indicates an increase in binding energy, presumably caused by removal of repulsion between the negatively charged phosphate groups and the negatively charged carboxyl groups in the mutants.

J. Weber, S. T. Hammond, S. Wilke-Mounts, and A. E. Senior, Biochemistry 37, 608 (1998).

[6]

fluorescent probes for ATP synthase

151

binding asymmetry vanished, as did the overall preference of the catalytic sites for Mg2þ-nucleotide over the uncomplexed form. In the mutant enzymes, all three sites had the same or very similar affinity for nucleotide, independent of the absence or presence of Mg2þ. These changes demonstrated that all three residues are involved in Mg2þ coordination. Their contributions are not additive, rather they are acting in a cooperative way. Each is essential, whether located in the first coordination shell (-Thr156) or in the second (-Glu 185 and -Asp242). Based on the results we proposed a model for coordination of the Mg2þ ion, which was later supported by crystal structures at higher resolution, showing the coordinating water molecules.3,59 The findings emphasize that proper coordination of the Mg2þ ion is an absolute requirement for catalysis. It is actually the quality of Mg2þ coordination that is preeminently responsible for the different affinities of the three catalytic sites for Mg2þ-nucleotide and the drop in binding energy of 18 kJ/mol between sites 1 and 3. If Mg2þ coordination is disrupted, the contribution of residues -Lys155 and -Arg182 to the differences in affinity (discussed above) is no longer detectable. Another residue with an essential carboxylate group, -Glu181, although located at a similar distance from the Mg2þ ion as -Glu185 and -Asp242,3,7 is not involved in Mg2þ coordination. Removal of the carboxylate in the E181Q mutant does not affect the Mg2þ-nucleotide-binding pattern; the binding asymmetry is preserved.46,58 On the other hand, experiments with MgADP fluoroaluminate showed that -Glu181 is necessary for formation of the transition state complex,39 by hydrogen bonding the apical water molecule.3,56 This is in agreement with the proposed catalytic role of this residue, namely aligning and polarizing the attacking water molecule during ATP hydrolysis, and stabilizing it as the leaving group during ATP synthesis.7,39,56 Further Applications of -Trp331 Fluorescence

Due to its large response, -Trp331 fluorescence is well suited to monitor nucleotide binding in the F1F0 holoenzyme. The additional nine Trp residues in the F0 subcomplex increase the background, reducing the fluorescence quench observed with detergent-solubilized Y331W mutant ATP synthase upon saturation with nucleotide to 15%.60 Nevertheless, the measured nucleotide-binding constants were indistinguishable from 59

C. Gibbons, M. G. Montgomery, A. G. W. Leslie, and J. E. Walker, Nat. Struct. Biol. 7, 1055 (2000). 60 S. Lo¨bau, J. Weber, and A. E. Senior, Biochemistry 37, 10846 (1998).

152

[7]

allosteric enzymes and receptors

those determined for F1.60 -Trp331 fluorescence was also used to determine kinetic nucleotide-binding parameters. Binding of MgATP and MgADP, but not that of MgAMPPNP, to catalytic sites was found to be fast ( 5  105 M1 s1).24,60 If recording of extended time courses is necessary, it is advisable to close the shutter in the excitation light path between periods of data collection, to avoid photobleaching.60 Conclusion

We hope we have convinced the reader of this chapter of the usefulness of fluorescent probes in the elucidation of the mechanism of ATP synthase, and that we have encouraged wider use of this approach. No doubt such probes will be used to further dissect the enzymatic mechanism of ATP synthase in the future, addressing questions such as the molecular basis of integration of substrate binding, catalysis, and subunit rotation, and integration of subunit rotation with proton translocation. Acknowledgment Supported by NIH Grant GM25349 to A.E.S.

[7] Measurement of Energetics of Conformational Change in Cobalamin-Dependent Methionine Synthase By Vahe Bandarian and Rowena G. Matthews Introduction

Conformational changes are essential for the function of a number of modular proteins that catalyze reactions ranging from the biosynthesis of amino acids to the production of antibiotics. The detection and characterization of conformational changes in these proteins, however, are often challenging. Several tools involving site-specific labeling of proteins with fluorescent or paramagnetic reporters have been developed to circumvent the paucity of in situ reporters of conformational changes in proteins. However, when available, naturally occurring ‘‘handles’’ provide the best estimate of conformational change. The cobalamin-dependent methionine synthase from Escherichia coli (MetH) is a 136-kDa (1227 amino acid) protein with a colored chromophore, whose spectral properties are sensitive to changes in conformation of the protein. Therefore, the cobalamin cofactor

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

152

[7]

allosteric enzymes and receptors

those determined for F1.60 -Trp331 fluorescence was also used to determine kinetic nucleotide-binding parameters. Binding of MgATP and MgADP, but not that of MgAMPPNP, to catalytic sites was found to be fast ( 5  105 M1 s1).24,60 If recording of extended time courses is necessary, it is advisable to close the shutter in the excitation light path between periods of data collection, to avoid photobleaching.60 Conclusion

We hope we have convinced the reader of this chapter of the usefulness of fluorescent probes in the elucidation of the mechanism of ATP synthase, and that we have encouraged wider use of this approach. No doubt such probes will be used to further dissect the enzymatic mechanism of ATP synthase in the future, addressing questions such as the molecular basis of integration of substrate binding, catalysis, and subunit rotation, and integration of subunit rotation with proton translocation. Acknowledgment Supported by NIH Grant GM25349 to A.E.S.

[7] Measurement of Energetics of Conformational Change in Cobalamin-Dependent Methionine Synthase By Vahe Bandarian and Rowena G. Matthews Introduction

Conformational changes are essential for the function of a number of modular proteins that catalyze reactions ranging from the biosynthesis of amino acids to the production of antibiotics. The detection and characterization of conformational changes in these proteins, however, are often challenging. Several tools involving site-specific labeling of proteins with fluorescent or paramagnetic reporters have been developed to circumvent the paucity of in situ reporters of conformational changes in proteins. However, when available, naturally occurring ‘‘handles’’ provide the best estimate of conformational change. The cobalamin-dependent methionine synthase from Escherichia coli (MetH) is a 136-kDa (1227 amino acid) protein with a colored chromophore, whose spectral properties are sensitive to changes in conformation of the protein. Therefore, the cobalamin cofactor

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[7]

conformational change in methionine synthase

153

serves as an embedded reporter of conformational changes of MetH, allowing one to detect the various conformations of the protein, quantify the distribution of the protein among the different forms, and extract the energetics of conformational change under a variety of conditions. Cobalamin-dependent methionine synthase or 5-methyltetrahydrofolate-homocysteine S-methyltransferase (EC 2.1.1.13) catalyzes the production of methionine as shown in Fig. 1. The tightly bound cobalamin cofactor is alternately demethylated by homocysteine (Hcy) forming cob(I)alamin and methionine, and remethylated by methyltetrahydrofolate (CH3-H4folate) to form tetrahydrofolate (H4folate) and to regenerate the methylcobalamin cofactor. The cob(I)alamin form of the cofactor is susceptible to oxidation to the catalytically inactive cob(II)alamin state, which

Fig. 1. The reactions catalyzed by MetH. The primary turnover cycle of MetH, where Hcy is methylated by a methyl group derived from CH3-H4folate, is indicated in black. The oxidation of cob(I)alamin to the catalytically inactive cob(II)alamin state and the subsequent reactivation of the protein are depicted in gray. (Adapted with permission from Bandarian et al.13)

154

allosteric enzymes and receptors

[7]

occurs every 100 to 2000 turnovers under microaerophilic conditions.1,2 Reactivation of the protein is accomplished by reduction of the cobalamin by reduced E. coli flavodoxin1,3 and methylation by S-adenosyl-l-methionine (AdoMet).4 If cob(I)alamin generated during turnover were to have access to AdoMet, futile cycling would result. Despite the ability of MetH to bind two alternative methyl donors (CH3-H4folate and AdoMet), the protein discriminates against AdoMet during the catalytic cycle and against CH3-H4folate during reductive methylation.5 The three substrates and the cobalamin cofactor are bound to distinct modules in the 136-kDa MetH polypeptide2,6; from the N- to the Cterminus of MetH these modules bind Hcy, CH3-H4folate, cobalamin, and AdoMet, respectively. Although an X-ray crystal structure of the fulllength MetH polypeptide is not available, structures of several fragments of MetH and of homologous proteins afford a glimpse of the structural elements that comprise MetH. By analogy to betaine homocysteine methyltransferase7 and CH3-H4folate-corrinoid–dependent methyltransferase,8 the Hcy- and CH3-H4folate–binding regions of MetH are expected to be ()8 barrels and in each case, the active site is located at the C-terminal end of the barrel strands. The structure of the cobalamin-binding module9 shows that the cobalamin is bound in a Rossmann fold, which is typically found in nucleotide-binding proteins. The upper face of the cobalamin is ‘‘capped’’ by a 4-helix bundle that precedes the Rossmann domain in this module, preventing access to the upper axial coordination position of the cobalamin where the methyl transfer reactions to or from the cobalamin occur. The lower coordination position of the cofactor, which in solution is occupied by a dimethylbenzimidazole nucleotide substituent of the corrin ring, is replaced by the imidazole side chain of His-759. This module is hydrogen bonded to the carboxylate side chain of Asp-757 and then to the hydroxyl of Ser-810 (Fig. 2). The AdoMet-binding region adopts a helmet shape and binds AdoMet on the inside of the helmet.10 A 65-kDa 1

K. Fujii, J. H. Gallivan, and F. M. Huennekens, Arch. Biochem. Biophys. 178, 662 (1977). J. T. Drummond, S. Huang, R. M. Blumenthal, and R. G. Matthews, Biochemistry 32, 9290 (1993). 3 K. Fujii and F. M. Huennekens, J. Biol. Chem. 249, 6745 (1974). 4 R. T. Taylor and H. Weissbach, J. Biol. Chem. 242, 1517 (1967). 5 J. T. Jarrett, S. Huang, and R. G. Matthews, Biochemistry 37, 5372 (1998). 6 C. W. Goulding, D. Postigo, and R. G. Matthews, Biochemistry 36, 8082 (1997). 7 J. C. Evans, D. P. Huddler, J. Jiracek, C. Castro, N. S. Millian, T. A. Garrow, and M. L. Ludwig, Structure 10, 1159 (2002). 8 T. Doukov, J. Seravalli, J. J. Stezowski, and S. W. Ragsdale, Structure 8, 817 (2000). 9 C. L. Drennan, S. Huang, J. T. Drummond, R. G. Matthews, and M. L. Ludwig, Science 266, 1669 (1994). 10 M. M. Dixon, S. Huang, R. G. Matthews, and M. Ludwig, Curr. Biol. 4, 1263 (1996). 2

[7]

conformational change in methionine synthase

155

Fig. 2. Catalytic triad of MetH.

C-terminal fragment of MetH [MetH (649–1227)] spanning the cobalaminand AdoMet-binding regions catalyzes the formation of methionine when supplied with a 71-kDa N-terminal fragment comprising the Hcy- and CH3H4folate–binding domains.11 The smaller size of MetH (649–1227) limits the conformations accessible to this fragment, providing a useful probe for studies of conformational equilibria in MetH. The structures of the domains that comprise the MetH molecule suggest that during the catalytic cycle of MetH each substrate-bearing module must approach the cobalamin cofactor as needed and that conformational rearrangements of the domains vis-a`-vis the cobalamin cofactor are required for each of the three methyl transfer reactions catalyzed by the protein. In each of these rearrangements, the 4-helix bundle that caps the cobalamin (as observed in the structure of the cobalamin-binding region in isolation) is displaced to permit approach of the required module to the cobalamin. Table I summarizes the minimal set of arrangements of the modules in full-length MetH (2–1227) and the C-terminal fragment MetH (649–1227). The absorbance properties of the cobalamin are sensitive to its coordination state. When the lower axial coordination position of the cobalamin is occupied by Ne of the imidazole side chain of His-759 the cofactor is designated ‘‘base-on’’; when the nitrogen of imidazole dissociates from the 11

V. Bandarian and R. G. Matthews, Biochemistry 40, 5056 (2001).

156

allosteric enzymes and receptors

[7]

TABLE I Conformational States of MetH (2–1227) and MetH (649–1227)a

cobalamin, the cofactor is designated ‘‘base-off’’ and distinct spectral changes accompany the change. In the methylcobalamin form, the baseon and base-off transitions are accompanied by a shift in the maximal absorbance of the protein from 525 nm (red) to 450 nm (yellow). In the cob(II)alamin form of methionine synthase the base-off form of the cofactor has a higher molar extinction and a maximum at a shorter wavelength

[7]

conformational change in methionine synthase

157

than the base-on form but the differences are subtle (see later). A growing body of evidence12,13 points to a correlation between the coordination environment of the cobalamin and the conformation of the protein. Observation of the base-off form of the cofactor is thought to signal the presence of a significant population of the cob(II)alamin form of MetH in state 4, the reactivation conformation, where the AdoMet-binding module is positioned above the cobalamin.12 Since cob(I)alamin is preferentially 4-coordinate (base-off), the change in coordination state of the cob(II)alamin from 5- to 4-coordinate would preorganize the cobalamin and facilitate the reduction of the cob(II)alamin cofactor during reactivation. In this chapter, we will discuss the use of ultraviolet (UV)/visible spectroscopy and electron paramagnetic resonance as tools to probe the coordination environment of the cobalamin bound to MetH and to obtain the ratios of base-off and base-on forms of the cofactor. Methods for overexpression and purification of recombinant MetH and for manipulations that are required to place the cofactor in various oxidation states were discussed previously.14 In the present chapter, we discuss methodology for obtaining a C-terminal fragment of MetH that spans amino acids 649–1227 and contains the cobalamin- and AdoMet-binding regions of the protein, spectroscopic methods for detecting base-off cobalamin, and measurement of the thermodynamics of conformational interconversions. Expression of the C-Terminal Fragment of Cobalamin-Dependent Methionine Synthase and Purification of the Fragment

Principle Previous studies have shown that the wild-type full-length recombinant MetH can be overexpressed in E. coli XL1-Blue strains when the cultures are grown in glucose M9 minimal medium supplemented with 20 amino acids, thiamin, ampicillin, micronutrients, and cobalamin. Although the fragment can be overexpressed under similar conditions, we have utilized a modified M9 medium that lacks NH4Cl and glucose. In the modified medium, E. coli is grown in the presence of ethanolamine and cobalamin, which is known to activate expression of the adenosylcobalamin-dependent ethanolamine ammonia-lyase15 and may increase the intracellular concentration of cobalamin. 12

J. T. Jarrett, M. Amaratunga, C. L. Drennan, J. D. Scholten, R. H. Sands, M. L. Ludwig, and R. G. Matthews, Biochemistry 35, 2464 (1996). 13 V. Bandarian, M. L. Ludwig, and R. G. Matthews, Proc. Natl. Acad. Sci. USA 100, 8156 (2003). 14 J. T. Jarrett, C. W. Goulding, K. Fluhr, S. Huang, and R. G. Matthews, Methods Enzymol. 281, 196 (1997).

158

allosteric enzymes and receptors

[7]

The overproducing strain pMMA-11/XL1-Blue contains nucleotides 1947–3683 of the E. coli metH gene in a pTrc99a (Pharmacia) vector and expression is induced by isopropyl--d-thiogalactopyranoside (IPTG). The doubling time of the cultures under these conditions is 2 h. Reagents 10 buffer concentrate 10,000 micronutrient mixture 10 amino acid supplement Cobalamin supplement Ethanolamine stock, 1 M Ampicillin stock Tosyl-l-lysine chloromethyl ketone (TLCK), 1 mg/ml in deionized water Phenylmethylsulfonyl fluoride (PMSF), 20 mg/ml in deionized water Potassium phosphate buffer (KPi), pH 7.2, 1 M stock in deionized water Procedures Preparation of Growth Medium. The 10 buffer concentrate was prepared by dissolving 128 g of Na2HPO47H2O, 30 g of KH2PO4 (anhydrous), and 5 g NaCl in 1 liter of deionized water. The pH of the solution was adjusted to 7.4 with solid NaOH. The 10,000 micronutrient stock was prepared by dissolving 37.1 mg of (NH4)6(Mo)7O244H2O, 71.4 mg of CoCl26H2O, 25 mg of CuSO45H2O, 247 mg of H3BO3, 197.9 mg of MnCl24H2O, and 28.8 mg of ZnSO47H2O in 1 liter of deionized water. The 10 amino acid supplement was prepared by adding 2.85 g of alanine, 3.37 g of arginine hydrochloride, 2.40 g of asparagine, 2.74 g potassium aspartate, 0.71 g of cysteine hydrochloride monohydrate, 4.45 g of potassium glutamate, 3.51 g of glutamine, 2.40 g of glycine, 1.68 g of histidine hydrochloride dihydrate, 2.10 g of isoleucine, 4.20 g of leucine, 2.92 g of lysine hydrochloride, 1.19 g of methionine, 2.64 g of phenylalanine, 1.84 g of proline, 42.0 g of serine, 1.91 g of threonine, 0.82 g of tryptophan, 1.45 g of tyrosine, and 2.81 g of valine to 4 liters of deionized water, and was steril ized by autoclaving for 20 min at 120 . The 10 cobalamin supplement was prepared by combining 2 mL of the 10,000 micronutrient mixture with 4.6 g of MgSO47H2O, 0.3 g CaCl22H2O, 18.5 mg thiamine, and 63 mg hydroxocobalamin in 250 ml of deionized water; this solution was sterilized by passage through the 0.2-m filter of a filter sterilization apparatus. The 15

C. M. Blackwell and J. M. Turner, Biochem. J. 176, 751 (1978).

[7]

conformational change in methionine synthase

159

1 M ethanolamine solution was prepared by adding 60 ml of ethanolamine (free base) into 1 M HCl in a total volume of 1 liter. The pH of the solution was adjusted to 7.4 with HCl or NaOH as necessary and the stock solution was sterilized by filtration through a sterile filtration apparatus fitted with a 0.2-m filter. The ampicillin stock solution was prepared by 1 g of ampicillin in a total volume of 10 ml of deionized water and was sterilized by passage through a 0.2-m syringe filter. To prepare 1 liter of bacterial growth medium, 0.1 liter of 10 buffer concentrate, 0.1 liter of 10 amino acid supplement, and 0.76 liter of water were combined and autoclaved. Cobalamin supplement (13.5 ml), ethanolamine (25 ml), and ampicillin (0.1 mg/ml final) were added after cooling the medium to room temperature. Growth of Bacteria. Cells from an XL1-Blue/pMMA-1111 freezer stock were plated on a Luria-Bertani-agar plate16 containing 0.1 mg/ml ampicil lin and incubated at 37 . A single colony from this plate was used to inoculate 5 ml of growth medium that was grown to stationary phase (24 h). A 0.1-liter starter culture was inoculated with the 5 ml of culture and grown until turbid (A420nm < 1, 24 h) and used to inoculate six 3.5-liter flasks each containing 1 liter of M9-ethanolamine growth medium. Cultures were grown to an OD420nm of 1 and expression of the C-terminal fragment was induced by addition of 0.5 mM IPTG. Cultures were allowed to grow 18 h after induction to a final OD420nm 2. Cells were pelleted by centrifugation   at 4440g (15 min, at 4 ) and stored at 80 . Purification of Wild-Type and Variant C-Terminal Fragments. All pro cedures were carried out at 4 . The crude lysates were prepared by suspending cells in 50–100 ml 0.01 M KPi (pH 7.2) containing the protease inhibitors PMSF (0.1 mL) and TLCK (0.2 mL) and sonicating with six 1-min bursts with a Branson Sonifier 450 (power setting 8), with 0.5- to 1-min periods between each burst. The lysate was cleared by centrifugation at 64,000g for 30 min and loaded onto a DEAE-Sepharose column (2.5 cm  20 cm) that had been equilibrated with 0.01 M KPi. The column was washed with 10 ml 0.01 M KPi, and protein was eluted with a 0.5-liter 0.01 M KPi linear gradient. The fractions containing the C-terminal fragment were identified by color and gel electrophoresis, pooled, and dialyzed overnight against 4 liters 0.01 M KPi. The dialysate was loaded onto a Q-Sepharose column (2.5 cm  20 cm) that had been equilibrated with 0.01 M KPi. The column was washed with 0.1 liter 0.01 M KPi, and eluted with a 0.5-liter linear gradient of 0.01–0.3 M KPi (pH 7.2). The fractions containing the protein were identified by color and electrophoretic analysis 16

F. M. Ausubel, R. Brent, R. E. Kingston, D. D. Moore, J. G. Seidman, J. A. Smith, and K. Struhl (eds.), ‘‘Current Protocols in Molecular Biology.’’ John Wiley & Sons, Inc., New York, 2000.

160

allosteric enzymes and receptors

[7]

and pooled. The solution was combined, over a 20-min period, with an equal volume of 2.4 M (NH4)2SO4 in 0.025 M KPi (pH 7.2) and loaded immediately onto a phenyl sepharose CL-4B column (2.5 cm  15 cm) that had been equilibrated with 1.2 M (NH4)2SO4 in 0.025 M KPi. The column was washed with 0.1 liter of the same buffer and the protein was eluted with a linear gradient (0.4 liter) of 0.6–0 M (NH4)2SO4 in 0.025 M KPi. The fractions containing the C-terminal fragment were identified by color and pooled on the basis of gel electrophoretic analysis. The protein was dialyzed overnight against 4 liters of 0.01 M KPi and concentrated using an Amicon pressure concentrator (YM-30 membrane) and Centricon concentrators (YM-30). The wild-type protein purifies in the methylcobalamin  form and, therefore, this protein was frozen (80 ) and used without further treatment. The Asp-757-Glu variant C-terminal fragment was isolated as a mixture of methylcobalamin, cob(II)alamin, and cob(III)alamin and was reductively methylated in an electrochemical cell using S-adenosyl-lmethionine as methyl donor as described previously for the wild-type protein.14 The protein was desalted by passage over a Sephadex G-50 column (1.2 cm  20 cm) that had been equilibrated with 0.01 M KPi. The proteincontaining fractions were pooled, concentrated in a Centricon concentrator (YM-30), and washed several times with 0.01 M KPi prior to freezing at 80 . UV/Visible Spectrophotometry of Methylcobalamin-Containing Methionine Synthase

Principle The UV/visible spectra of MetH are exquisitely sensitive to changes in the coordination environment of the cobalamin cofactor and cobalamins in virtually all oxidation/alkylation states exhibit distinct absorbance spectra. Figure 3A shows UV/visible spectra obtained with methylcobalamincontaining wild-type MetH and the His-759-Gly variant of MetH where the lower axial ligand to the cobalamin has been deleted. The 525-nm absorbance of the base-on form of the cofactor (red) is replaced by a 450-nm peak in the spectrum of the variant (yellow). The spectra shown in Fig. 3A represent two extremes in the ligation state of cobalamin; however, these reference spectra can be used to deconvolute the experimental spectra to obtain the percentage of the protein containing cobalamin in the base-on and the base-off coordination states. Figure 3B shows a series of spectra, calculated using the reference spectra in Fig. 3A, that would be observed in the presence of varying mixtures of base-on and base-off cobalamin. The spectral features and the extinction coefficient of the enzyme-bound

[7]

conformational change in methionine synthase

161

Fig. 3. The UV/visible spectra of methylcobalamin are sensitive to the coordination environment of the cobalamin. (A) The absorption spectra of wild-type (——) and His-759Gly (–––) variants of E. coli MetH showing the effect of deleting the lower axial ligand to the cobalamin on the spectrum of the protein. (B) The effect of varying proportions of base-off methylcobalamin on the spectrum of a sample of methylcobalamin. The spectra were calculated by addition of the reference spectra in (A) assuming 0, 20, 40, 60, 80, and 100% base-off methycobalamin. (C) Overlay of the spectra of methylcobalamin bound to wild-type and Asp-757-Glu variants showing the changes in the spectra of the protein upon introduction of conservative mutations in the catalytic triad.

methylcobalamin are sensitive to changes in the environment of the cobalamin and care should be exercised when reference spectra are chosen for deconvolution. Figure 3C shows the differences between the UV/visible spectra of the wild-type and Asp-757-Glu variants of MetH and illustrates that the presence of a conservative mutation in the catalytic triad of MetH introduces significant differences in the absorbance properties of the variant. Thus for experiments with the Asp-757-Glu variant, the base-on reference spectrum was obtained from the spectrum of the full length Asp-757-Glu protein at low temperature. The base-off reference spectrum was obtained from the Asp-757-Glu C-terminal fragment in the presence of saturating concentrations of AdoHcy. Since even at saturating concentrations of AdoHcy the sample is composed of a mixture of base-on and

162

allosteric enzymes and receptors

[7]

Fig. 4. Absorption spectra of Asp-757-Glu MetH (649–1227) obtained in the temperature range   40 (——) to 15 (–––); the actual temperatures at which spectra were measured were 40, 30, 25, 20,  and 15 . The protein concentrate was 3 M. (Adapted with permission from Bandarian et al.13)

base-off cobalmin, the reference spectrum for base-off Asp 757 Glu variant that was used in the deconvolutions was obtained by subtracting contribution of the base-on form (17%) from the spectrum of the variant in the presence of AdoHcy. Temperature is an additional factor that must be considered in the measurement of base-on and base-off equilibria since the absorption spectrum of the C-terminal variant of MetH is sensitive to  temperature, favoring base-off cobalamin at high temperatures (37 ) and  base-on cobalamin at lower temperatures (15 ) (Fig. 4). Procedures Spectral Deconvolution of Methylcobalamin Enzyme. MetH (5–10 M) in the methylcobalamin form was placed in 0.1 M KPi buffer, which had been preequilibrated at the appropriate temperature, and the spectrum obtained after allowing the contents to equilibrate for 2 min. The fractions of base-on and base-off cobalamin present in the sample were extracted by spectral reconstruction using appropriate reference spectra. Briefly, varying amounts of the base-on and base-off reference spectra were combined and the resulting spectra were visually evaluated against the experimentally acquired trace to identify the simulation that best reproduced the positions and amplitudes of the features within the spectral envelope.

[7]

conformational change in methionine synthase

163

Fig. 5. Absorbance changes associated with titration of cob(II)alamin protein with flavodoxin. The initial spectrum before addition of flavodoxin is characteristic of base-on cob(II)alamin (–––), while the spectrum obtained after addition of 90 M flavodoxin is predominantly due to base-off cob(II)alamin (——). [Reprinted with permission from Hoover et al.,17 Copyright (1997), American Chemical Society.]

UV/Visible Spectrophotometry of Cob(II)alamin-Containing Methionine Synthase

Principle In contrast to the differences between the spectra of base-on and baseoff methylcobalamin (see Fig. 3A), the differences between the spectra of the base-on and base-off forms of cob(II)alamin are subtle. The spectra shown in Fig. 5 were obtained during titration of the protein in the cob(II)alamin oxidation state with flavodoxin. Flavodoxin is the physiological redox partner of MetH in E. coli and stabilizes the base-off conformation of the protein,17 whereas the wild-type full-length MetH is initially predominantly in the base-on conformation. The base-on and base-off cob(II)alamin forms are distinguished based on the differences in their extinction coefficients (ebase-off > ebase-on) and absorbance maxima [max  474 nm for base-on and 464 nm for base-off cob(II)alamin]. Similarity in the overall shape of the spectral envelopes of the base-on and base-off forms precludes quantitative analyses of the absorbance changes to obtain the concentrations of base-on and base-off cobalamin.

17

D. M. Hoover, J. T. Jarrett, R. H. Sands, W. R. Dunham, M. L. Ludwig, and R. G. Matthews, Biochemistry 36, 127 (1997).

164

allosteric enzymes and receptors

[7]

Procedures Preparation of Cob(II)alamin-Containing MetH and Flavodoxin. Methylcobalamin-containing protein was converted into the cob(II)alamin form by homolysis under anaerobic conditions as described previously.14 E. coli flavodoxin was isolated and purified as described previously.18 Titration of Cob(II)alamin-Containing MetH by Flavodoxin. The absorbance spectrum of the endogenous flavin mononucleotide of flavodoxin must be accounted for in the titrations. MetH (10–25 M) and buffer are placed in the sample chamber of a dual-beam spectrophotometer and an equal concentration of buffer is placed in the reference cuvette. Equal aliquots from concentrated solutions of E. coli flavodoxin in its oxidized state (>1 mM) are titrated into both the sample and the reference cuvettes, the resulting solutions are mixed, and difference spectra are acquired. Addition of oxidized flavodoxin to MetH in the cob(II)alamin state results in some formation of the flavodoxin semiquinone, which reacts with molecular oxygen, so the titration must be conducted under anaerobic conditions.17 Electron Paramagnetic Resonance (EPR) Spectroscopy of Cob(II)alamin-Containing Methionine Synthase

Principle Cobalamin in the 2þ oxidation state is paramagnetic due to the presence of an unpaired electron in the dz2 orbital of the low-spin cobalt atom. The interaction of the unpaired electron with the cobalt nuclear spin (I ¼ 7/2) produces hyperfine splittings that are centered about the g? (2.3) and gk (2.0) values of the cobalt. The hyperfine splittings along g? are not resolved; however, eight sets of features, 120 G apart, are observed centered about the gk. The interaction between the unpaired electron spin and the nitrogen atom (I ¼ 1) of the lower axial ligand to the cobalt, when present, leads to a further splitting of each of the eight features to triplets. In the absence of a lower axial ligand, singlets are observed along gk. The EPR spectra are sensitive to the coordination state of the cobalamin. Figure 6 shows an overlay of cob(II)alamin spectra in solution as well as bound to MetH. As is evident from the traces, the gk features are very sensitive to the lower axial ligand. As with deconvolution of UV/visible spectra of cobalamin to obtain the contribution of base-on and base-off states, appropriate reference spectra are required. The EPR spectra of wild-type MetH in the cob(II)alamin state and that of the His-759-Gly variant may be used to estimate the contribution of each form. 18

V. Bianchi, R. Eliasson, M. Fontecave, E. Mulliez, D. M. Hoover, R. G. Matthews, and P. Reichard, Biochem. Biophys. Res. Commun. 197, 792 (1993).

[7]

conformational change in methionine synthase

165

Fig. 6. Comparison of EPR spectra of cob(II)alamin (Cbl) and cob(II)inamide (Cbi) in solution with wild-type (WT), Ser-810-Ala (SA), Asp-757-Asn (DN), Asp-757-Glu (DE), and His-759-Gly (HG) MetH (2–1227) variants. [Reprinted with permission from Jarrett et al.,12 Copyright (1996), American Chemical Society.]

A major advantage of EPR as a method to detect the ratio of base-on and base-off cobalamin is that one directly observes the molecular interaction between the cobalt and the nitrogen contributed by the lower axial ligand. However, the requirement for the presence of an unpaired spin in the system limits this method to studies with cob(II)alamin-containing protein. Procedure Preparation of Cob(II)alamin-Containing MetH for EPR. Cob(II)alamin forms an EPR-active cob(II)alamin/superoxo complex in the presence of oxygen. To avoid the formation of the superoxo complex the protein solution containing 150 M methylcobalamin 0.05 M KPi (pH 7.2) was deoxygenated in an anaerobic cuvette19 by repeated exposure to cycles of

19

C. H. Williams, Jr., D. Arscott, R. G. Matthews, C. Thorpe, and K. D. Wilkinson, Methods Enzymol. 62, 185 (1979).

166

allosteric enzymes and receptors

[7]

vacuum and equilibration with O2-free argon. The EPR tube was fitted with a rubber septum containing two syringe needles; one of the needles was fitted with a small-diameter polyethylene tube that was used to flush the tube with O2-free gas and the second served as an outlet. The cob(II)alamin protein was transferred to the tube with a syringe and the contents of the tube were frozen in liquid N2. The EPR spectra of cob(II)alamin were examined at X-band frequencies at temperatures 1 if X is a V-type activator. Similarly, if X is a K-type effector, its maximal effect would be given by the ratio of Michaelis constants, which we have previously denoted by the parameter Q16:

192

allosteric enzymes and receptors



Ka0 Ka1

[9]

(8)

Although this definition of Q at first appears to be inverse to that used for the definition of W, by defining Q in this way we preserve the way in which Q distinguishes activation and inhibition. For a K-type inhibitor Q < 1 and for a K-type activator Q > 1. If there is interest in the effect of X on the kinetic parameter V/K, this information is given by the quantity QW, with QW < 1 implying inhibition and QW > 1 implying activation. It is important to consider at this time how the mechanism in Fig. 1 differs from a model in which an enzyme can assume one of two states, particularly when X is an inhibitor. Although not required in their original formulations,4,5 most of the time, in the name of simplicity, it is assumed that each of the two states bind either X or A, but not both. This is equivalent to Fig. 1 with the segments enclosed in the dashed line omitted, since an enzyme form that cannot bind substrate cannot turn over. Note that neither of our parameters that characterize allosteric behavior, W and Q, can be defined in this abbreviated scheme since each requires the existence of the XEA form. Rather the scheme without XEA describes functionally a simple competitive mechanism despite the structural fact that X and A are envisioned to bind to different sites. Even if an isomerization equilibrium is imposed between the binding of A and the binding of X, the resulting form of the rate equation does not change. Thus an exclusive two-state model omits from consideration any partial kcat effects and presumes that the influence of X on the binding of A will not saturate. Note that the competitive situation is obtained by the entire mechanism in Fig. 1 in the limit when W ! 0 and when Q ! 0. It follows that an advantage of considering Fig. 1 in its entirety is that the parameters W and Q convey the magnitude as well as the nature of the allosteric effect. Despite the simplicity of the kinetic mechanism in Fig. 1, the steadystate rate equation is reasonably complex, with second-order terms in both A and X.17 Fortunately, the rate equation simplifies considerably if it is assumed that the substrate achieves a rapid equilibrium during the steady state17–19:  0  V 0 Kix ½A þ QW½A½X (9) v¼ 0 0 0 ½A þ K 0 ½X þ Q½A½X Kia Kix þ Kix ia

17

C. Frieden, J. Biol. Chem. 239, 3522 (1964). J. Botts and M. Morales, Trans. Faraday Soc. 49, 696 (1953). 19 R. A. Alberty and V. Bloomfield, J. Biol. Chem. 238, 2804 (1963). 18

[9]

analysis of allosteric behavior

193

where V represents initial velocity. One significant simplification implied by this equation is that both A and X binding hyperbolically. Although the applicability of Eq. (9) to the situation in which the substrate is in equilibrium in the steady state is widely recognized, there is some controversy whether a similar limiting condition regarding the allosteric ligand, X, also leads to an equation of the form of Eq. (9). If one assumes X achieves a rapid equilibrium, then an equation that is first order in A but second order in X results.20 However, if X achieves a true binding equilibrium prior to rate determination, i.e., if the enzyme forms are constrained completely 0 terms are reby Eqs. (5c) and (5d), then Eq. (9) results, although the Kia placed by Ka0 . Such a circumstance may arise if binding equilibrium with respect to X is achieved after a slow transient phase in turnover, and the rate after this transient is measured, instead of a true initial velocity. Regardless, the real utility of Fig. 1 as it relates to the mechanism of action of an allosteric ligand is how it leads to the determination of the parameters that describe allosteric action, namely W and Q. Since W is a kinetic parameter, its determination is independent of the validity of the rapid equilibrium assumption. So the real issue is whether the value of Q, determined from the ratio of Michaelis constants as defined in Eq. (8), is equal to the corresponding ratio of thermodynamic dissociation parameters. One reason this equivalence is important is that when true, it gives rise to the principal of reciprocity. When this is the case, combining Eqs. (6) and (8) yields Q¼

0 0 Kia Kix Ka0 ¼ ¼ 1 1 Kix Ka1 Kia

(10)

The right equality implies that the extent to which the binding of X modifies the dissociation constant for A must be equal to the extent to which the binding of A changes the dissociation constant for X. Although various approaches, such as isotope trapping experiments or use of a viscogen,21 can be utilized to determine whether Q derived from steady-state kinetics experiments is a thermodynamic quantity, Symcox and Reinhart22 have pointed out that the steady-state solution requires that X be in binding equilibrium in the limit of very low A concentration and in the limit of very high A concentration. Consequently they described a method in which the value of Q obtained from Michaelis constants is compared to the value obtained from apparent dissociation parameters for X 20

I. H. Segel, in ‘‘Enzyme Kinetics’’ (I. H. Segal, ed.), p. 838. Wiley, New York, 1975. W. W. Cleland, in ‘‘Techniques of Chemistry’’ (C. F. Bernasconi, ed.), Part I, Vol. VI, p. 791. Wiley, New York, 1986. 22 M. M. Symcox and G. D. Reinhart, Anal. Biochem. 206, 394 (1992). 21

194

allosteric enzymes and receptors

[9]

obtained at low and high concentrations of A, which must give the thermodynamic value of Q. If they are equivalent, it can safely be presumed that Eq. (10) is valid. Fortunately, this seems often to be the case for allosteric enzymes, which tend to bind substrates with relatively low affinity, thus increasing the likelihood that the off-rate constant for substrate is substantially greater than kcat. Enzymes that exhibit positive cooperativity in substrate binding have an even lower initial binding affinity, further increasing the chance that Eq. (10) is applicable. Indeed, Michaelis constants so often closely approximate thermodynamic dissociation parameters that investigators sometimes uncritically presume that to be the case. This, of course, can lead to trouble if subsequent interpretations rely on thermodynamic rather than kinetic arguments. Mechanistic and Energetic Insights

As we have seen, the commonly invoked exclusive binding two-state model is a limiting case of the more general, yet still simple, mechanism depicted in Fig. 1. The essential difference lies in whether one considers the ternary complex, XEA, capable of forming. As will be elaborated below, when it comes to understanding allosteric inhibition the issue is really ‘‘why is it difficult to form XEA?’’ It is hard to imagine answering this question without considering the nature, or at least the possible nature, of XEA, and therein lies the biggest limitation associated with ignoring the XEA form conceptually when considering the basis of action of an allosteric ligand. It might reasonably be asked next, how can one establish whether XEA can form? The most direct way to address this question is to evaluate Q and see if it has a value different from 0. This is most easily accomplished by plotting, usually on a log–log scale, the apparent Michaelis constant as a function of the concentration of X. Equation (9) predicts that these data will be described by the following function:  0  Kix þ ½X 0 Ka ¼ Kia (11) 0 þ Q½X Kix The analytical geometry associated with such a graph is shown in Fig. 2. A fit of data to Eq. (11) becomes a convenient way to estimate the values of 0 , K 0 , and Q. Although Eq. (11) strictly pertains to Fig. 1, as we shall see Kia ix below, it is also valid in more complex situations in which ligands bind cooperatively, provided the allosteric ligand does not bind cooperatively. An important point that is emphasized in Eq. (11) is that the pa0 , K 0 , and Q are independent of one another. Thus the action rameters Kia ix

[9]

analysis of allosteric behavior

195

Fig. 2. Dependence of the dissociation of A on the concentration of X predicted by Fig. 1 when the rapid-equilibrium assumption is valid. The equation describing this curve is Eq. (11) in the text.

of an allosteric ligand, or more precisely the interaction between an allosteric ligand and substrate, is independent of the binding affinity either ligand displays for the free enzyme alone. This point is illustrated in the data shown in Fig. 3. The response of phosphofructokinase to the inhibitor PEP changes with pH such that at lower pH the affinity of the enzyme for PEP increases, yet the coupling free energy between PEP and the substrate Fru-6-P decreases. In other words, the inhibitor binds more tightly, yet the inhibitor inhibits less effectively. When Eq. (10) is valid, it is often convenient to express the value of Q as a coupling free energy, Gax, by applying Eq. (12): Gax ¼ RT ln ðQÞ

(12)

The coupling free energy, or free energy of interaction, was introduced by Weber9,10 to illuminate the principle of linkage that had been developed by Wyman.6–8 Note that the coupling free energy is a standard free energy, although the superscript ‘‘0’’ is often dropped from the designation. The coupling free energy depicts the nature of the allosteric effect by its sign, with negative values for activation and positive values for inhibition. The absolute value of Gax conveys the strength of the interaction, with an equivalent scale for either activation or inhibition. When Gax ¼ 0, there is no allosteric effect on binding.

196

allosteric enzymes and receptors

[9]

Fig. 3. Influence of the allosteric inhibitor phospho(enol)pyruvate (PEP) on the Michaelis constant, Ka, for fructose 6-phophate (Fru-6-P) for phosphofructokinase from B. stearothermophilus (BsPFK) as a function of pH as indicated. Independent experiments have confirmed that the rapid equilibrium assumption is appropriate under these conditions. The data at each pH are shown fit to Eq. (11) described in the text. Note how the affinity for PEP increases at lower pH, yet the coupling between PEP and Fru-6-P decreases, emphasizing the independence of these two features of allosteric action. Note also that Eq. (11) fits the data well despite the fact that BsPFK is a tetramer to which PEP binds cooperatively.

The identities in Eq. (10) indicate that Gax is the standard free energy for the following disproportionation equilibrium: EA þ XE $ XEA þ E

(13)

Upon reflection, this equilibrium reveals another important conceptual distinction between this view of allosteric behavior and one that presupposes that X and A bind exclusively to two different enzyme forms. Since Gax conveys quantitatively both the nature and the magnitude of the allosteric effect, understanding the basis for that effect basically involves understanding why the disproportionation equilibrium achieves its particular value. Note that the EA and XE enzyme forms occur on the same side of the reaction. Their chemical potentials could be vastly different from one another but that would not, in and of itself, reveal anything about the poise of the equilibrium. Rather it is the contrast of the sum of these two enzyme forms to the sum of the ternary complex and free enzyme forms that establishes the value of Gax. In particular we once

[9]

analysis of allosteric behavior

197

again see the importance of understanding the properties of the ternary complex, XEA. The true issue might be clarified by the following elaboration. Gax can be determined from the difference in free energy of formation of the products minus that of the reactants of the equilibrium in Eq. (13): Gax ¼ ðGXEA þ GE Þ  ðGEA þ GXE Þ

(14)

Simple algebra leads to the following equivalent expression: Gax ¼ ðGXEA  GE Þ  ½ðGEA  GE Þ þ ðGXE  GE Þ

(15)

To simplify the notation, let us designate the differences in the parentheses with the lower case delta, : Gax ¼ GXEA  ðGEA þ GXE Þ

(16)

Thus, for example, GX-E-A is equal to the difference between the free energy of formation of XEA and that of free enzyme. In other words, GXEA is equal to the perturbation in the free energy of formation created by the binding of both ligands simultaneously. Equation (16) indicates, therefore, that the coupling free energy is derived from the difference between the perturbation of the free energy of formation when both ligands bind simultaneously versus the sum of the perturbations created by the binding of each ligand individually. The implication is that the binding of A and X can each do very different things to the structure of the enzyme, but if the enzyme can accommodate both changes when the ligands bind together then GXEA ¼ GEA þ GXE , and there will be no allosteric interaction. Rather, the degree to which the binding of both ligands simultaneously mitigates the effects each ligand would otherwise make on its own is the essence of the cause of the allosteric influence. Clearly the nature of the XEA enzyme form is the key to understanding the basis for allosteric interactions, and the focus of interest should be on those areas in which there is an overlapping influence of the binding of X and A. Multisubstrate Enzymes

Most allosteric enzymes have more than one substrate. If the kinetic mechanism is random, one can usually consider the effects of the allosteric ligand on one substrate at a time while keeping the other substrates at saturating levels. Figure 1 then applies, with ‘‘E’’ representing the enzyme– substrate complex associated with the nonvariable substrate. It is important to verify that the rapid equilibrium assumption is valid for each variable substrate. For example, we have found that PFK from E. coli behaves in rapid equilibrium fashion with regard to Fru-6-P at saturating ATP

198

allosteric enzymes and receptors

[9]

concentrations, but that ATP is not at rapid equilibrium at saturating Fru-6-P concentrations.23 One can force a multisubstrate enzyme to operate at binding equilibrium if the nonvaried substrates are very low in concentration. For a random Bi–Bi mechanism, for example, the Ka approaches Kia as the concentration of B approaches 0. Consequently Eq. (11) must pertain if B is held substantially below its Michaelis constant. Of course, the variable substrate will be interacting with free enzyme in this circumstance, not an enzyme–substrate complex. An exception occurs if the kinetic mechanism is ordered. The second substrate to bind can still be evaluated at saturating A concentration as if it were a random mechanism. However, Ka for the first substrate to bind with B saturating is equal to the second-order rate constant for A combining with E. So Ka will never be equal to a thermodynamic dissociation parameter in such a case. Ka will still approach Kia at low B concentrations. Kb may or may not approach Kib at low concentrations of A, depending on whether the rapid equilibrium condition is met. The situation is not forced as it is for a random mechanism. Oligomeric Enzymes

Unfortunately, most oligomeric enzymes are too complex to be evaluated at a level of detail comparable to that to which the single-substrate– single-modifier mechanism can be. This fact has given great impetus to the utilization of two-state models in an effort to provide a means of simplifying an otherwise prodigious analysis. However, this approach often sacrifices important principles of linkage, such as reciprocity of ligandbinding effects, quantifying separately parameters related to binding and coupling, and explicitly considering the properties of the ternary complex, that potentially provide much more insight into the basis for the allosteric properties being manifest. Alternatively, one can systematically approach the analysis of allosteric enzymes that are more structurally and functionally complex so that these model-independent parameters can be determined or estimated. The basic principles of this approach are revealed by a consideration of a symmetrical dimeric allosteric enzyme with two active sites and two allosteric sites. Such a system is tractable to complete analysis provided the rapid-equilibrium assumption is valid.24 We will further restrict our discussion by presuming that the allosteric ligand does not affect kcat. If X does influence kcat, apparent dissociation parameters can still be 23 24

J. L. Johnson and G. D. Reinhart, Biochemistry 31, 11510 (1992). G. D. Reinhart, Biophys. Chem. 30, 159 (1988).

[9]

analysis of allosteric behavior

199

estimated at any particular concentration of X and the following analysis will still be valid. The first complication that is often encountered with a multimeric enzyme is cooperativity of ligand binding due to multiple binding sites for a given ligand. Although negative as well as positive cooperativity have been observed, positive cooperativity, in which the binding affinity of a ligand increases with its extent of saturation, is by far the most common. Positive cooperativity gives rise to the characteristic sigmoidal binding isotherm when plotted on linear axes. An apparent dissociation parameter, K0.5, can be determined from the concentration of ligand required to produce half saturation (one-half Vmax if the ligand is a substrate). K0.5 is equal to the geometric mean of the individual dissociation constants for the individual ligand-binding interactions. Consequently, for a dimeric enzyme, K0.5 pertaining to substrate is equal to K0:5 ¼ ðK1 K2 Þ0:5

(17)

where K1 and K2 represent the dissociation constants for the first equivalent of ligand and the second equivalent of ligand, respectively. However, in our case of a symmetrical dimer, the binding sites are initially equivalent. K2 6¼ K1 because of an allosteric interaction between the two like-binding sites that is no different in principle from the heterotropic interaction between A and X discussed in the single-substrate–single-modifier mechan0 , K would ism above. By analogy to this equation if we let K1 ¼ Kia 2 therefore be given by K2 ¼

K1 Qaa

(18)

where Qaa is equal to the coupling parameter between to the two bound equivalents of A. Next the question presents itself, how can we determine the value of Qaa? For this we can turn to the Hill coefficient (nH), which is the slope of the tangent to the curve at mid saturation of a plot of log [v/(Vmax  v)] versus log [A]. Qaa is then given by24 pffiffiffiffiffiffiffiffi 2 Qaa pffiffiffiffiffiffiffiffi nH ¼ (19a) 1 þ Qaa Qaa ¼



nH 2  nH

2

(19b)

Note that the Hill coefficient is bounded by the stoichiometry of binding, 2 in this case, in the limit of an infinitely high value of the coupling parameter,

200

allosteric enzymes and receptors

[9]

Qaa. Equivalently, a value of 2 will be obtained only in the case of a very large negative value for the homotropic coupling free energy, calculated from Qaa according to Eq. (12). Of course, similar relationships pertain to the potentially cooperative interactions of X with its two sites. A practical challenge comes from estimating the cooperativity of the binding interaction if an independent assay of the binding of X is not available. This cooperativity can be assessed with steady-state kinetics by titrating X at fixed concentrations of substrate. Data can be difficult to obtain at very low concentrations of substrate, because of the low activity and limited extent of reaction over which to estimate a rate, and at very high concentrations of substrate, because of the limited extent of inhibition. Nonetheless, with care, data can be obtained that will allow the determination of Qxx according to the relationships presented in Eq. (19). The heterotropic interactions can be assessed by evaluating the effect of X on K0.5. These data can be plotted in a manner similar to that shown in Fig. 2, and if a plateau value at high concentrations of X is sufficiently defined, an apparent coupling parameter, Q, can be obtained from the ratio of the two plateau values as indicated in Fig. 2. Often, however, a plateau value is not fully established, leading to a desire to fit the data to a functional dependence that will provide an extrapolated estimate of the coupling. For our symmetrical dimer, the dependence of K0.5 on the concentration of X is given by24 !0:5 0 Þ2 þ 2K 0 ½X þ Q ½X2 ðKix xx 0 ix K0:5 ¼ K0:5 (20)  0:5 0 Þ2 þ 2K 0 Q Q =Q ðKix ½X þ Q2 Qxx ½X2 xx xx=aa ix where Qxx/aa is equal to the value of Qxx in the limit when A is saturating. It is significant to note that if the enzyme displays no homotropic cooperativity in the binding of allosteric ligand with or without substrate bound, i.e., if Qxx ¼ Qxx=aa ¼ 1, then Eq. (20) reduces to Eq. (11). This will generally be true of higher-order oligomers as well—in the absence of cooperative binding of the allosteric ligand, Eq. (11) will describe how K0.5 varies with the concentration of the allosteric ligand. We have often found that Eq. (11) provides a satisfactory fit to the data even when the allosteric ligand binds cooperatively, as is evident in Fig. 3. Simulations indicate that with large values of Qxx (i.e., corresponding to Gxx ¼ 2 kcal=mol), Eq. (11) allows the estimate of Q with an error of less than 20%. What is the meaning of the apparent heterotropic coupling, Q, in the case of a symmetrical dimer? We must first appreciate that two copies of two different heterotropic couplings exist in principle, Qax1 and Qax2. The

[9]

analysis of allosteric behavior

201

difference is easy to visualize if the binding sites are completely contained within each subunit. In that case, we can consider that the binding of substrate to a particular binding site might be affected by the binding of the allosteric ligand to either the site within that subunit or to the site in the other subunit, leading to a consideration of the intrasubunit coupling, Qax1, and the intersubunit coupling, Qax2. Clearly the magnitude of these two couplings could be the same, but in general they need not be. In the completely general case, in which the enzyme displays both homotropic and heterotropic allosteric effects, Q is given by the following24: Q ¼ Qax1 Qax2



Qxx=a Qxx



Qaa=xx Qaa

0:5

(21)

where Qxx/a is equal to Qxx when a single equivalent of A is bound and Qaa/xx is equal to Qaa when X is saturating. Equation (21) illustrates several principles. First, in the absence of all homotropic cooperativity, the apparent heterotropic coupling parameter is equal to the product of each individual, unique heterotropic coupling parameter. In terms of free energy, the free energy of interaction in an oligomer is equal to the sum of all the individual, unique coupling free energies in the absence of homotropic cooperativity. Second, this principle of additivity will also be true in the presence of homotropic cooperativity if the magnitude of the homotropic cooperativity does not change as the concentration of the heterotropic ligand changes. In other words, if the Hill coefficient for substrate binding does not change with allosteric ligand concentration, even if the Hill coefficient is different from 1, then it will not impact the magnitude of the apparent heterotropic coupling. The same principle holds for the Hill coefficient evident in the binding of the allosteric ligand. Third, if Eq. (11) does a satisfactory job of fitting the dependence of K0.5 on [X], then only the cooperativity in substrate binding will potentially enter into the magnitude of Q. The ratio of Qaa/xx to Qaa can be assessed from the Hill coefficients associated with substrate binding at low and saturating concentrations of X as just described. The existence of homotropic coupling parameters is not the only origin of nonhyperbolic binding profiles in oligomeric allosteric enzymes. As Weber10 originally pointed out, multiple heterotropic couplings can give rise to apparent cooperative behavior even if all of the homotropic coupling free energies are equal to 0. The cooperative behavior is positive (i.e., sigmoidal) regardless of whether the heterotropic couplings are positive or negative and occurs specifically when the heterotropic ligand is only partially saturating. Consequently we have previously termed this phenomenon ‘‘subsaturating heterotropic cooperativity.’’24 This effect is

202

allosteric enzymes and receptors

[9]

directly related to the principle of reciprocity discussed above. Consider first the case when X is an activator, i.e., X increases the binding affinity for A, and A increases the binding affinity for X. At a concentration of X that leads to only partial saturation of X, A will begin to bind with a certain partially enhanced affinity. But as A binds, the binding affinity of the enzyme for X will also be increased, leading to a greater degree of saturation of X. This in turn further increases the binding affinity of A, leading to a positively cooperative binding titration. Alternatively, if X antagonizes the binding of A, and X is only partially saturating, when A initially binds to the enzyme population, its affinity will be diminished to some extent relative to binding in the absence of X. As A is titrated into the system, it will bind and in so doing diminish the binding affinity of X, thus decreasing the degree of saturation of X. With less X bound, A will proceed to bind further with higher affinity, producing a positively cooperative binding isotherm for A. Subsaturating heterotropic cooperativity will not be observed either in the absence of X or when the concentration of X is high enough that the enzyme remains fully saturated with X regardless of the degree of satur1 ation of A. Since this latter situation is the state that defines Kia , this subsaturating effect should not influence the value of the apparent heterotropic coupling that is measured. However, in practice care must be taken to make sure, when determining the value for Qaa/xx, that one is extrapolating to a limiting value for the Hill coefficient approached when X is very large. Subsaturating cooperativity will tend to obscure the value of Qxx/a appearing in Eq. (21) should the allosteric ligand exhibit homotropic cooperativity that is influenced by the binding of A. Establishing the value of the homotropic coupling when only a single equivalent of A is bound would be problematic in any event. Usually the value of Qxx/a must be approximated by the value of the square root of Qxx/aa. This approximation assumes that the influence of a single equivalent of A is half that of both equivalents. The error introduced if this assumption is not completely correct is likely to be small. Summary

Even complex allosteric enzymes can be analyzed in a manner guided by the principles of thermodynamic linkage. Apparent coupling parameters between pairs of ligands can be determined regardless of the oligomeric nature of the enzyme. Except in the simplest of cases the associated coupling free energies will be equal to the sum of the individual unique coupling parameters that exist between the multiple binding sites

[9]

analysis of allosteric behavior

203

for each ligand. If the coupling free energy is divided by the stoichiometry, the average coupling free energy will be obtained. The overall value will include a contribution from homotropic interactions only if the degree of cooperativity changes in the presence of the other ligand. The sum or average nature of these coupling parameters compromises their utility only to the same extent as does the fact that K0.5 represents an average dissociation constant. The coupling free energy, even if a composite of more fundamental couplings, still quantitatively describes both the nature and magnitude of the allosteric effect, and as such it provides a means of monitoring how another experimental variable, for example, the introduction of a site-specific mutation, might alter the action of an allosteric ligand. Thus, even without a precise understanding of the individual interactions that comprise the coupling free energy, it provides a model-independent basis for interpreting experiments. One might draw an analogy to the value of the kinetic parameter V/K in a kinetic analysis of a nonallosteric enzyme. Even before (or without) knowing the precise kinetic mechanism for an enzyme, the meaning of V/K is clear, and its evaluation assists in the eventual determination of that mechanism. Other principles of linkage are also useful to keep in mind—in particular, the principle of reciprocity, the principle of the independence of binding affinity and allosteric efficacy, and the principle that ultimately it is the poise of a disproportionation equilibrium, such as Eq. (13), that must be understood. In particular, the properties of the ternary complex are the key to understanding any structural basis for the allosteric function. The ternary complex is too important to leave its true nature to mere presumptions, or worse yet to the seemingly irrelevant ambiguity of a two-state model. Acknowledgments I would like to thank the many students and postdoctoral fellows who have contributed through the years to these ideas and to the experimental observations that have been discussed. Special thanks are extended to Dr. Valarie Tlapak-Simmons who obtained the data shown in Fig. 3. This work has been supported by the National Institutes of Health grant GM33216.

[10]

cooperativity in eukaryotic transcription complex assembly

207

[10] The Immobilized Template Assay for Measuring Cooperativity in Eukaryotic Transcription Complex Assembly By Kristina M. Johnson, Jin Wang, Andrea Smallwood, and Michael Carey Introduction

Assembly of the 3.5-MDa RNA polymerase II preinitiation complex (PIC) on promoter DNA is essential for transcription. Transcriptional activators initiate a cascade of events that results in PIC formation and have the capacity to regulate PIC assembly in a cell-specific and temporal manner. The activation process begins when transcriptional activators bind to their specific binding sites on the DNA within the promoters they regulate.1,2 Typically, a limited number of activators work combinatorially in the context of enhanceosomes to execute a larger repertoire of regulatory decisions.3 Protein–protein interactions between the multiple activators that form an enhanceosome,3 between the activators and the PIC, and interactions within the PIC all have the effect of stabilizing the final complex. These interactions define cooperative binding in PIC assembly. Cooperative assembly of the PIC by multiple activators leads to synergistic transcription. Transcriptional synergy manifests itself in the ability of the cell to respond rapidly to outside signals and to integrate multiple signaling pathways controlled by individual activators. Due to the principles of cooperativity and synergy, small changes in concentration of a few key transcription factors can have a large net effect on PIC formation and transcription.3 The preinitiation complex contains all of the factors required for transcription initiation. The PIC assembly includes but is not limited to (1) general transcription factors (GTFs), which include TBP of TFIID, TFIIA, TFIIB, TFIIE, TFIIH, and TFIIF,1 (2) coactivators, including the 25 subunit mediator complex (Med)4 and the TAFs of TFIID,5 (3) various chromatin remodeling factors,6,7 and (4) RNA polymerase II (Pol II).1 An important mechanistic insight to emerge from in vivo DNA binding 1

N. A. Woychik and M. Hampsey, Cell 108, 453 (2002). G. Orphanides and D. Reinberg, Cell 108, 439 (2002). 3 M. Carey, Cell 92, 5 (1998). 4 M. Boube, L. Joulia, D. L. Cribbs, and H. M. Bourbon, Cell 110, 143 (2002). 5 S. Hahn, Cell 95, 579 (1998). 6 G. J. Narlikar, H. Y. Fan, and R. E. Kingston, Cell 108, 475 (2002). 7 P. J. Horn and C. L. Peterson, Science 297, 1824 (2002). 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

208

cooperativity in protein folding and assembly

[10]

analyses in yeast is that PICs assemble in a cooperative, concerted manner in response to activators. Cooperative binding is indicated because mutations in individual Med or GTF subunits have been observed to prevent assembly of the entire PIC in vivo.8,9 In vitro studies additionally indicate that the subcomplexes of the PIC can assemble in a cooperative manner.10 Our laboratory has demonstrated that part of the concerted PIC assembly in vivo is probably due to cooperative interactions between TFIID and Med; TFIID and Med stimulate each other’s binding to promoter DNA in vitro.11 Collectively, these in vivo and in vitro data suggest that cooperative interactions between the various components of the PIC are a fundamental aspect of PIC assembly. Conventional methods for studying the problem of cooperative DNA binding by relatively small proteins include DNase I footprinting and electrophoretic mobility shift assay (EMSA) analysis. Early studies on cooperative binding of the bacteriophage  repressor protein to operator sites utilized DNase I footprinting. By comparing the affinity of repressor for a 32P end-labeled DNA fragment containing two binding sites with either binding site alone, Ptashne and colleagues showed that  repressor binds DNA cooperatively.12 Quantitative DNase I footprinting methods for extracting thermodynamic parameters to model cooperative binding of  repressor were later described by Ackers and colleagues.13 DNase I footprinting can also be used to study larger protein–DNA complexes where the different components bind DNA directly. For example, recruitment of RNA polymerase by activators in prokaryotes has been studied by DNase I footprinting.14 Additionally, both DNase I and EMSA assays were utilized in studies of cooperative binding between activators and TFIID in eukaryotes.15–17 Work by Sharp and colleagues exemplifies the use of these techniques to study stepwise assembly of general transcription factors in the absence of activators.18 While often used together, EMSA and DNase I footprinting offer distinct advantages. DNase 8

L. Kuras and K. Struhl, Nature 399, 609 (1999). X. Y. Li, A. Virbasius, X. Zhu, and M. R. Green, Nature 399, 605 (1999). 10 J. A. Ranish, N. Yudkovsky, and S. Hahn, Genes Dev. 13, 49 (1999). 11 K. M. Johnson, J. Wang, A. Smallwood, C. Arayata, and M. Carey, Genes Dev. 16, 1852 (2002). 12 A. D. Johnson et al., Nature 294, 217 (1981). 13 M. Brenowitz D. F. Senear, M. A. Shea, and G. K. Ackers, Methods Enzymol. 130, 132 (1986). 14 A. L. Meiklejohn and J. D. Gralla, Cell 43, 769 (1985). 15 P. M. Lieberman and A. J. Berk, Genes Dev. 5, 2441 (1991). 16 M. Horikoshi, T. Hai, Y. S. Lin, M. R. Green, and R. G. Roeder, Cell 54, 1033 (1988). 17 T. Chi, P. Lieberman, K. Ellwood, and M. Carey, Nature 377, 254 (1995). 18 S. Buratowski, S. Hahn, L. Guarente, and P. A. Sharp, Cell 56, 549 (1989). 9

[10]

cooperativity in eukaryotic transcription complex assembly

209

I footprinting enables specific information regarding the location of DNA binding; EMSA can yield additional information when components do not bind DNA directly. However, both of these techniques are limited with respect to use with large complexes. How can the cooperative assembly of the activated PIC be studied when only a subset of proteins binds DNA and when complexes become too large to resolve by EMSA or biophysical techniques? Here, we describe an alternative biochemical method, the immobilized template recruitment assay. In this assay, a biotinylated DNA fragment is immobilized via attachment to streptavidin-coated, magnetic Dynal beads. DNA-binding proteins are incubated with the immobilized template. Then proteins not specifically assembled in a complex are removed by separating the immobilized template with a magnetic particle concentrator and by subsequent washing of the magnetic bead/template pellet. Bound proteins are detected by immunoblot. To detect cooperative binding, subsaturating amounts of two or more proteins or protein complexes are added to the immobilized template. Template binding of a protein in concert with its cooperative binding partner is then compared to the binding of each individual protein or protein complex alone. Use of the immobilized template assay to study cooperative binding of protein complexes offers many advantages over traditional approaches. For instance, the immobilized promoter template can also be used in functional assays such as in vitro transcription, so this technique offers the opportunity to directly correlate cooperative assembly with function. Additionally, because protein binding is assayed by immunoblot, one can detect the presence of individual subunits of a multisubunit complex regardless of size. Analysis of the assembly of large complexes can be achieved in a short time frame and, as there are no large, sophisticated pieces of equipment involved, this technique is not particularly expensive. Since the protein detection method is immunoblotting, it is possible to detect posttranslational modifications of individual subunits of protein complexes by shifts in electrophoretic mobility and, because immunoblotting is relatively sensitive, one can, with the use of standard curves, reproducibly detect small (less than 2-fold) changes in DNA binding. Of particular advantage in the study of transcription complex assembly is that the immobilized DNA templates can be preassembled with nucleosomes to mimic the more natural, chromatin environment within the cell.19 The disadvantages of the immobilized template assay are as follows. For instance, detection of protein binding is dependent on the availability of high titer antibodies. Additionally, background, nonspecific DNA 19

A. H. Hassan, K. E. Neely, and J. L. Workman, Cell 104, 817 (2001).

210

cooperativity in protein folding and assembly

[10]

binding such as binding of proteins to DNA ends, as is characteristic of RNA polymerase II, for example, can pose a problem. Techniques for addressing this issue are discussed below. As nonspecific DNA binding is a common problem with this technique, controls for specificity should be included in experimental design. The use of a TATA-less promoter, for instance, should be used as a control for specific TFIID binding. Also, the immobilized template assay gives less positional information than DNase I footprinting and, with respect to correlating function with recruitment, the activity of assembled PIC complexes is, for reasons we do not fully understand, below 100%. The immobilized template assay can be used to decipher mechanistic events in PIC assembly. Early experiments by Green and colleagues showed that an activator helps to assemble a PIC from factors present in a HeLa nuclear extract (NE).20 Hahn and colleagues used the immobilized template assay to address how mutations in PIC components affected assembly of PICs in a yeast NE.10 The mechanism and recruitment of multiple chromatin remodeling complexes were also studied using immobilized nucleosomal templates.19 Our group used an immobilized GAL4-VP16 responsive promoter containing multiple GAL4 DNA-binding sites upstream of the adenovirus E4 TATA box (G5E4T)21 to study cooperative recruitment of coactivator complexes. This model system has been used extensively to study the mechanism of transcription complex assembly.22 We have extended the use of this system via the immobilized template assay. This has allowed us to determine that two human coactivator complexes, Mediator and TFIID, bind cooperatively to promoter DNA and form a critical intermediate, the DAMed complex. Preassembly of DAMed stimulates transcription due to a stimulatory effect of DAMed on PIC formation.11 Template Preparation

Template Design and PCR Amplification Any promoter DNA fragment can be immobilized for the assay. We utilized the model promoter G5E4T, which contains five GAL4 DNA binding sites 23 bp upstream of the adenovirus E4 TATA box. Biotinylated DNA fragments are polymerase chain reaction (PCR) amplified from pG5E4T with a 27-nucleotide primer biotinylated at the 50 end (we 20

Y. S. Lin and M. R. Green, Cell 64, 971 (1991). D. Tantin, T. Chi, R. Hori, S. Pyo, and M. Carey, Methods Enzymol. 274, 133 (1996). 22 M. Carey, J. Leatherwood, and M. Ptashne, Science 247, 710 (1990). 21

[10]

cooperativity in eukaryotic transcription complex assembly

211

Fig. 1. Preparation of G5E4T immobilized templates. (A) Schematic of the 650-bp immobilized template bearing five GAL4 DNA binding sites 23 bp upstream of the adenovirus E4 TATA box. The HindIII site is used to cleave the DNA from the Dynabeads for quantitation, by agarose gel electrophoresis. From Johnson et al.,11 with permission. (B) Quantitation of template bound to Dynabeads beads: 2 l (lane 1), 1 l (lane 2), and 0.5 l (lane 3) of HindIII-digested immobilized templates are run on an ethidium bromide–stained agarose gel with dilutions of a DNA ladder where the 500-bp fragment is 40 (lane 4), 20 (lane 5), or 10 ng (lane 6).

purchase the primer with the biotin moiety attached) and a 27-nucleotide downstream primer, generating a 650-bp fragment (Fig. 1A11). The upstream primer is positioned 205 bp upstream of the GAL4 DNA-binding sites. This distance is sufficient to allow binding of large protein complexes without steric interference from the magnetic beads. A distance of 50 bp between the GAL4 DNA binding sites and the 50 end of the upstream primer is not sufficient; these templates do not support efficient PIC recruitment or transcriptional activity. DNA fragment sizes of 1 kb or below are recommended for sufficient binding to magnetic beads. The PCR product is electrophoresed on an agarose preparatory gel and purified using the Qiaquick gel extraction kit. DNA fragments purified from one 100-l PCR reaction are resuspended in 60 l TE. Binding DNA Fragments to Dynal Beads Dynabeads M-280 (30 l) (streptavidin) (Dynal catalog no. 112.05) are placed on a magnetic particle concentrator (MPC) (Dynal MPC-S) and the storage buffer is removed. The beads are then washed two times in 2 concentrated binding and washing buffer (B&W buffer) [10 mM Tris-HCl (pH 7.5), 1 mM ethylenediametetraacetic acid (EDTA), 2 M NaCl] and resuspended in 60 l 2 B&W buffer. Biotinylated PCR fragment (60 l) in TE is added to the Dynal beads and the mixture is incubated at room temperature with constant rotation on a Labquake rotisserie. The required incubation time depends on the length of the DNA fragment; a 30-min incubation

212

cooperativity in protein folding and assembly

[10]

is sufficient for complete binding of DNA fragments up to 1 kb. Subsequent to the binding reaction, the supernatant is removed from the beads using the MPC. The beads are washed twice (by rapid resuspension) in 1 B&W buffer (200 l per wash) and then washed two times with 0.1 M Buffer D (20% v/v glycerol, 20 mM HEPES, pH 7.9, 0.1 mM EDTA, 0.1 M KCl). Beads are resuspended and stored in 60 l 0.1 M Buffer D. Beads  can be stored for up to 2 months at 4 . We resuspend the immobilized templates in Buffer D as it is commonly used in in vitro transcriptional studies. Beads can be washed and resuspended in TE if that is preferred for a given application. Quantitation of Fragment Bound to Beads To assess the amount of DNA fragment bound to a given volume of beads, a restriction endonuclease digest is performed. A HindIII site is located 170 bp from the 50 end of the template on the immobilized G5E4T fragment. Beads (5 l) are washed twice with 200 l TE and are resuspended in a 20-l HindIII reaction mix. To ensure complete cleavage we intermittently mix the reaction to prevent the beads from settling at the bottom of the tube. The supernatant is removed from the beads and serial dilutions of the reaction products are electrophoresed on a 1.5% agarose gel alongside quantitative DNA markers (1 kb DNA ladder, NEB catalogue no. 323-2L). Figure 1B shows an ethidium bromide–stained agarose gel in which 2 l of beads (8 l of the restriction endonuclease digest), 1 l of beads, and 0.5 l of beads are electrophoresed adjacent to dilutions of a DNA ladder where the 500-bp fragment is 40, 20, or 10 ng. The amount of DNA bound to a given volume of beads is estimated; the femtomoles of DNA per volume (or micrograms) of beads is calculated from the size, and, thus, approximate molecular weight of the DNA fragment. For a 650-bp fragment, our typical binding efficiency is 2 fmol of DNA per g of beads. It is important to determine the amount of template bound to the magnetic beads with each preparation so that there is consistency between experiments. This is also critical if one is to assess the stoichiometry of protein– DNA binding. Immobilized Template-Binding Reactions

Typical Binding Reaction Conditions Our binding reaction mixtures are generally 60 l and the buffer conditions are those of a typical in vitro transcription reaction: Of the reaction mixture 37.5 l is composed of 0.1 M Buffer D (20% v/v glycerol, 20 mM

[10]

cooperativity in eukaryotic transcription complex assembly

213

HEPES, pH 7.9, 0.1 mM EDTA, pH 8.0, 0.1 M KCl) and the remaining 22.5 l of the mixture is aqueous. The Buffer D portion includes all proteins or immobilized templates that are stored in 0.1 M Buffer D. In addition, the reactions contain 0.025% NP-40, 7.5 mM MgCl2, 50 g/l bovine serum albumin (BSA), 200 ng pGEM3 (nonspecific competitor DNA), 1 mM dithiothreitol (DTT), and 1 mM phenylmethylsulfonyl fluoride (PMSF). We typically use 40 fmol DNA template in each reaction. In our studies of PIC assembly from HeLa NE we employ 5 ng GAL4-VP16 and 35 g NE. In assessment of cooperative binding of purified Med and TFIID we use 5 ng GAL4-VP16, 100 ng TFIID, 12 ng TFIIA, and 1 unit Med. One unit of Med is defined as the amount required to complement 35 g of Med-depleted NE (NE) to produce the same femtomole amount of transcript as that produced by 35 g HeLa nuclear extract. Purified Med,11 TFIID,23 TFIIA,24 and GAL4-VP1621 are prepared as described previously. Reaction mixtures are assembled on ice. Typically, binding reactions  are incubated for 30-min at 25 with constant rotation. Beads are then immobilized on the MPC and washed three times with 200 l wash buffer (62.5% Buffer D 0.1 M, 7.5 mM MgCl2, 0.05% NP-40, 1 mM DTT, and 1 mM PMSF). Subsequent to washing, tubes are spun for 10 s in a microfuge, placed on the MPC, and all wash buffer is removed by pipette. The beads are resuspended in 10 l 1 sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS–PAGE) loading dye and electrophoresed on an SDS–PAGE gel. Binding to the template is detected by immunoblot using antibodies against various factors. To study TFIID and Med binding we employ commercial antibodies against a small representative subset of the subunits for each coactivator. Many Med and TFIID antibodies can be purchased from Santa Cruz Biotech. A single gel can be used to study proteins in different molecular weight ranges by simply cutting the membrane horizontally with a razor blade. A potential disadvantage of this assay is background, nonspecific DNA binding of proteins either because of an intrinsic DNA-binding activity or due to the ability of a given protein, RNA polymerase II, for instance, to bind to DNA ends. Several options are available to address this issue. Additional competitor DNA (plasmid DNAs such as pGEM3 or linear polymers such as dGdC) can be added, the salt or detergent concentration can be increased, or the total incubation time can be decreased. The amounts of immobilized template and proteins can also be adjusted. If available, DNA-binding mutants or immobilized templates lacking 23 24

Q. Zhou, P. M. Lieberman, T. G. Boyer, and A. J. Berk, Genes Dev. 6, 1964 (1992). J. Ozer et al., Genes Dev. 8, 2324 (1994).

214

cooperativity in protein folding and assembly

[10]

DNA-binding sites are appropriate and preferable controls. To avoid nonspecific binding to DNA ends, small DNA fragments can be added. These are generated by cleaving plasmid DNA with a restriction enzyme that generates many fragments. We observed high background binding of RNA polymerase to our immobilized G5E4T template. Some of this background was decreased by the addition of Sau3a-digested pGEM3. Importantly, all binding conditions should be assessed in a functional assay (see below for a description of in vitro transcription assays on immobilized templates). Using Immobilized Templates to Examine Cooperative Binding To detect cooperative binding of TFIID and Med, we first determine conditions under which binding of each protein is subsaturating. This is done by titration of onput and also by adjustment of the stringency of binding conditions. Then, binding of TFIID or Med is assessed alone and together. Cooperative binding is indicated by a stimulation of TFIID binding in the presence Med and, reciprocally, by stimulation of Med binding in the presence of TFIID. Results Figure 2A demonstrates that purified Med stimulates recruitment of purified TFIID to immobilized G5E4T promoter DNA.11 TFIID (40 ng) was incubated with immobilized templates in the absence (lane 1) or presence (lane 2) of saturating levels of GAL4-VP16 (200 ng). Levels of TFIID recruitment in these first two lanes are subsaturating and below the detectable limit of the immunoblot. However, when the same amount of TFIID is incubated in the presence of saturating levels (1 unit) of purified Med (lanes 5 and 6), TFIID recruitment in the presence of GAL4-VP16 is substantially increased (compare lanes 2 and 6). A model for cooperative interactions between TFIID and Med implies the potential for mutual stabilization of these coactivator complexes on the DNA. Figure 2B depicts an experiment where, in the presence of GAL4-VP16, TFIID is saturating with respect to both template occupancy and transcriptional activity.11 Comparison of lanes 4 and 6 demonstrates that saturating levels (120 ng) of TFIID stimulate the recruitment of subsaturating levels of Med (0.04 units). Importantly, the approximate stoichiometry of GAL4-VP16, TFIID, and Med recruitment is 10 (5 GAL4-VP16 dimers):1:1, respectively. Stoichiometeric measurements were determined by the following: (1) measurement of Med 130, Med 220, and TBP concentrations within their respective complexes by silver stain SDS–PAGE gel and comparison to BSA standards; and (2) determination of moles of protein bound to

[10]

cooperativity in eukaryotic transcription complex assembly

215

Fig. 2. Human mediator and TFIID bind cooperatively to promoter DNA. (A) Saturating levels of purified Med recruit TFIID to immobilized templates. G5E4T immobilized template (every lane) was prebound with GAL4-VP16 (+) in every other lane. Subsequently, TFIID and TFIIA were added to lanes 1–2 and 5–6. Saturating levels of Med were added to lanes 3–6. After washing, bound proteins were detected by immunoblot. (B) Saturating levels of TFIID stimulate Med recruitment to immobilized templates. G5E4T templates were prebound with GAL4-VP16 in every other lane as in (A). TFIID and TFIIA were added to lanes 1–2 and 5–6; Med was added to lanes 3–6. Protein bound to the immobilized templates was detected by immunoblot. From Johnson et al.,11 with permission.

immobilized templates by immunoblot is accomplished by comparison to standard, known amounts of protein. In Vitro Transcription on Immobilized Templates

Methods Transcription can be performed on immobilized templates. However, the efficiency of transcription reactions on linear templates, and, therefore, immobilized templates is below what is observed on supercoiled templates (approximately a 5-fold decrease). We recommend initially optimizing in vitro transcription conditions on supercoiled DNA templates and then transitioning to use of the immobilized DNA templates.

216

cooperativity in protein folding and assembly

[10]

The conditions for in vitro transcription are the same as for the immobilized template-binding assay but with the addition of 0.5 mM of each nucleoside triphosphate (NTP). Reactions are performed at room temperature with rotation for 60 min. The supernatant is removed from the beads with use of the MPC and the RNA is processed as described.21 Transcripts are analyzed by primer extension21 and fractionated by electrophoresis on 10% polyacrylamide/urea gels in 1 TBE buffer. Results Figure 3A demonstrates that purified Med is active on immobilized DNA templates. Med can complement mediator-depleted nuclear extract, enabling GAL4-VP16-activated transcription on immobilized G5E4T.11 Figure 3B shows that preassembly of the DAMed complex on immobilized templates under conditions of mutual cooperative binding accelerates transcription.11 Immobilized promoter templates were incubated with either TFIID alone (lane 1), Med alone (lane 2), both TFIID and Med (lane 3), or neither coactivator (lane 4) for 30 min. After washing, a second, 5-min incubation was performed in which a mediator-depleted nuclear extract and NTPs were added along with the coactivator(s) not added in the preassembly step. Transcription was observed only when both TFIID and Med were preassembled. Quantitative Assessment of Cooperative Binding

Methods The extent of cooperative binding can be quantitated with a minor addition to the immobilized template assay. The cooperative binding assay is performed as described above, but, during the protein detection phase of the protocol, standard curves of protein onput are added. In this way it is possible for us to quantitate the stimulatory effect of TFIID on Med binding and vice versa. As immunoblots allow considerable sensitivity in detection, particularly with the use of higher-technology scanning devices that can detect indirect immunofluorescence, this simple modification to the immobilized template assay enables a more sophisticated interpretation of cooperative effects. Results Figure 4 shows a standard immobilized template recruitment assay in which TFIID and Med bind cooperatively (lanes 4–6). In lanes 1–3, 2-fold titrations of the TFIID and Med onput are electrophoresed and

[10]

cooperativity in eukaryotic transcription complex assembly

217

Fig. 3. Immobilized templates can be used in in vitro transcription assays to correlate recruitment with function. (A) Purified Med complements mediator-depleted nuclear extract (NE) for activated transcription on immobilized templates. Immobilized G5E4T template (lanes 1–6) was incubated with NE (lanes 1–4) and (lanes 3–4) or purified Med (lanes 5 and 6) in the absence (lanes 1, 3, and 5) or presence (lanes 2, 4, and 6) of GAL4-VP16. Transcription was measured by primer extension. (B) Preincubation of TFIID and Med enhances transcription. The experimental design is outlined at top. Where indicated in the figure, factors were preincubated (TFIIA was added along with TFIID). NE, NTPs, and  coactivators were added to the preincubation mixes for 4 min at 30 and transcription was measured by primer extension analysis. An autoradiograph of the gel is shown. From Johnson et al.,11 with permission.

representative subunits of each coactivator are immunoblotted. We can thus estimate the extent of stimulation in Med recruitment in the presence of TFIID to be 2-fold and the extent of stimulation in TBP binding in the presence of Med to be 4-fold. Assessment of the precise molar amounts of

218

cooperativity in protein folding and assembly

[10]

Fig. 4. The immobilized template assay can be used to quantitate the extent of cooperative binding between TFIID and Med. G5E4T was prebound with 200 ng GAL4-VP16 in the immobilized template reactions represented in lanes 4–6. Purified TFIID was then bound for 30 min in the absence (lane 4) or presence (lane 6) of Med. Med was bound alone in the reaction represented in lane 5. Comparison of the extent of binding reflected in the Med 220 and TBP immunoblots (lanes 4–6) to the 2-fold titration of onput shown in lanes 1–3 enables approximation of the extent to which each protein complex stimulates recruitment of the other. Recruitment of Med, as represented by Med 220, is stimulated 2-fold by TFIID; recruitment of TFIID, as represented by TBP, is stimulated 4-fold by Med.

TFIID and Med bound to the immobilized promoter can be determined by titrating known concentrations of onput for direct comparison. Future Prospects

We previously undertook an analysis in which we systematically altered model templates to vary activator and TFIID affinities for DNA. EMSA was used to calculate the affinities of activator and TFIID for the altered promoters and in vitro transcription in a HeLa NE was used to measure transcription. We employed the data to generate a mathematical model for synergistic eukaryotic gene activation.25 This model was used to make quantitative predictions of the transcriptional response in the in vitro system and the model was found to adhere closely to the experimental measurements.25 We are currently in the process of using quantitative 25

J. Wang, K. Ellwood, A. Lehman, M. F. Carey, and Z. S. She, J. Mol. Biol. 286, 315 (1999).

[11]

219

cytoplasmic dynein characterization

measurements attained through use of the immobilized template assay to mathematically model cooperative interactions between TFIID and Med. This type of analysis should lend insight into the nature of cooperative assembly of the PIC and should allow us to gain a greater understanding of the influence of cooperative coactivator binding on synergistic eukaryotic transcription. Acknowledgment The preparation and original research in this article were supported by NIH Grant GM057283.

[11] Characterization of the Cargo Attachment Complex of Cytoplasmic Dynein Using NMR and Mass Spectrometry By Elisar Barbar and Michael Hare Introduction

Cell biological and genetic studies have clearly demonstrated that cytoskeleton-based transport and motor function impact all dynamic aspects of cell behavior, including cell division, the generation and maintenance of cell polarity, cell signaling, and cell motility.1 Understanding the regulatory mechanisms that control motor function is important for attempts to define the basic cellular processes underlying numerous diseases, including cancer and respiratory and neurodegenerative disorders. Dynein is the largest and most complex of the motor proteins and is a principal motor for minus enddirected intracellular transport along microtubules. During mitosis, dynein plays an important role in the localization of the mitotic spindle and in centrosome separation.2–4 In the interphase, cytoplasmic dynein is engaged in the perinuclear positioning of the Golgi complex,5 and in the retrograde transport of other membranous organelles in the cytoplasm.6 1

T. Hays and M. G. Li, Curr. Biol. 11, R136 (2001). M. McGrail and T. S. Hays, Development 124, 2409 (1997). 3 C. M. Pfarr, M. Coue, P. M. Grissom, T. S. Hays, M. E. Porter, and J. R. McIntosh, Nature 345, 263 (1990). 4 R. B. Vallee, C. Y. Tai, N. E. Faulkner, and D. L. Dujardin, J. Gen. Physiol. 118, 12 (2001). 5 I. Corthesytheulaz, A. Pauloin, and S. R. Pfeffer, J. Cell Biol. 118, 1333 (1992). 6 B. M. Paschal and R. B. Vallee, Nature 330, 181 (1987). 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[11]

219

cytoplasmic dynein characterization

measurements attained through use of the immobilized template assay to mathematically model cooperative interactions between TFIID and Med. This type of analysis should lend insight into the nature of cooperative assembly of the PIC and should allow us to gain a greater understanding of the influence of cooperative coactivator binding on synergistic eukaryotic transcription. Acknowledgment The preparation and original research in this article were supported by NIH Grant GM057283.

[11] Characterization of the Cargo Attachment Complex of Cytoplasmic Dynein Using NMR and Mass Spectrometry By Elisar Barbar and Michael Hare Introduction

Cell biological and genetic studies have clearly demonstrated that cytoskeleton-based transport and motor function impact all dynamic aspects of cell behavior, including cell division, the generation and maintenance of cell polarity, cell signaling, and cell motility.1 Understanding the regulatory mechanisms that control motor function is important for attempts to define the basic cellular processes underlying numerous diseases, including cancer and respiratory and neurodegenerative disorders. Dynein is the largest and most complex of the motor proteins and is a principal motor for minus enddirected intracellular transport along microtubules. During mitosis, dynein plays an important role in the localization of the mitotic spindle and in centrosome separation.2–4 In the interphase, cytoplasmic dynein is engaged in the perinuclear positioning of the Golgi complex,5 and in the retrograde transport of other membranous organelles in the cytoplasm.6 1

T. Hays and M. G. Li, Curr. Biol. 11, R136 (2001). M. McGrail and T. S. Hays, Development 124, 2409 (1997). 3 C. M. Pfarr, M. Coue, P. M. Grissom, T. S. Hays, M. E. Porter, and J. R. McIntosh, Nature 345, 263 (1990). 4 R. B. Vallee, C. Y. Tai, N. E. Faulkner, and D. L. Dujardin, J. Gen. Physiol. 118, 12 (2001). 5 I. Corthesytheulaz, A. Pauloin, and S. R. Pfeffer, J. Cell Biol. 118, 1333 (1992). 6 B. M. Paschal and R. B. Vallee, Nature 330, 181 (1987). 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

220

cooperativity in protein folding and assembly

[11]

Fig. 1. Schematic representation showing the relationship of the components of dynein to each other.

Based on images from electron microscopy,7 cytoplasmic dynein is a large multisubunit complex (1.2 MDa) composed of two globular heads joined by flexible stalk domains to a common base (Fig. 1). The head and stalk domains comprise the heavy chain motor subunits (530 kDa) that contain the microtubule-binding sites and the hydrolytic adenosine triphosphate (ATP)-binding sites required for force production. The base of the motor complex contains two 74-kDa intermediate chains (IC74), four light intermediate chains (52–61 kDa), and several light chains (10–25 kDa). The proteins that assemble to form the active complex are tissue specific.8,9 By far the most variable of the subunits are the light and intermediate chains, and they are presumed to have a predominant role in controlling dynein assembly and activity.10 Some of the following major questions are of interest in the dynein field: How are motors targeted to the cargoes that they transport? How is the motor activity regulated? Are there switches that turn some motors on and others off? How are cargoes latched to the complex and released at their destination? What is the role of light and intermediate chains in dynein assembly and regulation? To gain a mechanistic understanding of 7

U. Goodenough and J. Heuser, J. Mol. Biol. 180, 1083 (1984). D. I. Nurminsky, M. V. Nurminskaya, E. V. Benevolenskaya, Y. Y. Shevelyov, D. L. Hartl, and V. A. Gvozdev, Mol. Cell. Biol. 18, 6816 (1998). 9 S. M. King, E. Barbarese, J. F. Dillman, S. E. Benashski, K. T. Do, R. S. Patel-King, and K. K. Pfister, Biochemistry 37, 15033 (1998). 10 S. M. King, Biochim. Biophys. Acta 1496, 60 (2000). 8

[11]

cytoplasmic dynein characterization

221

the function and regulation of dynein, detailed studies of the structure, dynamics, and interactions of each of the subunits is necessary. We have focused our studies on three subunits that are located at the base of the complex, LC8, Tctex-1, and IC74. The light chains LC8 and Tctex-1 (10–12.5 kDa) are highly conserved and mediate interactions with a number of cellular molecules and putative cargoes outside the dynein motor complex.10 The intermediate chain IC74 forms a key intermediary in the complex, as it associates with the heavy chain and the accessory complex dynactin as well as with light chain subunits.11,12 Furthermore, the intermediate chains act as negative regulators of the motor domain.13 Research on the structural biology of multiprotein complexes such as dynein is in its infancy. Technologies including mass spectrometry and nuclear magnetic resonance (NMR), which have become standard tools to examine protein structure and dynamics at the atomic level, are only recently being extended to the study of multiprotein complexes. This review is an overview of our plans to apply these techniques to the dynein complex, along with other biochemical and biophysical approaches. Overview of Strategy

We have chosen to focus our structural studies on the cargo attachment complex at the base of the motor for several reasons. This subcomplex is the site for cargo attachment, it binds to the dynein regulator complex dynactin, and the heterogeneity of the subunits within it suggests that they regulate the activity of the motor and its targeting to various cargoes. Our strategy is described in four main sections. Protein Preparation The proteins prepared include intact dynein subunits and smaller constructs that contain the binding site but are large enough to reflect the conformation of the intact protein. Limited proteolysis coupled with mass spectrometry is used to identify independently folded domains in large subunits. Protein Characterization Folding, stability, and self-aggregation of individual subunits and domains are determined by circular dichroism, fluorescence spectroscopy, and analytical ultracentrifugation. 11

W. Steffen, S. Karki, K. T. Vaughan, R. B. Vallee, E. L. F. Holzbaur, D. G. Weiss, and S. A. Kuznetsov, Mol. Biol. Cell. 8, 2077 (1997). 12 S. Karki and E. L. F. Holzbaur, Curr. Opin. Cell Biol. 11, 45 (1999). 13 A. R. Kini and C. A. Collins, Cell Motil. Cytoskeleton 48, 52 (2001).

222

cooperativity in protein folding and assembly

[11]

Structural Determination The structures of individual subunits and domains are determined by heteronuclear NMR spectroscopy. Structural Characterization of Complexes This section will cover biochemical techniques to assay for binding, biophysical techniques to monitor conformational changes that accompany binding, and limited proteolysis coupled with mass spectrometry to identify segments that are at the binding interfaces. Protein Preparation

Our goal is to prepare soluble protein subunits and domains that are in the small to medium size range that can be characterized by NMR but are large enough to reflect the conformation of the intact protein. All the proteins in this study are recombinant proteins of Drosophila genes. Small and Medium Size Subunits Eukaryotic genes are expressed in bacteria using primarily the pET expression system with N-terminal 6-His tag to facilitate purification on Ni-NTA affinity resin. To remove the His tag fusion peptide, we routinely engineer an FXa protease site that is cleaved at the first residue of the protein. The advantages of this expression system are the high protein levels produced, a robust promoter, and the ease of expression in minimal media since no additional amino acids are needed. The proteins produced are generally of high purity and elute in 350 mM imidazole, high salt, and pH 6. Following purification and cleavage with FXa, ion-exchange chromatography yields purities greater than 95%. This procedure was applied to production of light chains LC8 and Tctex-1 with typical yield of about 30 mg of purified protein from 1 liter of growth media. Design of Smaller Constructs of Large Subunits To design the larger subunits, we use several sequence analysis programs to predict independently folded domains. For example, web-based secondary structure prediction programs including PHD, COILS, and MultiCoil show that the intermediate chain IC74 is comprised of two structurally independent folding domains. The N-terminal domain IC(1–289) is predicted to be disordered except for two short segments of coiled coil, and the C-terminal domain IC(300–640) is predominantly -sheet with six WD repeats predicted to fold into a -propeller. Further analysis using

[11]

cytoplasmic dynein characterization

223

PONDR14 predicts that the N-terminal domain is primarily disordered. The program PFAM,15 which defines structural domain boundaries, also predicts that the N-terminal segment is an independent domain. Functional analysis confirms that both domains have independent functions. Truncation mutations show that the N-terminal domain binds dynactin while the C-terminal domain binds the heavy chain.16 The above information aids in the rational design of constructs that are structurally and functionally representative of the native protein. In addition to sequence analysis, we use limited proteolysis with a variety of enzymes to confirm domain boundaries.17 Proteinase K, a nonspecific enzyme, is presumed to cleave at exposed loops that connect domains and other flexible regions. Stable domains are identified as bands on sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS– PAGE) that resist proteolysis for longer times relative to the rest. The identity of these bands is then determined by mass spectrometry after extraction and in-gel digestion. Preparation of Constructs and Comparison to Large Intact Protein Subunits For preparation of domains, both pGEX with GST fusion (Pharmacia) and pET vectors with His tag fusion (Novagen) systems are used. The advantages of the pGEX system are fast binding screening using GST pull down assays, and more importantly, a significant increase in the yield of soluble proteins. Expression of constructs and large subunits with a His tag often results in insoluble proteins in inclusion bodies. Solubility of expressed protein is generally increased by lowering the temperature during induction and using low concentration of IPTG. If the above steps are unsuccessful, we use the His tag fusion and purify the protein of interest from inclusion bodies under denaturing conditions and then refold at protein concentrations in the micromolar to nanomolar range. All constructs of IC74 prepared are compared to the intact IC74 in terms of their ability to bind the light chains.

14

P. Romero, Z. Obradovic, X. H. Li, E. C. Garner, C. J. Brown, and A. K. Dunker, Proteins Struct. Func. Genet. 42, 38 (2001). 15 A. Bateman, E. Birney, L. Cerruti, R. Durbin, L. Etwiller, S. R. Eddy, S. Griffiths-Jones, K. L. Howe, M. Marshall, and E. L. L. Sonnhammer, Nucleic Acids Res. 30, 276 (2002). 16 S. Ma, L. Trivinos-Lagos, R. Graf, and R. L. Chisholm, J. Cell Biol. 147, 1261 (1999). 17 J. Carey, Methods Enzymol. 328, 499 (2000).

224

cooperativity in protein folding and assembly

[11]

Characterization of Protein Folding and Self-Assembly

Purified proteins are checked for correct mass by mass spectrometry, and their purity is determined by SDS–PAGE gels. There are several other steps for characterization as explained below. Overall Fold Far-ultraviolet circular dichroic (UV CD) spectra indicate whether the protein is folded and give an estimate of the percentage of helices and sheets in the structure. Figure 2A shows a comparison of far-UV CD spectra for a folded globular protein (LC8) and for the primarily unfolded N-terminal domain of IC74, IC(1–289). The lack of signal at 220 nm and the negative signal at 203 nm indicate very limited secondary structure for IC(1–289), while the pronounced negative ellipticity at 220 nm and the positive signal at 198 nm indicate that LC8 is a compact folded protein made up of both helices and sheets. Near-UV CD can also be used to probe packing of aromatic residues, but for proteins with few aromatic residues such as IC(1–289) or for nonglobular proteins of limited solubility, the absence of a signal in the near-UV CD can be incorrectly interpreted as due to absence of tertiary structure. Fluorescence spectroscopy of intrinsic Trp is a very sensitive alternative technique to monitor tertiary structure. Figure 2B shows overlay of Trp fluorescence emission spectra for LC8 and IC(1–289). The emission maximum of IC(1–289) at 350 nm relative to 330 nm for LC8 indicates that there is no tertiary packing in the environment of the Trp residues in IC(1–289), while LC8 has a well-buried Trp residue. Stability and Globular/Nonglobular Structure Unfolding profiles monitored by far-UV CD as a function of temperature or chemical denaturant can give the level of cooperativity in addition to stability of the protein. The absence of cooperativity in both temperature and chemical denaturant profiles (Fig. 2C and previously published work18) indicates that the protein under study is not a folded globular protein. While LC8 is a relatively stable protein with free energy of unfolding of 16 kcal/mol,19 the unfolding profile of IC(1–289) in the temperature  range of 0–80 shows that helicity is decreased noncooperatively.

18 19

M. Makokha, M. Hare, M. Li, T. Hays, and E. Barbar, Biochemistry 41, 4302 (2002). E. Barbar, B. Kleinman, D. Imhoff, M. Li, T. Hays, and M. Hare, Biochemistry 40, 1596 (2001).

[11]

cytoplasmic dynein characterization

225

Fig. 2. Physical characterization of IC(1–289) and comparison to LC8. (A) Overlay of CD spectra of LC8 (dotted line) and IC(1–289) (solid line) in molar ellipticity. (B) Overlay of fluorescence emission spectra in arbitrary units (A.U.). Both data show that LC8 is folded while IC(1–289) is primarily unstructured. (C) Thermal unfolding profile of IC(1–289) monitored by far-UV CD at 220 nm showing noncooperative unfolding.

Self-Assembly To determine the association state of subunits and domains, we use sizeexclusion chromatography and analytical ultracentrifugation. In size-exclusion chromatography, the association state of a protein is determined by comparing its elution time to a set of molecular weight standards. Figure 3 shows analytical size-exclusion chromatograms of LC8WT (top), and of LC8H55K, a mutant where a His-Lys substitution at the dimer interface

226

cooperativity in protein folding and assembly

[11]

Fig. 3. Size-exclusion chromatography of LC8WT (dimer) and LC8H55K (monomer). Experiments were carried out at flow rate of 1 ml/min on a TSK2000SW column. The running buffer used is 0.1 M sodium phosphate, 0.3 M sodium sulfate, 5 mM DTT, pH 7, and elution was monitored by absorbance at 280 nm. The peak at 25 min is for a small molecule internal standard.

destabilizes the dimer (bottom). The peak for 17.5 min corresponds to a dimer, while the peak for LC8H55K at 21 min corresponds to a monomer. No soluble higher aggregates are detected at higher protein concentrations. Analytical ultracentrifugation is a more precise method, particularly for ellipsoidal or nonglobular proteins. For example, LC8 and Tctex-1 elute earlier than expected for dimeric proteins of their size, while sedimentation velocity and equilibrium show they are both dimers.19,20 For IC74 fragments that are primarily unfolded and nonglobular, size-exclusion chromatography is only useful in monitoring whether the protein is a large aggregate that elutes in the excluded volume or a mixture of species of lower oligomerization states. For disordered elongated proteins, sedimentation velocity is the method of choice due to the short lifetime of the samples. In these experiments, the association state is calculated from the frictional ratio and diffusion coefficient.18,21

20 21

M. Makokha and E. Barbar, Biophys. J. 80, 2530 (2001). K. Langsetmo, W. F. Stafford, K. Mabuchi, and T. Tao, J. Biol. Chem. 276, 34318 (2001).

[11]

cytoplasmic dynein characterization

227

Structural Determination of Subunits and Domains by Heteronuclear NMR

After establishing purity, solubility at high NMR concentrations, and the association state of the subunits or domains under study, the next step is to use heteronuclear NMR to determine high-resolution structures. For dimeric proteins, we first attempt to find conditions that disrupt the dimer interface and form a folded monomer. The rationale behind this is to simplify NMR structural analysis while keeping the protein intact. Furthermore, comparisons of dimeric and monomeric structures will allow identification of residues that are important for stability of the dimer, and the conformational changes that accompany dimerization. Comparison of structure and function of dimer and monomer will allow us to probe the functional significance of dimerization. LC8 is used as an example in the description below. Methods to Disrupt the Dimer Interface Low pH. We have shown by sedimentation equilibrium experiments that LC8 is a pH-dependent dimer and dissociates at pH below 4.5, and is fully monomeric at pH 3.19 The protein at pH 3 is folded and compact and has a CD-detected structure similar to the dimer at pH 7. Fluorescence emission spectra at both pHs show similar emission wavelengths indicating that the tertiary packing in the environment of the single Trp, Trp-54, is not perturbed upon dissociation. To further confirm that the monomer at pH 3 is similar in its overall fold and stability to the dimer, we compare the GdnCl-unfolding profiles. Figure 4 shows unfolding profiles for LC8 at pH 7 and at pH 3 monitored by fluorescence intensity. The monomer at pH 3 (triangles) shows remarkable stability and cooperativity of unfolding, indicating that it still has a compact core. Single-Site–Directed Mutagenesis. To explain the titration behavior of LC8, we examined the crystal structure of PIN, mammalian LC8, solved as a homodimer with a bound peptide fragment of nNOS.22 Since the primary sequence of PIN is 94% identical to Drosophila LC8, and the remaining 6% are of the same charge, the two variants are expected to show similar association behavior. The dimer interface of PIN is primarily hydrophobic. Hydrophobic contacts across the interface include Ile-57/570 , Phe-62/620 , and Phe-86/860 , and the ring stacking of His-55 and His-550 . The pH-dependent dissociation of LC8 implies pH titration of one or more critical groups. In general, protonation may cause dimer dissociation by eliminating a favorable charge/charge interaction between residues, or by introducing an unfavorable interaction between like charges. Of the 22

J. Liang, S. R. Jaffrey, W. Guo, S. H. Snyder, and J. Clardy, Nat. Struct. Biol. 6, 735 (1999).

228

cooperativity in protein folding and assembly

[11]

Fig. 4. LC8 at pH 3 is a moderately stable protein and undergoes cooperative unfolding. GdnCl unfolding profiles of LC8 monomer at pH 3 (triangles) and dimer at pH 7 (circles) are  followed as changes in fluorescence emission intensity at 327 nm. Data were acquired at 30 using a batch-type experiment to ensure that equilibrium was achieved before data acquisition. Dimer LC8 unfolds by a three-state process with 0.6 M and 3.7 M GdnCl midpoints for the first and second transitions, respectively. Monomeric LC8 at pH 3 unfolds with a single transition, midpoint 3.0 M. Solid lines represent nonlinear least-squares fits of the data. The data are normalized so that a population consisting entirely of dimer would have a value of 1, and a completely unfolded sample would have a value of 0. The plateau between the dissociation and unfolding is normalized to 0.5. Reproduced with permission from Barbar et al.,19 Copyright American Chemical Society.

residues at the interface of LC8 that could titrate between pH 4 and 5, His-55 and -550 are the most probable candidates. Since they are separated ˚ , protonation of these residues would create a repulsive by less than 6 A interaction buried in the dimer interface. To test if protonation of His-55 is primarily responsible for pH-dependent dimer dissociation, we have used site-directed mutagenesis to replace His-55 with a Lys residue. This will insert a repulsive positive charge at the interface at pH 7. The mutant produced is indeed a monomer at pH 7, as predicted (Fig. 3). NMR Sample Preparation Uniformly 15N or 15N–13C isotopically labeled proteins were prepared by growing the bacteria in MJ9 media containing 1 g/liter of 15NH4Cl and 2 g/liter of 12C or 13C glucose. Protein concentrations were in the

[11]

cytoplasmic dynein characterization

229

0.8–1.4 mM range, in 50 mM citrate phosphate buffer and 50 mM NaCl, 1 mM sodium azide, 10% D2O, and 3% glycerol in 5-mm Shigemi susceptibility matched NMR tubes. Purity of >95% is always used as verified by SDS–PAGE and analytical size-exclusion chromatography. All experiments are conducted at 600- or 750-MHz spectrometers and 30 , and referenced to internal DSS (2,2-dimethyl-2-silapentane-5-acid). Spectra are processed with the programs Felix 97.0 (Accelrys, San Diego) and NMRPipe.23 The programs Felix and SPARKY (Goddard and Kneller, University of California, San Francisco) are used for peak picking and interactive spectral analysis. All NMR spectra for monomeric LC8 were obtained at pH 3.00 and for dimeric LC8 at pH 5.00 in the same citrate-phosphate buffer but of different proportions. We used pH 5 for the dimer instead of pH 6 or 7 in order to increase protein solubility. At 1 mM concentration, LC8 at pH 5 is primarily dimeric (only a single set of peaks was observed in NMR spectra) due to the shift in equilibrium to the dimer at these concentrations. The lower pH also is ideal for NMR measurements because it slows the exchange of amide protons with the solvent. Figure 5 shows a comparison of 1H–15N HSQC spectra of LC8 at pH 3 (monomer) and LC8 at pH 5 (dimer). The peak dispersion observed at pH 3 supports our previous conclusions that LC8 at pH 3 is folded. In addition, the chemical shifts for peaks in both spectra are different, indicating that resonance assignments have to be made for both association states. Resonance Assignments Complete resonance assignments are generally obtained from threedimensional (3D) heteronuclear experiments. For backbone resonance assignments, we use the program AutoAssign,24 which is an expert system providing automated analysis of backbone and C resonance assignments. The input includes peak lists from 2D HSQC and 3D HNCO spectra along with peak lists from three intraresidue [CANH, CBCANH, and H(CA)NH] experiments that correlate the CA, CB, and HA resonances of residue (i), with the backbone amide 15N–1H of residue (i). Three interresidue [CA(CO)NH, H(CA)(CO)NH, and CBCA(CO)NH] experiments correlate with the CA, CB, and HA resonances of residue (i1) with the backbone amide 15N–1H of residue (i). Side chain assignments are obtained manually using HCCH-TOCSY and HCC(CO)NH-TOCSY spectra 23

F. Delaglio, S. Grzesiek, G. W. Vuister, G. Zhu, J. Pfeifer, and A. Bax, J. Biomol. NMR 6, 277 (1995). 24 D. E. Zimmerman, C. A. Kulikowski, Y. P. Huang, W. Q. Feng, M. Tashiro, S. Shimotakahara, C. Y. Chien, R. Powers, and G. T. Montelione, J. Mol. Biol. 269, 592 (1997).

230

cooperativity in protein folding and assembly

[11]

Fig. 5. Comparison of 1H–15N HSQC spectra of LC8 at pH 3 (monomer, left) and pH 5  (dimer, right). Spectra were acquired on a 500-MHz Varian Inova spectrometer at 30 in 0.1 M citrate phosphate buffer and 0.1 M NaCl.

recorded with either the 1H or 13C in the indirect dimension. To assign aromatic side chain chemical shifts, 2D NOESY, TOCSY, and 1H-13C-CTHSQC and HCCH-TOCSY dedicated to the aromatic region are acquired in D2O after lyophilizing the protein. The advantage of acquiring spectra in 100% D2O is improved sensitivity and resolution in the aliphatic–aliphatic connectivities. Figure 6 shows a representative spectrum for backbone and side chain resonance assignments. Structure Determination Using AutoStructure For 3D structure determination, 1H–1H distance constraints are obtained from estimates of nuclear Overhauser effect (NOE) intensities using a homonuclear 1H NOESY spectrum that identifies NOEs involving the aromatic and methyl proton resonances, a 3D 15N–1H NOESY-HSQC that provides short- and long-range contacts between pairs of amide protons, and a 3D 13C–1H NOESY-HSQC provides short- and long-range contacts between aliphatic protons, all recorded at 750 MHz. The program AutoStructure25 is used for automatic assignment of NOEs from data, 25

N. J. Greenfield, Y. J. Huang, T. Palm, G. V. T. Swapna, D. Monleon, G. T. Montelione, and S. E. Hitchcock-DeGregori, J. Mol. Biol. 312, 833 (2001).

[11]

cytoplasmic dynein characterization

231

Fig. 6. Strip plots showing residue assignments for monomeric LC8 corresponding to amide planes taken from 3D spectra of HCC(CO)NH-TOCSY on 13C/15N doubly labeled protein. All carbon resonances corresponding to the same spin system are labeled. For example, Ile-83 has a long spin system ( , , 1, 2, ). Spectra were acquired on a 600-MHz  Bruker spectrometer at 30 .

including chemical shift assignments, NOE contacts, residues with slow amide hydrogen deuterium exchange, and scalar coupling constants. Three-bond J coupling constants between amide protons and H (3J HNH ) to estimate  dihedral angle ranges are obtained from 3D HNHA.26 Amide hydrogen exchange rates are determined by dissolving the lyophilized protein in D2O, and acquiring a series of 2D 1H–15N HSQC spectra in the interval of 20 min to 1 week. The conformational restraints generated by AutoStructure were used with the structure generation program DYANA27 to determine the 3D structure of monomeric LC8. The AQUA and PROCHECK-NMR programs28 provide a means of validating the geometry and restraint violations of an ensemble of protein structures. The outputs include a detailed breakdown of the restraint violations and summary statistics and show the degree of agreement of the model structures with the experimental data. Using the knowledge-based set of rules that constitutes the core algorithms of AutoStructure, a total of 1129 conformationally restricting constraints were identified for monomeric LC8. Each of the resulting 26

H. Kuboniwa, S. Grzesiek, F. Delaglio, and A. Bax, J. Biomol. NMR 4, 871 (1994). P. Guntert, C. Mumenthaler, and K. Wuthrich, J. Mol. Biol. 273, 283 (1997). 28 R. A. Laskowski, J. A. C. Rullmann, M. W. MacArthur, R. Kaptein, and J. M. Thornton, J. Biomol. NMR 8, 477 (1996). 27

232

cooperativity in protein folding and assembly

[11]

constraints corresponds to a pair of upper- and lower-bound internuclear distances. These include 939 NOE distance constraints, 122 dihedral angle constraints, and 68 hydrogen bond constraints (two per hydrogen bond). Twenty cycles of iterative structure generation and NOESY cross-peak assignments were used for identification of these constraints. The total number of conformationally restricting constraints per residue is 12.7, out of which 4.2 constraints are long-range (ji–jj > 5 residues). The solution NMR structure generated has maximum violation of the NOE-based dis˚ . There are no violations of dihedral constraints tance constraints of 0.26 A   >10 (maximum violation is 7.5 ). The root mean square (RMS) deviations ˚ for backbone and all heavy atoms in this ensemble of structures are 1 A ˚ , respectively. Analysis of backbone dihedral angle distributions and 1.5 A show that 97% of residues are in most-favored or favored regions and the remaining 3% are in generously allowed regions with no residues in the disallowed regions. The above statistics indicate that the structure obtained is robust.29 For large proteins that are not amenable to NMR structural determination due to low solubility and slow tumbling rates causing uniform peak broadening, recent advances in NMR technology help alleviate these problems. These advances include the use of cryprobes that have the potential of increasing the signal to noise 3- to 4-fold, and hence are ideal for dilute protein concentrations. In addition, the use of higher fields and pulse sequences utilizing TROSY,30 a method that overcomes the line broadening effects in large systems, is becoming more routine and pushes the envelope of proteins characterized by NMR to large proteins and complexes of greater than 200 kDa.31 NMR Backbone Dynamics NMR relaxation techniques probe the fast internal dynamics of proteins at specific residues. For LC8, for example, changes in flexibility are observed at the dimer interface upon dissociation29 and may occur at other subunit/cargo interfaces upon binding. For domains or segments of IC74 that are primarily unfolded, no unique structure can be obtained by NMR but dynamic characterization may allow identification of relatively more ordered segments and the segments that undergo disorder–order transitions upon binding. 29

M. Makokha, Y. Huang, G. Montelione, A. S. Edison, and E. Barbar, Protein Science, in press. 30 M. Salzmann, K. Pervushin, G. Wider, H. Senn, and K. Wuthrich, Proc. Natl. Acad. Sci. USA 95, 13585 (1998). 31 J. Fiaux, E. B. Bertelsen, A. L. Horwich, and K. Wuthrich, Nature 418, 207 (2002).

[11]

cytoplasmic dynein characterization

233

Internal motions in proteins can be characterized by the spatial restriction of motion or rigidity of the main chain (a measure of the amplitude) and the correlation time (an indication of the time scale of motion). Pulse sequences detect motion by creating a nonequilibrium; the rate of return to equilibrium is dependent upon the amplitude and the time scale of motion. These parameters can be measured for every residue in a protein thereby giving access to the internal dynamics of many locations within the molecule.32 Longitudinal and transverse relaxation times T1 and T2 are measured from heteronuclear 1H–15N spectra recorded with several durations of relaxation delay. Steady-state 1H–15N heteronuclear NOEs are determined from ratios of average intensities obtained with and without proton saturation. In general, residues with the most restricted local mobility have the lowest values of T1 and T2. The magnitude of the NOE reflects the internal mobility of the N–H vector. Other Methods for Backbone Dynamics Hydrogen exchange can be followed both by NMR and mass spectrometry to measure dynamics of the protein of specific residues or short segments. Hydrogen exchange mass spectrometry does not require the high concentrations needed for NMR and offers the advantage of measuring faster exchange rates. Fast exchanging protons at the surface, are often the ones that change the most upon dimerization or ligand interactions. These techniques are reviewed elsewhere in this volume. Characterization of Complexes

Biochemical Techniques to Assay for Binding Biochemical techniques include coprecipitation of His-tagged or GSTtagged proteins on an affinity matrix. The IC(1–289) domains of IC74 that bind LC8 was identified by the presence of a band for LC8 with IC(1–289), indicating binding, and no band with IC(300–640) indicating the absence of binding.18 High salt or mild detergent washes eliminate the complications from nonspecific binding. Covalent cross-linking is used routinely to probe self-association and can be similarly used to test for the presence of interactions. Other techniques including fluorescence spectroscopy, titration calorimetry, and sedimentation velocity for determining energetics of binding can also be used. These techniques will allow probing different

32

J. W. Peng and G. Wagner, Methods Enzymol. 239, 563 (1994).

234

cooperativity in protein folding and assembly

[11]

conditions of pH, salt, and temperature, as reviewed elsewhere in this volume. Mapping Binding Sites Limited proteolysis coupled to mass spectrometry was used to locate specific cleavage sites in the IC(1–289) sequence that are exceptionally well protected by association with the light chains LC8 and Tctex-1.18 This technique generally maps segments that are at the interface,33 and can also identify segments that are protected due to a change in conformation or increase in structure upon binding. In these experiments, segments of a protein that are protected upon interaction with another molecule can be identified by comparison of proteolytic digests of the free and bound proteins. A site that is infrequently cleaved in the interacting protein is presumably close in the sequence to the binding site. Figure 7 is an illustration of this method where the flexible free protein is a model for IC(1–289) and the globular protein in spherical shape represents LC8. The major advantage of this method is that proteolysis of a large domain of IC74 is administered under near-physiological conditions in the presence of the associating light chains. In addition, when one of the interacting proteins, in this case IC(1–289), is far more susceptible to protease digestion than the light chains, it is possible to perform the experiment at conditions where all of the proteolytic fragments are from IC(1–289). After digestion, the protected IC(1–289) fragments were first identified on silver-stained SDS–PAGE tricine gels by comparison with the digest of free IC(1–289). Figures 8 and 9 show sample proteolysis data using trypsin and proteinase K. The fragments were separated by HPLC and analyzed by ESI-MS. The sequences of the tryptic fragments were unambiguously identified from their mass. With proteinase K, the sequence of protected fragments was identified by further digestion with trypsin to produce fragments easily identifiable by mass spectrometry. The use of several enzymes such as trypsin and proteinase K serves to better define the region that is protected. NMR of Complexes The precise recognition of partner molecules is an essential step in the process of any biological function. To delineate the structural basis of the assembly of the subcomplex LC8/Tctex-1/IC74, characterizaton of the interactions of Tctex-1 and LC8 may be possible using short segments of

33

A. Scaloni, N. Miraglia, S. Orru, P. Amodeo, A. Motta, G. Marino, and P. Pucci, J. Mol. Biol. 277, 945 (1998).

[11]

cytoplasmic dynein characterization

235

Fig. 7. Drawing of limited proteolysis coupled with identification of fragments by mass spectrometry. The technique is most effective when one of the interacting proteins is less susceptible to proteolytic cleavage (shown as dark sphere, model for LC8), while the other is labile with exposed cutting sites [shown as disordered protein, model for IC(1–289)].

IC74 that were identified from limited proteolysis and mass spectrometry, and tested for tight binding by biochemical techniques. We prefer recombinant constructs of segments of IC74 over synthesized peptides for two reasons: (1) the constructs are larger than commonly available commercial peptides (preferred length of 20 residues, while the binding segments are 30–40 residues) and better reflect the conformation of native IC74; and (2) it is possible to multiply label the peptide with 13C, 15N, and 2H, to significantly ease the assignments and structural characterization. Mixed labeling techniques will identify NOEs between proteins in the complex.34 For example, 3D 12C-filtered/13C-edited NOE spectroscopic

34

C. Zwahlen, P. Legault, S. J. F. Vincent, J. Greenblatt, R. Konrat, and L. E. Kay, J. Am. Chem. Soc. 119, 6711 (1997).

236

cooperativity in protein folding and assembly

[11]

Fig. 8. SDS–PAGE gel showing time-dependent trypsin digestion of IC(1–289) in the  presence and absence of LC8 and Tctex-1. Lanes 2–20 show tryptic digests (25 , enzyme– substrate ratio 1:150) of IC(1–289) in the presence (even-numbered lanes) and absence (oddnumbered lanes) of equimolar amounts of both LC8 and Tctex-1. The fragment appearing slightly higher than Mr 20 kDa (indicated by an arrow) is protected in the presence of the light chains.

experiments will give only intermolecular NOEs when one interacting protein is labeled and the other is not. The general strategy for mapping the binding sites with NMR is to uniformly label one of the proteins with 15N and 13C and titrate it with the unlabeled interacting protein.35 The limitations of this experiment are aggregation upon mixing and during the time required for the experiment, and uniform broadening of all peaks in the spectra and not just those at the interface. We test for optimal concentrations before aggregation using sizeexclusion chromatography. 1H–15N HSQC spectra of the protein of interest are typically recorded at increasing concentrations of the ligand until the binding sites are saturated. Since the ligand is unlabeled, no extra peaks will be observed and the only changes observed in the spectra will be due to the effect of binding. Titration with the ligand at high concentrations suitable for NMR causes precipitation and aggregation upon binding. To minimize solubility problems, we prepare the complex while dilute and then concentrate the mixture. We prepare several samples at the same protein concentration of LC8 but with varying amount of the ligand. 35

Y. K. Chae, F. Abildgaard, E. R. Chapmann, and J. L. Markley, J. Biol. Chem. 273, 25659 (1998).

[11]

cytoplasmic dynein characterization

237

Fig. 9. SDS–PAGE gel showing time-dependent proteinase K digestion of IC(1–289) in the presence and absence of LC8 and Tctex-1. Lanes 2, 5, 8, and 11 are proteinase K digests of IC(1–289) alone, showing rapid degradation in the absence of the ligand. Lanes 3, 6, 9, and 12 are proteinase K digests of LC8 and Tctex-1, showing that no observable digestion of these  proteins occurs in the time of the experiment. Digestion was carried out at 25 , with an enzyme–substrate ratio of 1:400. The arrow points to an Mr 5-kDa fragment that is protected by the presence of LC8 and Tctex-1. Reproduced with permission from Makokha et al.,18 Copyright American Chemical Society.

Information determined from NMR on interacting proteins in general is discussed briefly below. Chemical Shift and Line Width Perturbations. Since chemical shifts of amide protons are sensitive to changes in the local environment, amide resonances in 1H–15N HSQC spectra of residues close to or at the binding interface are expected to undergo clear chemical shift changes. Chemical shift and line width perturbations are measured directly from HSQC spectra of bound and free protein. Changes in line widths may be observed if the chemical exchange between free and bound is in the intermediate time scale regime. The chemical exchange on/off rate can be determined from measurements of transverse relaxation rates as a function of ligand concentration. NOEs at the Protein–Peptide Interface. One limitation to mapping binding sites using chemical shift changes or line width broadening is that several peaks in the spectrum undergo significant change in chemical shift in the complex. For such cases, the assignments of the complex are better obtained using standard triple-resonance experiments similar to what was discussed earlier for the free protein, but supplemented with those

238

cooperativity in protein folding and assembly

[11]

employing deuterium decoupling. Different combinations of labeled proteins can be used to resolve ambiguities in intermolecular NOEs. For example, one component can be labeled with 15N/2H and the other with 13 C. 15N-filtered/13C-separate experiments can be used to detect intermolecular NOEs between 15NHs of one component and 13CH of the other component.36 15 N- and 13C-Filtered NOESY and TOCSY to Monitor Changes in the Conformation of the Ligand upon Interactions. In these experiments, the peaks arising from 15N- or 13C-labeled protein are filtered out, and only the peaks arising from the ligand are observed. This technique is particularly effective for protein segments that are small and disordered, and may adopt a compact structure upon binding. The presence of ordered structure is reflected in changes in NOE pattern, chemical shifts, and peak broadening. Filtered NOESY and TOCSY spectra of the bound ligand will be compared to NOESY and TOCSY spectra of the free, which show no chemical shift dispersion outside the random coil envelope. The additional NOEs observed in the spectrum of the bound peptide may be caused by an increase in structure of the peptide upon binding, giving NOEs from protons within the bound peptide, or they may be due to NOEs between the peptide and the protein. This technique was used effectively to study the interaction between ligand-binding repeats of a fibronectin-binding protein with fibronectin module.37 Conformational Changes That Accompany Binding Conformational changes upon binding can be followed by several techniques. We have monitored changes in the far-UV CD, fluorescence emission spectra, and increased resistance to proteolytic enzymes of IC(1–289) upon binding to LC8. IC(1–289) has limited regular secondary structure when free in solution but gains structure and stability on binding to LC8.18 The mechanism by which LC8 increases the structure of IC74 on binding may be highly relevant to the assembly of the dynein complex and could give insights into the assembly of other large macromolecular complexes. To probe this mechanism, we make smaller constructs of IC74 that are large enough to reflect native conformation and to undergo conformational change upon LC8 binding. The technical problems of working with large constructs are that they are prone to aggregation and proteolysis with time. All these constructs aggregate and change 36

D. S. Garrett, Y. J. Seok, A. Peterkofsky, A. M. Gronenborn, and G. M. Clore, Nat. Struct. Biol. 6, 166 (1999). 37 C. J. Penkett, C. M. Dobson, L. J. Smith, J. R. Bright, A. R. Pickford, I. D. Campbell, and J. R. Potts, Biochemistry 39, 2887 (2000).

[11]

239

cytoplasmic dynein characterization 

conformation when frozen and have a shelf life at 4 of 2–3 days. For reproducible results, it is best to work with fresh protein preparation. Far-UV CD. Binding of LC8 to IC(1–289) results in an increase in CDdetected structure in the complex compared with free IC. CD spectra of bound and free IC(1–289) show a clear increase in the signal at 208 nm in the bound, indicating an increase in helical or coiled-coil structure in IC(1–289) in the presence of LC8 (Fig. 10). The spectrum of bound IC was obtained by subtracting an LC8 spectrum from a 1:1 LC8:IC mixture where LC8 and IC constructs were kept at the same concentrations in the bound and the free. Care should be taken to measure exact concentrations of free and bound LC8. A 1-mm cell length was used because the effect of LC8 on the structure was observed only at protein concentrations of 6 M and higher. Fluorescence Emission. IC(1–289) has two Trp residues that are exposed in the free protein as indicated by an emission wavelength maximum of 350 nm. If there is an increase in tertiary structure upon binding, one would expect a change in fluorescence intensity and a blue shift in the emission maximum. To perform this experiment, we mutated the single Trp residue in LC8 to Phe and verified by GST pull-down assay that binding

Fig. 10. Increase in average ordered structure of IC(1–289) upon binding to LC8. CD spectra of bound and free IC(1–289) show a significant change in IC(1–289) upon LC8 binding as indicated by an increase in CD signal at 220 nm and a shift of the 203-nm signal toward 208 nm. Stronger 208- and 222-nm signals indicate increased percentage of coiled-coil or helical structure.

240

cooperativity in protein folding and assembly

[11]

to IC(1–289) was not affected. The experiment was performed using an excitation wavelength of 295 nm to selectively excite tryptophan. The blank spectra that contain the buffer and equal concentration of LC8 are subtracted from the spectra of the complex. An increase in intensity and blue shift was observed upon addition of LC8 indicating formation of a more compact structure (unpublished data). Limited Proteolysis. An increase in structure is also observed by protection from proteolysis and visualized as intense bands for large fragments on SDS–PAGE gel. The increase in resistance to proteolysis of IC(1–298) upon binding to LC8 is indicated by the persistence of bands for intact IC(1–289) and large fragments upon interaction with LC8 (Fig. 11). Interaction with other molecules often increases the stability of proteins, so that the unfolding–folding equilibrium is shifted toward the folded conformation. This tends to decrease the effective exposure of the protein to the protease, without necessarily indicating a significant change in the secondary or tertiary structure of the protein. Hence, this method alone does not verify the increase in structure. However, since both CD and fluorescence show increase in structure upon binding, the increase in protection from proteolysis can be interpreted as due to an increase in structure and

Fig. 11. SDS–PAGE gel showing proteinase K digests of IC(1–289) in the presence (lanes 5, 7, 9, 11, 13, and 15) and absence of LC8 (lanes 6, 8, 10, 12, 14, and 16) with digestion time of 1, 5, 10, 15, 30, and 60 min. Lane 2 is a mixture of LC8 and IC(1–289) at time 0. Lanes 3 and 4 are free IC(1–289) and LC8 at time 0, respectively. Lane 17 shows 30-min digestion of free LC8 indicating that LC8 is not cleaved under these conditions. Molecular weight markers are shown in lane 1. The large-molecular-weight fragments (pointed to by arrows) and that of IC(1–289) persist significantly longer in the presence of LC8, while rapid degradation is  observed in the absence of the ligand. Digestion was carried out at 4 , with an enzyme– substrate ratio of 1:500.

[11]

cytoplasmic dynein characterization

241

not simply due to stabilization upon binding. LC8 is not cleaved at the conditions of this experiment, and hence the increase in protection cannot be attributed to dilution of the enzyme by accessible sites on LC8. Comparison to the Full-Length IC74 It is imperative that the binding studies on domains of IC74 be compared to the full-length IC74 to better reflect physiological conditions. The N-terminal domain IC(1–289) is disordered as indicated by CD and fluorescence spectra and the relative ease of proteolysis. Although it is not unusual to have natively unfolded proteins, it is necessary to confirm that this large domain of the protein is indeed disordered when part of intact IC74. For this we have prepared the full IC construct as a GST fusion and used limited proteolysis to identify folded and unfolded domains. Indeed the full IC is digested at a rate similar to the N-terminal domain constructs, and the resulting stable domains correspond to the C-terminal. This indicates that the N-terminal domain is also unfolded in the full protein (unpublished data). Summary

We are investigating fundamental questions regarding the structural and thermodynamic basis for the activity of a cytoplasmic dynein motor protein complex. In this complex, the assembly of the various subunits is dynamic, and at least for IC74, assembly is accompanied by disorder to order transitions. In studying complex assembly, it is imperative to get a structural view of the relation of subunits to each other as well as the thermodynamic forces that drive the assembly and the conformational changes involved. We hope to have underscored the importance of bringing together a large suite of complementary and unique techniques for solving important biological problems. Acknowledgment The preparation and original research in this article were supported by NIH grant GM60969.

242

cooperativity in protein folding and assembly

[12]

[12] Circular Dichroism of Protein-Folding Intermediates By Robert W. Woody Introduction

The protein-folding problem is one of the central problems of molecular biophysics.1,2 There are two contrasting views3 of the process of protein folding: (1) the pathway model, according to which the protein folds via a series of discrete intermediates, and (2) the folding funnel model, according to which the polypeptide chain folds by moving down a free energy surface along any of a vast number of pathways. The role of folding intermediates in these two models is very different. In the pathway model, a series of well-defined intermediates is essential to the protein-folding process. By contrast, the folding funnel model considers folding to be an intrinsically fast process and regards intermediates as nonproductive kinetic traps that retard native structure formation. The characterization of protein-folding intermediates is important, regardless of which of the fundamental models is correct. First, an understanding of their role is essential to distinguish between the models. In some cases, intermediates have been demonstrated to be on the folding pathway,4,5 providing strong support for the pathway model, at least in the latter stages of folding. In other cases, intermediates have been demonstrated to be nonproductive,6 and their elimination by mutation or changes in solution conditions greatly facilitates folding. Another general question about early intermediates is whether they are best described by the hierarchic model of secondary structure first, tertiary structure last, or whether they are generated by a nonspecific hydrophobic collapse, followed by parallel development of secondary and tertiary structure. Circular dichroism (CD) has played a key role in the study of proteinfolding intermediates. CD, fluorescence, and nuclear magnetic resonance (NMR) are the most extensively used methods for characterizing C. M. Dobson, A. Sˆali, and M. Karplus, Angew. Chem. Int. Ed. Engl. 37, 868 (1998). R. H. Pain, ‘‘Mechanisms of Protein Folding,’’ 2nd Ed. Oxford University Press, Oxford, 2000. 3 K. A. Dill and H. S. Chan, Nat. Struct. Biol. 4, 10 (1997). 4 D. V. Laurents, M. Bruix, M. Jamin, and R. L. Baldwin, J. Mol. Biol. 283, 669 (1998). 5 Y. Bai, Proc. Natl. Acad. Sci. USA 96, 477 (1999). 6 T. R. Sosnick, L. Mayne, R. Hiller, and S. W. Englander, Struct. Biol. 1, 149 (1994). 1 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[12]

CD of folding intermediates

243

equilibrium and kinetic intermediates. This chapter will focus on the use of CD to provide structural information about folding intermediates. The much more numerous studies in which CD is used simply to monitor transitions and kinetics will not be considered. Similarly, results from fluorescence, NMR, and other biophysical methods that augment evidence from CD will be discussed in some cases, but not extensively. Equilibrium Intermediates

Most small, single-domain proteins exhibit equilibrium unfolding transitions that show no evidence for intermediates.7 Such proteins are said to undergo two-state transitions because only two states, the native (N) and unfolded (U), are detectable. Thus, the unfolding equilibrium can be formulated as: N>U

(1)

and the equilibrium constant for unfolding is KU ¼

ðUÞ fU fU ¼ ¼ ðNÞ fN ð1  fU Þ

(2)

where fU and fN are, respectively, the fraction of the protein in the unfolded and folded forms. There are two standard criteria for the validity of Eqs. (1) and (2) and hence for a two-state transition. (1) If only two states are present, the extent of the transition can be characterized by a single parameter, fU (or fN), and this can be calculated from measurements of any physical property that is different in the folded and unfolded forms: fU ¼ ðy  yN Þ=ðyU  yN Þ

(3)

where y is the measured value of the property, and yN and yU are the values of the property for the native and unfolded forms, respectively, under the same conditions. For two-state transitions, the transition parameter fU will exhibit the same dependence on the unfolding conditions (temperature, denaturant concentrations, pH, etc.) for all physical properties. This test should use properties that measure very different aspects of the protein structure. Far-ultraviolet (UV) and near-UV CD are frequently used for this purpose because they depend on the secondary and tertiary structure of the protein, respectively. (2) The enthalpy of unfolding, HU, the enthalpy change for reaction (1), can be determined directly by calorimetry (HUcal) or indirectly from the van’t Hoff equation: 7

P. L. Privalov, Adv. Protein Chem. 33, 167 (1979).

244

cooperativity in protein folding and assembly vH ¼ R½@ ln KU =@ð1=TÞ HU

[12]

(4)

For a two-state transition, the calorimetric and van’t Hoff enthalpies are equal. The presence of detectable levels of intermediate(s) in the transition, thereby invalidating Eqs. (1) and (2), is manifested by a van’t Hoff enthalpy that is smaller than the calorimetric value. Evidence for equilibrium intermediates was reported in the 1960s and 1970s for several proteins: growth hormone,8,9 carbonic anhydrase,10,11 and -lactalbumin.12–14 The most common equilibrium folding intermediate is the molten globule (MG), which has the following characteristics: 1. It is less compact than the native protein but much more compact than the unfolded form. 2. The secondary structure content is comparable to that of the native form. 3. It has a fluctuating tertiary structure, generally lacking specific side chain interactions. 4. It has a hydrophobic core that is more loosely packed than in the native protein and into which solvent and small solutes can penetrate. The term ‘‘molten globule’’ was coined in a discussion by Oleg Ptitsyn, Colyn Crane-Robinson, and Akiyoshi Wada at a meeting in Galzignano, Italy in 1982.15,16 It emphasizes the compact, nativelike features (globule) and the absence of specific tertiary interactions (molten). Molten globules and their role in protein folding have been reviewed frequently.17–23 8

H. G. Burger, H. Edelhoch, and P. G. Condliffe, J. Biol. Chem. 241, 449 (1966). L. A. Holladay, R. G. Hammonds, Jr., and D. Puett, Biochemistry 13, 1653 (1974). 10 K.-P. Wong and C. Tanford, J. Biol. Chem. 248, 8518 (1973). 11 K.-P. Wong and L. M. Hamlin, Biochemistry 13, 2678 (1974). 12 K. Kuwajima, K. Nitta, M. Yoneyama, and S. Sugai, J. Mol. Biol. 106, 359 (1976). 13 K. Kuwajima, J. Mol. Biol. 114, 241 (1977). 14 M. Nozaka, K. Kuwajima, K. Nitta, and S. Sugai, Biochemistry 17, 3753 (1978). 15 M. Ohgushi and A. Wada, FEBS Lett. 164, 21 (1983). 16 C. Crane-Robinson, personal communication (2003). 17 M. Arai and K. Kuwajima, Adv. Protein Chem. 53, 209 (2000). 18 A. L. Fink, Annu. Rev. Biophys. Biomol. Struct. 24, 495 (1995). 19 K. Kuwajima and M. Arai, in ‘‘Mechanisms of Protein Folding’’ (R. H. Pain, ed.), 2nd Ed., p. 138. Oxford University Press, Oxford, 2000. 20 O. B. Ptitsyn, J. Protein Chem. 6, 273 (1987). 21 O. B. Ptitsyn, in ‘‘Protein Folding’’ (T. E. Creighton, ed.), p. 243. W. H. Freeman and Co., New York, 1992. 22 O. B. Ptitsyn, Adv. Protein Chem. 47, 83 (1995). 23 K. Kuwajima, Proteins Struct. Funct. Genet. 6, 87 (1989). 9

[12]

CD of folding intermediates

245

Circular dichroism is frequently used to demonstrate the existence of a molten globule intermediate. The CD criteria are (1) far-UV CD spectrum comparable in intensity and shape to that of the native form, indicative of similar secondary structure, and (2) weak near-UV CD spectrum, comparable to that of the unfolded protein, indicative of a poorly defined tertiary structure. Molten globules have been observed as intermediates in the unfolding of proteins by low and high pH, chemical denaturants, i.e., urea and guanidinium chloride (GuCl), high and low temperature, high pressure, organic solvents, e.g., alcohols, cleavage of disulfide bonds, or removal of tightly bound ligands. Molten globules produced by these various means are thermodynamically and structurally similar. Several of the earliest reports of MG behavior, antedating the term itself, were derived from studies of proteins at low pH. The term ‘‘A-form’’ for a protein partially unfolded by acid was introduced by Kuwajima et al.12 This term has been largely supplanted by ‘‘molten globule.’’ Many studies of MGs have used the acid-induced forms because it is frequently possible to find a pH at which the MG intermediate is substantially more stable than either the native or fully unfolded protein, and therefore the MG can be studied with little or no interference from other forms. This is rarely possible with MGs generated by denaturants or by heat. Variations on the MG theme have been proposed.24–27 An equilibrium intermediate that has substantial secondary structure, is less compact than classic MGs but more compact than the unfolded protein, and has low protection factors against hydrogen exchange ( II > III). The dashed lines represent the transformed spectra of components I, II, and III. The solid lines denote the spectra of cyt c in the acid-denatured state at pH 2.0 and the folded state at pH 4.5. The dotted line illustrates the spectrum of the equilibrium A state of cyt c in 1.5 M NaCl at pH 2.2. (Reprinted with permission from Akiyama et al.117 Copyright 2000 Nature America Inc.)

[12]

CD of folding intermediates

275

Fig. 14. Far-UV time-resolved CD results for reduced cytochrome c (cyt c) folding induced by photoreduction of oxidized cyt c. Although time-resolved CD spectra were measured at 32 time points after the initial photoevent, only 7 time points are shown above: 3 and 16 s; 1, 10, 50, 100, and 320 ms. The spectra measured at 16 s and 1 ms are shown in thicker black to highlight the dynamics of the CD signal in that time region. Each spectrum represents the average of at least 512 scans. The data were accumulated with 100 M cyt c in 0.1 M NaP– 3.5 M GuCl–500 M NADH, pH 7, using a 0.5-mm path length flow cell. (Reprinted with permission from Chen et al.121 Copyright 1999 American Chemical Society.)

spectra were obtained with a custom-made instrument that measures changes in ellipticity with high sensitivity on a nanosecond time scale.122 (The usual method for measuring CD utilizes 50-kHz modulation of polarization and is therefore inapplicable on the microsecond time scale.) The time-resolved CD spectra are shown in Fig. 14 for times ranging from 3 s to 320 ms, and compared with the equilibrium spectra of oxidized and reduced cytochrome c in 3.5 M GuCl. Singular valve decomposition (SVD) analysis of the time-resolved spectra definitely indicated three kinetic phases, with a fourth possible phase. Data averaged over 222–225 nm could be analyzed to yield four phases, with time constants of 3.5 s, 180 s, 6 ms, and 180 ms. (For comparison, the fit to the full spectra yielded time constants of 5 s, 6 ms, and 110 ms.) The 222 nm amplitudes of the spectra at 16 s and 1 ms are similar to those observed in the burst phase for oxidized cytochrome c. The 5-s process presumably corresponds 122

J. W. Lewis, R. A. Goldbeck, D. S. Kliger, X. Xie, R. C. Dunn, and J. D. Simon, J. Phys. Chem. 96, 5243 (1992).

276

cooperativity in protein folding and assembly

[12]

to the nonspecific collapse inferred for the oxidized protein. The 180-s process is associated with a small decrease in the CD at 222 nm. Chen and et al.121 suggest that this phase may correspond to unfolding of some nonnative helices or of some secondary structure that is formed around a nonnative heme ligation.6 (The latter refers to the presence of two potential heme ligands, His-26 and His-33, that can bind to the heme iron opposite to His-18, which is liganded under nearly all conditions. The second ligand is Met-80 in the native protein.) Refolding experiments triggered by CO photolysis from the CO complex of reduced cyt c, cyt c CO, have been reported.123 Time-resolved CD experiments on this system have superior time resolution, down to  420 ns. However, under the conditions used (4.6 M GuCl, 40 ), the early intermediates in folding are unstable, so only small changes were observed in the time-resolved far-UV CD spectra. A maximal change in 222 nm CD is observed at 2 s, with an amplitude corresponding to 8% of the native CD signal. At 5 ms, the CD spectrum is indistinguishable from that of the initial cyt c CO. Ribonucleases The folding of bovine pancreatic ribonuclease A (RNase A) has been studied by Houry et al.94 using stopped-flow kinetics. Unfolded RNase A at equilibrium is heterogeneous, consisting of a mixture of species differing in their state of cis–trans isomerization about the four X-Pro bonds in the molecule. A homogeneous population can be produced by rapidly unfolding the native protein and refolding after a minimal delay, thus assuring that all X-Pro bonds are in the same isomerization state as in the native protein. The unfolded species produced this way is called Uvf because it exhibits very fast refolding. This procedure is known as a doublejump experiment because it involves two successive rapid changes in GuCl concentrations, pH, etc. Refolding of Uvf on dilution from 5.9 M to 0.9 M GuCl, pH 7 gave rise to a burst-phase intermediate (I) in which about 52% of the native CD intensity at 222 nm was recovered. A significant burst phase was also observed at 275 nm, corresponding to recovery of 39% of the native intensity. The observation of near-UV CD in the burst phase is interesting and contrasts with the lack of a burst phase in absorption.124 The latter observation implies that the Tyr side chains of I are exposed to the same extent as they are in the unfolded form. Houry et al.94 suggest that the exposed Tyr(s) must give rise to the side chain CD in I, 123 124

E. Chen, M. J. Wood, A. L. Fink, and D. S. Kliger, Biochemistry 37, 5589 (1998). W. A. Houry, D. M. Rothwarf, and H. A. Scheraga, Biochemistry 33, 2516 (1994).

[12]

CD of folding intermediates

277

and their subsequent discussion implies that they are referring to Tyr(s) exposed in the native form. However, since all Tyr are exposed in I, one cannot make any assignment to specific individual or groups of side chains. Houry et al.94 conclude that I is a hydrophobically collapsed species with significant levels of secondary and tertiary structure. Equilibrium MGs of RNase A have not been well characterized. GuClinduced unfolding of RNase appears to be two-state: the midpoints of the transition detected by far- and near-UV CD agree within experimental error.94 The A state of RNase A is nativelike with a far-UV CD spectrum very similar to that of the native protein32,65,78 and a near-UV CD spectrum with 83% of the native intensity. Thus, the A state of RNase A is more ordered than the I intermediate observed by Houry et al.94 The burst phase in folding of GuCl-unfolded RNase was also investigated by Qi et al.125 They compared the 222 nm CD of the burst phase and the steady-state CD of reduced, carboxamidomethylated RNase (CAM-RNase A). CAM-RNase A has been widely used as a model for unfolded proteins. Qi et al.125 showed that the CD of CAM-RNase A at   222 nm is nearly independent of temperature from 0 to 100 in the ab sence of GuCl, is equal to that of RNase A at 100 , and shows only a broad noncooperative dependence on GuCl concentration. Earlier studies of HD exchange demonstrated no protection against exchange in CAM-RNase A.126 Houry et al.94 observed that the CD of their burst-phase intermediate at 0.9 M GuCl, I, is equal to that for CAM-RNase A but dismissed the agreement as coincidental. Qi et al.125 found that the 222 nm CD of the burst-phase and of CAM-RNase A agree not just at a single GuCl concentration but over the whole range from 0 to 5 M. This suggests that the burst-phase intermediate in folding of RNase A is not a MG with substantial secondary structure, but the product of a hydrophobic collapse with relatively little defined secondary structure. In support of this, the molar ellipticity of the burst-phase intermediate at low GuCl concentrations is  3000 deg cm2 dmol1.78 However, it is difficult to reconcile the substantial intensity of the near-UV spectrum of the burst-phase intermediate94 with a product of nonspecific hydrophobic collapse. The near-UV CD spectrum of CAM-RNase has not been reported. Dilution from 5.9 to 2.6 M GuCl at pH 7 gives rise to a different burstphase species, called IU.94 The 222 nm CD of the IU intermediate corresponds to a recovery of less than 7% of the native CD intensity. There is no evidence for a burst phase in near-UV CD. Houry et al.94 suggest that 125 126

P. X. Qi, T. R. Sosnick, and S. W. Englander, Nat. Struct. Biol. 5, 882 (1998). Y. Bai, J. S. Milne, L. Mayne, and S. W. Englander, Proteins Struct. Funct. Genet. 17, 75 (1993).

278

cooperativity in protein folding and assembly

[12]

IU probably lacks any significant secondary or tertiary structure. The difference in CD between Uvf and IU may be simply due to a redistribution of local conformations within the ensemble of unfolded conformations, as the PII conformers stabilized by GuCl are redistributed on the Ramachandran map.127 The burst-phase intermediate in the refolding of urea-denatured ribonuclease H has a far-UV CD spectrum that is identical to that of the A state, which is observed at pH 1.128 When the A state is subjected to a jump to pH 5.5, the kinetics of folding as detected by 222 nm CD is identical to the kinetics for folding the urea-unfolded protein at the same pH. Thus, the burst-phase intermediate and the A state of RNase H are kinetically equivalent. Apomyoglobin Jennings and Wright84 studied the kinetics of refolding of apomyoglobin from concentrated urea solutions. Within the dead time (5 ms), 64% of the 222 nm CD was recovered, as shown in Fig. 15. The 222 nm ellipticity of the burst phase extrapolated to zero urea concentration is 15,000 deg cm2 dmol1, to be compared with 16,200 obtained for the urea-induced MG129 and 14,500 for the MG formed at pH 4.2 in the absence of urea.130 The free energy of unfolding the burst-phase intermediate was estimated to be 2.5  0.5 kcal/mol, whereas a value of 2.1 kcal/mol was determined for the equilibrium MG induced by urea. These lines of evidence for the equivalence of the burst-phase intermediate and equilibrium MG, based on CD, have received strong support from NMR studies of the MG and of HD exchange kinetics.131 Ubiquitin Many small, single-domain proteins have been found to fold by twostate kinetics, i.e., without detectable intermediates. A recent study of ubiquitin folding132 illustrates the need for caution in assigning proteins to this  class. At 25 and low GuCl, a burst phase in the folding of ubiquitin was detected by fluorescence,133 using a single-tryptophan mutant, F45W  (Ub*). At 8 , however, no burst phase was observed by fluorescence. The 127

Z. Shi, R. W. Woody, and N. R. Kallenbach, Adv. Protein Chem. 62, 163 (2002). T. M. Raschke and S. Marqusee, Nat. Struct. Biol. 4, 298 (1997). 129 D. Barrick and R. L. Baldwin, Biochemistry 32, 3790 (1993). 130 S. N. Loh, M. S. Kay, and R. L. Baldwin, Proc. Natl. Acad. Sci. USA 92, 5446 (1995). 131 A. K. Chamberlain and S. Marqusee, Adv. Protein Chem. 53, 283 (2000). 132 Z. Qin, J. Ervin, E. Larios, M. Gruebele, and H. Kihara, J. Phys. Chem. B 106, 13040 (2002). 133 S. Khorisanizadeh, I. D. Peters, T. R. Butt, and H. Roder, Biochemistry 32, 7054 (1993). 128

[12]

279

CD of folding intermediates



Fig. 15. (A) Refolding kinetics of apomyoglobulin (apoMb) at pH 6.1 and 5 monitored by stopped-flow CD spectroscopy at 222 nm. The ellipticity of the unfolded protein (left arrow) was obtained by linear extrapolation of data collected above 4 M urea in equilibrium CD experiments. The ellipticity of the folded protein (right arrow) in 0.8 M urea was determined from CD experiments. The solid line is a single-exponential fit of the data. (B) The equilibrium unfolding of apoMb ( ) was monitored by the change in ellipticity at 222 nm as a  function of urea concentration at pH 6.1 and 5 . The burst-phase amplitudes []0 observed in refolding experiments are also indicated (m). (C) CD spectra of the burst-phase intermediate ( ) and refolded ( ) protein generated by varying the wavelength of detection during stopped-flow refolding experiments. Spectra of the native protein in 0.8 M urea (lower curve) and unfolded protein in 6.0 M urea (upper curve) under equilibrium conditions are also shown. (Reprinted with permission from Jennings and Wright.84 Copyright 1993 American Association for the Advancement of Science.)





absence of the intermediate at low temperature was interpreted as evidence that the species observed at room temperature was a product of hydrophobic collapse, which was destabilized at low temperature.  Stopped-flow CD experiments on ubiquitin at lower temperatures (4 to  20 ) in ethylene glycol (EG) water mixtures (up to 45% EG) revealed

280

cooperativity in protein folding and assembly

[12]

Fig. 16. Ubiquitin F45W mutant (Ub*) CD spectra under different solvent conditions. The  native spectra are similar to one another over the whole 20 to þ25 /0 to 45% ethylene glycol range, compared to the GuCl-denatured state and the burst-phase intermediate. The latter has a larger magnitude of the CD spectrum than even the native state. GuCl and intermediate  data were acquired at 20 . (Reprinted with permission from Qin et al.132 Copyright 2002 American Chemical Society.)

an overshoot in the CD spectrum, as shown in Fig. 16. The amplitude of the burst phase remains constant over temperatures of 10 to 20 and for EG  concentrations of 25–40%. Above 4 , the burst phase is no longer resolvable from the main phase in CD. A small burst phase (4% of the total amplitude) is also seen in fluorescence at low temperatures. The static far-UV   CD (Fig. 16) is essentially the same at 25 , 0% EG and at 20 , 45% EG, demonstrating that the conformation of the native protein remains unchanged under the cryoscopic conditions. The shape of the burst-phase CD spectrum is very different from that for -lactoglobulin, showing that the overshoot does not arise from nonnative -helix but is probably due to nonnative -sheet in the intermediate. The radius of gyration of the intermediate is 20% larger than that of the native protein. The CD intensity of the burst-phase intermediate argues strongly against the hydrophobic collapse model and in favor of a MG. Fluorescence is probably less sensitive to burst-phase formation in this case because the fluorophore remains largely exposed in this species. Dihydrofolate Reductase Stopped-flow studies61 of the folding of urea-denatured E. coli dihydrofolate reductase (DHFR) were a landmark in the application of CD to the study of folding intermediates. Kuwajima et al.61 observed a burst phase in

[12]

CD of folding intermediates

281

which 40% of the 220 nm CD was recovered in the dead time (16 ms). The shape of the CD spectrum for the burst-phase intermediate indicated little or no -helix. Secondary structure was analyzed by three different methods.36,38,134 The data were limited in wavelength range by absorption due to residual urea, and one of the methods36 gave results that deviated substantially from those for the other two methods.38,134 Since the latter two methods gave the best agreement with the X-ray structure for the native protein, the average from these will be quoted here. The quantitative analysis (average of the two methods) indicated that the burst-phase intermediate has less -helix than the native (14% vs. 23%) and comparable -sheet to the native (18% vs. 20%). Measurement of the burst-phase amplitude at various urea concentrations provided data on the unfolding of the burst-phase species. This showed a broad, noncooperative transition with increasing urea concentration. Near-UV CD was not investigated, but a previous stopped-flow fluorescence study of DHFR folding did not show a burst phase.135 After testing fits to varying numbers of exponentials, the CD data over the period 16 ms to 250 s were fit to five exponentials,61 using the time constants determined in a previous kinetics study by fluorescence and absorption.136 The transient spectra associated with the five time constants are shown in Fig. 17, represented as kinetic difference spectra. The fastest transient, 5, with a time constant of 200 ms, has the most distinctive and readily interpretable shape. It is a positive couplet centered at 228 nm. Kuwajima et al.61 proposed that this couplet results from exciton coupling between Trp-47 and Trp-74, which are nearest neighbors in the native structure. The 200-ms transient had previously been identified by stopped-flow fluorescence137 and was interpreted as the effect of burying Trp-74 in a hydrophobic cluster. Mutation of Trp-74 to Leu (W74L) leads to loss of this transient in both fluorescence and far-UV CD. The static CD difference spectrum (wild type  W74L) is nearly superposable on the kinetic difference spectrum 5. These strong experimental arguments for an exciton interaction are supported by theoretical calculations62 that predict that exciton coupling of Trp-47 and Trp-74 in the native form of DHFR gives rise to a positive CD couplet with a magnitude comparable to that observed in the equilibrium difference spectrum. The appearance of exciton coupling between the two Trp side chains, separated by 30 residues in

134

C. T. Chang, C.-S. Wu, and J. T. Yang, Anal. Biochem. 91, 13 (1978). E. P. Garvey and C. R. Matthews, Biochemistry 28, 2083 (1989). 136 N. A. Touchette, K. M. Perry, and C. R. Matthews, Biochemistry 25, 5445 (1986). 137 E. P. Garvey, J. Swank, and C. R. Matthews, Proteins Struct. Funct. Genet. 6, 259 (1989). 135

282

cooperativity in protein folding and assembly

[12]

Fig. 17. Kinetic difference CD spectra for the five phases in refolding of DHFR at 0.4 M  urea, pH 7.8, and 15 . Observed difference spectra are shown by ( ), and theoretical best-fit curves with three sets of reference data for secondary structure CD spectra: (- - -) Greenfield and Fasman,36 (– – –) Chen et al.,38 and (——) Chang et al.134 A thick solid line in the panel for 5 shows the equilibrium difference between the native spectra of the wild-type and the W74L mutant proteins. The equilibrium difference spectrum coincides with the kinetic difference spectrum. (Reprinted with permission from Kuwajima et al.61 Copyright 1991 American Chemical Society.)



the sequence, indicates that an important element of tertiary structure forms on the 200-ms time scale. The spectra of the other four transients, with time constants of 900 ms, 6 s, 37 s, and 160 s, are more difficult to interpret. The quantitative analysis indicates that transients 4 and 3 (in order of increasing time constants) involve some loss of -sheet and gain of -helix. The CD changes associated with transients 2 and 1 are at or below the level of error in the analysis, especially transient 1. These latter two processes may involve proline isomerizations and lead to rather local structural changes.

[12]

CD of folding intermediates

283

Pectate Lyase C The folding of GuCl-unfolded pectate lyase C (pelC), a parallel-helix protein, was studied by stopped-flow CD.92 Far-UV CD revealed three kinetic phases following a burst phase with 20% of the total amplitude (Fig. 18). The observed time constants were 250 ms, 21 s, and 46 s. NearUV CD (Fig. 19) gave evidence of an additional phase at 1 s, in addition to the three observed in far-UV CD. Manual-mixing fluorescence kinetics was consistent with the three slower processes observed in CD. About 30% of the total change in CD at 218 nm occurs in the 250-ms phase, 10% in the

Fig. 18. (Upper panel) A representative folding curve for pelC measured by far-UV CD.  Conditions are 25 mM MES, pH 6, 50 mM NaCl, 0.2 M GuCl, 25 . The main figure shows the results from manual mixing and the inset is from stopped-flow mixing. (Lower panel) Timeresolved CD spectra of the fast events in pelC folding. Final conditions were the same as above. Shown are the spectra of pelC in 5 M GuCl ( – – ) and native pelC (- - -). The solid lines are pelC folding from 0.24 to 2.2 s at 200 ms intervals. Also shown is the spectrum at 0 s determined by extrapolation (– – –). (Reprinted with permission from Kamen and Woody.92 Copyright 2002 American Chemical Society.)

284

cooperativity in protein folding and assembly

[12]

Fig. 19. A representative folding curve for pelC measured by near-UV CD at 277 nm.  Conditions were 25 mM MES, 50 mM NaCl, 0.3 M GuCl, 25 . The data are cut for clarity but equilibrate to the native CD value. The inset shows the fast phase and the change in sign. (Reprinted with permission from Kamen and Woody.92 Copyright 2002 American Chemical Society.)

21-s phase, and 40% in the 46-s phase. An interesting feature of the 250-ms phase in near-UV CD is that the sign is negative, opposite that of the slower phases. The near-UV CD contributions of each Tyr and Trp side chain were calculated92 for the native state from the crystal structure.138 The total near-UV CD from the side chains was found to be positive and in good agreement with the equilibrium CD spectrum. The total for those aromatic groups in the -sheet was found to be negative and of a magnitude consistent with that associated with the fast phase. This led to the proposal that the 250-ms phase corresponds to the formation of the secondary and tertiary structure in the -sheet region of pelC. The 1-s phase was then assigned to the locking in of the conformation of the loops connecting the -strands. This is consistent with the observation of this phase in near-UV CD and fluorescence, but not in far-UV CD. The loops are dynamically disordered in the intermediate produced in the 250-ms phase and become statically disordered in the 1-s phase, leading to only small changes in farUV CD but readily observable near-UV CD and fluorescence signals. The two slow phases result from proline isomerization.139 The slowest phase is specifically attributable to Pro-220, which forms a cis peptide bond in the native protein.

138 139

M. D. Yoder, N. T. Keen, and F. Jurnak, Science 260, 1503 (1993). D. E. Kamen and R. W. Woody, Biochemistry 41, 4724 (2002).

[13]

MS applied to cooperative protein folding

285

Tryptophan Synthase It has frequently been noted87,89,90 that although the CD of burst-phase intermediates implies nativelike secondary structure, pulse-labeling HD exchange measurements indicate only weak protection of backbone protons against exchange. The F2 fragment of the 2-subunit of E. coli tryptophan synthase, an equilibrium MG at ordinary temperatures,26 provides a useful model to examine this discrepancy. Stopped-flow CD and fluorescence in the presence of ANS showed a burst phase with the same amplitudes as those of the protein at equilibrium with no further changes. Thus, the burst-phase intermediate has the same CD and ANS-binding properties as the equilibrium MG of F2. As demonstrated for the equilibrium MG of F2, discussed previously, low protection factors against amide HD exchange are not in conflict with CD evidence for significant secondary structure, and this is expected to hold for other burst-phase intermediates. Acknowledgments The author thanks Dr. Narasimha Sreerama and Dr. A-Young Moon Woody for critically reading the manuscript and providing helpful comments and suggestions; Ms. Janice Chapman and Dr. Sreerama for expert assistance in preparing the manuscript; and NIH Grant EB-02803 (formerly GM22994) for financial support.

[13] Amide Hydrogen Exchange/Mass Spectrometry Applied to Cooperative Protein Folding: Equilibrium Unfolding of Staphylococcus aureus Aldolase By Hai Pan and David L. Smith Introduction

Understanding the rules that govern protein folding is of considerable practical importance to many areas of biotechnology and medical research. From the mid-1960s, protein folding has been generally described as a highly cooperative process.1 According to the predominant two-state hypothesis, the thermodynamics of the folding/unfolding process can be described using only two states, the native state and the unfolded state. Although the folding path may include partially folded intermediates, the intermediates are not significantly populated. Studies by amide hydrogen 1

R. Lumry and R. Biltonen, Biopolymers 4, 917 (1966).

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[13]

MS applied to cooperative protein folding

285

Tryptophan Synthase It has frequently been noted87,89,90 that although the CD of burst-phase intermediates implies nativelike secondary structure, pulse-labeling HD exchange measurements indicate only weak protection of backbone protons against exchange. The F2 fragment of the 2-subunit of E. coli tryptophan synthase, an equilibrium MG at ordinary temperatures,26 provides a useful model to examine this discrepancy. Stopped-flow CD and fluorescence in the presence of ANS showed a burst phase with the same amplitudes as those of the protein at equilibrium with no further changes. Thus, the burst-phase intermediate has the same CD and ANS-binding properties as the equilibrium MG of F2. As demonstrated for the equilibrium MG of F2, discussed previously, low protection factors against amide HD exchange are not in conflict with CD evidence for significant secondary structure, and this is expected to hold for other burst-phase intermediates. Acknowledgments The author thanks Dr. Narasimha Sreerama and Dr. A-Young Moon Woody for critically reading the manuscript and providing helpful comments and suggestions; Ms. Janice Chapman and Dr. Sreerama for expert assistance in preparing the manuscript; and NIH Grant EB-02803 (formerly GM22994) for financial support.

[13] Amide Hydrogen Exchange/Mass Spectrometry Applied to Cooperative Protein Folding: Equilibrium Unfolding of Staphylococcus aureus Aldolase By Hai Pan and David L. Smith Introduction

Understanding the rules that govern protein folding is of considerable practical importance to many areas of biotechnology and medical research. From the mid-1960s, protein folding has been generally described as a highly cooperative process.1 According to the predominant two-state hypothesis, the thermodynamics of the folding/unfolding process can be described using only two states, the native state and the unfolded state. Although the folding path may include partially folded intermediates, the intermediates are not significantly populated. Studies by amide hydrogen 1

R. Lumry and R. Biltonen, Biopolymers 4, 917 (1966).

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

286

cooperativity in protein folding and assembly

[13]

exchange (H/D) nuclear magnetic resonance (NMR) performed under native conditions indicate that different regions within small, singledomain proteins can have different stabilities.2,3 These results suggest that folding of small proteins involves a small number of discrete, metastable intermediates.4,5 To understand better how large proteins fold, it is important to obtain detailed information on the structure, stability, and relationships of these intermediates. These partially folded intermediates are difficult to detect under physiological conditions because their populations are small and their signals are often masked by those of the abundant native state. However, detection of these partially folded intermediates is possible when denaturants, such as urea or guanidine hydrochloride (GdHCl), are used to destabilize the folded form of the protein. It has been known for nearly 40 years that H/D exchange at peptide amide linkages in polypeptides depends strongly on the noncovalent structures (i.e., folded conformations) of proteins.6,7 Experimental methods used to detect H/D exchange in proteins include tritium, ultraviolet (UV), infrared (IR), NMR, neutron scattering, and mass spectrometry (MS). Although H/D exchange has been used to study a wide range of structural features in folded proteins, applications to protein folding have been particularly successful.8–18 Two-dimensional NMR has been the standard technique for characterizing the structures of folding intermediates in small proteins

2

Y. Bai, T. R. Sosnick, L. Mayne, and S. W. Englander, Science 269, 192 (1995). A. K. Chamberlain, T. M. Handel, and S. Marqusee, Nature Struct. Biol. 3, 782 (1996). 4 A. K. Chamberlain and S. Marqusee, Structure 5, 855 (1997). 5 S. W. Englander, Annu. Rev. Biophys. Biomol. Struct. 29, 213 (2000). 6 A. Hvidt and S. O. Nielsen, Adv. Protein Chem. 21, 287 (1966). 7 S. W. Englander and N. R. Kallenbach, Q. Rev. Biophys. 16, 521 (1984). 8 B. B. Kragelund, C. V. Robinson, J. Knudsen, C. M. Dobson, and F. M. Poulsen, Biochemistry 34, 7217 (1995). 9 A. Miranker, C. V. Robinson, S. E. Radford, R. T. Aplin, and C. M. Dobson, Science 262, 896 (1993). 10 Q. Yi and D. Baker, Protein Sci. 5, 1060 (1996). 11 D. K. Heidary, L. A. Gross, M. Roy, and P. A. Jennings, Nat. Struct. Biol. 4, 725 (1997). 12 Y. Deng and D. L. Smith, Biochemistry 37, 6256 (1998). 13 C. S. Maier, M. I. Schimerlik, and M. L. Deinzer, Biochemistry 38, 1136 (1999). 14 V. Tsui, C. Garcia, S. Cavagnero, G. Siuzdak, H. J. Dyson, and P. E. Wright, Protein Sci. 8, 45 (1999). 15 J. E. Coyle, F. L. Texter, A. E. Ashcroft, D. Masselos, C. V. Robinson, and S. E. Radford, Nat. Struct. Biol. 6, 683 (1999). 16 M. Jager and A. Plu¨ckthun, Protein Sci. 9, 552 (2000). 17 S. J. Eyles, J. P. Speir, G. H. Kruppa, L. M. Gierasch, and I. A. Kaltashov, J. Am. Chem. Soc. 122, 495 (2000). 18 J. Chen, S. Walter, A. L. Horwich, and D. L. Smith, Nat. Struct. Biol. 8, 721 (2001). 3

[13]

MS applied to cooperative protein folding

287

because it can be used to quantify deuterium levels at individual amide linkages. In the last decade, H/D MS has evolved into a useful tool for studying the folding of both small and large proteins. Although H/D MS remains a low-resolution method, it offers several advantages over NMR.19,20 First, MS has the ability to detect proteins or peptides with very high sensitivity. Because of its high absolute sensitivity, MS requires less than a nanomole of protein. The high sensitivity is essential when the quantity of protein is limited. In addition to its high absolute sensitivity, MS can be used to analyze very dilute solutions where the concentration of protein is only micromolar. Second, MS can be used to study relatively large proteins. Some proteins with molecular masses over 100,000 Da have been studied by H/D MS, whereas NMR studies are typically limited to proteins under 30,000 Da. Third, exchange of all amide hydrogens can be detected by MS, whereas exchange of more labile amide hydrogens generally cannot be detected by NMR. Isotopic exchange is often quenched before analysis by either NMR or MS. In NMR experiments, quenching is achieved by rapid refolding of the protein to the native state where exchange at many, but not all, amide linkages is slow. In MS experiments, H/D exchange may be quenched by lowering the pH and the temperature, which slows the exchange rate at all amide linkages, thereby facilitating detection of structural changes along the entire polypeptide backbone. These advantages of MS are countered by its relatively low spatial resolution (typically 5–10 residues), although ion fragmentation in the gas phase may lead to improved spatial resolution.21 Two different labeling techniques have been used to label unfolded regions of proteins.22 In continuous labeling, the protein is exposed to D2O for a wide range of times. This approach has been used to detect very slow unfolding processes under native conditions where the population of the unfolded form was always small.23 The small population of unfolded molecules can be detected because the labeling is cumulative over time, even though the unfolded forms quickly refold. In pulsed labeling experiments, the protein is exposed to the labeling buffer for a short period of time that is sufficient to label only unfolded regions. This approach provides a direct measure of the populations of different structural forms present at the time of labeling. 19

D. L. Smith, Y. Deng, and Z. Zhang, J. Mass Spectrom. 32, 135 (1997). J. R. Engen and D. L. Smith, Anal. Chem. 73, 256A (2001). 21 Y. Deng, H. Pan, and D. L. Smith, J. Am. Chem. Soc. 121, 1966 (1999). 22 Y. Deng, Z. Zhang, and D. L. Smith, J. Am. Soc. Mass Spectrom. 10, 675 (1999). 23 J. R. Engen, T. E. Smithgall, W. H. Gmeiner, and D. L. Smith, Biochemistry 36, 14384 (1997). 20

288

cooperativity in protein folding and assembly

[13]

Protein folding may be studied by pulse-labeling H/D MS under either equilibrium or kinetic conditions. In equilibrium experiments, isotope labeling is performed only after the populations of all structural forms (i.e., folded, intermediate, and unfolded) reach thermodynamic equilibrium. In kinetic experiments, labeling is performed as the protein is approaching some equilibrium states (e.g., folded or unfolded). In either case, the H/D MS results can show which regions are folded and the populations of these regions. These populations may be used with modeling to determine important features of the folding/unfolding thermodynamics and kinetics. This contribution provides a detailed prescription for designing and performing typical pulse-labeling H/D MS experiments used to characterize the structures and thermodynamics of folding intermediates. Application of these procedures to the equilibrium unfolding of Staphylococcus aureus aldolase will be described. Experimental Methods

Sample Preparation The experimental procedure used to study equilibrium unfolding of S. aureus aldolase by pulsed labeling H/D MS is illustrated in Fig. 1. Aldolase from S. aureus (Sigma Chemical Co.) was used without further purification. The protein powder was dissolved in phosphate buffer (5 mM, pH 7.5) and allowed to equilibrate. Because of the high sensitivity of amide hydrogen exchange rates to pH, it is usually necessary to use the appropriate buffers at each step. Phosphate buffers (5–25 mM) are particularly convenient because they have good capacity at both the exchange step (pH 7.5) and the quench step (pH 2.4). The protein solution was diluted 10-fold into various urea/H2O solutions [0.5–3 M urea, 5 mM phosphate, pH 7.5, 1 mM dithiothreitol (DTT)]. Another denaturant, GdHCl, was used as the incubation buffer in some experiments. After 24-h incubation in denaturant, an aliquot (100 l) of this solution was pulsed labeled for 5 s with 900 l of the labeling buffer (urea/D2O, 5 mM phosphate buffer, 1 mM DTT, pH 7.4). The same concentration of urea was used in both the incubation and labeling solutions to minimize the structural changes during the labeling step. The duration of the labeling time was chosen to be just adequate to completely exchange all peptide amide hydrogens in unfolded regions. Although 5–10 s is generally adequate for experiments performed  at pH 7 and 20 , detailed considerations for optimizing the labeling time have been discussed elsewhere.22 Successful H/D MS unfolding studies normally require preparation of high concentrations of urea or GdHCl where all exchangeable hydrogens

[13]

MS applied to cooperative protein folding

289

Fig. 1. The experimental procedure used to study equilibrium unfolding of S. aureus aldolase by pulsed labeling H/D MS.

have been replaced with deuterium. Deuterated denaturants can be prepared by dissolving reagent-grade denaturant in D2O and drying in a vacuum centrifuge. Effectively complete exchange occurs after three or four cycles. The concentrations of these solutions can be determined by their refractive index.24 Following the 5-s labeling period, isotope exchange was quenched by  decreasing the pH and temperature to 2.4 and 0 , respectively. Under these conditions, the half-life for H/D exchange at most peptide amide linkages is 20–300 min.25 The pH of the solution can be decreased by adding the appropriate volume of quench solution, such as hydrochloric acid (HCl), phosphoric acid, trifluoroacetic acid (TFA), or phosphate buffer (100 mM, pH 2.3). In folding/unfolding studies, strong acid is recommended to minimize sample dilution, thereby increasing sensitivity. The volume of quench solution added is ideally less than 10% of the labeling solution. The quench solution used in this study was 0.2 M HCl; the volume added 24 25

C. N. Pace, Methods Enzymol. 131, 266 (1986). Y. Bai, J. S. Milne, L. Mayne, and S. W. Englander, Proteins Struct. Funct. Genet. 17, 75 (1993).

290

cooperativity in protein folding and assembly

[13]

depended on the concentration of urea in the solution. The temperature of the solution was decreased using dry ice. Labeled samples can be stored at  80 for 1–2 weeks without appreciable H/D exchange. Analysis of Labeled Protein by HPLC ESIMS Although a variety of MS approaches have been used to analyze the labeled protein, high-performance liquid chromatography electrospray ionization mass spectrometry (HPLC ESIMS) has many advantages. For example, this approach permits use of virtually any buffers or denaturants, separation of large quantities of other proteins, detection of more peptic fragments, and rapid concentration of protein and peptides. The conditions required to quench H/D exchange are compatible with reverse-phase HPLC. It is also important to note that rapidly exchanging deuterium located in the side chains is replaced with protium during HPLC. As a result, the increase in molecular mass is attributable only to deuterium located at peptide amide linkages. To desalt the proteins, two small perfusion columns (2  20 mm) were hand-packed with Poros 20 (Perseptive Biosystems). One column was used as an injector loop (to concentrate sample) and the other was located between the injector and the ESI probe (to separate components). Because these columns are inexpensive and easily packed, they can be repacked when blocked by protein aggregates. Injection and HPLC mobile phases (water and acetonitrile with 0.5% formic acid) were placed in an ice bath to minimize H/D exchange. TFA, another ion-paring agent, is also often used for H/D MS analyses. However, TFA may suppress the ESI signal. Depending on the ionization efficiency of a particular protein and the sensitivity of the mass spectrometer, 50–200 pmol of protein is injected. The flow rate to the ESI probe was relatively high (50 l/min) because the ESI probe cannot be easily cooled. High flow rates are used to minimize  the time labeled protein spends in regions that are not cooled to 0 . Following a short desalting time (3–5 min) at 30% acetonitrile, the protein was eluted by a 30–80% gradient in 5 min. Approximately 5–7 min passed between the time the samples were thawed and their arrival at the ESI probe. Although mass spectra presented here were acquired using a Micromass Qtof electrospray ionization mass spectrometer, several other types of ESI instruments have been used effectively. When applying H/D MS to protein folding, one must control many parameters, including pH, temperature, digestion time, and separation conditions. It is a good practice to perform some preliminary experiments when initiating studies of a new protein. Additional peaks may appear in the mass spectra due to interactions with the denaturants. For example, urea

[13]

MS applied to cooperative protein folding

291

slowly decomposes to form isocyanate ions that can carbamylate Lys residues (43-Da mass increase). As a result, urea solution should be prepared daily using only very high-purity urea. In addition, some proteins tend to form noncovalent adducts with some denaturants. Many of these problems can be detected by performing the experiment in the absence of deuterium. To study the structures within specific regions of folding intermediates, the labeled protein may be fragmented with an acid protease under quench conditions.26–28 Pepsin has been used most often for H/D MS measurements because it is highly active under H/D quench conditions (i.e., pH  2–3, 0 ). Although the low specificity of pepsin prohibits accurate prediction of its cleavage sites, the reproducibility of cleavage is high. These peptides may be readily identified by exact molecular mass, CID MS/MS, or C-terminal sequencing using various carboxypeptidases.29 Most proteins are readily digested using a solution digestion approach, which requires a large amount of pepsin (enzyme/substrate mass ratio 1:1) and a relatively long digestion time (5 min). Although many successful applications of this approach have been reported, high concentrations of pepsin can create problems. Plugging of the HPLC columns, either by pepsin or undigested substrate, is particularly common when using highperformance columns packed with small-pore materials. To avoid such problems, an on-line approach was used in this study to rapidly digest labeled proteins and to concentrate the peptic fragments prior to their separation and analysis by HPLC ESIMS.30 Labeled protein was rapidly loaded into the sample loop of the injection valve. Solvent (H2O, 0.5% formic acid) from an HPLC pump carried the sample though the immobilized pepsin column to the peptide trap column (1  8 mm, Michrom BioResources, Inc.) located on the switch valve. Preparation of the immobilized pepsin column is described elsewhere.30 Following a 3-min period for loading, digesting, and desalting, the switch valve was set so that solvents (H2O, 0.5% formic acid and 80% acetonitrile, 0.5% formic acid) eluted peptides from the peptide trap, enabling their separation and analysis by HPLC ESIMS. The separation column was a peptide mapping C18 column (0.8  50 mm, LC Packings). The flow rate was 50 l/min and the gradient was 2–38% acetonitrile in 12 min. The digestion time, which was only 20 s, was determined by the volume of the pepsin column (170 l) and the flow rate of the HPLC pump (500 l/min). Because the time required for 26

J. J. Rosa and F. M. Richards, J. Mol. Biol. 133, 399 (1979). J. J. Englander, J. R. Rogero, and S. W. Englander, Anal. Biochem. 147, 234 (1985). 28 Z. Zhang and D. L. Smith, Protein Sci. 2, 522 (1993). 29 J. A. McCloskey, Methods Enzymol. 193 (1990). 30 L. Wang, H. Pan, and D. L. Smith, J. Mol. Cell. Proteomics 1, 132 (2003). 27

292

cooperativity in protein folding and assembly

[13]

digestion was substantially less than the time required for digestion with mobile pepsin, back-exchange during digestion was substantially reduced. The efficiency of immobilized pepsin is much higher than soluble pepsin for hydrogen exchange studies where the substrate concentration is low. Immobilization of pepsin leads to effective digestion, even when the substrate concentration is in the submicromolar range. Compared with solution digestion, the intensities of individual peptides may be substantially different for immobilized pepsin. However, the cleavage sites were rather similar. Optimization of digestion conditions to give peptides derived from the entire length of a protein is an important step in most H/D MS studies. To achieve high coverage of the backbone, several techniques were employed, including varied digestion time, different HPLC gradients, different ion pairing agents, and different types of ESI sources. Data Analysis The levels and isotope patterns of deuterium found in labeled proteins and peptides are important sources of structural information. Detailed interpretation of these results requires similar information for reference samples. Two samples, designated folded and unfolded references, are used to determine whether the protein is folded or unfolded. The folded reference is prepared by pulse labeling the native protein in the absence of any denaturants. Deuterium levels found in this sample indicate the extent of H/D exchange with pulse labeling of the folded protein. The unfolded reference can be prepared by equilibrating the protein in high concentrations of denaturant, followed by pulse labeling. In the present study, aldolase was incubated in 4 M urea for 24 h and labeled with D2O/urea for 5 s. When compared to results obtained for a sample completely exchanged in D2O, the deuterium levels found in the unfolded reference show whether the pulse labeling conditions lead to complete H/D exchange. Mass spectra of proteins and peptides may exhibit multiple envelopes under various denaturant concentrations.9,31 To relate these envelopes to different structural forms of the protein, a spread sheet containing m/z and relative intensity was generated using commercial software, such as Masslynx (Micromass) or Xcalibur (Thermo Finnigan). Data in these spreadsheets were imported into the program PEAKFIT (Jandel Scientific Software, version 4.0), which generated Gaussian peaks fitted to the envelopes of isotope peaks. The areas and centroids of these Gaussian peaks were used to determine the populations and deuterium levels of different structural forms. Although the most intense charge state is normally used, 31

Y. Deng and D. L. Smith, Anal. Biochem. 276, 150 (1999).

[13]

MS applied to cooperative protein folding

293

all charge states should give similar results because the proteins were unfolded in the acid/acetonitrile solution used for HPLC. The procedure described above gives the deuterium level present at the time the protein or peptides reached the mass spectrometer. The deuterium levels present at the end of pulse labeling can be determined by adjusting the measured value for losses that occurred during digestion and HPLC. In this study, deuterium levels found in the intact protein were multiplied by a factor of 285/234, where 285 is number of peptide amide hydrogens in the protein and 234 is the deuterium level found in the unfolded reference. Because the same labeling solution was used for all samples and references, this adjustment also accounts for the fact that the labeling solution was only 90% D2O. Deuterium levels of the peptic fragments were adjusted similarly. However, the adjusting factor was based on the theoretical and found deuterium levels for each peptide fragment in the unfolded reference. For optimized experimental conditions, one may expect deuterium recoveries of 80–95% for analysis for intact proteins and 65–95% for analysis for peptic fragments.32,33 Results

Analysis of Intact Protein The equilibrium unfolding of aldolase in urea was investigated using pulse-labeling hydrogen exchange techniques reported previously and illustrated in Fig. 1. Following 24 h of destabilization in urea, the protein was exposed to D2O for 5 s to complete H/D exchange at amide linkages in unfolded regions of the peptide backbone. This destabilization time was used because no significant changes were found in the mass spectra for longer incubation times. That is, equilibrium populations of all structural forms had been established after 24 h. For some urea concentrations, mass spectra of proteins exhibiting two-state behavior will have two peaks corresponding to populations of folded and unfolded structural forms of the protein. More peaks may appear in the mass spectra of proteins exhibiting multistate behavior. Electrospray ionization mass spectra are presented in Fig. 2 for analysis of intact protein equilibrated in 0.5, 1.0, 1.5, 2.0, 2.5, and 3.0 M urea for 24 h. Spectra of folded and unfolded references are presented in the top and the bottom panels, respectively. Although results for all charge states were recorded, only results for the (M+36H+)+36 charge state are presented 32 33

L. Wang and D. L. Smith, Current Protocols Protein Sci. 2, 17.6.1 (2002). L. Wang and D. L. Smith, Anal. Biochem. 314, 46 (2003).

294

cooperativity in protein folding and assembly

[13]

Fig. 2. Electrospray ionization mass spectra of intact aldolase (þ36 charge state) equilibrated in various concentrations of urea. Dashed lines indicate envelopes of isotope peaks representing specific structural forms of the protein. Each sample was labeled for 5 s in urea/D2O prior to quenching H/D exchange. Spectra of folded and unfolded references (top and bottom, respectively) indicate the deuterium levels expected for folded and unfolded aldolase.

here. The m/z of the unfolded reference peak was 921.7, indicating that approximately 234 amide hydrogens had been replaced with deuterium. This protein has 296 residues, 10 of which are proline. It follows that this protein has a total of 285 peptide amide hydrogens. Finding fewer than

[13]

MS applied to cooperative protein folding

295

the theoretical maximum number of deuteriums in the unfolded protein is due to labeling in 90% D2O and back-exchange during the HPLC process. The m/z of the folded reference was 916.6, indicating that hydrogens at approximately 60 amide linkages were replaced with deuterium prior to quenching. The m/z of the intact protein following equilibration in 0.5 M urea was 916.5, which is very close to the m/z of the peak from the folded reference, indicating that nearly all aldolase molecules were still in the folded form. However, equilibration of the protein in 1.0 M urea gave two major peaks (envelopes of isotope peaks) with average m/z of 916.6 and the 918.4. The incomplete separation of the peaks is due primarily to the broad intermolecular distribution of deuterium, not to the resolution of the mass spectrometer. Because the m/z of the high-mass peak was much less than that of the unfolded reference, this peak must represent an intermediate. In 1.5 M urea, the population of folded molecules was barely detectable, whereas an intermediate form with m/z 918.3 dominated. Three peaks were found in the mass spectrum of aldolase equilibrated in 2.0 M urea. The first peak, m/z 918.5, suggests population of the same intermediate found following equilibration in 1.0 and 1.5 M urea. The second peak with m/z 920.0 indicates population of another intermediate with more residues unfolded. The m/z of the last peak in the spectrum (921.6) is very similar to the m/z of the unfolded reference (921.7), indicating that this population of protein molecules was totally unfolded. All molecules were unfolded in 3.0 M urea. These four different structural forms have been designated as F (native), I1 and I2 (intermediates), and U (unfolded) in Eq. (1). F.I1 .I2 .U

(1)

Deuterium levels for the four different structural forms F, I1, I2, and U in different concentrations of urea are presented in Table I. These results reflect the deuterium levels at the end of the labeling period if labeling were performed in 100% D2O. The deuterium levels of the four structural forms following equilibration of the protein in different concentrations of urea were almost identical, suggesting that their structures were not substantially affected by the urea. Changing the urea concentration changed only the populations of these four structural forms. The increase in deuterium level, D, is given in parentheses for each unfolding step. As will be discussed below, the increase in deuterium level with each unfolding step indicates the increase in the number of residues exhibiting no protection to H/D exchange. Because folded aldolase, F, had an average of 58.7 deuteriums, the increased deuterium level measured for each unfolding step is only a lower limit to the number of residues unfolding in that step. Results

296

[13]

cooperativity in protein folding and assembly

TABLE I Deuterium Levelsa Found in the Native State (F), Two Equilibrium Unfolding Intermediates (I1, I2), and the Unfolded State (U) of S. AUREUS Aldolase Equilibrated in Different Concentrations of Urea [Urea] (M)

F

0 (folded ref.) 0.5 1.0 1.5 2.0 2.5 3.0 4.0 (unfolded ref.) Average

60.9 56.5 60.9 56.5

58.7

I1

140.5 136.1 144.9

140.5 (81.8)b

I2

U

211.3 211.3

282.1 285.0 285.0 285.0 284.3 (73.0)b

211.3 (70.8)b

a

Deuterium levels, determined from centroids of envelopes of isotope peaks (Fig. 2), were adjusted for 10% H2O in the labeling solution and for deuterium loss during HPLC. Measured deuterium levels were multiplied by a factor 285/234, where 285 is the theoretical deuterium level and 234 is the deuterium level measured in unfolded, pulse-labeled aldolase. b Numbers in parentheses indicate the change in deuterium level (i.e., the increase in the number of residues with no protection to H/D protection) involved in each unfolding step.

presented in Table I show that at least 81.8 residues unfolded in the first step, 70.8 unfolded in the second step, and 73.0 unfolded in the last step. The distribution of the 58.7 residues in F showing no protection to H/D exchange cannot be determined from these data. Mass spectra presented in Fig. 2 show that there are two unfolding intermediates and the number of exchangeable amide hydrogens in each intermediate. In addition, the relative intensities of the peaks are direct measures of the populations of these intermediates. The populations of F, I1, I2, and U determined at different concentrations of urea are presented in Fig. 3. These results show that the population of native aldolase (F) decreased from 1.0 to 0.2 as the urea concentration increased from 0 to 1.5 M, and that the population of I1 increased accordingly over the same range. These results also show that the I1, I2, and U were equally populated when the urea concentration was 2.0 M, and that all of the protein was unfolded in 3.0 M urea. Equilibrium constants for each of the three unfolding/refolding steps depicted in Eq. (1) were determined from the populations of F, I1, I2, and U given in Fig. 3. The change in free energy for each of these steps was determined from these equilibrium constants. The free energy change for

[13]

MS applied to cooperative protein folding

297

Fig. 3. Populations of folded (F), two intermediates (I1 and I2), and unfolded (U) S. aureus aldolase equilibrated in different concentrations of urea.

each unfolding step under native condition was estimated by the linear extrapolation method (LEM) using the following equations: GðF=I1 Þ ¼ GðF=I1 Þ ðH2 OÞ  mF=I1 ½urea GðI1 =I2 Þ ¼ GðI1 =I2 Þ ðH2 OÞ  mI1 =I2 ½urea GðI2 =UÞ ¼ GðI2 =UÞ ðH2 OÞ  mI2 =U ½urea

(2)

where G(F/I1) is the change of the Gibbs free energy for unfolding of F to give the intermediate I1 in denaturant and G(F/I1) (H2O) is the Gibbs free energy change for the same reaction in water. The change in exposed surface has been attributed to the parameter, m. The dependence of the free energy change for each unfolding step on urea concentration is presented in Fig. 4. The intercepts of these three lines represent G(F/I1) (H2O), G(I1/I2) (H2O), and G(I2/U) (H2O), respectively, which are the free energy changes for each step under native conditions. Formation of the first intermediate, I1, required 2.2 kcal/mol free energy, formation of the second intermediate, I2, required 3.3 kcal/mol, and unfolding the most stable region required 4.2 kcal/mol free energy. The slopes of these lines show that mF/I1, mI1/I2, and mI2/U are equal to 2.1, 1.7, and 2.1 kcal/mol M1, respectively. The similarity of these three values for m suggests that similar amounts of new surface area were exposed in each unfolding step.

298

cooperativity in protein folding and assembly

[13]

Fig. 4. Free energy change for each unfolding step of aldolase versus urea concentration. Linear extrapolation to zero molar urea gives the free energy changes in H2O.

Analysis of Peptic Fragments Derived from Labeled Protein Analysis of the intact protein shows that S. aureus aldolase has three unfolding domains. In addition, these measurements provide estimates of changes in the number of unfolded residues, the free energy, and the exposed surface area for each unfolding/refolding step. To identify specific residues comprising each unfolding domain, labeled intact aldolase was digested with pepsin. The deuterium levels and intermolecular distributions of the peptic fragments were determined from their mass spectra. For the digestion conditions used in this study, over 60 peptides were identified. To simplify analyses, only 28 peptides covering 94% of the backbone were chosen for analysis. Presented in Fig. 5 are electrospray ionization mass spectra of three peptic fragments including residues 43–55, 101–114, and 160–175 taken from intact aldolase equilibrated in various concentrations of urea. Mass spectra for the folded and unfolded reference samples are in the top and the bottom panels. The molecular masses of these three peptic fragments derived from the unfolded reference sample indicate that they had an average of 8.4, 9.4, and 10.0 excess deuteriums. The accuracy of the mass measurement depends on the signal-to-noise ratio and the particular type of mass spectrometer. One may expect the uncertainty in mass measurement to be 10–500 ppm. Comparison of these values with deuterium levels

[13]

MS applied to cooperative protein folding

299

Fig. 5. Electrospray ionization mass spectra of three peptic fragments derived from intact aldolase equilibrated in various concentrations of urea. Spectra of folded and unfolded references (top and bottom, respectively) indicate the deuterium levels expected for folded and unfolded aldolase. Each sample was labeled for 5 s in urea/D2O prior to quenching and digestion by pepsin.

expected for complete exchange at each peptide amide linkage shows that the deuterium recoveries for these peptic fragments were 70, 73, and 71% respectively. The mass spectra for peptides from the reference samples, as well as for peptides from aldolase equilibrated in some concentrations of urea, have only one envelope of isotope peaks. However, mass spectra of most peptides from aldolase incubated in other concentrations of urea have bimodal isotope patterns. These bimodal isotope patterns, which are due to bimodal intermolecular distributions of deuterium, show that the regions

300

cooperativity in protein folding and assembly

[13]

represented by the peptide were folded in some molecules and unfolded in others. The average deuterium levels of the low-mass envelopes of these three fragments are 1.8, 3.8, and 1.4, respectively, which are equal, within experimental error, to the deuterium levels measured for the folded reference. Similarly, the average deuterium levels of the high-mass envelopes of these three fragments are equal to deuterium levels measured for the unfolded reference. These results indicated that these regions of the aldolase molecules were either completely folded or completely unfolded. The populations of aldolase folded and unfolded in a particular region, as indicated by the area of the low- and high-mass envelopes, were estimated by fitting the reference spectra to the bimodal spectra to achieve the best fit. Twenty-two peptic fragments that clearly show bimodal isotopic patterns were processed by this approach. This analysis of bimodal isotope patterns of peptides is the most direct way to quantify the fraction of aldolase that was unfolded in specific regions of the protein. However, this approach is most useful when the isotope envelopes can be clearly resolved. In the three small fragments including residues 37–43, 47–53, and 115–122, the low- and high-mass envelopes were not resolved. An alternative approach based on the deuterium levels found in these fragments was used to determine the populations of unfolded molecules.34 The average molecular mass of a peptic fragment, determined from the centroid of all of its isotope peaks, gives the deuterium level in this fragment. Finding a deuterium level equal to that found in the same fragment when derived from the folded reference indicates that this population was folded. Likewise, finding a deuterium level equal to that found in the same fragment when derived from unfolded reference indicates that this population was unfolded. Deuterium levels between these two limits indicate the fraction of molecules that was unfolded. Regions covered by 25 peptides (76% of the backbone) were processed by these two approaches to determine the unfolded populations in different urea concentrations. The urea concentration, Cm, required to unfold 50% of the molecules in the regions represented by the peptic fragment was determined for each peptide. The Cm, or local unfolding midpoint, indicates the stability of a particular region. Close examination of the local unfolding midpoints for the 25 peptic fragments indicates that their stabilities fall into three groups (see Table II), for example, Cm values for the regions including residues 43–55, 101 and 114, and 160–175 are 1.1, 1.9, and 2.3 M, respectively. The fraction of molecules that was unfolded in specific regions following equilibration of the intact protein in 0–4 M urea is presented in Fig. 6. Data points and error bars are the means and standard deviations 34

H. Yang and D. L. Smith, Biochemistry 36, 14992 (1997).

[13]

MS applied to cooperative protein folding

301

TABLE II The Cm Values for Peptic Fragments of S. AUREUS aldolase Equilibrated in Urea Least stable domain

Medium stable domain

Most stable domain

Segment

Cm (M)

Segment

Cm (M)

Segment

Cm (M)

20–32 20–36 37–43 43–55 47–52 56–66a 241–249b 250–257 261–268 278–286 287–296

1.0 1.1 1.1 1.0 1.0 1.1 1.0 1.1 1.0 1.1 1.1

1–19 66–73a 74–89 90–100 101–114 115–122 126–145 126–146

1.9 1.8 1.7 1.9 1.9 1.9 1.8 1.9

147–158 159–175 160–175 176–196 176–203 204–221 222–231 222–234 222–237 235–241b

2.3 2.4 2.4 2.4 2.3 2.4 2.3 2.3 2.4 2.3

a b

Results for residues 56–66 and 66–73 were derived from fragments 56–73 and 53–70. Results for residues 235–241 and 241–249 were derived from fragment 235–249.

for all of the peptic fragments comprising each unfolding group. These three groups are designated as least stable, medium stable, and most stable domains. In addition to the 25 fragments discussed above, three fragments displayed more complex isotope patterns. The ESI mass spectrum of the peptide 56–73 is shown in Fig. 7. The masses of the folded reference (data not sown) and the 0.5 M sample show that 2.0 D exchange into this segment of native aldolase under the present labeling conditions. The mass spectrum of the same peptic fragment following equilibration of the intact protein in 1.0 M urea has two envelopes of isotope peaks, indicating exchange-in of 1.8 or 8.8 D. Mass spectra of this peptide following equilibration of the intact protein in 1.5 M urea have the same two envelopes of isotopic peaks, but with different relative intensities. Equilibration of the protein in 2.0 M urea also gave a bimodal isotope pattern. However, the two envelopes exchanged-in 8.8 or 14.8 D. Equilibration of the protein in 2.5 or 3.0 M gave one envelope of isotope peaks with 14.8 D. All deuterium levels presented here were adjusted to indicate the deuterium level at the end of the labeling period and if the labeling were performed in 100% D2O. These results indicate that there are two transitions in the unfolding of the segment including residues 56–73. The first transition occurred at 1.0 M urea, which corresponds to unfolding of the least stable domain. The second transition occurred near 2.0 M urea, which corresponds to unfolding of the medium

302

cooperativity in protein folding and assembly

[13]

Fig. 6. Populations of aldolase molecules unfolded in three unfolding domains as a function of the urea concentration. Data points and error bars are the means and standard deviations for all of the peptic fragments comprising each unfolding domain (see Table II).

stable domain. These results suggest that the transition between the least and medium stable domains occurs within the peptic fragment including residues 56–73. This interpretation is supported by two other observations. Fragments located just before this fragment belong to the least stable region, whereas fragments located just after this fragment belong to the medium stable region, suggesting that this fragment covers a transition region between these domains. Additional evidence may be found in mass spectra of an overlapping peptide including residues 53–70, which showed similar behavior. Deuterium levels indicated by the masses of the envelopes of isotope peaks may be used with the sequence for this peptic fragment (see Fig. 7) to locate more accurately the transition point. This peptide has 15 amide hydrogens. The difference between the deuterium levels of the high- and low-mass envelopes in the 1.0 M urea mass spectrum indicates that at least seven residues in this segment unfold with the least stable domain. Likewise, the difference between the deuterium levels of the high- and low-mass envelopes in the mass spectrum of this peptide from aldolase equilibrated in the 2.0 M urea shows that at least six residues unfolded with unfolding of the medium stable domain. These results show that the

[13]

MS applied to cooperative protein folding

303

Fig. 7. Mass spectra of the peptic fragment 56–73 derived from intact aldolase equilibrated in various concentrations of urea. Each sample was labeled for 5 s in urea/D2O prior to quenching and digestion by pepsin. The dashed lines represent each envelope of isotope peaks.

304

cooperativity in protein folding and assembly

[13]

boundary between the least and medium stable domains is very near Pro65 in S. aureus aldolase. Results for the overlapping peptic fragment including residues 53–70 support this conclusion. Similar analysis of another fragment including residues 235–249 suggests that this fragment includes a transition between the most and least stable domains, and that the transition between these two domains is very near 241 of S. aureus aldolase. Discussion

Although many different experimental approaches have been used to study the kinetics and thermodynamics of protein folding, all involve determining populations of different structural forms (i.e., folded, partially folded intermediates, and unfolded). Signals from most methods, including NMR, CD, and fluorescence, are a sum of contributions from all structural forms. Transformation of these signals into populations requires assuming a folding model (two-, three-, or four-state). Amide hydrogen exchange/ mass spectrometry differs from traditional methods in that a signal specific for each structural form (i.e., an envelope of isotope peaks) is measured.9,31 These structure-specific signals are easily transformed into populations. In addition, these signals may point to a wide range of rather detailed information about the folding process. Analysis of intact protein, pulse-labeled under either equilibrium or nonequilibrium conditions, is relatively easy and gives a good overview of the folding process. The four envelopes of isotope peaks found in the mass spectra of S. aureus aldolase (Fig. 2) clearly show three-state unfolding behavior. Analysis of peptic fragments of the labeled protein is technically more demanding but gives additional information. The stability of specific regions of the aldolase backbone was determined from the mass spectra of peptic fragments taken from the labeled protein following incubation in various urea concentrations. Sorting 28 fragments by their midpoints for unfolding, Cm, led to three groups of fragments with average Cm of 1.1, 1.9, and 2.3 M (Table II). Regions represented by these three groups of peptides were designated as least stable, medium stable, and most stable domains. The folding status of these domains in different concentrations of urea was used to identify regions in the four structural forms (F, I1, I2, and U) that were unfolded in the same concentrations of urea. The unfolded regions in I1, I2, and U were identified by comparing their populations (Fig. 3) with the unfolded populations of the three domains (Fig. 6). For example, in 1.5 M urea both I1 and molecules with the least stable domain unfolded were dominant. In 2.0 M urea, the populations of I1, I2, and U

[13]

MS applied to cooperative protein folding

305

were equal and the population of F was below the limit of detection (Fig. 2). If the molecules comprising I1 had only the least stable domain unfolded, and the molecules comprising I2 had both the least and medium stable domains unfolded, the unfolded populations of the least, medium, and most stable domains would be 100, 67, and 33%. Analysis of the peptic fragments of aldolase equilibrated in 2.0 M urea showed that the unfolded populations of the least, medium, and most stable domains were 97, 55, and 22% (Fig. 6). These correlations show which parts of I1 and I2 were unfolded. In addition, they show that the unfolding of aldolase is not due to general loosening of the entire structure, but to step-by-step unfolding of three domains. The number of residues in each domain can be estimated from the deuterium levels found in F, I1, I2, and U (Table I). Because pulse labeling of folded aldolase led to exchange of approximately 59 amide hydrogens, the differences in the deuterium levels found for the three envelopes of isotope peaks represent only the lower limits to the number of residues in each domain. Results from Table I show that the deuterium levels in F and I1 differ by 82, indicating that at least 82 residues unfolded in the first transition. Likewise, at least 71 residues became unfolded in the transition from I1 to I2, and at least 73 residues unfolded in the transition from I2 to U. If the 59 residues with rapidly exchanging amide hydrogens and the 10 proline residues were distributed evenly among the three unfolding domains, these results would suggest that the most, medium, and least stable domains contain 96, 94, and 105 residues, respectively. Mass spectra of peptic fragments of the labeled protein were used to determine the folding status of approximately 94% of the aldolase backbone. Thus, the number of residues in each domain can also be determined from the peptic fragments comprising the three unfolding domains. Results presented in Table II show that the least stable domain contained 89 residues. In addition, this domain included parts of two other peptic fragments, residues 258–260 and 269–277, leading to a total of 101 residues. Similar analysis of the other peptic fragments shows that the medium and most stable unfolding domains contain 99 and 94 residues, respectively. These results show that three cooperative unfolding processes occur in the unfolding of S. aureus aldolase and that each process involves one-third of the backbone. These results also demonstrate the inherent consistency of intact protein analysis and peptic fragment analysis. The three-dimensional (3D) structure of S. aureus has not been reported. However, sequence alignment with rabbit muscle aldolase (Fig. 8) indicates very high homology. More than 30% of the residues are identical and 42% are chemically similar. The relationship between the primary structure of aldolase and its three folding domains is presented

306

cooperativity in protein folding and assembly

[13]

Fig. 8. Primary structures of S. aureus and rabbit muscle aldolase illustrating their homology, secondary structure, and unfolding domains. The residues in boxes with solid lines are identical and the residues in the boxes with dashed lines are homologous. Peptic fragments illustrating the least stable, medium stable, and most stable domains are indicated by white, gray, and black, respectively.

in Fig. 8. These results show that the least stable domain is made up of two different regions including residues 20–66 and 241–296. Note that the residue numbers given in Fig. 8 correspond to rabbit muscle aldolase. These results also show that the medium stable domain also includes two different regions (residues 1–19 and 66–146), while the most stable domain appears to consist of only one region (residues 147–241). Thus, folding domains

[13]

MS applied to cooperative protein folding

307

Fig. 9. Three-dimensional view of aldolase illustrating the least stable, medium stable, and most stable domains in S. aureus aldolase indicated by red, green, and blue, respectively. The structure was modeled using the coordinates for rabbit muscle aldolase monomer (1ADO).

may include backbone regions that are widely separated in the primary structure. A 3D representation based on the structure of rabbit muscle aldolase of S. aureus aldolase illustrating these folding domains is presented in Fig. 9. The least, medium, and most stable domains are indicated by blue, green, and red, respectively. This presentation shows that all of the residues comprising each of the domains are in close proximity to each other in folded aldolase, even though parts of the least and medium stability domains are widely separated in the primary structure (Fig. 8). Summary

To determine the stabilities of folded or partially folded proteins, one often measures changes in stability due to temperature or denaturant and extrapolates these measurements to native conditions. These measurements have most often been made by either spectroscopic or calorimetric techniques. Hydrogen exchange/mass spectrometry is playing an increasingly important role in such measurements. Although this approach offers substantial advantages in sensitivity, it also provides distinct signals for different structural forms. In addition to providing highly specific thermodynamic information, H/D MS analyses are an important source of

308

cooperativity in protein folding and assembly

[14]

mechanistic and structural information. Furthermore, protein folding can be studied by H/D MS under a wide range of experimental conditions, including high concentrations of denaturant, salt, buffer, ligands, and other proteins. Acknowledgments This research was supported by a grant from the National Institutes of Health (GM RO1 40384) and the Nebraska Center for Mass Spectrometry.

[14] Kinetic and Spectroscopic Analysis of Early Events in Protein Folding By David S. Kliger, Eefei Chen, and Robert A. Goldbeck Introduction

It has long been known that the primary sequences of proteins determine their three-dimensional structures, which in turn enable them to carry out their specific functions. It is thus of fundamental importance to understand the rules that determine the structure of a folded protein given its sequence. Beyond the basic scientific interest in this problem, understanding how proteins achieve their native structures has taken on increased practical importance with the information now available about the human genome. Many new proteins are being identified and determining their structures and functions could lead to major advances in biotechnology and biomedical sciences. One approach to understanding the forces by which sequence determines structure is to study the kinetics of folding. To tease apart complex interactions and study them in simpler steps, one often looks for reaction intermediates. Equilibrium intermediates are generally difficult to observe in cooperative transformations such as protein folding, however, so experimentalists turn to kinetic methods. Until recently, kinetic studies of protein folding have centered on the late stages. The most common approach involved adding a denaturant to a protein solution to unfold the protein and then rapidly diluting the solution to reach a denaturant concentration that was low enough that the protein would fold. Various spectroscopic probes could then be used to follow the folding as a function of time. Because typical solution mixing times were on the order of milliseconds or longer, only slow steps in the folding process could be monitored. Often such studies would reveal that at the

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

308

cooperativity in protein folding and assembly

[14]

mechanistic and structural information. Furthermore, protein folding can be studied by H/D MS under a wide range of experimental conditions, including high concentrations of denaturant, salt, buffer, ligands, and other proteins. Acknowledgments This research was supported by a grant from the National Institutes of Health (GM RO1 40384) and the Nebraska Center for Mass Spectrometry.

[14] Kinetic and Spectroscopic Analysis of Early Events in Protein Folding By David S. Kliger, Eefei Chen, and Robert A. Goldbeck Introduction

It has long been known that the primary sequences of proteins determine their three-dimensional structures, which in turn enable them to carry out their specific functions. It is thus of fundamental importance to understand the rules that determine the structure of a folded protein given its sequence. Beyond the basic scientific interest in this problem, understanding how proteins achieve their native structures has taken on increased practical importance with the information now available about the human genome. Many new proteins are being identified and determining their structures and functions could lead to major advances in biotechnology and biomedical sciences. One approach to understanding the forces by which sequence determines structure is to study the kinetics of folding. To tease apart complex interactions and study them in simpler steps, one often looks for reaction intermediates. Equilibrium intermediates are generally difficult to observe in cooperative transformations such as protein folding, however, so experimentalists turn to kinetic methods. Until recently, kinetic studies of protein folding have centered on the late stages. The most common approach involved adding a denaturant to a protein solution to unfold the protein and then rapidly diluting the solution to reach a denaturant concentration that was low enough that the protein would fold. Various spectroscopic probes could then be used to follow the folding as a function of time. Because typical solution mixing times were on the order of milliseconds or longer, only slow steps in the folding process could be monitored. Often such studies would reveal that at the

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[14]

kinetics of early events in protein folding

309

earliest observable times some folding had already taken place. This is typically referred to as the ‘‘burst phase’’ of folding, which did not receive a lot of attention because it was experimentally inaccessible. However, advances in time-resolved spectroscopies as well as in rapid triggering methods for initiating folding have made it possible to extend the time range of observations of folding processes, allowing for studies of protein folding from the earliest times after folding is initiated. The ability to look at early events in protein folding is very important to fully understand the factors that control the structures of proteins. It is the events that occur early in the folding process, when the conformational heterogeneity that characterizes the protein folding problem is greatest, that determine how the forces driving folding overcome the entropic barrier presented by that heterogeneity to find the unique native structure. This heterogeneity was recently incorporated by theoretical studies into a new way of thinking about the nature of protein folding that is best evaluated through experiments studying early folding events. Before the introduction of the new ‘‘landscape model’’1,2 investigations into the mechanisms of protein folding used the language common in describing any other chemical reaction. One started from an initial reactant, the unfolded state, and proceeded through intermediate state(s) to a final product, the folded state. A relatively new language has been developed that acknowledges that in fact there may be many different unfolded states and these different unfolded states may take different pathways to reach the native folded state. An interesting question is whether this ‘‘landscape model’’ of protein folding represents an important distinction for understanding protein folding or merely represents a semantic difference. No one would argue that there are not many possible unfolded states, but couldn’t one simply consider the collection of unfolded states together and think of an ensemble of unfolded states as equivalent to one state? In other words, can one experimentally distinguish a classic model of folding from a landscape model of folding? To answer this question one must think about the possible paths from different unfolded states. The landscape model is usually described in terms of a ‘‘folding funnel’’ in which motion around the perimeter of the open end of the funnel represents conformational diffusion between different unfolded states and motion down the neck of the funnel represents biased diffusion toward the folded state. A given unfolded configuration can move in either one of these directions. If conformational diffusion between unfolded states is fast compared to motion toward the folded state, 1 2

J. D. Bryngelson and P. G. Wolynes, Proc. Natl. Acad. Sci. USA 84, 7524 (1987). J. D. Bryngelson and P. G. Wolynes, J. Phys. Chem. 93, 6902 (1989).

310

cooperativity in protein folding and assembly

[14]

Fig. 1. Schematic diagram of a folding funnel for which slow conformational interconversion around the rim results in heterogeneous kinetics as ensembles diffuse toward the folded state under the downhill bias provided by the funnel. Ensembles that encounter a low global barrier on the way downhill diffuse quickly to the folded state, whereas ensembles that encounter a high barrier must slowly diffuse around the obstacle before crossing at the lower barrier and resuming rapid downhill diffusion.

the ensemble of unfolded states will act as an average unfolded state that ultimately moves toward the folded state (Fig. 1). Thus it would not be possible to experimentally distinguish folding of such an ensemble of unfolded states from folding of the average unfolded configuration. If, on the other hand, this conformational diffusion is slower than the folding process, it might be possible to experimentally distinguish different paths, perhaps different intermediates, in the folding of one unfolded configuration relative to another. To detect these differences, and thus experimentally confirm the utility of the landscape model of folding, it is necessary to look at early events in protein folding where such differences would show up. Thus this chapter will focus on early events in protein folding and recently developed techniques to both trigger the folding process rapidly and probe the folding processes with a variety of spectroscopic techniques that have high time resolution. We hope to show not only the value of time-resolved spectral techniques in understanding early events in protein folding, but also the value of applying a variety of techniques to investigate these processes. Folding processes can be complicated and detection of different folding pathways can be subtle. Different spectroscopic probes are sensitive to different types of structural changes so one should not rely on only one probe to make conclusions about folding pathways. For example, we will see below that in studies of the folding of cytochrome c very different conclusions would be reached about the nature of early events in the folding

[14]

kinetics of early events in protein folding

311

process when relying only on time-resolved optical absorption measurements (TROA) than would be reached with more structurally sensitive time-resolved circular dichroism (CD) or optical rotatory dispersion (TRCD or TRORD) or time-resolved magnetic circular dichroism (TRMCD) measurements. To introduce the issues of rapid initiation of protein folding, the value of multiple spectroscopic probes in studying protein folding and what we have learned about the merits of the protein folding landscape model, we will primarily discuss studies of the early events in folding of cytochrome c. Rapid folding has been initiated in this protein by a ligand photolysis3 method as well as a photoreduction4,5 method. Folding reactions have been monitored in our laboratory with fast time-resolved absorption, circular dichroism (and optical rotatory dispersion), and magnetic circular dichroism (and magnetic optical rotatory dispersion) techniques. Other fast initiation and probe techniques that will be mentioned below include temperature jump and stopped flow initiation triggers, and fluorescence, vibrational, and small-angle X-ray solution scattering probes, which have provided valuable information on the folding of other peptide and protein systems. Classically, reaction kinetics are typically understood from a quasithermodynamic point of view in which the reactant state is in equilibrium with a transition state, i.e., transition state theory (TST). In TST, the kinetics are characterized by the free energy of the transition state and a preexponential factor incorporating the frequency with which the transition state evolves to the product state. This classic description does indeed appear to be adequate for the relatively slow (>1 ms) folding reactions of many proteins. Moreover, for small proteins such as cytochrome c there is often a further connection between kinetics and thermodynamics via free energy relations, i.e., the transition state free energy is observed to vary in response to denaturing perturbations as a linear function of the variation in the reaction free energy. In the landscape model, however, there is no guarantee that the many conformations making up the unfolded state will be in rapid equilibrium with each other and a transition state, and thus it cannot be assumed that the thermodynamics of the transition state will be adequate to describe the folding kinetics. How quickly this conformational equilibrium is established, and thus how late in the progress of the folding 3

C. M. Jones, E. R. Henry, Y. Hu, C.-K. Chan, S. D. Luck, A. Bhuyan, H. Roder, J. Hofrichter, and W. A. Eaton, Proc. Natl. Acad. Sci. USA 90, 11860 (1993). 4 (a) T. Pascher, J. P. Chesick, J. R. Winkler, and H. B. Gray, Science 271, 1558 (1996). (b) G. A. Mines, T. Pascher, S. C. Lee, J. R. Winkler, and H. B. Gray, Chem. Biol. 3, 491 (1996). 5 J. R. Telford, P. Wittung-Stafshede, H. B. Gray, and J. R. Winkler, Acc. Chem. Res. 31, 755 (1998).

312

cooperativity in protein folding and assembly

[14]

process a landscape versus a classic description may still be required, is a principal question motivating the fast spectral techniques for folding studies described here. Techniques for Rapid Initiation of Folding

As mentioned above, early folding studies were limited to events that occurred on time scales of milliseconds or longer. This was primarily due to the fact that methods for initiating folding reactions typically had millisecond resolution. The most common initiation approach involved stoppedflow mixing of solutions. A protein solution could be rapidly mixed with a solution of denaturant to yield a solution with protein plus enough denaturant to unfold the protein. Alternatively, a solution of protein plus denaturant could be rapidly diluted to reduce the denaturant concentration so that the initially unfolded protein would fold. In either case the reaction kinetics could be followed spectroscopically. In the early 1970s a faster technique for initiation of protein unfolding using a rapid temperature jump (T-jump) was developed. By poising the sample temperature near the midpoint of the temperature-induced protein unfolding curve, a sudden rise in temperature will initiate unfolding of the protein. A capacitative electrical discharge across the sample cell can typically increase the temperature of a sample by several degrees within a microsecond, but a T-jump as rapid as 50 ns can be achieved in small volumes by using a coaxial cable capacitor.6 However, rapid discharge T-jump typically require high salt concentrations, which are not always optimal for proteins studies. Recently, a number of initiation methods that work in the submillisecond time regime have become available, making it possible to study early events in the folding process. Two of these techniques represent refinements of the stopped-flow and T-jump methods described above. The first is a microflow stopped-flow method that was developed based on a design by Regenfuss et al.,7 who used highly turbulent flow conditions to achieve complete mixing of two solutions within tens of microseconds. When coupled with a charge-coupled device it was shown that a measurement dead time of 45 s is possible.8 The second technique uses a fast, high-powered laser of the appropriate wavelength to generate a rapid change in temperature by either dye9 or water absorption.10,11 Relying on 6

G. W. Hoffman, Rev. Sci. Instrum. 42, 1643 (1971). P. Regenfuss, R. M. Clegg, M. J. Fulwyler, F. J. Barrantes, and T. M. Jovin, Rev. Sci. Instrum. 56, 283 (1985). 8 M. C. R. Shastry, S. D. Luck, and H. Roder, Biophys. J. 74, 2714 (1998). 7

[14]

kinetics of early events in protein folding

313

the absorption properties of water eliminates the potential of interference by the dye or high salt concentration with the folding/unfolding processes. Thus, this is currently the most frequently observed way to generate a fast T-jump in a protein sample. Furthermore, by sending laser light at a wavelength of 1.4 m (for H2O) to 2 m (for D2O) into an aqueous sample, it is possible to generate a 10–30 K T-jump within the typical 10 ns pulse width of a fast laser. Other techniques to trigger protein folding that take advantage of both the rapid nature of photochemical processes and the availability of fast laser pulses have also been developed in recent years. Two of these are particularly useful for studying folding of heme proteins, as we will see below. The first involves photodissociation of a ligand bound to the heme.3 In the presence of the denaturant guanidine hydrochloride (GuHCl) carbon monoxide (CO) will displace the native heme axial ligand, Met-80, in the reduced cytochrome c (redcyt c) protein. In addition, in the presence of GuHCl the (CO-unbound) redcyt c and CO-bound redcyt c (cyt c-CO) forms have different folding free energies. For example, when CO is bound to the reduced heme the protein is half unfolded when the GuHCl concentration is about 3 M. However, when CO is unbound, redcyt c favors the folded state, with half of the protein being unfolded when the GuHCl concentration reaches about 5 M. Since absorption of light by cyt c-CO rapidly results in photodissociation of the CO ligand, one can make a solution of cyt c-CO in 3–5 M GuHCl, where the CO-bound redcyt c is largely unfolded, and rapidly photodissociate the Fe(II)–CO bond. The absence of the CO triggers the folding reaction because the equilibrium is shifted toward the folded state in the CO-unbound redcyt c form. One can then follow the kinetics of folding of redcyt c as a function of time. A related approach takes advantage of the fact that oxidized cytochrome c (oxcyt c) is less stable to denaturant than redcyt c. The midpoint of unfolding of oxcyt c is about 2.5 M, whereas the midpoint of unfolding of redcyt c is near 5 M GuHCl. Thus with a solution of oxcyt c in the presence of GuHCl concentrations between 2.5 and 5 M, folding of the redcyt c can be triggered if one could rapidly reduce the oxcyt c. This is accomplished by a fast photoreduction technique.4,5 By adding a reducing agent, such as NADH,5 to the sample and then applying a fast pulse of UV light, which is absorbed by the NADH, a rapid ejection of an electron occurs. This electron, as well as the resulting NADH photoproducts, then reduces the 9

C. M. Phillips, Y. Mizutani, and R. M. Hochstrasser, Proc. Natl. Acad. Sci. USA 92, 7292 (1995). 10 D. H. Turner, G. W. Flynn, N. Sutin, and J. V. Beitz, J. Am. Chem. Soc. 94, 1554 (1972). 11 D. H. Turner, G. W. Flynn, S. K. Lundberg, L. D. Faller, and N. Sutin, Nature 239, 215 (1972).

314

cooperativity in protein folding and assembly

[14]

oxcyt c heme. Again, because the sample conditions of 2.5–5 M GuHCl stabilize the folded state of redcyt c it will begin to fold and this folding process can be followed spectroscopically. Finally, a different approach to initiation of protein folding involves synthesis of peptides with photosensitive molecules. For example, when the photocleavable aryl disulfide group is incorporated into a helical, polyalanine peptide a decrease in helicity by half that of its unconstrained analogue is observed.12 In the presence of a 30 -(carboxymethoxy)benzoin group a cyclic peptide with random coil structure will form a helical structure upon photolysis.13 Azobenzene groups have been successfully used to modulate the conformation of cyclic peptides, the sense of a peptide helix, the changes in peptide aggregation, and the content of peptide helix, -sheet, and coil structures.14–17 Although many of these photosensitive peptides show significant structural changes under equilibrium conditions, there are few that have been successful in observing the nanosecond dynamics of peptide structural changes. For example, the peptide study using the aryl disulfide linker was limited by recombination of the disulfide bond.12 Recently, however, Flint et al. demonstrated the ability to engineer 16 residue peptides with different stabilities, different contents of helix secondary structures, and different levels of light-induced conversion to disordered structure by varying the spacing of two cysteine residues that link an azobenzene group.17 Depending upon the spacing between the cysteine residues the compatibility of the helix conformation with the trans and cis geometries can be regulated so that significant increases or decreases in helix secondary structure can be induced by photoirradiation. Recently we demonstrated that it is possible to follow unfolding in the 16 residue peptide with i, i þ 11 cysteine spacing using nanosecond TRORD detection.18 Techniques for Probing Fast Folding Reactions

The most common probes for following fast folding reactions in proteins are time-resolved absorption and fluorescence (TRF) measurements. These tend to be the easiest kinetic spectroscopic measurements to make 12

M. Volk, Y. Kholodenko, H. S. M. Lu, E. A. Gooding, W. F. DeGrado, and R. M. Hochstrasser, J. Phys. Chem. B 101, 8607 (1997). 13 K. C. Hansen, R. S. Rock, R. W. Larsen, and S. I. Chan, J. Am. Chem. Soc. 122, 11567 (2000). 14 O. Pieroni, A. Fissi, N. Angelini, and F. Lenci, Acc. Chem. Res. 34, 9 (2001). 15 R. Cerpa, F. E. Cohen, and I. D. Kuntz, Folding Design 1, 91 (1996). 16 J. R. Kumita, O. S. Smart, and G. A. Woolley, Proc. Natl. Acad. Sci. USA 97, 3803 (2000). 17 D. G. Flint, J. R. Kumita, O. S. Smart, and G. A. Woolley, Chem. Biol. 9, 391 (2002). 18 E. Chen, J. R. Kumita, G. A. Woolley, and D. S. Kliger, J. Am. Chem. Soc. 125, 12443 (2003).

[14]

kinetics of early events in protein folding

315

and can provide valuable kinetic information about folding. However, they do have some limitations. Fluorescence measurements are sensitive but provide limited information because fluorescence intensities are generally determined by the proximity of a fluorophore and quencher. The fluorophore and quencher can be internal moieties in the protein or extrinsic groups linked at specific locations on the protein. These measurements yield information about changes in a specific distance between two points on the protein. Although this can be very valuable information, care must be taken not to overinterpret these results to conclude more about global folding processes than is warranted from one distance measurement. TROA measurements can yield a great deal of information about intermediates involved in folding as long as there is a significant difference in the absorption spectra of the different states. This is generally the case for heme proteins that have chromophore spectra that are sensitive to their local environments. For nonchromophoric proteins TROA measurements are likely to be of more limited value (though far-UV TROA could still provide information on protein changes). However, even for proteins with useful intrinsic chromophores, TROA measurements tend to report on the presence of different intermediates but provide little information on their structural nature. To understand the structural changes associated with folding reactions time-resolved vibrational spectroscopies, with their higher spectral resolution and structural sensitivity, as well as time-resolved small-angle X-ray solution scattering (SAXS) methods, have proven to be useful. Resonance Raman spectroscopy provides information about structural changes around specific chromophores and IR spectroscopy provides more general information on protein changes. Because vibrations involve local motions, vibrational spectroscopies tend to yield information on local structures, though correlations have been made between vibrational spectra and structural motifs such as helices, sheets, and random coils. SAXS techniques have been used as a probe of the global dimensions of macromolecules. Measurements with 200-s time resolution have recently been applied to the study of fast oxcyt c folding reactions, allowing for the size and shape characterization of intermediate species during the hundreds of microseconds to seconds time scale.19,20 CD spectroscopy is sensitive to global protein structural motifs and has long been used to determine the amounts of -helix, -sheet, and random 19

S. Akiyama, S. Takahashi, T. Kimura, K. Ishimori, I. Morishima, Y. Nishikawa, and T. Fujisawa, Proc. Natl. Acad. Sci. USA 99, 1329 (2002). 20 L. Pollack, M. W. Tate, N. C. Darnton, J. B. Knight, S. M. Gruner, W. A. Eaton, and R. H. Austin, Proc. Natl. Acad. Sci. USA 96, 10115 (1999).

316

cooperativity in protein folding and assembly

[14]

coil structures in proteins. In addition to naturally occurring CD spectra, which report on asymmetric molecular structures, one can induce CD in molecules by inserting the sample in a magnetic field. The resulting MCD spectra are most sensitive to aromatic moieties and their surrounding environments, making them an excellent probe of the environments around the heme group and aromatic amino acids in heme proteins. Until recently the time resolution of CD measurements was limited to several milliseconds because modulation of the light between left and right circularly polarized light was accomplished with optoacoustic modulators with frequencies typically in the 100 kHz range. This meant that to collect data with reasonable signal to noise (S/N) in a CD measurement took at least milliseconds. This allowed for stopped-flow CD measurements of protein folding, but only to monitor the slow folding steps. Since the mid-1980s methods have been developed that enabled CD and MCD [as well as the equivalent ORD and magnetic ORD (MORD)] measurements to be made with nanosecond time resolution (Fig. 2).21 It is worth discussing how these methods work because they are not commonly used and because these techniques have been quite important in understanding early events in the folding of redcyt c, as we discuss below. The difference between absorption of left and right circular polarized light by circularly dichroic samples tends to be very small. Thus, direct measurement of CD would involve determining a very small difference between two large absorption signals for each circular polarization. To get around this problem the standard way of making CD measurements is to modulate the light between left and right circular polarizations and, using phase-sensitive detection, to directly measure the absorption differences. However, as mentioned above, this results in measurements that have time resolutions limited to milliseconds or longer. An alternative approach first introduced in 1985 uses a different method for CD measurements.21 Rather than probing a CD signal with left and right circularly polarized light, elliptically polarized light is used. Elliptically polarized light can be thought of as comprising components of left and right circularly polarized light of unequal amplitudes. Thus, if elliptically polarized light is passed through a circularly dichroic sample, the amplitudes of the left and right circular components will be altered and the eccentricity of the light polarization will change (i.e., the polarization ellipse will get fatter or skinnier) (Fig. 3). Such a system, described below, permits sensitive CD measurements with high time resolution.

21

J. W. Lewis, R. F. Tilton, C. M. Einterz, S. J. Milder, I. D. Kuntz, and D. S. Kliger, J. Phys. Chem. 89, 289 (1985).

[14]

kinetics of early events in protein folding

317

Fig. 2. Nanosecond multichannel apparatus for natural and magnetic TRCD/TRORD spectroscopy. Based on a laser photolysis apparatus, a near-null modulator (NNM) and crossed analyzing polarizer (P) are used to detect CD/ORD in the photolyzed sample (S). In TRMCD/TRMORD, faraday rotation of the sample is compensated by a solvent blank (B) in a reversed magnetic field.

The skeleton of the TRCD apparatus is a standard laser photolysis apparatus in which a laser is used to initiate a photochemical reaction and a secondary light source is used to probe the absorption spectra as a function of time. In our laboratory we use a pulsed xenon flash lamp for the probe source as it provides the intensity needed for high S/N measurements with low total light energy so samples are not degraded by the probe light. TROA measurements are made with this apparatus by following the change in intensity of light passing through the sample as a function of wavelength and time. For TRCD measurements the optical train of the apparatus is modified to follow the change in polarization properties of the probe light, rather than the change in intensity, as a function of time. Light from the probe source is passed through a linear polarizer. The resulting linearly polarized beam is then passed through a quartz plate to which a strain has been  applied along an axis oriented at 45 relative to the linear polarization axis. This produces birefringence in the plate, resulting in highly eccentric elliptically polarized light (the intensity along the major ellipse axis is about 104 times the intensity along the minor axis). After passing through

318

cooperativity in protein folding and assembly

[14]

Fig. 3. (A) CD is detected with an NNM comprising a polarizer and strain plate, a fused silica plate slightly compressed by an anvil mechanism to produce a birefringence  along an axis diagonal to the polarizer axis and mounted on a motorized rotation stage. The CD of the sample adds to or subtracts from the reference ellipticity produced by the strain plate, depending on its  45 orientation as the stage is rotated. The difference between intensity measurements after P is proportional to the CD. (B) ORD is detected with an NNM comprising a polarizer mounted on a motorized rotation stage. The ORD of the sample adds to or subtracts from the reference rotation  produced by the polarizer, depending on the  orientation of its rotation. The difference between intensity measurements after the analyzer is proportional to the ORD.

the sample, the probe light passes through another linear polarizer oriented perpendicular to the axis of the first polarizer and finally to a spectrograph and gated multichannel detector. The detected signal thus monitors the intensity along the minor polarization ellipse axis, which is most sensitive to changes induced by CD. In fact, it turns out that by measuring the signal using both right and left elliptically polarized light one can directly

[14]

kinetics of early events in protein folding

319

determine the CD signal by taking the difference between the two measurements normalized to their sum: Signal ¼

IREP  ILEP 2:3eCl ¼  IREP þ ILEP

where IREP, LEP is the intensity of right or left elliptically polarized light that reaches the detector, e is the change in extinction coefficient for the difference in absorption of left and right circularly polarized light (i.e., the CD signal), C is the sample concentration, l is the sample path length, and  is the retardation (in radians) of the birefringent element (the strained quartz plate). Using this method it is possible to measure multiwavelength CD spectra with nanosecond time resolution (an alternative approach that synchronizes picosecond lasers with high repetition rates to optoacoustic modulators for a more classic approach to CD measurements at even higher time resolution has also been used22). With appropriate modifications of the TRCD apparatus, MCD,23 ORD,24 and MORD25,26 spectra can also be measured with similar time resolution. Early Events in the Folding of Cytochrome c

As an example of how the different triggering techniques described above can be used to understand early events in protein folding, we will discuss a variety of studies that have been carried out on folding of redcyt c. The first group of studies focuses on redcyt c folding triggered by a photodissociation event. This trigger was developed based on the observations that the presence of GuHCl weakens the Met-80 axial ligand, which can then be readily replaced by nonnative, side chain amino acids such as histidine29,30 or extrinsic ligands such as imidazole, cyanide, azide, or CO.27–29 Two important experimental details arise from this observation. 22

X. Xie and J. D. Simon, Biochemistry 30, 3682 (1991). R. A. Goldbeck, T. D. Dawes, S. J. Milder, J. W. Lewis, and D. S. Kliger, Chem. Phys. Lett. 156, 545 (1989). 24 D. B. Shapiro, R. A. Goldbeck, D. Che, R. M. Esquerra, S. J. Paquette, and D. S. Kliger, Biophys. J. 68, 326 (1995). 25 R. M. Esquerra, R. A. Goldbeck, D. B. Kim-Shapiro, and D. S. Kliger, J. Phys. Chem. A 102, 8740 (1998). 26 R. M. Esquerra, R. A. Goldbeck, D. B. Kim-Shapiro, and D. S. Kliger, J. Phys. Chem. A 102, 8749 (1998). 27 J. Babul and E. Stellwagen, Biopolymers 10, 2359 (1971). 28 K. Muthukrishnan and B. T. Nall, Biochemistry 30, 4706 (1991). 29 D. N. Brems and E. Stellwagen, J. Biol. Chem. 258, 3655 (1983). 30 T. Pascher, Biochemistry 40, 5812 (2001). 23

320

cooperativity in protein folding and assembly

[14]

First, the folding free energies for cyt c-CO and (CO-unbound) redcyt c in GuHCl are different, as described above. And second, when CO is present as the extrinsic ligand it will preferentially bind at the axial position that is coordinated by Met-80 under native conditions and form an iron–CO bond. This presents us with an opportunity, based on the well-known phenomenon that CO bound to a reduced iron heme can be rapidly dissociated upon absorption of light. Taken together with the difference in folding stabilities for cyt c-CO and redcyt c in GuHCl, this phenomenon provides a useful way to trigger protein folding in redcyt c. One can add GuHCl to a solution of redcyt c and CO at a concentration in which the CO-bound redcyt c is unfolded (such as 4.6 M GuHCl). Photolysis of this sample by a fast light pulse will then transiently yield an environment in which the equilibrium between refolding and unfolding is expected to favor the folded redcyt c state.3 Because the photodissociation process occurs on a subnanosecond time scale, folding of the redcyt c will be triggered on a time scale useful for studying early events in the folding process. A second approach to rapid triggering of folding in redcyt c takes advantage of the observation that the folded state of redcyt c is more stable than that of oxcyt c.4,5 For example, the fraction of unfolded protein in oxcyt c and the fraction of folded protein in redcyt c are near 1 in the range of 3– 4 M GuHCl. Thus if one could rapidly convert oxcyt c to redcyt c it would again be possible to trigger folding on a time scale useful to study early events in the folding process. This rapid reduction of oxcyt c to form redcyt c has been accomplished by photoreduction of a number of different reducing agents,4,5 including inorganic salts, such as ruthenium trisbipyridyl, Ru(2,20 -bipyridine)32þ, and cobalt oxalate, Co(C2O4)33, or agents such as nicotinamide-adenine dinucleotide (NADH). In each case photoreduction of oxcyt c can occur on microsecond or submicrosecond time scales. The first use of these schemes for rapidly triggering redcyt c folding was made by Eaton and co-workers using the CO photolysis method.3 The CO bound to the redcyt c in 4.6 M GuHCl was photodissociated by a 532-nm laser pulse and the resulting folding reactions were followed by monitoring absorption changes as a function of time from 10 ns to 1 s after photolysis. The initial photoproduct observed at 10 ns was identified as a 5-coordinate reduced heme intermediate and the final product had a spectrum like that of native redcyt c. At intermediate times they observed spectra that were identified as at least two different 6-coordinate heme species. The initial 5-coordinate species was taken to be the heme ligated only to His-18 without a ligand in the sixth axial position. Ligation of this species by Met-80 would produce native cytochrome c ligation but nonnative ligation by His-26, His-33, or Met-65 would also be possible. Their results were interpreted in terms of early ligation, occurring with a net rate of 2 s, to each of

[14]

kinetics of early events in protein folding

321

these residues to varying degrees depending on their distance from the heme. A shifting of the proposed metastable equilibrium to predominantly His-33 and His-26 ligation states was then assigned to a process taking place on a time scale of 50 s, followed by displacement of the nonnative histidine ligands by Met-80 and folding to the native structure occurring on a millisecond time scale. Triggering redcyt c folding by a rapid electron transfer reaction was first accomplished by Gray and colleagues.4 They were able to photoreduce oxcyt c in the presence of GuHCl and Ru(2,20 -bipyridine)32þ to study early (1 ms) folding reactions reduction of the oxcyt c was accomplished by photoinitiated electron transfer from Co(C2O4)33. These conditions led to reduction of the cytochrome in less than a millisecond and enabled study of later stages of folding. Upon photoreduction of the oxcyt c the changes in the resulting redcyt c were monitored by time-resolved absorption,4 as in the CO photolysis studies described above, or by fluorescence.5,30 Given the time required for photoreduction of oxcyt c, a 2-s process was not observed, but at 4.6 M GuHCl (pH 7, 40 ) a 40-s process was observed and assigned to collapse of the unfolded protein to a compact denatured structure. In addition, two slow processes were detected and associated with folding of the redcyt c and reoxidation of the heme. These two different pictures of the early events in the folding of this protein derived from time-resolved absorption studies led us to look at these processes again by monitoring the folding processes with more structurally sensitive polarization spectroscopies. This initially involved TRCD studies of folding induced by the CO photolysis method.31,32 As a precursor to the TRCD measurements TROA measurements were carried out on CO-bound redcyt c in 4.6 M GuHCl. As seen before, four exponential processes, with lifetimes of 2, 50, 225, and 880 s, were required to reproduce the pattern of spectral changes as a function of time. However, noticing that the spectra returned to their prephotolysis shape at late times, it was clear that one or more of these processes involved a bimolecular recombination of the CO to the redcyt c rather than a folding process. The experiment was then repeated after lowering the concentration of CO by a factor of 3 and it was found that the last two rates slowed by a factor of 3 while the first two rates were not significantly changed. 31 32

E. Chen, M. J. Wood, A. L. Fink, and D. S. Kliger, Biochemistry 37, 5589 (1998). E. Chen and D. S. Kliger, Inorg. Chim. Acta 242, 149 (1996).

322

cooperativity in protein folding and assembly

[14]

Thus, the last two processes were taken to be due to two different CO bimolecular recombination processes. To assign the nature of the first two processes, experiments were repeated under different conditions. The results are shown in Table I. When tuna heart redcyt c was studied instead of horse heart redcyt c very similar rates were found for each of the four processes, though the amplitude of the second rate was reduced by a factor of 3. Because one difference between these two proteins is that His-33 in horse cyt c is replaced by a Trp in tuna cyt c, this suggested that the 50-s process could involve binding of the heme to a nonnative His residue. Further evidence of this was found for horse cyt c experiments in 6 M GuHCl (pH 4.1), where His-26 and His-33 are protonated, and in experiments at 4.6 M GuHCl (pH 1.7), where all of the His residues are protonated. In both cases no 50-s process was observed. Hints about the nature of the 2-s process were obtained in experiments performed on samples of cyt c-CO in 3.7 M GuHCl and of CO-unbound redcytc c in 4.6 M GdnHCl. For the cyt c-CO sample in 3.7 M GuHCl the amplitude of the 2-s process doubled and in the CO-unbound redcyt c sample only a 2-s component was observed. MCD spectral studies (see below) had shown that at lower GuHCl concentrations a His–Met heme ligation is favored over a His–His heme ligation. Thus the 3.7 M GuHCl TABLE I Kinetics of Horse Heart Cyt c-CO Photolysisa Time constants (s) and amplitudes Sample 4.6 M GuHCl 4.6 M GuHCl, 3:1, Ar/CO 4.6 M GuHCl, pH 1.7b 6 M GuHCl, pH 4.1 3.7 M GuHCl CO unbound redcyt c, 4.6 M GuHCl Tuna heart, 4.6 M GuHCl a

(1)

(2)

(3)

(4)

(5)

50 (0.30) 70 (0.30)

225 (0.25) 640 (0.26)

880 (0.35) 2200 (0.34)

— —

3.5 (0.09)



150 (0.50)

2000 (0.21)

17000 (0.19)

3 (0.11)



160 (0.50)

2600 (0.11)

26000 (0.28)

1.6 (0.16) 2.4 (0.56)

50 (0.28) —

590 (0.49) —

— —

43000 (0.04) —

2 (0.09)

40 (0.11)

250 (0.56)

710 (0.24)



2 (0.08) 2 (0.09)



Experimental conditions: pH 6.5 and 40 (1 atm CO) for horse cyt c unless otherwise noted. b The sample also contained 0.1 M NaCl.

[14]

kinetics of early events in protein folding

323

experiment provided a clue that the 2-s process could involve binding of the Met residue to the heme. Photolysis of redcyt c alone was hypothesized to result in photodissociation of the Met-80 group, and the lone 2-s process was taken to be due to rebinding of this group to the heme. The above assignments of the 2-s and 50-s processes were confirmed in TRMCD studies as described below. First, however, TRCD experiments in the far-UV region were performed to determine the folding kinetics. Interestingly, the far-UV TRCD measurements showed that only 8% of the folding expected upon going from the unfolded CO-bound protein to the  folded CO-unbound redcyt c protein (4.6 M GdnHCl, pH 6.5, 40 ) was observed. Folding to the native structure was observed to be slower than CO rebinding to the unfolded state(s). Thus, what was thought to be slow folding processes in previous TROA studies3 was reinterpreted as CO recombination with little ultimate folding. The TRCD results also produced another surprising result. The 8% of the proteins that actually folded did so in less than 2-s. These results suggested the presence of a fast-folding ensemble of unfolded proteins along with a larger ensemble of slow-folding proteins. To test the validity of the above assignments for the 2-s and 50-s processes and to understand the apparent heterogeneity of the folding rates TRMCD studies of this system proved invaluable.33 While absorption spectra of His–His and His–Met ligated hemes show subtle differences, the MCD spectra of these species are dramatically different. It was thus straightforward to determine the MCD spectral changes associated with the 2- s and 50- s processes and determine that they must, as previously hypothesized, involve heme ligation to a Met residue and a His residue, respectively. Beyond identifying the ligands involved in binding processes on the 2and 50-s time scales, the MCD data provided new information about the nature of the folding mechanism that the TROA and TRCD data did not reveal. With binding of a Met ligand on a 2-s time scale and binding of a His ligand on a 50-s time scale one might guess that these reactions proceeded sequentially, first with binding of a Met residue to an unfolded state and then with displacement of this Met ligand by a His ligand. The TRMCD data showed, however, that this could not be true since the 2-s process clearly involved binding of a Met residue to a 5-coordinate heme and the 50-s process clearly involved binding of a His residue to a 5-coordinate heme. Parallel reactions involving binding of these two residues to two different populations of unfolded protein must thus be occurring in this system. What turned out to be even more exciting was that the MCD 33

R. A. Goldbeck, Y. G. Thomas, E. Chen, R. M. Esquerra, and D. S. Kliger, Proc. Natl. Acad. Sci. USA 96, 2782 (1999).

324

cooperativity in protein folding and assembly

[14]

spectra had sufficient structure and sensitivity that they could be used as a test of the landscape model of folding in redcyt c. As discussed above, a classic mechanism of protein folding and the landscape model of protein folding can be distinguished only if the rate of interconversion between different unfolded conformations is on the order of or slower than the rate of folding from the unfolded conformations to the native structure. The TRMCD experiments followed the changes at the heme group as a function of time after CO photolysis of cyt c-CO. Analysis of these data showed that four exponential processes were needed to describe the pattern of spectral changes, and that the lifetimes of the processes closely matched those extracted from the absorption studies.3,31 The exponential relaxation spectra obtained in the global kinetic analysis for those experiments are linear combinations of the spectra of the intermediates involved in the mechanism. To obtain the intermediate spectra one must know the mechanism involved in the various transformations being studied. It is possible to turn this requirement around by assuming different mechanisms and determining if the resulting calculated intermediate spectra are realistic in terms of known model spectra. This made it possible to provide the first definitive experimental test of the landscape model of protein folding. The TRMCD data for the redcyt c folding reaction were analyzed in terms of two different models. One was a classic model in which the different initial species involved in the folding, a His–Met ligated heme, a His– His ligated heme, and a 5-coordinated heme, were in rapid equilibrium with each other, as assumed in the early TROA studies3 discussed above. The other was a heterogeneous model in which each of these intermediates could be involved in a pathway leading to folding of the protein. However, these pathways were not coupled, i.e., folding along each of the pathways was faster than interconversion of these intermediates. For each of the analyses intermediate spectra were calculated from the TRMCD data and the resulting calculated intermediate spectra were compared to model compound spectra for each intermediate. The results showed that for the His–Met heme and 5-coordinate heme species (as well as the CO ligated heme species) the two folding mechanisms yielded the same, good fits with model spectra. In the case of the His–His ligated heme, however, only the heterogeneous (landscape) model could even qualitatively produce a spectrum like that of model compounds (Fig. 4). This was the first experimental demonstration where protein folding is better described by the landscape model than the classic model of folding. Although the direct experimental support for the landscape model of folding was valuable, a question remained as to whether this is a general feature of protein folding or whether it could even be an artifact of the method of initiation of the folding. The study of redcyt c is useful to answer the latter question since it is possible to initiate folding through a rapid

[14]

kinetics of early events in protein folding

325

Fig. 4. Calculated MCD spectra for the bis–His intermediate that forms within tens of microseconds after photolysis of cytochrome c-CO in 4.6 M guanidine hydrochloride indicate that the unfolded protein conformations are not in equilibrium on this time scale.33 The spectrum obtained from an energy landscape model in which slow conformational equilibration kinetically isolates the heterogeneous unfolded conformations (bold solid line) gives a much better fit to the bis–His model compound spectrum (light solid line) than the spectrum obtained from a classic model in which the unfolded conformations are homogeneously equilibrated with one another on the microsecond time scale (bold dashed line). (The cytochrome b5 model compound spectrum has been shifted in wavelength and scaled slightly for comparison with the calculated spectra for cytochrome c.)

photoreduction of oxcyt c. To do this NADH was added to an oxcyt c solution and, as described above, photoexcitation of the NADH resulted in submicrosecond reduction of the protein to form redcyt c.34 A nice benefit of this folding initiation method over the CO photolysis method is that experimental conditions can be controlled such that competing back reactions can be minimized, so that late as well as early events in folding can be followed through to formation of the final native protein structure. Compared to the formation of 8% secondary structure within 2 s that was observed in folding of redcyt c initiated by photolysis of cyt c-CO, it was now expected that nearly 100% of secondary structure formation would be detectable in TRCD studies of redcyt c folding initiated by a photoreduction trigger. Thus, by monitoring the folding reaction with 34

E. Chen, P. Wittung-Stafshede, and D. S. Kliger, J. Am. Chem. Soc. 121, 3811 (1999).

326

cooperativity in protein folding and assembly

[14]

TRCD measurements it was possible to follow slow folding reactions (at times longer than 4 ms) to the native structure and see that the kinetics of late folding reactions were similar to those measured by previous stopped-flow folding experiments.35,36 It was also possible to follow faster folding processes not observable in the earlier stopped-flow studies. The early-time observations found that secondary structure appeared within 5 s after photoreduction.34 The secondary structure formed during these early times remained fairly constant for milliseconds, although there was a small amount of structure that seemed to undergo unfolding (i.e., a small decrease in the CD signal was observed) with a time constant of 180 s. This 180-s component preceded a 6-ms phase that involved an increase in secondary structure and a 110-ms phase that resulted in formation of the native redcyt c structure. Extension of these experiments to study the denaturant concentration dependence of early events in redcyt c folding yielded further insights into the landscape nature of folding.37 TRORD was used in this study in order to extend the measurements to earlier times and maximize information about early folding events. In Fig. 5 the kinetics of folding at 2.7, 3.0, 3.3, and 4.0 M GuHCl clearly show dramatic changes in the early time kinetics with changing GuHCl concentrations. The results show a surprising trend; the early secondary structure formation times speed up with increasing denaturant concentration. It is also interesting to note that the time for reduction of the oxcyt c to form redcyt c from electrons ejected from the NADH is nanoseconds but the time to fully reduce the sample is about 100 s, slower than the kinetics of the fast-folding process. These results show that the fastfolding process proceeds from a conformational ensemble that is not in equilibrium with the bulk of protein conformers during this time. Slow-folding processes in this protein, however, exhibit homogeneous folding. Thus, one cannot simply think of folding as being heterogeneous, as suggested in the landscape model of folding, or as homogeneous, as suggested by the classic model of folding. Rather, the appropriate model describing folding will depend on the relative time scales of interconformational equilibration versus kinetics of folding. In redcyt c conformational equilibration takes place on a time scale of 104–103 s. Submillisecond folding processes will thus appear to follow heterogeneous folding kinetics while slower folding processes will appear to follow homogeneous folding kinetics. The kinetic heterogeneity that is expected to be a hallmark of the energy landscape/funnel folding regime has heretofore largely eluded 35

K. Kuwajima, H. Yamaya, S. Miwa, S. Sugai, and T. Nagamura, FEBS Lett. 221, 115 (1987). G. A. Elo¨ve, A. F. Chaffotte, H. Roder, and M. Goldberg, Biochemistry 31, 6876 (1992). 37 E. Chen, R. A. Goldbeck, and D. S. Kliger, J. Phys. Chem. A. 107, 8149 (2003). 36

[14]

kinetics of early events in protein folding

327

Fig. 5. The rise time of the submillisecond TRORD signal gets faster with increasing concentrations of GuHCl. In (A–D) the difference ORD signals of the time-dependent photoreduced folding product and the initial oxcyt c state are shown as a kinetic trace that was obtained by averaging the multiwavelength data over a 6-nm interval around 230 nm. The data are shown for GuHCl concentrations of 2.7, 3.0, 3.3, and 4.0 M in (A)–(D), respectively. The equilibrium difference signal for redcyt c minus oxcyt c is shown as a dotted black line. Oxcyt c (75 M) was prepared in 0.1 M NaP, 500 M NADH, and the appropriate concentration of GuHCl at pH 7 and equilibrium redcyt c was prepared by adding dithionite  to oxcyt c. All data were measured in a 1.3-mm path length cell at 25 .

detection, but with fast, structure-sensitive spectral methods such as those described here it has become increasingly possible to probe the earliest events of folding. These events are expected to occur mainly on the upper reaches of the folding funnel where the heterogeneity of the unfolded conformations is greatest. The surprising kinetic behavior of unfolded cytochrome c protein chains as revealed by TRMCD/MORD and TRORD measurements points to the 104–103 s time scale as the likely temporal crossing point between landscape and classic behavior in this protein. Testing the generality of this finding for protein folding will require further application of these methods to a wider variety of protein sequences and conditions. Acknowledgment This research was supported by the National Institutes of Health grant GM38549.

328

cooperativity in protein folding and assembly

[15]

[15] Hydrogen-Exchange Strategies Applied to Energetics of Intermediate Processes in Protein Folding By David Wildes and Susan Marqusee Introduction

The incredible success of X-ray crystallography has provided protein chemists with a wealth of information about protein structure. These models provide a description of the average atomic positions for the protein, which serves as a good approximation for the native or most-populated conformation. Although important, these studies neglect other lesspopulated conformations. In addition to the native conformation, proteins sample partially and globally unfolded states whose populations and lifetimes are defined by a Boltzmann distribution, with kinetic barriers dictating the average lifetime of a molecule in a given state. Together, these features comprise a protein’s energy landscape. Understanding the structures and energetics of these high-energy regions of the energy landscape is an important, often overlooked goal in going from protein structure to function. This ensemble of conformations contributes to many fundamental processes, including enzymatic catalysis, ligand recognition, allosteric regulation, and protein folding. Detection of high-energy states on the energy landscape is experimentally challenging. By definition these states are poorly populated; they form a tiny fraction of the ensemble under native conditions. Using most traditional probes of protein structure, only well-populated conformations are detected. Structural techniques sensitive to these rare conformations are needed in order to characterize the entire energy landscape. The rate of exchange of solvent hydrogens with those on the peptide backbone is sensitive to the presence of minutely populated protein conformations, making amide hydrogen exchange an excellent technique for probing high-energy forms. When coupled with high-resolution nuclear magnetic resonance (NMR) or mass spectrometry, hydrogen exchange can provide residue-specific information about the energetics of conformational fluctuations in a polypeptide chain. Here we present an overview of the basics of hydrogen exchange and its applications to the equilibrium energy landscape, both in the absence and presence of reversible ligand binding. For more details on hydrogen exchange and its many applications we suggest the following reviews as a starting point.1–3

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[15]

hydrogen-exchange strategies in protein folding

329

Chemistry of Hydrogen Exchange: The Basics

Amide hydrogen exchange is a powerful tool for combining structural and energetic studies of proteins. Proteins contain several different classes of hydrogens that are labile to exchange with solvent, both on the peptide backbone and the side chains. Of these, only amide hydrogens on the backbone exchange at an experimentally useful rate. Chemically, the exchange of an amide proton with a solvent proton is catalyzed by both acid and base. When examined using an unstructured random-coil polypeptide, this gives rise to the characteristic V-shaped curve for the log of the exchange rate (log krc) versus pH, with a minimum around pH 3.0 (Fig. 1). This exchange rate shows considerable dependence on primary structure, due to steric and inductive effects from nearby side chains. These effects, as well as the effect of temperature, have been calibrated using short model peptides, and can be very easily predicted from protein sequence4; the resulting krc are often called ‘‘Bai factors.’’ A comparison of Bai factors with hydrogen-exchange rates measured in unfolded proteins suggests that they serve as a good approximation of the true exchange rate in the unfolded state. Detailed instructions for calculating residue-specific krc for a given protein sequence, using a simple spreadsheet, have been previously outlined.3

Fig. 1. pH dependence of the intrinsic rate of exchange from a random-coil polypeptide. krc (calculated from Bai et al.4) is plotted versus pH for the valine amide in the sequence Phe-Val-Ala.

1

S. W. Englander and N. R. Kallenbach, O. Rev. Biophys. 16, 521 (1983). S. W. Englander, L. Mayne, Y. Bai, and T. R. Sosnick, Protein Sci. 6, 1101 (1997). 3 C. B. Arrington and A. D. Robertson, Methods Enzymol. 323, 104 (2000). 4 Y. Bai, J. S. Milne, L. Mayne, and S. W. Englander, Proteins 17, 75 (1993). 2

330

cooperativity in protein folding and assembly

[15]

Hydrogen Exchange in Proteins: Slowing by Structure

In folded proteins, the observed rates of amide hydrogen exchange (kobs) are often much smaller than the predicted Bai factors. This slowing is usually represented by the protection factor, P, the ratio of the random coil exchange rate constant krc to the observed rate constant kobs (P ¼ krc/ kobs). In a folded protein, values of P can range from 1 (no protection) to 1010 or higher. Such protection reflects a dramatic slowing of exchange due to the fact that the amide hydrogen is in a folded protein. Retardation from exchange results from either exclusion of solvent (and hydrogen-exchange catalysts) from the protein interior, from hydrogen bonding, or both. To exchange, protected amide sites must transiently become exposed, either by penetration of solvent into the protein interior or by a transient opening event that exposes the amides to bulk solvent. Potential opening events include small fluctuations that expose only single amides, cooperative partial unfolding, and even complete unfolding of the protein (see section on native state exchange). These exchange-competent, or open, states represent higher energy conformations within the native state ensemble; the connection between these high-energy states and hydrogen exchange rates is what makes hydrogen exchange such a powerful technique for studying the details of protein native state ensembles. As first outlined by Linderstrom-Lang and co-workers,5 the exchange reaction for a given protected amide can be modeled using a microscopic two-state model (Scheme I):

Scheme I

where closed represents the native, exchange-incompetent conformation, and open represents the rare, exchange-competent state. Application of the steady-state approximation to the small population of open states results in Eq. (1), describing the observed rate of exchange. kobs ¼

kop krc kcl þ krc

(1)

Under most conditions, this equation can be further simplified depending on the relative rates of closure (kcl) and intrinsic exchange (krc). If opening is rate limiting (i.e., krc  kcl), Eq. (1) simplifies to (2). kobs ¼ kop 5

A. Hvidt and S. O. Nielsen, Adv. Protein Chem. 21, 287 (1966).

(2)

[15]

hydrogen-exchange strategies in protein folding

331

This extreme case is usually called an EX1 exchange mechanism. In the other extreme, the EX2 mechanism, krc  kcl, meaning that opening events only rarely result in exchange. The opening reaction is therefore a preequilibrium step, and, assuming that kcl  kop, so that the open state exists as a minor population at equilibrium, Eq. (1) simplifies to (3). kobs ¼

kop krc ¼ Kop krc kcl

(3)

Kop is the equilibrium constant for the opening event. In the EX2 regime, the change in free energy required for opening (GHX) can be calculated from the measured rate constant: GHX ¼ RT ln Kop ¼ RT ln

kobs krc

(4)

where R is the ideal gas constant and T is the absolute temperature. Based on the above analysis, hydrogen-exchange rates can reflect either equilibrium or kinetic features of a protein depending on the conditions. Under conditions favoring an EX2 mechanism, the observed rate reports on the equilibrium constant for an opening event, whereas under conditions favoring an EX1 mechanism, the observed rate reports directly on the rate of opening. In most proteins, at pH 7.0 and below, the amide protons exchange in the EX2 regime. The EX1 regime can often be accessed at high pH, where, due to the pH dependence of intrinsic exchange, random coil exchange rates (krc) can get very large. EX1 and EX2 exchange provide different but complementary information about the native state ensemble of a protein. When the open, exchange-competent state is an unfolded conformation (either partially or globally unfolded), EX2 exchange provides a measurement of the free energy difference between the open and closed conformations, yielding the relative populations of these high-energy states under native conditions. If the same protein can be switched to EX1 conditions without altering the conformational energetics, then the hydrogen-exchange rates report on the rate of conversion between the folded form and these high-energy forms. Therefore, experiments in EX2 give information about the relative populations of rare states, whereas EX1 experiments provide information about their dynamics of interconversion. Clearly, before interpreting hydrogen-exchange data, one needs to determine whether the EX1 or EX2 conditions apply. This is usually done by measuring hydrogen exchange rates as a function of pH. In EX2, kobs ¼ Kop(krc); since Kop is normally considered insensitive to small changes in pH, the log of the observed exchange rate [log(kobs)] should have a

332

cooperativity in protein folding and assembly

[15]

linear dependence on pH with a slope of 1. Similarly, because we expect the rate of opening, kop, to be independent of pH over small ranges, there should be no dependence of kobs on pH under EX1 conditions. Typically, a plot of log(kobs) versus pH for a given amide will be linear at low pH (EX2) and will take the shape of a rectangular hyperbola as pH is raised, asymptotically approaching kop as the EX1 mechanism dominates. Although measuring the pH dependence of observed exchange rates provides a direct method to distinguish between EX1 and EX2 kinetics, it may not be possible in a system in which the open states that give rise to hydrogen exchange show a strong dependence on pH. In this case, one can predict which regime is more likely by comparing the folding rate of the protein (kf) to the average krc under the experimental conditions. Folding should be the slowest possible ‘‘closing’’ event for a protein molecule, so the folding rate provides a lower bound for kcl. If kf  krc, then one generally assumes that hydrogen exchange occurs by an EX2 mechanism. It is important to note that in order to interpret exchange rates using Eq. (3), the exchange-competent state must be rare. This is certainly the case for very well-protected amides, but not for the fast-exchanging amides in exposed turns or in marginally stable proteins. Typically, these faster exchanging protons are not detected due to the inherent time limitations associated with NMR. Fast rates can be measured, however, using techniques in which exchange is quenched by low pH and temperature. If the EX2 condition is satisfied, these exchange rates can still be interpreted in terms of conformational equilibria. From Scheme I, it is clear that in general the measured rate of exchange must be the product of the random coil exchange rate constant, krc, and the fraction of molecules in the open state at equilibrium, Fop. kobs ¼ krc Fop

(5)

The quantity Fop can be derived from the equilibrium constant Kop, assuming a reversible two-state equilibrium. kobs ¼ krc

Kop Kop þ 1

(6)

Equation (6) contains no assumptions about the relative populations of closed and open states at equilibrium. The hydrogen-exchange free energy is now GHX ¼ RT ln

kobs krc  kobs

(7)

[15]

hydrogen-exchange strategies in protein folding

333

The discrepancy between Eqs. (3) and (6) is only 1% at a protection factor of 100, so the simpler equation is reliable for the analysis of most protection factors. Detecting the Hydrogen-Exchange Reaction

Although all proteins in solution are constantly exchanging their labile amide hydrogens, the process usually goes undetected because it does not result in a change in chemical species. To detect the hydrogen-exchange reaction, one usually employs different hydrogen isotopes (1H, 2H, 3H). Exchange is typically initiated in one of two ways—either by dissolving lyophilized protein into an aqueous solution of a different isotope or by rapidly exchanging unlabeled for labeled solvent by means of a spin column. Depending on the isotope incorporated, a number of different methods can be used to measure the exchange reaction over time. These methods vary considerably in temporal and spatial resolution; some are also limited by the size of the protein. Early hydrogen-exchange experiments used tritium as the isotope of choice, and exchange was detected by scintillation counting. This technique is limited in spatial resolution, requires the use of large amounts of radioactivity, and has since been supplanted by detection methods sensitive to deuterium. Deuterium oxide (D2O) has the advantages of being nonradioactive and available at reasonably low cost. Deuterium incorporation into polypeptides can be detected by a number of spectroscopic techniques, including Fourier transform infrared (FTIR), ultraviolet (UV), and nuclear magnetic resonance (NMR), as well as neutron diffraction by protein crystals and mass spectrometry. Of these, NMR and mass spectrometry are by far the most commonly used. The development of two-dimensional NMR techniques in the 1980s revolutionized hydrogen exchange by making possible the measurement of exchange rates at individual amide sites in a protein. NMR detection of hydrogen exchange takes advantage of the very different spin properties of protons and deuterons. NMR detects only the protons and so as protons are replaced by deuterons (or vice versa), the corresponding resonance peaks disappear (or appear). This technique has made possible most of the analysis presented in this chapter, and is the most widely used detection technology at present. Protocols for NMR-detected hydrogen-exchange experiments have been previously detailed.3 NMR-detected hydrogen exchange does have some significant limitations. Practical considerations like shimming, tuning, and pulse calibration mean that the dead time in an NMR hydrogen-exchange experiment is long, on the order of 20–30 min. Also, acquisition times by NMR are

334

cooperativity in protein folding and assembly

[15]

relatively slow, ranging from 5 min to an hour, depending on the experiment. These considerations make fast-exchanging amides difficult to detect by NMR without the use of more complex saturation transfer experiments. Also, proteins suitable for NMR analysis are limited to those that are small enough and soluble enough to give good data with short acquisition times. Advances in spectrometer hardware and pulse sequences are constantly expanding the number of proteins that can be studied using NMR, but limitations remain. Mass spectrometry is also a commonly employed detection technique. Compared to NMR, mass spectrometry is not nearly as limited by size or solubility and is capable of measuring very fast exchange rates.6,7 The exchange of hydrogen for deuterium results in a change of mass that is measured either directly on the intact protein or on fragments following proteolysis. Unlike NMR, which measures the average amount of deuterium at a given site in all molecules in the sample, mass spectrometry directly detects individual molecules, making it possible to distinguish directly between sites that exchange together (cooperative opening events) and those that exchange in separate, independent events (noncooperative opening events). The resolution of hydrogen exchange detected by mass spectrometry is, however, limited by the size of the proteolytic peptides, and single-amide resolution is generally not possible. Still, the ability to make hydrogen-exchange measurements on large and biologically interesting proteins makes mass spectrometry an important and growing detection technology. Hydrogen Exchange as a Structural Technique

The protection factors determined by monitoring hydrogen exchange (either by NMR or mass spectrometry) are often used for structural studies. Protection arises when a protein is in a structured conformation; residues displaying a high level of protection are therefore assumed to be in regions of structure such as helices and sheets. This inferred structure can be used to confirm NMR secondary structural assignments or even as a constraint during a structure calculation. However, a much more novel application of protection from amide hydrogen exchange is to infer structure on conformations not amenable to more direct techniques such as NMR or crystallography. Such experiments are not the focus of this review and will be discussed only briefly. Several reviews cover this topic in detail.8–10 6

Y. Deng and D. L. Smith, Anal. Biochem. 276, 150 (1999). J. G. Mandell, A. M. Falick, and E. A. Komives, Anal. Chem. 70, 3987 (1998). 8 J. Clarke and L. S. Itzhaki, Curr. Opin. Struct. Biol. 8, 112 (1998). 7

[15]

hydrogen-exchange strategies in protein folding

335

A major asset of hydrogen-exchange studies is that they can be carried out under a wide variety of conditions, including very dilute solutions, extremes of pH, and high levels of denaturant. Exchange can also be applied to atypical systems such as protein aggregates and structures transiently populated during the folding process. At various times during the exchange process under this diverse array of conditions, the reaction can be quenched by bringing the protein back to native conditions, lowering the pH or temperature, or a combination of the two. Detection is then carried out by NMR or mass spectrometry under these quenched conditions. Amides that do not exchange significantly during the quench and data collection phases are used as indirect probes of protein structure; they report on the degree of protection during the exchange process. The quenched hydrogen-exchange technique has most notably been used in studies of protein folding. A major challenge in experimental studies of protein folding is characterizing partially folded intermediates. At equilibrium, usually under mildly denaturing conditions, some proteins form flexible, partially folded states termed molten globules. Due to their heterogeneous nature, inherent flexibility, and low solubility, molten globules have not been amenable to standard techniques such as crystallography and NMR. The quenched hydrogen-exchange experiment has been used to identify the structure (protected) regions in a number of these molten globules and other partially folded states. Proteins also form partially folded intermediate states transiently during the folding process—so-called kinetic intermediates. The ephemeral and heterogeneous nature of these states makes them particularly challenging to study. Indirect hydrogen-exchange studies have provided structural information for several folding intermediates. The simplest of these kinetic experiments is a technique called ‘‘pulse-labeled hydrogen exchange’’ in which the hydrogen-exchange reaction is carried out transiently during the folding process and detection takes place after the protein is folded. Usually, this hydrogen-exchange pulse is carried out at high pH where the exchange process is orders of magnitude faster than the folding process. All of these indirect structural studies rely on quenching the exchange reaction and trapping the isotopic label in the native conformation. Structural information is thereby limited to residues whose exchange is much slower than the quench and detection phase. These so-called probe residues are by definition in the slowly exchanging core of the protein. 9 10

S. W. Englander, Annu. Rev. Biophys. Biomol. Struct. 29, 213 (2000). T. M. Raschke and S. Marqusee, Curr. Opin. Biotechnol. 9, 80 (1998).

336

cooperativity in protein folding and assembly

[15]

Intermediate structure can therefore be inferred only in a well-structured region of the native conformation, biasing models of these partially folded states to a subset of the native conformation. Although this limits the interpretation of these studies, indirect hydrogen exchange has provided the folding community with some of the most high-resolution structural information about intermediates. Hydrogen Exchange and Protein Stability

EX2 hydrogen exchange is a probe of conformational energetics at the level of individual residues. As described above, in EX2 the observed exchange rate reports on the equilibrium (Kop) between the closed and open states (Scheme I). What is, however, the nature of these closed and open states? Clearly, one possibility for the open state is the globally unfolded state (U). In this case the closed to open equilibrium reflects the native and unfolded states of the protein. Traditionally, proteins are found to be highly cooperative and at equilibrium are approximated by the two-state equilibrium (N,U). For a stable protein, under so-called native conditions, such as those in a living cell, the native state is more stable than the unfolded state and the equilibrium is such that the native state (N) exists in vast excess over the unfolded state (U). Under these conditions, the native state population swamps most spectroscopic techniques, such as circular dichroism or fluorescence, and the small population of the unfolded state is undetectable. Protein stability is therefore determined by perturbing the equilibrium between N and U, using temperature or denaturant, until a regime is reached where the populations of N and U are both detectable (the transition zone).11 Although the data are taken in the transition zone, the stability of the protein under native conditions [Gunf (H2O)] is then inferred by extrapolation. Hence the stability of a protein is not usually determined directly under the specified conditions. However, when a protected amide primarily exchanges through a global unfolding event, GHX ¼ Gunf. For these amide protons, hydrogen exchange can directly detect the minute fraction of protein in the unfolded state under conditions that favor the native state. This incredible sensitivity (less than 1 in 106 molecules are in the unfolded state for a protein that is 8.5 kcal/mol stable) is achieved because only the open conformations undergo exchange, and the populated native state is invisible to the reaction. Therefore, unlike traditional spectroscopic techniques, hydrogen exchange can directly measure the stability of a protein at physiologically relevant temperature and solvent conditions. 11

C. N. Pace, Methods Enzymol. 131, 266 (1986).

[15]

hydrogen-exchange strategies in protein folding

337

If the two-state assumption (N,U) holds at the microscopic level, then GHX should equal Gunf for every amide hydrogen in a protein. In practice, this is never the case. Proteins usually contain a subset of amides with very high protection for which GHX agrees well with Gunf,12 however, many amides exchange faster than predicted by a mechanism governed by global unfolding. The existence of these less protected amides is strong evidence that most proteins do not adopt a single, static native structure but have complex energy landscapes and populate states (open states) intermediate in free energy between N and U under physiological conditions. These lower energy states give rise to the range of observed protection factors. Although a given amide can exchange through a host of open conformations, exchange will be dominated by the most populated, or lowest energy, state, and the observed GHX will correspond to this low-energy open state. Higher energy open states involving that amide are masked because they occur much less frequently. Thus all amides can and do exchange through global unfolding, but those that can also exchange through lower free energy open states will have GHX that reflect these lower states. Native State Hydrogen Exchange

Although protection factors can provide information about the energetics of high-energy states, alone they do not provide any information about the nature of the opening events that lead to exchange. Mechanistic information can be obtained, however, by examining hydrogen exchange as a function of denaturant, a technique termed native state hydrogen exchange (NSHX). Denaturant-induced unfolding transitions fit well to a model in which the free energy difference between the native and unfolded states varies linearly with denaturant concentration. The slope of a Gunf versus [denaturant] plot is called the m value, and correlates with the predicted change in accessible surface area (ASA) upon unfolding.13 Amide protons for which exchange occurs through global unfolding (GHX ¼ Gunf) should show the same denaturant dependence, or m value, as that determined by the more traditional probes. But what about those amides where GHX < Gunf? If denaturant dependence of unfolding free energy is proportional to the surface area exposed by unfolding, then the m values 12

B. M. P. Huyghues-Despointes, J. M. Scholtz, and C. N. Pace, Nat. Struct. Biol. 6, 910 (1999). 13 J. K. Myers, C. N. Pace, and J. M. Scholtz, Protein Sci. 4, 2138 (1995).

338

cooperativity in protein folding and assembly

[15]

of these amides should reveal how extensive an unfolding event is responsible for their exchange. If a very small fluctuation, involving only a few amino acids, is sufficient to expose an amide to exchange, then GHX should show little dependence on denaturant concentration. If exchange occurs through a larger unfolding event, involving the cooperative opening of a significant portion of the protein structure, then GHX should have a significant dependence on denaturant. By providing an estimate of the surface area exposed in conformational fluctuations, native state hydrogen exchange makes it possible to model the structures of high-energy open states in proteins. In NSHX experiments, only low concentrations of denaturant are used, so that fewer than 1% of protein molecules are unfolded. Although the native state is always the most populated conformation, the low levels of denaturant will modulate the populations of the rare unfolded forms (both partial and global unfolding). Hydrogen-exchange rates were first measured as a function of dilute denaturant on RNase A in 1993,14 and shortly thereafter the complete native state hydrogen-exchange analysis was carried out on horse heart cytochrome c15 and Escherichia coli ribonuclease H.16 To date more than a dozen proteins have been studied by native state hydrogen exchange. For virtually all of the amides studied, exchange can be classified as belonging to one of three mechanistic classes based on the response of their hydrogen exchange rates to denaturant. The first class represents the global stability of the protein. These sites are characterized by a GHX that is equal to Gunf for all denaturant concentrations, and mHX equal to munf. Such amides are typically buried in the center of the protein and define a hydrogen-exchange core; they do not appear to exchange in any open structures with a free energy lower than the unfolded state. These amides provide a good direct measure of protein stability under conditions that strongly favor the native state. Amides in the second mechanistic class exchange through subglobal or partial unfolding events. These amides also show a dependence on denaturant but exchange faster than expected for exchange through global unfolding. For these amides GHX ¼ Gsubglobal and Gsubglobal < Gunf. At low denaturant concentrations, they show an m value lower than that for global unfolding. As denaturant is increased, and because of the differences in m value, there reaches a concentration at which global unfolding (Gunf) is equal to GHX, and these amides begin to show an m value 14

S. L. Mayo and R. L. Baldwin, Science 262, 873 (1993). Y. Bai, T. R. Sosnick, L. Mayne, and S. W. Englander, Science 269, 192 (1995). 16 A. K. Chamberlain, T. M. Handel, and S. Marqusee, Nat. Struct. Biol. 3, 782 (1996). 15

[15]

hydrogen-exchange strategies in protein folding

339

consistent with global unfolding, which becomes the dominant exchange mechanism. Amides of this second class are often found in clusters of similar Gsubglobal and mHX within the native structure, consistent with secondary or supersecondary structural elements that unfold cooperatively, but independently of global unfolding. These partial unfolding events expose a large amount of surface area to solvent, but less than would be exposed by global unfolding, giving rise to a lower m value. That GHX is lower than Gunf indicates that there must be a population of protein molecules in the native state ensemble in which these regions are unfolded but the remainder of the protein remains folded. The resulting partially unfolded forms, of PUFs, have probably the best experimentally defined structures of any high-energy state in the native state ensemble, because both the native and unfolded regions are defined by the hydrogen-exchange data. The third mechanistic class is indicated by a GHX that is insensitive to denaturant. The nature of the opening events that give rise to this type of exchange and the origin of the extremely low or zero m value are controversial. The simplest and most convenient explanation is that these amides exchange through very small, noncooperative unfolding events that involve only a single amide. Such a small opening would result in a negligible change in solvent accessible surface area, giving an undetectably low m value. Because of this model, these denaturant-independent amides are often said to exchange through ‘‘local fluctuations.’’ The three mechanisms of opening, global unfolding, partial unfolding, and fluctuation, are not mutually exclusive. A single amide may exchange through a combination of unfolding and fluctuation. This means that the hydrogen exchange opening equilibrium may be expressed as the sum of multiple, independent equilibria: Kop ¼ Kg þ Ks þ K1

(8)

Kg, Ks, and Kl are the equilibrium constants for global, subglobal, and local unfolding, respectively. According to the aforementioned linear free energy model, each equilibrium constant Ki can be expressed as a function of denaturant concentration. H2 O

Ki ¼ eGi

þ mi ½den=RT

(9)

By combining Eqs. (4), (8), and (9), we can express the denaturant dependence of the hydrogen-exchange free energy for a given amide.   H2 O H2 O GHX ¼ RT ln eGg þ mg ½den=RT þ eGs þ ms ½den=RT þ eG1 =RT (10)

340

cooperativity in protein folding and assembly

[15]

In practice, the contribution of global unfolding to GHX for those amides that exchange through partial unfolding is usually negligible, so data may be fit to one denaturant-dependent term and one independent term:   H O GHX ¼ RT ln eG 2 þ m½den=RT þ eG1 =RT (11)

NSHX data simulated with Eq. (11) are shown in Fig. 2B. Typically, denaturant-independent fluctuations dominate the exchange mechanism at low denaturant concentration, and unfolding events dominate at higher concentration. Linear extrapolation of GHX to 0 M denaturant gives the unfolding free energies of global and partial unfolding events in water, even if these events are not apparent in water due to local fluctuations. A Boltzmann diagram for a protein can be constructed by linear extrapolation of NSHX data, as shown in Figure 2C. Clusters of amides with similar extrapolated GHX and m values indicate partially unfolded forms in equilibrium under native conditions. In this analysis we have assumed that denaturant-independent hydrogen exchange occurs through small structural fluctuations, and that the Bai factors are a good approximation of the unprotected exchange rate in such a fluctuation. These may be poor assumptions. First, the open state in a small fluctuation cannot be expected to resemble a random coil; steric effects from surrounding protein structure may block access of catalysts to the amides. The Bai factors may be a significant overestimate for the actual rate constant for exchange in these open states. Nevertheless, krc is still used to calculate GHX for these amides and should be cautiously considered an upper limit on the free energy difference between the open and closed states. The local fluctuation model has been shown to adequately account for the exchange of a solvent-exposed helical residue in cytochrome c.17 However, the results of this study indicate that the Bai factors do not accurately estimate the chemical exchange rate for this fluctuation. Local fluctuation is not the only model to describe denaturantindependent hydrogen exchange. An alternative mechanism has been proposed, based on a statistical–mechanical view of protein structure.18 If both the closed and open states are themselves ensembles of structures, then it is possible to imagine many different structural fluctuations that give rise to exchange at a single site. If the closed and open ensembles have similar average accessible surface area (ASA), then the average ASA, and thus the average m, will be zero. Because in an equilibrium experiment it is 17 18

H. Maity, W. K. Lim, J. N. Rumbley, and S. W. Englander, Protein Sci. 12, 153 (2003). J. O. Wooll, J. O. Wrabl, and V. J. Hilser, J. Mol. Biol. 301, 247 (2000).

[15]

hydrogen-exchange strategies in protein folding

341

Fig. 2. Native state hydrogen exchange on a hypothetical protein. (A) Structural changes giving rise to hydrogen exchange. Three different transitions comprise the closed–open equilibrium in this protein: complete unfolding, partial unfolding, and ‘‘local fluctuation.’’ Four amide probes are indicated by symbols on the native structure. (B) Simulated NSHX data for this protein using the amide probes indicated in (A). Filled symbols depict two amides that report on the global unfolding reaction, and open symbols depict amides that report on the subglobal unfolding. The m values for partial and global unfolding are 4 and 2.5 kcal/mol M1, respectively. Triangles depict amides where global or partial unfolding is masked at low denaturant concentration by denaturant-independent ‘‘local fluctuations.’’ Note that filled triangles and open circles are indistinguishable at 0 M denaturant, despite their very different mechanisms of opening. All denaturant concentrations are in the folded baseline using traditional spectroscopic probes (inset). (C) Boltzmann diagram for this protein, calculated by extrapolating denaturant dependences back to 0 M. A high-energy partially unfolded form (PUF) is evident in equilibrium with the native and unfolded conformations.

342

cooperativity in protein folding and assembly

[15]

possible to measure only ensemble average values, this complex exchange will give a denaturant-independent hydrogen-exchange free energy. Computational studies show that this model can account for the NSHX data for staphylococcal nuclease. It is not yet clear whether local fluctuation, ensemble modulation, or some other model best describes the physical origin of denaturant-independent hydrogen exchange. Whatever the mechanism of exchange, these residues indicate that the native state of all proteins studied thus far consists of myriad higher energy states, begging the question of the functional relevance of these states. Superprotection

Some amides do not fall into one of the above three classes. Instead, they have a hydrogen-exchange free energy higher than the free energy of unfolding for the whole protein. This is a strange situation, because every amide hydrogen should be accessible for exchange in the globally unfolded state and therefore any fluctuations with a higher free energy than global unfolding should be masked. This phenomenon is referred to as superprotection. There are several different explanations for superprotection. The presence of superprotection may indicate an error in the traditional measurements of global stability, due to either solvent isotope effects or failure of the linear extrapolation model. The most trivial explanation for superprotection is stabilization of the native state by deuterium. D2O has unpredictable effects on protein stability, and may alter the stability of a protein by 1 kcal/mol or more.19 Thus a stability determined by denaturation in water may be an incorrect model for the conditions of a hydrogen-exchange experiment. To compare GHX with Gunf, it is necessary to determine Gunf under the exact same conditions as the native state exchange experiment, including D2O. This is a simple experiment that can quickly resolve some mysteries in hydrogen-exchange data. A small degree of superprotection is frequently seen even after solvent isotope effects have been accounted for. This may be due to the effects of proline residues on the free energy of the unfolded state.12 For most amino acids in a random-coil polypeptide, the trans conformation of the peptide bond is favored over the cis by a factor of approximately 1000. For proline, this factor is reduced to about four. In the folded conformation a proline is typically in either the cis or the trans conformation, while in the unfolded state, proline residues will equilibrate between the cis and trans conformations. The unfolding equilibrium is described by Scheme II. 19

G. I. Makhatadze, G. M. Clore, and A. M. Gronenborn, Nat. Struct. Biol. 2, 852 (1995).

[15]

hydrogen-exchange strategies in protein folding

343

Scheme II

Two features of this equilibrium are important for hydrogen exchange. First, the unfolded state equilibrium favors the conformation with a random-coil distribution of proline isomers. This means that the change in free energy between N and Unative must be larger than that between N and Uisomerized. Second, the rate of direct transitions between N and Uisomerized is trivial; isomerization must occur from the unfolded state. Under native conditions, the concentration of Unative is so small that proline isomerization is extremely slow. Hydrogen exchange is a kinetic process, and is thus dependent on the paths between species. The ratedetermining equilibrium in hydrogen exchange is between N and Unative. The rate of transitions to Uisomerized is so low that it does not contribute significantly to the hydrogen-exchange rate. Even though all chemical species (disregarding isotopes of hydrogen) are in equilibrium in a hydrogenexchange experiment, the measured equilibrium constant between closed and open states reflects the fast equilibrium that occurs between N and Unative. The slight difference in free energy between Unative and Uisomerized is reflected in a slightly higher value of GHX compared to Gunf determined by equilibrium denaturation. The effect of proline isomerization on the free energy of the unfolded state has been determined experimentally, and small correction factors can be calculated based on the number of cis and trans prolines in the native state of a protein of interest.12 In a few proteins, notably the cSrc SH3 domain20 and Syrian hamster PrP,21 neither proline isomerization nor solvent isotope effects can account for the observed superprotection. For these proteins, superprotection may result from residual structure in the unfolded state under native conditions. This could lead to superprotection for one of two reasons. First, if the unfolded state is exchange competent, but is not accurately modeled by a random-coil polypeptide, then the actual rate of chemical exchange may be slower than krc. This would lead to an artificially high calculated protection factor, and thus an overestimate of GHX. Alternatively, the unfolded state may be completely closed to exchange, and exchange for the superprotected amides may occur only through an open state that is higher in free energy than the unfolded state. The former case indicates some residual protection in the unfolded state, but does not imply an energetic 20 21

V. P. Grantcharova and D. Baker, Biochemistry 36, 15685 (1997). E. M. Nicholson, H. Mo, S. B. Prusiner, F. E. Cohen, and S. Marqusee, J. Mol. Biol. 316, 807 (2002).

344

cooperativity in protein folding and assembly

[15]

difference between the unfolded state and the exchange-competent state. The apparent excess free energy is merely an artifact of an inaccurate krc. The latter case, however, suggests that the measured excess in free energy is real. According to this model there is a hitherto undetected unfolding intermediate, which leads to an underestimate of the unfolding free energy by linear extrapolation. The ‘‘superprotected’’ GHX thus reflects the real unfolding free energy of the protein. Hydrogen-Exchange Studies on Binding and Allostery

High-energy conformations are likely to play an important role in protein–ligand interactions and protein signaling. The great majority of proteins interact with ligands (metals, small molecules, or other macromolecules) at some point during the course of their biological function. Binding is often accompanied by large structural and/or functional changes throughout the protein molecule, despite the fact that ligand-binding sites are localized and in many cases involve only a few residues. These distant changes imply a cooperative pathway connecting the two regions of the protein such as those detected by evolutionary covariation.22 Such a pathway implies that different regions of the protein respond to the change differently, which, in turn, suggests a role for high-energy states such as those detected by hydrogen exchange.23 Consider a protein with a ground state with poor affinity for ligand and a multitude of high-energy states. Some of these high-energy states have a very high affinity for the ligand. Addition of ligand to this protein will shift the native state ensemble to favor those states that bind ligand (Fig. 3). Although functionally important, this redistribution of conformational states may occur without changing the ensemble average structure of the protein, making it undetectable by X-ray crystallography or NMR. The energetic change, however, may be detected using hydrogen exchange. The utility of hydrogen exchange for the study of binding and allostery was recognized very early. Extensive studies on hemoglobin using tritium labeling began in the 1970s,24 and over the past 30 years have provided important information about the energetic transitions induced by oxygen binding.25 Oxyhemoglobin has significantly more fast-exchanging amides than deoxyhemoglobin, suggesting that oxygen binding increases the number of amides that can exchange by local fluctuations or subglobal unfolding. A complex of cytochrome c and a monoclonal antibody was first 22

S. W. Lockless and R. Ranganathan, Science 286, 295 (1999). I. Luque, S. A. Leavitt, and E. Freire, Annu. Rev. Biophys. Biomol. Struct. 31, 235 (2002). 24 S. W. Englander and C. Mauel, J. Biol. Chem. 247, 2387 (1972). 25 S. W. Englander and J. J. Englander, Methods Enzymol. 232, 26 (1994). 23

[15]

hydrogen-exchange strategies in protein folding

345

Fig. 3. The role of high-energy states in ligand binding. If binding induces a change in a protein, the thermodynamics of binding can be dissected into two parts, preorganization, which is endergonic, and complexation, which is exergonic. Gpreorg is the energy required to rearrange the protein into a high-affinity state in the absence of bound ligand. Gcomplex is the affinity of the ligand for the preorganized protein. The observed binding free energy Gobs is the sum of Gpreorg and Gcomplex. In this way some of the energy of binding is dissipated through the protein, causing a conformational and/or functional change. (Figure courtesy of C. Park.)

used to demonstrate the ability of hydrogen exchange to provide information on protein–ligand binding surfaces.26,27 More recently, hydrogen exchange has been applied to the energetics of increasingly complex systems, including signaling protein kinases and other large, allosteric enzymes. Such studies are still in their infancy. The data analysis is not straightforward; there are several additional complexities that must be addressed in order to interpret hydrogen exchange in the presence of ligands. Hydrogen Exchange in Protein–Ligand Complexes

Ligand binding can alter hydrogen-exchange rates by a variety of mechanisms. Residues at the binding site may show slowed exchange due to direct protection (solvent exclusion) at the binding surface. Exchange rates of some amides may be altered by ligand-induced changes in local fluctuations or partial unfolding. Binding will also slow rates within the hydrogen exchange core as a result of the increase in stability afforded by the binding free energy. This is an important and sometimes ignored feature of hydrogen exchange with ligands: binding will alter the rate of 26 27

L. Mayne, Y. Paterson, D. Cerasoli, and S. W. Englander, Biochemistry 31, 10678 (1992). Y. Paterson, S. W. Englander, and H. Roder, Science 249, 755 (1990).

346

cooperativity in protein folding and assembly

[15]

any amide that exchanges through global unfolding, regardless of its proximity to the binding site. Interpretation of ligand-induced changes therefore requires an appreciation of the diverse possible mechanisms of hydrogen exchange in a ligand-bound protein. Some mechanistic information can be inferred by comparing observed exchange rates at several ligand concentrations, just like unfolding mechanisms can be inferred by examining rates at various denaturant concentrations. Just as in native state exchange, the free energy of exchange for a given amide may reflect the ligand-dependent change in global stability. The free energy of exchange may also show less or no ligand dependence, depending on the mechanism of opening. When crystallographers want to compare the bound and free states of a protein, they simply crystallize the protein with and without an excess of ligand. It is tempting to use the same simple comparison with hydrogen exchange, but such an experiment may be difficult to interpret. X-ray diffraction is sensitive only to the most populated states in an ensemble. Hydrogen exchange, in contrast, detects very rare conformations in an ensemble. If ligand binding is reversible, then there will always be a minute population of free protein in equilibrium with bound protein, regardless of ligand concentration or affinity. If in the free protein, an amide is open to exchange, then it will contribute to the exchange reaction even in a large molar excess of ligand. Scheme III shows a model for hydrogen exchange in the presence of B and K F are the equilibrium constants for opening of the bound ligand. Kop op and free states, respectively; Kd and Kd0 are the dissociation constants for the open and closed states; and krc is the intrinsic exchange rate from a random coil.

Scheme III

For a given amide, opening may occur from either the bound or the free state. To interpret hydrogen-exchange rates, the contributions of bound and free molecules to exchange must be separated. If the equilibration between the bound and free states, as well as the opening reaction, is fast compared to krc (analogous to the EX2 condition), the general expression

[15]

hydrogen-exchange strategies in protein folding

347

for the observed exchange rate constant as a function of the concentration of free ligand is given in Eq. (12). ! B þ K F K =½L Kop op d krc (12) kobs ¼ B þ K =½L þ K F K =½L 1 þ Kop d op d This seemingly complicated equation can be simplified considerably if a few reasonable assumptions are made. First, the concentration of free ligand must be much larger than the dissociation constant. Most ligands of interest to protein chemists bind with high affinity, so the concentration of free ligand can be closely approximated as the molar excess of ligand over protein after saturation. It is desirable to work with essentially saturated protein to avoid complex NMR spectra, so in an NMR-detected exB must be much less than periment this condition will be met. Second, Kop 1. This is reasonable, because the bound state is not expected to be less stable than the apo state, and the equilibrium constant for opening of the free state is typically several orders of magnitude less than unity. Given these assumptions, Eq. (12) reduces to ! F K K d op B kobs ¼ Kop þ (13) krc ½L In this simplified form, kobs/krc varies linearly with 1/[L]; the equilibrium constants for opening from the bound and free state can be determined by a linear fit to the data. Equation (13) makes it possible to separate the contributions of the free and bound states in a hydrogen-exchange experiment. An example of this type of analysis is shown in Fig. 4 for the case of the cSrc SH3 domain in complex with the peptide ligand RALPPLPRY. Data from two residues are shown to illustrate two different types of ligand-dependent exchange. Lysine 28 represents an amide that exchanges almost purely through dissociation followed by global unfolding. Using a reported Kd of 19 M,28 B that closely matches the hydrogen-exchange this plot gives a value for Kop protection factor measured for this residue in the absence of ligand,20 which is within error of the stability of this protein. The near-zero intercept indicates that there is no detectable exchange from the bound form of the protein. The exchange rate of lysine 28 is altered by ligand, but the data show that ligand binding has no detectable effect on the stability of this amide. In contrast, valine 11 has an intercept that indicates a large exchange contribution from the peptide complex. The slightly steeper slope of the regression line reflects the lower GHX of V11 compared to K28 28

C. Wang, N. H. Pawley, and L. K. Nicholson, J. Mol. Biol. 313, 873 (2001).

348

cooperativity in protein folding and assembly

[15]

Fig. 4. Hydrogen exchange as a function of ligand concentration reveals diverse opening mechanisms. Data shown are for two amides in the cSrc SH3 domain in complex with a peptide ligand. Lines show fits to Eq. (13). Lysine 28 (circles) appears to exchange purely by ligand dissociation followed by protein unfolding. Valine 11 (triangles) exchanges through two mechanisms. Dissociation and unfolding competes with exchange from the bound state, as is indicated by the nonzero y-intercept.

in the free protein. The intercept indicates a GHX in the complex of approximately 6.5 kcal/mol, in comparison to the free protein’s global stability of 4.7 kcal/mol. For V11 the effect of peptide on exchange due to protein stabilization can be separated from the effect due to ligand binding, and the hydrogen-exchange data indicate a high-energy open form in the complex that is not detectable in the free protein. Interpretation of Rate Changes: Binding Sites, Affinity, and Ensemble Modulation

Hydrogen Exchange as a Probe of Ligand-Binding Interfaces One simple use of hydrogen exchange to study protein–ligand complexes is a footprinting experiment designed to identify binding interfaces. Solvent-exposed amides at the binding interface should be poorly protected in the apoprotein but will gain protection upon binding by exclusion of solvent by ligand. Protein surfaces where amides gain protection upon addition of ligand may therefore represent the binding interface. Other mechanisms of protection may, however, complicate this experiment. Ligand-induced folding (in either a small part or in the whole protein) can produce spurious results. If, for example, a protein is only marginally stable, hydrogen exchange will be very fast everywhere in the free protein.

[15]

hydrogen-exchange strategies in protein folding

349

A tight-binding ligand will dramatically slow exchange throughout the protein molecule, which will make identification of a binding site problematic. Such natively unfolded proteins may be common in eukaryotic proteomes29,30 and present just one example of an obstacle to using hydrogen exchange as a general method to map binding sites. Hydrogen Exchange as a Probe of Ligand Affinity The ligand-dependent exchange rates of amides that report on global unfolding are useful for measuring protein–ligand affinity. The ligand dependence of these amides reflects the fact that by preferentially binding folded conformations, ligands stabilize proteins. This coupling between binding, stability, and hydrogen-exchange rates has been exploited in an approach that uses hydrogen exchange to screen small molecules for protein binding,31 taking advantage of the sensitivity and speed of hydrogen exchange detected by mass spectrometry to quantitatively screen large libraries of compounds for protein binding. Hydrogen Exchange as a Probe of Ligand-Induced Ensemble Modulation Ligand-induced changes in hydrogen-exchange rates that cannot be accounted for by occlusion of the binding interface or global stabilization may be the most interesting class. These rate changes report on more subtle changes in the protein conformational ensemble that occur when ligand B term in Eq. (13) is of particular interest, because it reflects binds. The Kop features that are specific to the bound state. The open conformations responsible for this exchange are high-energy states with affinity for ligand. By binding to these conformations, ligand increases their relative population in the ensemble and makes them detectable by hydrogen exchange. This is direct evidence of ligand binding to high-energy states and reveals the potential that hydrogen exchange has for uncovering functionally relevant energetic changes in proteins. A thorough investigation of hydrogen exchange at specific amides in an allosteric model such as hemoglobin could reveal new details about the mechanism of allostery. Acknowledgments The authors would like to thank Dr. Chiwook Park for assistance with figures and calculations, and Dr. Eric Nicholson, Kathleen Ratcliff, and Erik Miller for critical reading of the manuscript. 29

P. E. Wright and H. J. Dyson, J. Mol. Biol. 293, 321 (1999). V. N. Uversky, Protein Sci. 11, 739 (2002). 31 K. D. Powell, S. Ghaemmaghami, M. Z. Wang, L. Ma, T. G. Oas, and M. C. Fitzgerald, J. Am. Chem. Soc. 124, 10256 (2002). 30

350

cooperativity in protein folding and assembly

[16]

[16] Cooperativity Principles in Protein Folding By Hue Sun Chan, Seishi Shimizu, and Hu¨seyin Kaya Introduction

Knowledge of the physical driving forces in proteins is essential for understanding their structures and functions. As polymers, proteins have remarkable thermodynamic and kinetic properties. A well-known observation is that the folding and unfolding of many small single-domain proteins, of which chymotrypsin inhibitor 2 is a prime example, appear to involve only two main states—N (native) and D (denatured).1,2 These proteins’ folding/unfolding transitions are often referred to as ‘‘cooperative’’ because of their phenomenological similarity to ‘‘all-or-none’’ processes. Traditionally, only N, D, and a small number of postulated intermediate states were invoked to account for experimental protein folding data. Under such an interpretative framework, two-state folding is described by the reaction N Ð D, and different properties are ascribed to N and D to account for different proteins. Although useful, this approach does not address the microscopic origins of experimentally observed two-state–like behavior. Traditional analyses simply assume that there are a small number of conformational states. But proteins are chain molecules. Physically, it is obvious that a polymer chain can adopt many conformations, ranging from the most open to maximally compact, and all intermediate compactness in between. Thus, whether and how the multitude of conformations available to a protein may be grouped into two or more ‘‘states’’—as traditionally assumed— should be ascertained through a fundamental understanding of the effective intrachain interactions involved. In the protein literature, however, folding energetics are often discussed in terms of the sum of contactlike energies of a fully folded native structure versus that of a random-coil–like state or a certain other prespecified unfolded conformational ensemble.3 Such analyses have yielded important insight. But they obscure the remarkable nature of protein cooperativities. This is because cooperativity has already been presumed in these discourses by their preclusion of many a priori possible conformations—notably compact nonnative conformations—from the energetic equation. To gain a consistent understanding 1

S. E. Jackson and A. R. Fersht, Biochemistry 30, 10428 (1991). D. Baker, Nature 405, 39 (2000). 3 E. Freire and K. P. Murphy, Adv. Protein Chem. 43, 313 (1992). 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[16]

cooperativity principles in protein folding

351

of the physical origins of proteinlike cooperativities, it is only logical that one should seek to reproduce them in self-contained polymer models in which the distribution of explicit-chain conformations is determined solely by the interactions considered explicity in the model.4 This is the approach we take. Our rationale is based on the discovery that proteinlike cooperativities are nontrivial to achieve in chain models (see below). Consequently, we expect the stringent requirements of proteinlike cooperativities to provide us with important clues to protein energetics in general. These considerations led us to ask: What effective intrachain interactions can give rise to the remarkable cooperativity of many small proteins?5 Are traditional pictures based on additive interactions sufficient to rationalize experimental observations? At a more basic level, what insight can be gained from detailed molecular accounts of the elementary driving forces of protein folding? Using hydrophobic interactions as a test case, our investigation indicates that driving forces for protein folding are intrinsically nonadditive. Significantly, extensive evaluations of coarsegrained chain models suggest that the high degrees of thermodynamic and kinetic cooperativity of small single-domain proteins may likely originate from many-body interactions in the form of a coupling between local conformational preferences and favorable nonlocal interactions. This chapter summarizes our recent advances in these respects. Multiple Meanings of Cooperativity in Biomolecular Processes We begin with more precise definitions of cooperative behavior. Generally speaking, ‘‘cooperativity’’ refers to a particular type of deviation from a reference situation deemed not cooperative. Thus, whether a physical interaction or process is cooperative hinges on the reference interaction or process to which it is compared. Often the hypothetical reference noncooperative situation is one in which some form of additivity is satisfied (Fig. 1). For example, ligand binding to a protein is additive (not cooperative) if the free energy of association per ligand is independent of the number of ligands already bound to the same protein.6 In this case, cooperative binding means that the favorability of binding one ligand increases with the number of ligands already bound until the binding sites of an allosteric proteins are saturated, as for binding of oxgyen to hemoglobin.7 4

H. S. Chan, H. Kaya, and S. Shimizu, in ‘‘Current Topics in Computational Molecular Biology’’ (T. Jiang, Y. Xu, and M. Q. Zhang, eds.), p. 403. MIT Press, Cambridge, MA, 2002. 5 H. S. Chan, Nature 392, 761 (1998). 6 C. R. Cantor and P. R. Schimmel, ‘‘Biophysical Chemistry,’’ Part III. Freeman, New York, 1980. 7 G. K. Ackers, J. M. Holt, and A. L. Klinger, Biochemistry 39, 117 (2000).

352

cooperativity in protein folding and assembly

[16]

Fig. 1. ‘‘Cooperativity’’ and ‘‘anticooperativity’’ as deviations from additivity. Schematics of several questions of interest. (A) Multiple-site ligand binding: Is the intrinsic free energy of association per ligand dependent upon the number of bound sites? The dotted arrows here highlight the unbound ligand’s translational freedom. (B) Implicit-solvent potentials and group additivity: Can solvent-accessible surface area (SASA) or other geometric parameterizations based on bulk-phase transfer data (left) accurately predict solvent-mediated interactions involving more than one solute (right)? (C) Many-body interactions and pairwise additivity: To what extent can interactions among three or more physical entities be accurately described as a pairwise sum of two-body interactions (dotted lines)?

On the other hand, binding is considered to be anticooperative if the favorability of binding one ligand decreases with the number of ligands already bound (Fig. 1A). A different context of additivity is illustrated in Fig. 1B. Many empirical treatments of solvation employ experimental data from bulk-phase transfer, e.g., between aqueous and nonpolar phases. By itself, transfer data effectively characterize only the interactions between a single solute and its surrounding solvent molecules. Nonetheless, additivity assumptions are often invoked to apply transfer data to the energetics of more complex situations involving multiple-body interactions, including protein folding and protein–protein interactions. In these applications, free energy and other thermodynamic signatures are assumed to be proportional to geometric measures such as solvent-accessible surface area (SASA). Then,

[16]

cooperativity principles in protein folding

353

many-body interactions are computed as the product of a given configuration’s SASA with the energetic proportionality coefficient determined from (single-solute) transfer data. Using free energy of hydration G and SASA as an example, this means that one first sets G1 ¼ (SASA)1, where G1 is the hydration free energy of a single solute and (SASA)1 is its SASA. Then the free energy of hydration Gm of an interacting msolute system is taken to be Gm ¼ (SASA)m, where  ¼ G1/(SASA)1 is determined from the single-solute experimental data and (SASA)m is the SASA of the m-solute system. This approach is often referred to as group additivity.8 The idea is intuitive. But the question is, in view of the granularity or particulate nature of real solvents, how valid are such additivity assumptions (Fig. 1B)? In this context, multiple-solute interactions for a given solute configuration may be termed ‘‘cooperative’’ or ‘‘anticooperative’’ depending on whether the actual interactions are more or less favorable to the association of solutes than that predicted by SASA. Figure 1C depicts yet another additivity condition. Now, instead of using the solvation interaction of a single solute (Fig. 1B, left) as a standard for additivity, the reference interaction is taken to be the solvent-mediated interactions between a pair of solutes (Fig. 1C). The question of cooperativity then is whether the solvent-mediated interaction free energy W(m) (r1, r2, r3, . . ., rm) among three or more solutes (rs are the position vectors of the solutes) can be adequately described by the sum of independent two-solute interactions. In other words, whether W(m) (r1, r2, r3, . . ., Pm (2) rm) ¼ i Tg. In this perspective, faster folding is associated with a higher Tf =Tg ratio, and the Tf =Tg criterion is taken as a quantification of the minimal frustration principle to achieve fast stable folding.32 In applying the Tf =Tg criterion, it is important to recognize that these temperatures are model parameters defined with temperature-independent interactions. In these models, native stability always increases with decreasing temperature. However, the effective intrachain interactions in real proteins are temperature dependent.25 Thus, instead of identifying Tf and Tg with experimental temperatures,34 physically it is more appropriate to interpret these temperatures as reciprocals of interaction strength (see below), and the Tf > Tg condition as meaning that the onset of glassy dynamics occurs when native stability is higher than that at the transition midpoint. A simple relation between Tf =Tg and HvH/Hcal can readily be established12,13 in the random energy model (REM): qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2ffi (6) HvH =Hcal ¼ 1  4 Tg =Tf

where we have used the REM definition of Tf =Tg of Onuchic et al.32 [The quantity ln gD in Eq. (5) is equivalent to the S0/kB expression in Eq. (12) of Onuchic et al.32] As in Eq. (5), the model interaction energies here are taken to be temperature independent, the population-based HvH is used, and the variation in energies among the denatured conformations is assumed to be relatively narrow. It is apparent from Eq. (6) that HvH/ Hcal increases with increasing Tf =Tg . Another parameter of interest is the energy gap. Fast folding of heteropolymer models has been linked to a large gap between the native energy and the energy of certain denatured conformations.35 Despite problematic

32

J. N. Onuchic, Z. Luthey-Schulten, and P. G. Wolynes, Annu. Rev. Phys. Chem. 48, 545 (1997). 33 J. D. Bryngelson and P. G. Wolynes, Proc. Natl. Acad. Sci. USA 84, 7524 (1987). 34 B. Gillespie and K. W. Plaxco, Proc. Natl. Acad. Sci. USA 97, 12014 (2000). 35 A. S˘ali, E. Shakhnovich, and M. Karplus, Nature 369, 248 (1994).

[16]

cooperativity principles in protein folding

363

features of an early lattice formulation of this criterion,32,36,37 the consideration in Fig. 4B indicates that a larger energy gap, or more appropriately a larger stability gap38 between the native and denatured states, is generally conducive to a higher degree of thermodynamic cooperativity. Thus, in summary, the experimental calorimetric requirement for two-state cooperativity is closely associated with theoretical criteria such as large Z-scores and large Tf =Tg ratios, and is seen to be qualitatively consistent with a larger energy gap. These observations imply that these theoretical criteria for a chain model’s good folding behavior are fundamentally connected to the experimental observed thermodynamic cooperativity of real proteins. This is reassuring. Furthermore, the quantitative relationships above provide novel avenues to explore what values these theoretical parameters have to take to reproduce thermodynamic properties similar to that of real proteins. For instance, Tf =Tg  1.6 has been proposed for a helical protein with approximately 60 residues.39 But Eq. (6) suggests that Tf =Tg should be much higher for real proteins. Indeed, according to this formula,13 Tf =Tg ¼ 6:4 is required for HvH =Hcal ¼ 0:95. This issue will be further addressed below. Kinetic Cooperativity of Protein Folding and Unfolding

Additional criteria for cooperativity are provided by the folding and unfolding kinetics of small single-domain proteins. A prominent characteristic of these proteins is that the data obtained using traditional optical probes on their reversible folding and unfolding are well described by the simple two-state reaction

where the logarithm of folding rate kf and unfolding rate ku are both essentially linear in denaturant concentration (Fig. 4C). In other words, their chevron plots40 have linear folding and unfolding arms. Moreover, the two-state kinetics of these proteins are consistent with their thermodynamics. Their thermodynamically determined free energy of unfolding Gu is an essentially linear function of denaturant concentration, and is well approximated by the kinetic quantity kBT ln(kf /ku) obtained from comparing directly measured and extrapolated parts of the linear folding 36

H. S. Chan, Nature 373, 664 (1995). D. K. Klimov and D. Thirumalai, Proteins Struct. Funct. Genet. 26, 411 (1996). 38 J. N. Onuchic, P. G. Wolynes, Z. Luthey-Schulten, and N. D. Socci, Proc. Natl. Acad. Sci. USA 92, 3626 (1995). 39 P. G. Wolynes, J. N. Onuchic, and D. Thirumalai, Science 267, 1619 (1995). 40 C. R. Matthews, Methods Enzymol. 154, 498 (1987). 37

364

cooperativity in protein folding and assembly

[16]

and unfolding arms of the chevron plot (dashed lines in Fig. 4C). These properties constitute a more stringent requirement for cooperative behavior than thermodynamic cooperativity alone because they are not shared by all calorimetrically two-state proteins. For protein folding, it appears that thermodynamic cooperativity is necessary but not sufficient for kinetic cooperativity. Proteins that are moderately larger than the simple two-state variety often exhibit chevron rollovers, as for barnase41 and ribonuclease A,42 even though they are calorimetrically two-state.25 Linear Chevron Plots Require a High Degree of Thermodynamic Cooperativity What types of intrachain interactions might be behind the remarkable cooperativities of the protein folding/unfolding transition? To explore this question, three representative heteropolymer models are compared in Figs. 5 and 6. In Fig. 5, model i is the 55-mer cooperative model of Kaya and Chan,43 containing a physically motivated favorable coupling between helix formation and the packing of the native core as well as an extra stabilizing energy for the ground state. Model ii is the 48-mer Go¯ model of Pande and Rokhsar44 with pairwise additive interaction.44,45 Model iii is the 27-mer three-letter model of Socci et al.32,46 Models i and ii here are ‘‘native-centric’’ in that their interactions are highly specific. Only interactions present in the native ground-state conformation are favored in these models. Model iii is based on general contact interactions of a three-letter alphabet such that nonnative contact interactions can be favored. Figure 5A indicates that these models have very different thermodynamic cooperativities. The transition is sharp for models i and ii, but relatively broad for model iii. Quantitatively, the HvH/Hcal values (2 ratio without empirical baseline subtractions13) for the three models are,13,43 respectively, (i) 0.91, (ii) 0.87, and (iii) 0.46. We should point out here that the heat capacities in Fig. 5A were obtained using only temperature-independent interactions. But the general conclusion that additive nonspecific hydrophobic interactions (as modeled by chain sequences with small alphabets, cf. model iii) are insufficient for calorimetric cooperativity12 is more broadly supported, including lattice model studies using 41

A. Matouschek, J. T. Kellis, L. Serrano, M. Bycroft, and A. R. Fersht, Nature 346, 440 (1990). 42 W. A. Houry, D. M. Rothwarf, and H. A. Scheraga, Nature Struct. Biol. 2, 495 (1995). 43 H. Kaya and H. S. Chan, Proteins Struct. Funct. Genet. 52, 510 (2003). 44 V. S. Pande and D. S. Rokhsar, Proc. Natl. Acad. Sci. USA 96, 1273 (1999). 45 H. Kaya and H. S. Chan, Phys. Rev. Lett. 90, 258104 (2003). 46 N. D. Socci, J. N. Onuchic, and P. G. Wolynes, J. Chem. Phys. 104, 5860 (1996).

[16]

cooperativity principles in protein folding

365

Fig. 5. Comparing the thermodynamic and kinetic cooperativity of three models. (A) Heat capacity scans for the (i) 55-mer cooperative, (ii) 48-mer Go¯, and (iii) 27-mer three-letter models are obtained by standard Monte Carlo histogram techniques. The model interaction energies are taken to be temperature independent, with e ¼ 1. (B–D) Model chevron plots as negative logarithmic mean first passage time (MFPT) of folding (open circles) and unfolding (filled circles) versus intrachain interaction energy. (C) and (D) are for the 48-mer Go¯ (ii) and 55-mer cooperative (i) models. The dotted V shapes here depict hypothetical twostate chevron plots. (C) The ~e variables used in the text are indicated. (B) The 27-mer threeletter model (i). Each data point in (B) is averaged from 500 trajectories. For (B), folding starts from a randomly generated conformation and first passage is defined by achieving the ground-state conformation, whereas unfolding starts from the ground-state conformation and first passage is achieved when the chain has fewer than four native contacts. End flips are attempted for the monomers at the two chain ends of the 27-mer model, while corner flips (70%) and crankshafts (30%) are used to simulate the motion of other monomers. Model time in this figure and Fig. 7 is measured in units of attempted Monte Carlo moves. Part of the data shown in (A), (C), and (D) are from H. Kaya and H. S. Chan, Proteins Struct. Funct. Genet. 40, 637 (2000); 52, 510 (2003) as well as H. Kaya and H. S. Chan, Phys. Rev. Lett. 90, 258104 (2003).

physically more realistic temperature-dependent effective interactions12,19 (results not shown here). Nonetheless, because effective intrachain interactions in real proteins are temperature dependent, it is problematic to directly identify the T-dependent kinetics of a model having only temperature-independent interactions with the temperature effects on real protein

366

cooperativity in protein folding and assembly

[16]

kinetics.47,48 Instead, for kinetic considerations, it is often more appropriate to view the variation in interaction strength ~e e/kBT in these models as corresponding to the variation of denaturant concentration at constant temperature.26,47–49 With this in mind, chevron plots of real proteins are modeled by the logarithmic folding and unfolding rates as functions of ~e. Model chevron plots in Fig. 5 thus constructed show that thermodynamic cooperativity has a direct impact on kinetic cooperativity. Among the models shown, model iii with a three-letter alphabet is thermodynamically least cooperative. Concomitantly, its folding/unfolding kinetics (Fig. 5B) deviates most seriously from simple two-state behavior. As thermodynamic cooperativity increases for the native-centric models ii and i whose HvH/Hcal values are much higher, the corresponding chevron plots develop larger regions of quasilinear behavior resembling that of simple two-state proteins (Fig. 5C and D). The dotted V shapes in these plots depict hypothetical simple two-state linear chevron plots. In other words, the rates given by the dotted V shapes are consistent with a two-state account of the thermodynamic free energy of unfolding Gu, as discussed above. The comparison in Fig. 5 indicates that for a chevron plot to possess a significant linear regime consistent with two-state thermodynamics, i.e., have a substantial ~e region of agreement between the simulated rates and the dotted V shape as in Fig. 5D, a high thermodynamic cooperativity with HvH/ Hcal > 0.9 is most likely required. Generic Statistical Mechanical Properties as Stringent Experimental Constraints on Possible Protein Energetics

It is clear from Fig. 5 that protein models with different interaction schemes can lead to very different predictions with respect to thermodynamic and kinetic cooperativities. This has an important ramification. It means that generic experimental statistical mechanical properties of simple two-state protein folding such as calorimetric cooperativity and linear chevron plots are useful for deciphering protein energetics. Because not all chain models can produce simple two-state behavior, important insights into how real proteins work may be gained by ascertaining what model intrachain interaction schemes can lead to experimental cooperativity properties, and what interaction schemes are deficient in those regards. 47

H. S. Chan, in ‘‘Monte Carlo Approach to Biopolymers and Protein Folding’’ (P. Grassberger, G. T. Barkema, and W. Nadler, eds.), p. 29. World Scientific, Singapore, 1998. 48 H. S. Chan and K. A. Dill, Proteins Struct. Funct. Genet. 30, 2 (1998). 49 H. Kaya and H. S. Chan, J. Mol. Biol. 326, 911 (2003).

[16]

cooperativity principles in protein folding

367

In evaluating a protein model’s kinetic cooperativity against experiment and to characterize chevron rollovers or lack thereof, we found it useful to pay special attention to the behavior of the model at the transition midpoint (interaction strength ~e1/2) as well as at the interaction strength ~eopt when the folding rate is fastest or optimal (cf. Fig. 5C). Simple two-state behavior requires e˜opt to be significantly more negative (more favorable to the native state) than ~e1/2. Interaction Specificity Enhances Cooperativity Figure 5 shows that protein models with more specific interactions are more cooperative. The model with least specific interactions in Fig. 5 is the three-letter model. Figure 5B shows that the maximum folding rate of this model occurs at an interaction strength essentially identical to that of the transition midpoint (~eopt  ~e1/2). This observation, which is consistent with earlier simulations of Onuchic et al.,32 means that the folding rate to the left of the transition midpoint of this model decreases with increasing native stability. This trend is opposite to that of real small single-domain proteins. In Fig. 5C, intrachain interactions for the additive 48-mer Go¯ model13,44,45 are more specific, as the energy function is constructed with explicit biases for the native structure. The resulting chevron plot is more proteinlike, in that the folding rate around the transition midpoint increases with increasing native stability (~eopt < ~e1/2). But the severe chevron rollover exhibited by this model implies that its folding/unfolding kinetics still differ significantly from those of simple two-state proteins. In contrast, the 55-mer cooperative model43 in Fig. 5D provides a better agreement with experimental simple two-state behavior. Intrachain interactions in this model are more specific than Go¯ models with only pairwise additive contact energies. The present 55-mer cooperative model energy function favors relatively large fragments of the native structure as a whole, resulting in a wider average energetic separation between the native and denatured conformations (cf. Fig. 6). The 55-mer cooperative model incorporates many-body interactions embodying two main ideas. (1) The first is a cooperative interplay between local conformational preferences and favorable nonlocal interactions. This is motivated by experimental observations that secondary structure elements are not stable in isolation but are stable when packed against other parts of a protein in the folded core.50 (2) The second is an extra favorable energy for the native conformation as a whole. This is motivated by experimental mutagenesis data51 suggesting that driving forces for protein 50 51

K. A. Dill, Biochemistry 29, 7133 (1990). J. G. B. Northey, A. A. Di Nardo, and A. R. Davidson, Nature Struct. Biol. 9, 126 (2002).

368

cooperativity in protein folding and assembly

[16]

Fig. 6. Energy distributions and their implications for thermodynamic cooperativity (cf. Fig. 4B). P(E) is the fraction of conformations with E  0.5 < energy  E + 0.5. The 27-mer three-letter (A), 48-mer Go¯ (B), and 55-mer cooperative (C) models in Fig. 5 are compared. Results are obtained by standard Monte Carlo histogram techniques. Distributions under strongly folding and strongly unfolding conditions are shown by solid and dotted curves, respectively. Distributions near the models’ thermodynamic transition midpoints are shown by dashed curves with the areas underneath shaded. The e/kBT values used for strongly folding, transition, and strongly unfolding conditions for the three models are, respectively, (A) 0.77, 0.67, 0.33; (B) 1.47, 1.28, 1.14; and (C) 2.11, 2.0, 1.67.

folding kinetics are partially separated from the specific interactions that stabilize the native structure. Details of implementation of these ideas in a lattice model context are provided by Kaya and Chan.43 These features of the 55-mer cooperative model lead to a chevron plot with an extensive linear regime (~eopt significantly more negative than ~e1/2). For this prototypical model with many-body interactions, the folding arm of the chevron plot is approximately linear for e/kBT > 2.4, corresponding to Gu 

[16]

cooperativity principles in protein folding

369

10kBT. Therefore, if 10kBT is the maximum native stability that can be physically attained, the folding arm of this chevron plot would be linear for the entire physical regime. Because Gu  10kBT is comparable to the native stabilities of many small single-domain proteins at zero denaturant at room temperature (e.g., Gu  9.0kBT for protein L at 22 ), Fig. 5D indicates how simple two-state folding/unfolding kinetics might arise from many-body effects similar to those postulated in this model and by virtue of the limited native stability of a small protein.43 Many-Body Interactions Needed for Linear Chevron Behavior The existence of a maximum folding rate at a certain ~eopt is a robust feature observed across many chain models. Physically, it has long been recognized that ~eopt represents a balance between two opposing kinetic effects of making intrachain interactions thermodynamically more favorable to folding (~e more negative).52 On the one hand, more favorable intrachain interactions imply a stronger bias toward the native structure and thus faster folding. On the other hand, if intrachain interactions are too favorable, they would lead to deep kinetic traps, glassy dynamics, and slow folding.32,53 Hence folding must be fastest or optimal at a certain intermediate interaction strength.52 This perspective implies that chevron rollover is generally unavoidable. Nonetheless, as demonstrated by Fig. 5D, when thermodynamic cooperativity is enhanced by many-body interactions, kinetic trapping is reduced and ~eopt can be pushed to more negative values relative to ~e1/2. In that event, folding-arm chevron rollover can be practically eliminated if the hypothetical native stability at the theoretical ~eopt is much higher than the maximum native stability achievable in the given system. Table I provides a broader comparison to further underscore the relationship between kinetic cooperativity and interaction specificity. In addition to the three chain models in Fig. 5, this table considers also a designed 20-letter 48-mer sequence54 and a designed 2-letter 27-mer sequence (2LCa in ref. 32). Table I shows that for the 2-, 3-, and 20-letter models with finite alphabets, kinetic cooperativity increases (as characterized by larger ~eopt/~e1/2 ratios) with increasing number of letters in the alphabet. This is consistent with the above observation that kinetic cooperativity increases with increasing interaction specificity. Sequences designed using 52

R. Miller, C. A. Danko, M. J. Fasolka, A. C. Balazs, H. S. Chan, and K. A. Dill, J. Chem. Phys. 96, 768 (1992). 53 D. Thirumalai and S. A. Woodson, Acc. Chem. Res. 29, 433 (1996). 54 A. Gutin, A. Sali, V. Abkevich, M. Karplus, and E. I. Shakhnovich, J. Chem. Phys. 108, 6466 (1998).

370

[16]

cooperativity in protein folding and assembly TABLE I Characterization of Kinetic Cooperativity of Model Protein Folding and Unfolding by Comparing Properties at ~eopt (When Folding Rate Is Maximum) to That at the Transition Midpoint ~e1/2

Protein chain model

~eopt/~e1/2

log10 [(kf)opt/(kf)1/2]a

55-mer cooperativec 48-mer Go¯d 20-lettere 3-letterf 2-letterf

1.48 1.24 1.06 0.99 0.84

3.2 1.4 0.0 0.0 0.3

(Gu)opt/kBT b 30 14 0.9 0.0 3.0

a

(kf)opt and (kf)1/2 are the folding rates at interaction strengths ~eopt and ~e1/2, respectively. (Gu)opt is the free energy of unfolding at interaction strength ~eopt. c H. Kaya and H. S. Chan, Proteins Struct. Funct. Genet. 52, 510 (2003). d V. S. Pande and D. S. Rokhsar, Proc. Natl. Acad. Sci. USA 96, 1273 (1999). e A. Gutin, A. Sali, V. Abkevich, M. Karplus, and E. I. Shakhnovich, J. Chem. Phys. 108, 6466 (1998). f J. N. Onuchic, Z. Luthey-Schulten, and P. G. Wolynes, Annu. Rev. Phys. Chem. 48, 545 (1997). b

a larger alphabet tend to be more specific because they can exploit the higher degree of energetic heterogeneity afforded by the larger number of interaction types.4,55–57 Quantitative Characterizations of Chevron Plots Two novel parameters are introduced in Table I to better characterize kinetic cooperativity: log10[(kf)opt/(kf)1/2] compares the folding rates at ~eopt with that at ~e1/2, and (Gu)opt is the free energy of unfolding at ~eopt. (By definition Gu ¼ 0 at ~e1/2.) These parameters serve to facilitate more direct comparisons with experiments by eliminating references to the model interaction strength ~e. The discussion above implies that kinetic cooperativity is associated with a large (Gu)opt. The maximum folding rate generally occurs at a hypothetical native stability much higher than that covered by the linear regime of the model chevron plot. Therefore, for a model to behave like small single-domain proteins, (Gu)opt has to be significantly larger than typical zero-denaturant stabilities ( 10kBT) of these proteins. Among the models listed in Table I, only the 55-mer cooperative model satisfies this requirement. For the same reason, log10[(kf)opt/(kf)1/2] has to 55

P. G. Wolynes, Nature Struct. Biol. 4, 871 (1997). H. S. Chan, Nature Struct. Biol. 6, 994 (1999). 57 H. S. Chan and E. Bornberg-Bauer, Appl. Bioinform. 1, 121 (2002). 56

[16]

cooperativity principles in protein folding

371

be significantly larger than the common logarithm of the ratio between the zero-denaturant folding rate (kf)0 and transition midpoint folding rate (kf)1/2. The quantity log10[(kf)0/(kf)1/2] is generally not small for real, small single-domain proteins, as their folding rates often span several orders of magnitude under different denaturant conditions. For example,  log10[(kf)0/(kf)1/2]  3.1 at 25 for wild-type chymotrypsin inhibitor 2.1 In view of these considerations, Table I shows that 2-, 3-, and 20-letter models are not kinetically cooperative in that their log10[(kf)opt/(kf)1/2] and (Gu)opt are all very close to zero. This suggests strongly that model sequences designed using solely pairwise additive contact energies in a finite alphabet with  20 letters are not kinetically cooperative in general. Among them, the two-letter 27-mer sequence represents an extreme noncooperative case in which ~eopt is less negative than ~e1/2 such that ~eopt/~e1/2 < 1 and fastest folding occurs under denaturing [(Gu)opt < 0] rather than native conditions. It should be emphasized here that the relatively high kinetic cooperativity of the 55-mer cooperative model in Table I is a direct consequence of its many-body interactions; it does not arise from its longer chain length per se. Much shorter chain models with many-body interactions can also achieve similar kinetic cooperativities. A case in point is a 27-mer (with relative contact order CO ¼ 0.27) we have considered that has an extra favorable energy for the ground-state conformation as a whole.43 We found that its ~eopt/~e1/2 ¼ 1.9, log10[(kf)opt/(kf)1/2] ¼ 3.4, and (Gu)opt ¼ 35, indicating that this particular 27-mer model has thermodynamic and kinetic cooperativities similar to that of the present 55-mer cooperative model. Relationship between Thermodynamic and Kinetic Cooperativities Figure 6 shows the distributions of conformational population that underlie the differences in thermodynamic cooperativity among the three models in Fig. 5. The three-letter model in Fig. 6A is calorimetrically noncooperative because the broad peak of its transition midpoint energy distribution (E  52) lies approximately midway between that of the fully folded (E ¼ 84) and fully unfolded (E  27) states, corresponding to the noncooperative scenario in Fig. 4Biii. The significant conformational population with intermediate energy here is associated with substantial kinetic trapping, which is the root cause of the severe chevron rollover in Fig. 5B. As we pointed out, the continuous T-dependent shift of the energy distribution peak of this model contributes to a long high-temperature tail in the heat capacity function (Fig. 5Aiii) and implies a significant postdenaturational expansion of conformational dimension (as measured by rootmean-square radius of gyration, for example). But such an expansion is

372

cooperativity in protein folding and assembly

[16]

not observed in small-angle X-ray scattering experiments on several small single-domain proteins.13 This provides additional evidence that the thermodynamic behavior of this three-letter model is very different from these proteins. Therefore, in conjunction with the above consideration suggesting that Tf =Tg > 6 for simple two-state proteins—much higher than the Tf =Tg ¼ 1.6 for the three-letter model, these observations cast doubt13 on the proposal that ‘‘real proteins resemble bead models in which only three kinds of residues are used to encode sequence.’’39 In this connection, our findings also argue against the hypothesis that the folding behavior of this particular three-letter sequence can be mapped onto that of a 60-residue helical protein through a ‘‘law of corresponding states.’’39 Figure 6B shows that the energy distribution of the 48-mer Go¯ model is more in line with the cooperative scenario in Fig. 4Bii. But even in this case, pairwise additive native-centric contact energies are insufficient for simple two-state kinetics (cf. Fig. 5C and Table I). This is because of the existence of a small yet not negligible non–ground-state population with near–ground-state energies13 (small peak on the left of Fig. 6B). Some of these conformations would act as kinetic traps to slow folding and thus cause the folding chevron arm to roll over under conditions that are only mildly favorable to the native state.45 In contrast, for the 55-mer cooperative model in Fig. 6C that has an extended regime of linear chevron behavior (Fig. 5D), the corresponding near–ground-state population is very much reduced. So, simple two-state kinetics appears to require a wellseparated bimodal energy distribution quantitatively similar to that afforded by the cooperative model in Fig. 6C. Additive Go¯-Like Constructs Are Insufficient for Simple Two-State Kinetics: Implications for the Principle of Minimal Frustration The 48-mer Go¯ model example in Figs. 5 and 6 and Table I indicates that additive interaction schemes envisioned by the common Go¯ potential cannot account for proteinlike simple two-state kinetics. This is a general deficiency, not an artifact of the lattice approach. Indeed, we have recently demonstrated that continuum (off-lattice) Go¯-like models with essentially additive energies are also unable to produce simple two-state kinetics. As in Fig. 5C, the chevron plots of several continuum Go¯ models are seen to contain significant chevron rollovers, indicating that internal friction arising from kinetic trapping remains substantial in such constructs.49 It is noteworthy that Tf =Tg > 2 has been reported for a 27-mer lattice Go¯ model.58 Therefore, the failure of common Go¯ models to predict linear 58

H. Nymeyer, N. D. Socci, and J. N. Onuchic, Proc. Natl. Acad. Sci. USA 97, 634 (2000).

[16]

cooperativity principles in protein folding

373

chevron plots supports our contention above that the parameter Tf =Tg has to be significantly larger than 2 for a model protein to exhibit apparent simple two-state behavior. This limitation of additive Go¯ models is basic. As such, it has farreaching implications for the fundamental principles of protein folding energetics. Natural proteins are evolved molecules. What makes them different from other heteropolymers? A consistency principle was proposed by Go¯ 20 years ago.59 It stipulates that different energetic components (e.g., local and nonlocal interactions) in naturally occurring proteins are evolutionarily designed to be consistent with one another when the protein adopts the native conformation. In other words, the native state is essentially free of energetic stress. This serves to ensure the stability of the native structure. A similar hypothesis was subsequentely offered by the principle of minimal frustration of Bryngelson and Wolynes33 (see above) that recognizes energetic frustration is minimized but not nonexistent in real proteins,60 i.e., ‘‘consistency is not perfect.’’61 As discussed above, we found that the minimal frustration principle is intimately related to the calorimetric criterion for thermodynamic cooperativity. The minimal frustration principle is extremely insightful in providing a first constraint on how real protein energetics should look. However, minimal frustration per se does not ensure proteinlike thermodynamic and kinetic cooperativities. In applications of the consistency principle and the principle of minimal frustration, the primary focus is often on the native state’s energetic situation. But our analysis above shows clearly that cooperativity entails not only the stabilization of the native structure but also the destabilization of otherwise stable nonnative conformations (cf. Fig. 6). Therefore, inasmuch as the principle of minimal frustration is postulated to be well embodied by three-letter32,39,58 or common Go¯-like62 models, results in Figs. 5 and 6 and Table I imply that the principle of minimal frustration is insufficient for the simple two-state thermodynamics and folding/unfolding kinetics of small single-domain proteins. In short, minimal frustration of the native structure appears to be necessary but not sufficient for proteinlike thermodynamic and kinetic cooperativities. It follows that for purposes of evaluating protein chain models and for delineating the remarkable differences between natural 59

N. Go¯, Annu. Rev. Biophys. Bioeng. 12, 183 (1983). A. R. Panchenko, Z. Luthey-Schulten, R. Cole, and P. G. Wolynes, J. Mol. Biol. 272, 95 (1997). 61 N. Go¯, in ‘‘Old and New Views of Protein Folding’’ (K. Kuwajima and M. Arai, eds.), p. 97. Elsevier, Amsterdam, 1999. 62 C. Clementi, H. Nymeyer, and J. N. Onuchic, J. Mol. Biol. 298, 937 (2000). 60

374

cooperativity in protein folding and assembly

[16]

proteins and heteropolymers in general, the minimal frustration principle should be superseded by cooperativity principles based upon quantitative experimental criteria. Chevron Rollover as a Consequence of Nonideal Thermodynamic Cooperativity The present theoretical analysis of protein folding/unfolding kinetics offers a consistent perspective on both simple two-state kinetics as well as kinetics with chevron rollovers that are often operationally referred to as non–two-state. As discussed above, general considerations of polymer physics indicate that chevron rollover is probably unavoidable when interactions favoring intrachain sticking are sufficiently strong, i.e., when the stability of the native state is higher than a certain threshold. The phenomenon of chevron rollover is seen to arise from kinetic trapping that may be characterized as an internal friction effect related to the ‘‘front factor’’ in the transition state picture.26,49 Microscopically, this means that there are increasing barrier recrossings and other impediments to conformational search with increasing native conditions, as is evident from the simulated folding trajectories under such conditions45 (not shown here). From this vantage point, the experimentally pertinent question about chevron rollover becomes whether the omnipresent theoretical rollover occurs at a native stability accessible by experiments. Therefore, the simple twostate kinetics of small single-domain proteins imply that their theoretical rollovers occur at hypothetical native stabilities significantly higher than their real stabilities in zero denaturant. Our modeling effort above indicates that this already requires a high degree of thermodynamic cooperativity, and that many-body interactions beyond the additive contact interactions in common Go¯ models are necessary to achieve this feat. But obviously there are physicochemical limitations to the thermodynamic cooperativity achievable by a protein. As a result, chevron rollover would occur if the interactions in a particular protein are not sufficiently cooperative, or when native stability is relatively high at zero denaturant. Thus, chevron rollovers may be viewed as beginning signs of glassy dynamics that are expected by energy landscape theory to commence at higher native stabilities,32,33 even though such hypothetical high-stability conditions are often not experimentally attainable for real proteins.34 Our simulations of the 48-mer Go¯ model and the 55-mer cooperative model show that when intrachain interactions are sufficiently specific, folding relaxation remains essentially single exponential in the rollover regime for native stabilities less than that at ~eopt, i.e., for Gu < (Gu)opt. (But folding relaxation often becomes non–single-exponential for Gu >

[16]

cooperativity principles in protein folding

375

(Gu)opt.26,45) This feature echoes experimental rollover data on barnase41 and ribonuclease A,42 which exhibit essential single-exponential behavior after effects of proline isomerization are factored out.41,42 Consistent with the present perspective, the zero-denaturant stability for barnase is  18 kBT at 25 , which is considerably higher than that of many small singledomain proteins, but it is significantly lower than barnase’s hypothetical45 (Gu)opt  40kBT at the same temperature. In situations in which intrachain interactions are less specific, however, chevron rollover can also be associated with non–single-exponential relaxation and kinetic partitioning.48,53 Coupling of Local and Nonlocal Interactions as a Possible Key to Contact-Order–Dependent Cooperative Folding

In addition to thermodynamic and kinetic cooperativities, we have also applied the remarkable empirical correlation between relative contact order (CO) and folding rate63,64 to further narrow the types of cooperative energetics that are likely to be operating in real, small single-domain proteins. CO-dependent folding constitutes another nonredundant physical constraint on possible protein energetics because not all interaction schemes that provide for high thermodynamic and kinetic cooperativity can lead to significant correlation between CO and folding rate similar to that observed experimentally.65 Figure 7A shows that the common Go¯ potential with pairwise contact energies, which is insufficient for simple twostate kinetics to begin with, is also not capable of producing CO-dependent folding. The folding rates span only approximately one order of magnitude, with much scatter and a very low correlation coefficient with CO. This and a similar finding by Jewett et al.66 buttress our point that common Go¯ models are not adequate for certain basic generic properties of small single-domain proteins. Recently, Jewett et al.66 introduced a new interaction scheme aimed at enhancing thermodynamic cooperativity with a nonlinear relation between energy and the number of native contacts. Figure 7B shows that their cooperative interaction scheme leads to a better correlation between CO and folding rate than the common Go¯ potential, and the range spanned by the folding rates of the model proteins also increases to  1.8 orders of magnitude. However, r ¼ 0.80 for this interaction scheme is weak in comparison with the experimental correlation coefficient, and the divergence in folding 63

K. W. Plaxco, K. T. Simons, and D. Baker, J. Mol. Biol. 227, 985 (1998). D. Makarov and K. W. Plaxco, Protein Sci. 12, 17 (2003). 65 H. Kaya and H. S. Chan, Proteins Struct. Funct. Genet. 52, 524 (2003). 66 A. I. Jewett, V. S. Pande, and K. W. Plaxco, J. Mol. Biol. 326, 247 (2003). 64

376

cooperativity in protein folding and assembly

[16]

Fig. 7. Modeling CO-dependent folding. The folding rates (open circles) of a set of 97 three-dimensional 27-mer lattice model proteins with different CO values for their maximally compact native structures [H. Kaya and H. S. Chan, Proteins Struct. Funct. Genet. 52, 524 (2003)] are determined under three different native-centric interaction schemes. Solid lines are least-square fits. The correlation between log10(folding rate) and CO is quantitated by the correlation coefficient square r2. (A) Common Go¯ potential with pairwise additive contact energies; r2 ¼ 0.39. (B) The s ¼ 3 cooperative scheme of A. I. Jewett, V. S. Pande, and K. W. Plaxco [J. Mol. Biol. 326, 247 (2003)]; r2 ¼ 0.65. (C) The cooperative scheme of Kaya and Chan with a ¼ 0.1 local–nonlocal coupling; r2 ¼ 0.84. All folding rates shown are simulated at e/kBT ¼ 1.47, as in Kaya and Chan. Each data point in (A) and (C) is averaged from 500 trajectories, whereas each data point in (B) is averaged from 100 trajectories, using the same move set as that in Kaya and Chan.

rate between the low- and high-CO structures is still quite limited. It should be noted that the set of structures and chain moves used in the present study was chosen independently and is not exactly identical to that of Jewett et al. As a result, the s ¼ 3 result of r ¼ 0.80 (r2 ¼ 0.65) in Fig. 7B is different from the r ¼ 0.75 (r2 ¼ 0.57) reported by Jewett et al.66 Our exploration thus far indicates that a cooperative interplay between local conformational preferences and the nonlocal favorable interactions responsible for protein core formation can lead to a significant CO/folding rate correlation.65 This view, which is motivated by experimental observations (see above), differs from either the local-dominant or the nonlocaldominant picture of protein folding.50,67 Physically, it posits that nonlocal 67

R. L. Baldwin and G. D. Rose, Trends Biochem. Sci. 24, 26 (1999).

[16]

cooperativity principles in protein folding

377

contact interactions cannot be strong unless local conformations sequentially near the contacting residues are essentially native so as to promote better packing between chain segments around the contacting residues. This is implemented in a model interaction scheme whereby a contact interaction is assigned a strongly favorable value when the pair of local chain segments centered around each of the contacting residues are in their native conformation, and assigned an attentuated value (by a factor a) otherwise. Figure 7C shows that this interaction scheme significantly expands the range of model folding rates to 2.6 orders of magnitude and leads to a high level of correlation with CO (r ¼ 0.91) similar to that observed experimentally.63,64 We note that the divergence between low- and high-CO folding rates in this model is still substantially lower than the six orders of magnitude among experimental folding rates of small singledomain proteins. This is probably because of the shortness of the model chains. Although this question remains to be investigated further, we have verified that this many-body interaction scheme with local–nonlocal coupling satisfies the thermodynamic and kinetic cooperativity criteria as well. Taken together, these observations lead us to hypothesize that similar local–nonlocal coupling mechanisms are at work in real proteins. Concluding Remarks and Outlook

Results summarized in this chapter demonstrate that when generic, apparently mundane properties of simple two-state proteins are applied to evaluate self-contained polymer models, they can provide stringent constraints that lead to unexpected in-depth understanding of protein energetics. This is quite remarkable because so far only data acquired by traditional (structurally low-resolution) optical probes have been emphasized in our analysis. As discussed above, we emphasize here again that the present usage of the term ‘‘two state’’ should be appropriately construed in a structurally low-resolution context. Obviously, for a macromolecule such as a protein, it is physically inconceivable that there are only two discrete energy levels as envisioned in the ‘‘Levithal paradox.’’68 Conformations are expected to span a quasicontinuum of energies (or enthalpies), so it goes without saying that the ‘‘two-state’’ behavior of a polymer chain is not absolute like that of two quantum states. Nonetheless, from a polymer physics perspective, the low-resolution experimental hallmarks of small single-domain proteins, notably their calorimetric cooperativity and linear chevron plots, are remarkable feats that demand highly unusual energetics. Our model evaluation suggests strongly that in these proteins, while 68

R. L. Baldwin, BioEssays 16, 207 (1994).

378

cooperativity in protein folding and assembly

[16]

conformations with intermediate energies or enthalpies do exist, their populations must be greatly reduced by specific many-body interactions. The thermodynamics of their energy landscapes may be characterized as ‘‘near-Levinthal’’ in this respect.43 Indeed, for our cooperative chain models, small populations of non– ground-state conformations always exist even under strongly native conditions26 by virtue of the Boltzmann distribution. This feature is consistent with experimental data from native state hydrogen exchange.28,69–71 As we have pointed out, some of the ‘‘partially unfolded’’ states revealed by native state hydrogen exchange may be viewed as part of a multipleconformation native state implicitly defined by empirical baseline subtractions.13,26 Here, as a first step, we have focused on small single-domain proteins because their near-ideal cooperative behavior is expected to provide more clear-cut information. Nevertheless, as shown above, our analysis is also pertinent to proteins that fold with chevron rollovers. Because many heteropolymer models are calorimetrically noncooperative, we expect self-contained polymer modeling to be useful in explaining downhill folding72 as well. Several foldability criteria pioneered by earlier researchers are extremely useful in distinguishing natural proteins from random heteropolymers. These include the consistency principle,59 the principle of minimal frustration,33 the energy gap35 or stability gap ideas,38 and the  parameter that compares the conditions for folding versus that for chain collapse.37 These criteria recognized key energetic ingredients that set natural proteins apart from random polypeptides. As such, these foldability criteria had to be critical in prebiotic evolution. However, these criteria alone do not address quantitatively the high degrees of thermodynamic and kinetic cooperativities of today’s small single-domain proteins. The cooperativity principles discussed in this chapter build on these earlier foldability criteria. Apparently, among foldable proteins, there has been further evolutionary pressure to enhance folding/unfolding cooperativity. A probable biological impetus might be the ‘‘avoidance of aggregation, particularly to highly insoluble amyloid fibrils’’ in the crowded cellular environment, as has been pointed out by Dobson.73 In this view, the lesser tendency for RNAs to aggregate may explain why RNAs have not evolved similar folding/unfolding cooperativities. 69

S. Marqusee and D. Wildes, Chapter 15, this volume. H. Maity, W. K. Lim, J. N. Rumbley, and S. W. Englander, Protein Sci. 12, 153 (2003). 71 C. Woodward, N. Carulla, and G. Barany, Chapter 17, this volume. 72 M. M. Garcia-Mira, M. Sadqi, N. Fischer, J. M. Sanchez-Ruiz, and V. Mun˜oz, Science 298, 2191 (2002). 73 C. M. Dobson, Trends Biochem. Sci. 24, 329 (1999). 70

[17]

hydrogen exchange—protein folding and dynamics

379

The work presented here shows that coarse-grained models are effective in delineating the general principles of protein folding/unfolding cooperativity. Ultimately, however, the physical bases (or lack thereof, for that matter) of the many-body interactions postulated in our coarse-grained models have to be ascertained. Some of these investigations are already under way. For example, we found that the temperature dependence of the hydrophobic effect (Fig. 3B) may account for the lack of postdenaturational chain expansion, and anticooperativity of certain hydrophobic interactions (Fig. 3C) may lessen the tendency of premature chain collapse and thus contribute to overall folding/unfolding cooperativity. In this pursuit, it would be extremely interesting to see how side chain packing, hydrogen bonding, and other atomic interactions may give rise to mechanisms of local–nonlocal coupling similar to those proposed above.

[17] Native State Hydrogen-Exchange Analysis of Protein Folding and Protein Motional Domains By Clare Woodward, Nata`lia Carulla, and George Barany Introduction

Slow hydrogen isotope exchange is a defining characteristic of the folded state of proteins. If a protein is water soluble, not self-aggregated, and has slow exchange of 10–25% of its backbone amide groups, it is reasonable to presume that folding to a biologically functional conformational ensemble has occurred. Slow exchange in this context refers to buried amide hydrogens that exchange with solvent hydrogens of different isotopic composition on the hour-to-day time scale at neutral pH and room temperature. Pioneering investigators of protein hydrogen exchange1 recognized that slow exchange implies the existence of internal motions that expose buried amide NH groups to solvent and thereby permit isotope exchange. When viewing a diagram of protein hydrogen exchange as in Fig. 1A, the mind’s eye should fill in not only a picture of an actual protein, but also the third and fourth dimensions of space and time, to envision an ensemble that fluctuates over diverse conformations and on many time scales, but for the most part populates native structure. Although hydrogen exchange is a result of internal motility, the hydrogen-exchange experiment does not usually yield the frequency or amplitude of a motion, but 1

A. Hvidt and S. O. Neilsen, Adv. Protein Chem. 21, 287 (1966).

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[17]

hydrogen exchange—protein folding and dynamics

379

The work presented here shows that coarse-grained models are effective in delineating the general principles of protein folding/unfolding cooperativity. Ultimately, however, the physical bases (or lack thereof, for that matter) of the many-body interactions postulated in our coarse-grained models have to be ascertained. Some of these investigations are already under way. For example, we found that the temperature dependence of the hydrophobic effect (Fig. 3B) may account for the lack of postdenaturational chain expansion, and anticooperativity of certain hydrophobic interactions (Fig. 3C) may lessen the tendency of premature chain collapse and thus contribute to overall folding/unfolding cooperativity. In this pursuit, it would be extremely interesting to see how side chain packing, hydrogen bonding, and other atomic interactions may give rise to mechanisms of local–nonlocal coupling similar to those proposed above.

[17] Native State Hydrogen-Exchange Analysis of Protein Folding and Protein Motional Domains By Clare Woodward, Nata`lia Carulla, and George Barany Introduction

Slow hydrogen isotope exchange is a defining characteristic of the folded state of proteins. If a protein is water soluble, not self-aggregated, and has slow exchange of 10–25% of its backbone amide groups, it is reasonable to presume that folding to a biologically functional conformational ensemble has occurred. Slow exchange in this context refers to buried amide hydrogens that exchange with solvent hydrogens of different isotopic composition on the hour-to-day time scale at neutral pH and room temperature. Pioneering investigators of protein hydrogen exchange1 recognized that slow exchange implies the existence of internal motions that expose buried amide NH groups to solvent and thereby permit isotope exchange. When viewing a diagram of protein hydrogen exchange as in Fig. 1A, the mind’s eye should fill in not only a picture of an actual protein, but also the third and fourth dimensions of space and time, to envision an ensemble that fluctuates over diverse conformations and on many time scales, but for the most part populates native structure. Although hydrogen exchange is a result of internal motility, the hydrogen-exchange experiment does not usually yield the frequency or amplitude of a motion, but 1

A. Hvidt and S. O. Neilsen, Adv. Protein Chem. 21, 287 (1966).

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

380

cooperativity in protein folding and assembly

[17]

Fig. 1. Protein hydrogen-exchange models for an NH exchanging from the native state. (A) The two-process model. Under native conditions, for each NH, the observed exchange rate constant, kobs, is the sum of the rate constant for exchange by the folded state mechanism (upper arrow), kN, and the rate constant for exchange by the unfolding mechanism (lower dashed arrow), kD. The subscripts N and D stand for the native and denatured states of the protein. For the folded state process (upper), no specific fluctuations/transitions are drawn, since the two-process model does not depend on the mechanism one favors for motions that mediate folded state exchange (penetration versus local unfolding). When exchange is only by the folded state mechanism, kobs ¼ kN ¼ kcx,N where kcx,N is the rate constant for the chemical step of native state exchange for the NH being monitored, and  expresses the probability of interaction of the exchanging NH with water and catalyst. For the unfolding mechanism, the dashed arrow indicates two steps, reversible global unfolding followed by exchange from the D state. Rate constants for global unfolding and folding are ku and kf, respectively. When exchange is only by the unfolding mechanism, kobs ¼ kD ¼ Kfoldkcx,D, where kcx,D is the rate constant for the chemical step of exchange from the D state for the NH being monitored, and Kfold is the equilibrium constant for global unfolding, and is equal to ku/kf. (B) The EX2/EX1 analysis. A closed form, NH-closed, is in equilibrium with an open form, NH-open, with interconversion rates k1 and k2; kexg is the rate constant for exchange from the open form. The conditions under which EX2 versus EX1 prevail are discussed in the text. We apply the EX2/EX1 analysis to exchange by the unfolding mechanism, but not to exchange by the folded state mechanism.

rather a probability that a given NH will exchange with the hydrogen isotope in solvent water. This deceptively simple measurement of protein behavior opens a window on complex dynamic properties and associated biological functions, and monitors a subset of these. It is not surprising

[17]

hydrogen exchange—protein folding and dynamics

381

that hydrogen-exchange analyses and interpretations are subjects of some controversy, occasional misconceptions, and fairly regular reinvention of explanations. A common hydrogen-exchange protocol is to dilute or dialyze a native protein from H2O into deuterated buffer (D2O) and measure its nuclear magnetic resonance (NMR) spectrum at varying times after transfer to D2O. From the decay in intensity of an assigned N1H peak in a series of spectra taken at specified time intervals, the observed hydrogen-exchange rate constant, kobs, is obtained for that NH at the experimental pH and temperature. The beauty of the method is that many NH reporters scattered throughout the molecule are monitored simultaneously under the same conditions. The complication is that, under a given set of conditions and in the same spectrum, some NHs may exchange by one mechanism while others exchange by a second mechanism. This feature of hydrogen exchange is explained by the two-process model2,3 developed below. Because it is crucial to specify the mechanism (folded state exchange versus exchange by cooperative unfolding versus some of both), and because the range of exchange times is broad, exchange rate constants of the same NH groups should be obtained for an incremental series of temperatures and pH values. Within one protein at one set of conditions (pH, temperature, ionic strength, etc.), the typical range of exchange times is so broad that many are too fast ( minutes) or too slow ( days) to measure from the decay of 1H NMR peaks in D2O solvent. To obtain reliable values of kobs for all or most NHs in a protein, it is usually necessary to systematically vary conditions over much of the range of pH from 3.5 to 9, and of tem perature from 3 to the thermal unfolding mid-point. These concepts are discussed in detail below. Exchange rates are often expressed as protection factors, equal to the ratio kcalc/kobs, where kobs is the observed rate constant and kcalc is the exchange rate constant for an NH in a small peptide of equivalent sequence computed from empirical, nearest-neighbor rules.4 The most rapidly exchanging amides are on the surface of a folded protein. Some surface protons exchange with rate constants that are an order of magnitude less than kcalc,5,6 demonstrating that NHs may be ‘‘protected’’ from free exchange even when accessible to solvent and not intramolecularly H-bonded in the crystal structure. 2

C. Woodward and B. Hilton, Biophys. J. 32, 561 (1980). C. Woodward, I. Simon, and E. Tu¨chsen, Mol. Cell. Biochem. 48, 135 (1982). 4 Y. Bai, J. S. Milne, L. Mayne, and S. W. Englander, Proteins Struct. Funct. Genet. 17, 75 (1993). 5 E. Tu¨chsen and C. Woodward, J. Mol. Biol. 185, 405 (1985). 6 E. Tu¨chsen and C. Woodward, J. Mol. Biol. 193, 793 (1987). 3

382

cooperativity in protein folding and assembly

[17]

This chapter describes how we go about designing hydrogen-exchange experiments and analyzing the data. A few recently reported and particularly promising studies on hydrogen exchange are briefly reviewed toward the end. The discussion draws heavily from our own work, and no attempt is made to include all relevant literature. It is assumed that the hydrogen-exchange measurement is the decay rate of assigned peaks in high-resolution 1H NMR spectra, unless otherwise specified. Two-Process Exchange in Proteins

Hydrogen-exchange data are often gathered in the context of one or more of the following experiments. (1) NMR structure determinations, or initial explorations of NMR-detected structural features, frequently include characterization of the slower exchanging NHs. Typically these are located in center regions of secondary structural elements. (2) Hydrogen exchange is used to characterize native state flexibility and fluctuation, or native state surface groups, and their perturbation by ligands or cosolvents. (3) Hydrogen exchange is commonly used to characterize global protein stability and its pH or temperature dependence. The most useful first step in developing a new hydrogen-exchange system is determination of which NHs exchange by which mechanism, and under what conditions. This can be readily accomplished by sliding the observation window along pH and temperature axes using the two-process model as a guide (Fig. 1A). The two parallel processes referred to are exchange mediated by fluctuations of the folded state (folded state exchange mechanism) and exchange mediated by global unfolding/refolding (the unfolding mechanism), as illustrated in Fig. 1A. The observed exchange rate constant is the sum of contributions from both, that is, kobs ¼ kN þ kD. An analogy for the isotope label on each NH is a water reservoir that has two drain pipes, each with a flow-controlling valve (the rate constant for that process); water flows out through either or both pipes depending on how the valves are adjusted relative to each other (the relative magnitudes of kN and kD). An NH may exchange by the first mechanism under one set of pH and temperature conditions (kN  kD), by the second under another set of conditions (kD  kN), or by both simultaneously under a third set of conditions (kN  kD). Whether by one mechanism only, or with both contributing, exchange is ‘‘two-process’’ because the pathways operate simultaneously for each NH albeit with different rates and with different dependencies on solvent conditions. Considering the entire protein at a given pH and temperature, some NHs in a protein may exchange by one mechanism and others exchange by the second mechanism, or a mixture of both. The two mechanisms are most easily distinguished by their temperature or urea dependence,

[17]

hydrogen exchange—protein folding and dynamics

383

because the unfolding mechanism is much more sensitive to changes in temperature or denaturant concentration (below). The first proposal of dual parallel exchange mechanisms came from tritium-exchange experiments; it explained how some NHs exchange from the folded state and others exchange via global unfolding.7,8 In tritium experiments, bulk exchange averaged over the entire molecule is monitored, and rate constants of individual NHs are not obtained. As NMR techniques improved and each resonance peak could be assigned to an individual NH, a two-process model was developed to explain the complex behavior of individual, slowly exchanging NHs in bovine pancreatic trypsin inhibitor (BPTI).2,3 The model explained how one NH may change from a folded state mechanism to the unfolding mechanism depending on solution conditions, and how the exchange by unfolding may at times exhibit pHindependent kinetics (EX1 behavior, described below). When two-process exchange is not taken into account in interpreting hydrogen-exchange data, mistakes have been made, most often in ascribing to native state fluctuations the exchange properties that were actually measured for the global unfolding process. The validity of the two-process model is demonstrated by a number of results, including the switch for some NHs from folded state exchange to the unfolding mechanism as temperature is raised,2,9 a different effect of urea on the two processes,9 and the good agreement of G(HX) with Gu for the unfolding mechanism (Table I, described below). Methods for fitting out-exchange data to the two-process model10,11 have been published. First Steps

It is simplest to begin by identifying NHs that exchange only or primarily by the unfolding mechanism, and the range of conditions under which this is the case. These NHs are more easily recognized because they are the last to exchange, and all tend to have a similar protection factor since all are governed by the same unfolding/folding transition. To identify this group, reference conditions should be established, and then variations of pH and temperature over appropriate intervals will sketch in the essential outlines of the system. It is reasonable to start at ambient temperature and 7

A. Rosenberg and K. Chakravarti, J. Biol. Chem. 243, 5193 (1968). C. Woodward and A. Rosenberg, J. Biol. Chem. 246, 4114 (1971). 9 K.-S. Kim and C. Woodward, Biochemistry 32, 9609 (1993). 10 H. Qian, S. Mayo, and A. Morton, Biochemistry 33, 8167 (1994). 11 L. Swint-Kruse and A. D. Robertson, Biochemistry 35, 171 (1996). 8

384

[17]

cooperativity in protein folding and assembly TABLE I Comparison of G(HX) and Gu for BPTI Wild Type and Mutantsa

WT Y21A F22A Y23A Y35G G37A N43G N44G F45A a

Tm  ( )

H(Tm) (kcal/mol)

Gu (kcal/mol)

Gu (kcal/mol)

G(HX) (kcal/mol)

87 65 78 54 69 66 55 66 49

70 53 68 45 38 47 46 49 36

9.0 4.2 7.8 3.1 4.0 4.1 3.3 4.3 2.1

— 4.8 1.2 5.9 5.0 4.9 5.7 4.7 6.9

— 5.1 2.2 7.0 5.7 4.7 6.0 4.7 7.2

Values are from differential scanning calorimetry at pH 2, except the last column in which values are from hydrogen-exchange experiments at pH 3.5. The parameters are temperature at the midpoint of unfolding, Tm, enthalpy change at Tm, H(Tm), and free energy change parameters defined in the text. The table is compiled from published values [K.-S. Kim, J. Fuchs, and C. Woodward, Biochemistry 32, 9600 (1993); K.-S Kim et al., Protein Sci. 2, 588 (1993)]. Mutants of bovine pancreatic trypsin inhibitor (BPTI) in the first column are named to indicate the WT amino acid (one-letter code), the sequence number, and the mutant amino acid.

around neutral pH, after confirming the absence of self-aggregation under these conditions and at the protein concentration of the NMR experiment. Hydrodynamic methods such as gel filtration are useful for demonstrating the absence of aggregation. Also, the concentration dependence of the line widths and chemical shifts in the NMR spectra themselves is also an indicator of the presence of self-aggregation. Specific protocols for setting up and acquiring NMR spectra with resolved resonances in the fingerprint (amide NH) region, and for determining exchange rates constants from a timed sequence of spectra, may be found in any of a number of publications.9,11 Briefly, from two-dimensional NMR spectra, often 1H–15N heteronuclear single-quantum coherence (HSQC), total correlation spectroscopy (TOCSY), nuclear Overhauser enhancement spectroscopy (NOESY), or correlation spectroscopy (COSY), in which the amide backbone (fingerprint) region is well resolved, the peak volumes of assigned NH resonances are measured as a function of time. Exchange rate constants at specified pH and temperature are obtained from fits to the volume and time data to the first-order rate equation. If the initially chosen conditions are appropriate, many NHs will exchange with half times in the minute-to-hour range, while a group of slower exchanging NHs will have half-times in the day, or even the week,

[17]

hydrogen exchange—protein folding and dynamics

385

range. If all NHs exchange very quickly under the initially chosen conditions, then the pH and/or temperature should be lowered until useful reference conditions are identified. Two critical issues may then be addressed. First, determine whether the pH dependence of the observed rate constants is consistent with first-order catalysis of the chemical step by hydroxide ion (HO). This is the case if the rate increases by one order of magnitude for each increasing pH unit in the range pH 3.5–9 while the temperature is held constant. This is by far the most common observation and is often taken as a demonstration of EX2 exchange (discussed in the next section). Deviations from this pH dependence are interesting in their own right and should be pursued. They might arise from a number of sources, including a switch to EX1 exchange, a pH dependence of global stability, and/or a pH-dependent conformational change affecting folded state exchange, as discussed below. Second, identify conditions under which a group of slowest exchanging protons has roughly the same protection factor. To scan for these conditions, it is most useful to increase temperature while holding pH constant. The slowest exchanging group with similar protection factors at the higher temperatures exchange only or primarily by the unfolding mechanism. After identification of the conditions under which specific NHs exchange by the unfolding mechanism, further analysis of protein folding energetics can be carried out using the EX2/EX1 formalism developed in the next section. EX2/EX1 Model

The EX2/EX1 analysis introduced by Linderstrøm-Lang, Hvidt, and associates1 was derived for the following scenario. Exchange kinetics of an NH are regulated by a preexchange equilibrium between ‘‘closed’’ and ‘‘open’’ conformations, NH-closed and NH-open in Fig. 1B, and exchange occurs only from NH-open. Under experimental conditions where the protein is essentially folded, the equilibrium favors NH-closed. Two limiting cases are apparent. When k2  k1 þ kexg, the observed exchange rate constant, kobs, is approximately equal to (k1/k2)  kexg, where kexg is the rate constant for exchange from NH-open. This is called the EX2 mechanism. When k2  kexg, then kobs  k1; this is called the EX1 mechanism. Assumptions of the EX2/EX1 model are that the open/close equilibrium is pH independent, and that NH-open is fully exposed to solvent; as a consequence kexg is approximated by the exchange rate constant of a model amide, kcx, where the subscript refers to the chemical exchange step modeled by small amides and polyalanine. Using small model compounds, the chemistry, and the associated pH and temperature dependencies of

386

cooperativity in protein folding and assembly

[17]

the rate constant for the chemical step, could be elucidated.1 In the EX2 regime, when kobs  (k1/k2)kexg, taking kexg  kcx accounts for the observation that most native state exchange rate constants have the same pH dependence as model amides, since the observed rate constant is proportional to kexg. In present-day applications of the EX2 formalism, the estimate of kexg is taken as equal to kcalc, the exchange rate constant computed from empirical rules for a small peptide with the same neighboring amino acids.4 Hydrogen-Exchange Kinetics Applied to Protein Folding

Application of EX2/EX1 formalism (Fig. 1B) to the global unfolding mechanism is straightforward, since the preequilibrium transition in this case is well described. Unfolding and folding rate constants, ku and kf, and the equilibrium constant for unfolding/folding, Kfold (¼ ku/kf), can be determined from other types of biophysical experiments. In the unfolding mechanism, NH-closed is the native state and NH-open is the globally denatured state pictured in Fig. 1A, and k1 ¼ ku, k2 ¼ kf, and kcx,D is the rate constant for exchange from the denatured, D, state. The free energy change for global folding can be computed from the observed hydrogen-exchange rate, kobs, as GðHXÞ  RT lnðkobs =kcalc Þ

(1)

if three conditions are met. (1) Exchange is EX2, and kobs  Kfold  kcx,D (so Kfold  kobs/kcx,D). (2) The local region around the exchanging NH in the D conformation is essentially disordered so that kcx,D  kcalc. (3) Exchange is solely by the unfolding mechanism with no contribution from folded state exchange. Usually the EX2 condition is met, that is, the exchange rate shows first-order base catalysis, the same as for model peptides, so that at pH > 3.5 the rate constant increases by 10-fold for every unit increase in pH. (Occasionally, EX1 behavior, i.e., pH-independent exchange for the unfolding mechanism, is observed, as discussed below.) G(HX) computed as described is often, but not always, similar to Gu for global denaturation determined by methods such as calorimetry or chemical denaturation. The simplest explanation for the failure of G(HX) to agree with Gu is that the latter are in fact correct, but kcx,D 6¼ kcalc. If, for example, the denatured state retains nonrandom structure around the exchanging NH or has large hydrophobic side chains near the exchanging NH, then kcx,D < kcalc, and, to that extent, Eq. (1) gives a free energy change for global unfolding that overestimates the actual Gu.

[17]

hydrogen exchange—protein folding and dynamics

387

However, even when kcx,D 6¼ kcalc, a reliable value of G for wildtype versus mutant protein can be derived from the ratio of observed rate constants (unfolding mechanism only) of the same NH in both wildtype (WT) and mutant. G is the difference between mutant and wild-type values for G for global denaturation. G(HX) ¼ RT(KWT/ Kmut), where the folding equilibrium constants are given by KWT ¼ kobs,WT/kcx,D and Kmut ¼ kobs,mutant/kcx,D. Assuming that the chemical exchange rate is the same for both WT and mutant, the kcx,D terms cancel out. It follows that GðHXÞ ¼ RTðkobs;WT =kobs;mutant Þ

(2)

and the question of whether kcx,D 6¼ kcalc does not arise.12 For this reason, we confirmed the two-process model by comparison of G(HX) to Gu, with the latter determined by other methods at comparable solution conditions. Table I shows the results with eight BPTI mutants12,13 for all of which the crystal or NMR structure has been determined. Except for F22A, these mutations are among the most destabilizing for single amino acid replacements in the literature. This is not surprising as the replaced residues are either packed in the slow exchange core (Y21A, Y23A, F45A), or their absence disrupts a network of buried polar interactions (N43G, N44G) or causes rearrangement of the flexible loops (Y35G). Mutant G37A offers a special case, as its NMR structure is indistinguishable from WT but its dynamics are quite different14; its destabilization is attributed to a combination of strain in the 36–37 peptide bond and increased internal motions. Agreement, or not, between G(HX) and Gu, and related questions of if/when kcx,D 6¼ kcalc in the D state, are ongoing issues. It has been suggested15,16 that correction of G(HX) for the nearest-neighbor effects of proline arising from cis versus trans isomers accounts for differences between G(HX) and Gu. To examine this interesting possibility, Table 1 in Huyghues-Despointes et al.15 provides a useful summary of G(HX) and Gu values reported for the same protein. Without the cis/trans proline correction of kcalc, 15 of 17 proteins have G(HX) values within 1.5 kcal/mol of Gu, and 12 are within 1 kcal/mol. Huyghues-Despointes et al.15 note that the proline correction brings G(HX)*, the adjusted value 12

K.-S Kim, J. Fuchs, and C. Woodward, Biochemistry 32, 9600 (1993). K.-S Kim et al., Protein Sci. 2, 588 (1993). 14 J. L. Battiste, R. Li, and C. Woodward, Biochemistry 41, 2237 (2002). 15 B. M. P. Huyghues-Despointes, J. M. Scholtz, and C. N. Pace, Nat. Struct. Biol. 6, 910 (1999). 16 Reference 15 incorrectly cites Li and Woodward25 as disagreeing that protein conformational stabilities can be estimated from exchange rate constants of the slowest exchanging NHs. The error was noted in a personal communication from the authors of reference 15. 13

388

cooperativity in protein folding and assembly

[17]

of the free energy change, to within 1 kcal/mol for 19 of the 20 cases (with RNase A counted three times). However, the correction improves agreement to within 1.5 kcal/mol for only three proteins, RNase A, RNase T1, and RNase H. Although not of similar structure, these proteins are highly stable and it is possible that their D state retains residual structure under the experimental conditions (where the N state is favored). A summary of our view of the similarity of global stabilities from hydrogen exchange to equivalent values from other conventional methods is as follows. The value of G(HX) calculated from Eq. (2) is very likely to be a good estimate of Gu. Regarding G, reasonable estimates of the change in free energy for global unfolding are obtained from Eq. (1) for many proteins. For moderately stable proteins (G in the range 4.0–8.5 kcal/mol), presently reported values of G(HX) are the same or within 1.5 kcal/mol higher than Gu. For very stable proteins (G 9 kcal/mol), G(HX) may be 2.0–2.5 kcal higher than Gu. Correction15 for proline cis/trans effects may be appropriate, but this is not proven; instead the D state could have residual structure meaning simply that kcx 6¼ kcalc for the slowest exchanging NHs and this would account for discrepancies observed between G(HX) and Gu. Hopefully, the question of proline effects will be further probed by mutational replacement of proline(s). In addition to the free energy change, the enthalpy change for reversible global folding, Hu, can be estimated from hydrogen-exchange rates.9 Ea, the apparent activation energy of kobs, is computed in the usual way from the Arrhenius relationship and plots of log kobs versus reciprocal temperature. Under EX2 conditions, and when exchange is only by the unfolding mechanism, Ea is essentially the sum of Hu(HX) plus the activation energy of the chemical step, usually taken to be around 17 kcal/mol.17 Hu determined from other experimental methods usually compares favorably to Hu(HX). After the overall exchange behavior has been characterized by the experiments described above, a systematic search for EX1 exchange by the unfolding mechanism should be undertaken. EX1 behavior of the unfolding mechanism means that at pH > 8, plots of the rate constant versus pH level off and display a pH-independent plateau,2,18 which signals a switch from EX2 to EX1. This occurs because kcx,D, which increases by an order of magnitude with each pH unit in the base-catalyzed regime (pH > 3.5), eventually becomes so large relative to the refolding rate that every unfolding event leads to exchange. Thus, unfolding is the rate-limiting step, 17 18

J. J. Englander, D. B. Calhoun, and S. W. Englander, Anal. Biochem. 92, 517 (1979). C. B. Arrington, L. M. Teesch, and A. D. Robertson, J. Mol. Biol. 285, 1265 (1999).

[17]

hydrogen exchange—protein folding and dynamics

389

kcx,D no longer enters the rate expression, and the observed rate constant equals the unfolding rate constant. The special interest in EX1 behavior for exchange by the unfolding mechanism is that it provides the unfolding rate constant under conditions that favor folding.18 In searching for EX1 behavior, it is important to show by other methods that global protein folding is pH independent over the higher pH range studied. Having identified the NH groups exchanging by the unfolding mechanism under specified conditions, one can then ask a number of questions about the effects on global unfolding energetics of various perturbants, including amino acid mutations and chemical denaturants. Folded State Exchange

The NHs exchanging exclusively or partially by the folded state mechanism(s) are identified as those not exchanging only by the unfolding mechanism. They exchange more rapidly, and with a lower temperature dependence, than NHs exchanging only by the unfolding mechanism, and their rate constants are widely spread. Rates for NHs exchanging most rapidly by the folded state mechanism are similar or equal to model compounds. NHs exchanging most slowly by the folded state mechanism approach the rate for the unfolding mechanism; for such NHs the observed rate constant may have contributions from both the folded state and the unfolding mechanism (kN  kD). Exchange of buried amides with rates less than those for the unfolding mechanism, however fast or slow, implies that native state fluctuations populate conformations in which buried, internally H-bonded NHs react with solvent water and catalyst ions (Hþ or HO). Three models for the folded state mechanism are commonly discussed. (1) In the ‘‘penetration’’ model, multiple small noncooperative, internal fluctuations create ensembles of interconverting conformations of varying protection and provide transient access of solvent to buried NHs.3 Lumry and Rosenberg proposed a dynamic property of proteins, called mobile defects, to account for folded state exchange.19 (2) In a ‘‘local unfolding’’ model, an element of secondary structure undergoes cooperative breakage of H bonds, and in the exchangeable conformation that element is locally unfolded.20 (3) In a third model, exchange occurs from first-excited states that can have a slightly higher free energy yet very different conformations than the native state, and could be produced by the types of protein motions invoked in both penetration and local unfolding mechanisms.21 In summary, 19 20

R. Lumry and A. Rosenberg, Col. Int. CNRS L’Eau Syst. Biol. 246, 55 (1975). S. W. Englander and N. R. Kallenbach, Q. Rev. Biophys. 16, 521 (1984).

390

cooperativity in protein folding and assembly

[17]

there is no definitive proof for or against penetration or local unfolding as an explanation of folded state exchange. Several experiments that may distinguish between the two have been discussed. (1) In our view, the pH dependence of histidine C-2 hydrogen exchange supports a penetration model because the effective pKa to which exchange is linked arises from burial of the imidizole.3 (2) The temperature dependence of adjacent NHs in a secondary structural element is not expected to be the same if exchange is by a penetration mechanism but should be similar for local unfolding.3 The same is true of pressure dependence. Recent experiments22 indicate that there is no correlation of the temperature or pressure dependence of folded state exchange with secondary structural elements, consistent with a penetration model. (3) In molecular dynamics simulations of native proteins, intramolecular H bonds in secondary structure commonly break and reform noncooperatively on the picosecond time scale, and their breakage/reformation is not necessarily rate limiting once solvent gains access to buried regions; we take this as an indication that cooperative H-bond breakage is not required for isotope exchange with solvent. A recent 1.5-ns molecular dynamics simulation23 shows over 200 events in which water molecules reside for at least 300 ps. (4) Periodicity of exchange rates in amphiphilic secondary structure is expected to be different for the two models. A penetration model in which exchangeable species are on average approximated by the crystal structure predicts that in an amphiphilic helix, NHs on the interior side of the helix will exchange slower than on the exterior, and exchange rates will show an i, iþ3 or 4 periodicity (similarly, -sheet may show i,iþ1 periodicity). Exchange rates for some helices show clear exchange periodicity, and some do not; in general, amphiphilic helices with slower exchanging NHs show the most prominent periodicity. Analysis of the Folded State Mechanism

Analysis of data for NHs whose exchange is mediated by folded state motions depends on the model for the folded state process favored by the investigator. One issue is whether to express and discuss folded state exchange in terms of rate constants and protection factors, or alternatively, in terms of G(HX)(folded state mechanism) computed from Eq. (1) when kobs is the observed exchange rate constant for folded state exchange. Although aspects of EX2 analysis are valid for folded state exchange, we think that 21

D. W. Miller and K. Dill, Protein Sci. 4, 1860 (1995). M. E. Dixon, T. K. Hitchens, and R. G. Bryant, Biochemistry 39, 248 (2000). 23 A. Garcia and Hummer, Proteins 38, 261 (2000). 22

[17]

hydrogen exchange—protein folding and dynamics

391

application of Eq. (1) to folded state exchange rates carries with it a questionable literal interpretation of EX2 formalism. In particular, it is not valid to stipulate that because exchange is EX2 in that the observed pH dependence is first order in HO ion, there is a single, two-state, NH-close to NH-open motion governing exchange of that NH. This physical interpretation of an open–close transition makes sense for the unfolding mechanism, but not necessarily for the folded state mechanism. This view will now be elaborated. The central feature of EX2/EX1 is a preequilibrium transition between two folded state conformations, one in which the NH is not exchangeable and one in which it is. For the unfolding mechanism, the pretransition is reversible global unfolding. For folded state exchange, the preequilibrium is often assumed to be a local open–close transition, again between one nonexchangeable conformation and one exchangeable conformation of the NH being measured. However, preequilibrium transitions may also be envisioned when folded state exchange is mediated by numerous transitions within an ensemble of conformations that differ in the probability of exchange of a specified NH. In a penetration model, not one open–close transition, but the collective effect of numerous small fluctuations occurring over a range of frequencies and energy barriers, between structures with varying degrees of local order, transiently expose buried NHs to solvent on the exchange time scale. In this case, kcx,N still enters the expression for the observed rate constant of folded state exchange and accounts for the firstorder dependence on HO ion. Here, however, the parameter kop/kcl computed with the EX2 equation does not necessarily describe any particular transition. In summary, to express folded state exchange data as G(HX)(folded state mechanism) computed from Eq. (1) predisposes the discussion to a literal physical interpretation of the EX2/EX1 model, that there is one open and one closed transition being measured, and that an equilibrium constant and the time scale of their open and closed forms are given by the EX2/EX1 equations. We think that more advanced mathematical/ physical methods are needed to analyze exchange kinetics of the folded state process in terms of an ensemble of interconverting conformations with varying probabilities of a reaction between a given NH, solvent H2O, and HO or Hþ catalyst ions. Until that time, the more useful descriptors of folded state exchange are the observed exchange rate constant and the protection factor, kcalc/kobs. In practice, discussion of the folded state mechanism for the various NHs with reference to quantitative differences in protection factors is usually little different from the same discussion with reference to apparent stabilities (G(HX) (folded state mechanism) values). Further, even if one does assume an orthodox EX2 model, the value of G(HX)(folded state mechanism) calculated with Eq. (1) carries the implicit

392

cooperativity in protein folding and assembly

[17]

assumption that local environmental effects on the NH-open (folded state exchange) are the same as for an NH freely exposed to solvent in a small peptide (that is, that kcx,N ¼ kcalc). The observation5,6 that for many surface protons, exposed to water in the crystal structure and not intramolecularly H-bonded, kobs  kcalc, cautions against the assumption that kcx,N ¼ kcalc, even if one favors a local unfolding model. In any case, published articles should explicitly state observed exchange rate constants, because these can be lost if G(HX) is reported but kcalc is unspecified (kcalc values vary somewhat depending on the investigator, due primarily to variations in temperature and pH corrections). Once NHs exchanging by the folded state mechanism are identified, one can ask a number of questions about the effects of various perturbants on folded state fluctuations, as discussed in the next sections. Also ligandbinding sites can be mapped in favorable systems if one can easily separate out effects from global stabilization due to ligand binding. Temperature Can Switch the Exchange Mechanism

Temperature variation can switch the same NH between folded state and unfolding exchange mechanisms. Because kD has a higher temperature coefficient, arising from the large enthalpy of global folding (above), a rise in temperature accelerates kD more than kN. For NHs whose rates are just faster than the very slowest group, the value of kN approaches kD. At some higher temperature, kD > kN, and a switch in mechanism occurs. This is observed as curvature in Arrhenius plots, as shown in Fig. 2. The high temperature slope of the plot gives Ea equal to Hu þ Ea(chemical step), as described previously. For BPTI, fits to the data, using 78 kcal/mol for Ea of the unfolding mechanism (measured value for the very slowest exchange group), give Ea values of 27–40 kcal/mol for the folded state exchange rates. Subtracting 17 kcal/mol as the estimated value17 of Ea(chemical step) gives Hu(HX)  61 kcal/mol, and an apparent enthalpy for the preequilibrium of the folded state mechanism of 10–23 kcal/mol. Global Stability Is Not Correlated with Folded State Exchange

The lack of correlation of folded state exchange rates with global stability is shown unambiguously by the differential effect of destabilizing mutations on the unfolding mechanism versus the folded state mechanism.12 An effective graphic representation of the effect of amino acid substitutions (or additives) on protein out-exchange is a perturbation plot, in which out-exchange rates of unmodified protein (x-axis) are graphed against the rate constants for the same NHs in the mutant (y-axis). Perturbation plots

[17]

hydrogen exchange—protein folding and dynamics

393

Fig. 2. Temperature dependence of NH exchange rate constants of BPTI residues at pH 3.6. Solid lines show simulated curves of kobs as a function of temperature, assuming that kobs ¼ kN þ kD and that at each temperature kD is given by the dotted line (lower left) while  kN is given by extrapolation of the observed rate constant at 36 , using an activation energy of 40, 32, or 27 kcal/mol for Gln-31, Phe-45, or Met-52, respectively. Experimental data show the average of observed rate constants for Tyr-21, Phe-22, and Tyr-23 (), and observed rate constants of Gln-31 (), Phe-45 (þ), and Met-52 (h). Adapted from Fig. 3 in K.-S. Kim and C. Woodward, Biochemistry 32, 9609 (1993), reprinted with permission.

of WT and mutant exchange rate constants (Fig. 3) clearly illustrate the effect of single amino acid mutations on hydrogen exchange. NHs with the same exchange rate in WT and mutant lie along the diagonal; their rates are not correlated with global stability. This is the case for the majority of amides. Exceptions are readily apparent, as a group along the dotted line, and a group to the left of the diagonal and above the dotted line (Fig. 3). Both groups exchange more rapidly in the mutant. Those along the dotted line exchange by the unfolding mechanism; they are accelerated because destabilization increases ku/kf (Fig. 1A). For example, in F22A the slowest NHs on the lower left (Fig. 3) are several orders of magnitude faster than in WT. In the other three mutants, all with G between 4.7 and 5.1 kcal/mol (Table I), the very slowest exchanging are joined along the dotted line by other NHs. These amides exchange in WT by the folded state mechanism (kN  kD), but in the mutant by the unfolding mechanism (kD  kN). The data points in Fig. 3 above the dotted line and to the left of the diagonal have faster folded state exchange rates in the mutant; all are located in the vicinity of the mutation site. In summary, rates for exchange

394

cooperativity in protein folding and assembly

[17]

Fig. 3. Perturbation plot of exchange rate constants of BPTI mutants. The log of the rate constant of each NH in wild type (WT) is plotted against the log of the rate constant for the same NH in the mutant (mut).

by the folded state mechanism are not correlated with global stability. Some NHs near the mutation site may have faster folded state exchange rates. NHs exchanging by the unfolding mechanism in the mutant may include some that switch from folded state exchange in WT. A distinction between the two parallel exchange processes, based on their response to the denaturant urea, came from early BPTI experiments using tritium exchange; the experiments were later refined in NMR studies of individual NHs.9 As expected, the unfolded mechanism is accelerated by addition of urea, due to destabilization of the protein. In contrast, for folded state exchange some NHs are unaffected, some are slowed, and a few are accelerated. For 8 M urea, the perturbation plot (Fig. 4 in Kim and Woodward9) readily demonstrates the effect of urea on folded state exchange: nine NHs are slower, three to four NHs are accelerated, and five NHs show no effect. It was suggested that buried NHs slowed by urea must

[17]

hydrogen exchange—protein folding and dynamics

395

be accessible to the denaturant, meaning that folded state fluctuations permit interaction with urea for some buried sites. Also, some surface NHs on side chains of arginine or lysine are slowed, suggesting that they are at or near urea-binding sites of the protein. There are reports and extensive discussions on the implications of cases in which NHs exchanging by the folded state mechanism are affected by addition of urea.24 However one views these, it is clear that urea accelerates the unfolding mechanism in a way clearly distinguishable from its effect on folded state exchange. Protein Out-Exchange Reveals Motional Domains

Analysis of hydrogen exchange and ring flip rates in light of the crystal or NMR structures of the eight mutants of BPTI in Table I led to the proposal of motional domains in proteins.12 The central domain is the slow exchange core, i.e., those elements of secondary structure (usually mutually packed) that carry the very slowest exchanging NHs.25 Three other domains, relatively independent in flexibility, are (1) faster exchanging buried areas often composed of loops, (2) secondary structural regions not in the core, and (3) the surface of the molecule. Examples of (1) are the overlapping active site loops of BPTI. Perturbation greatly alters their motility relative to the rest of the protein. In one loop mutant, Y35G, the crystal structure undergoes large rearrangements, and folded state exchange as well as aromatic ring flips are much faster in the vicinity of the mutation site, relative to WT. In another loop mutant, G37A, the NMR structure is indistinguishable from WT,14 but folded state exchange rates and ring flips are much faster in the mutant around the replacement site, and essentially unchanged in the rest of the molecule. In a different series of BPTI mutants, partially folded species with only one of the three disulfides intact,26 the core is relatively stable, while the N- and C-terminal regions are disordered. It is clear that different regions of the folded protein have independent local internal motions, and that these may be selectively perturbed. Other Aspects of Protein Hydrogen Exchange

In addition to temperature, pH, and urea dependence, other areas of protein out-exchange warrant fresh experimental attention. A few particulars are given below. For most of these, similar observations are reported in 24

Y. Qu and D. W. Bolen, Biochemistry 42, 5837 (2003). R. Li and C. Woodward, Protein Sci. 8, 1571 (1999). 26 E. Barbar, V. Licata, G. Barany, and C Woodward, Biophys. Chem. 64, 45 (1997). 25

396

cooperativity in protein folding and assembly

[17]

more recent hydrogen-exchange papers, but only in the ‘‘fine print.’’ It is important to include data on these issues in the abstract and conclusion sections of publications where they may be readily noted by the reader. Surface NHs, those with solvent accessibility and without intramolecular H bonds in the crystal structure, are the most rapidly exchanging amides in a protein. Many surface protons in BPTI exchange with rate constants that are an order of magnitude less than kcalc, demonstrating5,6 that NHs with no apparent structural basis for slowed exchange can be more slowly exchanging than in small extended peptides. Clearly, the local environment of surface amides can differ from extended peptides. Explanations offered for why surface NHs may exchange more slowly than model peptides include rigid local geometry and/or local electrostatic effects that inhibit formation of the charged intermediate in the chemical step. Experiments with surface NHs also showed a marked ionic strength dependence for exchange of some NHs, and confirmed mechanistic aspects of acid-catalyzed exchange. The first demonstration that buried water molecules readily exchange with bulk solvent (10-sec to 5-min time scale) was with 18O-labeled tracer experiments conducted on BPTI.27 Refinements of the water-exchange experiment by NMR methods showed that protein buried waters have a residence time in the range 102–108 s in solution.28 For buried NHs that are H-bonded to buried waters, direct exchange with water (uncatalyzed exchange) apparently contributes to folded state exchange rates.27 The approximate preservation among NHs of rank order of rates at varying temperature and pH is characteristic of hydrogen exchange in proteins.29 This was shown by tritium-hydrogen methods, in which the average number of NHs exchanged per protein molecule is measured as a function of time after transfer to tritium solvent. When temperature or pH is jumped to a new condition, the tritium-exchange curve for a protein quickly merges with the curve for exchange entirely measured at postjump conditions. If rank order is approximately preserved, the expected observation for folded state exchange using NMR measurements is that Arrhenius plots of ln kobs versus 1/T for individual NHs will, for the most part, tend not to cross one another at temperatures below onset of the unfolding regime, i.e., when kN  kD. Recent studies of T4 lysozyme by NMR methods22 indicate that the rank order of exchange at varying temperature is preserved as a result of entropy–enthalpy compensation. 27

E. Tu¨chsen, J. Hayes, S. Ramaprasad, V. Copie´, and C. Woodward, Biochemistry 26, 5363 (1987). 28 G. Otting, E. Leipinsh, and K. Wu¨thrich, J. Am. Chem. Soc. 113, 4363 (1991). 29 C. Woodward and A. Rosenberg, J. Biol. Chem. 246, 4105 (1971).

[17]

hydrogen exchange—protein folding and dynamics

397

Hydrogen exchange occurs from protein crystals. Comparison of the same amide group in solution and in the crystal experiments shows that some buried NHs exchange with the same rates in the two different states, while others are markedly slowed.30 Extension of these types of experiments should yield insight into the types of internal motions that are responsible for folded state exchange. Exchange of buried primary amides in asparagine residues in BPTI is intriguing because two Asn side chains play unusual roles in buried interactions. NMR methods can be used to determine whether the two hydrogens on the same nitrogen have correlated or uncorrelated exchange. For Asn-43 and -44 in BPTI, exchange of the primary amide hydrogens of each Asn is uncorrelated,31 meaning that hydrogen exchange is several orders of magnitude more restricted that CO-NH2 rotation, implying that the primary amide group flips many times before isotope exchange occurs. Emerging Directions in Hydrogen-Exchange Methods

New and notable avenues of native state out-exchange research involve detailed examination of long-standing issues and development of novel methods applied to uncharted areas. Experiments summarized below and discussed in terms of the two-process model illustrate the rich variety of questions concerning protein dynamics that may be addressed with hydrogen exchange. Detailed studies by Robertson and associates of the pH and temperature dependence of out-exchange of ovomucoid third domain from turkey reveal one of the most complete pictures of protein hydrogen exchange.11 About 30 buried amides were measured from a total of 100 exchangeable NHs. The very slowest exchanging NHs, on residues 38, 39, and 40, are slower than expected for the unfolding mechanism, i.e., their observed exchange rate constants are smaller than the product Kfoldkcalc; these are termed ‘‘superprotected.’’11 For another group of 11 NHs, G(HX) values are very similar to Gu determined by other standard methods (kD  kN). A third group of eight NHs exchanges with contributions from both the folded state and the unfolding mechanism (kN  kD). A fourth group of nine NHs exchanges by the folded state mechanism and shows a pH effect, presumably arising from a conformational transition(s) linked to a carboxyl titration(s), that both increases kN for these amides and destabilizes the protein. It was suggested that superprotection arises from residual order in the D state, meaning that the value of kcalc derived from the rules based 30 31

W. Gallagher, F. Tao, and C. Woodward, Biochemistry 31, 4673 (1992). E. Tu¨chsen and C. Woodward, Biochemistry 26, 8073 (1987).

398

cooperativity in protein folding and assembly

[17]

on small peptides4 overestimates the actual value of kcx,D. Correction for proline cis and trans isomers is also proposed15 as the reason for the apparent superprotection (discussed above). If this is true, however, then the same correction should presumably be made to the 11 NHs for which G(HX)  Gu, and one then has to assume that the observed rate constant for these NHs has contributions from folded state exchange. EX1 behavior is observed in the ovomucoid third domain for the 14 slowest exchanging NHs, of which 9 have the same value of k1.18 Assuming that exchange is only by the unfolding mechanism, k1  ku (using the notation in Fig. 1). Mass spectrometry methods reveal whether the detected NHs exchange in a correlated or uncorrelated manner. Fits to mass spectrometry data for the ovomucoid third domain for exchange of the 13 most slowly exchanging NHs32 are most consistent with uncorrelated exchange for 5 NHs and correlated exchange for 4 NHs. The latter could represent the superprotected group. The five NHs with uncorrelated exchange also exhibit EX1 behavior, implying that ku represents global unfolding to an ensemble of partially folded and completely unfolded conformations, an intriguing outcome in the context of the multiple parallel folding pathways proposed to underlie funnel-shaped energy landscapes.33 Evidence of EX1 behavior is also observed for the folded state mechanism.34 Rather than actual rate constants, the amplitudes of individual peaks were measured after a 12-ms jump to (variable) high pH, following by quenching to low pH (very slow exchange). A simple explanation would be that at high pH, for these NHs, kN  kD and both mechanisms contribute; the leveling off behavior would then arise from the kD contribution, and the EX1 behavior arises from global unfolding, which accounts for some or all of the observed exchange. However, this is apparently ruled out because the slowest exchanging NHs (unfolding mechanism) have little or no exchange under conditions in which the EX1 NHs (folded state mechanism) show a decrease and then a leveling in peak intensity with increasing pH. The very interesting possibility is that the EX1 exchange rate under these conditions gives the rate of solvent and catalyst exposure to the NH for the folded state mechanism. Use of pressure as a thermodynamic variable opens a new and exciting dimension to the understanding of protein dynamics. Pioneering studies by Bryant and associates22,35 of the pressure dependence of hydrogenexchange in T4 lysozyme provide new insight into protein motional 32

C. B. Arrington and A. D. Robertson, J. Mol. Biol. 300, 221 (2000). K. Dill and H.-S. Chan, Nat. Struct. Biol. 4, 10 (1997). 34 C. B. Arrington and A. D. Robertson, J. Mol. Biol. 296, 1307 (2000). 35 T. K. Hitchens and R. G. Bryant, Biochemistry 37, 5878 (1998). 33

[17]

hydrogen exchange—protein folding and dynamics

399

properties associated with exchange from the folded state. The pressure dependence of the rate for exchange with no contribution from global unfolding yields the activation volume associated with interaction of the NH with water and catalyst when exchange is mediated by motions of the folded state. There is no evident correlation between activation volumes, protection factors, or structure.35 Comparison of the activation volumes ( V{) to other activation parameters of hydrogen-exchange in T4 lysozyme showed no correlation with H{ or S{. Further, the variations in V{, H{, and S{ for residues that are sequentially or three-dimensionally close in the crystal structure suggest different pathways for the access of solvent to these sites. We take this as support of a penetration model for folded state exchange (above). In another frontier of protein folding and dynamics, Bolen and associates have analyzed and described the mechanism of the complex behavior of osmolytes in protein folding. The solvophobic action of osmolytes like trimethylamine oxide and sucrose on the peptide backbone is called the osmophobic effect. This unfavorable interaction between peptide backbone units and the osmolyte solvent component is essential to survival of organisms in which osmolytes protect against protein denaturation. Osmophobic backbone–solvent interactions offer special interest for hydrogen-exchange. A careful investigation24 of osmolyte effects suggests that osmolytes suppress exchange of some NHs exchanging by the folded state mechanism. Design of New Proteins Based on the Slow Exchange Core

The slow exchange core of proteins has been proposed to be the folding core, based on the tendency for the secondary structural elements containing the slowest exchanging NHs in a native protein to also be the secondary structural elements that contain the NHs first protected during folding, and the NHs most protected in partially folded analogs.25 We think that ‘‘core elements’’ of a protein (secondary structure containing the slowest exchanging amide protons) are the most likely to be nativelike during folding. At various stages of folding, there are ensembles of random and nonrandom conformations; among these, the nonrandom conformations favor nativelike structure in their ordered regions. This idea is woven into a new strategy for the design of novel proteins that mimic natural proteins in spontaneous folding to a native state. The native state of a natural protein is a family of interconverting conformers that is more stable than any other possible conformations. This is a remarkable property for a polypeptide, and one presumably honed by evolution. We have designed and synthesized core modules from templates corresponding to core elements of a protein. Core modules modeled on

400

cooperativity in protein folding and assembly

[18]

natural BPTI, when water soluble, form conformational ensembles that favor nativelike structure. To produce a native state mimic, we incorporate two or more core modules into larger peptides in which module–module interactions are promoted. The interactions are expected to be mutually stabilizing, and thus to further shift the conformational bias of component core modules toward more ordered structure. This idea has been realized in the protein BetaCore, in which two core modules are incorporated into a single molecule by means of a long cross-link.36,37 BetaCore is monomeric in water and forms a new fold composed of a four-stranded, antiparallel -sheet. The single, dominant conformation of BetaCore has been characterized by various NMR experiments. Acknowledgments We thank Andrew Robertson and D. Wayne Bolen for critical reading of the manuscript and helpful discussions. This work is currently supported by NIH Grant GM51628. 36 37

N. Carulla, C. Woodward, and G. Barany, Protein Sci. 11, 1539 (2002). N. Carulla, G. Barany, and C. Woodward, Biophys. Chem. 101–102, 67 (2002).

[18] The Preparation of 19F-Labeled Proteins for NMR Studies By Carl Frieden, Sydney D. Hoeltzli, and James G. Bann Introduction

The incorporation of 19F-labeled amino acids into proteins for nuclear magnetic resonance (NMR) spectroscopy has been a technique used for many years as a probe of protein structure and dynamics. Three previous articles in Methods of Enzymology1–3 have dealt with this subject. Two recent excellent reviews4,5 describe the advantages of using 19F-NMR for such studies and cover the field well, pointing out the usefulness of fluorine as a probe of local environment. The present chapter discusses methods of incorporating 19F-labeled amino acids into proteins. 1

B. D. Sykes and W. E. Hull, Methods Enzymol. 49, 271 (1978). G. Horton and I. Boime, Methods Enzymol. 96, 777 (1983). 3 J. T. Gerig, Methods Enzymol. 177, 3 (1989). 4 M. A. Danielson and J. J. Falke, Annu. Rev. Biophys. Biomol. Struct. 25, 163 (1996). 5 J. T. Gerig, in ‘‘Biophysical Textbook Online’’ (2001). www.biophys.org/btol/ 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

400

cooperativity in protein folding and assembly

[18]

natural BPTI, when water soluble, form conformational ensembles that favor nativelike structure. To produce a native state mimic, we incorporate two or more core modules into larger peptides in which module–module interactions are promoted. The interactions are expected to be mutually stabilizing, and thus to further shift the conformational bias of component core modules toward more ordered structure. This idea has been realized in the protein BetaCore, in which two core modules are incorporated into a single molecule by means of a long cross-link.36,37 BetaCore is monomeric in water and forms a new fold composed of a four-stranded, antiparallel -sheet. The single, dominant conformation of BetaCore has been characterized by various NMR experiments. Acknowledgments We thank Andrew Robertson and D. Wayne Bolen for critical reading of the manuscript and helpful discussions. This work is currently supported by NIH Grant GM51628. 36 37

N. Carulla, C. Woodward, and G. Barany, Protein Sci. 11, 1539 (2002). N. Carulla, G. Barany, and C. Woodward, Biophys. Chem. 101–102, 67 (2002).

[18] The Preparation of 19F-Labeled Proteins for NMR Studies By Carl Frieden, Sydney D. Hoeltzli, and James G. Bann Introduction

The incorporation of 19F-labeled amino acids into proteins for nuclear magnetic resonance (NMR) spectroscopy has been a technique used for many years as a probe of protein structure and dynamics. Three previous articles in Methods of Enzymology1–3 have dealt with this subject. Two recent excellent reviews4,5 describe the advantages of using 19F-NMR for such studies and cover the field well, pointing out the usefulness of fluorine as a probe of local environment. The present chapter discusses methods of incorporating 19F-labeled amino acids into proteins. 1

B. D. Sykes and W. E. Hull, Methods Enzymol. 49, 271 (1978). G. Horton and I. Boime, Methods Enzymol. 96, 777 (1983). 3 J. T. Gerig, Methods Enzymol. 177, 3 (1989). 4 M. A. Danielson and J. J. Falke, Annu. Rev. Biophys. Biomol. Struct. 25, 163 (1996). 5 J. T. Gerig, in ‘‘Biophysical Textbook Online’’ (2001). www.biophys.org/btol/ 2

METHODS IN ENZYMOLOGY, VOL. 380

Copyright 2004, Elsevier Inc. All rights reserved. 0076-6879/04 $35.00

[18]

19

f-labeled proteins for NMR studies

401

As noted in the two recent reviews,4,5 there are many properties of fluorine that make it a useful probe for studies of protein structure and function. For example, the fluorine nucleus is small, only slightly larger than the hydrogen nucleus. Although the dipole moment is considerably different, it is expected that there will be a minimal perturbation to the structure, stability, and functionality of the protein. Fluorine does not occur naturally in proteins so that a protein labeled with an 19F-labeled amino acid will exhibit NMR peaks due only to the label. Because fluorine is extraordinarily sensitive to its environment, and to local shielding effects,6 the NMR peaks are typically well resolved from one another in a one-dimensional (1D) NMR spectrum. This sensitivity makes 19 F-NMR extremely well suited for studies of both protein structure and protein folding, since even in the denatured state the peaks are frequently resolved.7–10 It should also be noted that 19F-NMR can be used to examine structural aspects of much higher molecular weight protein than are generally accessible by proton NMR. Under conditions in which two or more conformers exist in equilibrium, it may be possible to use 19F-NMR to calculate rate or dissociation constants if the conformers have different chemical shifts. If, for example, two different conformers exchange slowly on the NMR time scale, there will be two peaks and dissociation constants can be determined by quantifying the area of each peak. For conformers or liganded forms in intermediate exchange on the NMR time scale (roughly 5–5000 s1 depending upon chemical shift difference), the exchange rate kex may be determined by lineshape analysis.11 For conformers or liganded forms in slow exchange, in the millisecond range, the exchange rate may be accessible through two-dimensional experiments such as nuclear Overhauser enhancement spectroscopy (NOESY). Finally, the accessibility of a residue to bound or bulk water can be assessed by 19F-1H HOESY experiments.12 When using 19F-NMR to study protein folding, two major types of experiments can be carried out: an equilibrium measurement that involves the determination of peak chemical shift and intensity as a function of denaturant concentration or a kinetic experiment measuring peak intensity as a function of time after dilution of the denaturant. This latter can be done by manual mixing or, for faster reactions (1 s or longer), using a 6

E. Y. Lau and J. T. Gerig, Biophys. J. 73, 1579 (1997). S. D. Hoeltzli and C. Frieden, Biochemistry 33, 5502 (1994). 8 S. D. Hoeltzli and C. Frieden, Biochemistry 35, 16843 (1996). 9 S. D. Hoeltzli and C. Frieden, Biochemistry 37, 387 (1998). 10 J. G. Bann, J. Pinkner, S. J. Hultgren, and C. Frieden, Proc. Natl. Acad. Sci. USA 99, 709 (2002). 11 J. Sandstrom, ‘‘Dynamic NMR Spectroscopy.’’ Academic Press, New York, 1982. 12 D. P. Cistola and K. B. Hall, J. Biomol. NMR 5, 415 (1995). 7

402

cooperativity in protein folding and assembly

[18]

Fig. 1. 19F-NMR spectra of apo, binary, and ternary complexes of E. coli dihydrofolate  reductase labeled with 6-19F-tryptophan. The data were collected at 22 on a Varian VXR 500 equipped with a Nalorac indirect detection probe. The five resonances were assigned using site-directed mutagenesis. MTX is the dihydrofolate analog, methotrexate. Data taken from Hoeltzli and Frieden.7

stopped-flow device.13 Because unfolded chemical shifts are frequently resolved, it is also possible to assess each residue individually in the unfolded state. Note that these measurements are different from those of hydrogen/ deuterium exchange in that the latter measures the ability of backbone amide protons to exchange with the solvent while the 19F-NMR experiments assess the environment of a specific amino acid side chain. Figure 1 illustrates a typical equilibrium experiment with native Escherichia coli dihydrofolate reductase. In this figure, the five tryptophan residues have been substituted with 6-19F-tryptophan. There are five clearly separated peaks for the native protein. Addition of ligands such as NADP or methotrexate (MTX) affects some peaks more dramatically than others. Four peaks are observed in the denatured protein (data not shown). In the presence of urea, all peaks in the native protein, except for Trp-22, are in slow exchange with the denatured form. In a technological advance, a fluorine cryoprobe (Varian) is currently available allowing either fewer transients or lower protein concentrations to be used for a given signal-to-noise (S/N) ratio. In our experience the S/N ratio is at least 4-to 5-fold greater than the conventional probes currently in use. This is particularly important for studies of proteins that tend 13

S. D. Hoeltzli, I. J. Ropson, and C. Frieden, in ‘‘Techniques in Protein Chemistry,’’ pp. 455–465. Academic Press, New York, 1994.

[18]

19

f-labeled proteins for NMR studies

403

Fig. 2. 19F-NMR spectra of 6-19F-tryptophan–labeled PapD recorded as a function of time after a stopped-flow urea jump from 4.5 M urea to 2.25 M urea. The final concentration of protein was 70 M using a fluorine cryoprobe on a Varian Unity-Plus 500-MHz spectrometer  with 32 transients at each time point. Data were obtained at 20 as described by Bann et al.10 The buffer was 30 mM Mops/HCl, pH 7.0.

to aggregate at higher concentrations, or for detecting peaks that may have previously been difficult to quantify. Figure 2 shows a kinetic experiment using 70 M of the protein PapD and the fluorine cryoprobe. PapD, a bacterial chaperone, is a two-domain protein in which each domain has one tryptophan. In these experiments we used stopped-flow methods as described elsewhere.7–10 As shown by Fig. 2, changes in peak intensity can be measured as a function of time after dilution of the denaturant. Interestingly, an intermediate form, represented by the peak at 45.5 ppm, is present during refolding and disappears as the protein finishes the folding process. Although not shown in Fig. 2, it should be pointed out that the earliest time one can collect a spectrum after diluting out the denaturant is about 1–2 s. Unfortunately, there is a least one problem that has not been solved: it has not been easy, so far, to relate the chemical shift of the fluorine signal to the structural environment surrounding the fluorine within the protein.5,14 Therefore, it is not yet possible to predict the chemical shifts in a 19 F-NMR spectrum based on analysis of the protein’s amino acid sequence or the analysis of its three-dimensional structure. Consequently, to assign 14

J. G. Pearson et al., Biochemistry 36, 3590 (1997).

404

cooperativity in protein folding and assembly

[18]

the peaks from the labeled amino acids in a spectrum, one needs to use sitedirected mutagenesis. Techniques used for this method will be outlined in detail in this chapter. Currently, there are a number of different analogs that can be used to label proteins biosynthetically, and the list of available amino acids continues to grow as improved methods for the efficient syntheses of these amino acids continue.15 Most studies thus far have utilized aromatic amino acids for reasons discussed later. Incorporation of

19

F-Labeled Amino Acids into Proteins

For production of proteins containing 19F-labeled amino acids, biosynthetic incorporation of the labeled amino acids by microbial protein expression is the strategy of choice. High yields of >90% labeled protein are possible and biosynthetic methods are cost effective relative to chemical or in vitro synthesis. High levels of incorporation are important since this allows lower protein concentrations to be used and avoids a heterogeneous population of labeled protein. A number of approaches have been successfully utilized to produce very high label incorporation and two common themes emerge: (1) the importance of placing the gene for the protein of interest under control of a tightly regulated inducible promoter and (2) the critical issue of completely depleting and preventing synthesis of, natural amino acids from the labeling media prior to inducing protein production. Because 19F-labeled amino acids can inhibit bacterial growth to differing degrees16 the bacteria usually cannot be grown from inoculation on fluorine-containing medium. Bacterial cultures can be grown on defined medium containing a limited amount of the unlabeled amino acid and then either grown to the point where unlabeled amino acid is completely depleted prior to introduction of the 19F-labeled amino acid (for example, Kranz et al.17) or harvested and transferred to defined medium containing 19 F-labeled amino acid. Our laboratory has obtained more consistent results for a number of different proteins with the latter approach. It should be noted that some proteins cannot be produced under conditions that allow complete labeling (for example, Luck and Falke18), and in these cases it may be necessary to accept a lower level of fluorine label incorporation by adding unlabeled amino acid to the labeling medium or by incompletely depleting the unlabeled amino acid. 15

A. Sutherland and C. L. Willis, Nat. Prod. Rep. 17, 621 (2000). R. E. Marquis, in ‘‘Handbook of Experimental Pharmacology’’ (F. A. Smith, ed.), Vol. 20. pp. 165–192. Springer-Verlag, Berlin, 1970. 17 J. K. Kranz, J. Lu, and K. B. Hall, Protein Sci. 5, 1567 (1996). 18 L. A. Luck and J. J. Falke, Biochemistry 30, 4257 (1991). 16

[18]

19

f-labeled proteins for NMR studies

405

Labeling with Aromatic Amino Acids To date, most studies of 19F-labeled proteins have involved incorporation of 19F-labeled aromatic amino acids. There are several reasons for this. Usually, even quite large proteins contain a limited number of aromatic amino acids, simplifying the problem of assignment. Several healthy strains of bacteria auxotrophic for each of the aromatic amino acids are readily available. In addition, m-19F-tyrosine, 4-, 5-, and 6-19F–labeled tryptophan, and m-, o-, and p-19F-phenylalanine are all commercially available at reasonable cost. All of these analogs have been used sucessfully in studies of various proteins. In some studies, more than one analog has been successfully incorporated into the same protein.19–22 These results and those from our laboratory (Ropson and Frieden, unpublished observations; Hoeltzli and Frieden, unpublished observations) suggest that proteins labeled with different analogs can express at different levels, can differ in chemical shift resolution, and might differ in structural perturbation and stability. Examination of the crystal structure and the local contacts that are likely to be affected by the substitution will certainly help guide the experimentalist in the choice of an analog. It may be necessary to first try a series of analogs to determine if there is an effect on stability. In the absence of a comprehensive theory predicting fluorine chemical shifts the choice of analog remains largely empirical. Use of Nonauxotrophic Strains. The simplest approach to incorporation of a 19F-labeled amino acid is to use whatever nonauxotrophic bacterial strain has been found to give good protein expression and to repress endogenous amino acid synthesis with high concentrations of amino acids. This approach is limited by the fact that 19F-labeled amino acids are generally inhibitory to bacterial growth. However, Lu and co-workers23 achieved incorporation of greater than 90% 3-19F-tyrosine into lac repressor protein by using high concentrations of tryptophan, phenylalanine, and tyrosine to repress 3-deoxy-d-arabinoheptulosonate-7-phosphate synthetase, the first enzyme of the aromatic pathway. Strain CSH46 (also known as 96) was grown in M9 minimal media supplemented with 1% glucose, 1 g/ml thiamine, and 0.2 mM amino acids except tyrosine, tryptophan, and phenylalanine. Tryptophan and phenylalanine were supplemented at 1 mM, tyrosine was omitted, and the cells grown to A550 nm of 1.0. The cultures 19

G. S. Rule, E. A. Pratt, V. Simplaceanu, and C. Ho, Biochemistry 26, 549 (1987). C. Lian et al., Biochemistry 33, 5238 (1994). 21 M. A. Dominguez, Jr., K. C. Thornton, M. G. Melendez, and C. M. Dupureur, Proteins 45, 55 (2001). 22 C. Minks, R. Huber, L. Moroder, and N. Budisa, Biochemistry 38, 10649 (1999). 23 P. Lu, M. Jarema, K. Mosser, and W. E. Daniel, Proc. Natl. Acad. Sci. USA 73, 3471 (1976). 20

406

cooperativity in protein folding and assembly

[18]



were shifted to 42 to induce production of lac repressor and 1 mM 3-19F-tyrosine added. Use of a nonauxotrophic strain to incorporate 19 F-labeled amino acids has resulted in lower degrees of label incorporation in other reports.21,24 Careful attention to optimizing the inhibition of amino acid synthesis and the concentration of labeled amino acid is probably critical to success. Whether or not high (>90%) levels of label incorporation can be obtained, this approach may be valuable when expression of the protein of interest requires specific properties not available in an existing auxotrophic strain. A second approach is the use of glyphosate to induce aromatic amino acid auxotrophy in a nonauxotrophic bacterial strain. Glyphosate is an inhibitor of 5-enolpyruvylshikimic acid 3-phosphate biosynthesis and suppresses the production of aromatic metabolites, including the aromatic amino acids tryptophan, tyrosine, and phenylalanine.25 This strategy may be especially useful when an amino acid auxotroph of desirable properties is not available, or when the protein of interest cannot be expressed in available auxotrophs.26 Recently, complete 5-fluorotryptophan labeling of T. maritima cold shock protein using glyphosate to repress aromatic amino acid synthesis has been reported by Schuler et al.27 Briefly, cells were grown  at 37 in minimal medium containing 50 mg/liter of all amino acids except tryptophan. Aromatic amino acid synthesis was suppressed with 1 g/liter glyphosate. When the cultures left exponential growth at A550 nm 1.5, 50 mg/liter dl-5-19F-tryptophan was added and protein expression induced for 4 h prior to harvest. Use of glyphosate has resulted in lower levels of incorporation in other reports.28 Depletion of the unlabeled amino acid from the growth medium is critical and needs to be monitored carefully. Use of Auxotrophic Strains. Most studies of 19F-labeled amino acids incorporated into proteins have utilized bacterial strains auxotrophic for the amino acid of interest. Numerous aromatic amino acid auxotrophs are available from individual investigators, through the ATCC or the Yale-New Haven E. coli Genetic Stock Center. The properties and sources of several auxotrophs we have found to be particularly useful are listed in Table I. Generally, the auxotrophic bacteria are transformed with a plasmid encoding the gene for the protein of interest under the control of a tightly regulated promoter using standard techniques. Transformed bacteria are grown on rich or defined medium containing a specific concentration of 24

C. M. Dupureur and L. M. Hallman, Eur. J. Biochem. 261, 261 (1999). L. Comai, L. C. Sen, and D. M. Stalker, Science 221, 370 (1983). 26 P. Bai, L. Luo, and Z. Peng, Biochemistry 39, 372 (2000). 27 B. Schuler, W. Kremer, H. R. Kalbitzer, and R. Jaenicke, Biochemistry 41, 11670 (2002). 28 H. W. Kim, J. A. Perez, S. J. Ferguson, and I. D. Campbell, FEBS Lett. 272, 34 (1990). 25

19

[18]

f-labeled proteins for NMR studies

407

TABLE I Useful E. COLI Auxotrophsa Strain

Auxotrophy

Mutation

Reference

W3110TrpA33 DL39 (CGSC #6913) NK6024 (CGSC #6178)

Trp Asp, Ile, Leu, Phe, Tyr Phe

trpA33 tyrB pheA::Tn10 (tetR)

40 30 31

a

Strains are available from the E. coli genetic stock center (http://cgsc.biology.yale.edu).

the natural amino acid of interest, then harvested and resuspended in defined medium containing the fluorine-labeled amino acid. The cells are allowed to recover for a period of time sufficient to deplete intracellular stores of unlabeled amino acid and protein production is then induced for an optimal period of time and cells harvested. Using this method, we have successfully produced 6-19F-tryptophan–labeled rat intestinal fatty acid binding protein,29 6-19F-tryptophan8,9 and p-19F-phenylalanine–labeled E. coli dihydrofolate reductase (Hoeltzli and Frieden, unpublished observation), 6-19F-tryptophan10 and p-19F-phenylalanine–labeled E. coli PapD (Bann and Frieden, unpublished data), and 6-19F-tryptophan–labeled murine adenosine deaminase (Shu and Frieden, unpublished data), with greater than 90% incorporation of label as assessed by electrospray mass spectrometry or by comparison of deconvoluted resonance intensity to the intensity of an internal standard of known concentration. For example, to produce >90% 6-19F-tryptophan–labeled E. coli dihydrofolate reductase, we used the E. coli auxotroph W3110TrpA33 containing the plasmid pTrc99DHFR. This plasmid was constructed by inserting the folA gene from plasmid pTY1 into the plasmid pTrc99A (Pharmacia Co., Piscataway, NJ). The cells were grown in a Biostat B fermentor  (Braun Instruments, Allentown, PA) at 37 and pH 7 on M9 minimal medium supplemented with twice the normal concentration of phosphate salts and supplemented with 1.5 g/liter CSM-TRP (Bio-101 Inc., Vista, CA), 0.2% glucose, 0.2 mM l-tryptophan, 50 g/ml ampicillin, and 1 ml/ liter Poly-Vi-Sol vitamin drops with iron (Mead-Johnson, Evansville, IN). Cells were maintained at pH 7 by addition of NH4OH. Glucose was maintained between 0.1% and 0.2% and pO2 between 25% and 35% by varying the rate of addition of a feed mixture containing 45 g/liter CSM-TRP, 1.2 g/liter l-tryptophan, 30 g/liter NH4Cl, 4.8 g/liter MgSO4, and 20% glucose. The cells were harvested at A600 ¼ 16 and resuspended in fresh 29

I. J. Ropson and C. Frieden, Proc. Natl. Acad. Sci. USA 89, 7222 (1992).

408

cooperativity in protein folding and assembly

[18]

medium containing 0.2 mM 6-19F-tryptophan in place of l-tryptophan. After 30 min of growth in the 6-19F-tryptophan medium, the plasmid was induced with 1 mM isopropyl--d-thiogalactopyranoside (IPTG) for 2 h, and then cells were harvested. The yield is approximately 1 mg of protein per 1 g wet weight of cells. In the case of adenosine deaminase, bacteria (W3110TrpA33) containing the plasmid pQE80LmADA have been grown in Luria broth to A600 ¼ 4 and then harvested and washed twice with minimal media as described by Muchmore et al.30 containing 1 mM 6-19F-tryptophan. The plasmid was induced with 1 mM IPTG and grown for 3 h. Greater than 90% incorporation into adenosine deaminase was achieved (Shu and Frieden, unpublished observation). Initial use of rich media has the advantage of faster growth and possibly higher optical density than minimal media. To produce >90% labeled p-19F–labeled phenylalanine dihydrofolate reductase or E. coli PapD, we have used two phenylalanine auxotrophs, DL3930 and NK6024 (also called CGSC #6178).31 The former is auxotrophic for both phenylalanine and tyrosine. The latter has a transposon insertion encoding tetR in the pheA gene, which encodes the enzyme that converts chorismate to prephenate to phenylpyruvate. The bacteria are grown on media containing tetracycline to select for the insertion. Both strains grow well on minimal media as used above.30 We have achieved >90% labeling by growing DL39 containing the plasmid ptrc99DHFR or pQ80DHFR in the presence of 0.1 mM phenylalanine to an A600 of 3.0. At this point the cells have just stopped the logarithmic phase of growth. Cells are then harvested and resuspended in media containing 0.2 mM p-19F-phenylalanine. After 30 min, the plasmid is induced with 1 mM IPTG for 2 h and the bacteria are harvested. To produce p-19F-phenylalanine–labeled PapD we grow NK6024 containing the plasmid pQE80papD in minimal media containing 1 mM unlabeled phenylalanine and then harvest the cells while in log phase of growth (A600 ¼ 5). The cells are washed twice with 0.9% NaCl, 1 mM p-19F–labeled phenylalanine and then resuspended in new media containing 1 mM 19F-labeled amino acid. We have achieved >95% labeling (Bann and Frieden, unpublished observations) and this approach was outlined in the paper of Furter31 for the incorporation of p-19F–labeled phenylalanine in mouse dihydrofolate reductase.

30

D. C. Muchmore, L. P. Mclntosh, C. B. Russell, D. E. Anderson, and F. W. Dahlquist, Methods Enzymol. 177, 44 (1989). 31 R. Furter, Protein Sci. 7, 419 (1998).

[18]

19

f-labeled proteins for NMR studies

409

Assignment of Fluorine Resonances

As already mentioned, one drawback of fluorine NMR is a lack of a comprehensive theory to relate fluorine chemical shift to its environment as deduced, for example, from the known three-dimensional structure solved by X-ray crystallography or NMR spectroscopy. Therefore, the assignment of the fluorine resonances in a uniformly labeled protein is usually accomplished by site-directed mutagenesis. Each of the residues of the amino acid being uniformly labeled is mutated to another, which is chemically similar, such as a tyrosine to phenylalanine or tryptophan to phenylalanine. Each mutant protein is then fluorine labeled and purified using the procedure developed for the wild-type protein. 19F-NMR spectra of the mutant proteins are obtained under each of the conditions of interest (e.g., apo, in the presence of ligand or denatured). If the mutation has not introduced a significant structural perturbation, only one resonance will have disappeared from the spectrum, the remaining resonances will show minimal chemical shift perturbation, and the missing resonance will be assigned unambiguously to the mutated amino acid. A problem with this method is the potential to introduce some perturbations into the 19F-NMR spectra such that a remaining resonance undergoes a significant chemical shift change or disappears. Large spectral perturbations would probably be due to large changes in the structure or dynamics of the mutant protein by the chosen amino acid substitution. To minimize such effects, several tools are available. For example, one could search the sequence databases for homologous proteins using standard similarity search tools (i.e., BLAST via the internet athttp://us. expasy.org/tools/) to identify potential conservative substitutions. The BLOSUM62 scoring matrix32 quantifies the results of such an approach. If conservative substitutions cannot be identified using sequence search and alignment tools, or from a complementary approach, it may be useful to choose a substitution based on statistical analysis of a database of protein structures. Such methods attempt to identify ‘‘similar’’ amino acids based on structural elements (helix, sheet, turn etc.) or environment (solvent exposed vs. interior). An example of the former approach is PSIPRED33 accessible on the Internet via http://bioinf.cs.ucl.ac.uk/ psipred/. Examples of the latter approach can be found at http://prowl. rockefeller.edu/aainfo/contents.htm.

32 33

S. Henikoff and J. G. Henikoff, Proc. Natl. Acad. Sci. USA 89, 10915 (1992). D. T. Jones, J. Mol. Biol. 292, 195 (1999).

410

cooperativity in protein folding and assembly

[18]

Single-Site–Specific Labeling of Proteins

Labeling of a protein with a 19F-labeled amino acid at more than one site could affect the global stability of the protein. Thus, there is a potential for heterogeneity in the stabilizing/destabilizing effects of the fluorine substitution. An excellent example is the effect of a single fluorine substitution with different 19F-labeled tryptophan analogs in the protein annexin V.22 The X-ray crystal structures of wild-type, 4-, 5-, and 6-19F–labeled tryptophan annexin V were compared to the effects observed by circular dichroism on thermal stability. A decrease in thermal stability was observed for the 4- and 6-19F–labeled proteins, and this correlated with a decrease in molecular packing from the crystal structure, while the 5-19F-tryptophan showed no altered packing and a slight increase in thermal stability. Thus, even at a single site there can be heterogeneity in the effects on stability. In most cases, biosynthetic incorporation results in more than one site being labeled. Therefore, it would be highly advantageous to be able to selectively place a single 19F-labeled amino acid at a given position in the sequence in order to probe the effects on stability in that region only. This would be a minimal perturbation and thus most closely represent the native unlabeled protein. Furthermore, if the number of 19F-labeled amino acids is high, as in large proteins, or if a complex of proteins is studied, labeling at a single site would greatly simplify the observations. Furter31 recently developed such a system for the site-specific incorporation of p-19F–labeled phenylalanine, which relies upon the expression of three separate genes from two vectors: a yeast phenylalanyl tRNA synthetase, a yeast amber suppressor tRNA, and the gene of interest with an Amber mutation. The use of the heterologous yeast tRNA/synthetase pair is the cornerstone of this technique, since there is little cross-reactivity between the yeast tRNA/synthetase pair and the E. coli tRNA/synthetase pair. We have used this method to unambiguously assign the p-19F–labeled phenylalanine resonances of both PapD and dihydrofolate reductase from E. coli. The approach is simple and straightforward. An example is shown in Fig. 3 for PapD. Strains and Vectors The strain used for specific incorporation of a single phenylalanine is K10-F6. This is a p-19F-Phe–resistant, Phe auxotroph. There are two plasmids that are required for the ability to produce site-specifically labeled protein, and these have been outlined by Furter.31 One plasmid contains the yeast tRNA synthetase and the mouse DHFR gene (pRO148), and the other the tRNAPhe/amber (pRO117). Both plasmids and the strain

[18]

19

f-labeled proteins for NMR studies

411

Fig. 3. Site-specific labeling PapD with p-19F–labeled phenylalanine using the procedure  described in Appendix 1. The data were collected at 20 using a fluorine cryoprobe on a Varian Unity-Plus 500-MHz spectrometer.

K10-F6 were obtained as gifts from Dr. David Tirrell (Caltech). The plasmid pRO148 is a derivative of pQE16, one of a set of vectors available from Qiagen that can allow expression of six-His–tagged coding sequences. pRO117 is a derivative of the pACYC177 plasmid pREP4, and also encodes the laclq gene. Both of the requisite genes (tRNA synthetase and tRNAPhe/amber from pRO148 and pRO117, respectively) can be subcloned after digestion with PvuII into blunt-end restriction sites, thus it should be feasible to utilize other vectors as long as they are compatible. Because overexpression of PapD is toxic to E. coli, it was essential to repress the papd gene by placing the laclq gene in cis rather than trans. Thus we utilized the pQE80 vector (Qiagen) that has this feature, but also has a six-His coding sequence between the start codon and the multiple cloning site. To remove the six-His tag sequence we used polymerase chain reaction (PCR) to generate a fragment that could be incorporated into the EcoR1 site upstream of the RBS and ATG start sites, and a KpnI site within the multicloning site of pQE80. Thus, the forward primer should include the RBS and ATG start sites that are normally present in the pQE80 plasmid, with an additional 12 bases of coding sequence from the gene of interest. For instance, to remove the six-His tag and introduce the papd gene into pQE80, we used as the forward primer the sequence CCCGAATTCATTAAAGAGGAGAAATTAACTATGATTCGAAAAAAG. This primer encoded the pQE80 EcoR1 site, the RBS site, and the ATG start site. If the gene of interest is not toxic, or if the six-His tag is

412

cooperativity in protein folding and assembly

[18]

desired for ease of purification, then the gene of interest can be subcloned using the multicloning sites provided in the appropriate vector. Fluorine-Labeled Amino Acids That Can Be Incorporated into Proteins

As implied above, there are a number of fluorine-labeled amino acids commercially available. These include the 4-, 5-, and 6-19F-tryptophan, the o-, m-, and p-19F-phenylalanine, m-19F-phenylalanine, m-19F-tyrosine, and 19F-methylhistidine. Incorporation of difluoromethionine,34 hexafluoroleucine,35a (2S,4R)-5-19F-leucine,35b and 19F-proline35c has been reported. These other analogs will provide a valuable tool for the investigation of protein structure and folding. The recent discovery of a fluorinase36 may allow other fluorine-labeled amino acids to be synthesized enzymatically. Currently only 19F-phenylalanine can be incorporated site specifically as described above. On the other hand, Wang et al.37 reported a method for site-specific incorporation of O-methyl tyrosine suggesting that it may be possible to incorporate 19F-tyrosine site specifically. The chapter by Gerig5 includes a number of suggestions for the incorporation of fluorine labels into proteins. The reader would be well served by consulting this reference. Site-Specific Labeling with Cysteine-Reactive Compounds

Another method that has been used for site specifically labeling proteins with fluorine for 19F-NMR experiments is the use of compounds that are reactive to cysteine residues. This approach was pioneered by Gerig5 and has the advantage that one can probe a single site. Obviously, the effect of the cysteine mutation, before and after treatment with a fluorinated compound, must be taken into account. However, a major advantage of cysteine labeling is that one can use these sites to probe with other cysteine-reactive reagents, such as fluorescent or spin labels, for comparative purposes. A recent study of the integral membrane protein rhodopsin has shown that fluorine labeling of two cysteine residues that are close in space can allow one to obtain through-space information under different conditions (i.e., light and dark) using 19F nuclear Overhauser effects (NOEs).38 34

M. D. Vaughn, P. Cleve, V. Robinson, H. S. Duewel, and J. F. Honek, J. Am. Chem. Soc. 121, 8475 (1999). 35 (a) Y. Tang and D. A. Tirrell, J. Am. Chem. Soc. 123, 11089 (2001). (b) J. Feeney et al., J. Am. Chem. Soc. 118, 8700 (1996). (c) C. Renner et al., Angew Chem. Int. Ed. 40, 923 (2001). 36 D. O’Hagan, C. Schaffrath, S. L. Cobb, J. T. Hamilton, and C. D. Murphy, Nature 416, 279 (2002). 37 L. Wang, A. Brock, B. Herberich, and P. G. Schultz, Science 292, 498 (2001). 38 M. C. Loewen et al., Proc. Natl. Acad. Sci. USA 98, 4888 (2001).

19

[18]

f-labeled proteins for NMR studies

413

TABLE II Cysteine-Reactive Fluorinating Reagents Formula

Name; Commercial availability

Reference

CF3-CH2-SH CF3-CO-CH2-Br F-C4H4-SH CF3-C4H4-NH-CO-CH2l (CF3)3C-C4H4-NH-CO-CH2l

2,2,2-Trifluoroethanethiol; yes 3-Bromo-1,1,1-trifluoroacetone; yes 4-Fluorobenzenethiol; yes 4-(Trifluoromethyl)phenyliodoacetamide; no 4-(Perfluoro-tert-(butyl)phenyliodoacetamide; no

38, 39 41, 42 43 44 45

There are a number of different compounds that can be used to label cysteines, and some recently used ones are presented in Table II. Although several cysteine-reactive fluorocompounds have been used recently for 19 F-NMR studies, a highly selective reagent for cysteine is the use of trifluoroethylthiol (TET).38,39 Whereas a fluorohalogenated compound may in some cases show reactivity toward amines, TET labeling of cysteine is specific for the formation of a disulfide. TET labeling produces an analog that is similar in length to lysine and chemically similar to methionine, such that either substitution [Lys to Cys (TET) or Met to Cys (TET)] would probably result in minimal perturbation of the molecule. Also, this compound exhibits a sharp resonance, due to the lack of proton coupling of the CF3 group.40–45 Conclusions

This chapter outlines the methods that can be used to incorporate fluorine-labeled amino acids, as well as other ligands, into proteins. There are many reasons for using these techniques to study the structural and dynamic aspects of proteins. Of these the most dramatic is the sensitivity of the fluorine nucleus to its environment under conditions in which there are only minimal perturbations in the wild-type structure. Thus, studies can investigate the changes due to ligand binding, protein–protein interactions, as well as protein folding. Currently no good theory exists for 39

J. Klein-Seetharaman, E. V. Getmanova, M. C. Loewen, P. J. Reeves, and H. G. Khorana, Proc. Natl. Acad. Sci. USA 96, 13744 (1999). 40 G. R. Drapeau, W. J. Brammar, and C. Yanofsky, J. Mol. Biol. 35, 357 (1968). 41 M. R. Thomas and S. G. Boxer, Biochemistry 40, 8588 (2001). 42 K. Oxenoid, F. D. Sonnichsen, and C. R. Sanders, Biochemistry 41, 12876 (2002). 43 J. P. Caradonna, E. W. Harlan, and R. H. Holm, J. Am. Chem. Soc. 108, 7856 (1986). 44 J. W. Shriver and B. D. Sykes, Biochemistry 21, 3022 (1982). 45 D. Heintz, H. Kany, and H. R. Kalbitzer, Biochemistry 35, 12686 (1996).

414

cooperativity in protein folding and assembly

[18]

relating chemical shifts to the structural environment. When such a theory is developed, the amount of information that can be obtained from fluorine chemical shifts should be impressive. Appendix I: General Method for the Production of Proteins Site Specifically Labeled with p-19F-Phenylalanine Using PapD as an Example

Originally, the procedure described by Furter for labeling mouse dihydrofolate reductase was to grow the bacteria in M9 minimal media supplemented with phenylalanine/p-19F-phenylalanine (0.2 and 3 mM, respectively).31 This serves the purpose of keeping a selective pressure to maintain the resistance to p-19F–labeled phenyalanine. In the original paper,31 the cells were grown to an A600 of 1.0 and then shifted to media containing 3 mM p-19F-phenylalanine and 0.03 or 0.04 mM phenylalanine. It was found that the level of specific p-19F-phenylalanine incorporation increased with ‘‘an unproportional 2.3-fold increase in p-19F-phenylalanine contamination’’ when using 0.03 versus 0.04 mM phenylalanine. Because the level of specific incorporation is important, we use the 0.04 mM phenylalanine final concentration for labeling, and find very similar levels of specific incorporation over uniformly labeled protein. A modified protocol for site-specific labeling is described below and should be applicable to other proteins. Two days prior to the growth of bacteria, frozen K10F6 cells harboring the pQE80papDPhe!Amber mutant/pRO117 are streaked onto LB agar plates containing 100 g/ml ampicillin and 50 g/ml kanamycin. A single colony is picked and grown the morning of the following day  in 5 ml of LB containing antibiotics at 37 . In the evening, the cells are diluted 1:1000 into 100 ml of the defined media containing 0.2 mM Phe/3 mM p-19F-phenylalanine with antibiotics. The defined media we use30 contains (for a 1-liter volume) 0.2 mM phenylalanine/3 mM p-19F-phenylalanine, 0.5 g alanine, 0.4 g arginine, 0.4 g aspartate, 0.05 g cystine, 0.4 g glutamine, 0.65 g glutamate, 0.55 g glycine, 0.1 g histidine, 0.23 g isoleucine, 0.23 g leucine, 0.42 g lysine hydrochloride, 0.25 g methionine, 0.1 g proline, 2.1 g serine, 0.23 g threonine, 0.17 g tyrosine, 0.23 g valine, 0.5 g adenine, 0.65 g guanosine, 0.2 g thymine, 0.5 g uracil, 0.2 g cytosine, 1.5 g sodium acetate, 1.5 g succinic acid, 0.5 g NH4Cl, 0.85 g NaOH, and 10.5 g K2PO4 per 950 ml of H2O. This is then autoclaved. To this is added 50 ml of a sterile-filtered solution of 40% glucose, 4 ml of 1 M MgSO4, and 10 ml of a sterile solution of 2.7 mg FeCl36H2O, 2 mg CaCl22H2O, 2 mg ZnSO47H2O, 2 mg MnSO4H2O, 50 mg l-tryptophan, 50 mg thiamine, 50 mg niacin, and 1 mg biotin. This solution has a pH

[18]

19

f-labeled proteins for NMR studies

415

near 7.2. The antibiotics (1 ml of 50 mg/ml kanamycin, 1 ml of 100 mg/ml ampicillin) are added separately. We have found that it is important to maintain the bacteria in a logarithmic phase of growth and not to let them go into stationary phase, as we have observed premature lysis of the bacteria. Thus, the overnight culture is started rather late (7:00 pm). The bacteria from the overnight culture are then diluted 1:50 into 750 ml of media in Fernbach flasks. The cells are grown to an A600 of 1.0, and harvested by centrifugation. Once harvested, the cells are then washed twice with 0.9% NaCl, 3 mM p-19F-phenylalanine. The resuspended cells are grown for about 20–30 min. min. Then, IPTG is added to a final concentration of 1.0 mM, and the cells are grown for an additional 1–3 h. Following this, the usual procedures for purifying the protein of choice are applicable.

Author Index

Numbers in parentheses are footnote reference numbers and indicate that an author’s work is referred to although the name is not cited in the text.

A Abaturov, L. V., 246, 247(30), 248(30), 250(30), 263(30) Abildgaard, F., 236 Abkevich, V., 369, 370 Abrahams, J. P., 69, 74(68), 133, 136(7), 148, 149(7), 150(7), 151(7) Acharya, K. R., 252 Achouri, Y., 107 Ackers, G. K., 93, 99, 99(33), 129, 208, 351 Aggeler, R., 132, 139 Ahmad, F., 259 Ahmad, Z., 137 Akiyama, S., 273, 274(117), 315 Alberty, R. A., 192 Alcorn, S. W., 249 Alden, R. A., 64 Alexov, E., 37, 39(35), 46(34; 35) Allison, W. S., 137, 147 Almas, B., 16 Al-Shawi, M. K., 132, 137, 138, 139, 140, 149 Amaratunga, M., 157, 165(12), 168(12), 169(12) Amemiya, Y., 270 Amodeo, P., 234 Anantharayanan, V. S., 259 Anderrsen, T. T., 127 Anderson, D. E., 50, 407(30), 408, 414(30) Andreola, A., 263(85), 264 Andres, J., 56 Angeletti, R. H., 9(16), 11, 13(16), 15, 15(16) Angelini, N., 314 Antosiewicz, J. J., 24, 29, 31, 33(8; 20), 34, 34(17; 20), 39(26), 43, 45(57), 46 Aplin, R. T., 286, 292(9), 304(9) Apostol, I., 127 ˚ qvist, J., 56, 68, 84 A

Arai, M., 244, 245(17; 19), 246(17), 254(17), 264, 265(99), 266(99), 270 Aravind, L., 107 Arayata, C., 208, 211(11), 213(11), 214(11), 215(11), 216(11), 217(11) Arcus, V. L., 48 Argarana, C. E., 7(9), 8 Arico-Muendel, C., 254, 255(57), 256(57; 59), 263(59), 269(59) Arjunan, P., 62 Armstrong, N. A., 171, 176(8), 178(8), 179(8), 180(8) Arrington, C. B., 328(3), 329, 333(3), 388, 389(18), 398, 398(18) Arrondo, J. L. R., 7(9), 8 Arscott, D., 165 Arvola, M., 171 Ash, E. L., 66 Ashcroft, A. E., 286 Austin, R. H., 273, 315 Ausubel, F. M., 159

B Baase, W. A., 45, 50 Babul, J., 272, 319 Baenziger, J. E., 172 Bagatolli, L. A., 7(9), 8 Bai, P., 406 Bai, Y., 242, 277, 286, 289, 328(2), 329, 338, 358, 378(28), 380, 386(4), 398(4) Baker, D., 286, 343, 347(20), 350, 375, 377(63) Balazs, A. C., 369 Balbach, J., 264, 266, 266(100) Baldwin, R. L., 242, 261, 262, 278, 338, 376, 377 Banaszak, L. J., 107 Bandarian, V., 152, 155, 157, 159(11), 162(13), 166(13), 167(13)

417

418

author index

Bann, J. G., 400, 401, 403(10) Barany, G., 378, 379, 395, 400 Barbar, E., 219, 224, 226, 226(19), 227(19), 228(19), 233(18), 234(18), 237(18), 238(18), 395 Barbarese, E., 220 Bardsley, B., 9 Barlow, D., 250 Barrantes, F. J., 312 Barrick, D., 278 Bashford, D., 28, 29, 29(14), 34, 34(15), 36 Bateman, A., 223 Batt, C. A., 270 Battiste, J. L., 387, 395(14) Baum, J., 247, 248(33), 249, 249(33), 250(33; 46) Bax, A., 229, 231 Beauregard, D. A., 6 Beitz, J. V., 312(10), 313 Bellefeuille, S. M., 135 Bellotti, V., 263(85), 264 Benashski, S. E., 220 Benevolenskaya, E. V., 220 Benhamu, B., 6 Benjamin, D. C., 7(10), 8 Berge, S. V., 16 Berk, A. J., 208, 213 Beroza, P., 36 Bertelsen, E. B., 232 Bertrand Garcı´a-Moreno, E., 33, 35, 37(40), 38, 41(28; 29), 43(21; 28), 44(28), 45(58), 46, 47(58), 48(58), 49, 49(40), 51(67) Bettler, B., 170 Bhattacharya, S., 45(58), 46, 47(58), 48(58) Bhuyan, A., 311, 313(3), 320(3), 323(3), 324(3) Bianchi, V., 163 Bigelow, C. C., 259 Bijol, V., 135, 138 Biltonen, R., 285, 358, 361(24) Birney, E., 223 Blackwell, C. M., 157(15), 158 Bloomfield, V., 192 Blout, E. R., 269 Blow, D., 61 Blumenthal, R. M., 154, 168(2) Blundell, T., 250 Bo¨ckmann, R. A., 70 Bo¨hm, H.-J., 3, 4(1)

Boime, I., 400 Bolen, D. W., 395, 399(24) Bolognesi, M., 85, 260 Bolotina, I. A., 246, 247(30), 248(30), 250(30), 263(30) Borkakorti, N., 250 Bornberg-Bauer, E., 370 Botts, J., 192 Boube, M., 207 Boulas, S., 174, 175(20) Bourbon, H. M., 207 Bowie, J. U., 361 Bowman, C., 136, 138 Boxer, S. G., 413 Boyer, P. D., 52, 69(4), 74(4), 133 Boyer, T. G., 213 Bradford, M. M., 139 Brahms, J., 268 Brahms, S., 268 Braig, K., 143 Braiman, M. S., 172, 174, 174(17) Brammer, W. J., 407(40), 413 Brandts, J. F., 358, 361(24) Braxton, B. L., 188 Brazhnikov, E. V., 246, 247(30), 248(30), 250, 250(30), 263(30) Brenowitz, M., 208 Brent, R., 159 Brew, K., 252, 264, 266(100) Briggs, J. M., 43 Bright, J. R., 238 Brock, A., 412 Bromberg, S., 354 Broos, J., 135 Brown, A., 62 Brown, C. J., 223 Bruice, T. C., 56, 57 Bruix, M., 43, 242 Brunori, M., 18 Bryant, C., 272 Bryant, R. G., 398, 399(35) Bryngelson, J. D., 309, 362, 373(33), 374(33), 378(33) Budisa, N., 405, 410(22) Buratowski, S., 208 Burger, H. G., 244 Burns, R. O., 86, 87 Burykin, A., 70, 80(73) Butt, T. R., 278

author index Bychkova, V. E., 246, 247(30), 248(30), 250, 250(30), 263(30) Bycroft, M., 45, 364, 375(41) Byler, D. M., 180

C Cadieux, C., 263(98), 264 Calciano, L. J., 256, 257(64), 258(64), 259(64; 65), 260(65), 277(65) Calderone, C. T., 6 Calhoun, D. B., 388, 392(17) Campbell, I. D., 238, 406 Cantor, C. R., 7(9), 8, 351 Capaldi, R. A., 132, 137, 138, 139 Caradonna, J. P., 413 Carey, J., 223 Carey, M., 207, 208, 210, 211(11), 213(11; 21), 214(11), 215(11), 216(11; 21), 217(11), 218 Carlsson, U., 252 Carp, S., 45 Carter, P., 64 Carulla, N., 378, 379, 400 Case, D. A., 34, 36 Cassidy, C. S., 66, 67(64), 68(64) Castro, C., 154 Cavagnero, S., 286 Cerasoli, D., 345 Cerpa, R., 314 Cerruti, L., 223 Chae, Y. K., 236 Chaffotte, A. F., 245, 262, 262(24; 26), 263(25; 86; 87; 98), 264, 266(25), 267(25), 268(25), 271(87), 273(87), 285(26; 87), 326 Chait, B. T., 9(14), 11 Chakravarti, K., 383 Chakravarty, S., 134 Chamberlain, A. K., 278, 286, 338 Chan, C.-K., 311, 313(3), 320(3), 323(3), 324(3) Chan, H. S., 242, 350, 351, 353, 354, 355, 355(9), 356, 356(19; 20), 357(17–20), 358, 358(13), 360(13), 361(12; 13), 363, 363(13), 364, 364(12; 13), 365, 365(12; 19), 366, 367(13; 43; 45), 368(43), 369, 369(43), 370, 371(43), 372(13; 45; 49), 374(26; 45; 49), 375, 375(26; 45; 48), 376, 376(65), 378(26; 43), 398

419

Chan, S. I., 314 Chang, C. T., 281, 282(134) Changeux, J.-P., 17, 85, 87, 88, 187, 192(4) Chapman, D., 172, 177(13), 180 Chapmann, E. R., 236 Chau, K. H., 248, 281(38), 282(38) Che, D., 319 Chehin, R., 16 Chen, E., 273, 275(121), 276, 276(121), 308, 314, 321, 323, 324(31), 325, 325(33), 326, 326(34) Chen, G.-Q., 171 Chen, J., 286 Chen, X., 46 Chen, Y. H., 248, 281(38), 282(38) Cheng, Q., 171, 175, 176(25), 183(24), 184(24), 185(24) Cherepanov, D. A., 70 Chesick, J. P., 311, 313(4), 320(4), 321(4) Chi, T., 208, 210, 213(21), 216(21) Chien, C. Y., 229 Chinchilla, D., 85, 86, 100, 102(40) Chipman, D. M., 107 Chisholm, R. L., 223 Chiti, F., 263(85), 264 Chorongiewski, H., 174, 175(20) Christenson, H., 263(89), 264, 285(89) Christy, A. A., 172 Chrysina, E. D., 252 Chu, Z. T., 30, 38(18), 39, 39(18), 67, 78, 79(94), 80(66; 94) Chung, S. H., 79 Churg, A. K., 33(23), 34, 62 Chyan, C.-L., 247, 248(33), 249(33), 250(33) Cistola, D. P., 401 Clardy, J., 227 Clarke, J., 334 Clegg, R. M., 312 Cleland, W. W., 64, 65(54; 56), 66(54), 144, 190, 191(15), 193 Clementi, C., 373 Cleve, P., 412 Clore, G. M., 238, 263(89), 264, 285(89), 342 Cobb, S. L., 412 Cohen, F. E., 314, 343 Cole, R., 373 Coletta, M., 18 Collins, C. A., 221 Colosimo, A., 18 Comai, L., 406

420

author index

Condliffe, P. G., 244 Cooper, A., 7 Cooper, E. A., 172 Copie´, V., 396 Corthesyntheulaz, I., 219 Coue, M., 219 Coulson, C. A., 65 Covell, D. G., 356 Cox, G. B., 139 Cox, J. P. L., 3, 4(2) Coyle, J. E., 286 Crane-Robinson, C., 244 Creighton, S., 83(100), 84 Creighton, T. E., 269 Cribbs, D. L., 207 Crichton, R. R., 38 Crosby, J., 61, 62(35), 63(35) Crowhurst, K. A., 48

D Dahlquist, F. W., 50, 407(30), 408, 414(30) Daily, M. D., 43 Dalvit, C., 34 Damaschun, G., 266 Dancer, R. J., 6 Daniel, W. E., 405 Danielson, M. A., 400, 401(4) Danielsson, 65 Danko, C. A., 369 Dao-pin, S., 45, 50 Darnton, N. C., 273, 315 Dato-Samada, Y., 133 Datta, P., 86 Daugherty, M. A., 93, 99(33) Davidson, A. R., 367 Davies, D. D., 106 Dawes, T. D., 319 Decedue, C. J., 87 DeGrado, W. F., 314 Deinzer, M. L., 286 Deisenhofer, J., 82 Delaglio, F., 229, 231 Del Buono, G. S., 26 Delepierre, M., 245, 262, 262(24; 26), 285(26) Demchuk, E., 37 Deming, D., 171 Demura, M., 254, 266 Deng, Y., 9(15), 11, 13(15), 286, 287, 288(22), 292, 304(31), 334

Devi-Kesavan, L. S., 63 Dewar, M. J. S., 61 Dill, K. A., 242, 353, 354, 366, 367, 369, 375(48), 389(21), 390, 398 Dillman, J. F., 220 Dimitrov, R. A., 38 Di Nardo, A. A., 367 Dioumaev, A. K., 174 Dixon, M. E., 390, 398(22) Dixon, M. M., 154 Do, K. T., 220 Dobson, C. M., 238, 242, 247, 248(33), 249, 249(33), 250(33; 46–48), 254, 255(57), 256(57; 59), 263(59; 85; 90), 264, 266, 266(90; 100), 269, 269(59), 285(90), 286, 292(9), 304(9), 378 Dodson, G. G., 18 Doig, A. J., 3, 4(2) Dolgikh, D. A., 246, 247(30), 248(30), 250, 250(30), 263(30) Dominguez, M. A., Jr., 405, 406(21) Dong, K., 137 Dou, C., 147 Doukov, T., 154 Downie, J. A., 139 Doyle, M. L., 93, 99(33) Dragan, A. I., 358 Drapeau, G. R., 407(40), 413 Drennan, C. L., 154, 157, 165(12), 168(12), 169(12) Drummond, J. T., 154, 168(2) Ducote, K. R., 88, 89(24), 97(24), 102(24) Duewel, H. S., 412 Dujardin, D. L., 219 Dukor, R. K., 248, 250(35) Dunham, W. R., 163, 164(17) Dunker, A. K., 223 Dunn, B. C., 174, 182(21) Dunn, R. C., 275 Dupureur, C. M., 405, 406, 406(21) Durbin, R., 223 Dutton, P. L., 80 Dwyer, J., 49, 51(67) Dyson, H. J., 286, 349

E Eaton, W. A., 273, 311, 313(3), 315, 320(3), 323(3), 324(3) Echabe, I., 7(9), 8

author index Eddy, S. R., 223 Edelhoch, H., 244 Edgecomb, S. P., 24, 33(7) Eftink, M. R., 134, 141, 142(44) Einterz, C. M., 316 Eisenberg, D., 361 Eisenstein, E., 85, 86, 87, 88, 89(20; 21; 24; 27), 90, 93(11; 20; 21), 94(21), 95(20; 21), 96(21; 27), 97(20; 21; 24; 27), 99, 100, 100(29), 102(24; 40) Eliasson, R., 163 Eliopoulos, E. E., 259, 260, 261(68) Ellwood, K., 208, 218 Elo¨ve, G. A., 263(82; 87), 264, 271(87), 273(87), 285(87), 326 Elston, T., 70 Emelyamenko, V. I., 254, 256(59), 263(59), 269(59) Engelborgh, Y., 263(86), 264 Engen, J. R., 287 Englander, J. J., 291, 344, 388, 392(17) Englander, S. W., 242, 271, 272(114), 273(114), 276(6), 277, 286, 289, 291, 328(1; 2), 329, 334(9), 335, 338, 340, 344, 345, 358, 378, 378(28), 380, 386(4), 388, 389, 392(17), 398(4) Ermler, U., 77(90), 78 Ermolenko, D. N., 46 Ervin, J., 278, 280(132) Esquerra, R. M., 319, 323, 325(33) Etwiller, L., 223 Evans, J. C., 154 Evans, P. A., 247, 248(33), 249, 249(33), 250(33; 46), 263(90), 264, 266(90), 285(90) Evans, P. R., 85, 96(4) Eyles, S. J., 286 Eyring, E. M., 174, 182(21)

F Fabiato, A., 144 Fabiato, F., 144 Fairman, R., 45 Falcioni, G., 18 Falick, A. M., 334 Falke, J. J., 400, 401(4), 404 Faller, L. D., 312(11), 313 Fan, H. Y., 207

421

Farrow, N. A., 358 Fasman, G. D., 248, 249(36), 273(36), 281(36), 282(36) Fasolka, M. J., 369 Faulkner, N. E., 219 Fayle, D. R. H., 139 Feeney, J., 412 Feher, G., 78 Feierberg, I., 68 Feng, W. Q., 229 Ferguson, S. J., 406 Fersht, A. R., 13, 17, 45, 46, 48, 64, 69, 130, 350, 359, 364, 371(1), 375(41) Fiaux, J., 232 Fidelio, G. D., 7(9), 8 Figueirido, F. E., 26 Filmer, D., 187, 192(5) Findlay, J. B. C., 259, 261(68) Fine, R., 26, 30(10) Fink, A. L., 244, 256, 257(64), 258(64), 259(64; 65), 260(65), 276, 277(65), 321, 324(31) Finn, B. E., 255, 263(61), 280(61), 281(61), 282(61) Fischer, N., 378 Fisher, E. E., 86 Fisher, K. E., 88, 89(24), 90, 97(24), 100(29), 102(24) Fissi, A., 314 Fitch, C. A., 20, 35, 37(40), 38, 41(28; 29), 43(28), 44(28), 45(58), 46, 47(58), 48(58), 49(40) Fitzgerald, M. C., 349 Flanagan, M. A., 33, 43(21) Flatmark, T., 16 Flint, D. G., 314 Florian, J., 62, 73 Fluhr, K., 157, 160(14), 163(14) Flynn, G. W., 312(10; 11), 313 Folque´, H., 260 Fontecave, M., 163 Ford, L. O., 56 Forge, V., 264, 266, 266(100), 270 Forman-Kay, J. D., 48, 358 Forsyth, W. R., 24, 33(8), 45(56; 57), 46, 263(88), 264 Fortes, P. A. G., 147 Fothergill, M., 56 Fox, J. W., 127 Fraser, S. G., 89(26), 90, 95(26)

422

author index

Frei, H., 174 Freire, E., 7(11), 8, 187, 344, 350 Fre`re, J. M., 37 Freskga˚rd, P.-O., 252 Frey, P. A., 64, 65(55; 56), 66, 66(55), 67(64), 68(64) Frieden, C., 193, 263(93), 264, 400, 401, 402, 402(7), 403(7–10), 407, 407(8; 9) Friguet, B., 263(97), 264 Fuchs, J. A., 37, 384, 387, 392(12), 395(12) Fueki, S., 263(91), 264 Fujii, K., 154 Fujisawa, T., 315 Fujiwara, K., 270 Fulwyler, M. J., 312 Furter, R., 407(31), 408, 410(31), 414(31) Futai, M., 133, 137, 139, 149 Fuxreiter, M., 38

G Gallagher, D. T., 85, 86 Gallagher, W., 397 Gallivan, J. H., 154 Gao, J., 63 Garcia, A., 390 Garcı´a, A. E., 357 Garcia, C., 286 Garcia-Mira, M. M., 378 Garde, S., 357 Gardner, M., 3, 4(2) Garner, E. C., 223 Garrett, D. S., 238 Garrow, T. A., 154 Garvey, E. P., 255, 263(61), 280(61), 281, 281(61), 282(61) Gast, K., 266 Gaunitz, S., 14 Geller, M., 34, 39(26) Georgescu, R. E., 37 Gerhard, U., 3, 4(2) Gerhart, J. C., 85 Gerig, J. T., 400, 401, 401(5), 403(5), 412(5) Gerlt, J. A., 64, 66(56) Gerstein, M., 353(10), 354, 355(10) Gerwert, K., 29, 34(15), 174, 175(20) Getmanova, E. V., 413 Ghaemmaghami, S., 349 Ghosh, T., 357 Giardina, B., 18

Gibbons, C., 151 Gibbons, I., 99 Gibson, F., 139 Gierasch, L. M., 261, 277(78), 286 Gill, S. J., 18, 23, 112, 119(18), 131(18) Gillespie, B., 362, 362(34), 374(34) Gilliland, G. L., 86 Gilmanshin, R. I., 246, 247(30), 248(30), 250, 250(30), 263(30) Gilson, M. K., 29, 31, 33(20), 34, 34(17; 20), 39(26), 45(57), 46 Giorgetti, S., 263(85), 264 Gittis, A. G., 49, 51(67) Glennon, T. M., 77 Glo¨ckner, J., 248, 249(37), 252(37), 254(37), 260(37) Gmeiner, W. H., 287 Go, N., 373, 378(59) Goldbeck, R. A., 275, 308, 319, 323, 325(33), 326 Goldberg, M. E., 245, 262, 262(24; 26), 263(25; 87; 97; 98), 264, 266(25), 267(25), 268(25), 271(87), 273(87), 285(26; 87), 326 Gonzalez, M., 7(9), 8 Goodenough, U., 220 Gooding, E. A., 314 Gordon, J. I., 263(93), 264 Goto, Y., 256, 257(64), 258(64), 259, 259(64), 260, 260(67), 261, 270, 270(67; 69) Gouaux, E., 171, 175, 176(8; 25), 178(8), 179(8), 180(8) Goulding, C. W., 154, 157, 160(14), 163(14) Graf, R., 223 Gralla, J. D., 208 Grant, G. A., 106, 107, 108, 109(1; 2; 15) Grantcharova, V. P., 343, 347(20) Gray, H. B., 311, 313(4; 5), 320(4; 5), 321(4; 5) Green, M. R., 208, 210 Greenberg, D. M., 106 Greenblatt, J., 235 Greene, L. H., 252 Greenfield, N. J., 230, 248, 249(36), 273(36), 281(36), 282(36) Greer, J., 250, 253(54) Gregoriou, V. G., 172 Grell, E., 135, 136, 137, 138, 139, 139(25), 143(25), 145(25), 146, 147(25), 152(25) Griffiths-Jones, S., 223 Grimsley, G. R., 43, 45

author index Grishina, I. B., 250, 255, 269(62), 281(62) Grissom, P. M., 219 Gronenborn, A. M., 238, 263(89), 264, 285(89), 342 Gross, L. A., 286 Groves, P., 6 Gru¨ber, G., 137 Grubmeyer, C., 15 Grubmtiller, H., 70 Gruebele, M., 278, 280(132) Gru¨ner, S. M., 273, 315 Grzesiek, S., 229, 231 Guarente, L., 208 Guarnieri, F., 38 Guijarro, J. I., 262 Guillou, Y., 245, 262(24; 26), 263(25; 98), 264, 266(25), 267(25), 268(25), 285(26) Gunner, M. R., 29, 34(16), 36(16), 37, 38(16), 39(35), 46(35) Guntert, P., 231 Guo, H., 57 Guo, W., 227 Gurd, F. R. N., 33, 43(21) Gutin, A., 369, 370 Gvozdev, V. A., 220

H Haezebrouck, P., 254, 255(57), 256(57), 263(58) Hagstrom, R., 26, 30(10) Hahn, S., 207, 208, 210(10) Hai, T., 208 Hall, K. B., 401, 404 Hallman, L. M., 406 Halvorson, H. R., 93 Hamada, D., 259, 260, 260(67), 261, 270(67; 69) Hamilton, J. T., 412 Hamlin, L. M., 244 Hammond, S. T., 135, 150, 151(61) Hammonds, R. G., Jr., 244 Hampsey, M., 207 Han, M. S., 99 Handel, T. M., 286, 338 Hanley, C., 249, 250(46) Hansen, K. C., 314 Harbury, P. B., 38 Hare, M., 219, 224, 226(19), 227(19), 228(19), 233(18), 234(18), 237(18), 238(18)

423

Haris, P. I., 180 Harlan, E. W., 413 Hartl, D. L., 220 Haslam, E., 57 Hassan, A. H., 209, 210(19) Hatfield, G. W., 86 Haugland, R. P., 134 Havranek, J. J., 38 Hayes, J., 396 Haynie, D. T., 254, 256(59), 263(59), 269(59) Hays, T., 219, 224, 226(19), 227(19), 228(19), 233(18), 234(18), 237(18), 238(18) Head, M. S., 34, 39(26) Heidary, D. K., 286 Heinemann, S., 170 Heintz, D., 413 Hellinga, H. W., 38 Hendsch, Z., 45 Henikoff, J. G., 409 Henikoff, S., 409 Henry, E. R., 311, 313(3), 320(3), 323(3), 324(3) Herberich, B., 412 Hermes, J. D., 144 Heuser, J., 220 Hill, A. V., 118 Hiller, R., 242, 276(6) Hilser, V. J., 340 Hilton, B., 380, 383(2), 388(2) Hilvert, D., 53, 61(11) Hinz, H.-J., 245, 262(24) Hiraoka, Y., 246, 247(31), 248(31), 249(31), 264(31) Hitchcock-DeGregori, S. E., 230 Hitchens, T. K., 390, 398, 398(22), 399(35) Ho, C., 405 Hochstrasser, R. M., 312(9), 313, 314 Hodge, C. N., 34, 39(26) Hoeltzli, S. D., 400, 401, 402, 402(7), 403(7–9), 407(8; 9) Hoffman, G. W., 312 Hofler, J. G., 87 Hofrichter, J., 311, 313(3), 320(3), 323(3), 324(3) Hogue, C. W. V., 135 Holladay, L. A., 244 Hollmann, M., 170 Holm, R. H., 413 Holmes, L. G., 248, 249(34) Holt, J. M., 351

424

author index

Holzbaur, E. L. F., 221 Honek, J. F., 412 Honig, B., 26, 29, 30(10), 31, 34, 34(16), 36(16), 38(16) Hoover, D. M., 163, 164(17) Hore, P. J., 266 Hori, R., 210, 213(21), 216(21) Horikoski, M., 208 Horn, P. J., 207 Horton, G., 400 Horwich, A. L., 232, 286 Hoshino, M., 270 Houk, K. N., 61, 62(36) Houry, W. A., 263(94), 264, 276, 276(94), 277(94), 364, 375(42) Howe, K. L., 223 Hrylska, H., 34, 39(26) Hu, X., 172, 174 Hu, Y., 311, 313(3), 320(3), 323(3), 324(3) Hu, Z., 106, 108, 109(1; 2; 15) Huang, S., 154, 157, 160(14), 163(14), 168(2) Huang, Y. J., 230 Huang, Y. P., 229 Huber, R., 405, 410(22) Huddler, D. P., 154 Huennekens, F. M., 154 Hull, W. E., 400 Hultgren, S. J., 401, 403(10) Hur, S., 57 Huyghues-Despointes, B. M., 43, 337, 342(12), 343(12), 387, 388(15), 398(15) Hvidt, A., 286, 330, 379, 386(1) Hwang, J.-K., 64

I Iacuzio, D. A., 88, 89(24), 97(24), 102(24) Ibarra-Molero, B., 45 Ichihara, A., 106 Ikeguchi, M., 246, 247(31), 248(31), 249(31), 254(29), 260, 264(29; 31), 270, 270(70), 356 Iko, Y., 133 Ikura, T., 270 Imhoff, D., 224, 226(19), 227(19), 228(19) Ip, S. H. C., 93 Irons, L. I., 252 Ishimori, K., 273, 274(117), 315 Itzhaki, L. S., 334 Iwamoto-Kihara, A., 133, 139

J Jackson, M., 180, 262 Jackson, S. E., 350, 359, 371(1) Jaenicke, R., 406 Jaffrey, S. R., 227 Jager, M., 286 Jamin, M., 242 Jarema, M., 405 Jaren, O. R., 45(57), 46 Jarrett, J. T., 154, 157, 160(14), 163, 163(14), 164(17), 165(12), 168(12), 169(12) Jayaraman, V., 170, 171, 172, 175, 175(9), 176(25), 177(9), 178, 179(26), 180(9), 181(9), 183, 183(24), 184(24), 185(24), 186(30) Jencks, W. P., 52, 61(2), 63(2) Jennings, P. A., 263(84), 264, 278(84), 279(84), 286 Jewett, A. I., 375, 376, 376(66) Jiracek, J., 154 Johnson, A. D., 208 Johnson, B. A., 7(12), 8 Johnson, J. L., 188, 198 Johnson, K. M., 207, 208, 211(11), 213(11), 214(11), 215(11), 216(11), 217(11) Johnson, M. L., 89(26), 90, 93, 95(26), 112, 131(19) Johnson, W. C., Jr., 249, 254(45) Jollie, D. R., 84 Jonasson, P., 252 Jones, C. M., 311, 313(3), 320(3), 323(3), 324(3) Jones, D. T., 409 Jones, J. A., 269 Jones, T. A., 259, 261(68) Joniau, M., 254, 263(58) Jo¨nsson, B., 33 Jonsson, B.-H., 252 Jordan, F., 62 Joulia, L., 207 Jovin, T. M., 312 Junge, W., 70 Jurnak, F., 284

K Kabsch, W., 250, 253(53), 254(53), 267(53), 268(53) Kagawa, Y., 148

author index Kalbitzer, H. R., 406, 413 Kallenbach, N. R., 278, 286, 328(1), 329, 389 Kaltashov, I. A., 286 Kambara, M., 148 Kamen, D. E., 263(92), 264, 283(92), 284, 284(92) Kanaya, S., 263(95), 264 Kany, H., 413 Kao, Y.-H., 45(58), 46, 47(58), 48(58) Kaptein, R., 231 Karki, S., 221 Karp, D. A., 37(40), 38, 49, 49(40), 51(67) Karplus, M., 22, 28, 29(14), 36, 37(30), 242, 362, 369, 370, 378(35) Karshikoff, A., 38 Kast, P., 53, 61(11) Kato, S.-I., 260, 270(70) Katta, V., 9(14), 11 Kay, L. E., 48, 235, 358 Kay, M. S., 278 Kaya, H., 350, 351, 355, 358, 358(13), 360(13), 361(13), 363(13), 364, 364(13), 365, 366, 367(13; 43; 45), 368(43), 369(43), 370, 371(43), 372(13; 45; 49), 374(26; 45; 49), 375, 375(26; 45), 376, 376(65), 378(26; 43) Kaye, P. T., 3, 4(2) Keen, N. T., 284 Keesey, R., 171, 175(9), 177(9), 180(9), 181(9) Keiderling, T. A., 248, 250(35) Keina¨nen, K., 171 Kellis, J. T., 364, 375(41) Ketchum, C. J., 132, 135, 139 Khanjin, N. A., 56, 57(21) Kholodenko, Y., 314 Khorana, H. G., 413 Khorisanizadeh, S., 278 Kiefhaber, T., 263(86), 264 Kienhofer, A., 53, 61(11) Kihara, H., 270, 278, 280(132) Kim, H. W., 406 Kim, K.-S., 383, 384, 384(9), 387, 388(9), 392(12), 393, 394(9), 395(12) Kim, P. S., 249, 250(47; 48) Kim-Shapiro, D. B., 319 Kimura, T., 315 King, S. M., 220, 221(10) Kingston, R. E., 159, 207 Kini, A. R., 221

425

Kinosita, K., Jr., 133, 138, 139 Klabe, R. M., 34, 39(26) Klapper, I., 26, 30(10) Klein, J., 5 Kleinman, B., 224, 226(19), 227(19), 228(19) Klein-Seetharaman, J., 413 Kliger, D. S., 273, 275, 275(121), 276, 276(121), 308, 314, 316, 319, 321, 323, 324(31), 325, 325(33), 326, 326(34) Klimov, D. K., 363, 378(37) Klinger, A. L., 351 Knapp, E. W., 36 Knappskog, P. M., 16 Knight, J. B., 273, 315 Knudsen, J., 286 Knutson, K., 172 Kobashigawa, Y., 254 Koenig, J. L., 180 Kolinski, A., 355 Kollman, P. A., 61 Komives, E. A., 334 Konrat, R., 235 Koonin, E. V., 107 Kosen, P. A., 269 Koshiba, T., 254 Koshland, D. E., Jr., 187, 192(5) Kragelund, B. B., 286 Kranz, J. K., 404 Kraulis, P. J., 259, 261(68) Kreevoy, M. M., 64, 65(54), 66(54) Kremer, W., 406 Kronman, M. J., 250 Kruppa, G. H., 286 Kuhlman, B., 47, 48(62) Kulikowski, C. A., 229 Kumita, J. R., 314 Kuntz, I. D., 314, 316 Kuo, P. H., 135 Kuras, L., 208 Kuroda, Y., 261 Kurotsu, T., 256, 259(65), 260(65), 277(65) Kuusinen, A., 171 Kuwajima, K., 244, 245(12; 17; 19), 246, 246(12; 14; 17), 247(13; 31), 248(31), 249(31), 254, 254(17; 29), 255, 263(61; 81; 83; 91; 96; 97), 264, 264(29; 31), 270, 271(81), 280(61), 281(61), 282(61), 326 Kuwata, K., 270 Kuznetsov, S. A., 221

426

author index

L Laemmli, U. K., 139 Lal, A. R., 3, 4(2) Lamotte-Brasseur, J., 37 Langen, R., 77 Langman, L., 139 Langsetmo, K., 37, 226 Larios, E., 278, 280(132) Larsen, R. W., 314 Laskowski, R. A., 231 Lattman, E. E., 37(40), 38, 49, 49(40), 51(67) Lau, E. Y., 401 Lau, H., 85 Laurents, D. V., 43, 242 Lawther, R. P., 90 Le, N. P., 137, 149 Leatherbarrow, R. J., 64 Leatherwood, J., 210 Leavitt, S. A., 187, 344 Lebedev, Yu. O., 246, 247(30), 248(30), 250(30), 263(30) Lecomte, J. T. J., 35, 41(28), 43(28), 44(28), 45(58), 46, 47(58), 48(58) Lee, F. S., 30, 38(18), 39(18), 55, 67, 74(15), 75(15), 80(66) Lee, J. K., 61, 62(36) Lee, K. K., 35, 37(40), 38, 41(28; 29), 43(28), 44(28), 49(40) Lee, R. S. F., 136, 137, 139, 139(25), 140, 143(25), 145(25), 147(25), 152(25) Lee, S. C., 311, 313(4), 320(4), 321(4) Legault, P., 235 Leipinsh, E., 396 Lenci, F., 314 Leslie, A. G. W., 132, 133, 133(3), 136(3; 7), 143, 148, 149(3; 7), 150(3; 7), 151, 151(3; 7) Levitt, M., 20, 54, 56(13), 250, 253(54), 353(10), 354, 355(10), 361 Levy, R. M., 26 Lewis, J. W., 275, 316, 319 Li, H., 62 Li, H. M., 172, 181(12) Li, M. G., 219, 224, 226(19), 227(19), 228(19), 233(18), 234(18), 237(18), 238(18) Li, R., 387, 387(25), 395, 395(14), 399(25) Li, X. H., 223 Li, X. Y., 208 Li, Y., 7(12), 8 Lian, C., 405

Liang, J., 227 Licata, V., 395 Lieberman, P. M., 208, 213 Lienhard, G. B., 61, 62(35), 63(35) Lightstone, F. C., 62 Lim, W. K., 340 Lin, J., 66, 67(64), 68(64) Lin, Y. S., 208, 210 Lipscomb, W. N., 85 Little, R., 90 Liu, Z.-P., 261, 277(78) Lo¨bau, S., 137, 145, 149(49), 151, 151(49), 152(63) Lockless, S. W., 187, 344 Loewen, M. C., 412, 413, 413(38) Loewenthal, R., 46 Loh, S. N., 278 Loladze, V., 45, 46 Lounnas, J., 37 Lu, H. S. M., 314 Lu, J., 404 Lu, P., 405 Luboniwa, H., 231 Luck, L. A., 404 Luck, S. D., 311, 312, 313(3), 320(3), 323(3), 324(3) Ludwig, M. L., 154, 157, 162(13), 163, 164(17), 165(12), 166(13), 167(13), 168(12), 169(12) Luisi, D., 47, 48(62) Lumry, R., 285, 358, 361(24), 389 Lundberg, S. K., 312(11), 313 Luo, L., 406 Luque, I., 187, 344 Luthey-Schulten, Z., 362, 363, 363(32), 364(32), 367(32), 369(32), 370, 373, 373(32), 374(32) Luthy, R., 361 Lutter, R., 133, 136(7), 149(7), 150(7), 151(7)

M Ma, L., 349 Ma, S., 223 Mabuchi, K., 226 MacArthur, M. W., 231 MacGowan, E. B., 272 Madden, D. R., 170, 171, 175(9), 177(9), 178, 179(26), 180(9), 181(9), 183, 186(30) Maguire, A. J., 6, 7 Maier, C. S., 286

author index Maity, H., 340 Makarov, A. A., 45 Makarov, D., 375, 377(64) Makhatadze, G., 45, 46, 342, 358, 362(25), 364(25) Makokha, M., 224, 226, 233(18), 234(18), 237(18), 238(18) Manavalan, P., 249 Mandell, J. G., 334 Mangione, P., 263(85), 264 Mantsch, H. H., 262 Mantsch, H. M., 172, 177(13) March, K. L., 33, 43(21) Marcus, R. A., 56 Marino, G., 234 Markley, J. L., 236 Marquis, R. E., 404 Marqusee, S., 278, 286, 328, 334(10), 335, 338, 343, 378 Marshall, M., 223 Ma˚rtensson, L.-G., 252 Martı´, S., 57 Martinez, A., 16 Masaike, T., 138 Masselos, D., 286 Matouschek, A., 364, 375(41) Mattevi, A., 85 Matthew, J. B., 33, 43(21) Matthews, B. W., 45, 50 Matthews, C. R., 255, 263(61; 88), 264, 280(61), 281, 281(61), 282(61), 363, 372(39), 373(39) Matthews, R. G., 152, 154, 155, 157, 159(11), 160(14), 162(13), 163, 163(14), 164(17), 165, 165(12), 166(13), 167(13), 168(2; 12), 169(12) Mauel, C., 344 Mayne, L., 242, 271, 272(114), 273(114), 276(6), 277, 286, 289, 328(2), 329, 338, 345, 358, 378(28), 380, 386(4), 398(4) Mayo, S. L., 338, 383 McCammon, A. J., 29, 34(17) McCammon, J. A., 31, 33(20), 34(20) McCloskey, J. A., 291 McGrail, M., 219 McIntosh, J. R., 219 McIntosh, L. P., 407(30), 408, 414(30) McPhail, D., 7 Mehler, E. L., 38 Meiklejohn, A. L., 208

427

Melendez, M. G., 405, 406(21) Mendieta, J., 260 Menger, F. M., 56, 57(21) Menz, R. I., 132, 133(3), 136(3), 149(3), 150(3), 151(3) Me´thot, N., 172 Michel, H., 52, 77(3) Milder, S. J., 316, 319 Miles, R. W., 9(16), 11, 13(16), 15(16) Miller, B., 15, 63 Miller, D. W., 389(21), 390 Miller, R., 369 Millian, N. S., 154 Mills, F. C., 93 Milne, J. S., 277, 289, 329, 380, 386(4), 398(4) Mines, G. A., 311, 313(4), 320(4), 321(4) Minks, C., 405, 410(22) Miraglia, N., 234 Miranker, A., 286, 292(9), 304(9) Mitchel, P., 52, 69(1), 74(1), 77(1) Mitchell, R. C., 3, 4(2) Mitome, N., 137 Miwa, S., 263(81), 264, 271(81), 326 Mizutani, Y., 312(9), 313 Mo, H., 343 Mok, Y. K., 358 Moller, D. E., 7(12), 8 Monaco, H. L., 260 Monleon, D., 230 Monod, J., 17, 87, 187, 192(4) Montelione, G. T., 229, 230 Montgomery, M. G., 151 Moore, D. D., 159 Morales, M., 192 Morishima, I., 273, 274(117), 315 Moroder, L., 405, 410(22) Morozova, L., 254, 255(57), 256(57) Morozova-Roche, L. A., 254, 256(59), 263(59), 269, 269(59) Morton, A., 383 Mosser, K., 405 Motta, A., 234 Moult, J., 267 Muchmore, D. C., 407(30), 408, 414(30) Muegge, I., 75, 77, 78(78), 80(78), 84 Muga, A., 16 Mulkidjanian, A. Y., 70 Mulle, C., 170 Muller Frohne, M., 266 Mulliez, E., 163

428

author index

Mumenthaler, C., 231 Muneyuki, E., 137, 138 Mun˜oz, V., 378 Murphy, C. D., 412 Murphy, K. P., 24, 33(7), 350 Muthukrishnan, K., 319 Myers, D., 93, 99(33) Myers, J. K., 337

N Nadanaciva, S., 137, 140, 143, 143(42), 145(42), 149, 150(42; 58; 59), 151(42; 59) Nagamura, T., 263(81), 264, 271(81), 326 Nakagawa, A., 254 Nakamoto, R. K., 132, 135, 137, 139, 149 Nakamura, S., 356 Nall, B. T., 319 Narlikar, G. J., 207 Neely, K. E., 209, 210(19) Ne´grerie, M., 135 Neilsen, S. O., 379, 386(1) Nemethy, G., 187, 192(5) Newcomer, M. E., 259, 261(68) Newsom, S., 43 Nicholls, I. A., 3, 4(2) Nicholson, E. M., 343 Nicholson, H., 45 Nicholson, L. K., 347 Nielsen, S. O., 286, 330 Nieves, E., 9(16), 11, 13(16), 15, 15(16) Nishikawa, K., 261 Nishikawa, Y., 315 Nitta, K., 244, 245(12), 246(12; 14), 254, 266 Noble, R. W., 92 Noguchi, T., 261 Noji, H., 132, 133, 138, 139 Noppe, W., 269 North, A. C. T., 259, 261(68) Northey, J. G. B., 367 Nozaka, M., 244, 246(14) Nurminskaya, M. V., 220 Nurminsky, D. I., 220 Nymeyer, H., 372, 373, 373(58)

O Oas, T. G., 349 Obradovic, Z., 223 O’Brien, D. P., 7

Ogasahara, K., 263(95), 264 O’Hagan, D., 412 Ohgushi, M., 244 Okamura, M. Y., 78 Oliveberg, M., 48 Omote, H., 133, 137, 139, 149 Onuchic, J. N., 362, 363, 363(32), 364, 364(32), 367(32), 369(32), 370, 372, 372(39), 373, 373(32; 39; 58), 374(32) Oobatake, M., 263(95), 264 Orphanides, G., 207 Orriss, G. L., 143 Orru, S., 234 Oster, G., 52, 70, 70(7) Otting, G., 396 Oxenoid, K., 413 Ozaki, Y., 172 Ozer, J., 213

P Pace, C. N., 43, 45, 289, 336, 337, 342(12), 343(12), 387, 388(15), 398(15) Pain, R. H., 242, 263(89), 264, 285(89) Palleros, D. R., 256, 259(65), 260(65), 277(65) Palm, T., 230 Pan, H., 285, 287, 291 Panchenko, A. R., 373 Pancoska, P., 248, 250(35) Pande, V. S., 364, 367(44), 370, 375, 376, 376(66) Paoli, M., 18 Papageorgiu, A. C., 252 Papazyan, A., 62, 65, 68(59), 84 Papiz, M. Z., 259, 261(68) Paquette, S. J., 319 Pardee, A. B., 85 Park, B., 361 Park, C., 345 Parson, W. W., 57, 61(24), 78(24), 82(24), 84, 84(24) Paschal, B. M., 219 Pascher, T., 311, 313(4), 319, 320(4), 321(4; 30) Patel-King, R. S., 220 Paterson, Y., 345 Pauloin, A., 219 Pawley, N. H., 347 Paxton, R. J., 127 Pearson, J. G., 403 Pecoraro, V. L., 144

429

author index Peng, J. W., 233 Peng, Z., 406 Peng, Z. Y., 249, 250(48) Penkett, C. J., 238 Perez, J. A., 406 Perry, K. M., 281 Perutz, M. F., 18 Pervushin, K., 232 Peterkofsky, A., 238 Peters, I. D., 278 Peterson, C. L., 207 Petrich, J. W., 135 Pfarr, C. M., 219 Pfeffer, S. R., 219 Pfeifer, J., 229 Pfister, K. K., 220 Phillips, C. M., 312(9), 313 Pickford, A. R., 238 Pieroni, O., 314 Pinkner, J., 401, 403(10) Pizer, L. I., 106, 109(6) Plaxco, K. W., 362, 362(34), 374(34), 375, 376, 376(66), 377(63; 64) Plu¨ckthun, A., 286 Podjarny, A., 267 Pokarowski, P., 355 Poljak, R. J., 7(10), 8 Pollack, L., 273, 315 Postigo, D., 154 Potts, J. R., 238 Poulsen, F. M., 286 Powell, K. D., 349 Powers, R., 229 Pratt, E. A., 405 Prince, R. C., 80 Privalov, P. L., 243, 358, 362(25), 364(25) Provencher, S. W., 248, 249(37), 252(37), 254(37), 260(37) Prusiner, S. B., 343 Ptashne, M., 210 Ptitsyn, O. B., 244, 245, 245(22), 246, 247(30), 248(30), 250, 250(30), 263(30; 97), 264 Pucci, P., 234 Puett, D., 244 Pyo, S., 210, 213(21), 216(21)

Q Qi, P. X., 277 Qian, H., 383

Qin, Z., 278, 280(132) Qu, Y., 395, 399(24)

R Rabenstein, B., 36 Rabinovich, D., 267 Radford, S. E., 263(90), 264, 266(90), 285(90), 286, 292(9), 304(9) Radzicka, A., 15 Ragsdale, S. W., 154 Raleigh, D., 45, 47, 48(62) Ramaprasad, S., 396 Rammelsberg, R., 174, 175(20) Ranganathan, R., 187, 344 Ranish, J. A., 208, 210(10) Rao, R., 138 Rao, S. N., 64 Raquet, X., 37 Raschke, T. M., 278, 334(10), 335 Redfield, C., 249, 250(47; 48), 264, 266(100) Reeves, P. J., 413 Regenfuss, P., 312 Reichard, P., 163 Reinberg, D., 207 Reinhart, G. D., 187, 188, 190, 191(16), 193, 198, 199(24), 200(24), 201(24) Ren, H., 137 Renner, C., 412 Richards, F. M., 291 Rico, M., 43 Rider, M. H., 107 Ridgeway, C., 14, 15 Riek, R., 232 Ritchey, J. M., 99 Rizo, J., 261, 277(78) Rizzi, M., 85 Robbi, M., 107 Robbins, E. M., 248, 249(34) Robblee, J., 45 Robertson, A. D., 24, 33(8), 45(56; 57; 59), 46, 328(3), 329, 333(3), 383, 384(11), 388, 389(18), 397(11), 398, 398(18) Robillard, G. T., 135 Robinson, C. V., 264, 266(100), 286, 292(9), 304(9) Robinson, V., 412 Rock, R. S., 314 Roder, H., 263(82; 87), 264, 271(87), 273, 273(87), 278, 285(87), 311,

430

author index

312, 313(3), 320(3), 323(3), 324(3), 326, 345 Roeder, R. G., 208 Rogero, J. R., 291 Rohl, C. A., 262 Rokhsar, D. S., 364, 367(44), 370 Romero, P., 223 Romm, J., 178, 179(26) Ropson, I. J., 263(93), 264, 402, 407 Rosa, J. J., 291 Rose, G. D., 261, 376 Rosenberg, A., 383, 389, 396 Ross, J. B. A., 135 Rothschild, K. J., 172, 174(17) Rothwarf, D. M., 263(94), 264, 276, 276(94), 277(94), 364, 375(42) Roy, M., 286 Rubin, M. M., 88 Rule, G. S., 7(10), 8, 405 Rullmann, J. A. C., 231 Rumbley, J. N., 340 Russell, C. B., 407(30), 408, 414(30) Russell, S. T., 26, 30(11), 33(23), 34, 55, 62, 62(16), 66, 67(62), 84(16)

S Sadqi, M., 378 Safront, V. S., 56 Saika, K., 148 Saito, K., 139 Sakuraoka, A., 263(91), 264 Sali, A., 242, 362, 369, 370, 378(35) Sali, D., 45 Sallach, H. J., 106 Salter, C. J., 3, 4(2) Salzmann, M., 232 Sambongi, Y., 133, 139 Sambonmatsu, N., 139 Sampogna, R., 29, 34(16), 36(16), 38(16) Sanchez-Ruiz, J., 45, 378 Sancho, J., 46 Sander, C., 250, 253(53), 254(53), 267(53), 268(53) Sanders, C. R., 413 Sands, R. H., 157, 163, 164(17), 165(12), 168(12), 169(12) Sandstrom, J., 401 Sarkisian, C. J., 45(58), 46, 47(58), 48(58) Sasahara, K., 266

Sauer, U., 45 Sawyer, L., 259, 260, 261(68) Sawyer, W. H., 125(23), 126, 131(23) Saya, A., 267 Scaloni, A., 234 Scatchard, G., 121 Schachman, H. K., 85, 99 Schaefer, M., 22, 36, 37(30) Scha¨fer, G., 146 Schaffrath, C., 412 Schell, D., 43 Scheraga, H. A., 263(94), 264, 276, 276(94), 277(94), 364, 375(42) Schimerlik, M. I., 286 Schimmel, P. R., 351 Schlosser, D. W., 81 Schmid, F. X., 263(86), 264 Schmitt, S., 146 Scholten, J. D., 157, 165(12), 168(12), 169(12) Scholtz, J. M., 43, 337, 342(12), 343(12), 387, 388(15), 398(15) Schramm, V. L., 9(16), 11, 13(16), 15, 15(16) Schuler, B., 406 Schuller, D., 107 Schulman, B. A., 249, 250(47; 48) Schultz, P. G., 412 Schutz, C. N., 26, 36(12), 37(12), 39(12), 50(12), 52, 67, 71, 80(74) Schwarz, F. P., 88, 89(24), 97(24), 100, 102(24; 40) Schweins, T., 77 Searle, M. S., 5, 6 Segawa, S.-I., 260, 270(69) Segel, I. H., 192 Seidman, J. G., 159 Sekimoto, Y., 148 Semisotnov, G. V., 246, 247(30), 248(30), 250, 250(30), 263(30; 97), 264, 270 Sen, L. C., 406 Senear, D. F., 208 Senior, A. E., 132, 135, 136, 136(20), 137, 138, 139, 139(25), 140, 141(36), 143, 143(20; 25; 41; 42), 145, 145(25; 41; 42), 147(25; 36), 148, 148(36), 149, 149(49), 150, 150(42; 58; 59), 151, 151(42; 49; 59; 61), 152(25; 63) Senn, H., 232 Seok, Y. J., 238 Seravalli, J., 154 Serrano, L., 364, 375(41)

author index Shaanan, B., 107 Shakhnovich, E. I., 362, 369, 370, 378(35) Sham, Y. Y., 39, 75, 78, 78(78), 79(94), 80(78; 94) Shapiro, D. B., 319 Sharman, G. J., 6 Sharp, K., 26, 29, 30(10), 34, 34(16), 36(16), 38(16) Sharp, P. A., 208 Shastry, M. C. R., 273, 312 Shaw, K. L., 43, 45 She, Z. S., 218 Shea, M. A., 208 Shevelyov, Y. Y., 220 Shi, W., 15 Shi, Z., 278 Shimabukuro, K., 137 Shimizu, A., 260, 270, 270(70) Shimizu, K., 356 Shimizu, S., 350, 351, 353, 354, 355, 355(9), 356(19; 20), 357(17–20), 365(19) Shimotakahara, S., 229 Shiraki, K., 261 Shirakihara, Y., 148 Shire, S. J., 33, 43(21) Shizuta, Y., 86 Short, S. A., 14 Shortle, D., 49 Shriver, J. W., 413 Shurki, A., 52, 57, 59(25), 63(25), 74, 75(77) Simon, I., 38, 380, 383(3), 389(3), 390(3) Simon, J. D., 275, 319 Simons, K. T., 375, 377(63) Simplaceanu, V., 405 Sitkoff, D., 34 Siuzdak, G., 286 Sivaprasadarao, R., 259, 261(68) Skolnick, J., 355 Slaughter, J. C., 106 Smallwood, A., 207, 208, 211(11), 213(11), 214(11), 215(11), 216(11), 217(11) Smart, O. S., 314 Smilansky, A., 267 Smith, B. C., 172 Smith, D. L., 9(15), 11, 13(15), 285, 286, 287, 288(22), 291, 292, 293, 300, 304(31), 334 Smith, F. R., 99, 129 Smith, J. A., 159 Smith, L. J., 238

431

Smith, R. G., 7(12), 8 Smithgall, T. E., 287 Snider, M. J., 14, 15, 53 Snyder, J. P., 56, 57(21) Snyder, S. H., 227 Socci, N. D., 363, 364, 372, 373(58) So´derlind, E., 45 Sonnhammer, E. L. L., 223 Sonnichsen, F. D., 413 Sosnick, T. R., 242, 271, 272(114), 273(114), 276(6), 277, 286, 328(2), 329, 338, 358, 378(28) Spadon, P., 260 Spector, S., 45 Speir, J. P., 286 Spencer, D., 49, 51(67) Spiro, T. G., 171, 172, 174 Sprang, S. R., 77 Sreerama, N., 249, 254(42), 273 Srinivasan, R., 261 Stafford, W. F., 226 Stalker, D. M., 406 Stefani, M., 263(85), 264 Steffen, W., 221 Steinmetz, M. G., 175, 183(24), 184(24), 185(24) Stellwagen, E., 272, 319 Stephens, E., 3 Stephens, P. J., 84 Stezowski, J. J., 154 Stilerman, M. D., 271, 272(114), 273(114) Stites, W. E., 37(40), 38, 49, 49(40), 51(67) Stone, R., 61, 62(35), 63(35) Storch, D. M., 61 Sˇtrajbl, M., 52, 59, 73, 74, 75(77) Straume, M., 112, 131(19) Struhl, K., 159, 208 Strydom, D. J., 127 Suel, G. M., 187 Sugai, S., 244, 245(12), 246, 246(12; 14), 247(31), 248(31), 249(31), 254(29), 255, 260, 263(61; 81; 91; 96; 97), 264, 264(29; 31), 270, 270(70), 271(81), 280(61), 281(61), 282(61), 326 Sugawara, T., 263(96), 264 Sugimoto, E., 106, 109(6) Susi, H., 180 Sussman, F., 64 Sutherland, A., 404

432

author index

Sutin, N., 312(10; 11), 313 Suzuki, T., 137 Svebak, R. M., 16 Svensson, B., 33 Swank, J., 281 Swapna, G. V. T., 230 Swint-Kruse, L., 45(59), 46, 383, 384(11), 397(11) Sykes, B. D., 400, 413 Symcox, M. M., 193 Szabo, A. G., 135

T Tabb, D. L., 180 Taddei, N., 263(85), 264 Tai, C. Y., 219 Taillon, B. E., 90 Takahashi, S., 273, 274(117), 315 Tanabe, M., 133 Tanaka, I., 254 Tanaka, T., 261 Tanford, C., 23, 244 Tang, Y., 412 Tantin, D., 210, 213(21), 216(21) Tao, F., 397 Tao, T., 226 Tapia, O., 56 Tate, M. W., 273, 315 Tauler, R., 260 Taylor, R. T., 154 Teesch, L. M., 388, 389(18), 398(18) Telford, J. R., 311, 313(5), 320(5), 321(5) Tennant, L., 34 ter Veld, F., 135 Texter, F. L., 286 Thiran, S., 175, 176(25), 178, 179(26), 183, 186(30) Thirumalai, D., 356, 363, 369, 372(39), 373(39), 375(53), 378(37) Thomas, G. J., 172, 181(11; 12) Thomas, M. R., 413 Thomas, S. T., 46 Thomas, Y. G., 323, 325(33) Thornburg, R. W., 135 Thornton, J. M., 231, 250 Thornton, K. C., 405, 406(21) Thorpe, C., 165 Thurkill, R. L., 43 Tidor, B., 45

Tiedge, H., 146 Tiktopulo, E. I., 246, 247(30), 248(30), 250(30), 263(30) Tilton, R. F., 316 Tirrell, D. A., 412 Tlapak-Simmons, V. L., 188 Tobin, J. B., 64, 65(55), 66(55) Tokushige, M., 86 Tollinger, M., 48 Touchette, N. A., 281 Toumadje, A., 249 Traub, W., 267 Trevino, S., 43 Trivinos-Lagos, L., 223 Truhlar, D. G., 63 Try, A. C., 6 Tsai, J., 353(10), 354, 355(10) Tshiro, M., 229 Tsui, V., 286 Tsunoda, S. P., 139 Tsuzuki, W., 7 Tu¨chsen, E., 380, 381, 383(3), 389(3), 390(3), 392(5; 6), 396, 396(5; 6), 397 Turina, P., 138 Turner, D. H., 312(10; 11), 313 Turner, J. M., 157(15), 158

U Ueda, I., 133 Ueda, T., 148 Umbarger, H. E., 85, 86 Urbanova, M., 248, 250(35) Uversky, V. N., 245, 247, 250(32), 252(32), 254(32), 277(32), 349

V Vallee, R. B., 219, 221 Van Dael, H., 254, 255(57), 256(57; 59), 263(58; 59), 269(59) van Nuland, N. A. J., 266 Van Schaftingen, E., 107 van Vlijmen, H. W., 22, 36, 37(30) Varadarajan, R., 134 Varley, P., 263(89), 264, 285(89) Vassilenko, K. S., 247, 250(32), 252(32), 254(32), 277(32) Vaughan, K. T., 221

author index Vaughn, M. D., 412 Venyaminov, S. Yu., 250, 273 Vetter, I. R., 77 Vijayakumar, M., 34 Villa, J., 54, 56(12), 61(12), 77 Vincent, S. J. F., 235 Virbasius, A., 208 Voges, D., 38 Volk, M., 314 Vuister, G. W., 229

W Wada, A., 244 Wada, Y., 133, 137, 139, 149 Wade, R. C., 37 Wagner, G., 233 Walker, J. E., 132, 133, 133(3), 136(3; 7), 143, 149(3; 7), 150(3; 7), 151, 151(3; 7) Wallqvist, A., 356 Walsh, D. A., 106 Walter, S., 286 Wang, B., 248, 250(35) Wang, C., 347 Wang, F., 9(16), 11, 13(16), 15, 15(16) Wang, H., 52, 70(7) Wang, H. Y., 70 Wang, J., 207, 208, 211(11), 213(11), 214(11), 215(11), 216(11), 217(11), 218 Wang, L., 291, 293, 412 Wang, M., 45 Wang, M. Z., 349 Warshel, A., 20, 26, 30, 30(11), 33(23), 34, 36(12), 37(12), 38(18), 39, 39(12; 18), 50(12), 52, 53, 54, 54(8), 55, 55(6), 56(6; 10; 12–14), 57, 59(6), 61, 61(6; 12; 24), 62, 62(6; 16; 32), 63(6; 14; 32), 64, 64(14), 65, 65(6), 66, 67, 67(6; 62), 68(59), 71, 73, 74, 75, 75(77), 77, 78, 78(24; 78), 79(75; 94), 80(6; 66; 74; 78; 92; 94), 81, 82(24), 84, 84(5; 16; 24) Weber, G., 188, 195(9) Weber, J., 132, 135, 136, 136(20), 137, 138, 139, 139(25), 140, 141(36), 143, 143(20; 25; 41; 42), 145, 145(25; 41; 42), 146, 147(25; 36), 148, 148(36), 149, 149(49), 150, 150(42; 58; 59), 151, 151(42; 49; 59; 61), 152(25; 63) Weiss, D. G., 221 Weiss, J. N., 123

433

Weiss, R. M., 66 Weissbach, H., 154 Well, M. A., 187 Wells, J. A., 64 Wente, S. R., 99 Westheimer, F. H., 77 Westwell, M. S., 5, 6, 7 White, S. H., 134 Whitham, S., 135 Whitt, S. A., 64, 65(55), 66(55) Whitten, S. T., 23 Wider, G., 232 Wijesinha, R. T., 264, 266(100) Wildes, D., 328, 378 Wilke-Mounts, S., 135, 136, 137, 138, 139, 139(25), 140, 143(25; 41), 145, 145(25; 41), 147(25), 149, 149(49), 150, 150(59), 151(49; 59; 61), 152(25) Wilkinson, A. J., 18 Wilkinson, K. D., 165 Willert, K., 263(86), 264 Williams, C. H., Jr., 165 Williams, D. H., 3, 4(2), 5, 6, 7, 7(10), 8, 9, 11, 12(17) Willis, C. L., 404 Willis, J. E., 106 Wilson, E. M., 7(12), 8 Winder, S. L., 266 Wingfield, P. T., 263(89), 264, 285(89) Winkler, J. R., 311, 313(4; 5), 320(4; 5), 321(4; 5) Winter, G., 64 Winzor, D. J., 125(23), 126, 131(23) Wise, J. G., 139 Wisz, M. S., 38 Wittinghofer, A., 77 Wittung-Stafshede, P., 273, 275(121), 276(121), 311, 313(5), 320(5), 321(5), 325, 326(34) Wolfenden, R., 14, 15, 53, 63 Wolynes, P. G., 309, 362, 363, 363(32), 364, 364(32), 367(32), 369(32), 370, 372(39), 373, 373(32; 33; 39), 374(32; 33), 378(33) Wong, K.-P., 244 Woo, T. S., 99 Wood, M. J., 276, 321, 324(31) Woodson, S. A., 369, 375(53) Woodward, C., 37, 378, 379, 380, 381, 383, 383(2; 3), 384, 384(9), 387, 387(25), 388(2; 9), 389(3), 390(3), 392(5; 6; 12),

434

author index

393, 394(9), 395, 395(12; 14), 396, 396(5; 6), 397, 399(25), 400 Woody, R. W., 242, 249, 254(42), 255, 263(92), 264, 269, 269(62), 273, 278, 281(62), 283(92), 284, 284(92) Wooll, J. O., 340 Woolley, G. A., 314 Workman, J. L., 209, 210(19) Wormald, C., 247, 248(33), 249(33), 250(33) Woychik, N. A., 207 Wozniak, J. A., 45 Wrabl, J. O., 340 Wright, P., 34 Wright, P. E., 263(84), 264, 278(84), 279(84), 286, 349 Wu, C.-S., 281, 282(134) Wu, N., 62, 63(45) Wuthrich, K., 231, 232, 396 Wyman, J., 17, 23, 87, 112, 119(18), 131(18), 187, 188, 192(4), 195(6–8)

Yang, J. T., 248, 281, 281(38), 282(38; 134) Yang, Y. R., 99 Yanofsky, C., 407(40), 413 Yao, M., 254 Yasuda, R., 133 Yasui, S. C., 248 Yernool, D., 175, 176(25) Yi, Q., 286 Yoder, M. D., 284 Yonath, A., 267 Yoneyama, M., 244, 245(12), 246(12), 263(91), 264 Yoshida, M., 132, 133, 137, 138, 139 You, T., 36 Young, P., 47, 48(62) Yu, H. D., 87, 88, 89(20; 24), 93(20), 95(20), 97(20; 24), 102(24) Yudkovsky, N., 208, 210(10) Yuksel, K. U., 127 Yutani, K., 263(95), 264

X

Z

Xiao, G., 86 Xie, X., 275, 319 Xu, X. L., 106, 108, 109(1; 2; 15)

Y Yadev, A., 56 Yakolev, G. I., 45 Yamasaki, K., 263(95), 264 Yamaya, H., 263(81), 264, 270, 271(81), 326 Yanagida, T., 133, 139 Yang, A.-S., 29, 34(16), 36(16), 38(16) Yang, D., 358 Yang, H., 300

Zanotti, G., 260 Zerella, R., 3 Zhang, Z., 9(15), 11, 13(15), 287, 288(22), 291 Zhou, G., 7(12), 8 Zhou, H.-X., 34 Zhou, M., 3, 18 Zhou, Q., 213 Zhu, G., 229 Zhu, X., 208 Zimmerman, D. E., 229 Zimmermann, H., 178, 179(26) Zirwer, D., 266 Zondlo, J., 86 Zwahlen, C., 235

Subject Index

A

advantages over nuclear magnetic resonance, 287 data analysis, 292–293 high-performance liquid chromatography electrospray ionization mass spectrometry, 290–292 homology with rabbit enzyme, 305–307 intact protein analysis, 293–297 peptide fragment analysis, 298–302, 304–305 principles, 286–287 prospects, 307–308 pulsed versus continuous labeling, 287–288 sample preparation, 288–290 three-state unfolding, 304 ligand-binding studies ligand affinity analysis, 349 ligand-binding interface probing, 348–349 ligand-induced ensemble modulation probing, 349 overview, 345–348 native state hydrogen exchange experiments buried amide exchange, 396 crystal studies, 397 EX2/EX1 model, 385–386 folded state exchange, 389–390 folded state mechanism analysis, 390–392 global stability independence of folded state exchange, 392–395 kinetics and thermodynamics of protein folding, 386–389 nuclear magnetic resonance, 381–384 overview, 337–340, 342, 379–381 pH dependence, 385 prospects, 397–399 protein-out exchange and motional domains, 395

Aldolase, cooperative protein folding studies using Staphylococcus aureus enzyme and amide hydrogen/deuterium exchange mass spectrometry advantages over nuclear magnetic resonance, 287 data analysis, 292–293 high-performance liquid chromatography electrospray ionization mass spectrometry, 290–292 homology with rabbit enzyme, 305–307 intact protein analysis, 293–297 peptide fragment analysis, 298–302, 304–305 principles, 286–287 prospects, 307–308 pulsed versus continuous labeling, 287–288 sample preparation, 288–290 three-state unfolding, 304 Allosterism cooperativity comparison, 188 coupling free energy, 195–197, 203 definition, 188 hydrogen exchange analysis, see Amide hydrogen/deuterium exchange K-type effects, 189 multisubstrate enzymes, 197–198 oligomeric enzymes, 198–202 reciprocity, 193, 203 single substrate–single modifier mechanism, 189–194 ternary complex formation, 194–197 V-type effects, 189 Amide hydrogen/deuterium exchange binding and allostery studies, 344–345 chemistry, 329 conformational sensitivity, 328, 330–333 cooperative protein folding studies using Staphylococcus aureus aldolase and mass spectrometry

435

436

subject index

Amide hydrogen/deuterium exchange (cont.) slow exchange core and protein design, 399 surface amide exchange, 396 temperature switching of exchange mechanism, 392 two-process exchange, 381–383 nuclear magnetic resonance versus mass spectrometry detection, 287, 333–334 protein folding analysis, 334–336 protein stability studies, 336–337 receptor–ligand binding studies using mass spectrometry exchange conditions, 11 packing in transition state binding, 15–16 principles, 9 structural tightening region identification using pepsin digestion, 11–13 superprotection mechanisms, 342–344 two-state model and equations, 330–332 ATP synthase electrostatic basis for bioenergetics conformational energy conversion to electrostatic energy, 75–77 problem defining and simplification, 69–75 structure and forms, 132 catalytic mechanism models, 133 fluorescent probe studies of cooperativity cysteine-modified residues, 134–135, 137 intrinsic tryptophan fluorescence, 134–135, 137, 151–152 magnesium coordination in catalytic site, 150–151 purification of Escherichia coli enzyme anion-exchange chromatography, 139 cell lysis, 139 expression, 138–139 gel filtration, 139 purity requirements, 136 Trp-331 of -subunit studies of nucleotide binding to Escherichia coli F1-ATPase advantages, 136, 151 base-binding pocket role in nucleotide binding, 147–148 data analysis, 141–145

fluorescence titration, 140–141 mechanistic implications, 145–147 mutational analysis of residues interacting with phosphate moiety of nucleotides, 149–150 nucleotide analog studies, 152

B Bioenergetics, electrostatics analysis ATP synthase studies, see ATP synthase electrostatic basis for structure–function correlation in enzymes, 54–56 electrostatic stabilization by hydrogen bonds versus low- barrier hydrogen bond proposal, 63–68 enzyme catalysis overview, 52–54 EVB simulations, 59–61 G-protein GTPase, 77 light-induced electron transfer electostatic control, 80–84 near attack conformation concept, 56–57, 59–61 proton translocation electostatic control, 78–80 reactive state destabilization proposals, 61–63

C CD, see Circular dichroism Chevron plot, protein folding cooperativity linear plots, 364–366, 369–370 quantitative characterization, 370–371 rollover and nonideal thermodynamic cooperativity, 374–375 Chorismate mutase, near attack conformation, 56–57, 59–61 Circular dichroism dynein cargo attachment complex conformational change monitoring on assembly, 238–239 folding and assembly analysis, 224 molten globule comparison between proteins, 250, 252–254, 256, 258, 261–262 detection, 245 -lactalbumin studies, 246–250, 255–256 -lactalbumin studies, 259–261

437

subject index lysozyme studies, 254–256 time-resolved studies of early protein folding events cytochrome c, 321, 323, 326 instrumentation, 315–319 magnetic circular dichroism, 316, 319, 322–324 Cobalamin-dependent methionine synthase catalytic reaction, 153–154 conformational change energetics carboxy-terminal fragment purification for study anion-exchange chromatography, 159 cell growth and induction, 157–159 gel filtration, 160 growth medium preparation, 158–159 hydrophobic affinity chromatography, 160 lysate preparation, 159 reductive methylation, 160 cobalamin chromophore as reporter, 152–153, 155–157 cob(II)alamin enzyme studies electron paramagnetic resonance, 164–165 flavodoxin titration, 163–164 spectrophotometry, 162–164 conformational states, 155–156 conformer assignment using S-adenosylmethionine, S-adenosylhomocysteine, and methyltetrahydrofolate, 166–168 energetics calculations, 166–169 ligand-binding domains, 154–155 methylcobalamin enzyme spectrophotometry and deconvolution, 160–162 Conformational change, see also Cooperativity; Protein packing glutamate receptor, see Glutamate receptor methionine synthase, see Cobalamindependent methionine synthase reporters, 152 Cooperativity additivity versus cooperativity, 351–353 ATP synthase, see ATP synthase definition, 188, 351–352

interaction cooperativity versus transition cooperativity, 354–355 modeling, 4–6, 353–355 negative cooperativity and structural loosening in receptors, 16–18 nuclear magnetic resonance probing of noncovalent bonding in positive cooperativity, 6–7, 9 phosphoglycerate dehydrogenase, see Phosphoglycerate dehydrogenase protein assembly, see Protein assembly cooperativity protein folding, see Protein folding threonine deaminase, see Threonine deaminase Cytochome c protein folding studies of early events, 319–327 stopped-flow circular dichroism of burst phases in folding, 271–273, 275–276

D DHFR, see Dihydrofolate reductase Dihydrofolate reductase, stopped-flow circular dichroism of burst phases in folding, 280–282 DNase I footprinting, cooperative DNA binding studies, 208 Dynein cargo attachment complex binding assays, 233–234 binding site mapping, 234 components, 220–221 conformational change monitoring on binding circular dichroism, 238–239 fluorescence emission, 239–240 full-length IC74 studies, 241 limited proteolysis, 240–241 folding and assembly analysis analytical ultracentrifugation, 226 circular dichroism, 224 gel filtration, 225–226 functions, 219 nuclear magnetic resonance of complexes binding site mapping, 236–237 chemical shift and line width perturbations, 237 conformational change monitoring on binding, 238

438

subject index

Dynein cargo attachment complex (cont.) constructs for study, 234–236 nuclear Overhauser effects at protein–peptide interface, 237–238 nuclear magnetic resonance determination of subunit and domain structure AutoStructure program for structure determination, 230–232 backbone dynamics, 232–233 dimer interface disruption low pH, 227 site-directed mutagenesis, 227–228 resonance assignments, 229–230 sample preparation, 228–229 protein purification small constructs of large subunits, 222–223 small and medium-size subunits, 222 structure, 220

E Electron paramagnetic resonance, cobalamin-dependent methionine synthase conformational change studies, 164–165 Electrophoretic mobility shift assay, cooperative DNA binding studies, 208–209 Electrostatic interactions biochemical processes, 20–21 bioenergetics analysis ATP synthase studies, see ATP synthase electrostatic basis for structure–function correlation in enzymes, 54–56 electrostatic stabilization by hydrogen bonds versus low-barrier hydrogen bond proposal, 63–68 enzyme catalysis overview, 52–54 EVB simulations, 59–61 G-protein GTPase, 77 light-induced electron transfer electostatic control, 80–84 near attack conformation concept, 56–57, 59–61

proton translocation electostatic control, 78–80 reactive state destabilization proposals, 61–63 computational structure-based electrostatic calculations conformational flexibility and reotganization handling, 36–38 limitations, 33 overview of continuum methods, 25–26 Poisson–Boltzmann equation solution by method of finite differences Coulombic, background, and Born contributions to pKa values, 41, 43 electrostatic potential calculation, 29–30, 40–41 internal ionizable residue physical properties, 49–51 modifications, 33–35 overview, 26 pKa value calculation, 28–29, 31–32, 36–38 protein dipole-Langevin dipole model advantages, 38–39 internal ionizable residue physical properties, 49–51 overview, 26, 30 pKa value calculation, 51 stability contributions context dependence of ion pair contributions, 45–46 long-range and short-range contributions, 43–45 Tanford–Kirkwood algorithm, 33, 45 thermodynamic cycle calculation of pKa values, 26–29, 31 denatured state studies, 48 experimental measurement of pH- and salt-linked thermodynamics equilibrium constants, 23–24 overview, 23 pKa determinations, 24–25, 31–32 physiochemical modulation of proteins, 21 salt sensitivity of electrostatic effects, 47–48

439

subject index EMSA, see Electrophoretic mobility shift assay EPR, see Electron paramagnetic resonance

F FDBP, see Poisson–Boltzmann equation solution by method of finite differences Fluorescence resonance energy transfer, phosphoglycerate dehydrogenase stoichiometric binding studies of NADH, 127–129 Fluorine-19-labeled proteins, see Nuclear magnetic resonance Fourier transform infrared spectroscopy, S1S2 ligand-binding domain of GluR4 advantages and limitations, 170 caged glutamate studies difference spectra, 185 photolysis mechanism, 183–185 time-resolved studies, 182–183 conformational changes with full agonists, partional agonists, and antagonists environment of cysteine 426, 181–182 secondary structure, 180–181 cyclic versus noncyclic systems, 174–175 data collection, 172–173 enthalpy of noncovalent interactions at carboxylate moieties of ligand, 175–178 prospects, 186–187 stereochemistry of quinoxaline antagonist binding, 178–180 time-resolved spectroscopy, 173–174 vibrational modes, 171–172 Free energy of binding LUDI approach, 3–4 FRET, see Fluorescence resonance energy transfer FTIR, see Fourier transform infrared spectroscopy

G Gel filtration cobalamin-dependent methionine synthase, 160 dynein cargo attachment complex assembly analysis, 225–226

F1-ATPase of Escherichia coli, 139 molten globule sizing, 258 Glutamate receptor Fourier transform infrared spectroscopy of S1S2 ligand-binding domain of GluR4 advantages and limitations, 170 caged glutamate studies difference spectra, 185 photolysis mechanism, 183–185 time-resolved studies, 182–183 conformational changes with full agonists, partional agonists, and antagonists environment of cysteine 426, 181–182 secondary structure, 180–181 cyclic versus noncyclic systems, 174–175 data collection, 172–173 enthalpy of noncovalent interactions at carboxylate moieties of ligand, 175–178 prospects, 186–187 stereochemistry of quinoxaline antagonist binding, 178–180 time-resolved spectroscopy, 173–174 vibrational modes, 171–172 topology of subunits, 170–171 Go¯model, protein folding cooperativity, 364, 372–374

H Hill plot, phosphoglycerate dehydrogenase cooperativity analysis, 118–121 Hydrogen exchange, see Amide hydrogen/ deuterium exchange

I Interaction cooperativity, versus transition cooperativity, 354–355 -Lactalbumin, circular dichroism molten globule studies, 246–250, 255–256 stopped-flow circular dichroism of burst phases in folding, 264, 266 -Lactalbumin, circular dichroism molten globule studies, 259–261

440

subject index

-Lactalbumin, circular dichroism (cont.) stopped-flow circular dichroism of burst phases in folding, 270–271

L LBHB, see Low-barrier hydrogen bond Low-barrier hydrogen bond, transition state stabilization, 63–68 Lysozyme, circular dichroism molten globule studies, 254–256 stopped-flow circular dichroism of burst phases in folding, 266–269

M Mass spectrometry amide hydrogen/deuterium exchange studies of receptor–ligand binding exchange conditions, 11 packing in transition state binding, 15–16 principles, 9 structural tightening region identification using pepsin digestion, 11–13 cooperative protein folding studies using Staphylococcus aureus aldolase and amide hydrogen/deuterium exchange advantages over nuclear magnetic resonance, 287 data analysis, 292–293 high-performance liquid chromatography electrospray ionization mass spectrometry, 290–292 homology with rabbit enzyme, 305–307 intact protein analysis, 293–297 peptide fragment analysis, 298–302, 304–305 principles, 286–287 prospects, 307–308 pulsed versus continuous labeling, 287–288 sample preparation, 288–290 three-state unfolding, 304 MetH, see Cobalamin-dependent methionine synthase

Methionine synthase, see Cobalamindependent methionine synthase Molten globule characteristics, 244 circular dichroism comparison between proteins, 250, 252–254, 256, 258, 261–262 detection, 245 -lactalbumin studies, 246–250, 255–256 -lactalbumin studies, 259–261 lysozyme studies, 254–256 gel filtration and sizing, 258 highly structured molten globule, 245 nuclear magnetic resonance, 262 phase transitions, 245 pH effects on stability, 245, 259 precursor, 245 MS, see Mass spectrometry Myoglobin, stopped-flow circular dichroism of burst phases in folding, 278

N NMR, see Nuclear magnetic resonance Nuclear magnetic resonance amide hydrogen/deuterium exchange detection versus mass spectrometry detection, 287, 333–334 dynein cargo attachment complex complex assembly binding site mapping, 236–237 chemical shift and line width perturbations, 237 conformational change monitoring on binding, 238 constructs for study, 234–236 nuclear Overhauser effects at protein–peptide interface, 237–238 structure determination of subunits and domains AutoStructure program for structure determination, 230–232 backbone dynamics, 232–233 dimer interface disruption, 227–228 resonance assignments, 229–230 sample preparation, 228–229

subject index fluorine-19-labeled protein studies advantages, 401 conformer exchange detection, 401 incorporation of labeled amino acids in proteins aromatic amino acids, 405–408 commercial availability of amino acids, 412 overview, 404 site-specific labeling, 410–415 protein folding studies, 401–404 resonance assignment, 409 molten globule, 262 native state hydrogen exchange experiments, 381–384 pKa determinations, 24, 31 positive cooperativity probing of noncovalent bonding, 6–7, 9

O Optical rotary dispersion, time-resolved studies of early protein folding events, 316, 319, 326–327 ORD, see Optical rotary dispersion

P PDGH, see Phosphoglycerate dehydrogenase PDLD, see Protein dipole-Langevin dipole model Pectate lyase C, stopped-flow circular dichroism of burst phases in folding, 283–284 Phosphoglycerate dehydrogenase catalytic reaction, 106 cooperative ligand binding characterization binding curves Hill plot, 118–121 Scatchard plot, 121–122 theoretical binding curves, 115–118 multiple dependent site binding equations, 111–115 serine inhibition, 123–124 single-site binding or multiple independent site binding equations, 109–111 equilibrium dialysis

441

ligand binding measurement, 125–127 protein quantification for binding experiments, 127 NADH binding and cooperativity, 109 serine binding and cooperativity, 108–109 site-direcected mutagenesis and thermodynamic linkage analysis of ligand binding, 129–131 species distribution, 106–107 stoichiometric binding studies of NADH using fluorescence resonance energy transfer, 127–129 structure, 107–108 Photosynthesis, electostatic control of light-induced electron transfer, 80–84 PIC, see Preinitiation complex PMF, see Potential of mean force Poisson–Boltzmann equation solution by method of finite differences Coulombic, background, and Born contributions to pKa values, 41, 43 electrostatic potential calculation, 29–30, 40–41 internal ionizable residue physical properties, 49–51 modifications, 33–35 overview, 26 pKa value calculation, 28–29, 31–32, 36–38 Potential of mean force, protein folding cooperativity, 355–357 Preinitiation complex components, 207 immobilized template assay of cooperative DNA binding in assembly advantages and limitations, 209–210 applications, 210 binding reaction conditions, 212–214 cooperative binding detection, 214–215 DNA fragment binding to streptavidin beads, 211–212 quantification of bound fragments, 212 polymerase chain reaction, 211 prospects, 218–219 quantitative analysis, 216–218 template design, 210–211 transcrioption on immobilized templates, 215–216

442

subject index

Protein assembly cooperativity dynein, see Dynein cargo attachment complex preinitiation complex, see Preinitiation complex Protein dipole-Langevin dipole model advantages, 38–39 internal ionizable residue physical properties, 49–51 overview, 26, 30 pKa value calculation, 51 Protein folding cooperative protein folding studies using Staphylococcus aureus aldolase and amide hydrogen/deuterium exchange mass spectrometry advantages over nuclear magnetic resonance, 287 data analysis, 292–293 high-performance liquid chromatography electrospray ionization mass spectrometry, 290–292 homology with rabbit enzyme, 305–307 intact protein analysis, 293–297 peptide fragment analysis, 298–302, 304–305 principles, 286–287 prospects, 307–308 pulsed versus continuous labeling, 287–288 sample preparation, 288–290 three-state unfolding, 304 cooperativity principles coupling of local and nonlocal interactions in contact-orderdependent cooperative folding, 375–377 Go¯model, 364, 372–374 interaction cooperativity versus transition cooperativity, 354–355 kinetic cooperativity chevron plots, 364–366 rate constants, 363 nonadditivity prevalence among solvent-mediated interactions, 355–357 prospects for study, 377–379 statistical mechanical properties as constraints on protein energetics

chevron rollover and nonideal thermodynamic cooperativity, 374–375 interaction specificity enhancement of cooperativity, 367–369 many-body interactions for linear chevron behavior, 369–370 overview, 366–367 principle of minimal frustration, 372–374 quantitative characterization of chevron plots, 370–371 thermodynamic versus kinetic cooperativity, 371–372 thermodynamic cooperativity of folding transitions calorimetric criterion, 358–360 relationship with minimal frustration and energy gap ideas, 361–363 enthalpy of unfolding, 243–244 folding funnel model, 242, 309–310 hydrogen exchange analysis, see Amide hydrogen/deuterium exchange landscape model, 309 molten globule characteristics, 244 circular dichroism comparison between proteins, 250, 252–254, 256, 258, 261–262 detection, 245 -lactalbumin studies, 246–250, 255–256 -lactalbumin studies, 259–261 lysozyme studies, 254–256 gel filtration and sizing, 258 highly structured molten globule, 245 nuclear magnetic resonance, 262 phase transitions, 245 pH effects on stability, 245, 259 precursor, 245 pathway model, 242 stopped-flow circular dichroism of burst phases cytochome c, 271–273, 275–276 dihydrofolate reductase, 280–282 -lactalbumin, 264, 266 -lactalbumin, 270–271 lysozyme, 266–269 myoglobin, 278 overview, 262–264

443

subject index pectate lyase C, 283–284 ribonuclease A, 276–278 ribonuclease H, 278 tryptophan synthase, 285 ubiquitin, 278–280 time-resolved studies of early events circular dichroism, 315–319, 321, 323, 326 cytochrome c studies, 319–327 fluorescence measurement, 314–315 magnetic circular dichroism, 316, 319, 322–324 optical absorption, 315, 321 optical rotary dispersion, 316, 319, 326–327 overview of techniques, 310–311 rapid initiation of folding, 312–31 small-angle X-ray scattering, 315 transition state theory, 311–312 two-state transition, 243, 285, 350, 363 Protein packing mass spectrometry amide hydrogen/ deuterium exchange studies of receptor–ligand binding exchange conditions, 11 packing in transition state binding, 15–16 principles, 9 structural tightening region identification using pepsin digestion, 11–13 negative cooperativity and structural loosening in receptors, 16–18 thermodynamic evidence for enzyme packing in transition state, 13–15

R Reactive state destabilization proposals. limitations, 61–63 Ribonuclease A, stopped-flow circular dichroism of burst phases in folding, 276–278 Ribonuclease H, stopped-flow circular dichroism of burst phases in folding, 278 RNA polymerase II, see Preinitiation complex

S SATK, see Tanford–Kirkwood algorithm SAXS, see Small-angle X-ray scattering Scatchard plot, phosphoglycerate dehydrogenase cooperativity analysis, 121–122 Site-direcected mutagenesis ATP synthase analysis of residues interacting with phosphate moiety of nucleotides, 149–150 magnesium coordination in catalytic site, 150–151 dynein cargo attachment complex, 227–228 phosphoglycerate dehydrogenase thermodynamic linkage analysis of ligand binding, 129–131 Small-angle X-ray scattering, time-resolved studies of early protein folding events, 315

T Tanford–Kirkwood algorithm, pKa determinations, 33, 45 Threonine deaminase catalytic reaction, 86 coupling free energy for cooperative active site ligand binding estimation using active dimeric variants, 92–93, 95–99 crystal structures, 90, 92 feedback inhibition, 85 feedback regulation studies using hybrid, enzyme-like tetramers, 99–100, 102–105 homotropic cooperativity, expanded twostate model, 87–90 Transition cooperativity, versus interaction cooperativity, 354–355 Tryptophan synthase, stopped-flow circular dichroism of burst phases in folding, 285

U Ubiquitin, stopped-flow circular dichroism of burst phases in folding, 278–280